See the code for this post, or all posts in this series.
See the live demo.
Today's post is going to be far less involved than the last one, but it's an important subject that we need to nail down the basics of before we move on too much further.
Thus far we've been doing a decent job of showing one thing on the screen at a time, which is great if the game you're building is "Crate in empty space" or "Look at this thing!", but that's not the game we want to build! We want to build games like "Holy crap! That's a lot of stuff on screen!" and "Look at all these things!"
...or something like that.
Point being, many times in a game we're going to need to display multiple instances of the same model. Good examples would be: Health packs scattered around the level, a forest where we simply duplicate and repeat 4 or 5 different tree models, a long row of computer terminals, or even just players sharing the same skin.
There are a few naive ways of going about this. The worst would be to load up a new instance of the model for each instance we want on screen, like so:
var crate1 = new Model("root/model/crate");
var crate2 = new Model("root/model/crate");
var crate3 = new Model("root/model/crate");
crate1.draw();
crate2.draw();
crate3.draw();
The horror!!! This is going to eat your game alive, because you're pumping your GPU memory full of duplicate data and you'll end up page swapping in no time flat. (Page Swapping == Bad Things for your Framerate)
What we need to think about with something like this is what is actually going to be different about each instance of the model and store only that. Now, in our case, we'll start by saying that the only difference between models is going to be the position, rotation, and scale. In other words, we need to store a transformation matrix per-instance. Further down the road we may want to add things like alternate skins or lightmaps to that, but for now let's just stick to the transform.
So, by tweaking our model to allow it to take in a transformation matrix, we can now do something like so:
var crate = new Model("root/model/crate");
var instances = [matrix1, matrix2, matrix3];
crate.draw(instances[0]);
crate.draw(instances[1]);
crate.draw(instances[2]);
And this will perform so much better than the first code snippet that it's not even funny! And, really, you could probably build a game like this if you wanted to. But.... well, we're still repeating ourselves a little too much. Let's look at the (pseudocode) internals of draw:
Model.prototype.draw = function(matrix) {
bindShader();
bindBuffers();
bindUniforms();
bindTransform(matrix);
bindVertexAttributes();
for(mesh in this.meshes) {
for(submesh in mesh.submeshes) {
drawTriangles(submesh.start, submesh.count);
}
}
};
The important part to realize here is that WebGL is a state machine, and so all of those bind... commands will continue to take effect until they're overridden. This means that in our previous draw code, although it looks really nice and clean what's really happening is this:
bindStuff();
draw();
bindStuff(); // Unnecessary!
draw();
bindStuff(); // Unnecessary!
draw();
It's certainly not helping us to do all of that binding over and over again if we only need to change one little part of the state (the transform matrix.)
(As a quick aside: there's a good possibility that the graphics driver will recognize that you are binding the same stuff again and simply ignore your call rather than spend the time to redo it. That's awesome, but unreliable. Maybe they do that on the desktop, but what about mobile? Or what if the manufacture decides that that optimization is causing a problem and removes it one day? Point being, we don't want to rely on the driver being smart for our performance when we can improve our own code instead.)
What we'd really rather do is something like this:
bindStuff();
bindTranform(matrix);
draw();
bindTransform(matrix);
draw();
bindTransform(matrix);
draw();
By only changing the stuff that actually changes from draw to draw we make our rendering as efficient as possible. Now we just need to implement it in a way that makes it easy to use. My version looks like this:
var crate = new Model("root/model/crate");
var instance1 = crate.createInstance();
var instance2 = crate.createInstance();
var instance3 = crate.createInstance();
crate.drawInstances();
Since the only way to get an instance is through the original model, that lets us keep a list of the instances internally. That makes drawing all the instances easy. You can see the instancing management and rendering code in model.js.
There's a few other features that I've slipped into the instancing code to make life easier. For one, in anything but the most trivial of circumstances you probably won't want to draw every instance every frame. To this end each instance has an updateVisibility function that takes in an integer. That may sound a little odd, but it's utilizing an old trick that I picked up from the original Quake code. Here's how it works: We keep track of an number that increments with every frame (or every change of view, if you want to be more conservative). Then we can do whatever algorithm we have in place (BSP, Oct-Tree, etc) to determine which instances are visible and flag them with that same frame number. Finally, when we go to render we only draw those instances that share the same "visibility number" as the current frame. What this gains us is a system where we don't have to hit every mesh every frame to explicitly flag it as "not visible". Instead, we just update the meshes that are visible and the rest become inherently invisible. It's a small thing, but one that's simple enough to implement and can cut down on the amount of looping we do per frame.
The other thing implemented on the instances is an individual draw method, which essentially just draws the Model with that instances matrix. This draw does not attempt to optimize the draw at all, it just binds the state and draws the mesh once, but it can be used for debugging or specific effects.
To demonstrate the instancing system, I've put together a simple demo that takes four different meshes and creates 250 instances of each with random translations and rotations for a total of 1000 models being rendered per frame. (In my tests I could get it to render more than 4000 before I started to see slowdown, but I went lower to account for slower systems.) The results looks like an explosion in a barrel factory.
This code for todays post is relatively simple, and will need to be expanded as we work out exactly how we're going to be using instances in the game proper. Also, it doesn't account for instancing skinned models just yet as I have some more complex plans for that. Finally, although we're referring to this technique as "Instancing" it's important to note that this is NOT hardware instancing. Unfortunately WebGL doesn't support hardware instancing, so this is the best we can do for now. If we ever do gain that ability, however, it would be relatively simple to update this code to support it without changing the high level code usage.
Still not completely certain what the next post is going to cover. I've got a couple of different things I want to talk about and it really depends on which piece of code starts working first. Hope to see you there no matter what the subject matter ends up being!
Thus far we've been doing a decent job of showing one thing on the screen at a time, which is great if the game you're building is "Crate in empty space" or "Look at this thing!", but that's not the game we want to build! We want to build games like "Holy crap! That's a lot of stuff on screen!" and "Look at all these things!"
...or something like that.
Point being, many times in a game we're going to need to display multiple instances of the same model. Good examples would be: Health packs scattered around the level, a forest where we simply duplicate and repeat 4 or 5 different tree models, a long row of computer terminals, or even just players sharing the same skin.
There are a few naive ways of going about this. The worst would be to load up a new instance of the model for each instance we want on screen, like so:
var crate1 = new Model("root/model/crate");
var crate2 = new Model("root/model/crate");
var crate3 = new Model("root/model/crate");
crate1.draw();
crate2.draw();
crate3.draw();
The horror!!! This is going to eat your game alive, because you're pumping your GPU memory full of duplicate data and you'll end up page swapping in no time flat. (Page Swapping == Bad Things for your Framerate)
What we need to think about with something like this is what is actually going to be different about each instance of the model and store only that. Now, in our case, we'll start by saying that the only difference between models is going to be the position, rotation, and scale. In other words, we need to store a transformation matrix per-instance. Further down the road we may want to add things like alternate skins or lightmaps to that, but for now let's just stick to the transform.
So, by tweaking our model to allow it to take in a transformation matrix, we can now do something like so:
var crate = new Model("root/model/crate");
var instances = [matrix1, matrix2, matrix3];
crate.draw(instances[0]);
crate.draw(instances[1]);
crate.draw(instances[2]);
And this will perform so much better than the first code snippet that it's not even funny! And, really, you could probably build a game like this if you wanted to. But.... well, we're still repeating ourselves a little too much. Let's look at the (pseudocode) internals of draw:
Model.prototype.draw = function(matrix) {
bindShader();
bindBuffers();
bindUniforms();
bindTransform(matrix);
bindVertexAttributes();
for(mesh in this.meshes) {
for(submesh in mesh.submeshes) {
drawTriangles(submesh.start, submesh.count);
}
}
};
The important part to realize here is that WebGL is a state machine, and so all of those bind... commands will continue to take effect until they're overridden. This means that in our previous draw code, although it looks really nice and clean what's really happening is this:
bindStuff();
draw();
bindStuff(); // Unnecessary!
draw();
bindStuff(); // Unnecessary!
draw();
It's certainly not helping us to do all of that binding over and over again if we only need to change one little part of the state (the transform matrix.)
(As a quick aside: there's a good possibility that the graphics driver will recognize that you are binding the same stuff again and simply ignore your call rather than spend the time to redo it. That's awesome, but unreliable. Maybe they do that on the desktop, but what about mobile? Or what if the manufacture decides that that optimization is causing a problem and removes it one day? Point being, we don't want to rely on the driver being smart for our performance when we can improve our own code instead.)
What we'd really rather do is something like this:
bindStuff();
bindTranform(matrix);
draw();
bindTransform(matrix);
draw();
bindTransform(matrix);
draw();
By only changing the stuff that actually changes from draw to draw we make our rendering as efficient as possible. Now we just need to implement it in a way that makes it easy to use. My version looks like this:
var crate = new Model("root/model/crate");
var instance1 = crate.createInstance();
var instance2 = crate.createInstance();
var instance3 = crate.createInstance();
crate.drawInstances();
Since the only way to get an instance is through the original model, that lets us keep a list of the instances internally. That makes drawing all the instances easy. You can see the instancing management and rendering code in model.js.
There's a few other features that I've slipped into the instancing code to make life easier. For one, in anything but the most trivial of circumstances you probably won't want to draw every instance every frame. To this end each instance has an updateVisibility function that takes in an integer. That may sound a little odd, but it's utilizing an old trick that I picked up from the original Quake code. Here's how it works: We keep track of an number that increments with every frame (or every change of view, if you want to be more conservative). Then we can do whatever algorithm we have in place (BSP, Oct-Tree, etc) to determine which instances are visible and flag them with that same frame number. Finally, when we go to render we only draw those instances that share the same "visibility number" as the current frame. What this gains us is a system where we don't have to hit every mesh every frame to explicitly flag it as "not visible". Instead, we just update the meshes that are visible and the rest become inherently invisible. It's a small thing, but one that's simple enough to implement and can cut down on the amount of looping we do per frame.
The other thing implemented on the instances is an individual draw method, which essentially just draws the Model with that instances matrix. This draw does not attempt to optimize the draw at all, it just binds the state and draws the mesh once, but it can be used for debugging or specific effects.
To demonstrate the instancing system, I've put together a simple demo that takes four different meshes and creates 250 instances of each with random translations and rotations for a total of 1000 models being rendered per frame. (In my tests I could get it to render more than 4000 before I started to see slowdown, but I went lower to account for slower systems.) The results looks like an explosion in a barrel factory.
This code for todays post is relatively simple, and will need to be expanded as we work out exactly how we're going to be using instances in the game proper. Also, it doesn't account for instancing skinned models just yet as I have some more complex plans for that. Finally, although we're referring to this technique as "Instancing" it's important to note that this is NOT hardware instancing. Unfortunately WebGL doesn't support hardware instancing, so this is the best we can do for now. If we ever do gain that ability, however, it would be relatively simple to update this code to support it without changing the high level code usage.
Still not completely certain what the next post is going to cover. I've got a couple of different things I want to talk about and it really depends on which piece of code starts working first. Hope to see you there no matter what the subject matter ends up being!