Wednesday, November 2, 2011

Building the Game: Part 4 - Static Model Instancing


See the code for this post, or all posts in this series.
See the live demo.



Today's post is going to be far less involved than the last one, but it's an important subject that we need to nail down the basics of before we move on too much further.

Thus far we've been doing a decent job of showing one thing on the screen at a time, which is great if the game you're building is "Crate in empty space" or "Look at this thing!", but that's not the game we want to build! We want to build games like "Holy crap! That's a lot of stuff on screen!" and "Look at all these things!"

...or something like that.

Point being, many times in a game we're going to need to display multiple instances of the same model. Good examples would be: Health packs scattered around the level, a forest where we simply duplicate and repeat 4 or 5 different tree models, a long row of computer terminals, or even just players sharing the same skin.

There are a few naive ways of going about this. The worst would be to load up a new instance of the model for each instance we want on screen, like so:

var crate1 = new Model("root/model/crate");
var crate2 = new Model("root/model/crate");
var crate3 = new Model("root/model/crate");


crate1.draw();
crate2.draw();
crate3.draw();

The horror!!! This is going to eat your game alive, because you're pumping your GPU memory full of duplicate data and you'll end up page swapping in no time flat. (Page Swapping == Bad Things for your Framerate)

What we need to think about with something like this is what is actually going to be different about each instance of the model and store only that. Now, in our case, we'll start by saying that the only difference between models is going to be the position, rotation, and scale. In other words, we need to store a transformation matrix per-instance. Further down the road we may want to add things like alternate skins or lightmaps to that, but for now let's just stick to the transform.

So, by tweaking our model to allow it to take in a transformation matrix, we can now do something like so:

var crate = new Model("root/model/crate");
var instances = [matrix1, matrix2, matrix3];


crate.draw(instances[0]);
crate.draw(instances[1]);
crate.draw(instances[2]);

And this will perform so much better than the first code snippet that it's not even funny! And, really, you could probably build a game like this if you wanted to. But.... well, we're still repeating ourselves a little too much. Let's look at the (pseudocode) internals of draw:

Model.prototype.draw = function(matrix) {
    bindShader();
    bindBuffers();
    bindUniforms();
    bindTransform(matrix);
    bindVertexAttributes();


    for(mesh in this.meshes) {
        for(submesh in mesh.submeshes) {
            drawTriangles(submesh.start, submesh.count);
        }
    }
};

The important part to realize here is that WebGL is a state machine, and so all of those bind... commands will continue to take effect until they're overridden. This means that in our previous draw code, although it looks really nice and clean what's really happening is this:

bindStuff();
draw();


bindStuff(); // Unnecessary!
draw();


bindStuff(); // Unnecessary!
draw();


It's certainly not helping us to do all of that binding over and over again if we only need to change one little part of the state (the transform matrix.)


(As a quick aside: there's a good possibility that the graphics driver will recognize that you are binding the same stuff again and simply ignore your call rather than spend the time to redo it. That's awesome, but unreliable. Maybe they do that on the desktop, but what about mobile? Or what if the manufacture decides that that optimization is causing a problem and removes it one day? Point being, we don't want to rely on the driver being smart for our performance when we can improve our own code instead.)


What we'd really rather do is something like this:


bindStuff();


bindTranform(matrix);
draw();


bindTransform(matrix);
draw(); 



bindTransform(matrix);
draw();


By only changing the stuff that actually changes from draw to draw we make our rendering as efficient as possible. Now we just need to implement it in a way that makes it easy to use. My version looks like this:


var crate = new Model("root/model/crate");


var instance1 = crate.createInstance();
var instance2 = crate.createInstance();
var instance3 = crate.createInstance();


crate.drawInstances();


Since the only way to get an instance is through the original model, that lets us keep a list of the instances internally. That makes drawing all the instances easy. You can see the instancing management and rendering code in model.js.


There's a few other features that I've slipped into the instancing code to make life easier. For one, in anything but the most trivial of circumstances you probably won't want to draw every instance every frame. To this end each instance has an updateVisibility function that takes in an integer. That may sound a little odd, but it's utilizing an old trick that I picked up from the original Quake code. Here's how it works: We keep track of an number that increments with every frame (or every change of view, if you want to be more conservative). Then we can do whatever algorithm we have in place (BSP, Oct-Tree, etc) to determine which instances are visible and flag them with that same frame number. Finally, when we go to render we only draw those instances that share the same "visibility number" as the current frame. What this gains us is a system where we don't have to hit every mesh every frame to explicitly flag it as "not visible". Instead, we just update the meshes that are visible and the rest become inherently invisible. It's a small thing, but one that's simple enough to implement and can cut down on the amount of looping we do per frame.


The other thing implemented on the instances is an individual draw method, which essentially just draws the Model with that instances matrix. This draw does not attempt to optimize the draw at all, it just binds the state and draws the mesh once, but it can be used for debugging or specific effects.


To demonstrate the instancing system, I've put together a simple demo that takes four different meshes and creates 250 instances of each with random translations and rotations for a total of 1000 models being rendered per frame. (In my tests I could get it to render more than 4000 before I started to see slowdown, but I went lower to account for slower systems.) The results looks like an explosion in a barrel factory.


This code for todays post is relatively simple, and will need to be expanded as we work out exactly how we're going to be using instances in the game proper. Also, it doesn't account for instancing skinned models just yet as I have some more complex plans for that. Finally, although we're referring to this technique as "Instancing" it's important to note that this is NOT hardware instancing. Unfortunately WebGL doesn't support hardware instancing, so this is the best we can do for now. If we ever do gain that ability, however, it would be relatively simple to update this code to support it without changing the  high level code usage.


Still not completely certain what the next post is going to cover. I've got a couple of different things I want to talk about and it really depends on which piece of code starts working first. Hope to see you there no matter what the subject matter ends up being!


6 comments:

  1. Good news to see that I used the good method in the past for my webGL game (but right now I joined all instances (and loose animation...) and pre-calcule vertex from matrix to gain ~30fps (if having a good GPU), but too bad I can't write an efficient frustum culling in javascript...)
    Some comments:
    -> Put the "gl.activeTexture(gl.TEXTURE0);" and "gl.uniform1i(shader.uniform.diffuse, 0);" before the loop (don't need to send this stuff again and again and again)
    -> Isn't it better to do something like this :

    var m, i, j, model, instance, mesh;

    for ( m in models ) {
    model = models[m];

    for ( i in model.instances ) {
    instance = model.instances[i];

    if ( instance._visibleFlag < 0 || instance._visibleFlag >= visibilityFlag ) {
    bindmatrix( instance.matrix );

    for ( j in model.mesh ) {
    mesh = model.mesh[j];
    bindtexture( mesh.texture );
    draw( mesh.elements );
    }
    }
    }
    }

    ReplyDelete
  2. Vince,

    Right now you could probably order the draw loop either way and it wouldn't make much difference, but that's only because we're using a single texture. That texture is only a placeholder for what will eventually be a much more complex material system, however, and as such the cost of binding it is assumed to be higher than the cost of binding a single uniform.

    Also, consider that it will likely be fairly common to have models with a couple of different materials and therefore a couple of different methods. Under your flow this would look like so:

    bindMatrix, bindMaterial, draw, bindMaterial, draw
    bindMatrix, bindMaterial, draw, bindMaterial, draw...

    Assuming once again that a "material" is eventually going to be something more complex than a single diffuse texture, this would be a much better pattern:

    bindMaterial, bindMatrix, draw, bindMatrix, draw
    bindMaterial, bindMatrix, draw, bindMatrix, draw

    But, of course, you should always keep in mind that I'm not a professional game dev. These are techniques that I've seen work well in my previous demos and non-WebGL 3D code. There's always a possibility that they may not work when extrapolated out into an actual game. That's part of the reason I'm doing all of this, it's just as much a learning experience for me as it is for anybody else!

    ReplyDelete
  3. Dude, thanks a ton for putting this series. I have done some game programming in J2ME during early 2K and know how difficult but same time exciting a game development could be. a true programming job in my opinion.

    Thanks
    Javin

    ReplyDelete
  4. Just wanted to say KEEP UP THE WONDERFUL WORK.

    You, sir, are awesome. I devour your every post in this series and wait for each new installment with baited breath. I eagerly and enthusiastically encourage you to dive in and create BTG#5,6,7,8,9... post-haste.

    If I was king of the universe I'd demand more BTG posts more often, please! The eternity between posts is killing me. In a good way! I know you're super busy and respect that, but just know that there are a myriad of silent lurkers like me eagerly waiting for the next amazingly mind-blowing post. =)

    ReplyDelete
  5. Thanks for the vote of confidence! :) I've been busy lately, and the Thanksgiving holiday has prevented me from getting too much personal code done. I'm still working on it, though! The biggest holdup in the code has been figuring out some more of the internals of Unity, but I feel like I'm close!

    ReplyDelete
  6. I came through google and wanted some sample project structure for this application. can you share the link to the application code. We are trying to run it as part of a legacy application which is still using Vectors from Java collections.

    ReplyDelete