Sunday, October 30, 2011

Building the Game: Part 3 - Skinning & Animation

See the code for this post, or all posts in this series.
See the live demo.

In the previous BtG we got the basics of a model format in place, but it only accounts for static meshes. Now, static geometry is very important and will make up the majority of any scene in our eventual game. But we all know that the most interesting things you see on screen are the ones that are moving, wether it be the player sprinting across the screen or the rocket careening towards your head.

There are many different techniques for creating motion in a game. It can be sliding a static mesh back and forth for a door, running a rigid body through a physics simulation, generating particle effects for smoke and sparks, transforming texture coordinates to fake flowing water, or deforming a mesh to look like waving cloth. We'll end up talking about some of those methods as we work our way through this series, but today I want to talk about the big daddy of animation: Skeletal Animation and Mesh Skinning!

I'm not going to cover the basics of those topics here, because this post is going to end up being long enough as-is. But if you're unfamiliar with them and would like to know more I'd recommend checking out the Wikipedia Article on the subject for a high level overview. Beyond that, Googling for "Mesh Skinning" and "Skeletal Animation" will turn up lots of good resources. For my part, I'm more interested in talking about how these systems will be implemented in my codebase.

So, first off, we know that we're going to need a list of bones with some information about their bind pose in here somewhere. This is information that could, conceivably, be stored in the binary file but in this case I've found it to be more convenient to store it in the JSON, if for no other reason than easy debugging. The data sits along side the "meshes" block, and looks like so:

"bones": [ 
            "name": "root",
            "parent": -1,
            "pos": [0, 0, 0],
            "rot": [0, 0, 0, 1],
            "skinned": false
            "name": "lower-back",
            "parent": 0,
            "pos": [0, 1.723, -0.56],
            "rot": [-0.454, 0.509, 0.551, 0.479],
            "skinned": true,
            "bindPoseMat": [1, 0, 0, 0, 0, 1, 0, 0, ...]

As you can see, each bone has a name, a parent bone ID (-1 means no parent), a position (Vector) and rotation (Quaternion), a "skinned" boolean, and if skinned is true a bind pose matrix. The bind pose matrix will be stored as an array of 16 floats, and actually represents the inverse of the bind pose skeletal transform. We need that information to allow the vertices to transform properly. 

Now that "skinned" value might be a little confusing, so here's the logic behind it: For each bone in our skeleton we are going to need to recursively update the rotations and positions based on the parent bones, calculate a transformation matrix based off of that, then multiply that matrix by the bind pose matrix to get the ACTUAL matrix that the vertices will be transformed by. In normal circumstances that's a lot of work, and in Javascript that's crazy expensive! But, not every bone in your skeleton will have vertices attached to them! It's not unusual to have the character's root bone or parts of the spine, etc, not have any vertex weighted against them, but we still need them there to guide the overall animation. If skinned is false, we know that we can skip the matrix calculation on those bones, which will save us a little bit of time. Now, to be honest I'm not sure if that will make any real difference in the long run, but it has proven useful on the meshes I've tested thus far.

Another thing that's worth pointing out: The bone structure is inherently hierarchal, and as such it's extremely tempting to make it into a nested structure here. That's probably going to cause more headaches than it's worth, however, so instead we lay them all out in a flat array. What we do enforce here, however, is that every bone in the list MUST appear after it's parent bone. This allows us, when updating the bones during animation, to always do a single pass, non-recursive calculation on each bone because we always know that it's parent will have already been processed and have the correct values accumulated. Plus, iterating over a flat array is faster than navigating a tree anyway and it makes the bone indexing more straightforward, so it's a win-win all around.

On the vertex side, skinning information is simply added to the vertex buffer. We had installed a vertex format flag in the binary file in the last post, and now we can use that to indicate that the vertices have skinning data as well. They're packed in the interleaved data just like the positions and normals and whatnot in the pattern: "Weight 0, Weight 1, Weight 2, Bone 0, Bone 1, Bone 2". I originally wanted to store the bone indicies and weights as bytes, since we can only reference a small number of bones per mesh anyway, but after some experimentation I found that:
  1. WebGL doesn't allow you to use bytes as attributes. (Aww....)
  2. You can convince it to use shorts, but the drivers hate you for it. You get nasty speed hits.
So I bit the bullet in this case and simply allowed the bone indices and weights to be floats. (I cast the indices to ints in the shader). The waste of bytes makes me grind my teeth a little, but it's not a huge deal for the better performance. That said, I'd be very interested to know if mobile devices behaved differently in this regard.

I also made a personal call to say that I would allow three bone weights per vertex instead of four as a way of allowing reasonably complex skinning but still cut down on the amount of data we're slinging around. To be perfectly honest, I don't have enough experience with these things yet to know if that will help or hurt me in the end (maybe I could have gotten away with only two?) but it feels like a good middle ground. Nevertheless, if you find that you need more or less than this for whatever reason it won't be too difficult to change.

Implementation wise, there was a new ModelVertexFormat flag added in model.js as mentioned earlier, but the bulk of the skinning specific loading code is in a new file and new class: skinned-model.js. The SkinnedModel class inherits the Model prototype, and adds a few new bits of loading code as needed. It's worth noting, however, that rather than try and build on top of the Model's draw routine, SkinnedMesh get's it's own completely independent implementation. This may seem somewhat imprudent, as about 80% of the code is the same and the opportunity for abstraction is high, but I'm resisting the temptation for a very simple reason:

Abstractions slow you down.

For the loading, I don't care so much. Yes, we want it to be fast but a few milliseconds here and there while loading meshes aren't going to be missed. We're going to be executing our rendering code hundreds or thousands of times every frame, though. As such, do we really want to be forcing the system to jump through multiple layers of function calls and redirections with every draw if we don't have to? Not to mention we'd probably have to introduce some tests to determine if we were rendering skinned or unskinned so that we could set our state correctly, and we'd have to pass more data around to ensure the right shader uniforms got filled in and... You know what, it's not that much code! It's totally worth it to me to repeat the ~40 lines of code and tweak them a bit to make sure that each Model variant renders as quickly and directly as possible.

So that takes care of the mesh, but what about the animations?

There are a bunch of different ways that animations can be stored, but I'm making mine relatively simplistic: We'll store a list of the bones that are affected (so we can have animations that only modify the legs, for example) and a list of frames. Each frame will be a snapshot of the rotations and positions (I'm ignoring scaling for now) of the bones at that frame, wether or not they've changed. That keeps the calculations really easy. We'll have a few more informational items added to the header, and for the sake of simplicity I'm doing this one as JSON. (At least initially. We'll see if the space savings justifies moving it to binary later on.)

    "animVersion": 1,
    "name": "run_forward",
    "frameRate": 30,
    "duration": 633,
    "frameCount": 18,
    "bones": [ "player_root", "Bip001", "Bip001 Pelvis", ... ],
    "keyframes": [
    { "pos": [ -6.148, -0.052, 0 ], 
                  "rot": [ 0, 0, 0, 1 ] },
    { "pos": [ 2.850, 1.012, 0.062 ], 
                  "rot": [ -0.431, 0.514, 0.567, 0.476 ] },
    { "pos": [ 0, 0, 0 ], 
                  "rot": [ -0.499, 0.500, 0.499, 0.5 ] },
                { "pos": [ -6.148, -0.052, 0 ], 
                  "rot": [ 0, 0, 0, 1 ] }, 
    { "pos": [ -0.008, 1.003, 0.062 ], 
                  "rot": [ -0.44, 0.506, 0.573, 0.470 ] }, 
    { "pos": [ 0, 0, 0 ], 
                  "rot": [ -0.499, 0.5, 0.499, 0.5 ] }, 

A few other points that are worth talking about with this format: The bone references are done by name, not by index. I went back and forth on this a lot, but in this end I feel like this method allows a little bit more flexibility. Otherwise you would essentially have to export the model and all it's animations at the same time to ensure that the bone indexes were correct (I'm reordering the bones in the exporter to meet my "children after parents" requirement). Also, the order that the bones appear in the "bones" list is also the order that their transformations will appear in for each frame. Keeps things simple that way.

Loading of the animation and calculation of the matrices for a given frame takes place in the animation.js file.

So now the model has it's bone data and the animation has the motion data, let's tie it all together and get this mesh moving!

When it comes to skinning meshes, there's basically two approaches: Software Skinning and GPU Skinning. Software skinning basically means that after calculating the bone matrices we multiply all the vertex positions in our application code (javascript, in this case) and push it out to the GPU each frame. You can actually see an example of this with my old Doom 3 model demo. This is a perfectly valid method... when you're not running in freaking javascript! As is, we want our javascript code to be as lightweight as possible, so we turn to GPU skinning. With GPU skinning we'll still calculate the bone matrices in javascript, so we can do complex animation blending later on, but the only thing that we have to push the the GPU for each frame is the matrix array. The mesh vertices are allowed to stay static, and they're transformed into the correct position in the shader. This is invaluable in an environment such as ours! You can see the skinning shader code at the top of skinned-model.js

There is one complexity that GPU skinning introduces however. Shaders have a limited number of uniform variables that they can use at once, and since our matrices are being passed as uniforms, we can eat that limit up quickly. Not to mention, we need to have some uniforms available for other things like lighting information, textures, etc. What this means in the end is that we can only process so many bones in a single draw call. How many? Well, that's kind of a complex thing to figure out: The answer is that it depends on how many uniforms your hardware can support and how many uniforms you need for other purposes like lights. Quite honestly, I don't have a good answer for that yet, as I think it's something that we'll have to experiment with as the game moves forward. So for the time being I've just picked a number, 50, and started working with it.

So, being limited to 50 bones per draw call now, if we have a model that uses more than that we have to break the mesh up into several sub-meshses, each of which can reference at most 50 bones. Of course, we're already one step ahead of the game here, as you may recall that sub-meshes are part of our original model format! Yay! All we need to do in this case is add a couple of new bits to the submesh to make it skinnable:

"submeshes": [
        "indexOffset": 0,
        "indexCount": 11760,
        "boneOffset": 0,
        "boneCount": 35

The meanings of these new elements should be easy to guess. boneOffset is the first bone in our list that this submesh uses, and boneCount is how many bones after that need to be passed to the shader. This, of course, makes the assumption that the bones each submesh needs are grouped together. That's great, but we've ALSO dictated that bone order must place parents before children. Getting those two limitations to play together nicely may prove to be a formidable problem...

... and it's one that I haven't solved yet. Actually, the exporter code that comes with this post's git branch doesn't account for submesh splitting at all, which works for now because the demo mesh has less bones than our bone limit. I'm putting the issue on hold for now in favor of getting other things done, and I'll tackle this problem when it actually becomes a problem. It's going to be a hairy one when I get there though, and I may have to reverse a couple of the decisions I'm making now to make it work. We'll see how it goes.

Anyway, theoretical potholes aside, now that we've figured out how all of the formats are supposed to work, we need to actually get some models out that implement them and to that end we've extended our Unity exporter. The previous "Export Selected Meshes" menu item has been extended to handle skinned meshes as well, and we've added a "Export Selected Animations" that will, predictably export any animations that you highlight in the Project view. In the AngryBots project the prime testing target for these is the main player model (main_player_lorez) and his various running animations (run_forward, etc.)

So, with the files exported, we need to actually get them showing up in our renderer. Loading a skinned model is easy, we simply swap out our Model class for the SkinnedModel class. (game-render.js, line 50)

this.model = new model.SkinnedModel();

(It's worth mentioning that you CAN load a skinned model with the static model class, it will just ignore the bone information and render it in the bind pose)

Loading an animation is similarly simple, and for now we're going to use a very basic and manual method for actually playing the animation. (game-renderer.js, line 54)

this.anim = new animation.Animation();
this.anim.load("/root/model/run_forward", function(anim) {
    // Simple hack to get the animation to play
    var frameId = 0;
    var frameTime = 1000 / anim.frameRate;
    setInterval(function() {
        if(self.model.complete) {
            anim.evaluate(frameId % anim.frameCount, self.model);
    }, frameTime);

And suddenly, we run!

Of course, at this point we still have a lot of missing pieces to put in place before this will be a game-ready animation system. For one, our method of playing an animation (setTimeout with a frame counter) is pretty lame, and won't do the job for anything but a simple demo. Secondly, we're applying the animation directly to the model's skeleton, which means that we would have to duplicate the model if we needed another instance that was playing a different animation. As mentioned earlier, we're not really handling the limits on bone count in our export, so we'll run into trouble as we try working with larger models. Also, we don't have any concept of mixing animations yet, nor are we attempting to interpolate animation frames at all. And that doesn't even take into consideration performance. This animation performs pretty well, but how fast will it run when we're trying to animate 20 different objects at the same time? These are just a few of the things that we'll have to fix as the game moves forward, and many of them will justify their own blog post.

But for now, the immediate goal has been achieved: We can export skinned models and the animations to play with them. Our skinning happens on the GPU, which will speed us up quite a bit, and we've got all the pieces in place to start fleshing it out as we go along. Not too shabby!

For the next post, prepare to see double (and then some) as we talk about mesh instancing.