Sunday, October 30, 2011

Building the Game: Part 3 - Skinning & Animation

See the code for this post, or all posts in this series.
See the live demo.

In the previous BtG we got the basics of a model format in place, but it only accounts for static meshes. Now, static geometry is very important and will make up the majority of any scene in our eventual game. But we all know that the most interesting things you see on screen are the ones that are moving, wether it be the player sprinting across the screen or the rocket careening towards your head.

There are many different techniques for creating motion in a game. It can be sliding a static mesh back and forth for a door, running a rigid body through a physics simulation, generating particle effects for smoke and sparks, transforming texture coordinates to fake flowing water, or deforming a mesh to look like waving cloth. We'll end up talking about some of those methods as we work our way through this series, but today I want to talk about the big daddy of animation: Skeletal Animation and Mesh Skinning!

I'm not going to cover the basics of those topics here, because this post is going to end up being long enough as-is. But if you're unfamiliar with them and would like to know more I'd recommend checking out the Wikipedia Article on the subject for a high level overview. Beyond that, Googling for "Mesh Skinning" and "Skeletal Animation" will turn up lots of good resources. For my part, I'm more interested in talking about how these systems will be implemented in my codebase.

So, first off, we know that we're going to need a list of bones with some information about their bind pose in here somewhere. This is information that could, conceivably, be stored in the binary file but in this case I've found it to be more convenient to store it in the JSON, if for no other reason than easy debugging. The data sits along side the "meshes" block, and looks like so:

"bones": [ 
            "name": "root",
            "parent": -1,
            "pos": [0, 0, 0],
            "rot": [0, 0, 0, 1],
            "skinned": false
            "name": "lower-back",
            "parent": 0,
            "pos": [0, 1.723, -0.56],
            "rot": [-0.454, 0.509, 0.551, 0.479],
            "skinned": true,
            "bindPoseMat": [1, 0, 0, 0, 0, 1, 0, 0, ...]

As you can see, each bone has a name, a parent bone ID (-1 means no parent), a position (Vector) and rotation (Quaternion), a "skinned" boolean, and if skinned is true a bind pose matrix. The bind pose matrix will be stored as an array of 16 floats, and actually represents the inverse of the bind pose skeletal transform. We need that information to allow the vertices to transform properly. 

Now that "skinned" value might be a little confusing, so here's the logic behind it: For each bone in our skeleton we are going to need to recursively update the rotations and positions based on the parent bones, calculate a transformation matrix based off of that, then multiply that matrix by the bind pose matrix to get the ACTUAL matrix that the vertices will be transformed by. In normal circumstances that's a lot of work, and in Javascript that's crazy expensive! But, not every bone in your skeleton will have vertices attached to them! It's not unusual to have the character's root bone or parts of the spine, etc, not have any vertex weighted against them, but we still need them there to guide the overall animation. If skinned is false, we know that we can skip the matrix calculation on those bones, which will save us a little bit of time. Now, to be honest I'm not sure if that will make any real difference in the long run, but it has proven useful on the meshes I've tested thus far.

Another thing that's worth pointing out: The bone structure is inherently hierarchal, and as such it's extremely tempting to make it into a nested structure here. That's probably going to cause more headaches than it's worth, however, so instead we lay them all out in a flat array. What we do enforce here, however, is that every bone in the list MUST appear after it's parent bone. This allows us, when updating the bones during animation, to always do a single pass, non-recursive calculation on each bone because we always know that it's parent will have already been processed and have the correct values accumulated. Plus, iterating over a flat array is faster than navigating a tree anyway and it makes the bone indexing more straightforward, so it's a win-win all around.

On the vertex side, skinning information is simply added to the vertex buffer. We had installed a vertex format flag in the binary file in the last post, and now we can use that to indicate that the vertices have skinning data as well. They're packed in the interleaved data just like the positions and normals and whatnot in the pattern: "Weight 0, Weight 1, Weight 2, Bone 0, Bone 1, Bone 2". I originally wanted to store the bone indicies and weights as bytes, since we can only reference a small number of bones per mesh anyway, but after some experimentation I found that:
  1. WebGL doesn't allow you to use bytes as attributes. (Aww....)
  2. You can convince it to use shorts, but the drivers hate you for it. You get nasty speed hits.
So I bit the bullet in this case and simply allowed the bone indices and weights to be floats. (I cast the indices to ints in the shader). The waste of bytes makes me grind my teeth a little, but it's not a huge deal for the better performance. That said, I'd be very interested to know if mobile devices behaved differently in this regard.

I also made a personal call to say that I would allow three bone weights per vertex instead of four as a way of allowing reasonably complex skinning but still cut down on the amount of data we're slinging around. To be perfectly honest, I don't have enough experience with these things yet to know if that will help or hurt me in the end (maybe I could have gotten away with only two?) but it feels like a good middle ground. Nevertheless, if you find that you need more or less than this for whatever reason it won't be too difficult to change.

Implementation wise, there was a new ModelVertexFormat flag added in model.js as mentioned earlier, but the bulk of the skinning specific loading code is in a new file and new class: skinned-model.js. The SkinnedModel class inherits the Model prototype, and adds a few new bits of loading code as needed. It's worth noting, however, that rather than try and build on top of the Model's draw routine, SkinnedMesh get's it's own completely independent implementation. This may seem somewhat imprudent, as about 80% of the code is the same and the opportunity for abstraction is high, but I'm resisting the temptation for a very simple reason:

Abstractions slow you down.

For the loading, I don't care so much. Yes, we want it to be fast but a few milliseconds here and there while loading meshes aren't going to be missed. We're going to be executing our rendering code hundreds or thousands of times every frame, though. As such, do we really want to be forcing the system to jump through multiple layers of function calls and redirections with every draw if we don't have to? Not to mention we'd probably have to introduce some tests to determine if we were rendering skinned or unskinned so that we could set our state correctly, and we'd have to pass more data around to ensure the right shader uniforms got filled in and... You know what, it's not that much code! It's totally worth it to me to repeat the ~40 lines of code and tweak them a bit to make sure that each Model variant renders as quickly and directly as possible.

So that takes care of the mesh, but what about the animations?

There are a bunch of different ways that animations can be stored, but I'm making mine relatively simplistic: We'll store a list of the bones that are affected (so we can have animations that only modify the legs, for example) and a list of frames. Each frame will be a snapshot of the rotations and positions (I'm ignoring scaling for now) of the bones at that frame, wether or not they've changed. That keeps the calculations really easy. We'll have a few more informational items added to the header, and for the sake of simplicity I'm doing this one as JSON. (At least initially. We'll see if the space savings justifies moving it to binary later on.)

    "animVersion": 1,
    "name": "run_forward",
    "frameRate": 30,
    "duration": 633,
    "frameCount": 18,
    "bones": [ "player_root", "Bip001", "Bip001 Pelvis", ... ],
    "keyframes": [
    { "pos": [ -6.148, -0.052, 0 ], 
                  "rot": [ 0, 0, 0, 1 ] },
    { "pos": [ 2.850, 1.012, 0.062 ], 
                  "rot": [ -0.431, 0.514, 0.567, 0.476 ] },
    { "pos": [ 0, 0, 0 ], 
                  "rot": [ -0.499, 0.500, 0.499, 0.5 ] },
                { "pos": [ -6.148, -0.052, 0 ], 
                  "rot": [ 0, 0, 0, 1 ] }, 
    { "pos": [ -0.008, 1.003, 0.062 ], 
                  "rot": [ -0.44, 0.506, 0.573, 0.470 ] }, 
    { "pos": [ 0, 0, 0 ], 
                  "rot": [ -0.499, 0.5, 0.499, 0.5 ] }, 

A few other points that are worth talking about with this format: The bone references are done by name, not by index. I went back and forth on this a lot, but in this end I feel like this method allows a little bit more flexibility. Otherwise you would essentially have to export the model and all it's animations at the same time to ensure that the bone indexes were correct (I'm reordering the bones in the exporter to meet my "children after parents" requirement). Also, the order that the bones appear in the "bones" list is also the order that their transformations will appear in for each frame. Keeps things simple that way.

Loading of the animation and calculation of the matrices for a given frame takes place in the animation.js file.

So now the model has it's bone data and the animation has the motion data, let's tie it all together and get this mesh moving!

When it comes to skinning meshes, there's basically two approaches: Software Skinning and GPU Skinning. Software skinning basically means that after calculating the bone matrices we multiply all the vertex positions in our application code (javascript, in this case) and push it out to the GPU each frame. You can actually see an example of this with my old Doom 3 model demo. This is a perfectly valid method... when you're not running in freaking javascript! As is, we want our javascript code to be as lightweight as possible, so we turn to GPU skinning. With GPU skinning we'll still calculate the bone matrices in javascript, so we can do complex animation blending later on, but the only thing that we have to push the the GPU for each frame is the matrix array. The mesh vertices are allowed to stay static, and they're transformed into the correct position in the shader. This is invaluable in an environment such as ours! You can see the skinning shader code at the top of skinned-model.js

There is one complexity that GPU skinning introduces however. Shaders have a limited number of uniform variables that they can use at once, and since our matrices are being passed as uniforms, we can eat that limit up quickly. Not to mention, we need to have some uniforms available for other things like lighting information, textures, etc. What this means in the end is that we can only process so many bones in a single draw call. How many? Well, that's kind of a complex thing to figure out: The answer is that it depends on how many uniforms your hardware can support and how many uniforms you need for other purposes like lights. Quite honestly, I don't have a good answer for that yet, as I think it's something that we'll have to experiment with as the game moves forward. So for the time being I've just picked a number, 50, and started working with it.

So, being limited to 50 bones per draw call now, if we have a model that uses more than that we have to break the mesh up into several sub-meshses, each of which can reference at most 50 bones. Of course, we're already one step ahead of the game here, as you may recall that sub-meshes are part of our original model format! Yay! All we need to do in this case is add a couple of new bits to the submesh to make it skinnable:

"submeshes": [
        "indexOffset": 0,
        "indexCount": 11760,
        "boneOffset": 0,
        "boneCount": 35

The meanings of these new elements should be easy to guess. boneOffset is the first bone in our list that this submesh uses, and boneCount is how many bones after that need to be passed to the shader. This, of course, makes the assumption that the bones each submesh needs are grouped together. That's great, but we've ALSO dictated that bone order must place parents before children. Getting those two limitations to play together nicely may prove to be a formidable problem...

... and it's one that I haven't solved yet. Actually, the exporter code that comes with this post's git branch doesn't account for submesh splitting at all, which works for now because the demo mesh has less bones than our bone limit. I'm putting the issue on hold for now in favor of getting other things done, and I'll tackle this problem when it actually becomes a problem. It's going to be a hairy one when I get there though, and I may have to reverse a couple of the decisions I'm making now to make it work. We'll see how it goes.

Anyway, theoretical potholes aside, now that we've figured out how all of the formats are supposed to work, we need to actually get some models out that implement them and to that end we've extended our Unity exporter. The previous "Export Selected Meshes" menu item has been extended to handle skinned meshes as well, and we've added a "Export Selected Animations" that will, predictably export any animations that you highlight in the Project view. In the AngryBots project the prime testing target for these is the main player model (main_player_lorez) and his various running animations (run_forward, etc.)

So, with the files exported, we need to actually get them showing up in our renderer. Loading a skinned model is easy, we simply swap out our Model class for the SkinnedModel class. (game-render.js, line 50)

this.model = new model.SkinnedModel();

(It's worth mentioning that you CAN load a skinned model with the static model class, it will just ignore the bone information and render it in the bind pose)

Loading an animation is similarly simple, and for now we're going to use a very basic and manual method for actually playing the animation. (game-renderer.js, line 54)

this.anim = new animation.Animation();
this.anim.load("/root/model/run_forward", function(anim) {
    // Simple hack to get the animation to play
    var frameId = 0;
    var frameTime = 1000 / anim.frameRate;
    setInterval(function() {
        if(self.model.complete) {
            anim.evaluate(frameId % anim.frameCount, self.model);
    }, frameTime);

And suddenly, we run!

Of course, at this point we still have a lot of missing pieces to put in place before this will be a game-ready animation system. For one, our method of playing an animation (setTimeout with a frame counter) is pretty lame, and won't do the job for anything but a simple demo. Secondly, we're applying the animation directly to the model's skeleton, which means that we would have to duplicate the model if we needed another instance that was playing a different animation. As mentioned earlier, we're not really handling the limits on bone count in our export, so we'll run into trouble as we try working with larger models. Also, we don't have any concept of mixing animations yet, nor are we attempting to interpolate animation frames at all. And that doesn't even take into consideration performance. This animation performs pretty well, but how fast will it run when we're trying to animate 20 different objects at the same time? These are just a few of the things that we'll have to fix as the game moves forward, and many of them will justify their own blog post.

But for now, the immediate goal has been achieved: We can export skinned models and the animations to play with them. Our skinning happens on the GPU, which will speed us up quite a bit, and we've got all the pieces in place to start fleshing it out as we go along. Not too shabby!

For the next post, prepare to see double (and then some) as we talk about mesh instancing.


  1. Excellent post. I've been implementing the same type of skeletal animation in webgl recently. I found two things in your article which contradict my experience however:

    "WebGL doesn't allow you to use bytes as attributes. (Aww....)"
    Just pass UNSIGNED_BYTE into vertexAttribPointer. Seems to work just fine, either normalized or unnormalized.

    "You can convince it to use shorts, but the drivers hate you for it. You get nasty speed hits."
    I'm using unsigned shorts and haven't detected any speed hit on my nvidia card at least.

    Maybe you're running into data alignment problems? GPUs don't like vertex data at odd alignments. For example constructing an attribute out of 3 bytes will run slowly, because the GPU can't read at that 3 byte alignment, but if you make a 4 byte attribute, that will likely run at full speed. The same is true for shorts - use 2 or 4 but not 1 or 3.

    With this in mind I chose 4 for the bones-per-vertex limit in our code, using unnormalized unsigned bytes for bone index, and normalized unsigned shorts for bone weights.

  2. It's getting more and more interesting, I wonder how far you are going to push it :)

  3. Really interesting, keep up the good work!

  4. Hi! I've been playing with your code and one strange thing I was wondering about.. In the vertex shader, bonemat is declared as a mat4 uniform of size[3], but in the code, it uploads the whole boneSet of ~30 bones.. is this intentional?

  5. hii Bradon,

    Great posts looking forward for more parts of the game blog.

    I am working on the same type of skinned model as descripted above. But if i CAN make a consession that my models dont need to be able to get morphed or changed at runtime besides the given animations in the model file, wouldnt it be faster for the game pre rendering a model vertexes,indices by the bones structure on loading the model instead of parsing bones vertexes etc etc to the shaders and calculate the vertex position on runtime? this way you calculate the positions once on loading the model instead of calculating it every frame in the shader??


  6. Flexx,

    Yes, at the most basic level rendering static buffers will always be faster than transforming the verts before or during the render. It's not a totally cut and dry win, though.

    The biggest consideration is that of memory. Depending on how big your model is, how many frames of animations you're looking at, and how many animated models you need at once, and the platform that you're targeting the method that you are describing could eat up a decent chunk of GPU memory very quickly. Skeletal animation has the advantage of storing less information and giving more flexibility (blending, etc) for the same motion, but there is a runtime cost for it. It's a tradeoff, and it's entirely up to your project requirements as to wether or not one method works better than the other. On most desktop machines it's probably not an issue, as they'll have lots of video memory, but at the same time those are the devices that are going to be least concerned about runtime performance. (I must admit that I have a hard time coming up with a platform where it would be clearly beneficial.)

    If you DO go with the "pre-transformed mesh" method (I have no idea what the formal name for it would be) I would recommend storing two separate buffers. One would contain the information that remains static from frame to frame, such as texture coords, colors, etc. The other would contain the information that changes per-frame, like vertex positions and normals. That way you're only duplicating the data that you absolutely have to.

    As a point of interest on this subject: the method you describe is exactly how several older games and engines handled their animations, notably the Quake series up through Quake 3. You could read up on those formats for some additional information on the technique.

  7. This series is BRILLIANT! Thanks so much for sharing this, Brandon.

  8. Bradon,

    Thanks for the clear anwser.

    So if i am right both ways have great benefits but also great disadvantages.

    Maybe implementing both ways will be great way not to use up all memory recourses of all gpu power, a sort of balancer with some choices like: how many animations does the model have, how many detail does the model have, what should the model be able to do?. I guess that way it can balance the gpu memory use and calculation power and won't overload one of the two.

    But first of all got the re-code the custom binary max3d exporter to optionally export bones weights etc ;)


  9. hi Brandon,

    trying to make an skinned model exporter for autodesk 3ds max 2011.
    The models i use doesn't seem to have a traditional bind pose (i am not that great at max3d), would it also work using just the first frame of the animation as the bind pose? (offcourse should keep in mind all matrixes are relative to the same bind pose). Also max3d gives an 3x3 matrix for the transform, would that still possible to use and just convert it to an 4x4 matrix?

    kind regards

  10. Hi Brandon,
    The demo code shows just a dark green background on Chrome and error page on Firefox and Opera 12.
    I refreshed the driver of my video card but it was useless. I'm new in WebGL therefore I haven't the faintest idea what is wrong with it. Please, help me.
    Otherwise I love your tutorial but it can be more detailed.

    Thank you.

  11. Hi brandon, how do i export animations??? it gives error of loading WebGLAnim no found, although it is in the WebGLExportTemplates directory kindly do let me know ASAP..

  12. I know that you can Set up the Skeleton yourself but I was wondering what software did you use to generate the Skin.

  13. I think that you need to check out this blog post because it has some essay writing tips that will be useful when you will be needing to write argumentative paper next time