Thursday, April 5, 2012

If I built a physics engine...

So let's be clear about something right from the very start: I'm not a physics guy. I love me a good physics engine, but I will likely never build one because that's just not where my expertise lies. Far be it from me to criticize the algorithms used by your broadphase collision pass, because I only have the most high level of ideas what that actually is. I'm a graphics guy through and through and that's where I'm happy to stay.

However, I do know a thing or two about writing fast Javascript code, which is something that will become fairly important to the new breed of browser-oriented physics engines that are starting to crop up, like Ammo.js, Cannon.js, and the various Box2D ports. Unfortunately, however, I've seen a few common threads within these engines that I think will limit their effectiveness in real-world use, and I wanted to get some of it off my chest while the development of these libraries is still in the early days. Some of these trends are inherited from the underlying C++ libs these libraries wrap (like Ammo.js, which is built from Bullet with Emscripten), but others appear to be self imposed. That's the group I'm talking to.

In my previous post about optimizing the garbage created by my texture loading I linked to a great article by Ashley Gullen that contained a couple of lines that I don't think were given enough emphasis:
Also, avoid vector objects if at all possible (as in vector2 with x and y properties). While again it may be convenient to have functions return these objects so they can change or return both values at once, you can easily end up with hundreds of these created every frame, resulting in terrible GC performance.
This is something that every speed-oriented Javascript library needs to take to heart, and fast! Unfortunately we've all been trained by C++ and friends that encapsulation of these things is good and desirable and as such the first thing that jumps to mind when we see an x, y, and z component is "Class it!" That's a good principle in most cases, but here it's subversively damaging.

Consider the following: I'm making a simple physics demo with the latest and greatest physics engine, "Physics", that's rendered with a separate 3D framework, "Renderer". (Yeah, people don't have very creative naming conventions in my imagination.) I'll probably end up with a couple of code chunks like this somewhere:
function setupBox(pos, size) {
    var physicsBox = new Physics.Box();
    physicsBox.setPosition(new Physics.Vector3(pos.x, pos.y, pos.z));
    physicsBox.setSize(new Physics.Vector3(size.x, size.y, size.z));


    renderBox = new Renderer.Cube();
    renderBox.setPosition(new Renderer.Vector3(pos.x, pos.y, pos.z));
    renderBox.setSize(new Renderer.Vector3(size.x, size.y, size.z));


function updateBox() {
    var pos = physicsBox.getPosition();
    var rot = physicsBox.getRotation();

    renderBox.setPosition(new Renderer.Vector3(pos.x, pos.y, pos.z));
    renderBox.setRotation(new Renderer.Quaternion(rot.x, rot.y, rot.z, rot.w));
Actually, that doesn't look too bad! Very clean and readable, if overly simplified. But let's examine what's happening here. First off, since the physics and rendering libraries don't share a math lib I can't simply pass vectors from one to the other. This means that I'm doing a lot of translating back and forth, which not only costs me execution time but also produces heaps of garbage for my browser to pick up because I'm constructing a new vector every other line! This is especially harmful in the update function, which I expect to be called every frame, and probably multiple times each frame to update multiple objects. (After all, what good is a physics simulation with only one box?)

I should point out that there's nothing really objectionable about the creation of the new Physics.Box and Render.Cube. These are both objects that we expect to stick around for a while and track important state for us. There's no reason they shouldn't be a nice encapsulated object. It's really just the throwaway vectors that we're concerned about.

Of course the above code sample is pretty naive. We could cache off some vector instances for data shuffling like this to prevent some of the garbage creation, but there's only so much that we as the library users can do. You know that the library creators are making use of these classes internally too. Take this simplified version of the naive broadphase code from Cannon.js:
for(var i=0; i < n; ++i) {
    for(var j=0; j < n; ++j) {
        // If body[i] is a plane and body[j] is a sphere...

        r = new CANNON.Vec3(x[i]-x[j],
        quat = new CANNON.Quaternion(qx[i],qy[i],qz[i],qw[i]),
        normal = quat.vmult(body[i]._shape.normal),
        q =[j]._shape.boundingSphereRadius();

        if(q < 0.0){
There is the potential to create three new vector or quaternion objects for every potential object pair in the physics simulation per pass. (The third is hidden away under quat.vmult, which actually returns a new Cannon.Vec3) There's really no way the user of a library can avoid these kind of allocations, but it's also hard to fault the library builders for making use of their own classes.

(As a side note, it's rather unfair of me to pick on Cannon.js here. A good deal of their internal simulation loop actually appears to be quite memory conscious. The above code is actually something of an outlier. Many of the user interactions with the library still require use of their vector classes, though.)

So how could this situation be improved? Well, for one we can observe that in my fake example code above both the physics and rendering libraries are relying on vectors that are basically just x, y, and z floats bundled together. What if we were to make our vector operations separate functions rather than methods on the vector prototype? So instead of this:
var v3 = v1.add(v2);
We do:
var v3 = Vector3.add(v1, v2);
Or, even better:
Vector3.add(v1, v2, vOut);
By making the output an explicit parameter rather than an implicit return we eliminate the need to create new objects inside the function and encourage object reuse.

Yeah, it's not quite as pretty, but now v1, v2, and vOut can be ANY object with x, y, and z properties. That means that our Renderer, for example, could potentially operate directly off of the vectors used by the physics library. That would turn our example from earlier into this if both the physics and rendering libs were onboard with the idea:
function setupBox(pos, size) {
    var physicsBox = new Physics.Box();


    renderBox = new Renderer.Cube();


function updateBox() {
Quite a bit cleaner, and with no throwaway object creation! Yay!

Now that's great, and it would likely be a marked improvement over the current state of things, but if it were me personally building a physics engine I'd take it one step further. Rather than use an arbitrary object with x, y, z properties I'd discard the idea of an object entirely and just use a 3-element arrays, preferably TypedArrays. (Hm... why does that sound familiar?) It's a bit awkward to use v[0] instead of v.x, but there are some real benefits to doing so! For one, array indexing is very fast in Javascript, especially for TypedArrays. Second, whenever you pass data to WebGL (or WebCL in the future, or a number of other APIs like Web Worker Transferrables) it's expected to be in the form of a TypedArray anyway. Essentially, TypedArrays are fast becoming the language that the web speaks when performance matters.

If your vectors are in an object, you're going to be doing a lot of new Float32Array([v.x, v.y, v.z]); as you pass data around to different APIs, which is just another variation of what we've spent this entire post trying to get away from. If your vectors (and quaternions and matricies!) are TypedArrays from the beginning they can be passed to WebGL and friends with no data duplication and no garbage creation. In fact, you may be able to work your way down to zero garbage creation per frame! That's an extremely lofty goal, but one that's plausible if your core libraries don't force you to create throwaway objects.

Allow me to end by saying that despite my griping I've really been very impressed by the capabilities of the Javascript physics engine options available today. Despite their flaws they are all accomplishing something that I would have thought to be nearly impossible only a few years ago. And this is only the first round! I completely expect that they'll just get better and better as time goes on. One of the things that will make them better, though, is to embrace the realities and limitations of Javascript as a language, and at some point that means that new Vector() is gonna have to die.


So after seeing a couple of frameworks toy with revamping some of their vector math (most notably Three.js discussing changes to their vertex class and Cannon.js trying to optimize their vector math), it's become apparent that there's some clarification and a slight bit of back tracking that needs to be done here.

I failed to mention (because frankly I had forgotten) that creating TypedArray instances is a very expensive operation in Javascript. How expensive relative to other data structures will vary from environment to environment but the point is that you want to create as few of them as possible. That, of course, is the main thrust of this entire post, regardless of your chosen data type. Depending on how your library needs to use it's vector/matrix/etc classes, though, TypedArrays may very well be the wrong choice for you!

If you need to be creating new vectors all the time, or will never be pushing your data into an API that requires TypedArrays, then there's actually a lot of benefit to either using a standard javascript array or a lightweight object. (The rest of my post still stands, though! {x: 0, y: 0, z: 0} is still preferable to new MyCustom.Vector())

If you are frequently passing data to WebGL, Audio APIs, Web Workers, or other TypedArray-based libraries, however, you will probably be forced to take that expensive creation hit at some point anyway. In those cases the absolute worst thing you can be doing for your performance is something like:
gl.uniform3fv(uniformX, new Float32Array([v.x, v.y, v.z]));
At that point you have probably destroyed any performance gains you may have otherwise had with the overhead of not only the TypedArray creation but also the intermediate JS array used to package the data. It would be far better to either use TypedArrays from the get go or keep a cache of them around explicitly for data passing.

Like so many other things in software development this is an area where there are no silver bullets. It's up to you to examine the needs of your code, experiment with the alternatives, and profile your performance every chance you get!