Monday, July 30, 2012

Sprite tile maps on the GPU

I had a fun idea yesterday while playing (what else) Spelunky and decided I'd give it a go. The result is this demo, which I'm reasonably proud of despite it's simplicity.

I know it doesn't look all that impressive, after all the SNES was doing that kind of stuff a long way back and they didn't need any fancy WebGL to do it. Heck, even a rudimentary HTML Canvas renderer can get similar results, so what makes this so special?

Well, what if I told you the entire thing was done on the GPU by drawing a full-screen quad? Interest piqued?

This is a fun little exercise in GPU abuse that may even prove useful in the right context. I'm not going to go super in-depth on every line of code, but I did want to explain the basic technique and discuss some avenues for expansion in the future.

The core concept is this: Everything is driven by two textures. One is your standard tile-based sprite sheet.

And the other is a specially built map of each tile on the screen.

Yes, it's tiny, but that's actually the image that's generating the map you see in the demo! The trick is that in the shader we treat each pixel of that image as a lookup table for the sprite sheet. Sprites are identified by storing the X and Y coordinate in the Red and Green channels respectively. So a Red value of 1 and a Green value of 2 (out of 255) indicates that the tile at (1, 2) - in this case the first cluster of buried gold - is the tile at that point on the map. Thus while the map appears to be drawn in black and white the black contains subtle variations that indicate different tiles. A fully white pixel is taken to mean "No tile here" and nothing is rendered.

Put that basic idea in place along with some minor math-fu to properly lay out all the pixels and take into account things like view offsets and you've got an entire map that can be drawn with a single call! In order to achieve the parallax scrolling as seen on the demo background we actually render two passes, and so the demo draws two quads per frame instead of just one. But still, hard to complain about that! :)

I'm not sure how novel of an idea this is. Certainly all of the elements that make it up have been used elsewhere (the tile lookup is vaguely reminiscent of virtual texturing as seen in RAGE) but I've never seen anyone actually do this kind of rendering on the GPU before. If anyone has some examples of previous implementations of the same concept I'd love to see them!

I'll get a prettified version of the main rendering code on Github soon, but in the meantime you can take a gander at the file used for the demo. At 250 lines it's not a lot of code to parse through, and most of the interesting bits come from the size calculations in the vertex and fragment shaders.

As it's stands right now, the code is surprisingly flexible given it's simplicity:
  • It can render tiles of any size, no need to be powers of two
  • It can handle 65,535 different types of tiles, which should be enough for anybody! ;)
  • The maps can be incredibly large (at least 2048x2048 tiles, depending on max texture size) with effectively no performance hit
  • Visibility culling is "free" and extremely precise (per pixel)
  • The code can scale the tiles by any factor you want (in the demo the 16x16 tiles are rendered at 2X scaling)
  • You can have as many different layers as you want (though there is a performance hit for those.)
  • I really love the idea of being able to embed full game levels in a web page as an image! It would be super easy to share them, and the nature of the image makes it almost a little preview of the map contained within!
Not bad for a morning of hacking, no? There are some downsides, though, so you'll have to take that into consideration.
  • Only supports square tiles aligned to a grid
  • Layer rendering is back-to-front to ensure proper transparency, which means there can be quite a bit of unnecessary overdraw.
  • Viewport offsets are currently snapped to the nearest pixel. Floating point offsets introduce artifacts at tile edges. (Might be able to fix this in the shader)
  • This only really makes sense for static tiles. Anything that moves (player sprites, enemies, etc) would be drawn separately
  • Changing tiles on the fly will require a gl.texSubImage2d call, which will almost certainly be slower to update than a more traditional tile renderer. (But should be acceptable for infrequent changes.)
  • It should also be noted that this IS a WebGL technique, which prevents it from being used on many mobile devices and older hardware where a traditional tile renderer would work just fine.
There are some obvious areas that can be improved: Linear filtering on the tiles is possible, but would require a gutter around each one which the current code hasn't taken into account. Animating tiles (making water bubble or grass wave) should also be possible but hasn't been implemented. You could probably reduce the overdraw too by flagging tiles with transparency to be drawn in a separate pass.

There's also a question of what do with the remaining color channels. I'm only making use of the Red and Green channels, which means the Blue and Alpha channels are available to do... something. My first instinct would be game logic flags, like "Solid" and "Water" or tying into script triggers. You could also use the extra bits for the aforementioned animation or transparency flags.

Finally, building the "map" textures for this kind of rendering by hand is a royal pain. It would need a good editor around it to be really useful, but that shouldn't be difficult to slap together.

In any case, it's a good foundation to build out from. I'm not sure if I'll have the time to flesh it out any further on my own, but I'd be happy to lend a hand to anyone that would like to experiment with adding this technique to their own renderer!