Friday, December 9, 2011

Compressed Textures in WebGL

[UPDATE: The compressed textures spec has been changing, and so the original code posted with this entry stopped running. I've since fixed the code and updated the information below. Be aware, though, that the spec may STILL be tweaked at some point!]

I gave a presentation this last Friday at WebGL Camp 4, the slides of which are online now. I had a great time, met some awesome developers, and saw a lot of things that got me really excited about the future of WebGL. I highly encourage anyone that is interested in WebGL to try and make it to WebGL Camp 6!

During my talk I was able to show what I think may be the first public demo of compressed textures in WebGL! The demo isn't terribly impressive, it simply displays a DXT5 texture loaded from a DDS file, but it shows off the required code effectively enough.

http://media.tojicode.com/webgl-samples/dds.html

(Warning: That demo will only work on machines that support DXT5 compression. That should be most desktops, but the majority of mobile or tablet devices will be out of luck! You'll also need to be running a fairly new Chrome dev channel build)

Yay! I got a textured cube on screen! Surely I'm the first person ever to do this!

Okay, yeah... it's not all that impressive. The key here is the potential that it provides. Compressed textures have been an integral part of 3D games and many other 3D applications on the desktop, console, and mobile platforms that they've become something of an invisible, pervasive optimization that everyone tends to take for granted. Up until now, however, they've been something that's been left out of WebGL (not without reason, they're tricky to get right). The fact that we're gaining the ability to use them now is, in my view, something of a benchmark of the maturity of the standard.

So what exactly do we mean by compressed texture? If you're not already familiar with the concept from a prior life as a game developer it can be a bit confusing to really grok what we're referring to and why it matters. After all, JPEGs and PNGs are compressed images, right? What's different here?

(Note: If you already know the answers to the above questions just skip to the "Implementation" section)

Theory

A good place to start it to look at what WebGL does with a normal texture. Let's say we've got a texture that's 1024x1024 pixels. The textures from iOS RAGE are a good example:


This texture, as it's used in my demo, is saved as a JPEG image and is 187k. That's not too bad, all things considered! For download times, certainly, it's great! But what about when we create a texture out of it? What happens then?

Your graphics card doesn't know how to read the JPEG file format, and even if it did we wouldn't want to make it do so. Decompressing a format like JPEG is slow compared to reading, say, a BMP and doesn't account very well for random access of texel data. (A texel is just a pixel of texture data.) So when we call gl.texImage2d with an image, what it actually does in the background is completely decompress the image and send the decompressed version to the graphics card. This is more readily apparent in OpenGL as used in a native language like C because it actually forces you to do the decompression yourself before providing it any texture data. WebGL is very, very kind to us developers in that regard.

The decompressed image data is basically just an array of RGB or RGBA values (think BMP without the header). Each color channel takes 1 byte, so each texel of an RGB texture is 24 bits, and each texel of an RGBA texture is 32 bits. This means that it's really easy to figure out how much RAM a given texture will take up on your graphics card. Given the texture from above:

1024 (height) * 1024 (width) * 3 (bytes per pixel) == 3,145,728 bytes == 3MB!

So that nice, compact little 187k texture is more than 16 times bigger when it reaches your GPU! Ouch! This has two practical side effects.
  1. Texture upload speeds can suffer because we're sending a lot of data (this was the topic of my presentation)
  2. Video memory can fill up pretty fast. Especially on mobile devices that won't have a lot of wiggle room to begin with.
At this point, you'd normally have to start looking at your game resources and say "Well, sorry artists. We just don't have the room to put all of these texture in the scene. We've got to make them all a quarter of the size." And your artists will now hate you for downscaling their lovingly crafted pixels.

Unless...

Cue texture compression!

Texture compression is, at it's core, algorithms built into your graphics card that let them decompress specific texture formats on the fly. This means that the texture actually stays compressed in your video memory and is only decompressed when your shader does a texture lookup. That may sound slow, but it's not. You can only use very specific formats for the compressed textures, and those are formats that are designed for speed rather than super high compression rates. For example, let's look at one of the more popular formats: DXT5 (The one used in the demo from above.)

DXT5 offers a fixed 4:1 compression ratio. This means that for every 1 pixel of an uncompressed texture, DXT5 can store 4. Applying that to our above 1024x1024 texture, this would translate to 768k (1/4th of the uncompressed version). That also happens to be the same amount of space that a 512x512 uncompressed texture would take. Seems like a pretty clear win, right? Unfortunately it's not without it's complications.

The compression for most formats is lossy (though in different ways than JPEG), but typically the visual artifacts of the compression are offset by the fact that you can fit four times as much texture data into the same space! Ask yourself this: Would you rather have a 512x512 texture that's completely accurate or a 1024x1024 texture that's got some minor artifacts? I know which one I would choose! Of course, the nice thing is that the developer gets to choose which route they want to go: Do you need four times as much detail in your scene, or do you want your scene to fit in a fourth the memory?

Also on the list of downsides, obviously 768k is much larger than the 182k JPEG version, so (counterintuitively) compressed textures will probably not help you out much when it comes to download times, which is unfortunate. Just one more thing to consider in the course of your development.

A final hitch in the whole compressed texture thing is that not every device supports every compressed texture format. For example: Most desktop systems will support S3 compression (that's your DXT1-5 textures), but your iOS devices won't. They support PVR compression. The two methods are similar in usage, but have different properties in terms of compression, artifacts, and performance. Those aren't the only formats either. There's a lot of them out there! It's hard to argue that one is "better" than the other (though there's plenty of people who try, usually the hardware manufacturers), and on the developers end you simply have to use whatever your platform can understand.

That does make things difficult for WebGL, however. Since pretty much any device has a browser nowadays, and many of those will probably have WebGL in the future, it means that a WebGL dev that wants to use compressed textures will have to keep three or four variants of their textures around, query the device for which one's it supports, and then download the one that meets the platforms needs. Sounds fun, right?

That being said, in many cases the upsides will still outweigh the downs, and it's probably worth the pain on our end as developers to deliver a richer experience to the users.

So, now that we know all about what a compressed texture is, how do we use them?

Implementation

First, a big fat disclaimer: These APIs are all very new and very experimental. While I don't expect the overall concepts to change to much, the exact details may shift around a bit as the feature settles down. I will do my best to keep this post updated in the future with the correct API calls, but don't kill me if they're out of date from time to time. Also, I'm pretty much only looking at Webkit browsers here. I have no idea what the other browsers are going to try and do in this space, but I would imagine that it shouldn't be too terribly different than the methods described here.

Compressed textures are exposed to WebGL as an extension. To anyone that's familiar with extensions in desktop OpenGL, that may be a cringe inducing statement, but WebGL actually makes working with extensions fairly painless!

[Updated: You used to query for texture compression as a single extension. It has since been decided that it makes more sense to  query for support of individual textures, which is what this line now does]

var ct = gl.getExtension("WEBKIT_WEBGL_compressed_texture_s3tc");

If the extension is supported, ct will be an object that contains all the functions and enumerations for the extension. If the extension is not supported, you'll get back NULL. Not too bad, right? From this point on, anything that needs to be done for compressed textures will happen on the ct object rather than the gl object.

Unfortunately the contents of the ct object aren't documented anywhere that I know yet, with the exception of looking at the code used to implement it. [EDIT: Brendan pointed out in the comments that the spec for this extension is available] I got most of my information from this Webkit bug. Lucky for us, there's only a few functions that we need to look at.

First is the enumerations for the texture types. This list could conceivably grow, but as implemented right now the following symbols are available from the extension queried above:

  • COMPRESSED_RGB_S3TC_DXT1_EXT
  • COMPRESSED_RGBA_S3TC_DXT1_EXT
  • COMPRESSED_RGBA_S3TC_DXT3_EXT
  • COMPRESSED_RGBA_S3TC_DXT5_EXT

Each of these represents a compressed texture type that may be supported. It should be pretty apparent from the names which is which. Just because the symbol is defined, though, doesn't mean your system supports it. To figure that out we have to call:
var formats = gl.getParameter(gl.COMPRESSED_TEXTURE_FORMATS);
This returns a list of enumerated values for the formats the current device supports. To test and see if a particular format is supported, you'll have to loop over it doing something like this:

var i, dxt5Supported = false;
for(i in formats) {
    if(formats[i] == ct.COMPRESSED_RGBA_S3TC_DXT5_EXT) {
        dxt5Supported = true;
    }
}
(Note: I previously had an "indexOf" there. That won't work, because the formats list is a Int32Array) 

Once we know which formats are supported, we can create a texture using that format with the following calls:
gl.compressedTexImage2D(target, level, internalFormat, width, height, border, data);
gl.compressedTexSubImage2D(target, level, xOffset, yOffset, width, height, internalFormat, data); 
This works very similarly to gl.texImage2D, with the following exceptions:
  • The internalFormat must be one of the compressed texture format enums
  • You must give it a width, height, and border. There's no variant of the function that will do it automatically.
  • You always give it a typedArray of the raw compressed data. No passing in image elements here.
Otherwise you treat compressed textures just as you would any other texture. 
var texture = gl.createTexture();
gl.bindTexture(gl.TEXTURE_2D, texture);
 
gl.compressedTexImage2D(gl.TEXTURE_2D, 0, ct.COMPRESSED_RGBA_S3TC_DXT5_EXT, 512, 512, 0, textureData); 
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.LINEAR);
gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.LINEAR_MIPMAP_NEAREST);
gl.generateMipmap(gl.TEXTURE_2D);

Okay, great. So far so good. But for those of you that are used to simply giving WebGL an image tag and letting it go, I'm sure you're wondering where that textureData comes from. And thus we hit on one of the primary sticky points of compressed textures.

You see, most of the time a DXT5 image will come in the form of a DDS file, one of Microsofts formats. But not always. Sometimes it may be wrapped in a custom container, like Valve's VTF format. In either case, the browser manufactures have to be very careful about what they implement, because the file formats and compression schemes they use may be patent encumbered.  Chrome, Firefox, and Opera all work very hard to steer clear of patents so that they can deliver a great browser to you free of charge without shouldering a financial burden themselves. If they were to start handling some of these patented formats automatically they would open themselves up to all sorts of nasty licensing issues and possibly lawsuits. Nobody wants that, but at the same time they do want to provide a way for you to use these formats without being detrimental to themselves.

So a compromise is struck: WebGL doesn't attempt to decipher compressed textures at all. It simply asks you to give it an array of texture data and tell it how big it is. WebGL then hands that information off to the graphics driver for you and says "Here! You handle it!" without looking at the data that it was given. That way while the drivers and texture creation tools need to have the appropriate licenses (and they already do), WebGL happily acts as a dumb pipe that just shuffles the data around semi-blindly, and thus stays out of the crosshairs of our crazy legal system.

What this means for you as a programmer, however, is that you're shouldering some of the burden of parsing those files. This is actually how all textures work in desktop OpenGL, so it's not a big deal if that's already your background, but it feels like a big inconvenience here since WebGL has been pretty good about sweeping the ugly bits of image handling under the rug for us.

In my demo, I've implemented a simple class to read the header of a DDS file and figure out the width, height, and data buffer. If you wanted to use other file formats like PVR, you'd have to write a parser for them too. It's not terribly difficult, the format itself is well documented, but it's certainly something that begs for a good library to hide away the details. Hm....

So that's it for now! As I said earlier, I expect things in this space to change in the near future, and it's hard to recommend that anybody start using this commercially just yet, but I'm sure that a great many graphics devs (like me!) are eagerly awaiting the dust to settle here so we can all start using this great tech in our demos, games, and apps!