Tuesday, September 27, 2011

Source Engine Levels in WebGL: Tech Talk

So I've just gotten back from onGameStart and have been very pleased with how well my "Surprise Project" was received! For anyone not at the conference or following me on Twitter, at the end of my presentation I demonstrated a Source Engine level (2Fort, from Team Fortress 2) running in WebGL at an absolutely stable 60fps! Now, to be perfectly fair there are a lot of bits of the rendering that don't work properly yet. Off the top of my head, it's still missing: Normal mapping on brush surfaces, displacement surfaces, water rendering, 3D skybox, any shaders that use cubemaps, accurate lighting on props.. you get the idea. Over the next few weeks I'm going to try and fix some of the more egregious omissions after which I'll put the code up on github for any enterprising developers, and I'll also post a Youtube walkthrough of the level, but don't expect a live version any time soon.


[Update: Video and Source Code are online now!]

There's a couple of reasons why I'm not planning on posting a live version of the demo. First and foremost, I obviously don't own the rights to the content. That's actually true of any of the demos that I've posted thus far, but whereas Quake 3 is 11 years old at this point(!) Team Fortress 2 is a very modern game that people are still actively playing. I feel a little different about distributing resources for a game that is still making it's publisher good money, and I certainly don't want to step on any toes at Valve. Secondly, however, is the practical matter that the resources for this single level take up nearly 200MB! (In comparison the Quake 3 demo requires about 12MB of binary and textures.) My web host (Dreamhost) is decent enough about giving me "unlimited" bandwidth, but I'm not entirely certain how well they would hold up to hundreds of people pulling that much data all at once.

Anyway, I've already had a couple of people request that I do one of my "Tech Talk" posts about the demo, which was my intention from the start. As always, this is more or less a brain dump of anything interesting I can think of saying about the project, so append all the standard disclaimers about technobabble here and lets get started!

So the first thing that struck me about the Source Engine BSP format is how extremely similar it is to the Quake 2 BSP. I had heard this before, but I had assumed that people meant it was similar in the same way that, say, Quake 3's format was similar to Quake 2: some core shared concepts but improved suitability for the platform. To my surprise, Source's BSP is really more accurately termed as Quake 2++. It's so similar that you could probably accurately read half the format with the same loader! I'd love to say that this was a good thing, but the reality of the matter is that Quake 2's format is very unsuitable for modern, hardware accelerated games. In fact, some readers may recall that I did a post about that exact issue a while back. Sadly, Source inherits just about every single limitation I listed there, and then adds a few more besides!

Now, I'd like to take a moment and clarify something: If it sounds like I don't like the format, well, that because it's true. But I'm not going to go so far as to call Valve out on it and suggest they do something better. The fact is, the format as it is now is almost certainly a side effect of the fact that Source was built off the original Half Life engine (or GoldSrc, if you will). GoldSrc, in turn, was built off of code licensed from id. All along the way, it's doubtful that anybody wanted to take a set of content creation tools that were proven and working and scrap them just because the file format was less-than-optimal. Joel Splosky has a really excellent post about scrapping your code and starting again being the worst possible approach to any program, and Source is a very good example of that. After all, a bad file format hasn't stopped the Valve games from being insanely popular and very profitable, has it? So, no, this isn't the format that you would build if you were starting from scratch, and I certainly wouldn't recommend it's use outside the Source engine, but there's no reason for Valve to dump it just yet.

Credit where credit is due: Most of what I did was figured out from the documentation posted at the Valve Developer Community. There were also significant bits here and there that were gleaned from the Source SDK.

As far as the BSP format itself goes, there's not a whole lot new to say about it. It's very similar to previous BSPs that I've worked with in that it contains brushes (convex hulls, defined by a list of planes) which are used for collision detection and also get broken down into the triangles that make up the bulk of the world geometry. Both brushes and triangles get attached to leaves of the Binary Space Partitioning (BSP) tree, which you can use to quickly determine where on the map any given point is, and thus narrows the elements that need to be tested for visibility and collision. The biggest difference in terms of map makeup is that the source maps rely a lot more on "props" for their overall layout. Props are models (typically static) that are placed around the level as detail geometry and in some cases can make up the bulk of the geometry in a level. As a general rule, anything that you see in a source level that isn't a flat surface like a wall or floor is probably a prop.

A side note here: Have you ever noticed how insanely detailed some of these maps are? Seriously, start up a server with nobody but yourself in it some time and just go wander around 2fort. It's absolutely stunning how many little nooks and crannies there are in the levels that you never really notice because you're far too concerned about not dying. Especially in a game like Team Fortress 2, that everyone associates with big, colorful shapes and clean outlines, it's surprising just how beautifully cluttered the world is.

There are a few new types of geometry in the source maps, neither of which I'm handling particularly well at this point. We have displacement surfaces, which can best be described as distorted planes like landscapes or rock walls, etc. These are built with a fairly complex structure of nested vertices, presumably so that we can easily reduce the level of detail on lower end machines. Then there are water surfaces, which you can probably guess at the usage for.

Worth mentioning is the fact that Source maps take Quake 2's approach to lightmapping, in that the lightmaps are actually stored per-face. This means that for performance reasons I once again broke out the code I did for Quake 2 to pack multiple lightmaps into a single texture and reused it here. It's not a huge pain, but it does introduce one more step that must be taken before we can start compiling the vertices, since we don't know where to set the lightmap UV coords until after the lightmaps have been fully parsed. Quake 3 had a much better system for this, where all the lightmaps were pre-packed and the texture coords were already correct.

Also, a special shout out needs to be given to the insanity that is the texture coordinates. Rather than store UV's  directly in an easily readable format, they instead store, essentially, a texture matrix that you apply against the vertex position to derive a pixel X/Y offset on the texture that you can then divide by the texture width and height to get the appropriate UV's. At the very least, they have the decency to give you the texture dimensions in the BSP file rather than making you load the external resource like Quake 2. This is one of those things that makes a lot of sense for an editor format, and next to none in the "compiled" map. It drives me batty!

Materials are stored in a format that strongly resembles the material format from Quake 3 (Looks a little bit like JSON), but in this case there are only a few basic shaders that are supported. These are things like Lighmapped geometry, Vertex lit, water, and so on. Basically each material selects it's shader and then the rest of the material represents the variables that are fed into the shader, like texture maps, transforms, shinyness, etc. The range of variables that can be applied can be quite large in some cases, making the more common shaders (like the vertex lighting shader) very robust and flexible. (Ubershaders, so to speak.) It does end up feeling slightly more limited than the Quake 3 approach, but I'm quite certain that it's much more efficient and probably one that is worth emulating in your own engines.

There are some materials that get embedded into the BSP itself, and these usually make use of the embedded cube maps that are generated at different points throughtout the level and also stored in the BSP. I wasn't able to make use of these, however, since they're stored in a zipped format within the BSP and, well, frankly I don't want to write a gzip decompressor in Javascript. As a result any surfaces that are supposed to be really shiny and reflective in the level typically show up black or with the default checker pattern texture. Sorry!

The model format, used for player meshes and props, is somewhat better in it's suitability for modern hardware than the BSP file but is still not the greatest for web use simply because the format is actually spread across three file types.
  • .MDL, which stores basically everything that isn't geometry-related. This includes bone hierarchies, materials, attachment points, animations, etc.
  • .VVD, which stores the raw vertex data. It also contains some level of detail information which is required to lay out the vertices in the correct order.
  • .VTX, which stores triangles indices, sorted by body part, model, mesh (which is associated with a material), level of detail, strip groups, and finally triangle strips.
Some data elements are repeated between files, or share extremely similar structures too, which makes then all the larger to download. As I said, not the most web-friendly of formats.

The information in the VTX file is somewhat difficult to parse, as it has a lot of layers of information in it, and several indirections that you have to go through to get the raw index data. For example: the indicies don't point directly at a vertex offset, but instead give an index into a "vertex table", which itself contains the actual index. Even that number, however, is not a true index since you also have to manually calculate another offset into the vertex array based on the number of vertices in all prior meshes. (That's not documented anywhere, by the way, and was painful to figure out.) So in order to build that all important index buffer you have to do a lot of pre-processing.

The data structure also feels a little heavy for it's intended purpose. The body parts seem to have a lot to do with game logic but I'm not certain what the difference between models and meshes are. LOD is pretty clear, and Strip groups are obviously intended to allow Direct X renderers to cut down on the number of locks that you would need to do by providing vertex/index buffer offsets and lengths. (GL renderers aren't so lucky, as vertex offsets require a re-binding of the shader attribs). Strip groups are broken up based on the number of bones that they are associated with, which is smart and allows for easy GPU skinning.

Still, since I don't have any skinning implemented yet and was more concerned about performance of static props, on load I condense the entire format down into a series of meshes, one per materials, and a set of triangle patches that make up that mesh. A triangle patch is nothing more than a start index and index count. If we were concerned about a model having more than 65536 verts a vertex offset would be required as well, but the Source Engine apparently limits that, so no worried here. When a model is a prop on the map, we actually do store the vertex offset, since we combine the vertices and indices for all of the props on the map into one big buffer, which allows us to bind it once for all props on the map.

I mentioned as part of my Quake 3 tech talk a while back that I wasn't doing any geometry culling. That was perhaps applicable at the time, but certainly is not a principle that applies to this format! You simply cannot brute-force render your way through the entire map! That being said, I still did keep it as simple as possible. Each leaf in the BSP tree contains a list of props and brush geometry that is built during the map load. We also parse out the Potentially Visible Set (PVS) for each leaf, which is stored as a run-length encoded series of bitflags inside the BSP. There's one for every leaf, and the number of bits corresponds to the number of leaves. To see if a leaf of the map is potentially visible from another one, you simply check the bit that corresponds with the leaf's index. 1 means it can be seen, 0 means it can't.

To help speed up the visibility flagging, I use a couple of tricks. First, I only bother doing the visibility checks when you change leaves as you move around the level. I also don't bother (yet) with any sort of frustum culling, which means that until you move from one leaf to another the visible set is exactly the same. With props I flag the entire prop as visible or not, but with brush surfaces I do the flagging by material. If a material is visible for any of the potentially visible leaves, we draw all geometry associated with that material, no matter where it is on the map. It's still not worthwhile to break it into multiple draw calls for the sake of saving some triangles. Additionally, I don't use a boolean "visible" flag on the geometry, but instead have a "visibleFrame". Each time I change leaves, I increment a "currentFrame" variable, and set all visible geometry's "visibleFrame" equal to it. With props I also have them apply the flag to their prop type. Then during render loop I check to see if the visibleFrame is less than the currentFrame I don't render it. It's an old trick from Quake that prevents you from manually clearing the vis flags on everything. Still a very valid technique!

The core render loop for the map ends up looking like this:
  • Draw Skybox
  • Bind Brush Geometry buffers
  • Loop through Opaque Brush Materials
    • If the material is visible:
      • Bind material
      • Draw all geometry for it (one drawElements call)
  • Bind Prop Geometry buffers
  • Loop through all Prop Types with Opaque Materials
    • If prop type is visible:
      • Loop through materials
        • If material is opaque:
          • Bind Material
          • Loop through all instances of this prop type
            • If instance is visible:
              •  Bind the appropriate light and model matrices
              • Draw all geometry for that material
  • Bind Brush Geometry buffers
  • Repeat Brush Geometry loop with transparent materials
  • Bind Prop Geometry buffers
  • Repeat Prop Geometry loop with transparent materials
As you can see, the whole thing is hard wired to switch state as little as possible. Also, we have to draw transparent geometry after everything else or it wont render correctly. Normally you would want to sort the transparent geometry from back to front, but I'm not doing that here. It's a known flaw and you can see it manifest in a couple of places on the demo map where, for instance, a tree will disappear when you try to look at it through a fence.

Something that's worth pointing out about the way that way I'm rendering props: It's basically begging for instanced rendering! I'm not sure if the Source engine actually uses instancing for these meshes, but it feels like a very natural fit. I was told once that one of the big reasons that WebGL doesn't have instanced rendering is because it's hard to make use of it in real world scenarios, so very few games do. I'd like to present my counter example, and ask nicely for the people making decisions to reconsider!

Alright, so I think that's enough rambling for one post. I usually end up tweaking and amending these things after I've put them up, so if I've omitted anything it won't stay omitted for long. Also, there's typically some good questions that pop up in the comments, so be sure to browse through them!

Oh, and for anyone still wondering about what the last teaser post was, it was a top-down point-cloud of the Heavy:

18 comments:

  1. Very nice post! I experimented with id-game-engines myself, so its very interesting to read about its derivates and its histories. I especially find it interesting, that the source engine in parts resembles more quake2 engine, even though it is newer than quake3. Still source-engine games look awesome. I would like to see newer id-engines ported to webgl, e.g. rage-idtech4. But J.Carmack once tweeted, that he sees the future of web-games more with native clients, not with webgl. So webgl-idtech4 probably wont happen soon.

    ReplyDelete
  2. Are you planning to publish the source, just like your Quake 3 and RAGE loaders?

    ReplyDelete
  3. Yes, I certainly am planning on open sourcing what I've got. I do want to clean up a few things first, and maybe fix a couple of the outstanding bugs, but I will get it on github soon.

    ReplyDelete
  4. That all really depends on how fast they can make javascript. However, a combination of NaCL and WebGL might see the light of day at some point (which must flow through JS if I understand NaCL correctly).

    ReplyDelete
  5. Awesome! Thanks for working on something meaty like this. All the little effects demos everyone else are doing are getting old.

    ReplyDelete
  6. "a combination of NaCL and WebGL might see the light of day at some point"

    Yes, that's in the works right now. I think you'll probably see a scenario emerge where we toss really complex calculations off to NaCl or WebCL, like Physics or possibly complex AI, and keep some of the other bits in Javascript (like game logic) for ease of development. And honestly, I think rendering may be one of those things that end up staying in Javascript, as weird as that sounds. The fact is that WebGL already allows the native code to handle most of the complex stuff via shaders. You just have to be frugal with your script calls.

    Frankly, most of the time that I see people claiming that it's not fast enough unless it's native it really translates into "I can't be bothered to optimize, so I just want the platform to go faster for me."

    ReplyDelete
  7. "But J.Carmack once tweeted, that he sees the future of web-games more with native clients, not with webgl."

    So I wanted to comment on this, because I've seen it repeated several times. I went searching for the actual tweets, and what he said was:

    "I agree with Microsoft’s assessment that WebGL is a severe security risk. The gfx driver culture is not the culture of security."

    followed by:

    "They are orthogonal technologies, but NaCl is much, much easier to make secure than WebGL, even though it sounds scarier."

    Really he's just talking security in this case, not how fit one platform or another is for gaming. (Though security certainly factors into that!) In then end, though, I think this is just a popular misquote. To my knowledge he hasn't said anything more publicly about web game development tech.

    ReplyDelete
  8. absolutely amazing.
    Have you seen this post on VTX and vertices: http://stackoverflow.com/questions/1844727/vtx-file-format
    (Some time ago I toyed with valve's 3D data formats for a modeler application)
    @NaCl: Performance hasn't been what I expected in comparison to C# for data intensive tasks. It is also unclear whether NaCl will make it as a technology. But there is a definite void to be filled, left open by the dated java-web-applets platform

    ReplyDelete
  9. The VTX format that the StackOverflow question references is different than the one that Valve uses. It's an ASCII format used by Anim8tor.

    ReplyDelete
  10. Nice!

    BTW, I would render skybox and the end of the opaque geometry and before the transparent. I'm guessing you're calling gl.Clear your skybox for now :)

    ReplyDelete
  11. Ah so, I interpreted the Carmack-tweets as: 'since WebGl isnt and wont be secure, NaCl is a better web-gaming platform'. Maybe/hopefully I have read too much into it. :)

    ReplyDelete
  12. The skybox is actually being rendered as a cubemapped box right now, not just a simple clear. As such, it would be more efficient to render it after the opaque geometry, yes. It was one of those things thati had planned but didn't have the time to get to before the demo. *shrug* I'm currently not fillrate limited anyway, so it's not super critical.

    I will probably fix that before I release the code, though.

    ReplyDelete
  13. Awesome write-up, great to see such progress and a more practical example of webgl. Looking forward to seeing a youtube clip of it in action, and even more the running demo!
    Keep it up!

    ReplyDelete
  14. Hi Brandon,
    Brandon I want to make a Game in my Final Year Project on HTML5 and I want to keep it as a third person game like your work above, my group members and myself are good in programming but are very poor in Graphics. So as HTML5 supports WebGL please Guide me how to go with WebGL & it's working from Basics.

    ReplyDelete
  15. hi syed haider abbas rizvi,
    rizvi ,,look there is no one who can guide you from the basics .you have to search in the wide universe of internet to resolve your queries. although ,there are many tutorials avialable on youtube which may help u alot...

    ReplyDelete
  16. Syed,

    Hi! I'm probably not the best one to give an intro to WebGL, as I tend to focus on more advanced stuff. If you're new to 3D graphics programming, I highly recommend looking at http://LearningWebGL.com They have some great introductory lessons!

    Best of luck!

    ReplyDelete
  17. I just wanted to say thank you. I've been working on writing a Source renderer of my own (plain old OpenGL/C++ though, not WebGL), and had been going absolutely crazy trying to get model rendering working properly. I'd scoured the internet for references, and managed to write parsers for the MDL, VVD, and VTX formats that seemed to basically work, but multi-mesh models kept coming out garbled. I finally stumbled across your page, and your comment about the vertex indices being relative to the total count of all previous meshes was the missing piece. "Painful to figure out" indeed. :/ Thank you for documenting something that seems to be mentioned nowhere else on the internet!

    ReplyDelete
    Replies
    1. I'm very happy this helped you out! Any time I find myself getting caught on odd minutia like that I try to post SOMETHING about it online, simply because I wish the guy before me had done the same. It's nice to know that I'm not just talking to myself. :)

      Delete