[Update: Video and Source Code are online now!]
There's a couple of reasons why I'm not planning on posting a live version of the demo. First and foremost, I obviously don't own the rights to the content. That's actually true of any of the demos that I've posted thus far, but whereas Quake 3 is 11 years old at this point(!) Team Fortress 2 is a very modern game that people are still actively playing. I feel a little different about distributing resources for a game that is still making it's publisher good money, and I certainly don't want to step on any toes at Valve. Secondly, however, is the practical matter that the resources for this single level take up nearly 200MB! (In comparison the Quake 3 demo requires about 12MB of binary and textures.) My web host (Dreamhost) is decent enough about giving me "unlimited" bandwidth, but I'm not entirely certain how well they would hold up to hundreds of people pulling that much data all at once.
Anyway, I've already had a couple of people request that I do one of my "Tech Talk" posts about the demo, which was my intention from the start. As always, this is more or less a brain dump of anything interesting I can think of saying about the project, so append all the standard disclaimers about technobabble here and lets get started!
So the first thing that struck me about the Source Engine BSP format is how extremely similar it is to the Quake 2 BSP. I had heard this before, but I had assumed that people meant it was similar in the same way that, say, Quake 3's format was similar to Quake 2: some core shared concepts but improved suitability for the platform. To my surprise, Source's BSP is really more accurately termed as Quake 2++. It's so similar that you could probably accurately read half the format with the same loader! I'd love to say that this was a good thing, but the reality of the matter is that Quake 2's format is very unsuitable for modern, hardware accelerated games. In fact, some readers may recall that I did a post about that exact issue a while back. Sadly, Source inherits just about every single limitation I listed there, and then adds a few more besides!
Now, I'd like to take a moment and clarify something: If it sounds like I don't like the format, well, that because it's true. But I'm not going to go so far as to call Valve out on it and suggest they do something better. The fact is, the format as it is now is almost certainly a side effect of the fact that Source was built off the original Half Life engine (or GoldSrc, if you will). GoldSrc, in turn, was built off of code licensed from id. All along the way, it's doubtful that anybody wanted to take a set of content creation tools that were proven and working and scrap them just because the file format was less-than-optimal. Joel Splosky has a really excellent post about scrapping your code and starting again being the worst possible approach to any program, and Source is a very good example of that. After all, a bad file format hasn't stopped the Valve games from being insanely popular and very profitable, has it? So, no, this isn't the format that you would build if you were starting from scratch, and I certainly wouldn't recommend it's use outside the Source engine, but there's no reason for Valve to dump it just yet.
Credit where credit is due: Most of what I did was figured out from the documentation posted at the Valve Developer Community. There were also significant bits here and there that were gleaned from the Source SDK.
As far as the BSP format itself goes, there's not a whole lot new to say about it. It's very similar to previous BSPs that I've worked with in that it contains brushes (convex hulls, defined by a list of planes) which are used for collision detection and also get broken down into the triangles that make up the bulk of the world geometry. Both brushes and triangles get attached to leaves of the Binary Space Partitioning (BSP) tree, which you can use to quickly determine where on the map any given point is, and thus narrows the elements that need to be tested for visibility and collision. The biggest difference in terms of map makeup is that the source maps rely a lot more on "props" for their overall layout. Props are models (typically static) that are placed around the level as detail geometry and in some cases can make up the bulk of the geometry in a level. As a general rule, anything that you see in a source level that isn't a flat surface like a wall or floor is probably a prop.
A side note here: Have you ever noticed how insanely detailed some of these maps are? Seriously, start up a server with nobody but yourself in it some time and just go wander around 2fort. It's absolutely stunning how many little nooks and crannies there are in the levels that you never really notice because you're far too concerned about not dying. Especially in a game like Team Fortress 2, that everyone associates with big, colorful shapes and clean outlines, it's surprising just how beautifully cluttered the world is.
There are a few new types of geometry in the source maps, neither of which I'm handling particularly well at this point. We have displacement surfaces, which can best be described as distorted planes like landscapes or rock walls, etc. These are built with a fairly complex structure of nested vertices, presumably so that we can easily reduce the level of detail on lower end machines. Then there are water surfaces, which you can probably guess at the usage for.
Worth mentioning is the fact that Source maps take Quake 2's approach to lightmapping, in that the lightmaps are actually stored per-face. This means that for performance reasons I once again broke out the code I did for Quake 2 to pack multiple lightmaps into a single texture and reused it here. It's not a huge pain, but it does introduce one more step that must be taken before we can start compiling the vertices, since we don't know where to set the lightmap UV coords until after the lightmaps have been fully parsed. Quake 3 had a much better system for this, where all the lightmaps were pre-packed and the texture coords were already correct.
Also, a special shout out needs to be given to the insanity that is the texture coordinates. Rather than store UV's directly in an easily readable format, they instead store, essentially, a texture matrix that you apply against the vertex position to derive a pixel X/Y offset on the texture that you can then divide by the texture width and height to get the appropriate UV's. At the very least, they have the decency to give you the texture dimensions in the BSP file rather than making you load the external resource like Quake 2. This is one of those things that makes a lot of sense for an editor format, and next to none in the "compiled" map. It drives me batty!
Materials are stored in a format that strongly resembles the material format from Quake 3 (Looks a little bit like JSON), but in this case there are only a few basic shaders that are supported. These are things like Lighmapped geometry, Vertex lit, water, and so on. Basically each material selects it's shader and then the rest of the material represents the variables that are fed into the shader, like texture maps, transforms, shinyness, etc. The range of variables that can be applied can be quite large in some cases, making the more common shaders (like the vertex lighting shader) very robust and flexible. (Ubershaders, so to speak.) It does end up feeling slightly more limited than the Quake 3 approach, but I'm quite certain that it's much more efficient and probably one that is worth emulating in your own engines.
The model format, used for player meshes and props, is somewhat better in it's suitability for modern hardware than the BSP file but is still not the greatest for web use simply because the format is actually spread across three file types.
- .MDL, which stores basically everything that isn't geometry-related. This includes bone hierarchies, materials, attachment points, animations, etc.
- .VVD, which stores the raw vertex data. It also contains some level of detail information which is required to lay out the vertices in the correct order.
- .VTX, which stores triangles indices, sorted by body part, model, mesh (which is associated with a material), level of detail, strip groups, and finally triangle strips.
The information in the VTX file is somewhat difficult to parse, as it has a lot of layers of information in it, and several indirections that you have to go through to get the raw index data. For example: the indicies don't point directly at a vertex offset, but instead give an index into a "vertex table", which itself contains the actual index. Even that number, however, is not a true index since you also have to manually calculate another offset into the vertex array based on the number of vertices in all prior meshes. (That's not documented anywhere, by the way, and was painful to figure out.) So in order to build that all important index buffer you have to do a lot of pre-processing.
The data structure also feels a little heavy for it's intended purpose. The body parts seem to have a lot to do with game logic but I'm not certain what the difference between models and meshes are. LOD is pretty clear, and Strip groups are obviously intended to allow Direct X renderers to cut down on the number of locks that you would need to do by providing vertex/index buffer offsets and lengths. (GL renderers aren't so lucky, as vertex offsets require a re-binding of the shader attribs). Strip groups are broken up based on the number of bones that they are associated with, which is smart and allows for easy GPU skinning.
Still, since I don't have any skinning implemented yet and was more concerned about performance of static props, on load I condense the entire format down into a series of meshes, one per materials, and a set of triangle patches that make up that mesh. A triangle patch is nothing more than a start index and index count. If we were concerned about a model having more than 65536 verts a vertex offset would be required as well, but the Source Engine apparently limits that, so no worried here. When a model is a prop on the map, we actually do store the vertex offset, since we combine the vertices and indices for all of the props on the map into one big buffer, which allows us to bind it once for all props on the map.
I mentioned as part of my Quake 3 tech talk a while back that I wasn't doing any geometry culling. That was perhaps applicable at the time, but certainly is not a principle that applies to this format! You simply cannot brute-force render your way through the entire map! That being said, I still did keep it as simple as possible. Each leaf in the BSP tree contains a list of props and brush geometry that is built during the map load. We also parse out the Potentially Visible Set (PVS) for each leaf, which is stored as a run-length encoded series of bitflags inside the BSP. There's one for every leaf, and the number of bits corresponds to the number of leaves. To see if a leaf of the map is potentially visible from another one, you simply check the bit that corresponds with the leaf's index. 1 means it can be seen, 0 means it can't.
To help speed up the visibility flagging, I use a couple of tricks. First, I only bother doing the visibility checks when you change leaves as you move around the level. I also don't bother (yet) with any sort of frustum culling, which means that until you move from one leaf to another the visible set is exactly the same. With props I flag the entire prop as visible or not, but with brush surfaces I do the flagging by material. If a material is visible for any of the potentially visible leaves, we draw all geometry associated with that material, no matter where it is on the map. It's still not worthwhile to break it into multiple draw calls for the sake of saving some triangles. Additionally, I don't use a boolean "visible" flag on the geometry, but instead have a "visibleFrame". Each time I change leaves, I increment a "currentFrame" variable, and set all visible geometry's "visibleFrame" equal to it. With props I also have them apply the flag to their prop type. Then during render loop I check to see if the visibleFrame is less than the currentFrame I don't render it. It's an old trick from Quake that prevents you from manually clearing the vis flags on everything. Still a very valid technique!
The core render loop for the map ends up looking like this:
- Draw Skybox
- Bind Brush Geometry buffers
- Loop through Opaque Brush Materials
- If the material is visible:
- Bind material
- Draw all geometry for it (one drawElements call)
- Bind Prop Geometry buffers
- Loop through all Prop Types with Opaque Materials
- If prop type is visible:
- Loop through materials
- If material is opaque:
- Bind Material
- Loop through all instances of this prop type
- If instance is visible:
- Bind the appropriate light and model matrices
- Draw all geometry for that material
- Bind Brush Geometry buffers
- Repeat Brush Geometry loop with transparent materials
- Bind Prop Geometry buffers
- Repeat Prop Geometry loop with transparent materials
Something that's worth pointing out about the way that way I'm rendering props: It's basically begging for instanced rendering! I'm not sure if the Source engine actually uses instancing for these meshes, but it feels like a very natural fit. I was told once that one of the big reasons that WebGL doesn't have instanced rendering is because it's hard to make use of it in real world scenarios, so very few games do. I'd like to present my counter example, and ask nicely for the people making decisions to reconsider!
Alright, so I think that's enough rambling for one post. I usually end up tweaking and amending these things after I've put them up, so if I've omitted anything it won't stay omitted for long. Also, there's typically some good questions that pop up in the comments, so be sure to browse through them!
Oh, and for anyone still wondering about what the last teaser post was, it was a top-down point-cloud of the Heavy: