Tuning the Geometry Pipeline | Core Techniques and Algorithms in Game Programming2003

The overall workload of the geometry stage is directly related to the amount of processed primitives (vertices, usually). It is also dependent on the amount of transforms to be applied, number of lights, and so on. Thus, being efficient in the way we handle these objects is the best way to ensure maximum performance.

We will begin by making sure the minimum number of transforms are applied to a given primitive. For example, if you are concatenating several transform matrices (such as a rotation and a translation) to the same object, make sure you compose them into one matrix, so in the end only one matrix is applied to vertices. Many modern APIs will perform this operation internally, but often we will have to combine hand-computed matrices ourselves. Remember that the following code duplicates performance because we have precomputed the resulting matrix manually:

 matrix m1,m2; for (I=0;I<1000;I++)    {    point transformedvertices[I]=m2*(m1*originalvertices[I]);    }

It needs to perform 2000 point-matrix multiplies: 1000 to compute m1*originalvertices[I] and 1000 to multiply that result by m2. Rearranging the code to:

 matrix m3=m2*m1; for (I=0;I<1000;I++) { point transformedvertices[I]=m3*originalvertices[I]; }

duplicates performance because we have precomputed the resulting matrix manually. This kind of philosophy can be applied whenever we have hierarchical transforms. A good example is a skeletal animation system, where we need to concatenate one limb's local transforms to those of the ascendants in the skeletal hierarchy.

It is also a good idea to use connected primitives whenever possible because they send the same data using less primitive calls than unconnected ones. The most popular type of such primitives is triangle strips, which avoid sending redundant vertices that add no information and take up bus bandwidth. An object stored using triangle strips can require about 60 70 percent of the primitives from the original, nonconnected object. Any model can easily be converted to triangle strips by using any of the stripping libraries available (many of them are free). NVIDIA has been offering the NVTriStrip library and header files to the developer community for some time now. Usage is very straightforward, and a number of options are supported. We can choose to merge all strips in a long strip using degenerate vertices, we can optimize for vertex caches, and so on. Other triangle stripping options exist as well. Libraries, such as Brad Grantham's ACTC, do a remarkable job of consolidating triangle meshes and building strips from them. Download the latest release from www.plunk.org/~grantham/public/actc/index.html.

Indexed primitives are also headed in the same direction. In any solid object, the same vertex will be sent many times through the geometry stage because it will belong to different, connected faces. Preindexing the mesh allows us to divide the object into geometry (that is, unique vertices that compose it) and topology (that is, face loops). Then, transforms can only be applied to the unique vertex list, which will be much shorter than the initial, nonindexed version. Remember that most APIs allow us to combine connected and indexed primitives for even better performance.

As far as lighting goes, many hardware platforms can only support a specific number of hardware light sources. Adding more lamps beyond that limit will make performance drop radically because any extra lamps will be computed on the software. Be aware of these limits and implement the required mechanisms so lamp number never exceeds them. For example, it doesn't makes sense to illuminate each vertex in the game level with all the possible lamps. Only a few of them will be within a reasonable distance to influence the resulting color. A lamp selection algorithm can then be implemented so closer lamps are used, and the rest, which offer little or no contribution, are discarded, thus making sure hardware lighting is always used.

Remember also that different kinds of light sources have different associated costs. Directional lights (which have no origin and only a direction vector, such as the sun) are usually the cheapest, followed by omni lights (lights located at a point in space, lighting equally in all directions). In addition, spotlights (cones of light with origin, direction, and aperture) are the most expensive to compute. Keep this in mind when designing your illumination model.