Displaying 3D Models | 3D Game Programming All in One (Course Technology PTR Game Development Series)

After we have defined a model of a 3D object of interest, we may want to display a view of it. The models are created in object space, but to display them in the 3D world, we need to convert them to world space coordinates. This requires three conversion steps beyond the actual creation of the model in object space.

Convert to world space coordinates.
Convert to view coordinates.
Convert to screen coordinates.

Each of these conversions involves mathematical operations performed on the object's vertices.

The first step is accomplished by the process called transformation. Step 2 is what we call 3D rendering. Step 3 describes what is known as 2D rendering. First we will examine what the steps do for us, before getting into the gritty details.

Transformation

This first conversion, to world space coordinates, is necessary because we have to place our object somewhere! We call this conversion transformation. We will indicate where by applying transformations to the object: a scale operation (which controls the object's size), a rotation (which sets orientation), and a translation (which sets location).

World space transformations assume that the object starts with a transformation of (1.0,1.0,1.0) for scaling, (0,0,0) for rotation, and (0,0,0) for translation.

Every object in a 3D world can have its own 3D transformation values, often simply called transforms, that will be applied when the world is being prepared for rendering.

Tip

Other terms used for these kinds of XYZ coordinates in world space are Cartesian coordinates, or rectangular coordinates.

Scaling

We scale objects based upon a triplet of scale factors where 1.0 indicates a scale of 1:1.

The scale operation is written similarly to the XYZ coordinates that are used to denote the transformation, except that the scale operation shows how the size of the object has changed.Values greater than 1.0 indicate that the object will be made larger, and values less than 1.0 (but greater than 0) indicate that the object will shrink.

For example, 2.0 will double a given dimension, 0.5 will halve it, and a value of 1.0 means no change. Figure 3.13 shows a scale operation performed on a cube in object space. The original scale values are (1.0,1.0,1.0). After scaling, the cube is 1.6 times larger in all three dimensions, and the values are (1.6,1.6,1.6).

click to expand
Figure 3.13: Scaling.

Rotation

The rotation is written in the same way that XYZ coordinates are used to denote the transformation, except that the rotation shows how much the object is rotated around each of its three axes. In this book, rotations will be specified using a triplet of degrees as the unit of measure. In other contexts, radians might be the unit of measure used. There are also other methods of representing rotations that are used in more complex situations, but this is the way we'll do it in this book. Figure 3.14 depicts a cube being rotated by 30 degrees around the Y-axis in its object space.

click to expand
Figure 3.14: Rotation.

It is important to realize that the order of the rotations applied to the object matters a great deal. The convention we will use is the roll-pitch-yaw method, adopted from the aviation community. When we rotate the object, we roll it around its longitudinal (Z) axis. Then we pitch it around the lateral (X) axis. Finally, we yaw it around the vertical (Y) axis. Rotations on the object are applied in object space.

If we apply the rotation in a different order, we can end up with a very different orientation, despite having done the rotations using the same values.

Translation

Translation is the simplest of the transformations and the first that is applied to the object when transforming from object space to world space. Figure 3.15 shows a translation operation performed on an object. Note that the vertical axis is dark gray. As I said earlier, in this book, dark gray represents the Z-axis. Try to figure out what coordinate system we are using here. I'll tell you later in the chapter. To translate an object, we apply a vector to its position coordinates. Vectors can be specified in different ways, but the notation we will use is the same as the XYZ triplet, called a vector triplet. For Figure 3.15, the vector triplet is (3,9,7). This indicates that the object will be moved three units in the positive X direction, nine units in the positive Y direction, and seven units in the positive Z direction. Remember that this translation is applied in world space, so the X direction in this case would be eastward, and the Z direction would be down (toward the ground, so to speak). Neither the orientation nor the size of the object is changed.

click to expand
Figure 3.15: Translation.

Full Transformation

So now we roll all the operations together. We want to orient the cube a certain way, with a certain size, at a certain location. The transformations applied are scale (s)=1.6,1.6,1.6, followed by rotation (r)=0,30,0, and then finally translation (t)=3,9,7. Figure 3.16 shows the process.

click to expand
Figure 3.16: Fully transforming the cube.

Note

The order that we use to apply the transformations is important. In the great majority of cases, the correct order is scaling, rotation, and then translation. The reason is that different things happen depending on the order.

You will recall that objects are created in object space, then moved into world space. The object's origin is placed at the world origin. When we rotate the object, we rotate it around the appropriate axes with the origin at (0,0,0), then translate it to its new position.

If you translate the object first, then rotate it (which is still going to take place around (0,0,0), the object will end up in an entirely different position as you can see in Figure 3.17.

click to expand
Figure 3.17: Faces on an irregularly shaped object.

Rendering

Rendering is the process of converting the 3D mathematical model of an object into an on-screen 2D image. When we render an object, our primary task is to calculate the appearance of the different faces of the object, convert those faces into a 2D form, and send the result to the video card, which will then take all the steps needed to display the object on your monitor.

We will take a look at several different techniques for rendering, those that are often used in video game engines or 3D video cards. There are other techniques, such as ray-casting, that aren't in wide use in computer games—with the odd exception, of course—that we won't be covering here.

In the previous sections our simple cube model had colored faces. In case you haven't noticed (but I'm sure you did notice), we haven't covered the issue of the faces, except briefly in passing.

A face is essentially a set of one or more contiguous co-planar adjacent triangles; that is, when taken as a whole, the triangles form a single flat surface. If you refer back to Figure 3.12, you will see that each face of the cube is made with two triangles. Of course, the faces are transparent in order to present the other parts of the cube.

Flat Shading

Figure 3.18 provides an example of various face configurations on an irregularly shaped object. Each face is presented with a different color (which are visible as different shades). All triangles with the label A are part of the same face; the same applies to the D triangles. The triangles labeled B and C are each single-triangle faces.

click to expand
Figure 3.18: Faces on an irregularly shaped object.

When we want to display 3D objects, we usually use some technique to apply color to the faces. The simplest method is flat shading, as used in Figure 3.17. A color or shade is applied to a face, and a different color or shade is applied to adjacent faces so that the user can tell them apart. In this case, the shades were selected with the sole criterion being the need to distinguish one face from the other.

One particular variation of flat shading is called Z-flat shading. The basic idea is that the farther a face is from the viewer, the darker or lighter the face.

Lambert Shading

Usually color and shading are applied in a manner that implies some sense of depth and lighted space. One face or collection of faces will be lighter in shade, implying that the direction they face has a light source. On the opposite side of the object, faces are shaded to imply that no light, or at least less light, reaches those faces. In between the light and dark faces, the faces are shaded with intermediate values. The result is a shaded object where the face shading provides information that imparts a sense of the object in a 3D world, enhancing the illusion. This is a form of flat shading known as lambert shading (see Figure 3.19).

click to expand
Figure 3.19: Lambert-shaded object.

Gouraud Shading

A more useful way to color or shade an object is called gouraud shading. Take a look at Figure 3.20. The sphere on the left (A) is flat shaded, while the sphere on the right (B) is gouraud shaded. Gouraud shading smoothes the colors by averaging the normals (the vectors that indicate which way surfaces are facing) of the vertices of a surface. The normals are used to modify the color value of all the pixels in a face. Each pixel's color value is then modified to account for the pixel's position within the face. Gouraud shading creates a much more natural appearance for the object, doesn't it? Gouraud shading is commonly used in both software and hardware rendering systems.

click to expand
Figure 3.20: Flat-shaded (A) and gouraud-shaded (B) spheres.

Phong Shading

Phong shading is a much more sophisticated—and computation-intensive—technique for rendering a 3D object. Like gouraud shading, it calculates color or shade values for each pixel. Unlike gouraud shading (which uses only the vertices' normals to calculate average pixel values), phong shading computes additional normals for each pixel between vertices and then calculates the new color values. Phong shading does a remarkably better job (see Figure 3.21), but at a substantial cost.

click to expand
Figure 3.21: Phong-shaded sphere.

Phong shading requires a great deal of processing for even a simple scene, which is why you don't see phong shading used much in real-time 3D games where frame rate performance is important. However, there are games made where frame rate is not as big an issue, in which case you will often find phong shading used.

Fake Phong Shading

There is a rendering technique that looks almost as good as phong shading but can allow fast frame rates. It's called fake phong shading, or sometimes fast phong shading, or sometimes even phong approximation rendering. Whatever name it goes by, it is not phong rendering. It is useful, however, and does indeed give good performance.

Fake phong shading basically employs a bitmap, which is variously known as a phong map, a highlight map, a shade map, or a light map. I'm sure there are other names for it as well. In any event, the bitmap is nothing more than a generic template of how the faces should be illuminated (as shown in Figure 3.22).

click to expand
Figure 3.22: Example of a fake phong highlight map.

As you can tell by the nomenclature, there is no real consensus about fake phong shading. There are also several different algorithms used by different people. This diversity is no doubt the result of several people independently arriving at the same general concept at roughly the same time—all in search of better performance with high-quality shading.

Texture Mapping

Texture mapping is covered in more detail in Chapters 8 and 9. For the sake of completeness, I'll just say here that texture mapping an object is something like wallpapering a room. A 2D bitmap is "draped" over the object, to impart detail and texture upon the object, as shown in Figure 3.23.

click to expand
Figure 3.23: Texture-mapped and gouraud-shaded cube.

Texture mapping is usually combined with one of the shading techniques covered in this chapter.

Shaders

When the word is used alone, shaders refers to shader programs that are sent to the video hardware by the software graphics engine. These programs tell the video card in great detail and procedure how to manipulate vertices or pixels, depending on the kind of shader used.

Traditionally, programmers have had limited control over what happens to vertices and pixels in hardware, but the introduction of shaders allowed them to take complete control.

Vertex shaders, being easier to implement, were first out of the starting blocks. The shader program on the video card manipulates vertex data values on a 3D plane via mathematical operations on an object's vertices. The operations affect color, texture coordinates, elevation-based fog density, point size, and spatial orientation.

Pixel shaders are the conceptual siblings of vertex shaders, but they operate on each discrete viewable pixel. Pixel shaders are small programs that tell the video card how to manipulate pixel values. They rely on data from vertex shaders (either the engine-specific custom shader or the default video card shader function) to provide at least triangle, light, and view normals.

Shaders are used in addition to other rendering operations, such as texture mapping.

Bump Mapping

Bump mapping is similar to texture mapping. Where texture maps add detail to a shape, bump maps enhance the shape detail. Each pixel of the bump map contains information that describes aspects of the physical shape of the object at the corresponding point, and we use a more expansive word to describe this—the texel. The name texel derives from texture pixel.

Bump mapping gives the illusion of the presence of bumps, holes, carving, scales, and other small surface irregularities. If you think of a brick wall, a texture map will provide the shape, color, and approximate roughness of the bricks. The bump map will supply a detailed sense of the roughness of the brick, the mortar, and other details. Thus bump mapping enhances the close-in sense of the object, while texture mapping enhances the sense of the object from farther away.

Bump mapping is used in conjunction with most of the other rendering techniques.

Environment Mapping

Environment mapping is similar to texture mapping, except that it is used to represent effects where environmental features are reflected in the surfaces of an object. Things like chrome bumpers on cars, windows, and other shiny object surfaces are prime candidates for environment mapping.

Mipmapping

Mipmapping is a way of reducing the amount of computation needed to accurately texture-map an image onto a polygon. It's a rendering technique that tweaks the visual appearance of an object. It does this by using several different textures for the texture-mapping operations on an object. At least two, but usually four, textures of progressively lower resolution are assigned to any given surface, as shown in Figure 3.24. The video card or graphics engine extracts pixels from each texture, depending on the distance and orientation of the surface compared to the view screen.

click to expand
Figure 3.24: Mipmap textures for a stone surface.

In the case of a flat surface that recedes away from the viewer into the distance, for pixels on the nearer parts of the surface, pixels from the high-resolution texture are used (see Figure 3.25). For the pixels in the middle distances, pixels from the medium-resolution textures are used. Finally, for the faraway parts of the surface, pixels from the low-resolution texture are used.

click to expand
Figure 3.25: Receding mipmap textures on a stone surface.

Tip

Anti-aliasingis a software technique used in graphics display systems to make curved and diagonal lines appear to be continuous and smooth. On computer monitors the pixels themselves aren't curved, but collectively they combine together to represent curves. Using pixels within polygon shapes to simulate curves causes the edges of objects to appear jagged. Anti-aliasing, the technique for smoothing out these jaggies, or aliasing, usually takes the form of inserting intermediate-colored pixels along the edges of the curve. The funny thing is, with textual displays this has the paradoxical effect of making text blurrier yet more readable. Go figure!

Scene Graphs

In addition to knowing how to construct and render 3D objects, 3D engines need to know how the objects are laid out in the virtual world and how to keep track of changes in status of the models, their orientation, and other dynamic information. This is done using a mechanism called a scene graph, a specialized form of a directed graph. The scene graph maintains information about all entities in the virtual world in structures called nodes. The 3D engine traverses this graph, examining each node one at a time to determine how to render each entity in the world. Figure 3.26 shows a simple seaside scene with its scene graph. The nodes marked by ovals are group nodes, which contain information about themselves and point to other nodes. The nodes that use rectangles are leaf nodes. These nodes contain only information about themselves.

click to expand
Figure 3.26: Simple scene graph.

Note that in the seaside scene graph, not all of the nodes contain all of the information that the other nodes have about themselves.

Many of the entities in a scene don't even need to be rendered. In a scene graph, a node can be anything. The most common entity types are 3D shapes, sounds, lights (or lighting information), fog and other environmental effects, viewpoints, and event triggers.

When it comes time to render the scene, the Torque Engine will "walk" through the nodes in the tree of the scene graph, applying whatever functions to the node that are specified. It then uses the node pointers to move on to the next node to be rendered.

3D Audio

Audio and sound effects are used to heighten the sense of realism in a game. There are times when the illusion is greatly enhanced by using position information when generating the sound effects. A straightforward example would be the sound generated by a nearby gunshot. By calculating the amplitude—based on how far away the shot occurred— and the direction, the game software can present the sound to a computer's speakers in a way that gives the player a strong sense of where the shot occurred. This effect is even better if the player is wearing audio headphones. The player then has a good sense of the nature of any nearby threat and can deal with it accordingly—usually by massive application of return fire.

The source location of a game sound is tracked and managed in the same way as any other 3D entity via the scene graph.

Once the game engine has decided that the sound has been triggered, it then converts the location and distance information of the sound into a stereo "image" of the sound, with appropriate volumes and balance for either the right or left stereo channel. The methods used to perform these calculations are much the same as those used for 3D object rendering.

Audio has an additional set of complications—things like fade and drop-off or cutoff.