3D Graphics Accelerators

Since the late 1990s, 3D accelerationonce limited to exotic add-on cards designed for hardcore game playershas become commonplace in the PC world. Although mainstream business users are not likely to encounter 3D imaging until Windows Vista (previously code-named Longhorn) is released in 2006, full-motion 3D graphics are used in sports, first-person shooters, team combat, driving, and many other types of PC gaming. Because even low-cost integrated chipsets offer some 3D support and 3D video cards are now in their ninth generation of development, virtually any user of a recent-model computer has the ability to enjoy 3D lighting, perspective, texture, and shading effects in her favorite games. The latest 3D sports games provide lighting and camera angles so realistic that a casual observer could almost mistake the computer-generated game for an actual broadcast, and the latest 3D accelerator chips enable fast PCs to compete with high-performance dedicated game machines, such as Sony's PlayStation 2, Nintendo's GameCube, and Microsoft's Xbox 360, for the mind and wallet of the hard-core game player.

Note

At a minimum, Windows Vista requires graphics hardware that supports DirectX 7 3D graphics; however, for maximum functionality of its 3D AeroGlass GUI, graphics hardware that supports DirectX 9 or greater is required.

How 3D Accelerators Work

To construct an animated 3D sequence, a computer can mathematically animate the sequences between keyframes. A keyframe identifies specific points. A bouncing ball, for example, can have three keyframes: up, down, and up. Using these frames as a reference point, the computer can create all the interim images between the top and bottom. This creates the effect of a smoothly bouncing ball.

After it has created the basic sequence, the system can then refine the appearance of the images by filling them in with color. The most primitive and least effective fill method is called flat shading, in which a shape is simply filled with a solid color. Gouraud shading, a slightly more effective technique, involves the assignment of colors to specific points on a shape. The points are then joined using a smooth gradient between the colors.

A more processor-intensive, and much more effective, type of fill is called texture mapping. The 3D application includes patternsor texturesin the form of small bitmaps that it tiles onto the shapes in the image, just as you can tile a small bitmap to form the wallpaper for your Windows desktop. The primary difference is that the 3D application can modify the appearance of each tile by applying perspective and shading to achieve 3D effects. When lighting effects that simulate fog, glare, directional shadows, and others are added, the 3D animation comes very close indeed to matching reality.

Until the late 1990s, 3D applications had to rely on support from software routines to convert these abstractions into live images. This placed a heavy burden on the system processor in the PC, which has a significant impact on the performance not only of the visual display, but also of any other applications the computer might be running. Starting in the period from 1996 to 1997, chipsets on most video adapters began to take on many of the tasks involved in rendering 3D images, greatly lessening the load on the system processor and boosting overall system performance.

There have been roughly nine generations of 3D graphics hardware on PCs, as detailed in Table 13.21.

Table 13.21. Brief 3D Acceleration History
Generation	Dates	Technologies	Example Product/Chipset
1st	19961997	3D PCI card with passthrough to 2D graphics card; OpenGL and GLIDE APIs	3dfx Voodoo
2nd	19971998	2D/3D PCI card	ATI Rage, NVIDIA RIVA 128
3rd	1999	2D/3D AGP 1x/2x card	3dfx Voodoo 3, ATI Rage Pro, NVIDIA TnT2
4th	19992000	DirectX 7 API, AGP 4x	NVIDIA GeForce 256, ATI Radeon
5th	2001	DirectX 8 API, programmable vertex and pixel shaders	NVIDIA GeForce 3, NVIDIA GeForce 4 Ti
6th	20012002	DirectX 8.1 API	ATI Radeon 8500, ATI Radeon 9000
7th	20022003	DirectX 9 API, AGP 8x	ATI Radeon 9700, NVIDIA GeForce FX 5900
8th	20042005	PCI Express, DirectX 9.0c	ATI X800, NVIDIA GeForce 6800
9th	2004present	DualGPU rendering with PCI Express x8, x16	ATI X1K, NVIDIA GeForce 7800; ATI CrossFire, NVIDIA nForce SLI motherboard chipsets and compatible cards

With virtually every recent graphics card on the market featuring DirectX 8.x or greater capabilities, you don't need to spend a fortune to achieve a reasonable level of 3D graphics. Many cards in the $75$200 range use lower-performance variants of current high-end GPUs, or they might use the previous year's leading GPU. These cards typically provide more-than-adequate performance for 2D business applications. Most current 3D accelerators also support dual-display and TV-out capabilities, enabling you to work and play at the same time.

However, keep in mind that the more you spend on a 3D accelerator card, the greater the onboard memory and faster the accelerator chip you can enjoy. Current high-end video cards featuring NVIDIA or ATI's top graphics chips, 256MB512MB of video memory, and PCI Express x16 interfaces sell for $300$500 each. These cards are aimed squarely at hardcore gamers for whom money is no object, and some support dual-card technologies such as NVIDIA's SLI or ATI's CrossFire, which split rendering chores across the GPUs in both video cards for faster game display than with a single card.

Mid-range cards costing $200$300 are often based on GPUs that use designs similar to the high-end products but might have slower memory and core clock speeds or a smaller number of rendering pipelines. These cards provide a good middle ground for users who play games fairly often but can't cost-justify high-end cards. Most of these cards are available in either PCI Express x16 or the older AGP 8x form factor.

Before purchasing a 3D accelerator adapter, you should familiarize yourself with some of the terms and concepts involved in the 3D image generation process.

The basic function of 3D software is to convert image abstractions into the fully realized images that are then displayed on the monitor. The image abstractions typically consist of the following elements:

Vertices. Locations of objects in three-dimensional space, described in terms of their x, y, and z coordinates on three axes representing height, width, and depth.
Primitives. The simple geometric objects the application uses to create more complex constructions, described in terms of the relative locations of their vertices. This serves not only to specify the location of the object in the 2D image, but also to provide perspective because the three axes can define any location in three-dimensional space.
Textures. Two-dimensional bitmap images or surfaces designed to be mapped onto primitives. The software enhances the 3D effect by modifying the appearance of the textures, depending on the location and attitude of the primitive. This process is called perspective correction. Some applications use another process, called MIP mapping, which uses different versions of the same texture that contain varying amounts of detail, depending on how close the object is to the viewer in the three-dimensional space. Another technique, called depth cueing, reduces the color and intensity of an object's fill as the object moves farther away from the viewer.

Using these elements, the abstract image descriptions must then be rendered, meaning they are converted to visible form. Rendering depends on two standardized functions that convert the abstractions into the completed image that is displayed onscreen. The standard functions performed in rendering are

Geometry. The sizing, orienting, and moving of primitives in space and the calculation of the effects produced by the virtual light sources that illuminate the image
Rasterization. The converting of primitives into pixels on the video display by filling the shapes with properly illuminated shading, textures, or a combination of the two

A modern video adapter that includes a chipset capable of 3D video acceleration has special built-in hardware that can perform the rasterization process much more quickly than if it were done by software (using the system processor) alone. Most chipsets with 3D acceleration perform the following rasterization functions right on the adapter:

Scan conversion. The determination of which onscreen pixels fall into the space delineated by each primitive
Shading. The process of filling pixels with smoothly flowing color using the flat or Gouraud shading technique
Texture mapping. The process of filling pixels with images derived from a 2D sample picture or surface image
Visible surface determination. The identification of which pixels in a scene are obscured by other objects closer to the viewer in three-dimensional space
Animation. The process of switching rapidly and cleanly to successive frames of motion sequences
Antialiasing. The process of adjusting color boundaries to smooth edges on rendered objects

Typical 3D Techniques

Typical 3D techniques include

Fogging. Fogging simulates haze or fog in the background of a game screen and helps conceal the sudden appearance of newly rendered objects (buildings, enemies, and so on).
Gouraud shading. Interpolates colors to make circles and spheres look more rounded and smooth.
Alpha blending. One of the first 3D techniques, alpha blending creates translucent objects onscreen, making it a perfect choice for rendering explosions, smoke, water, and glass. Alpha blending also can be used to simulate textures, but it is less realistic than environment-based bump mapping.
Stencil buffering. Stencil buffering is a technique useful for games such as flight simulators in which a static graphic elementsuch as a cockpit windshield frame, which is known as a heads-up display (HUD) and used by real-life fighter pilotsis placed in front of dynamically changing graphics (such as scenery, other aircraft, sky detail, and so on). In this example, the area of the screen occupied by the cockpit windshield frame is not re-rendered. Only the area seen through the "glass" is re-rendered, saving time and improving frame rates for animation.
Z-buffering. The Z-buffer portion of video memory holds depth information about the pixels in a scene. As the scene is rendered, the Z-values (depth information) for new pixels are compared to the values stored in the Z-buffer to determine which pixels are in "front" of others and should be rendered. Pixels that are "behind" other pixels are not rendered. This method increases speed and can be used along with stencil buffering to create volumetric shadows and other complex 3D objects. Z-buffering was originally developed for computer aided drafting (CAD) applications.
Environment-based bump mapping. Environment-based bump mapping (standard starting in DirectX 6) introduces special lighting and texturing effects to simulate the rough texture of rippling water, bricks, and other complex surfaces. It combines three separate texture maps (for colors; for height and depth; and for environment, including lighting, fog, and cloud effects).
This creates enhanced realism for scenery in games and can also be used to enhance terrain and planetary mapping, architecture, and landscape-design applications. This represents a significant step beyond alpha blending.
Displacement mapping. Special grayscale maps called displacement maps have long been used for producing accurate maps of the globe. Microsoft DirectX 9 supports the use of grayscale hardware displacement maps as a source for accurate 3D rendering. GPUs that fully support DirectX 9 in hardware support displacement mapping.

Advanced 3D Filtering and Rendering

To improve the quality of texture maps, several filtering techniques have been developed, including MIP mapping, bilinear filtering, trilinear filtering, and anisotropic filtering. These techniques and several other advanced techniques found in recent 3D GPUs are explained here:

Bilinear filtering. Improves the image quality of small textures placed on large polygons. The stretching of the texture that takes place can create blockiness, but bilinear filtering applies a blur to conceal this visual defect.
MIP mapping. Improves the image quality of polygons that appear to recede into the distance by mixing low-res and high-res versions of the same texture; a form of antialiasing.
Trilinear filtering. Combines bilinear filtering and MIP mapping, calculating the most realistic colors necessary for the pixels in each polygon by comparing the values in two MIP maps. This method is superior to either MIP mapping or bilinear filtering alone.

Note

Bilinear and trilinear filtering work well for surfaces viewed straight-on but might not work so well for oblique angles (such as a wall receding into the distance).

Anisotropic filtering. Some video card makers use another method, called anisotropic filtering, for more realistically rendering oblique-angle surfaces containing text. This technique is used when a texture is mapped to a surface that changes in two of three spatial domains, such as text found on a wall down a roadway (for example, advertising banners at a raceway). The extra calculations used take time, and for that reason, it can be disabled. To balance display quality and performance, you can also adjust the sampling size: Increase the sampling size to improve display quality, or reduce it to improve performance.
T-buffer. This technology eliminates aliasing (errors in onscreen images due to an undersampled original) in computer graphics, such as the "jaggies" seen in onscreen diagonal lines; motion stuttering; and inaccurate rendition of shadows, reflections, and object blur. The T-buffer replaces the normal frame buffer with a buffer that accumulates multiple renderings before displaying the image. Unlike some other 3D techniques, T-buffer technology doesn't require rewriting or optimization of 3D software to use this enhancement. The goal of T-buffer technology is to provide a movie-like realism to 3D-rendered animations. The downside of enabling antialiasing using a card with T-buffer support is that it can dramatically impact the performance of an application. This technique originally was developed by now-defunct 3dfx. However, this technology is incorporated into Microsoft DirectX 8.0 and above.
Integrated transform and lighting (T&L). The 3D display process includes transforming an object from one frame to the next and handling the lighting changes that result from those transformations. T&L is a standard feature of DirectX starting with version 7. The NVIDIA GeForce 256 and original ATI Radeon were the first GPUs to integrate the T&L engines into the accelerator chip, a now-standard feature.
Full-screen antialiasing. This technology reduces the jaggies visible at any resolution by adjusting color boundaries to provide gradual, rather than abrupt, color changes. Whereas early 3D products used antialiasing for certain objects only, recent accelerators from NVIDIA and ATI use various types of highly optimized FSAA methods that allow high visual quality at high frame rates.
Vertex skinning. Also referred to as vertex blending, this technique blends the connection between two angles, such as the joints in an animated character's arms or legs.
Keyframe interpolation. Also referred to as vertex morphing, this technique animates the transitions between two facial expressions, allowing realistic expressions when skeletal animation can't be used or isn't practical. See the ATI website for details.
Programmable vertex and pixel shading. Programmable vertex and pixel shading became a standard part of DirectX starting with version 8.0. However, NVIDIA introduced this technique with the GeForce3's nfiniteFX technology, enabling software developers to customize effects such as vertex morphing and pixel shading (an enhanced form of bump mapping for irregular surfaces that enables per-pixel lighting effects), rather than applying a narrow range of predefined effects. The NVIDIA GeForce4 Ti's nfiniteFXII pixel shader is DirectX 8 compatible and supports up to four textures, whereas its dual vertex shaders provide high-speed rendering up to 50% faster than the GeForce3. The ATI Radeon 8500 and 9000's version, SmartShader, is supported by DirectX 8.1. DirectX 8.1 supports more complex programs than nfiniteFX and provides comparable quality to nfiniteFXII. ATI 9700, 9800, and 9500 support DirectX 9's floating-point pixel shaders and more complex vertex shader. NVIDIA GeForce FX cards also support DirectX 9 pixel and vertex shaders, but they add more features. NVIDIA's 6xxx and 7xxx series and ATI's X1K (X1xxx) series support DirectX 9.0c Shader Model 3.0.
Floating-point calculations. Microsoft DirectX 9 supports floating-point data for more vivid and accurate color and polygon rendition. ATI uses standard DirectX 9 floating-point data; NVIDIA's GeForce FX uses additional-precision data; and NVIDIA's 6xxx series and 7xxx series further increase precision beyond standard DirectX 9 requirements.

Table 13.22 shows when various 3D rendering features were added to DirectX versions from 6.0 to 9.0c.

Table 13.22. 3D Rendering in DirectX 6.09.0
Feature	DirectX 6.0	DirectX 7.0	DirectX 8.x	DirectX 9.0
3D sky effects	No	Yes	Yes	Yes
Smoke and fog effects (volumetric effects)	No	Limited	Yes	Yes
Dynamic refraction	No	No	Limited	Yes
Transform and lighting methods	Fixed function in software	Fixed function in hardware	Vertex Shader 1.1; Pixel Shader 1.x	Vertex Shader 2.0, 3.0 (9.0c); Pixel Shader 2.0, 3.0 (9.0c)
Bump mapping	No	No	Yes	Yes
Texture resolutions	128x128,256x256	256x256	512x512	512x512
Displacement map resolutions	Low	Medium	Medium to high with bump mapping	High with bump mapping
Appearance of water	Poor	Fair	Good	Excellent

Single- Versus Multiple-Pass Rendering

Various video card makers handle application of these advanced rendering techniques differently. The current trend is toward applying the filters and basic rendering in a single pass rather than in multiple passes. Video cards with single-pass rendering and filtering typically provide higher frame-rate performance in 3D-animated applications and avoid the problems of visible artifacts caused by errors in multiple floating-point calculations during the rendering process.

Hardware Acceleration Versus Software Acceleration

Compared to software-only rendering, hardware-accelerated rendering provides faster animation. Although most software rendering would create more accurate and better-looking images, software rendering is too slow. Using special drivers, these 3D adapters can take over the intensive calculations needed to render a 3D image that software running on the system processor formerly performed. This is particularly useful if you are creating your own 3D images and animation, but it is also a great enhancement to the many modern games that rely extensively on 3D effects. Note that motherboard-integrated video solutions, such as those listed in Tables 13.9 and 13.10, typically have significantly lower 3D performance than even low-end GPUs because they use the CPU for more of the 3D rendering than 3D video adapter chipsets do.

To achieve greater performance, many of the latest 3D accelerators run their accelerator chips at very high speeds, and some even allow overclocking of the default RAMDAC frequencies. Just as CPUs at high speeds produce a lot of heat, so do high-speed video accelerators. Both the chipset and the memory are heat sources, so most mid-range and high-end 3D accelerator cards feature a fan to cool the chipset. Also, most current high-end 3D accelerators use finned passive heatsinks to cool the memory chips and make overclocking the video card easier (refer to Figure 13.14).

Software Optimization

It's important to realize that the presence of an advanced 3D-rendering feature on any given video card is meaningless unless game and application software designers optimize their software to take advantage of the feature. Although various 3D standards exist (OpenGL and DirectX), video card makers provide drivers that make their games play with the leading standards. Because some cards do play better with certain games, you should read the reviews in publications such as Maximum PC to see how your favorite graphics card performs with them. Typically, it can take several months or longer after a new version of DirectX or OpenGL is introduced before 3D games take full advantage of the 3D rendering features provided by the new API.

Some video cards allow you to perform additional optimization by adjusting settings for OpenGL, Direct 3D, RAMDAC, and bus clock speeds, as well as other options. Note that the bare-bones 3D graphics card drivers provided as part of Microsoft Windows usually don't provide these dialog boxes. Be sure to use the drivers provided with the graphics card or download updated versions from the graphics card vendor's website. Although you can sometimes use generic drivers provided by the GPU vendor, you should use drivers that have been specifically developed for your card to ensure that your card's particular features are fully supported.

Note

If you want to enjoy the features of your newest 3D card immediately, be sure to purchase the individual retail-packaged version of the card from a hardware vendor. These packages typically come with a sampling of games (full and demo versions) designed or compiled to take advantage of the card with which they're sold. The lower-cost OEM or "white box" versions of video cards are sold without bundled software, come only with driver software, and might differ in other ways from the retail advertised product. Some even use modified drivers, use slower memory or RAMDAC components, or lack special TV-out or other features. Some 3D card makers use different names for their OEM versions to minimize confusion, but others don't. Also, some card makers sell their cards in bulk packs, which are intended for upgrading a large organization with its own support staff. These cards might lack individual documentation or driver CDs and also might lack some of the advanced hardware features found on individual retail-pack video cards.

Application Programming Interfaces

Application programming interfaces (APIs) provide hardware and software vendors a means to create drivers and programs that can work quickly and reliably across a wide variety of platforms. When APIs exist, drivers can be written to interface with the API rather than directly with the operating system and its underlying hardware.

Currently, the leading game APIs include SGI's OpenGL and Microsoft's Direct3D (part of DirectX). OpenGL and Direct3D are available for virtually all leading graphics cards. At one time, a third popular game API was Glide, an enhanced version of OpenGL that is restricted to graphics cards that use 3Dfx chipsets, which are no longer on the market.

OpenGL

The latest version of OpenGL is version 2.0, released on September 7, 2004. The OpenGL Shading Language, an optional feature of the previous version (OpenGL 1.5), is now a core feature of OpenGL 2.0. OpenGL 2.0 also supports programmable vertex and fragment shaders, multiple render targets, improved textures, and other enhancements.

Although OpenGL is a popular gaming API, it is also widely used in 3D rendering for specialized business applications, including mapping, life sciences, and other fields. To learn more about OpenGL, see the OpenGL website at www.opengl.org. OpenGL support is provided by the video card or chipset vendor through driver updates.

Microsoft DirectX

Direct3D is part of Microsoft's comprehensive multimedia API, DirectX. Although DirectX 8.0, 8.1, and 9.0 all provide support for higher-order surfaces (converting 3D surfaces into curves), vertex shaders, and pixel shaders, significant differences exist between DirectX 8.0/8.1 and 9.0 in how these operations are performed.

The difference between DirectX 8.0 (used by NVIDIA) and DirectX 8.1 (used by ATI and Matrox) involves the pixel shader portion of the 3D rendering engine. DirectX 8.1's pixel shader can handle more texture maps (6 versus 4) and more texture instructions (8 versus 4) than DirectX 8.0. DirectX 8.1 also handles integer data with 48-bit precision, versus DirectX 9.0's 32-bit precision. However, both pale in comparison to DirectX 9.0's pixel shader, which handles up to 16 texture maps, up to 32 texture instructions, and up to 64 color instructions and uses floating-point data at 128-bit precision.

To create higher-order surfaces, DirectX 9 supports continuous tessellation (the process of converting a surface into small triangles using floating-point math for greater precision) and displacement mapping in addition to the n-patches method used by DirectX 8.0 and 8.1.

DirectX 9.0's vertex shader is capable of handling many more complex commands than the DirectX 8.x shader: 1,024 instructions versus 128 and up to 256 constants versus 96. DirectX 9.0 also supports flow control.

The latest version of DirectX is version 9.0c. DirectX 9.0c supports improved pixel and vertex shading (65,535 instructions with 32-bit floating-point precision), dynamic branching, and vertex texture lookups through its support for Shader Model 3.0. (DirectX 9.0 supports only Shader Model 2.0.) Currently, NVIDIA 6xxx and 7xxx GPUs and ATI's X1K series support DirectX 9.0c (and thus Shader Model 3.0), whereas ATI's first-generation PCI Express cards (X850X300) use Shader Model 2.0.

Note

DirectX provides backward compatibility to support games and other programs written for earlier DirectX versions. Thus, installing the latest version of DirectX enables you to play games that require it while maintaining compatibility with older games.

For more information about DirectX or to download the latest version, see Microsoft's DirectX website at www.microsoft.com/windows/directx.

Note

DirectX 9.x is for Windows 98 and later versions (98SE, Me, 2000, and XP) only. However, Microsoft still provides DirectX 8.0a for Windows 95 users.

Dual-GPU Scene Rendering

In Table 13.21, I placed the development of dual PCI Express graphics card solutions as the ninth generation of 3D acceleration. The ability to connect two cards together to render a single display more quickly isn't exactly new: The long-defunct 3dfx Voodoo 2 offered an option called scan-line interfacing (SLI), which pairs two Voodoo 2 cards together on the PCI bus with each card writing half the screen in alternating lines. With 3dfx's version of SLI, card number one wrote the odd-numbered screen lines (one, three, five, and so on), while card number two wrote the even-numbered screen lines (two, four, six, and so on). While popular and effective, use of SLI with Voodoo 2 was an expensive proposition that only a handful of deep-pocketed gamers took advantage of.

A few companies also experimented with using multiple GPUs on a single card to gain a similar performance advantage, but these cards never became popular. However, the idea of doubling graphics performance via multiple video cards has proven too good to abandon entirely, even after 3dfx went out of business.

NVIDIA SLI

When NVIDIA bought what was left of 3dfx, it inherited the SLI trademark and, in mid-2004, reintroduced the concept of using two cards to render a screen under the same acronym. However, NVIDIA's version of SLI has a different meaning and much more intelligence behind it.

NVIDIA uses the term SLI to refer to scalable link interface. The scaling refers to load-balancing, which adjusts how much of the work each card performs to render a particular scene, depending on how complex the scene is. To enable SLI, the following components are needed:

A PCI Express motherboard with an SLI-compatible chipset and two PCI Express video slots designed for SLI operation. Motherboards based on the NVIDIA nForce 4 SLI, SLI x16, and nForce Professional chipsets are the first to support SLI operation. Versions are available for Pentium 4, Pentium D, and Athlon 64 processors.
Two NVIDIA-based video cards in the GeForce 7800, 6800, or 6600 series with SLI support. A special bridge device known as a multipurpose I/O (MIO) is used to connect the cards to each other. The MIO is supplied with SLI-compatible motherboards.

Note

Originally, you needed to use two identical cards for NVIDIA SLI. With the introduction of NVIDIA ForceWare v81.85 or higher driver versions, this is no longer necessary. Just as with the ATI CrossFire multi-GPU solution, the cards need to be from the same GPU family (two 7800, two 6800, or two 6600 cards), but they don't need to be from the same manufacturer. You can obtain updated drivers from your video card maker or from the NVIDIA website (www.nvidia.com).

For best results, SLI should be used with games that have been optimized for SLI. More than 60 games feature in-box SLI support, but you can download customized game profiles from NVIDIA's official SLI website: www.slizone.com. You can enable or disable SLI rendering and load balancing through the NVIDIA configuration software installed during the cards' installation process.

Figure 13.16 illustrates a typical SLI hardware configuration. Note the MIO device connecting the cards to each other.

Figure 13.16. How NVIDIA SLI looks in a typical installation.

ATI CrossFire

ATI's CrossFire multi-GPU technology uses three methods to speed up display performance: alternate frame rendering; supertiling, which divides the scene into alternating sections and uses each card to render parts of the scene; and load-balancing scissor operation (similar to SLI's load-balancing). The ATI Catalyst driver uses alternate frame rendering for best performance, but automatically switches to one of the other modes for games that don't work with alternate frame rendering.

To achieve better image quality than with a single card, CrossFire offers various SuperAA (antialiasing) modes, which blend the results of antialiasing by each card. CrossFire also improves anisotropic filtering by blending the filtering performed by each card.

To use CrossFire, you need the following components:

A PCI Express motherboard with a CrossFire-compatible chipset and two PCI Express video slots designed for CrossFire operation. Motherboards based on the ATI XPress 200 chipsets are the first to support CrossFire operation.
A supported combination of ATI CrossFire-supported cards.

For CrossFire support for the X800, X850, and X1800 series, ATI has developed special CrossFire Edition cards that contain the composting engine (an Xilinx XC3S400 chip) and use a proprietary DMS-59 port for interconnecting the cards. One of these cards can be paired with a standard Radeon card from the same series. For example, you can pair an X800 CrossFire Edition with an existing X800XL. To connect these cards, a special external patch cable is used to connect the cards via the CrossFire Edition's DMS port and the DVI port on the standard card. The slower X1300 and X1600 series no longer use a special CrossFire Edition card as a master. You can pair any two X1300 or any two X1600 cards together on a CrossFire motherboard and the PCI Express bus is used to transfer information between the cards. The X1800 solution supports higher refresh rates at 1600x1200 than the X800 or X850.

CrossFire can be disabled to permit multimonitor operation. Figure 13.17 illustrates how the X800/X850/X1800 cards implement CrossFire.

Figure 13.17. How data flows in a typical ATI CrossFire implementation using X800, X850, or X1800 cards.

For more information about CrossFire, see the ATI website at www.ati.com.

3D Chipsets

Virtually every mainstream video adapter in current production features a 3D accelerationcompatible chipset. With several generations of 3D adapters on the market from the major vendors, keeping track of the latest products can be difficult. Table 13.23 lists the major 3D chipset manufacturers, the various chipsets they make, and the major features of each chipset.

Table 13.23. 3D Video Chipset Manufacturers and Products
Manufacturer: ATI
GPU (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
Radeon (R100/Rage 6C)	7	Yes	2	N/A	128-bit	.18 micron	AGP 4x, PCI
Radeon 7500, 7200 (RV200)	7	Yes	2	N/A	128-bit	.15 micron	AGP 4x, PCI	AIW (7500)
Radeon 8500 (R200)	8.1	Yes	4	2	128-bit	.15 micron	AGP 4x	AIW
Radeon 9700 PRO, 9700 (R300)	9	Yes	8	4	256-bit	.15 micron	AGP 8x	AIW
Radeon 9500 PRO	9	Yes	8	4	128-bit	.15 micron	AGP 8x	Based on 9700 Pro
Radeon 9500	9	Yes	4	4	128-bit	.15 micron	AGP 8x
Radeon 9000 PRO, 9000 (RV250)	8.1	Yes	4	2	128-bit	.15 micron	AGP 8x	Updated 8500 core; AIW
Radeon 9800 PRO (R350)	9	Yes	8	4	256-bit	.15 micron	AGP 8x	Update of 9700 Pro
Radeon 9600 PRO, XT (RV350)	9	Yes	4	4	128-bit	.13 micron	AGP 8x	AIW version has dual VGA
Radeon 9200, SE, Pro (RV280)	8.1	Yes	4	2	64-bit (SE); 128-bit	.15 micron	AGP 8x, PCI
Radeon 9250	8.1	Yes	4	2	128-bit	.15 micron	PCI
Radeon 9800XT (R360)	9	Yes	8	4	256-bit	.15 micron	AGP 8x
Radeon 9600XT (RV360)	9	Yes	4	4	128-bit	.13 micron	AGP 8x
Radeon X800 PRO (R420)	9	Yes	12	6	256-bit	.13 micron	AGP 8x	GTO uses GDDR1 or GDDR2; GTO2 uses GDDR3; supports CrossFire
X800GTO, GTO2 (R423, R480)	9	Yes	12	6	256-bit	.13 micron	PCI-Express x16	GTO uses GDDR1 or GDDR2; GTO2 uses GDDR3; supports CrossFire
Radeon X800 XT, XT Platinum (R420)	9	Yes	16	6	256-bit	.13 micron	PCI-Express x16; XT Platinum also available in clock AGP 8x	Platium runs at faster core, memory speeds; supports CrossFire (PCIe only); AIW
Radeon X800 GT (R423, R480)	9	Yes	8	6	256-bit	.13 micron	PCI Express x16	GDDR1 or GDDR2 memory; supports CrossFire
Radeon X800; CrossFire Edition (R430)	9	Yes	16	6	256-bit	.11 micron	AGP 8x or PCI Express	Supports CrossFire (PCIe only); AIW
Radeon X800 CrossFire Edition (R430)	9	Yes	16	6	256-bit	.11 micron	PCI Express x16
Radeon X800 XL (R430)	9	Yes	12	6	256-bit	.11 micron	AGP 8x or PCI Express x16	AIW
Radeon X300SE (RV370)	9	Yes	4	2	64-bit	.13 micron	PCI Express x16	Based on X300
X300 (RV370)	9	Yes	4	4	128-bit	.13 micron	PCI Express x16
X600, X600 PRO, XT (RV380)	9	Yes	4	2	128-bit	.13 micron	PCI Express x16	Core and memory clock speeds differ
X700, PRO, XT (RV410)	9	Yes	8	6	128-bit	.13 micron	PCI Express x16s or AGP 8x	Core and memory clock speeds differ
Radeon X850 PRO (R481)	9	Yes	12	6	256-bit	.11 micron	PCI Express x16 or AGP 8x
Radeon X850 XT, Platinum Edition XT AGP	9	Yes	16	6	256-bit	.11 micron	PCI Express x16 or AGP 8x
Radeon X850 CrossFire Edition	9	Yes	16	6	256-bit	.11-micron	PCI Express x16	Core and memory clock speeds differ; supports CrossFire
Radeon X1300 HyperMemory (RV515)	9.0c	Yes	4	2	32-bit	.09 micron	PCI Express x16
Radeon X1300, PRO (RV515)	9.0c	Yes	4	2	64-bit or 128-bit	.09 micron	PCI Express x16	PRO runs at faster core, memory clock speeds; 128-bit only; AIW
Radeon X1600 PRO; CrossFire Edition (RV530)	9.0c	Yes	12	5	128-bit	.09 micron	PCI Express x16	GDDR2 memory; supports CrossFire
Radeon X1600 XT (RV530)	9.0c	Yes	12	5	128-bit	.09 micron	PCI Express x16	GDDR3, faster core and clock than PRO; supports CrossFire
Radeon X1800 XL, XT, XT OC; CrossFire Edition (RV520)	9.0c	Yes	16	8	256-bit	.09 micron	PCI Express x16	XT and XT OC have faster core, memory clock speeds; up to 512MB of GDDR3; AIW; supports CrossFire
Manufacturer: Matrox
GPU/Card (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
Millennium G450	6	No	2	N/A	64-bit	.18 micron	AGP 4x or PCI	Triple-head display option
Millennium G550	6	No	2	N/A	64-bit	.18 micron	AGP 4x or PCI	Triple-head display option
Millennium P650	8.1	Yes	2	2	128-bit	.15 micron	AGP 8x or PCI	Triple-head display
Millennium P750	8.1	Yes	2	2	128-bit	.15 micron	AGP 8x or PCI Express x16	Triple-head display
Parhelia AGP	8.1	Yes	4	4	256-bit	.15 micron	AGP 8x	Has some DirectX 9 features; triple-head display
Parhelia APVe	8.1	Yes	4	4	256-bit	.15 micron	AGP 8x	Has some DirectX 9 features; triple-head display
Manufacturer: NVIDIA
GPU/Card (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
GeForce2 GTS, Ultra	7	Yes	4	N/A	128-bit	.18 micron	AGP 4x
GeForce2 Ti	7	Yes	4	N/A	128-bit	.15 micron	AGP 4x
GeForce2 MX, MX400	7	Yes	2	N/A	64-bit, 128-bit	.18 micron	AGP 4x
GeForce3, GeForce Ti 200, 500 (NV20)	8	Yes	4	1	128-bit	.15 micron	AGP 4x	Various core/memory speeds
GeForce2 MX200	7	Yes	2	N/A	64-bit	.18 micron	AGP 4x
GeForce4 Ti 4600, 4400, 4200 (NV25)	8	Yes	4	2	128-bit	.15 micron	AGP 4x	Most 4200 uses standard memory; others use faster BGA
GeForce4 MX 400, 420, 440, 460 (NV17)	7	Yes	2	N/A	128-bit	.15 micron	AGP 4x	Updated GeForce2 MX core; PCinema
GeForce4 MX 440-8x (NV18)	7	Yes	2	N/A	128-bit	.15 micron	AGP 8x	Based on GeForce 4 MX 440
GeForce4 Ti 4600-8x, 4200-8x (NV28)	8	Yes	4	2	128-bit	.15 micron	AGP 8x	Based on GeForce 4 Ti 4600 and 4200; dual-display
GeForce FX 5800 (NV30)	9	Yes	8	1	128-bit	.13 micron	AGP 8x
GeForce FX 5600 (NV31)	9	Yes	4	1	128-bit	.13 micron		PCinema
GeForce FX 5200 (NV35)	9	Yes	4	1	128-bit	.13 micron	AGP 8x	PCinema
GeForce FX 5900 (NV36)	9	Yes	8	1	256-bit	.13 micron	AGP 8x	Requires two slots for fan; PCinema
GeForce FX 5700 (NV36)	9	Yes	4	1	128-bit	.13 micron	AGP 8x	Based on FX 5900; PCinema
GeForce FX 5950 Ultra (NV38)	9	Yes	8	1	256-bit	.13 micron	AGP 8x	Faster version of FX 5900; requires two slots for fan
GeForce 6800 Ultra (NV40/NV45 with an integrated PCIe bridge)	9.0c	Yes	16	6	256-bit	.13 micron	AGP 8x or PCI Express x16	GDDR-3 memory; requires 480-watt power supply and two separate Molex power connectors
GeForce 6800 GT (NV40)	9.0c	Yes	16	6	256-bit	.13 micron	AGP 8x
GeForce 6800 (NV40)	9.0c	Yes	12	6	256-bit	.13 micron	AGP 8x or PCI-Express x16	GDDR-3 memory
GeForce 6800 GS (NV42 in PCIe or NV40 in AGP)	9.0c	Yes	12	5	256-bit	.11 micron	PCI-Express x16 or AGP
GeForce 6600, LE, GT (NV43-V), 6700 XL, 6610 XL	9.0c	Yes	8	4	128-bit	.11 micron	PCI Express x16	Different versions use DDR, GDDR1, GDDR2, or GDDR3 memory at different core and memory speeds
GeForce 6500 (NV44)	9.0c	Yes	4	4	64-bit	.11 micron	PCI Express x16	Faster version of 6200 series
GeForce 6200 (NV44)	9.0c	Yes	4	4	64-bit 128-bit	.11 micron	PCI Express x16 or AGP 8x (64-bit only)	Early PCIe versions based on NV42 core
GeForce 6200 TurboCache (NV44)	9.0c	Yes	4	4	32-bit or 64-bit TurboCache	.11 micron .11 micron	PCI Express x16
Manufacturer: NVIDIA
GPU/Card (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
GeForce 7800 GTX (G70)	9.0c	Yes	24	8	256-bit	.11 micron	PCI Express x16
GeForce 7800 GT (G70)	9.0c	Yes	20	7	256-bit	.11 micron	PCI Express x16
Manufacturer: SiS
GPU/Card (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
SiS315	7	Yes	2	N/A	64-bit, 128-bit	.15 micron	AGP 4x
Xabre 80	8.1	Yes	4	2	64-bit, 128-bit	.15 micron	AGP 4x
Xabre 200, 400	8.1	Yes	4	2	128-bit	.15 micron	AGP 8x	Dual-display (400);
Xabre 600	8.1	Yes	4	2	128-bit	.13 micron	AGP 8x	Dual-display; faster version of Xabre 400
Manufacturer: XGI
GPU/Card (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
Volari V8, V8 Ultra (V8 Duo: 2-chip version)	9.0	Yes	8 (16)	2 (4)	128-bit (256-bit)	.13 micron	AGP 8x	Dual-display
Volari V5, V5 Ultra (V5 Duo: 2-chip version)	9.0	Yes	4 (8)	2 (4)	128-bit (256-bit)	.13 micron	AGP 8x	Dual-display
Manufacturer: AGI
GPU/Card (Codename)	DirectX Version	Hard-ware T&L	Rendering Pipelines	Programmable Vertex Shader Pipelines	Memory Bus	Mfr. Process	Interface	Notes
Volari V3	8.1	Yes	2	1	64-bit; 128-bit)	.13 micron	AGP 8x	Up to four displays; based on Trident XP4
Volari V3XT	9	Yes	2	1	64-bit	.13 micron	AGP 8x or PCI
Volari 8300	9	Yes	4	4	64-bit	.13 micron	PCI Express x16	32MB onboard; uses up to 96MB of system RAM
AIW: Indicates ATI GPUs also used in the ATI All-in-Wonder series of TV-tuner/capture graphics cards.
CrossFire Edition: Card version that can be paired with a standard Radeon in the same series on a CrossFire-enabled motherboard.
PCIe: PCI Express x16.
PCinema: Indicates NVIDIA GPUs also used in Personal Cinema TV-tuner/capture graphics cards.
XGI: Spinoff of SiS's graphics chip division; later acquired Trident's graphics subsidiary.
SiS no longer manufactures 3D graphics chips; it spun off its multimedia division as XGI (see the next section for XGI products).

Professional graphics workstation cards and chipsets from vendors such as 3Dlabs (www.3dlabs.com), ATI, and NVIDIA are not listed in Table 13.23 because these cards are not found in standard desktop computers. See the vendors' websites for details about these products.

Note

See Chapter 15 in both Upgrading and Repairing PCs, 12th Edition and 13th Edition on this book's disc for more information about older chipsets from current manufacturers and for chipsets from manufacturers who no longer produce 3D graphics hardware.

Note

Table 13.23 is designed to be a reference to recent and current GPUs (and graphics cards) from current vendors. Only GPUs that meet fourth-generation (DirectX 7) or newer standards are included. Refer to Table 13.21 for the criteria I use to describe each 3D generation. Because most graphics card vendors now use the GPU name as part of the product name, Table 13.23 does not include product examples. See Upgrading and Repairing PCs, 15th Anniversary Edition, available in electronic form on the disc packaged with this book, for a table matching older graphics card GPUs and product names.

Be sure to use this information in conjunction with application-specific and game-specific tests to help you choose the best card/chipset solution for your needs. Consult the chipset vendors' websites for the latest information about third-party video card sources using a specific chipset.

Most chipsets in Table 13.23 feature some level of dual-display support, typically offering VGA, DVI-I, and TV-out. See the details for a particular video card to determine which features it implements as well as its memory size and other particulars.

How 3D Accelerators Work

Table 13.21. Brief 3D Acceleration History

Typical 3D Techniques

Advanced 3D Filtering and Rendering

Table 13.22. 3D Rendering in DirectX 6.09.0

Single- Versus Multiple-Pass Rendering

Hardware Acceleration Versus Software Acceleration

Software Optimization

Application Programming Interfaces

OpenGL

Microsoft DirectX

Dual-GPU Scene Rendering

NVIDIA SLI

Figure 13.16. How NVIDIA SLI looks in a typical installation.

ATI CrossFire

Figure 13.17. How data flows in a typical ATI CrossFire implementation using X800, X850, or X1800 cards.

3D Chipsets

Table 13.23. 3D Video Chipset Manufacturers and Products