Texture Compression | Inside Direct3D (Dv-Mps Inside)

[Previous] [Next]

To render the most realistic-looking scenes, it's best to use high-resolution textures with rich color depth. However, such textures can consume a lot of memory. For example, a 256 × 256 texture with 16 bits of color per pixel will use 128 KB of memory. Adding mipmaps to this texture costs an additional 43 KB of memory. A scene with 50 such textures will require more than 8 MB of memory. For added realism, you can use 512 × 512 textures with 32 bits of color per pixel, but that uses eight times as much memory!

Managing this much data can hurt your application's performance in several ways. First, the texture data must be loaded from disk into the computer. Second, the texture data needs to be transferred to memory that the video card can access (unless the renderer can use textures directly from system RAM or AGP memory). If the video card doesn't have enough memory to hold all the textures you're using, expensive transfers from system to video memory will constantly occur. Finally the rendering hardware needs to access all those textures—often many times per texel—while rasterizing primitives. Texture compression is therefore an essential technique for using high-quality textures without overwhelming the video memory subsystem. Early 3D accelerator cards didn't use compressed textures, but now that DirectX promotes a standard for compressed textures, hardware support is much more prevalent.

DXT Formats

Microsoft introduced DXT-compressed texture surfaces with DirectX 6. DXT is a type of DirectDraw surface that stores its image data in compressed form. Several 3D accelerator cards can render textures directly from DXT surfaces, which affords tremendous memory and bandwidth savings. But even when working with a renderer that doesn't use DXT surfaces directly, you can save disk space by authoring textures in the DXT format.

Five varieties of the DXT format exist: DXT1, DXT2, DXT3, DXT4, and DXT5. The compression ratio for the DXT1 format is 4:1. (A 4 × 4 block of 16bit RGB565 texels is compressed to 64 bits: two 16-bit RGB565 values and sixteen 2-bit indices.) This level of compression isn't spectacular (when compared with compression technologies such as JPEG), but it's enough to effectively quadruple a 3D accelerator card's capacity for storing textures.

In return for this modest compression ratio, the compressed texture format has the following advantages compared with more sophisticated compression algorithms such as JPEG:

Each 4 × 4 texel block can be compressed and decompressed independently of the other blocks. Techniques that provide higher compression ratios might also require decompressing larger portions of the texture image.

Each compressed block is always the same size (for example, in the DXT1 format it's 64 bits), which simplifies the problem of finding the block that contains a particular 4 × 4 region of the texture. The blocks representing the image can be stored in an array, and the offset into this array is easily calculated from the x and y indices and the block size.

The decompression algorithm is fast. It simply uses the 2-bit indices in the compressed block to select colors from a four-value lookup table.

Although it's not necessary to know all the details about how DXT compression works, a few concepts are worth noting. The following table describes the five DXT formats:

Setting for DDPIXELFORMAT's dwFourCC Member	Description	Alpha Premultiplied?
DXT1	Opaque; 1-bit alpha	Not applicable
DXT2	Explicit alpha	Yes
DXT3	Explicit alpha	No
DXT4	Interpolated alpha	Yes
DXT5	Interpolated alpha	No

As you can see, the primary difference among these variations is the treatment of alpha. DXT1 works for opaque images or images with 1 bit of alpha, which means that it works well for most textures, including those that have traditionally used color-keyed transparency. DXT2 and DXT3 use four levels of alpha that are evenly spaced between full transparency and full opacity. DXT4 and DXT5 can have values indicating full transparency and full opacity, plus four other levels interpolated between two values that the application specifies. As an alternative, DXT4 and DXT5 can instead simply have six alpha levels interpolated between two specified values. DXT2 and DXT4 use premultiplied alpha, in which the RGB values for each pixel have already been scaled down by the alpha value at that pixel, while DXT3 and DXT5 use non-premultiplied alpha, in which the RGB values remain independent of the alpha value. Whether you use premultiplied or non-premultiplied alpha depends on what your hardware supports and how you plan to use the texture.

DXT works by breaking images into 4 × 4 texel chunks and exploiting color coherence within each chunk. For example, in the DXT1 format the compression software converts the 4 × 4 texel data into two 16-bit color values (RGB 5:6:5 format) plus 2 bits per texel. The 2 bits represent an index into a table containing the two 16-bit color values plus two more values. The compressor derives the two additional values from the 16-bit color values through linear interpolation. This compression procedure can cause images with a large number of colors in a small area to suffer some loss in quality. DXT is a fixed-rate compression format, meaning that a compressed, solid white image will take up as much space as a compressed photograph. However, DXT does a remarkable job of efficiently compressing a wide range of images that will likely be useful as texture maps. DXT1 uses 4 bits per texel, while the DXT2 through DXT5 formats use 8 bits per texel. (The DXT2 through DXT5 formats are larger because of their extra alpha information.) Therefore, compressing a 24-bit RGB image to DXT1 format can yield a sixfold compression savings, and compressing a 32-bit RGBA image to any of the DXT2 through DXT5 formats produces a fourfold compression savings. Be aware that you can simply compress a 24-bit pixel to 16-bit RGB565 without using DXT1 at all, which accounts for the difference between the 6:1 ratio for a DXT1 format shown here and the 4:1 ratio listed earlier.

Using DXT Surfaces

In DirectDraw, you create DXT surfaces the same way you create ordinary texture surfaces except that instead of specifying the DDPF_RGB flag, you must specify the DDPF_FOURCC flag and a FOURCC code indicating which DXT format to use. Here's an example of code that creates a 256 × 256 DXT1 texture that uses automatic texture management:

 DDSURFACEDESC2 ddsd; LPDIRECTDRAWSURFACE7 pddsCompressed = NULL; ZeroMemory(&ddsd, sizeof(ddsd)); ddsd.dwSize = sizeof(ddsd); ddsd.dwFlags = DDSD_CAPS | DDSD_WIDTH |                DDSD_HEIGHT | DDSD_PIXELFORMAT; ddsd.ddsCaps.dwCaps = DDSCAPS_TEXTURE; ddsd.ddsCaps.dwCaps2 = DDSCAPS2_TEXTUREMANAGE; ddsd.dwWidth = 256; ddsd.dwHeight = 256; ddsd.ddpfPixelFormat.dwSize = sizeof(DDPIXELFORMAT); ddsd.ddpfPixelFormat.dwFlags = DDPF_FOURCC; ddsd.ddpfPixelFormat.dwFourCC = FOURCC_DXT1; if (FAILED(hr = pDD->CreateSurface(&ddsd, &pddsCompressed, NULL)))     return hr; return S_OK;

To move an image from a regular RGB surface into the compressed texture, use DirectDraw's Blt function. If the source RGB surface has alpha bits, Blt will set alpha information appropriately in the destination DXT surface. If the source RGB surface has a color key attached, Blt will set the alpha for each pixel to either opaque or transparent. The reverse operation also works: you can blit from a DXT texture to just about any format of RGB surface to decompress the image. Be aware that compressing an image into a DXT format is much slower than decompressing it. Therefore, applications should avoid compressing textures at run time and use precompressed textures instead.

Another way to fill the compressed texture is to lock the surface and then transfer DXT-compressed data directly into it. This method is the one you should use if you've stored your texture data on file in compressed format. The DirectX SDK illustrates this approach with the DirectX Texture Tool and the Compress sample application. (You'll see the files for these applications if you install DirectX 7 from the CD accompanying this book.) The DirectX Texture Tool lets you load uncompressed images, generate mipmap levels (see the section that follows), compress the resulting surface, and save everything as a DirectDraw surface (DDS) file. The DDS file is a simple format that holds the surface description (which holds the image dimensions, compression format, and other useful parameters), followed by one or more blocks of compressed surface data (one block per mipmap level). Here's the code needed to load a DDS file:

 HRESULT ReadDDSTexture( CHAR* strTextureName, LPDIRECTDRAW7 pDD,                          DDSURFACEDESC2* pddsdComp,                          LPDIRECTDRAWSURFACE7* ppddsCompTop ) {     HRESULT              hr;     LPDIRECTDRAWSURFACE7 pddsTop      = NULL;     LPDIRECTDRAWSURFACE7 pdds         = NULL;     LPDIRECTDRAWSURFACE7 pddsAttached = NULL;     DDSURFACEDESC2       ddsd;     DWORD                dwMagic;     hr = E_FAIL;     //     // Open the compressed texture file.     //     FILE* file = fopen( strTextureName, "rb" );     if( file == NULL )         return E_FAIL;     // Read the magic number.     fread( &dwMagic, sizeof(DWORD), 1, file );     if( dwMagic != MAKEFOURCC('D','D','S',' ') )         goto LFail;     //     // Read the surface description.     //     fread( pddsdComp, sizeof(DDSURFACEDESC2), 1, file );     //     // Mask/set surface caps appropriately for the application.     //     pddsdComp->ddsCaps.dwCaps2 |= DDSCAPS2_TEXTUREMANAGE;     //     // Handle the special case in which the hardware doesn't     // support mipmapping.     //     if( !g_bSupportsMipmaps )     {         pddsdComp->dwMipMapCount = 0;         pddsdComp->dwFlags &= ~DDSD_MIPMAPCOUNT;         pddsdComp->ddsCaps.dwCaps &= ~( DDSCAPS_MIPMAP |                                         DDSCAPS_COMPLEX );     }     //     // Does texture have mipmaps?     //     if( pddsdComp->dwMipMapCount == 0 )         g_bMipTexture = FALSE;     else         g_bMipTexture = TRUE;     //     // Clear unwanted flags.     //     pddsdComp->dwFlags &= (~DDSD_PITCH);     pddsdComp->dwFlags &= (~DDSD_LINEARSIZE);     //     // Create a new surface based on the surface description.     //     if( FAILED( hr = pDD->CreateSurface( pddsdComp, ppddsCompTop,                                          NULL ) ) )         goto LFail;     pddsTop = *ppddsCompTop;     pdds = pddsTop;     pdds->AddRef();     while( TRUE )     {         ZeroMemory( &ddsd, sizeof(DDSURFACEDESC2) );         ddsd.dwSize = sizeof(DDSURFACEDESC2);         if( FAILED( hr = pdds->Lock( NULL, &ddsd, DDLOCK_WAIT,                                      NULL )))             goto LFail;         if( ddsd.dwFlags & DDSD_LINEARSIZE )         {             fread( ddsd.lpSurface, ddsd.dwLinearSize, 1, file );         }         else         {             DWORD yp;             BYTE* pbDest = (BYTE*)ddsd.lpSurface;             LONG dataBytesPerRow =                 ddsd.dwWidth * ddsd.ddpfPixelFormat.dwRGBBitCount / 8;             for( yp = 0; yp < ddsd.dwHeight; yp++ )             {                 fread( pbDest, dataBytesPerRow, 1, file );                 pbDest += ddsd.lPitch;             }         }         pdds->Unlock( NULL );         if( !g_bSupportsMipmaps )         {             // For mipless hardware, don't copy mipmaps.             pdds->Release();             break;         }         ddsd.ddsCaps.dwCaps  = DDSCAPS_TEXTURE | DDSCAPS_MIPMAP |                                DDSCAPS_COMPLEX;         ddsd.ddsCaps.dwCaps2 = 0;         ddsd.ddsCaps.dwCaps3 = 0;         ddsd.ddsCaps.dwCaps4 = 0;         if( FAILED( hr = pdds->GetAttachedSurface(                                    &ddsd.ddsCaps, &pddsAttached ) ) )         {             pdds->Release();             break;         }         pdds->Release();         pdds = pddsAttached;     }     hr = S_OK;  // Everything worked. LFail:     fclose( file );     return hr; }

One point worth mentioning about accessing DXT surfaces directly is that they always use a linear block of memory—unlike RGB surfaces, in which the pitch (the distance, in bytes, between two memory addresses that represent the beginning of one bitmap row and the beginning of the next bitmap row) isn't necessarily the same as the row width. Therefore, pitch is meaningless for DXT surfaces. The dwLinearSize field indicates the total amount of memory used for the compressed image data. The previous code sample can also load DDS files containing uncompressed RGB surfaces. The code uses the DDSD_LINEARSIZE flag to determine whether to load an entire mipmap level in a solid block or one to load it one row at a time, honoring pitch.

If you author textures in DXT-compressed form, when the program runs you'll need to determine whether the renderer directly supports DXT textures. If it does, you can go ahead and pass the DXT textures that you've loaded to the renderer. If not, you need to decompress the texture contents into textures that are in a format supported by the renderer. The Compress sample application on the companion CD shows how to enumerate supported formats. If the renderer doesn't support DXT, Compress chooses an appropriate RGB format and decompresses the texture into it.