Flylib.com

Books Software

 
 
 

Section C.4. Cull Unseen Geometry


C.4. Cull Unseen Geometry

If your application is geometry limited, a good way to avoid bottlenecks, or at least reduce their effects, is to send less geometry to OpenGL in the first place.

Performance-conscious applications commonly perform frustum culling to avoid sending geometry outside the view volume to OpenGL. The six view volume planes can be derived trivially from the projection and model-view matrices whenever the view changes. To cull, simply test a bounding sphere against those planes.

In some instances, geometry is inside the view volume but still not visible. Imagine an automobile design application rendering a model of a car, in which the engine is modeled with complex geometry. The car body occludes the engine, so the application can gain performance by not rendering it. Because it's within the view volume, however, frustum culling alone is insufficient.

OpenGL provides the occlusion query feature for this situation, which appendix A, "Other Features," describes in brief. Note that occlusion queries return data to the application. This in itself can be the cause of performance problems. To avoid stalling the rendering pipe, issue occlusion queries during rendering, but obtain occlusion query results only at the end of the frame. Use the results when you render the next frame. This technique works well for frame- coherent applications, but for initial frames or sudden changes in views, your application will need to assume that everything is visible and issue a new set of queries for use in successive frames .

Another popular technique for optimizing occlusion queries is to arrange your geometry so that frames are rendered in front-to-back (or outside-to-inside) order. This maximizes the chance for occlusion to occur.



C.5. State Changes and Queries

In general, modern OpenGL graphics hardware performs optimally when processing an uninterrupted stream of geometry data. State changes interrupt this stream and cause delays in processing. In a worst-case scenario, applications make extensive state changes after every triangle, dramatically inhibiting performance.

Avoid unnecessary state changes with the following tips:

  • Group geometry that share a state. Group textured primitives when they share a texture to avoid redundant calls to glBindTexture () .

  • To restore several changed states efficiently , use glPushAttrib () and glPopAttrib () , and their client-side equivalents, glPushClientAttrib () and glPopClientAttrib () . OpenGL implementations typically optimize these routines so that they're extremely lightweight.

  • To restore only a small number of changed-state items efficiently, make explicit calls to change and restore the states. Restoring state explicitly requires that your application track current state or query OpenGL before changing state. Nonetheless, this can be more efficient than using the attribute stacks on some implementations.

  • Avoid setting state redundantly. Although most OpenGL implementations are optimized to do nothing in this case, it still costs at least a branch and usually a call through a function pointer.

  • Avoid switching between multiple rendering contexts. If your application uses multiple contexts, limit the number of context switches to as few as possible per frame.

Obviously, OpenGL implementations set state in the underlying graphics hardware, but they also keep a shadow copy of many state values in host RAM. Querying a state item stored in shadow state requires only a data copy and, therefore, is relatively inexpensive.

glIsEnabled () is generally lightweight, because enable state is almost always shadowed in host RAM. Some OpenGL implementations keep a shadow copy of the top of the matrix stacks, so getting the matrix usually is as cheap as copying 16 GLfloat s (or GLdouble s). Implementations don'tand shouldn'tshadow all state, however. For example, implementations optimally store texture maps generated from framebuffer data (using glCopyTexImage2D () ) only in graphics hardware RAM. Querying OpenGL to retrieve such a texture map (using glGetTexImage () ) requires a large data copy over the system bus.