8.1. Introducing CPU CachesFigure 8.1 depicts typical caches that a CPU can use. Figure 8.1. CPU Caches
Caches include the following:
These are the typical caches for the content of main memory, depending on the processor. Another framework for caching page translations as part of the Memory Management Unit (MMU) includes the Translation Lookaside Buffer (TLB) and Translation Storage Buffers (TSBs). These translation facilities are discussed in detail in Chapter 12 in Solaris™ Internals. Of particular interest are the I-cache, D-cache, and E-cache, which are often listed as key specifications for a CPU type. Details of interest are their size, their cache line size, and their set-associativity. A greater size improves cache hit ratio, and a larger cache line size can improve throughput. A higher set-associativity improves the effect of the Least Recently Used policy, which can avoid hot spots where the cache would otherwise have flushed frequently accessed data. Experiencing a low cache hit ratio and a large number of cache misses for the I-, D-, or E-cache is likely to degrade application performance. Section 8.2 demonstrates the monitoring of different event statistics, many of which can be used to determine cache performance. It is important to stress that each processor type is different and can have a different arrangement, type, and number of caches. For example, the UltraSPARC IV+ has a Level 3 cache of 32 Mbytes, in addition to its Level 1 and 2 caches. To highlight this further, the following describes the caches for three recent SPARC processors:
The cores are connected by a high-speed, low-latency crossbar in silicon. An UltraSPARC T1 processor can be considered SMP on a chip. Each core has an instruction cache, a data cache, an instruction translation-lookaside buffer (iTLB), and a data TLB (dTLB) shared by the four strands. A twelve-way associative unified Level 2 (L2) on-chip cache is shared by all 32 hardware threads. Memory latency is uniform across all coresuniform memory access (UMA), not non-uniform memory access (NUMA). Figure 8.2 illustrates the structure of the UltraSPARC T1 processor. Figure 8.2. UltraSPARC T1 Caches
For a reference on UltraSPARC caches, see the UltraSPARC Processors Documentation Web site at http://www.sun.com/processors/documentation.html This Web site lists the processor user manuals, which are referred to by the cpustat command in the next section. Other CPU brands have similar documentation that can be found online. |