3.3.1 A Typical Beowulf Motherboard

recharging the capacitor to its former condition. As a consequence, DRAM can have a shorter access time (the time taken to read a cell) than cycle time (the time until the same cell may be accessed again). Also, isolation of the cell's storage capacitor is imperfect and the charge leaks away requiring it to be refreshed (rewritten) every few milliseconds. Finally, because the capacitor is a passive, non-amplifying device, it takes longer to access a DRAM than an SRAM cell. However, the benefits are substantial. DRAM density can exceed ten times that of SRAM and its power consumption is much lower. Also, new techniques for moving data from the DRAM internal memory row buffers to the system bus have narrowed the gap in terms of memory bandwidth between DRAM and SRAM. As a result, main memory for all Beowulf nodes is provided by DRAM in any one of its many forms.
Of the many forms of DRAM, the two most likely encountered in Beowulf nodes are Extended Data Output DRAMs (EDO DRAM) and Synchronous DRAM (SDRAM). Both are intended to increase memory throughput. EDO DRAM provides a modified internal buffering scheme that maintains data at the output pins longer than conventional DRAM, allowing improved memory data transfer rates. While many current motherboards support EDO DRAM, the higher speed systems likely to be used as Beowulf nodes in the immediate future will employ SDRAM instead. SDRAM is a significant advance in memory interface design. It supports a pipeline burst mode that permits a second access cycle to begin before the previous one has completed. While one cycle is putting output data on the bus, the address for the next access cycle is simultaneously applied to the memory. Effective access speeds of 10 nanoseconds can be achieved with systems using 100 MHz systems bus.
3.4.4 Memory Hierarchy L1 and L2 Caches
The modern memory system is a hierarchy of memory types. Figure 3.3 shows a typical memory hierarchy. Near the processor at the top of the memory system are the high speed Level 1 or L1 caches. Usually a separate cache is used for data and instructions for high bandwidth to load both data and instructions into the processor on the same cycle. The principle requirement is to deliver the data and instruction words needed for processing on every processor cycle. These memories run fast and hot, are relatively expensive, and now often incorporated directly on the processor chip. For these reasons, they tend to be very small with a typical size of 16 KBytes. Because L1 caches are so small, and the main memory requires long access times, modern system architectures usually include a second level or L2 cache to hold both data and instructions. Access time to acquire a block of L2 data may take several processor cycles. A typical L2 cache size is 512 KBytes with future L2 caches planned to reach 2 MBytes within the next one to two years.

 



How to Build a Beowulf
How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters (Scientific and Engineering Computation)
ISBN: 026269218X
EAN: 2147483647
Year: 1999
Pages: 134

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net