Memory Basics

Memory is the workspace for a server's processor. It is a temporary storage area where the programs and data being operated on by the processor must reside. Memory storage is considered temporary because the data and programs remain there only as long as the server has electrical power or is not reset. Before the server is shut down or reset, any data that has been changed should be saved to a more permanent storage device (usually a hard disk) so it can be reloaded into memory in the future.

Memory is often called RAM, for random access memory. RAM's contents are volatile, requiring frequent power refreshes to remain valid. Main memory is called RAM because you can randomly (as opposed to sequentially) access any location in memory. This designation is somewhat misleading and often misinterpreted. Read-only memory (ROM), for example, is also randomly accessible, yet it is usually differentiated from system RAM because ROM, unlike RAM, maintains its contents without power and can't normally be written to. Disk memory is also randomly accessible, but we don't consider that RAM, either.

Over the years, the definition of RAM has changed from being a simple acronym to referring to the primary memory workspace the processor uses to run programs, which is usually constructed of a type of chip called dynamic RAM (DRAM). One of the characteristics of DRAM chips (and, therefore, RAM in general) is that they store data dynamically, which really has two meanings. One meaning is that the information can be written to RAM repeatedly at any time. The other has to do with the fact that DRAM requires the data to be refreshed (essentially rewritten) every 15ms (milliseconds) or so. A type of RAM called static RAM (SRAM) does not require this periodic refreshing. An important characteristic of RAM in general is that data is stored only as long as the memory has electrical power.

When we talk about a computer's memory, we usually mean the RAM or physical memory in the system (the memory modules used to temporarily store currently active programs and data). It's important not to confuse memory with storage, which refers to things such as disk and tape drives (although they can be used as a substitute for RAM called virtual memory).

RAM can refer to both the physical chips/modules that make up memory in a system and the logical mapping and layout of that memory. Logical mapping and layout refer to how the memory addresses are mapped to actual chips and what address locations contain which types of system information.

Memory temporarily stores programs when they are running, along with the data being used by those programs. RAM chips are sometimes termed volatile storage because when you turn off a computer or an electrical outage occurs, whatever is stored in RAM is lost unless it's been saved to a local hard disk or other storage device or the server's hard disk. Because of the volatile nature of RAM, many computer users make it a habit to save their work frequently. (Some software applications can do timed backups automatically.) Launching a computer program from either local or network storage brings files into RAM, and as long as they are running, computer programs reside in RAM. The CPU executes programmed instructions in RAM and also stores results in RAM. The server transmits data to onboard storage or to connected workstations that request the information.

Physically, the main memory in a system is a collection of chips or modules containing chips that are usually plugged in to the motherboard. These chips or modules vary in their electrical and physical designs and must be compatible with the system into which they are being installed in order to function properly. This chapter discusses the various types of chips and modules that can be installed in different systems.

Next to the processor and motherboard, memory can be one of the most expensive components in a modern PC, although the total amount of money spent on memory for a typical system has declined over the past few years. Server memory generally sells for about $200 per gigabyte. It is somewhat more expensive than desktop memory because it supports data-protection and reliability features such as parity/error correcting code (ECC) and a signal buffer. (Memory that uses a signal buffer chip is known as registered memory.)

If you build a new server, you can't expect to be able to use just any existing server memory in your inventory. Similarly, if you upgrade the motherboard in an existing server, it's likely that the new motherboard will not support the old motherboard's memory. Therefore, you need to understand all the various types of memory on the market today so you can best determine which types are required by which systems and thus more easily plan for future upgrades and repairs.

To better understand physical memory in a system, you should see where and how it fits into the system. Three main types of physical memory are used in modern systems:

Read-only memory (ROM)
Dynamic random access memory (DRAM)
Static random access memory (SRAM)

ROM

ROM is a type of memory that can permanently or semipermanently store data. It is called read-only because it is either impossible or difficult to write to. ROM is also often referred to as nonvolatile memory because any data stored in ROM remains there, even if the power is turned off. Therefore, ROM is an ideal place to put a server's startup instructions (that is, the software that boots the system).

Note that ROM and RAM are not opposites, as some people seem to believe. They are both simply types of memory. In fact, ROM could be classified as technically a subset of the system's RAM. In other words, a portion of the system's RAM address space is mapped into one or more ROM chips. This is necessary to contain the software that enables the PC to boot up; otherwise, the processor would have no program in memory to execute when it was powered on.

	For more information on ROM, see "Motherboard BIOS," p. 286.

The main ROM BIOS is contained in a ROM chip on the motherboard, but there are also adapter cards with ROM on them as well. ROM on adapter cards contains auxiliary BIOS routines and drivers needed by the particular card, especially for those cards that must be active early in the boot process, such as video cards. Cards that don't need drivers active at boot time typically don't have ROM because those drivers can be loaded from the hard disk later in the boot process.

Most systems today use a type of ROM called electrically erasable programmable ROM (EEPROM), which is a form of flash memory. Flash is a truly nonvolatile memory that is rewritable, enabling users to easily update the ROM or firmware in their motherboards or any other components (video cards, SCSI cards, peripherals, and so on).

DRAM

DRAM is the type of memory chip used for most of the main memory in a modern PC. The main advantages of DRAM are that it is very dense, meaning you can pack a lot of bits into a very small chip, and it is inexpensive, which makes purchasing large amounts of it affordable.

The memory cells in a DRAM chip are tiny capacitors that retain a charge to indicate a bit. The problem with DRAM is that it is dynamic. Because of its design, DRAM must be constantly refreshed; otherwise, the electrical charges in the individual memory capacitors drain, and the data is lost. A refresh occurs when the system memory controller takes a tiny break and accesses all the rows of data in the memory chips. Most systems have a memory controller (which is built in to the North Bridge portion of the motherboard chipset of most servers or is found in the processor on the AMD Opteron), which is set for an industry-standard refresh rate of 15ms. This means that every 15ms, all the rows in the memory chip are automatically read to refresh the data.

	See "Server Chipsets Overview," p. 146.

Unfortunately, refreshing memory takes processor time away from other tasks because each refresh cycle takes several CPU cycles to complete. A few servers allow you to alter the refresh timing parameters via the CMOS Setup, but be aware that increasing the time between refresh cycles to speed up your system can allow some of the memory cells to begin draining, which can cause random soft memory errors to appear. (A soft error is a data error that is not caused by a defective chip.)

On a server using ECC memory, the server will automatically correct a single-bit soft error without any user intervention. A soft error involving two or more memory bits would trigger an error message. Servers using advanced ECC (chipkill) can correct up to four bit errors in the same memory module. However, a few low-end servers don't use ECC memory. On those servers, any type of memory error would cause the system to lock up and require a restart. Unsaved data would be lost. It is usually safer to stick with the recommended or default refresh timing. Because refreshing consumes less than 1% of a modern system's overall bandwidth, altering the refresh rate has little effect on performance. It is almost always best to use default or automatic settings for any memory timings in the BIOS Setup. Most servers don't allow changes to memory timings and are permanently set to automatic settings. On an automatic setting, the motherboard reads the timing parameters out of the serial presence detect (SPD) ROM found on the memory module and sets the cycling speeds to match. Even if you're accustomed to altering memory settings on desktop PCs to boost performance, such changes often require a lot of experimentation to find a balance between performance and stability. With a server, you should always opt for stability over a relatively minute gain in performance.

The transistor for each DRAM bit cell reads the charge state of the adjacent capacitor. If the capacitor is charged, the cell is read to contain a 1; no charge indicates a 0. The charge in the tiny apacitors is constantly draining, which is why the memory must be refreshed constantly. Even a momentary power interruption, or anything that interferes with the refresh cycles, can cause a DRAM memory cell to lose the charge and therefore the data. If this happens in a running system, it can lead to blue screens, global protection faults, corrupted files, and any number of system crashes. Therefore, you should use battery backup systems (UPS devices) and high-quality surge suppressors on your servers.

DRAM is used in desktop and server systems because it is inexpensive, and the chips can be densely packed, so a lot of memory capacity can fit in a small space. Unfortunately, DRAM is also slow, typically much slower than the processor. For this reason, many types of DRAM architectures have been developed to improve performance. These architectures are covered later in this chapter.

SRAM: Cache Memory

Another distinctly different type of memory exists that is significantly faster than most types of DRAM. SRAM is so named because it does not need the periodic refresh rates that DRAM needs. Because of how SRAM is designed, not only are refresh rates unnecessary, but SRAM is much faster than DRAM and much more capable of keeping pace with modern processors.

SRAM is available in access times of 2ns (nanoseconds) or less, so it can keep pace with processors running 500MHz or faster. This is because of the SRAM design, which calls for a cluster of six transistors for each bit of storage. The use of transistors but no capacitors means that refresh rates are not necessary because there are no capacitors to lose their charges over time. As long as there is power, SRAM remembers what is stored. Unfortunately, SRAM is too expensive and too large to use as main memory.

However, SRAM is a perfect choice for memory caching. Cache memory runs at speeds close to or equal to the processor speed and is the memory the processor usually directly reads from and writes to. During read operations, the data in the high-speed cache memory is resupplied from the lower-speed main memory or DRAM in advance. Cache memory is built in to all modern server processors, starting with the Pentium and Pentium Pro.

Cache effectiveness is expressed as a hit ratio, which is the ratio of cache hits to total memory accesses. A hit occurs when the data the processor needs has been preloaded into the cache from the main memory, meaning that the processor can read it from the cache. A cache miss occurs when the cache controller does not anticipate the need for a specific address and the desired data is not preloaded into the cache. In the case of a miss, the processor must retrieve the data from the slower main memory instead of the faster cache. Anytime the processor reads data from main memory, the processor must wait longer because the main memory cycles at a much slower rate than the processor. If a processor with integral on-die cache is running at 3400MHz (3.4GHz), both the processor and the integral cache would be cycling at 0.29ns, while the main memory would most likely be cycling 8.5 times more slowly, at 2.5ns (200MHz DDR). Therefore, the memory would be running at only a 400MHz equivalent rate. So, every time the 3.4GHz processor read from main memory, it would effectively slow down 8.5-fold to only 400MHz! The slowdown is accomplished by having the processor execute wait states, which are cycles in which nothing is done; the processor essentially cools its heels while waiting for the slower main memory to return the desired data. Obviously, you don't want your processors slowing down, so cache function and design become more important as system speeds increase.

To minimize the processor being forced to read data from the slow main memory, two or three stages of cache usually exist in a modern system: called Level 1 (L1), Level 2 (L2), and Level 3 (L3). The L1 cache is also called the integral, or internal, cache because it has always been built directly in to the processor as part of the processor die (the raw chip). Therefore, the L1 cache always runs at the full speed of the processor core and is the fastest cache in any system.

L1 cache has been a part of all processors since the Intel 386. To improve performance, later processor designs from Intel (starting with the Pentium Pro of 1995) and AMD (starting with the K6-III of 1999) included the L2 cache as a part of the processor die (earlier systems used the L2 cache on the motherboard). Although the Pentium Pro included L2 cache in the processor core, running at full speed, this was a very expensive processor to build, and Intel switched over to slot-based designs for the Pentium II, Pentium II Xeon, and early versions of the Pentium III and Pentium III Xeon, a design also used by the original AMD Athlon. These processors placed L2 cache in separate chips from the processor core, and the L2 cache ran at half the speed (or sometimes a bit less) of the processor core.

However, by late 1999, with the introduction of the Pentium III Coppermine and Pentium III Xeon (Advanced Transfer Cache) processors, all of Intel's subsequent processors for servers as well as desktops have placed full-speed L2 cache in the processor core. Likewise, AMD's Socket A Athlon (first introduced in 2000) and Athlon MP server processors led AMD's return to on-die full-speed L2 cache. Today, all server (as well as desktop) processors use on-die L2 cache. In chips with on-die L2, the cache runs at the full core speed of the processor and is much more efficient than older designs that placed L2 cache outside the processor core. For details, see Chapter 2, "Server Microprocessors."

On-die L3 cache has been present in high-end workstation and server processors such as the Xeon and Itanium families since 2001. Having more levels of cache helps mitigate the speed differential between the fast processor core and the relatively slow motherboard and main memory. L2 and L3 cache is faster and is accessed much more quickly than main memory. Thus, virtually all motherboards designed for processors with built-in cache don't have any cache on the board; the entire cache is contained in the processor or processor module instead. The key to understanding both cache and main memory is to see where they fit in the overall system architecture.

Chapter 4, "Server Motherboards and BIOS," provides diagrams showing recent systems with different types of cache memory. Table 5.1 illustrates the need for and function of cache memory in modern systems.

Table 5.1. The Relationship Between L1 (Internal) Cache, L2 (External) Cache, and Main Memory in Modern Servers
CPU Type	Pentium	Pentium Pro	Pentium II, Pentium II Xeon	Pentium III Xeon	Athlon MP 2200+	Opteron 250	Pentium 4 560	Xeon
CPU speed	233MHz	200MHz	450MHz	933MHz	1.8GHz	2.4GHz	3.6GHz	3.8GHz
L1 cache speed	4.3ns (233MHz)	5.0ns (200MHz)	2.2ns (450MHz)	1.07ns (933MHz)	0.56ns (1.8GHz)	0.42ns (2.4GHz)	0.278ns (3.6GHz)	0.263ns (3.8GHz)
L1 cache size	16KB	32KB	32KB	32KB	128KB	128KB	16KB	16KB
L2 cache type	Onboard	On-chip	On-chip	Onboard	On-die	On-die	On-die	On-die
CPU/L2 speed ratio		1/1	1/2	1/2	1/1	1/1	1/1	1/1
L2 cache speed	15ns (66MHz)	5ns (200MHz)	4.4ns (225MHz)	2.14ns (466.5MHz)	0.56ns (1.8GHz)	0.42ns (2.2GHz)	0.278ns (3.6GHz)	0.263ns (3.8GHz)
L2 cache size	Varies ^[1]	256KB ^[2]	512K	256KB ^[3]	256KB	1MB ^[4]	1MB ^[5]	2MB ^[6]
CPU bus speed	66MHz	66MHz	100MHz	133MHz	266MHz	333MHz	800MHz	800MHz
Memory bus speed	60ns (16MHz)	60ns (16MHz)	10ns (100MHz)	7.5ns (133MHz)	3.8ns (266MHz)	3.0ns (333MHz)	1.25ns (800MHz)	1.25ns (800MHz)

^[1] The L2 cache is on the motherboard, and its size depends on which board is chosen and how much memory is installed.

^[2] The Pentium Pro was also available with 512KB and 1024KB L2 cache.

^[3] The Pentium III Xeon was also available with 512KB, 1024KB, and 2048KB L2 cache.

^[4] Dual-core versions of the Opteron have 1024KB L2 cache per core.

^[5] The Pentium 4 is also available with 256KB and 512KB L2 cache.

^[6] The Xeon is also available with 512KB and 1024KB L2 cache. Some models also include L3 cache.

Starting with the Pentium Pro and Pentium II, the processor sets the amount of main memory that can be cached (that is, the cacheability limit). The Pentium Pro and some of the earlier Pentium IIs can address up to 64GB but only cache up to 512MB. The later Pentium IIs and all Pentium III and Pentium 4 processors can cache up to 4GB. All the server-oriented Xeon processors can cache up to 64GB. On current servers, all installed memory is cacheable.

ROM

DRAM

SRAM: Cache Memory

Table 5.1. The Relationship Between L1 (Internal) Cache, L2 (External) Cache, and Main Memory in Modern Servers