Requirements for Storage IO | Storage Networking Fundamentals: An Introduction to Storage Devices, Subsystems, Applications, Management, and File Systems (Vol 1)

Requirements for Storage I/O

Storage networks have unique requirements for performance, reliability, and integrity that differentiate them from other types of networks. This section analyzes the requirements for each of these three categories.

Performance Requirements

There are two meaningful performance areas in storage networking: bandwidth and latency.

Bandwidth

Bandwidth is the total amount of data traffic that a network can accommodate. It is typically measured in megabits per second or megabytes per second.

Bandwidth measurements are meaningful for host bus adapters (HBAs), network links, switching hardware, and storage controllers. In general, 100 Mbps and 1 Gbps are commonly used with network access server (NAS) products. 1 Gbps, 2 Gbps, and 4 Gbps are typical for Fibre Channel storage area networking (SAN) products. Both SAN and NAS are likely to adopt 10 Gbps technology within a year of each other.

A choke point in a network is a hardware or software function that constrains performance. The following four components in storage network are potential choke points:

HBAs
Network links
Switching hardware
Storage controllers

If bandwidth is inadequate through any of these potential choke points, the storage network will not be able to provide the expected results. The choke points for a storage network are shown in Figure 3-1.

Figure 3-1. Bandwidth Choke Points in a Storage Network

Of the potential choke points shown in Figure 3-1, the two that are most likely to be problematic are (1) links between switches and storage and (2) storage controllers. The reason for this is the high degree of session multiplexing that occurs. This topic is addressed in more detail in Chapter 5, "Storage Subsystems."

Latency

Latency is the transmission delay that occurs in sending, receiving, or forwarding a network transmission. A good analogy for network latency is a water pipeline with a storage tank in the middle, as shown in Figure 3-2. The volume moved through the pipeline per second is equivalent to bandwidth, while the time the water is held in the storage tank is equivalent to latency. The larger the water tank, the greater the latency for water flowing through it.

Figure 3-2. Latency in a Pipeline

Sometimes networking professionals have difficulty reckoning latency issues in storage networks because latency is not often a critical issue for data networks. Rest assured, the situation is completely different for storage networks, especially SANs. Storage I/O transmission latency can have a direct impact on systems performance due to the need to ensure data integrity, as discussed in the next section.

Latency requirements depend on the applications being used. Transaction processing applications have the most demanding latency requirements, while other applications, such as e-mail, are relatively unaffected by it.

Fibre Channel SAN switches are typically designed to have 1 to 3 m sec (microseconds) of latency. By contrast, many data networking switches do not even list latency measurements in their specifications because they have not been important to the market. Latencies for Ethernet switches are typically in the hundreds of microseconds.

NOTE

Some (still) believe Gigabit Ethernet will not effectively support storage I/O because Ethernet latencies are too high. But latency requirements are mostly valid for a small percentage of applications. The problem is that latency-sensitive applications tend to be the most important applications running in the data center, and they do not tolerate slack performance.

Obviously, Gigabit Ethernet switches can be built with lower latencies (and higher costs) if needed, but they can also be used (latency warts and all) to support the enormous volume of applications that do not have much sensitivity to latency.

Reliability Requirements

Network storage does not tolerate poor reliability, such as intermittent network connections. While most data network operators try to minimize network problems to maintain a smoothly operating environment, storage network operators cannot afford anything but the highest degree of reliability.

The sensitivity to reliability is rooted in the close coupling between storage I/O and system performance. As a primary system function that is almost always in use during system activity, there is very little tolerance for delays or retransmissions. Without a high level of connection reliability, system performance and reliability would be completely unpredictable. Automated processing would be anything but automatic.

Reliability in storage connections is also inherent in the nature of storage writes. Data that is being written by an application probably does not exist anywhere else, and there may be no other opportunities to store it properly. Re-creating it may not be possible, depending on the application.

From a historical perspective, DAS storage was designed with the assumption that bus connections were always available and latency was never an issue. Storage network technology does not necessarily improve connection reliability over storage buses, but it certainly can be equivalent, and it extends to much greater distances. In addition, using multiple connections over a storage network improves the overall reliability and availability of the system.

Network storage allows the concept of service level agreements (SLAs) to work with storage. The reliability of DAS storage usually depends on the reliability of the products forming the bus and attached to it, such as the HBA and the devices or subsystems connected to it. With storage networks, reliability is also an element of the network designa dimension that never existed with DAS storage. SLAs can be established for storage networks that define multiple tiers of network reliability for different types of applications.

NOTE

The requirements for NAS storage are more varied than for SANs. Realistically, low-end NAS products used as departmental file servers work well enough with average LAN connections. However, high-end NAS products that support mission-critical applications should run only over high-speed LANs with optimal reliability.

Integrity Requirements: Write Ordering and Data Consistency

One of the most challenging aspects of storage networks is the requirement for data integrity. This does not mean that storage networks need to have better transmission integrity for their payloads than data networks. Instead, data integrity in storage networks refers to the sequence in which data is written to storage media compared to the way writes were issued by the system. Data must be written to storage media in a way that does not distort an application's intended order of write I/Os. In other words, write ordering cannot be violated.

The term data consistency is used to refer to the complete and proper writing of storage data to storage media. In a nutshell, data consistency is a concept where the intended results of all data processing operations are reflected correctly in stored data. Under normal I/O operations, this is never a problem. However, data consistency is an issue when disasters occur and when creating point-in-time copies of data (see Chapter 17, "Data Management") or when making remote copies of data (see Chapter 10, "Redundancy Over Distance with Remote Copy").

To illustrate the concept of data consistency, consider a hypothetical situation where an application writes data in sequence to two different storage locations and their respective system cache memory locations, as pictured in Figure 3-3. After the first write is made to system cache (1), the system attempts to write the data to storage (2), but something goes wrong and the write does not complete. Shortly thereafter, the application reads the first piece of data from cache memory (3), processes it, and writes new, dependent data to its cache memory location (4). Shortly thereafter, the system writes the dependent data successfully to storage (5). At this point the data is inconsistent because the dependent data is stored on media, but the original write that was used to create the dependent data is not on media. The data is inconsistent until the original data is written to media (6).

Figure 3-3. Dependent Data Written to Media Out of Order