NAS Workload Characterization

team lib

When characterizing workloads for Network Attached Storage, the first step is to keep in mind the inherent value that NAS brings to application storage environments. As described in Chapters 3 and 12, these include the following:

  • Compatibility with existing network resources

  • Quick deployment with bundled packaging

  • Support for multiple file protocols

  • Cost-effectiveness

These characteristics appear to reinforce NAS as a general-purpose storage solution; however, it also places some parameters around what NAS can doas well as what it cant do. Some of this was also covered in Chapters 3 and 12, which touched on processing environments with the following workload characteristics:

  • An OLTP configuration supporting write- intensive applications

  • A high-traffic transactional-intensive environment with high I/O content

  • Messaging environments that use embedded databases

  • Relational database systems supporting distributed environments

As we review each workload characteristic and environment, both pro and con, well begin to narrow our scope of supported applications.

What NAS Does Well

NAS provides one of the best solutions for departmental network storage requirements. In many cases, it can extend beyond these applications and move into an enterprise orientation where user access heads into the thousands and data size moves into the terabyte range. We examine each of these NAS characteristics next .

Attachment to Existing Network Resources

As discussed previously, this is NASs capability to operate within an existing Ethernet LAN environment. This allows any server existing on the LAN to connect and utilize the NAS file server functions. It also allows clients who use network files to connect and utilize the file server functions as a general-purpose file server.

Deduction: A good file server at a greatly reduced cost.

Quick Deployment as a Bundled Solution

Quick deployment as a bundled solution allows NAS to be installed swiftly on a network, where its defined as an addressable entity and made available for immediate use. In many cases, this can be done within a matter of minutes given the right network and network functions.

Deduction: A quick fix for storage capacity, once again at a greatly reduced cost.

Support for Multiple File Protocols

NAS supports most common remote file protocols, allowing many applications to exist transparently while application servers redirect I/O requests to a NAS device. More importantly, this allows both clients and servers to move storage requirements and I/O requests to an external device.

Deduction: A good device for separating storage requirements from application requirements, and if it supports HTTP, it can be used as a cost-effective storage device for Web Services.

Cost-Effectiveness

Although I think we have introduced this concept with each of our previous attributes, it can be further articulated in terms of the NAS bundled solution, which provides both hardware (processors, network attachments, and storage devices) and software (the OS, file system, and so on) with several alternatives to RAID protection.

What NAS Cant Do

It is also important to keep in mind what NAS cant do. Trying to get NAS to support things it cant undermines its value as a cost-effective solution and potentially renders it obsolete. Here are some things to balance against what NAS does well.

OLTP in a Write-Intensive Application

Remember two things: First, NAS is bundled with a file system; OLTP almost always requires an RDBMS that restricts NAS from fully participating in this application. The RDBMS engine cannot run on the NAS device because it has an RTOS and is closed to other applications. Second, the file system will hamper performance of tables that are stored on the device, and support from database vendors using NAS in this fashion is problematic .

Deduction: Be cautious when considering NAS storage for high-end OLTP applications supporting multiple remote end users. These are applications using traditional relational database solutions.

Transactional-Intensive Applications with High I/O Content

NAS devices, although offering multiple network access points and multiple SCSI paths to data, remain limited in their scalability to meet high-transactional traffic.

Deduction: You probably dont want to use NAS for your corporate data warehouse application, or any application that requires sophisticated RDBMS partitioning, updates, and complex data access.

Messaging-Intensive Applications Using an Embedded Database

The same can be said for messaging environments like e-mail, where high user traffic coupled with high I/O operations cannot sustain large user communities with additional deferred I/O operations that may require the remote execution of transactions.

Deduction: Be cautious when considering NAS for your e-mail operations. Although this is a matter of scalability, most e-mail systems continue to have a monolithic file architecture that makes remote partitioning of the data difficult, and scaling problematic.

High-Performance Relational Database Systems

The relational model has yet to fit well with NAS systems given the optimization at the file-system level and the singular nature of the storage array options. Relational databases at the high end of the product spectrum do not work well with NAS storage. Given there is some value to using NAS to store database tables in elementary RDBMS installations, enhanced performance options such as the parallel processing of queries, sophisticated table partitioning, and adding global table locking mechanisms remain beyond the capabilities of NAS.

Deduction: You should not consider NAS for applications that have database systems, which may extend their requirements into advanced RDBMS functionality. Although considerations of scalability is a factor, NAS storage servers do not have the processing services necessary to handle advanced database functions.

In conclusion, there are a number of applications to which NAS will provide significant value. These applications are not necessarily the most highly political or visible, but they make up much of ITs expenditures. Thus, our findings for NAS include the following:

  • The NAS device can be used as a storage solution for Web Services, because it does speak HTTP, along with other remote dialects such as NFS, CIF, and future file transfer technologies such as Direct Access File System.

  • NAS provides a quick solution for file servers whose storage capacity is constantly being upgraded within specific network domains.

  • Given the flexibility in attachment to an existing Ethernet LAN, NAS can be utilized as a remote storage device while controlling it through the network and vice versa.

  • NAS provides an excellent cost-effective storage device for archival- and HSM-type activities.

These conclusions establish disembarkation points for departmental solutions, data centerspecific solutions, and application-specific solutions. Derivations of any one of these solutions will likely come from one of these three configurations. A brief overview follows :

  • Appliance/Departmental Configuration This NAS deployment provides a storage-centric solution to departmental file services. Figure 19-1 shows a simple NAS configuration supporting a network domain of two servers and multiple client devices that rely on NAS storage for networked file services, shared files such as application code libraries, and the download of business info (for example, results of large data warehouse operations).

    click to expand
    Figure 19-1: An appliance/departmental NAS configuration

  • Enterprise/Data Center NAS Configuration This NAS configuration provides a performance-oriented system that supports multiple web servers for both Internet and intranet storage requirements. Figure 19-2 illustrates multiple NAS storage devices supporting the Web Services servers which connect clients as they transfer data within their own network segment using high-speed gigabit Ethernet transports.

    click to expand
    Figure 19-2: An enterprise NAS configuration

  • Specialized Application Support NAS Configuration This configuration supports the large storage requirements for special applications that store unstructured data in existing file formats, archived HSM data used for near real-time access, or image/video data used in streaming applications. (See Figure 19-3.)

    click to expand
    Figure 19-3: A specialized application support NAS configuration

The following are some important considerations as you begin to evaluate your NAS design and implementation. Keep in mind that although NAS is easy to implement, and in some cases can be installed as a quick fix to a critical capacity problem, a level of analysis may be helpful to avert potential problems.

  • Identify and describe the I/O workloads of the applications you expect NAS to support.

  • Understand the strengths and weaknesses of each of the major networking configurations.

Assuming you have completed a general workload analysis and now have a total I/O workload transfer rate (see Chapter 17 for guidelines on estimating I/O workloads), you can estimate the number of ports required to support the I/O workloads. The sidebar Guidelines for Estimating NAS Requirements describes a set of activities that can be used to provide NAS implementation estimates given the size and scope of a potential NAS solution. These should be viewed as general guidelines and used as a worksheet when analyzing your potential NAS requirements. As always in capacity planning exercises, many data-center specifics should to be considered in context. Your results may vary according to those specific requirements.

Consider the following when contemplating specific NAS requirements:

  • The I/O workload transfer rate should be available and accurate.

  • The base configuration will be using LAN.

start sidebar
Guidelines for Estimating NAS Requirements
  • (Total I/O Workload Transfer Rate (I/OWTR) / Max Network Performance [1] )/2 = Minimum number of Network Paths (NetPaths) required

  • Total I/O Workload Transfer Rate (I/OWTR) / Max NAS Channel Performance [2] = Minimum number of data paths (DataPaths) required

  • Add NetPaths to DataPaths = Total logical paths (Lpaths)

  • Lpaths — Redundancy/Recovery Factor (RRF) [3] = Number of paths for Redundancy and Recovery (RRPs)

  • Lpaths + RRPs = Minimum logical paths (Lpaths) required

  • Calculate ratio of Lpaths to DataPaths Lpaths/Dpaths = Size Factor ( sf %)

  • Compare sf % factor to NAS sizing table

NAS Sizing Factor Table

NAS Devaice Type

Data Capacity Requirement

Sizing Factor

Appliance

Aggregate Data is < 500GB

0 30%

Mid-Range NAS

Aggregate Data is >500GB <1TB

30% 60%

Enterprise NAS

Aggregate Data is > 500GB <5TB

< 60 %

end sidebar
 

Conclusion Use the guidelines to calculate estimates for data paths, (number of SCSI channels), network paths, (number of IP ports), and the NAS sizing factor. This will build in additional resources for redundancy and recovery operations/device transfers. Use the NAS sizing factor table to find the NAS device most appropriate for your workloads by comparing your total data storage requirements to the sizing factor.

For example, if your total data required for NAS workloads is less than 500GB and your calculated sizing factor is under 30 percent, your best solution will be an appliance-level NAS device. Size the appliance-level device with estimates that provide the minimal number of data paths and network paths as estimated in your calculations.

Estimates that do not fit the table or cannot be sized with sufficient network or data paths are probably better suited to a more scalable solution such as a SAN, a high-end, direct-attached solution using SMP, a clustered system, or a combination of any of the three.

[1] The MAX network performance is based upon the type of Ethernet network segment you install on the NAS device ( ranging from a low-speed base 10 to high-speed 10Gb segments). A recalculation of bits to bytes is required to perform the calculation.

[2] Max NAS Channel Performance is the bandwidth MB/sec burst rate specification of the SCSI channels selected. This ranges from 10MB/sec to 5060MB/sec, depending on the SCSI type selected.

[3] The RRF is made up of an average of 40 percent for transactional workloads, 20 percent for file servers, and 10 percent for archival and special applications.

 
team lib


Storage Networks
Storage Networks: The Complete Reference
ISBN: 0072224762
EAN: 2147483647
Year: 2003
Pages: 192

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net