Block and File


At first glance, the division of networked storage platforms into block-optimized and file-optimized types might appear to have been driven by real architectural differentiators. At the protocol level, for example, NAS relies on network file system protocol, FC fabrics use an encapsulation of SCSI across a serial interconnect. It remains to be demonstrated, however, whether these technical differences justify the segregation of disk and block and the platforms that support them.

Traditionally, all storage was simply block storage. That is to say that, for most of the computing era, we used internal disk drives and disk arrays to store all data, whether block or file. Disk drives provided a common location for writing both the raw block output of databases as well as collections of blocks described with "metadata" ”what we refer to as files.

As two-tier client/server computing gave rise to three- tier architectures consisting of client, application server host, and database server, server administrators deliberately delegated to some disk platforms the task of file storage, and to others the task of database, or block-level, storage.

Over time, IT decision- makers came to prefer using "fast" storage platforms, delivering what vendors described as high performance I/O, to accommodate the block storage requirements of high-volume, transaction-oriented applications. On the other hand, less expensive platforms offering commensurately lower I/O performance were often deployed to handle file system storage.

There was little to distinguish these platforms from each other save for their read/write performance (a function of controller design, buffer memory, and type and size of disk drive) and, of course, their price tags. Truth be told, any data could be written to any platform: It was up to the IT decision-maker to select the product that he or she believed best met the needs of the application ”and did so at an acceptable price.

With the advent of NAS and SAN, this differentiation moved from the realm of preference to the realm of design. NAS was designed to store files. It leverages standards-based protocols such as the Network File System (NFS), Common Internet File System (CIFS), or the Hypertext Transfer Protocol (HTTP), to enable the extension of file system “based storage across multiple storage platforms.

Initially, NAS provided " elbow room" for file storage: an efficient one-stop -shop alternative to configuring a general-purpose server with internal disk or external arrays and configuring it for use across a network. In short order, NAS came into vogue as a platform for email hosting, partly because of its simpler deployment and lower cost, but also because of innovations such as "snap shots" ”a technique for backing up inode [1] pointers that facilitated point-in-time recovery of data. Many large email service providers both within large enterprises and on the web claimed that, given the propensity of early email server software to fail at random, point-in-time recovery was a key feature of NAS.

NAS, however, did not remain the pure file system play that it was touted to be by its advocates. In 1999, Microsoft announced that its Exchange Server 2000 would no longer support NFS- or CIFS-based access alone. To address failure rates and to support new software architecture, Microsoft needed to establish, in addition to a network file system connection, a block-level connection to the storage device.

So entrenched was the view of "NAS-as-file-server" and so embedded was the file server concept in product architecture, even "brand- name " NAS vendors trembled when Redmond announced its new email server (and underlying SQL database server) and excluded virtually all NAS products from the list of supported storage platforms. In the end, they had no alternative but to comply with Microsoft's change of direction and added the necessary block channel support.

So, to enable its use with SQL Server, NAS became, like all other storage platforms, both a block-and file-system -oriented storage device: a so-called "hybrid." This evolution is discussed in greater detail below.

SANs, by contrast to NAS, first appeared as a solution for off-LAN tape backup. According to some observers, upwards of 75 percent of early SANs were deployed to establish back-end connections to a shared tape library. Not only did such a strategy provide shared access to expensive tape libraries, but it also attacked the problem of shrinking backup windows that had long plagued LAN-based tape backups in open systems environments.

The vendor community, however, saw another "killer application" to drive SANs ”namely, SANs were touted to provide a nondisruptive storage scaling solution for block data storage associated with very large databases. At first, this was the meaning assigned to the expression: "storage utility." Ultimately, however, SANs proved to be more like arrays of arrays, with SAN zoning substituted for "old fashioned" array partitioning.

Can you store files on a SAN? Of course: Files are collections of blocks, after all. But, to use a SAN (or any other platform) for file storage, some means must be provided to manage and control file access by applications and end-users. As of this writing, storing files to a SAN (or any other platform in which storage capacity is shared among multiple hosts with multiple operating systems) requires the file management services of application hosts . And, because of differences in the semantics of different file systems used by different operating systems, files must be segregated on shared storage platforms so that the files used by one OS file system are kept separate from those used by another. This has traditionally been accomplished on shared direct-attached arrays through the use of partitions. With SANs, the same logical partitioning of storage can be accomplished using zones.

As of this writing, the development of SAN file systems is ongoing at IBM and elsewhere. In keeping with the FC SAN tradition, many approaches support only one flavor of server operating system or a single vendor's storage platform. For now, it is less expensive and difficult to use FC SANs for OS-neutral block storage, rather than file system storage. Storage architects and consultants prefer NAS to SAN when it comes to file system storage.

Interestingly, however, developments at Microsoft and elsewhere may be changing this position within the next two or three years . Microsoft's next operating system, code named "Longhorn," substitutes an SQL database and binary objects for a traditional file system. Such an approach, long advocated by database software vendors such as Oracle and IBM, has the potential to reunify " bifurcated " storage by returning to the concept of all storage as blocks. Adoption of the "new" approach and its impact on the file systems of UNIX and Linux operating systems remains to be seen.

Another interesting development is Houston, Texas-based NuView's StorageX offering. StorageX provides a means to aggregate file systems into a common or global "namespace." A global namespace was originally conceived as a logical layer that sits between clients and file systems for purposes of aggregating multiple, heterogeneous file systems, and presenting file information to users and applications in a single, logical view, with obvious benefits for users and administrators. Users (and applications) are shielded from physical storage complexities, while Administrators can add, move, rebalance, and reconfigure physical storage without affecting how users view and access it.

With a global namespace in place, the administrator can perform data management and data movement tasks in less time, without disrupting user access to files. When files are moved, links in the namespace are automatically updated, which reduces manual administration and ensures continuous client access to date. Figures 5-5 and 5-6 depict the "before" and "after" images of a representative implementation of global namespace technology.

Figure 5-5. Before the implementation of a StorageX Global Namespace ”Hurdles for File (Data) Management and File (Data) Movement increase as storage platforms increase. ( Source: "Global Namespace ”The Future of Distributed File Server Management," a white paper from NuView, Inc., Houston, TX, www.nuview.com . Reprinted by permission.)

graphics/05fig05.gif

Figure 5-6. After implementation of a StorageX Global Namespace ”A single point of access and management. ( Source: "Global Namespace ”The Future of Distributed File Server Management," a white paper from NuView, Inc., Houston, TX, www.nuview.com . Reprinted by permission.)

graphics/05fig06.gif

The potential of the technology, however, goes beyond simple file system pointer aggregation. It might well become a platform for delivering value-added functions that directly support data migration, server consolidation, and disaster recovery. By tightly synchronizing these services between servers and storage devices with the metadata server, NuView's architecture is non-disruptive and preserves existing server OS and storage controller investments. Ultimately, the vision of the designers seems to be the creation of a file broker service, similar to a directory server in a distributed client/server application, that enables storage to scale horizontally (i.e., by deploying many storage devices) just as efficiently as ”or even more efficiently than ”scaling it vertically (i.e., by creating bigger volumes ).

In the StorageX approach, the choice of deployment of files to SAN, NAS, or DAS is determined simply by the security, accessibility, and performance requirements of data itself ”as well as platform cost parameters. As depicted in Figure 5-7, the physical hosting of files is already irrelevant from the perspective of its inclusion in a global namespace.

Figure 5-7. An Inclusive and Non-Disruptive Approach ”A Global Namespace Brokers Requests for Files to Back End Storage, Regardless of Topology. ( Source: "Global Namespace ”The Future of Distributed File Server Management," a white paper from NuView, Inc., Houston, TX, www.nuview.com . Reprinted by permission.)

graphics/05fig07.gif



The Holy Grail of Network Storage Management
The Holy Grail of Network Storage Management
ISBN: 0130284165
EAN: 2147483647
Year: 2003
Pages: 96

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net