Capacity Management


When filing systems fill storage address spaces past a certain point, system performance can become unacceptably slow. Also, the presence of old, unused data adds time to time-critical processes like backup, virus scanning, and system management practices. Capacity management is the process of relocating files from primary storage and placing them on second tier storage where they do not interfere with production operations.

One of the questions that eventually comes up is "Should capacity management be called storage management or data management?" Although it is often referred to as storage management, it is actually a filing-level process. The point is to return empty blocks to the file system's free space pool.

NOTE

A storage address space is just a bunch of continuous blocks that are formed by virtualization and volume management. It is not always obvious, but a storage address space is already full of nonsensical bits when it is created. Writing data into an address space does not fill the address space; it creates a layout reference system within its boundaries. Over time, the layout reference system becomes increasingly convoluted as the free space pool becomes smaller and more fragmented. The best way to untangle a layout reference system is to remove files and return blocks to the free space pool. A defragmentation process finishes the job by restructuring the file layout within the address space.


The most common form of capacity management is simply to buy larger storage products and migrate data from smaller-capacity devices to them. This can become difficult in a data center environment if a lot of equipment has to be moved and storage components, rack spaces, and cabling all have to be juggled. Virtualization systems provide an excellent technology solution for these problems, as discussed in Chapter 12, "Storage Virtualization: The Power in Volume Management Software and SAN Virtualization Systems."

But purchasing more storage comes at a price. It is certainly possible to spend less money on storage equipment by managing the capacity of file systems more effectively. Three capacity management methods are discussed in the following sections:

  • Storage Resource Management

  • Hierarchical Storage Management and archiving

  • Tiered Storage

Storage Resource Management

Storage resource management (SRM) is an application that maps file systems to LUNs and monitors file system capacity trends. The goal is to provide a proactive means to predict when capacity problems could occur, giving administrators an opportunity to take action before a storage crisis occurs. SRM products normally record a variety of statistics about the file system, including the characteristics of files within the file system. These can include the percentage of files over or under a certain size and those that are older or newer than certain dates. In some cases it might be possible to identify capacities used by certain applications and create trend analysis for them.

Administrators establish the thresholds and set the policies that an SRM system uses. For instance, a policy could be set that would identify all files that have not been accessed in the last 90 days and that are larger than 25 MB. The policies used for individual servers can vary according to the characteristics of the applications they support. For example, a multimedia server would have different policies than a database system. Some SRM applications provide automation tie-ins to other system tools that could perform maintenance actions. Alternatively, SRM systems may be able to create scripts that an administrator can edit and run.

Hierarchical Storage Management

Hierarchical storage management (HSM) is an automated system that moves files from primary to secondary storage but continues to provide (more or less) transparent access to data as if it were still stored on primary storage. Most HSM systems use policies similar to those just described for SRM systems. Files are identified by the HSM system as candidates for migrating to secondary storage based on their size, lack of activity, or both. A capacity monitor periodically queries the file system to determine its filled capacity. When the capacity exceeds the "high water mark," the files that were identified as candidates for migration are copied to secondary storage. As they are copied, they are replaced by a "stub file" that has the same name as the original file but fills a minimum amount of storage capacity. The migration of files from primary storage to secondary storage continues this way until capacity levels drop below the "low water mark" and migration stops.

Figure 17-3 illustrates three steps in a basic HSM migration process. In Step 1, capacity levels in a storage address space exceed the high water mark. This starts Step 2, which migrates files to secondary storage, in this case, tape. Migration stops in Step 3 when the capacity level drops below the assigned low water mark.

Figure 17-3. An HSM System Migrates Data from Primary Storage to Secondary Storage


HSM depends on a process running in kernel space that intercepts the file-open process to see if it is a stub file. If the file is determined to be a stub file, the HSM system suspends the file-open task and reads the contents of the HSM stub file, which has information needed for the HSM system to demigrate the file. Demigration involves locating the file in secondary storage and restoring it to the file system. This process necessarily involves creating new layout reference data for the file. The speed of the demigration process depends a great deal on the type of secondary storage used to migrate files. For example, disk storage is quite a bit faster than tape storage. After the file has been completely restored, the file-open process is returned to the file system, which services the request as it normally would.



Storage Networking Fundamentals(c) An Introduction to Storage Devices, Subsystems, Applications, Management, a[... ]stems
Storage Networking Fundamentals: An Introduction to Storage Devices, Subsystems, Applications, Management, and File Systems (Vol 1)
ISBN: 1587051621
EAN: 2147483647
Year: 2006
Pages: 184
Authors: Marc Farley

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net