|< Day Day Up >||
THE ROLE OF BACK-UP IN DATA RECOVERY
There are many factors that affect back-up. For example:
Storage costs are decreasing.
Systems have to be on-line continuously.
The role of bBack-up has changed.
Storage Costs Are Decreasing
The cost per MB of primary (on-line) storage has fallen dramatically over the last several years and continues to do so as disk drive technologies advance. This has a huge impact on back-up. As users become accustomed to having immediate access to more and more information on-line, the time required to restore data from secondary media is found to be unacceptable.
Systems Have to Be On-line Continuously
Seven/twenty-four (7 x 24) operations have become the norm in many of today’s businesses. The amount of data that has to be kept on-line and available (operationally ready data), is very large and constantly increasing. Higher and higher levels of fault tolerance for the primary data repository are a growing requirement. Because systems must be continuously on-line, the dilemma becomes that you can no longer take files off-line long enough to perform back-up.
The Role of Back-up Has Changed
It’s no longer just about restoring data. Operationally, ready or mirrored data does not guard against data corruption and user error. The role of back-up is now taking on the responsibility for recovering user errors and ensuring that good data has been saved and can quickly be restored.
CONVENTIONAL TAPE BACK-UP IN TODAY’S MARKET
Current solutions offered by storage vendors and by back-up vendors focus on network back-up solutions. To effectively accomplish back-up in today’s environment, tape management software is generally bundled with several other components to provide a total back-up solution. A typical tape management system consists of a dedicated workstation with a front-end interfaced to the network, and the back-end controlling a repository of tape devices. The media server is running tape management software. It can administer back-up devices across an enterprise and can run continuous parallel back-ups and restores.
An alternative to tape back-up is to physically replicate or mirror all data and keep two copies on-line at all times. Because the cost of primary storage is falling, this as not as cost-prohibitive as it once was. The advantage is that the data does not have to be restored, so there are no issues with immediate data availability. There are, however, several drawbacks to all the back-up and data availability solutions on the market today.
ISSUES WITH TODAY’S BACK-UP
Network back-up creates network performance problems. Using the production network to carry back-up data, as well as for normal user data access can severely overburden today’s busy network resources. Installing a separate network exclusively for back-ups can minimize this problem, but even dedicated back-up networks may become performance bottlenecks.
Off-line back-up affects data accessibility. Host processors must be quiescent during the back-up. Back-up is not host-independent, nor is it nondisruptive to normal data access. Therefore, the time that the host is off-line for data back-up must be minimized. This requires extremely high-speed, continuous parallel back-up of the raw image of the data. Even in doing this, you have only deferred the real problem, which is the time to restore the information. Restoration of data needs to occur at the file level, not the full raw image, so that the most critical information can be brought back into operation first.
Live back-ups allow data access during the back-up process, but affect performance. Many database vendors offer live back-up features. The downside to the live back-up is that it puts a tremendous burden on the host. Redirection lives on the host, and journaling has to occur on the host. This requires local storage, host CPU cycles, and host operating system dependencies to consider. Up to 50% of all host CPU cycles may be consumed during the back-up process, severely impacting performance.
Mirroring doesn’t protect against user error and replication of bad data. Fully replicated on-line data sounds great, albeit at twice the cost per megabyte of a single copy of on-line data. But synchronizing, breaking, and resynchronizing mirrors is not a trivial process and influences data access speeds while these activities are occurring. Also, duplicating data after a user has deleted a critical file or making a mirrored copy of a file that has been corrupted by a host process doesn’t help. Mirroring has its place in back-up/recovery, but cannot solve the problem by itself.
NEW ARCHITECTURES AND TECHNIQUES ARE REQUIRED
Back-up at extremely high speed, with host-processor independence of the underlying file structures supporting the data, is required. Recovery must be available at the file level. The time that systems are off-line for back-up must be eliminated.
Mirroring, or live data replication for hot recovery also has a role. For data that must be always available, highly fault-tolerant primary storage is not enough, nor is a time-consuming back-up/restore. Remote hot recovery sites are needed for immediate resumption of data access. Back-up of critical data is still required to ensure against data errors and user errors. Back-up and mirroring are complementary, not competing technologies.
To achieve effective back-up and recovery, the decoupling of data from its storage space is needed. Just as programs must be decoupled from the memory in which they’re executed, the stored information itself must be made independent of the storage area it occupies.
It is necessary to develop techniques to journal modified pages, so that journaling can be invoked within the primary storage device, without host intervention. Two separate pipes for file access must be created: one pipe active and the other dynamic. The primary storage device must employ memory-mapping techniques that enable the tracking of all modified information. Two copies of each change must be kept, with a thread composed of all old data stored in the journaled file.
Part of the primary storage area must be set aside for data to be backed-up. This area must be as large as the largest back-up block (file, logical volume, etc.). The point-in-time snapshot of changed data will be used for back-up, while the file itself remains in normal operation without impacting user access to data. To minimize this reserve storage area for back-ups, the storage device must support the reuse of this area by dynamically remapping.
Mechanisms must be put in place to allow for the back-up of data to occur directly from the primary storage area to the back-up area without host intervention. Host CPU bottlenecks and network bottlenecks are then eliminated. The net result will be faster user response times during live back-up, normal network performance levels throughout the process, and no back-up downtime.
What about restore times? Fast, nonrandom restoration of critical data assumes that the user can select at the file level exactly which information comes back on-line first. Here again, the primary storage and its back-up software must offload that burden from the host(s) and take on the responsibility for understanding the underlying file structures of multiple heterogeneous hosts. The necessary indexing of file structures can be done in the background subsequent to a high-speed back-up. Then, at the time of restore, the indices are accessible to allow selection at the file level for the recovery of the information from the back-up device.
How achievable is this scenario? Many back-up tools are available today. What have been missing are architectures that can support journaling within the primary storage area, to enable direct, live back-ups with high-speed file-level restores. A few storage vendors, mostly in the mainframe arena, are providing some of these types of solutions. Now vendors such as Storage Computer, with Virtual Storage Architecture , provide an underlying structure to enable these back-up features in open systems environments. Thanks to this kind of progress on the part of storage vendors and their back-up partners, techniques to address continuous operations and fast data recovery in today’s 7 x 24 business environment are now becoming both more cost-effective and more widely available.
|< Day Day Up >||