Saving Restore Time


Some DDT solutions, like electronic vaulting, seek not only to support time-savings in data backup operations, but also in data restore operations. Generally speaking, the tape backup software industry has always been more interested in talking about backup than it has about restore for a variety of reasons.

One reason is the focus of consumers on shrinking windows for backups, which has led vendors almost universally to emphasize performance in backup operations. While common sense dictates that backups are not performed for their own sake, but to provide insurance against unforeseen events that compromise the integrity of the primary copy of data, rarely has restore speed been a gating factor in the consumer mind. When it has been a serious consideration, vendors have rushed in to sell fail-over mirroring rather than tape.

Data restoral can be expedited in two main ways. First is to stream (or multistream) data directly from tape to the primary storage hardware from which the backup was taken ” sort of server-less backup in reverse.

Such a strategy has merit, provided that the target platform to which data is being restored is configured identically (or nearly so) to the platform from which data was originally copied . This may not always be the case, especially if data is stored across many physical devices comprising a logical volume in a SAN. Ideally, such a strategy must also avoid passing data through a RAID controller or virtualization software engine that could introduces latency as it determines where to write each block of data being received. Such "bare metal" writes are available with some high-end RAID arrays today, but are not in widespread use.

The second way to speed data restore is to enable restore software itself to adjust dynamically to the environment in which it operates. An early leader in this space was BakBone Software, which uniquely touted its restore speed (about 80 percent of backup speeds in restore operations) as part of its earliest marketing campaign in the United States.

Most other products realize as high as 70 percent to as low as 20 percent of backup speed on restore operations. Vendors with low restore speeds almost universally blame operating systems, file systems, virtualization engines and RAID controllers for the difference between backup and restore timeframes. These all play a role in determining how quickly data can be written to target devices. However, the performance of leading applications in the face of similar impediments leads to the obvious differentiator: Some products are simply better engineered to adapt more quickly and restore faster than others.

A third set of alternatives for rapid data restore approached the problem from a variety of unique perspectives. Some vendors, for example, offered hardware-based multitargeting solutions based on switches, multiported controllers, multiported host bus adapters, intended to expedite data recovery from tape or disk platforms via multiple, parallel, data channels. This is roughly the same approach advanced by SAN virtualization vendors, who seek to use the capabilities of SAN switching, combined with their virtualization and data replication software, to expedite recovery through virtual volume fail-over within local or remote SANs. This is what DataCore Software illustrates in Figure 9-16.

Figure 9-16. Virtualization-based data replication in a Fibre Channel fabric using DataCore Software SANSymphony. ( Source: DataCore Software Corporation, Corporate Park, 6300 NW 5th Way, Ft. Lauderdale, FL 33309, www.datacore.com .)

graphics/09fig16.jpg

Another approach offered by Tacit Networks and an increasing number of NAS platform providers is to implement a network of disk caching appliances throughout an IP network. These networked caches provide another data redundancy approach that can heal breakdowns either in the storage infrastructure or in the networks that provide access to it.

In the case of Tacit's offering, caching appliances use the vendor's own protocol, called Storage Caching/Internet Protocol (SC/IP ), which translates standard network file system protocols such as NFS/CIFS into WAN-optimized protocols, to facilitate file cache coherency across wide geographical areas (see Figure 9-17). The approach differs from NAS appliances using remote snapshots and mirrors in that "Tacit's caching architecture solves the WAN latency issue." According to the vendor, optimizing copy operations via SC/IP obviates the need for expensive bandwidth connections.

Figure 9-17. Tacit networks network-based file caching architecture. ( Source: Tacit Networks, 4041M Hadley Road, South Plainfield, NJ 07080, www.tacitnetworks.com .)

graphics/09fig17.gif

Network-based file caching moves the enhanced backup solution set nearer to the traditional multi-hop mirroring end of the spectrum in Figure 9-11. Another set of solutions in that domain can be referred to collectively as "Way Back Machines." More than a homage to the "Bullwinkle" television cartoon series I enjoyed in my youth, this reference is to a specific concept in mirroring: the journaling of block-level changes to storage.

The following series of illustrations from Revivio, a leader in this particular technology, may help to articulate the concept. [9]

As of this writing, Revivio is bringing its technology to market as a fully fault-tolerant, block-level appliance that allows instant access to data as it existed at any point in time. Since the technology operates at the block level, it is able to protect applications that work on file systems, SQL or nonstandard databases, or even raw disk partitions.

In normal operation, depicted in Figure 9-18, the appliance presents itself as a standard set of disk drives (LUNS) called TimeSafe volumes . The platform imposes no performance or reliability hit on the live storage or applications using it.

Figure 9-18. Revivio technology as write-only mirror during normal operation. ( Source: Revivio, Inc., Lexington, MA. www.revivio.com .)

graphics/09fig18.jpg

If a data corruption event occurs at, say 3:55 P.M ., the administrator can access the Revivio appliance and instruct it to "reflect" a TimeImage from a point in time before the corruption event. This point in time can be chosen arbitrarily after the event and does not require preplanning (as is the case with mirror- splits ).

This reflected TimeImage is instantly presented as another complete set of volumes (disks) that contain the data from an earlier point in time (see Figure 9-19). These disks can be mounted on another host and validated for correctness.

Figure 9-19. Presentation of TimeImage prior to corruption event. ( Source: Revivio, Inc., Lexington, MA. www.revivio.com .)

graphics/09fig19.jpg

If the chosen data set is defective (that is, the missing or corrupt data occurred prior to the chosen TimeImage target time), the appliance can be instructed to provide another point-in-time image, which is also presented instantly.

Database experts understand that the act of recovering a database changes the underlying storage. This is why existing snapshot and mirror-split solutions require that data first be replicated in order to preserve the contents of the source snapshot or mirror-split. TimeImages are fully read/write capable, and changes made to a TimeImage do not change the primary storage. Revivio's inclusion of read/write and nondestructive testing capabilities dramatically shortens recovery time.

Once the image is verified , the appliance can instantly restore the validated point-in-time data onto the TimeSafe volumes, as shown in Figure 9-20. Again, this process happens without moving or copying data.

Figure 9-20. Rapid recovery with Revivio. ( Source: Revivio, Inc., Lexington, MA. www.revivio.com .)

graphics/09fig20.jpg

At this point, the TimeSafe volumes can stand in temporarily for the primary dataset. Database validation and application restart can immediately commence.

A background process resynchronizes the data, the TimeSafe volume, and the primary storage. Once this synchronization is complete, the host directs all read requests to the primary storage array, and the Revivio appliance returns to its normal status.

Revivio is yet another example of an innovative appliance-based solution for data protection that approaches the problems from the perspective of restore, rather than backup. Products of the "Way Back Machine" variety may find considerable interest in IT shops where mirror-splitting is currently used. Revivio, for example, boasts that deploying its product would effectively return up to 80 percent of array capacity to production use, assuming a traditional high-end array configuration with six separate local mirrored disk sets "for daily mirroring and point-in-time recovery."



The Holy Grail of Network Storage Management
The Holy Grail of Network Storage Management
ISBN: 0130284165
EAN: 2147483647
Year: 2003
Pages: 96

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net