12.3 LAN-Free and Server-Free Tape Backup

Tape backup poses a number of problems for IT operations, none of them easily addressed by traditional parallel SCSI methods. As long as disk arrays are bound to individual servers, tape backup options are limited to server-attached tape subsystems or transport of backup data as files across the messaging network. Provisioning each server with its own tape backup system is expensive and requires additional overhead for administration of scheduling and tape rotation on multiple tape units. Performing backups across the production LAN allows for the centralization of administration to one or more large tape subsystems, but it burdens the messaging network with much higher traffic volumes during backup operations. In addition, scheduling backups for multiple servers to a central tape resource creates a potential conflict between the time required to back up all servers and the time available for nondisruptive access to the network. Scheduling backups during off-peak hours, such as 8:00 PM to 6:00 AM, may not provide sufficient time to back up all data. Nor is it an option for enterprises that operate across multiple or international time zones.

In Figure 12-3, three departmental servers share a common tape backup resource across the production LAN. Even with switched 100Mbps Ethernet and no competing user traffic, the maximum sustained throughput from server to tape is approximately 25GB per hour. If each server supports a modest 100GB of data, a full backup of the department's data would require 12 hours. Backups, however, are normally scheduled for incremental backup of changed files on a daily basis, with full disk backups occurring only once per month or quarter. To accommodate both full and incremental backups, the full backup routines would have to be rotated among different servers on different days, and then only during periods when full LAN bandwidth was available.

Figure 12-3. Tape backup across a departmental network with direct-attached storage

graphics/12fig03.gif

When the volume of data exceeds the allowable backup window and stresses the bandwidth capacity of the messaging network, either the bandwidth of the messaging network must be increased or the backup data must be partitioned from the messaging network. Installing a high-speed LAN transport, such as switched Gigabit Ethernet, can alleviate the burden on the production network but leaves the server/storage relationship unchanged. Thus, the potential conflict between user traffic and storage backup requirements can be resolved only by isolating each on a separate network, either by installing separate SAN interconnection using Fibre Channel or Gigabit Ethernet or by creating VLANs for storage data through a common interconnection. A storage network removes backup data from the production network, provides gigabit transport, and enables other backup and storage technologies to emerge.

As a transitional configuration, the SAN in Figure 12-4 is installed solely to offload the production network. The server connectivity can be either Fibre Channel HBA or iSCSI NIC, with the appropriate Fibre Channel or IP storage SAN switch providing access to a shared tape subsystem. Because the tape subsystem appears to each server as another SCSI device on a separate SCSI bus, it is accessible to the tape backup client residing on each server. The backup scheduler instructs each server when to perform backup and what kind of backup to perform on a sequential basis. Because the backup data path is now across a dedicated interconnection, the constraints of the messaging network are removed from the backup process and the burden of backup traffic is removed from the LAN.

Figure 12-4. Transitional LAN-free backup implementation for direct-attached storage

graphics/12fig04.gif

The multigigabit performance of the SAN and the flexibility of moving backup data on its own dedicated transport, however, do not resolve every issue associated with this backup implementation. Although the SAN transport may allow backup data to move at high speed, other limiting factors include server performance, data rate of the parallel SCSI drives, the type of data being backed up, and the performance of the tape subsystem itself. The slowest component in a backup configuration is usually determined by the tape drive's sustained streaming rate. A tape unit may be able to stream only 10MBps to 15MBps and so cannot fully utilize the bandwidth made available by the Fibre Channel or Gigabit Ethernet SAN. The overall time required for full backups is thus only moderately improved by the SAN, although the scheduling itself is no longer dependent on and no longer interferes with LAN traffic patterns.

Because each of the three servers is now provisioned with a SAN interface, other options are available for reducing backup times. Tape subsystems can provide multiple SAN ports or sharing of tape drives so that multiple backup streams can be accommodated concurrently.

Further optimizing the backup routine requires several additional SAN components. Moving disk storage from parallel SCSI to SAN-attached arrays offers, among other things, the ability to remove the server from the backup data path. This is the most significant improvement from the standpoint of performance and nondisruptive backup operations. If server resources are freed from backup tasks, the servers are always available for user access. And if the backup process itself does not interfere with user access to data, the backup window is no longer defined by users nor by the relatively slow performance of the tape subsystem. Backups can be performed at any time, provided that the backup software handles file permissions and updates and that a SAN-attached backup agent exists to buffer data from disk to tape.

The backup agent can be an extended copy (third-party copy) utility embedded in a SAN switch, in a dedicated SAN-attached backup server, in a SAN-to-SCSI bridge product, or in the tape target itself.

Figure 12-5 demonstrates an extension of the departmental tape backup solution that incorporates SAN-attached disk arrays and a third-party copy utility resident on a SAN switch. In this configuration, backup data is read directly from disk by the copy agent and then written to tape, bypassing the server. Although the SAN provides the vehicle to move the backup data, the backup software must control when and where to move it. Concurrent backup and user access to the same data are possible if the backup protocol maintains metadata (file information about the actual data) to track changes that users may make to data (such as records) as it is being written to tape.

Figure 12-5. LAN-free and server-free tape backup with SAN-attached storage

graphics/12fig05.gif

As higher-performance native SAN-attached tape subsystems are available, the ability to back up and restore over the SAN will better accommodate the growing volume of data generated by enterprises. The SNIA Interoperability Committee has demonstrated terabyte per hour backup operations at workshops held in the SNIA Technology Center, highlighting the feasibility of building high-performance backup configurations for large enterprise networks. Less attention, however, has been given to the counterpart of backup procedures: restoring tape data to disk. It is not uncommon for companies to institute complex schedules for tape backup with periodic incremental and full backups, tape labeling and rotation, and physical transport of tapes off-site and yet never test the validity of tape data restoration to disk. A SAN infrastructure may facilitate streamlined tape backup operations, but a comprehensive backup strategy must also verify the integrity of archived data.



Designing Storage Area Networks(c) A Practical Reference for Implementing Fibre Channel and IP SANs
Designing Storage Area Networks: A Practical Reference for Implementing Fibre Channel and IP SANs (2nd Edition)
ISBN: 0321136500
EAN: 2147483647
Year: 2003
Pages: 171
Authors: Tom Clark

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net