Any serious storage networking professional knows that backup is the "killer app" of storage networking. Backup is the one application that every SAN runs, and it is considered to be either the most mission-critical application for the organization or nearly so. Many SANs exist solely for the purpose of backup and recovery. And while it's an axiom that no one ever gets promoted for doing a great job on backups, it's also true that the life expectancy of an administrator who can't restore critical data from backups can be clocked with an egg timer. Backup over a SAN adds significant convenience and additional capability to the task at hand that isn't there when you back up either locally or over a network. With centralized SAN backup hardware and software, you get an economy of scale that you don't have with smaller forms of backup. But the most important reason that backup is such a powerful selling point for SANs is that it removes one of the primary consumers of network bandwidth from your corporate LAN. With SAN backup, you don't have to wait for periods of inactivity to perform a backup, and you can make better use of the backup resources that you do have. Up until a few years ago, the predominant form of SAN backup was to tape. Tape is a relatively inexpensive storage medium, and tape technologies have steadily improved in performance and capacity over the years. However, the cost and capacity of disk storage have improved even more significantly than those of tape, leading to a situation where the cost of disk capacity is now similar to the cost for the same capacity of tape storage. Each of these two media offers different capabilities that complement each other. Tape is portable and reliable, but it affords only serial access to data, and it is mind-numbingly slow. Hard disks are faster and offer parallel access to stored data, but they're not portable. Consequently, disk backup is used more often on SANs for first-level backup and snapshots, where the most recent data is stored. Tape has taken the role of near-online backup and archival storage. Taken together, server, disk, and tape offer a multilayered approach to backup that is both flexible and prudent. Note You can find a detailed discussion comparing tape to disk backup on UltraBac's website, at www.ultrabac.com/techsupport/50white-papers/UBS_tapevsdisk.asp. Tape Formats and LibrariesTape drives have a long and storied past. Space precludes a historical presentation here, but the reality of the current situation is that there are really only three tape formats that are in common use in server/SAN deployments:
Of course, there are many other tape formats, including VXA, ADR, SLR, Mammoth, Travan, and Accelis, but these are older DAS tape formats. Of the formats mentioned previously, SDLT and LTO are the two that are still deployed on SANs. What makes these two formats the ones of choice is not only that they have high capacities and transfer rates but that the casing for these tape formats was specially ruggedized to withstand the wear and tear of handling in robotic tape libraries. Typically, SDLT-320 drives use Ultra SCSI connections, whereas Ultrium-2 drives are found in both Ultra SCSI and Fibre Channel connections. The most important tape backup system is the robotic multitape carousel or library. Because modern enterprise tape cartridges duplicate approximately one single hard drive (roughly speaking), tape backup must use a significant number of tapes in order to create archival copies. When snapshots are stored, an even larger capacity is required. Let's consider some representative examples of enterprise tape storage system. At the low end for SAN tape systems are autoloaders such as the $5,000 Exabyte Magnum 1X7 LTO-2 2u tape system shown in Figure 12.10. A carousel of seven 200GB LTO-2 tapes are passed around a circle and are read/written to by a single tape drive at the back of the unit. The capacity of the system is 1.4TB native and 2.8 compressed, with a throughput of 169MBps, using the two Ultra 320 SCSI ports. Systems of this type can be rack mounted. Exabyte is one of the larger tape system vendors, and it offers a variety of tape systems and formats. The LTO series moves up to more tape heads and more tapes. Exabyte systems come with software that lets you manage them remotely. Figure 12.10. The Exabyte LTO-2 Magnum 1X7 autoloader.
Note When purchasing or intending to use a tape system, make sure that your backup software contains an up-to-date driver for that system. The next step up from an autoloader is a tape library. Tape libraries look like the kinds of automation you see in science fiction and spy movies. They come in sizes ranging from a desktop model to the more commonly seen size of a file cabinet or refrigerator up to absolutely mammoth systems that fill enormous rooms. One system, built by eMASS (now a part of ADIC) for the Internal Revenue Service, fills an entire building. Because the only way to make certain that a unit functions correctly is to see the robotics in action, tape libraries often come with see-through doors. Unlike an autoloader, a library comes with multiple tape drives, often not of the same type. Having multiple tape drives operating at the same time enables fantastic throughputs, features such as tape RAID, internal tape calibration, redundant systems, and very broad heterogeneous networking support. Pricing is almost never standard, and it is quoted on a per-system build, depending on the components. As an example of a tape library, consider the StorageTek 9740 shown in Figure 12.11. This system can contain up to 494 cartridges, 10 tape drives, and 6 slots with a capacity of up to 30TB uncompressed tape when fully populated by DLT, or 60TB compressed. Tape libraries are often expandable, with an extra cabinet added to the side of the starter and with the robotics used to service the combined unit. Companies buy these tape libraries to back up the enterprise-class storage servers described earlier in this chapter. To give this some scale, you could back up roughly 15 EMC Symmetrix DMX servers like the one you saw earlier in this chapter. Companies buy tape libraries to help them archive, do backup and restoration, and to be the last line of defense in a disaster recovery system. Figure 12.11. The StorageTek Timberwolf 9740 enterprise-class tape library.
Disk Backup HardwareThere are a lot of good reasons people back up their SAN data to disk systems. (Chapter 11 describes some disk backup systems, although it doesn't stress backup.) People use storage arrays and storage servers to back each other up. Internally, arrays can be backed up in hardware RAID, using mirroring and replication techniques. BCV (Business Continuance Volume) is just a fancy way of saying disk-to-disk backup. Disk-based backup is more reliable, more fault-tolerant, and a great deal faster than tape. Of course, some vendors offer storage arrays that are specially outfitted for disk-to-disk backup. For example, consider the DX series from Quantum (see www.quantum.com/am/products/eb/default.htm). The DX30 array offers 277MBps throughput to up to 16TB of disk. Its bigger brother, the DX100, stores 64TB of disk. Organizations invest in disk-to-disk storage devices because they solve some thorny backup problems. With tape, you are always fighting to keep backups within a reasonable backup window. You can throw more and more tape hardware at the problem, but because tape is so much slower than disk, it's a much more expensive proposition. Using disk arrays to back up your email or large databases lets you restore a system much more quickly while still giving you the opportunity to do versioning using snapshots. SAN Backup Software and ServersWhen you prepare to back up a server over a SAN, typically your server is one of a pool of systems. If your server is to be backed up, it is a matter of adding your server to the backup routine and defining the parameters of the backup. The parameters might include which disk(s) to back up, how often, and using what method (full backup, incremental backup, snapshot, and so on). Some programs let you access the backup program remotely and set up the backup, or you can pop a terminal session and log in to the system. There's nothing substantially different about setting up a system to be backed up over a SAN; the software you use is similar to what you might have used in the past to do local backup. Enterprise backup software has a number of features that are unique, as described later in this chapter. Much of the real action in SAN backups becomes apparent when your server is one of the "backupers." Backup servers on a SAN run enterprise backup software, and you need to consider a number of factors to get them to perform effectively. Here's a fact for you to ponder: Typically 15% of all servers deployed in an enterprise are deployed as backup servers. Thus for small workgroups or departments, a backup server might be a lone wolf, but more often SAN-level backup requires multiple servers operating cooperatively when backing up other systems. When you select your backup software, you need to look for features such as master backup systems, backup groups, storage groups, backup policy and scripting, and other automation features that make it possible for as few IT staff to run the system as possible. Of course, you also need to look for wide device support within the software package because you never know what you might be called to back up from or to. The major backup software packages in the SAN marketplace are VERITAS NetBackup, Computer Associates ARCserve, Legato Systems Networker, and Tivoli Systems Storage Manager. However, this is a crowded category with many more players. Smaller vendors such as BakBone, NovaNET, Syncsort, and others all have products in this area. Figure 12.12 shows you the main console from VERITAS's NetBackup software. Figure 12.12. VERITAS NetBackup is a market leader in SAN backup software. Shown here is its main console.You are probably familiar with centralized backup systems, but SANs enable some very interesting backup options, including the following:
You need to consider two other concepts when it comes to backup software over a SAN: hot versus cold backup and vaulting. In a cold backup, you can close the running applications on a server and perform a complete backup, knowing that none of the data will change during the time of your backup or snapshot. Many application servers that run over a SAN are either mission critical or do not have sufficient free time to enable a backup window to be established. You can't just shut down a corporate email server or a large transactional database. In such a situation, you need to perform a "warm" or "hot" backup. In a warm backup, special software designed for the enterprise application you need to back up quiesces the application so that it is running slowly and performs the backup. A hot backup backs up while the application is running, without slowing down any of the processing. A hot backup picks a point in time, runs the backup, and then examines a transaction log to see what transactions need to be backed up in order to bring the backup successfully to completion. Hot backup software is specific for the application it is backing upan Oracle database or Exchange Server, for exampleand can be quite expensive to implement. Hot backups lead naturally to the concept of data vaulting. Data vaulting is a backup method that is done remotely so that the data is both duplicated and protected. The transmission is compressed, encrypted, and assembled. You can purchase data vaulting software, or you can buy it as a subscription service. Companies such as LiveVault and CommVault offer special techniques for backing up enterprise applications. The advantage of a vaulting application is that if all else fails, your friendly vaulting application is there to back you up. Vaulting should be viewed in the context of disaster recovery and applied to mission-critical systems and data. |