< Day Day Up > |
General backup software makes copies of data either by opening a file and copying the bytes out of it or by transferring raw blocks of data from a disk to a backup media. In either case, general backup is a binary copy. In the event of catastrophic failure of the object, such as corruption of the volume or a disk crash, the entire object can be recovered from the backup medium and the object restored to its state as of the last backup. This is called an image copy. This creates problems for multielement, structured objects such as databases, e-mail systems, and similar applications. The elements inside structured objects are unknown to the general backup software. A database, for example, is copied to the backup media as a file or volume (depending on how it is implemented). It can be restored only as that type of object. What if the event is not a catastrophic one? Perhaps a critical table was accidentally deleted, and the database is unusable. An important e-mail may have been deleted and needs to be recovered. It would be unacceptable to return an e-mail or database system to a day-old state, losing all changes that have occurred since then, just to retrieve a single element. With image copies, the data is copied precisely, but there is no catalog of the elements within. Individual elements are inaccessible, because the software does not record what is in the structured objects. A Microsoft Exchange Server database can be copied to backup media as a whole volume or a file. General backup software does not know about the e-mails, attachments, contacts, and calendar items that are in the database and, hence, has no way to find them later. When it comes time to restore a single e-mail, the software has no way to reference it. To alleviate this problem, many applications that use structured data stores have backup software that can catalog the elements within the structured objects. Oracle's Recovery Manager, also known as RMAN, is one such utility. Capable of backing up a database to various media, including disk, it allows for selective restores of database elements. Most backup software vendors have application-specific versions of their software. They offer versions of their software that catalog internal elements of structured data objects. It is easy to find support for Oracle, Microsoft SQL Server, Microsoft Exchange Server, and IBM's Lotus Notes/Domino, either as a product or as an add-on to the primary backup software. Structured Object Backup ConstraintsStructured data objects have certain constraints that must be taken into account when backing them up. To begin with, many of these objects are outside the normal file system. The server processes that create and manage these objects use direct block I/O to access data on the disk. Subsequently, the normal discovery mechanisms that backup software uses cannot find application data. As far as the backup software is concerned, the data doesn't exist. Luckily, most backup software vendors and applications vendors have add-ons that allow for discovery of these objects. During installation of Microsoft Exchange Server, for example, a new version of the standard Windows backup software that is capable of seeing and backing up the Exchange database is loaded with it. Most structured objects are difficult to back up when still in use. Many applications, especially database-oriented ones, require complete control over the underlying data store to ensure the integrity of the data. The database or application software has various controls built in to ensure that two processes do not try to change data at the same time. The software does this by placing locks on certain elements of the data. Other software is unaware of these controls, in much the same way that they are unaware of the internal structure. Backup software, by the very act of reading the data, can cause disruptions in the data. Many system administrators take database-oriented applications offline while backing them up. Because this practice makes the application or group of applications unavailable, it is not always desirable. Application-aware backup alleviates this constraint. By understanding the internal structure and locking mechanisms of the underlying databases, application-aware backup software can run backups without disrupting database operations and application availability.
Off-Site BackupsAn interesting option for smaller organizations is to have a service provider perform backups to an off-site location. When the backup process is outsourced, most of the headaches are removed, and capital costs can be reduced to a more manageable monthly fee. This is a boon for capital-poor organizations or those that want to put their dollars into other projects. A WAN link or virtual private network (VPN) over the Internet is established that connects the service provider's network to the customer's. The service provider runs backup servers on its side of the connection, which interacts with agents on the customer's servers or desktop computers. The service provider then takes responsibility for ensuring that backups are performed on a prearranged schedule and that the data is available for restore operations as needed. Pricing for this type of service is usually based on the amount of data backed up, though some service providers prefer to charge by the host. For a small company with limited IT resources, using a service provider has benefits. Money is not tied up in capital equipment, and capital expenditures do not increase as the amount of data grows. Limited personnel are not stretched as backup needs grow, and the organization gains the extra protection of having off-site backups without the mishaps, hassles, and expense of having to move tapes. What is given up is control. People outside the company are making decisions for the company. The service provider has to be trusted to provide for the security, safety, and availability of the data it holds. A reasonable amount of bandwidth on an Internet connection is also needed; for some companies, that can be expensive.
SAN Backup Deployment StepsRelieving the stress of backup has been one of the most important justifications for installing a SAN. SANs remain an important tool in architecting efficient backup and restore systems. They provide fast I/O and allow for consolidation of backup resources. This in turn saves money, as fewer personnel are needed to run systems as they grow and storage resources have higher utilization rates. SAN-based backup is an important technique for improving performance and efficiency, as well as saving money through consolidation. A SAN-based backup system deployment does not happen all at once. The amount of new equipment alone creates a lot of work for systems personnel and is usually done in stages. First, the basics of the SAN are put in place, including switches and host adapters. After the infrastructure is in place, it is time to build the rest of the system. The basic SAN should be fully tested and operational before any live storage systems are placed into the system. Otherwise, data corruption and downtime may occur, crippling the applications that depend on these data stores. Next, the backup drives are consolidated into one larger unit. This means replacing individual tape drives with a library. It is also an opportunity to build in a disk-to-disk virtual tape system. All backup jobs are now pointed at the consolidated backup storage. Even with storage consolidation, there is still redundancy in the operations, because there are still several uncoordinated backup jobs running. Software designed for a single machine may run well in a SAN environment, but it is inefficient. Backup software designed for a SAN, on the other hand, has a more distributed architecture that better mirrors the design of the SAN itself. Software that utilizes a dedicated backup server and agents running on different computers is installed once the backup system is up and running on the SAN. Last, in the interests of even more efficiency, a data mover may be brought in to perform server-less backup. It should always be deployed last, after everything else in the system is running well. A SAN is great technology for backup. A staged deployment helps bring components online gradually, allowing for less disruption of operations.
|
< Day Day Up > |