Fundamentals of Backup and Recovery


The concept of backup and recovery is very simple. Data is copied to offline or secondary storage during backup operations and is kept there in case it needs to be restored following some type of disaster that causes the loss of data. Backup traditionally uses both duplication redundancy and delta redundancy with file granularity. In other words, legacy backup can copy all the files in a storage address space, or it can copy only those that have changed.

Historically, backup copies have been made to tape storage devices and tape media. In recent years, this scenario has changed somewhat with the introduction of less-expensive ATA disk drives. The use of disk for backup is discussed toward the end of this chapter in the section "Disk-Based Backup." Although disk storage is used, many of the logical operations are based on operations developed for tape devices and media. For that reason, most of this chapter is devoted to the analysis of tape backup because an understanding of tape backup is essential to understanding most disk backup schemes.

Backup Versus Recovery

The execution of backup and recovery processes is complex and involves a surprising number of variables. Backup and recovery processes are also considerably different. Backup is usually implemented with the goal of minimizing the impact on production systems. Restore, on the other hand, has to provide the mechanism for a complete restore of a system. Backup simply copies data to secondary storage of some sort, where it can be restored by backup software at a later time. Restore, on the other hand, has to make sure that restored data can be used correctly by system and application software. Restoring data also involves the possibility of over-restoring data and creating a number of problems associated with a disk-full condition.

NOTE

It is truly amazing how the added complexity of recovery is often either ignored or underestimated. Beyond the scary problems of backup data being incomplete or missing from backup media, there can be a lot of file system cleanup following the restore to get rid of unwanted garbage that was restored by mistake.

As the amount of stored online data increases (online data is data that is readily available to applications on disk storage), the problems of backup compound. While backup technology has made great strides in the last ten years, it never seems to be able to catch up with the expanding requirements for storage capacity and data availability.


Backup Applications

The four primary uses for backup are

  • Business continuity protection

  • Historical archiving

  • System and application migration

  • Data sharing

Business Continuity Protection

Backups are part of most disaster recovery operations. If systems or data are destroyed for any reason, the organization must be able to re-create its systems and data in order to resume normal operations. For organization that are not using any remote copy data protection technologies, backup data on magnetic tape is usually the only automated way of recreating data.

Historical Archiving

Companies use backup systems to archive data for historical purposes. Copies of data can be made to easily restore historically important versions of software and data. For example, software development companies typically make archival copies of projects when they reach certain milestones. Archiving to tape is also used to satisfy audit requirements by taking backup "snapshots" of monthly or quarterly financial data. Of course, today there are new regulations such as the Sarbonnes Oxley Act that mandate the storage and archiving of nearly all management communications.

System and Application Migration

One way to transfer data from one system to another is to back it up on the first system and restore it on the other. In general, it is preferable to migrate data over a network, but that is not always practical.

Data Sharing

Similarly, data sharing in the form of the "sneaker net" can be accomplished with tape copies made by a backup system. This method is still used (amazing, but true!) in situations where data sharing involves a large number of extremely large files. However, as storage capacities increase, storage networks are replacing these "socially friendly" data sharing methods.

Backup as a Filing Application

Most backup systems are designed as a filing process to allow individual files to be identified and restored easily. End users who want data restored ask for a particular file or group of files. Database backup can be done by naming files or tablespaces. Both of these are filing-level functions because they specify data as a data object as opposed to by its storage block address.

File systems and database systems usually have programming interfaces that allow backup systems to copy data as files or database objects. Backup software designers usually integrate these interfaces in their backup products to provide the most complete, consistent, and orderly copies of data. In addition to copying data, most backup systems also keep their own log files or metadata database that stores additional information about the files and objects that have been backed up. For instance, the backup metadata may contain an entry for the date and time that the data object was backed up.

Backup can also be done as a storing (block-level) process to back up entire storage volumes en masse. Like many other storage management techniques, as the amount of stored data grows, this approach becomes less realistic. There have also been some backup products that had the ability to copy storage blocks for particular files, an idea that is being retried as serverless backup (discussed later in this chapter).

Offline, Online, and Near-Line Storage

The location where backup data (data that is being backed up or that has been backed up) is stored is an important element in all backup systems. Some terms used to delineate different types of storage by the location and access to data are

  • Offline

  • Online

  • Near-line

Offline Storage

Offline storage is storage or media that cannot be accessed without an administrator's making it available. The most common example of offline storage is tape, which must be loaded into a tape drive before its data can be accessed. CDs and DVDs are also offline when they are not loaded into a device. In general, optical disks lack the required write I/O performance and capacity to be broadly useful in network storage environments.

In some cases offline storage can also refer to disk storage that needs to be connected to a system in order to access it. Portable devices such as USB/FireWire external disk drives are often used this way. However, like optical storage, it is much more practical for personal storage than it is for network storage.

The major advantage of offline storage is that it can be transported easily to other locations for business continuity purposes and for data exchanges where network connections are not practical or affordable or do not exist.

Online Storage

Conversely, online storage is storage or media that can be used to access data directly, without administrator intervention. Most disk storage is designed as online storage. Backup can use online storage successfully as long as administrators understand its limitations and how it differs from the offline media methods designed into backup software.

NOTE

While most backup software supports the use of online storage, most of these products were designed primarily to use offline tape storage. The difference is enormous: online storage is managed by a file system, which imposes limitations on how it can be used.

For instance, most file systems do not allow two files with the same name in the same directory. This makes storing multiple versions of a file (a common backup requirement) difficult, because these different versions must be given different names or be placed in different directories. Both of these scenarios make restoring data much more difficult and error-prone than it normally isa scenario most sane people would try to avoid like the plague.


Near-Line Storage

Near-line storage uses automation techniques to quickly make data available that is not online. For example, files stored in a tape library that can automatically load tapes on demand, without administrator intervention, are said to be on near-line storage.

The software that uses near-line storage is usually not backup software per se, but something related, called hierarchical storage management (HSM), which is discussed in the next chapter. However, as HSM and backup usually need to be integrated to avoid operational conflicts, the discussion of near-line storage sometimes enters into discussions of backup systems.

The Backup System

The equipment and software used for backup and recovery are usually referred to simply as the backup system. Backup systems can be relatively small, including a tape drive with single-system backup software, or they can be large, involving a SAN with automated tape libraries and distributed, multiplatform backup software. The backup system is also used for restoring data, of course, but it is rarely referred to as the backup and recovery system.

The generic parts of a network backup system are

  • Backup engine

  • Backup agents

  • Operations scheduler

  • Backup transfer network

  • Media management

  • Devices and subsystems

  • SAN or device interconnect

  • Backup metadata

The sections that follow describe these various components of backup systems.

Backup Engine

The backup engine is software that controls and processes backup and restore operations, including device control, media management, and metadata management. The backup engine can be installed on a dedicated system or on one that also processes other applications.

Multiple backup engines are often used in a company to divide the work or to optimize platform coverage. Just as other applications may run better on some platforms than others, backup engines typically work better on certain, targeted platforms. For instance, a company might use one type of backup software for Windows systems and another type of software for UNIX systems.

Backup Agents

The backup engine communicates with backup agent software to determine what data to back up and to manage the data transfer process. On a detailed level, the backup agent uses the programming interfaces of the file or database system to access data and to ensure data consistency. Backup agents typically run on the systems where the data is stored.

Backup agents are also used during restore processes to facilitate complete restore operations. Different file and database systems have their own unique set of programming interfaces, which means that backup systems must have multiple backup agents available to cover all the various platforms businesses use. In addition, backup agents are usually made to work with only specific backup engines. The backup engines and agents tend to be proprietary, even if the network protocols might be standard.

NOTE

The lack of standardization in engine/agent communications is a prime example of how a great deal of energy and money can be wasted by an industry and its market when standardization is not pursued as a priority. Without standards, every backup software company has been forced to develop, test, implement, debug, and support its own application interfaces and communications. Conversely, companies that purchase and use backup systems have to learn the unique installation and operations tools provided by each backup vendor. This is not nearly as simple as it might appear. A great deal of time is spent learning how to make backup software work. The cross-training that comes with standardized interfaces would save a great deal of money every year.


Backup agents are responsible for handling the unique requirements of various operating systems, file systems, and databases. Not all agents are equal, and there can be important differences in their results. This is one of the reasons companies sometimes use different backup systems for different platforms in their environments.

Operations Scheduler

Backup operations are typically scheduled to run periodically at certain times of the day. While the backup scheduler is considered part of the backup engine, it often has its own interface and tools for managing backup operations. The operations scheduler decides what data to back up and when to copy data. Its work is usually integrated with media and device management that determines which media and devices to use.

Backup Transfer Network

Legacy network backup uses LANs for transferring data from a backup agent to a backup engine. In most cases, the backup transfer network is the same as the LAN used for normal day-to-day operations and applications. However, in some cases, companies install dedicated LANs for carrying backup traffic.

NOTE

Bypassing the LAN and backing up to devices directly over the SAN is a much better solution, if possible. Why use a backup transfer network if you don't need to?


Removable Backup Media and Media Management

The term removable media is used generically to encompass a variety of products used for storing backup data. It includes magnetic tapes, optical disks, and removable disk cartridges. By far the most common form of removable media for backup in storage network environments is tape.

Tapes are used for a certain period of time, as determined by the tape rotation schedule, and then are stored locally or remotely in accordance with corporate business continuity policies. Tapes are often copied so that one copy can be kept locally for fast restores and another copy can be stored remotely for restoration following a major disaster.

Backup tapes store corporate data assets and should be taken care of deliberately. For that reason, media management is one of the key disciplines in all of storage networking. Backup operations can easily involve thousands of backup tapes used for a variety of different data restoration goals.

Tape rotation algorithms define when and how tapes are used. For example, the popular grandfather, father, son (GFS) algorithm generates a sequence of tapes that is aligned with the calendar. Full backups are written to weekly tapes on the weekends, incremental backups are written to daily tapes on Monday through Thursday, and a monthly tape is written at or near the end of the month. Table 13-1 is an example of the sequence of tapes from a GFS algorithm.

Table 13-1. GFS Tape Rotation Scheme

Friday Through Sunday

Monday

Tuesday

Wednesday

Thursday

Weekly Tape 1

Daily Tape 1

Daily Tape 2

Daily Tape 3

Daily Tape 4

Weekly Tape 2

Daily Tape 1

Daily Tape 2

Daily Tape 3

Daily Tape 4

Weekly Tape 3

Daily Tape 1

Daily Tape 2

Daily Tape 3

Daily Tape 4

Weekly Tape 4

Daily Tape 1

Daily Tape 2

Daily Tape 3

Daily Tape 4

End of Month Tape 1

    


Media management is a central part of backup software and includes not only the tape rotation algorithm, but also information about tape usage and any special-purpose tapes, such as historical archiving. Media management in a storage network environment usually keeps track of the number of times a tape has been used (sometimes referred to as tape passes) and the number of errors that occur on each tape. Old or failing tapes should be removed from regular tape rotations before they fail. Media management can also recommend the movement of tapes between local and remote locations for disaster recovery protection.

Whatever tape rotation algorithm is used, it is highly recommended that you follow it closely to avoid unexpected coverage "holes" in the backup data. Using tapes out of sequence can result in the loss of data on backup tapes that might be needed during a disaster recovery operation. Unfortunately, with so many tapes in an organization, it's not hard to understand how mistakes are made in the execution of tape rotation. Paying close attention to media management reduces backup errors. Using automated tape equipment such as autoloaders or libraries also helps reduce administrator errors.

Tape Devices and Subsystems

Tape devices and subsystems were discussed in Chapters 4 and 5. Tape equipment has been practically synonymous with network backup for many years. However, as remote storage network transmissions become more affordable and as the need for faster backup processing continues to increase, disk-based backup techniques will take a larger share of the load. Disk backup is discussed toward the end of this chapter.

Device Interconnect or SAN Connection

Tape devices and subsystems have to connect to either a device interconnect technology or a SAN. In legacy network backup, the backup server usually connects to tape equipment over the SCSI bus. In most SANs, tape devices attach to a device interconnect within a tape subsystem, which connects to systems over a SAN.

Backup Metadata

The core of most network backup systems is the internal database or catalog system that contains information about the data that has been backed up. This backup information base is also called the backup metadata (data about the data). Information kept in backup metadata could include things like the full extended name (which includes the directory path), the date/time stamp of file creation and access, the date/time of backup copy, file size, file owner, access rights, tape name, tape location data, and the operation used for that backup task. Backup metadata can also include a field identifying if the file had been deleted. It can be a great benefit to not restore files if they had previously been deleted.

In some backup systems, the metadata functions like a real-time transaction processing system that records information about data as it is being backed up. In other systems, the backup metadata is processed as a batch job from a log file after all the data copies have been made.

Backup metadata provides the underpinnings for most restore functions of the system by allowing administrators to easily determine which files and versions to restore. Metadata significantly speeds up restore processing compared to manual methods.

Backup metadata can become very large, requiring administrative oversight in order to maintain an efficiently running backup system. For example, a server with 250,000 files must have a minimum of 250,000 records in its metadata if each file in the system has been backed up only once. But, in fact, most files are backed up multiple times by different backup operations. Assuming each file is backed up ten times, creating ten records, the total number of records would be 2.5 million. At some point in all backup systems, the metadata reaches a size thatstarts impacting backup performance. The key issue is the number of files and objects being backed up and tracked individually.

The Big Picture of Backup

Figure 13-1 shows the location of the backup functions just discussed in a legacy LAN-based backup system. The arrows indicate the direction in which data is copied by the backup process.

Figure 13-1. Legacy Network Backups




Storage Networking Fundamentals(c) An Introduction to Storage Devices, Subsystems, Applications, Management, a[... ]stems
Storage Networking Fundamentals: An Introduction to Storage Devices, Subsystems, Applications, Management, and File Systems (Vol 1)
ISBN: 1587051621
EAN: 2147483647
Year: 2006
Pages: 184
Authors: Marc Farley

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net