Archiving Data | Professional SQL Server 2005 Administration (Wrox Professional Guides)

Archiving a large amount of data from very large tables can be challenging. For example, if you select millions of rows from a billon-row table, copy them, and then delete them from that table, that will be a long-running delete process that may escalate to a table lock and reduce concurrency, which will not be acceptable, unless no one will be using the table. A commonly used procedure is to periodically delete a small number of rows to improve table concurrency, as the smaller number of rows may take page locks and may use an index for faster access and complete faster.

An efficient procedure to archive large amounts of data is to use a sliding time window table partitioning scheme. There are two approaches to this solution: using SQL Server 2005 data partitioning or using a partition view.

SQL Server 2005 table partitioning

A new feature of SQL Server 2005 is table partitioning, where a table can be carved into 1000 pieces, where each can reside on its own filegroup, and then the partition can be independently backed up. With table partitioning, a new empty partition is created when the next monthly data becomes available. Then the oldest partition can be switched out into a table and moved to an archive without table locking or reducing concurrency, because switching out a partition is only a schema change and occurs very quickly. The basic steps to create a table partition are:

Create a partition function that describes how the data is to be partitioned.
Create a partition schema that maps the pieces to the filegroups.
Create one or more tables using the partition scheme.

The following is an example of creating a partition table using a monthly sliding window.

 --Create partition function CREATE PARTITION FUNCTION [OrderDateRangePFN](datetime) AS RANGE RIGHT FOR VALUES (N'2005-01-01 00:00:00', N'2005-02-01 00:00:00', N'2005-03-01 00:00:00', N'2005-04-01 00:00:00'); --Create partition scheme CREATE PARTITION SCHEME [OrderDatePScheme] AS PARTITION [OrderDateRangePFN] TO ([filegroup1], [filegroup2], [filegroup3], [filegroup4], [filegroup5]); --Create partitioned table SalesOrderHeader CREATE TABLE [dbo].[SalesOrderHeader](   [SalesOrderID] [int] NULL,   [RevisionNumber] [tinyint] NOT NULL,   [OrderDate] [datetime] NOT NULL,   [DueDate] [datetime] NOT NULL,   [ShipDate] [datetime] NULL,   [Status] [tinyint] NOT NULL ) ON [OrderDatePScheme]([OrderDate]);

Partition View

This technique has been around since earlier versions of SQL Server, using a partition view to group independent, identical tables together (for example, a new table for each month). Here is the procedure:

Create individual, identical tables with a check constraint to limit the data that can reside on each.
Create a view to union all of these tables together.
Load the data through the partition view. SQL Server evaluates the table constraint to insert the data in the correct table.
Before the next date period, create a new table with the date period constraint and include it as part of the view definition. Then load the current data through the view.
To archive, remove the oldest table from the view definition and then archive it. Each table can be placed on its own filegroup and can be backed up individually.

This technique does not have the 1000-partition limitation but requires more manageability, as each table is independent and managed.