Flylib.com

Books Software

 
 
 

Snapshot Isolation


Snapshot Isolation

SQL Server 2005 introduces a new isolation level called snapshot isolation , which, like REPEATABLE READ , avoids dirty reads and non-repeatable reads, but unlike with REPEATABLE READ , with snapshot isolation , readers don't block writers. Snapshot isolation is based on a technique called as row versioning , which makes use of the tempdb system database to keep a version of updated rows. With snapshot isolation, if a transaction updates or deletes a row that is already read by another transaction, SQL Server copies the original version of the row to the tempdb database. If the transaction tries to read the row again, instead of reading the row from the user database, SQL Server reads the row from the tempdb database, and hence avoids the non-repeatable read issue; at the same time, it avoids the blocking. This increases the concurrency and data availability and reduces the locking and deadlocking.

Snapshot isolation in SQL Server 2005 is exposed in two ways:

  • By turning on the READ_COMMITTED_SNAPSHOT database option. Then the default READ COMMITTED isolation level makes use of row versioning instead of locks to protect transactions from dirty reads and to support repeatable reads. Once the READ_COMMITTED_SNAPSHOT database option is turned on, all the READ COMMITTED TRansactions in that database use row versioning to run in snapshot isolation mode.

  • By turning on the ALLOW_SNAPSHOT_ISOLATION database option. Then you set the session's isolation level to snapshot isolation by running the SET TRANSACTION ISOLATION LEVEL SNAPSHOT T-SQL statement. With this, only the transactions in the current session are run in snapshot isolation mode.

Note

The ALLOW_SNAPSHOT_ISOLATION database option is by default turned off. Even if this option is off, SET TRANSACTION ISOLATION LEVEL SNAPSHOT will succeed, but as soon as you try to perform any DML or SELECT operation, SQL Server raises an error indicating that the database is not enabled for snapshot isolation.


Snapshot isolation is discussed in great detail, including its pros and cons and examples, in Chapter 9, "Performance Analysis and Tuning."



Ranking Functions

SQL Server 2005 introduces a new category of built-in functions: analytical ranking functions. This category consists of four functions ROW_NUMBER() , RANK() , DENSE_RANK() , and NTILE() that you can use to generate a sequence number or a sequential number for each row or to rank rows based on some partition criteria. The main idea behind providing these functions is to reduce the complexity and amount of query code that you have to write in order to efficiently calculate row numbers or rank the rows. Let's look at an example of this and then examine each function in detail.

The goal of this example is to output a column containing a sequential number based on a unique combination of ProductID and LocationID columns in the Production.ProductInventory table in the AdventureWorks sample database. Here is how you would do it without using any ranking function:

USE AdventureWorks;
GO
SELECT
  ( SELECT COUNT(*)
    FROM Production.ProductInventory AS pi2
    WHERE
         pi2.ProductID < pi1.ProductID OR
         (
          pi2.ProductID = pi1.ProductID AND
          pi2.LocationID <= pi1.LocationID
         )
  ) AS SequenceNo, *
FROM Production.ProductInventory AS pi1
ORDER BY ProductID, LocationID;
GO

This query produces the following output (some columns and rows have been omitted for brevity):


SequenceNo

ProductID   LocationID Shelf      Bin  Quantity

-----------

----------- ---------- ---------- ---- --------

1

1           1          A          1    408

2

1           6          B          5    324

3

1           50         A          5    353

4

2           1          A          2    427

5

2           6          B          1    318

6

2           50         A          6    364

Here is how you can obtain the same results by using the new ROW_NUMBER() ranking function:

SELECT

ROW_NUMBER() OVER (ORDER BY ProductID, LocationID)

as SequenceNo, *
FROM Production.ProductInventory;
GO

Let's assume that you want to reset the sequence number and start over from number 1 when the ProductID changes. Here is how you would do it without using any ranking function:

SELECT
  ( SELECT COUNT(*)
    FROM Production.ProductInventory AS pi2
    WHERE

pi2.ProductID = pi1.ProductID AND

(pi2.ProductID < pi1.ProductID OR
           (pi2.ProductID = pi1.ProductID AND
            pi2.LocationID <= pi1.LocationID)
         )
  ) AS SequenceNo, *
FROM Production.ProductInventory AS pi1
ORDER BY ProductID, LocationID;
GO

This query produces the following output (some columns and rows have been omitted for brevity):


SequenceNo

ProductID   LocationID Shelf      Bin  Quantity

-----------

----------- ---------- ---------- ---- --------

1

1           1          A          1    408

2

1           6          B          5    324

3

1           50         A          5    353

1

2           1          A          2    427

2

2           6          B          1    318

3

2           50         A          6    364

1

3           1          A          7    585

2

3           6          B          9    443

You can obtain the same results by using the new ROW_NUMBER() ranking function as shown here:

SELECT ROW_NUMBER() OVER
 (

PARTITION BY ProductID

ORDER BY ProductID, LocationID) as SequenceNo, *
FROM Production.ProductInventory;
GO

As you can see in these two examples, you write less code when you use the ranking functions, the queries are simple to understand and maintain. And if you study the execution plan for these queries, you notice that ranking functions perform better than the traditional approach.

The ROW_NUMBER Ranking Function

As illustrated in the preceding section, the ROW_NUMBER ranking function can be used to sequentially number starting at 1 for the first row. You can partition the result set and restart the sequential numbering by using the PARTITION BY clause, also illustrated in the preceding section.

One common application where ROW_NUMBER can be very valuable is when you are building a web application and need to implement the paging in the result set:

SELECT * FROM
   (SELECT ROW_NUMBER() OVER (ORDER BY ProductID, LocationID) as SequenceNo, *
    FROM Production.ProductInventory) AS tblRows

WHERE SequenceNo BETWEEN 101 and 120

;
GO

Note that you can directly use the alias for the ROW_NUMBER function; you simply have to nest the query as shown here to use the ROW_NUMBER column alias ( SequenceNo in this case) in the WHERE condition.

The RANK and DENSE_RANK Ranking Functions

The RANK and DENSE_RANK ranking functions are similar to the ROW_NUMBER function as they also produce ranking values according to sorting criteria, optionally partitioned into groups of rows. However, unlike the ROW_NUMBER function, which generates a different value for each row, the RANK and DENSE_RANK functions output the same value for all rows that have the same values in the sort column. The RANK and DENSE_RANK functions differ in the way the rank number value is generated when the sort order column value changes. The following example illustrates this:

SELECT

RANK() OVER (ORDER BY ProductID)

as Rank,

DENSE_RANK() OVER (ORDER BY ProductID)

as DenseRank,
       *
FROM Production.ProductInventory;

This query produces the following output (some columns and rows have been omitted for brevity):


Rank


DenseRank

ProductID   LocationID Shelf  Bin  Quantity

------


----------

----------- ---------- ------ ---- --------

1


1

1           1          A      1    408

1


1

1           6          B      5    324

1


1

1           50         A      5    353

4


2

2           1          A      2    427

4


2

2           6          B      1    318

4


2

2           50         A      6    364

7


3

3           1          A      7    585

7


3

3           6          B      9    443

7


3

3           50         A      10   324

10


4

4           1          A      6    512

When ProductID changes, the DENSE_RANK function increments the rank value by 1. However, the RANK function increments the rank value by the number of rows for the preceding rank.

The NTILE Ranking Function

The fourth and final ranking function introduced in SQL Server 2005, NTILE can be used to assemble rows into a desired number of buckets according to specified sorting criteria, optionally within partitions. This function takes an integer value that indicates how many groups you want:

SELECT

NTILE(3)

OVER (ORDER BY ShipMethodID) AS Bucket, *
FROM Purchasing.ShipMethod;

The Purchasing.ShipMethod table contains five rows. NTILE(3) divides those five rows into three buckets, returning values 1 through 3 in the first column, named Bucket , as the output of preceding query (some columns have been omitted for brevity):


Bucket

ShipMethodID Name

-------

------------ --------------------

1

1            XRQ - TRUCK GROUND

1

2            ZY - EXPRESS

2

3            OVERSEAS - DELUXE

2

4            OVERNIGHT J-FAST

3

5            CARGO TRANSPORT 5

Note how the NTILE function handles the situation when even distribution of rows is not possible. It puts two rows into the first group, two rows into the second group, and the remaining one row into the third group . When even distribution is not possible, larger groups precede smaller groups. Implementing this functionality in SQL Server 2000 is possible but a little complex, and it requires more code.