Large objects, or LOBs, come in different flavors. If a large object is to be stored in the database in binary format, it's commonly referred to as a binary large object (BLOB). If a large object is to be stored in the database as textual data, it's commonly referred to as a character large object (CLOB). Table 14-1 shows the data types for each of these large object types.
Large Object Type | SQL Server Data Type | Maximum Size | .NET Data Type |
---|---|---|---|
BLOB | varbinary(MAX) Image | 2,147,483,647 | Byte Array |
CLOB | varchar(MAX) Text | 2,147,483,647 | String |
Unicode CLOB (NCLOB) | nvarchar(MAX) NText | 1,073,741,823 | String |
XML Document | xml | 2,147,483,647 | String |
SQL Server 2005 has new data types for working with large objects: varbinary(MAX), varchar(MAX), nvarchar(MAX), and xml. Note that the Image, Text, and NText data types are used with earlier versions of SQL Server and will be removed in the next release of SQL Server. The xml data type is covered in detail in Chapter 15.
Opinions have always differed on where LOBs should be stored. Some people believe they should be stored as files on the file system, while others believe that LOBs belong in the database. From a performance perspective, storing LOBs as files is definitely faster. If the LOBs are images, it might be easier for graphic artists to access them as files when the images need to be edited.
The biggest problem with storing LOBs as files shows up when databases are moved from one machine to another; it's easy for the files to get disconnected from the database. Even in the Northwind sample database, the Employees table has a field called PhotoPath that contains invalid paths to the employee's photo, and the photo files are nowhere to be found. On the other hand, the Employees table does contain a column called Photo that has the employee's photo embedded in it. When the LOBs are embedded in the database, you can be sure that backups contain synchronized copies of the data. In short, storing the LOB in the database helps you keep your data together at all times, which is why I prefer using the database for storing LOBs.
In ADO.NET, you can work with LOBs by using a SqlDataReader object to return a result set, by using a SqlDataAdapter object to fill a DataTable object, or by using a large object SqlParameter.
When you work with CLOBs, most operations work the same with large objects as they do with their smaller counterparts, as long as the large object can be loaded into memory.
If a large object is so large that you can't load it without running out of memory, you must deal with it by reading and processing a chunk at a time.
Also, when you work with BLOBs, you might not have a lot of experience working with binary data, which can be a problem. Most of our examples in this chapter use BLOBs, so you can get up to speed quickly.
The normal operation of the DataReader object is to read one row at a time. When the row is available, all of the columns are buffered and available for you to access, in any order.
To access the DataReader object in a stream fashion, you can change the DbCommand object's behavior to a sequential stream when you execute the ExecuteReader method. In this mode, you must get the bytes from the stream, in the order of each column that is being returned, and you can't retrieve the data more than once. You essentially have access to the underlying DataReader object's stream.
To work with chunks of data, you should understand the operation of a stream object. When you read from a stream, you pass a byte array buffer that the stream populates. The stream does not have an obligation to populate the buffer, however. The stream's only obligation is to populate the buffer with at least 1 byte if the stream is not at its end. If the end has been reached, no bytes are read. When you use slow streams, such as a slow Internet network stream, data might not be available when you attempt to read the stream. In this case, the stream is not at its end but no bytes are available, and the thread will block (wait) until 1 byte has been received. Based on the stream operation described, you should always perform stream reading in a loop that continues until no more bytes are read.
To try reading BLOB data from the database and writing to a file, create a Windows application called LobTest and add a button to the form. Change the button's Text property to Photos To File. Add the Northwind database to the project. Add an App.config file with the following connection string.
App.config
<?xml version="1.0" encoding="utf-8" ?> <configuration> <connectionStrings> <add name="NwString" connectionString="Data Source=.\SQLEXPRESS; AttachDbFilename=|DataDirectory|\northwnd.mdf; Integrated Security=True;User Instance=True" providerName="System.Data.SqlClient" /> </connectionStrings> </configuration>
Double-click the button, and add the following code that opens a connection to the database, retrieves all of the employee photos, and stores each one to a disk file.
Visual Basic
Imports System.Configuration Imports System.Data Imports System.Data.SqlClient Imports System.IO Public Class Form1 Private Sub button1_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles button1.Click const employeeIdColumn as integer = 0 const employeePhotoColumn as integer= 1 'bufferSize must be bigger than oleOffset const bufferSize as integer = 100 Dim buffer(bufferSize) as byte Dim byteCountRead as integer Dim currentIndex as long = 0 Dim nwSetting as ConnectionStringSettings = _ ConfigurationManager.ConnectionStrings("NwString") using cn as new SqlConnection() cn.ConnectionString = nwSetting.ConnectionString cn.Open() using cmd as SqlCommand = cn.CreateCommand() cmd.CommandText = _ "SELECT EmployeeID, Photo FROM Employees" Dim rdr as SqlDataReader = cmd.ExecuteReader(_ CommandBehavior.SequentialAccess) while (rdr.Read()) dim employeeId as integer = _ rdr.GetInt32(employeeIdColumn) dim fileName as string = "c:\Employee" _ + employeeId.ToString().PadLeft(2, "0"c) _ + ".bin" ' Create a file to hold the output. using fs as new FileStream(_ fileName, FileMode.OpenOrCreate, _ FileAccess.Write) currentIndex = 0 byteCountRead = _ cint(rdr.GetBytes(employeePhotoColumn, _ currentIndex, buffer, 0, bufferSize)) while (byteCountRead <> 0) fs.Write(buffer, 0, byteCountRead) currentIndex += byteCountRead byteCountRead = _ cint(rdr.GetBytes(employeePhotoColumn, _ currentIndex, buffer, 0, bufferSize)) end while end using end while end using end using MessageBox.Show("Done") End Sub End Class
C#
using System; using System.Data; using System.Text; using System.Windows.Forms; using System.Configuration; using System.Data.SqlClient; using System.IO; namespace LobTest { public partial class Form1 : Form { public Form1() { InitializeComponent(); } private void button1_Click(object sender, EventArgs e) { const int employeeIdColumn = 0; const int employeePhotoColumn = 1; //bufferSize must be bigger than oleOffset const int bufferSize = 100; byte[] buffer = new byte[bufferSize]; int byteCountRead; long currentIndex = 0; ConnectionStringSettings nwSetting = ConfigurationManager.ConnectionStrings["NwString"]; using (SqlConnection cn = new SqlConnection()) { cn.ConnectionString = nwSetting.ConnectionString; cn.Open(); using (SqlCommand cmd = cn.CreateCommand()) { cmd.CommandText = "SELECT EmployeeID, Photo FROM Employees"; SqlDataReader rdr = cmd.ExecuteReader( CommandBehavior.SequentialAccess); while (rdr.Read()) { int employeeId = rdr.GetInt32(employeeIdColumn); string fileName = @"c:\Employee" + employeeId.ToString().PadLeft(2, '0') + ".bin"; // Create a file to hold the output. using (FileStream fs = new FileStream( fileName, FileMode.OpenOrCreate, FileAccess.Write)) { currentIndex = 0; byteCountRead = (int)rdr.GetBytes(employeePhotoColumn, currentIndex, buffer, 0, bufferSize); while (byteCountRead != 0) { fs.Write(buffer, 0, byteCountRead); currentIndex += byteCountRead; byteCountRead = (int)rdr.GetBytes(employeePhotoColumn, currentIndex, buffer, 0, bufferSize); } } } } } MessageBox.Show("Done"); } } }
This code gives you the pattern for reading the BLOB and writing it to a file. The ExecuteReader method is executed with the CommandBehavior.SequentialAccess parameter. Next, a loop runs to read row data, and within the loop and for each row, the employee's ID is read to create the filename. A new FileStream object is created, which opens the file for writing.
Next, a loop reads bytes into a byte array buffer and then writes the bytes to the file. The buffer size is set to 100 bytes, which keeps the amount of data in memory to a minimum.
Notice that the filename has a .bin extension. In Chapter 7 and Chapter 8, I explained that the employee photos were stored in the database with an OLE header. To retrieve the photos and strip off the OLE header, add a second button and add the following code, which reads the header and tests it to verify that the OLE header exists.
Visual Basic
Private Sub button2_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles button2.Click Const oleOffset As Integer = 78 Const oleTypeStart As Integer = 20 Const oleTypeLength As Integer = 12 Const employeeIdColumn As Integer = 0 Const employeePhotoColumn As Integer = 1 Const bufferSize As Integer = 100 'must be bigger than oleOffset Dim buffer(bufferSize) As Byte Dim bufferStart As Integer = 0 Dim byteCountRead As Integer Dim currentIndex As Long = 0 Dim nwSetting As ConnectionStringSettings = _ ConfigurationManager.ConnectionStrings("NwString") Using cn As New SqlConnection() cn.ConnectionString = nwSetting.ConnectionString cn.Open() Using cmd As SqlCommand = cn.CreateCommand() cmd.CommandText = _ "SELECT EmployeeID, Photo FROM Employees" Dim rdr As SqlDataReader = cmd.ExecuteReader( _ CommandBehavior.SequentialAccess) While (rdr.Read()) Dim employeeId As Integer = _ rdr.GetInt32(employeeIdColumn) Dim fileName As String = "c:\Employee" + _ employeeId.ToString().PadLeft(2, "0"c) + ".bmp" ' Create a file to hold the output. Using fs As New FileStream(_ fileName, FileMode.OpenOrCreate, FileAccess.Write) currentIndex = 0 'read until we have the oleheader, if possible While (currentIndex < oleOffset) byteCountRead = _ CInt(rdr.GetBytes(employeePhotoColumn, _ currentIndex, buffer, CInt(currentIndex), _ bufferSize - CInt(currentIndex))) If (byteCountRead = 0) Then Exit While currentIndex += byteCountRead End While byteCountRead = CInt(currentIndex) 'process oleheader, if it exists If (byteCountRead >= oleOffset) Then Dim type As String = Encoding.ASCII.GetString( _ buffer, oleTypeStart, oleTypeLength) If (type = "Bitmap Image") Then bufferStart = oleOffset byteCountRead = byteCountRead - oleOffset End If End If While (byteCountRead <> 0) fs.Write(buffer, bufferStart, byteCountRead) bufferStart = 0 byteCountRead = _ CInt(rdr.GetBytes(employeePhotoColumn, _ currentIndex, buffer, 0, bufferSize)) currentIndex += byteCountRead End While End Using End While End Using End Using MessageBox.Show("Done") End Sub
C#
private void button2_Click(object sender, EventArgs e) { const int oleOffset = 78; const int oleTypeStart = 20; const int oleTypeLength = 12; const int employeeIdColumn = 0; const int employeePhotoColumn = 1; const int bufferSize = 100; //must be bigger than oleOffset byte[] buffer = new byte[bufferSize]; int bufferStart = 0; int byteCountRead; long currentIndex = 0; ConnectionStringSettings nwSetting = ConfigurationManager.ConnectionStrings["NwString"]; using (SqlConnection cn = new SqlConnection()) { cn.ConnectionString = nwSetting.ConnectionString; cn.Open(); using (SqlCommand cmd = cn.CreateCommand()) { cmd.CommandText = "SELECT EmployeeID, Photo FROM Employees"; SqlDataReader rdr = cmd.ExecuteReader( CommandBehavior.SequentialAccess); while (rdr.Read()) { int employeeId = rdr.GetInt32(employeeIdColumn); string fileName = @"c:\Employee" + employeeId.ToString().PadLeft(2, '0') + ".bmp"; // Create a file to hold the output. using (FileStream fs = new FileStream( fileName, FileMode.OpenOrCreate, FileAccess.Write)) { currentIndex = 0; //read until we have the oleheader, if possible while (currentIndex < oleOffset) { byteCountRead = (int)rdr.GetBytes(employeePhotoColumn, currentIndex, buffer, (int)currentIndex, bufferSize - (int)currentIndex); if (byteCountRead == 0) break; currentIndex += byteCountRead; } byteCountRead = (int)currentIndex; //process oleheader, if it exists if (byteCountRead >= oleOffset) { string type = Encoding.ASCII.GetString( buffer, oleTypeStart, oleTypeLength); if (type == "Bitmap Image") { bufferStart = oleOffset; byteCountRead = byteCountRead - oleOffset; } } while (byteCountRead != 0) { fs.Write(buffer, bufferStart, byteCountRead); bufferStart = 0; byteCountRead = (int)rdr.GetBytes(employeePhotoColumn, currentIndex, buffer, 0, bufferSize); currentIndex += byteCountRead; } } } } } MessageBox.Show("Done"); }
This code still takes into account that when the stream is read, it might not fill the buffer. This means a loop is created to read enough bytes to get the complete OLE header, if one exists. This code also has checks to verify that the photo is at least as large as the OLE header, in case a very small photo is loaded into the database. After the code tests for the OLE header, the main read/write loop starts, but the first time through, the writing might start after the OLE header, if it exists.
You can write BLOB data to a database by issuing the appropriate INSERT or UPDATE statement and passing the BLOB value as an input parameter. Because the BLOB is stored in binary format, you can pass an array of type byte as a binary parameter.
BLOBs that are quite large will consume too many system resources; your application performance will suffer and possibly throw out-of-memory exceptions.
You can use the SQL Server UPDATETEXT function to write the BLOB data in chunks of a specified size. The UPDATETEXT function requires a pointer to the BLOB field being updated, so the SQL Server TEXTPTR function is first called to get a pointer to the field of the record to be updated.
The following code example updates the Employees table, replacing a photo with a new one from a file.
Visual Basic
Private Sub button3_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles button3.Click Const bufferSize As Integer = 100 Dim buffer(bufferSize) As Byte Dim currentIndex As Long = 0 Dim photoPtr() As Byte Dim nwSetting As ConnectionStringSettings = _ ConfigurationManager.ConnectionStrings("NwString") Using cn As New SqlConnection() cn.ConnectionString = nwSetting.ConnectionString cn.Open() Using cmd As SqlCommand = cn.CreateCommand() cmd.CommandText = _ "SELECT TEXTPTR(Photo) FROM Employees WHERE EmployeeID = 1" photoPtr = CType(cmd.ExecuteScalar(), Byte()) End Using Using cmd As SqlCommand = cn.CreateCommand() cmd.CommandText = _ "UPDATETEXT Employees.Photo @Pointer @Offset null @Data" Dim ptrParm As SqlParameter = _ cmd.Parameters.Add("@Pointer", SqlDbType.Binary, 16) ptrParm.Value = photoPtr Dim photoParm As SqlParameter = _ cmd.Parameters.Add("@Data", SqlDbType.Image) Dim offsetParm As SqlParameter = _ cmd.Parameters.Add("@Offset", SqlDbType.Int) offsetParm.Value = 0 Using fs As New FileStream("Girl.gif", _ FileMode.Open, FileAccess.Read) Dim count As Integer = fs.Read(buffer, 0, bufferSize) While (count <> 0) photoParm.Value = buffer photoParm.Size = count cmd.ExecuteNonQuery() currentIndex += count offsetParm.Value = currentIndex count = fs.Read(buffer, 0, bufferSize) End While End Using End Using End Using MessageBox.Show("Done") End Sub
C#
private void button3_Click(object sender, EventArgs e) { const int bufferSize = 100; byte[] buffer = new byte[bufferSize]; long currentIndex = 0; byte[] photoPtr; ConnectionStringSettings nwSetting = ConfigurationManager.ConnectionStrings["NwString"]; using (SqlConnection cn = new SqlConnection()) { cn.ConnectionString = nwSetting.ConnectionString; cn.Open(); using (SqlCommand cmd = cn.CreateCommand()) { cmd.CommandText = "SELECT TEXTPTR(Photo) FROM Employees WHERE EmployeeID = 1"; photoPtr = (byte[])cmd.ExecuteScalar(); } using (SqlCommand cmd = cn.CreateCommand()) { cmd.CommandText = "UPDATETEXT Employees.Photo @Pointer @Offset null @Data"; SqlParameter ptrParm = cmd.Parameters.Add("@Pointer", SqlDbType.Binary, 16); ptrParm.Value = photoPtr; SqlParameter photoParm = cmd.Parameters.Add("@Data", SqlDbType.Image); SqlParameter offsetParm = cmd.Parameters.Add("@Offset", SqlDbType.Int); offsetParm.Value = 0; using (FileStream fs = new FileStream("Girl.gif", FileMode.Open, FileAccess.Read)) { int count = fs.Read(buffer, 0, bufferSize); while (count != 0) { photoParm.Value = buffer; photoParm.Size = count; cmd.ExecuteNonQuery(); currentIndex += count; offsetParm.Value = currentIndex; count = fs.Read(buffer, 0, bufferSize); } } } } MessageBox.Show("Done"); }
This code opens a connection and retrieves a pointer to the photo that is to be updated by calling the TEXTPTR function using a SqlCommand object. Then a new SqlCommand object is created, and its CommandText property is set to the following.
"UPDATETEXT Employees.Photo @Pointer @Offset null @Data "
Note that the null parameter defines the quantity of bytes to delete. Passing null indicates that all existing data should be deleted. Passing a 0 (zero) indicates that no data should be deleted; the new data simply overwrites the exist data. (You pass a number if you want to delete some of the data.) The other parameters represent the pointer to the start of the photo, the current offset to insert data, and the data being sent to the database.
After the file is opened, a loop starts that reads chunks of the file into the buffer and then sends the chunk to the database.