Choosing the Right Data Access Interface | Hitchhikers Guide to Visual Studio and SQL Server: Best Practice Architectures and Examples, 7th Edition (Microsoft Windows Server System Series)

This is not a book just about ADO.NET (although I'll spend a lot of words discussing it and showing you how to use it)it's a book about solving data access problems with Visual Studio and SQL Server. However, one of the problems I've seen of late is the tendency to use ADO.NET where it does not belong. ADO.NET (as DAO, RDO, ADOc, and the low-level data interfaces) are simple query and, some would say, update interfaces. None of these interfaces is particularly good at handling large rowsets or for moving data from one place to another. Using ADO.NET to move data from one system to another is like moving a ton of coal in a bicycle basket. ADO.NET is designed for, optimized for, and should be principally used for executing queries, managing client-side data, and posting minor updates to a database. I know there are many, many situations where you need to move data from one system to another. I spent a significant part of my time at Electronic Data Systems (EDS) using custom utilities to do just that.

SQL Server has always included tools for importing and exporting data. One of the oldest of these is the Bulk Copy utility (BCP). SQL Server 2000 added a new tool called "Data Transformation Services" (DTS). SQL Server 2005 replaced DTS with "SQL Server Integration Services" (SSIS). To make things even easier for data access developers, ADO.NET now supports a new SqlBulkCopy class to permit you to upload data from virtually any data source to SQL Server. Why go to the trouble of using BCP? Well, in one word: performancestartlingly fast performance. What might take days using ADO action commands can be done in minutes to hours. Because BCP, SSIS, and the SqlBulkCopy method use proprietary high-speed interfaces that bypass much of SQL Server's overhead, they can work far faster than using UPDATE or INSERT SQL commands.

When you need to make bulk changes to the database but these changes require complex logic (but isn't driven by user input), some developers have (unwisely) chosen to fetch many thousands of rows from the server, process them on the client, and send back the updated rows to the server. This approach is typical of developers who are not familiar with SQL Server stored procedures or other server-side executables. Consider that you can transport bulk data from virtually any trusted or untrusted source into a "temporary" SQL Server table. Once on the server, it's simply a matter of coding a stored procedure to munge the new data and merge it into the permanent (trusted) tables using whatever business rules or referential integrity constraints make sense. There is rarely a way to make the case to have a client perform bulk operations using ADO.NET queries and updates.

Choosing the Right Data Access Provider

As I'll discuss when I get to the later chapters, ADO.NET is simply a portal to a variety of data access "providers." Each provider exposes standard classes to implement custom native protocols (using managed code) or OSFA protocols like ODBC or OLE DB. These OSFA protocols are simply bridge interfaces to custom ODBC (managed code) drivers or COM-based (unmanaged) OLE DB providers written for specific backends. As a rule of thumb, it's best to use the most efficient protocolwith ADO.NET that means using a managed .NET data provider like SqlClient for SQL Servernot an OSFA interface. As a result, I won't be discussing the other providers very much (if at all), but I will make it abundantly clear how to use the SQL Server SqlClient provider in later chapters.

Given that developers sometimes wish to create a single application that needs to connect to a variety of backend data engines, using SqlClient can be problematic. In this case, it might be better to use a "generic" data interface that permits you to subclass the low-level data access calls and not have to specify a specific DBMS interface class. ADO.NET exposes new "factory" classes that make this process easier. That said, I'm not convinced that it's possible to write an efficient, competitive generic application. While it's possible to create classes that are backend agnostic, you may access only those features on the backend server that can be accessed on all other targeted backends. This approach means you might not be able to use stored procedures, as there are no two DBMS engines that implement them the same way. You won't be able to use XML datatypes, CLR functions, user-defined types, ANSI-SQL extensions, and other features that are not implemented universally. If you stick to SQL Server implementations like Workgroup, Standard, Enterprise, and Express, you'll have a better chance of creating a single application that can work on a wider variety of platforms.

What About SQL Server Everywhere Edition?

Just as the final chapter of this book was turned in for editing, Microsoft re-re-released a CTP of its free "SQL Server CE" database engine as "SQL Server Everywhere" (SQLEv). This CTP release was preceded by the SQL Server Mobile 3.0 edition, which was targeted (by license) for use on Windows Smartphones, Pocket PC, PDAs, and Windows Tablet PC systems. Microsoft previewed this announcement over a year ago at an author's summit, where I gave it my support. I could see any number of interesting and important ways to leverage this tiny DBMS. Over the last year or so, I worked with the Product/Program managers at Microsoft to refine this "new and improved" version of SQL CE. I've decided to publish an EBook on SQL Server Everywhere^[7], so this book would not be delayed any further. The first release of SQLEv is expected to be in November, 2006, so it should be available on www.microsoft.com/SQL/Everywhere.

^[7] See the support site www.hitchikerguides.net for details on the EBooks "Hitchhiker's Guide to SQL Server Everywhere" and "Hitchhiker's Guide to SQL Server EverywhereReplication."

When compared to other versions of SQL Server, SQLEv is different. Different should not be equated to bad or taken in a negative lightin this case, it means better performance, cheaper development costs and more secure databases. Permit me to illuminate a few of these similarities and differences.

Implementation

SSEv is implemented as a set of DLLsnot a service like SQL Server. This is an advantage in its typical use scenarios as it does not have to be started or stopped to release its resources. This makes Xcopy or embedded deployment easy and safe. I discuss several deployment strategies in the Ebookone of which does not require administrator rights.
SSEv database files can use the default file extension of .sdf but do not need to have any particular file extension. Unlike SQL Server, a SSEv database cannot span more than one file. In addition, SSEv databases are backward-compatible with SQL Mobile so that data files can be passed in both directions between applications.
SSEv has the same database size limitation as SQL Server Express Edition (4 GB), but there are plans to remove this SSEv restriction to permit far larger databases.
SSEv can be included in a bootstrap deployment package, so it can be used in a web- (or intranet-) based click-once scenarioSQL Server deployment is still a lot more complex. SSEv can also be deployed using a Microsoft Installer (MSI) to permit automatic updates to the DLLs through Windows Update.
SSEv does not use named instances like SQL Server. Each application that uses SSEv simply references the DLLs to access a private (albeit sometimes shared) database file. This also means you don't have to worry about whether or not SSEv is "started" or pausedit's always available.
SSEv databases (unlike SQL Server) don't have a "name" like "Pubs" or "Biblio"they are referenced through their data path such as "C:\Databases\Accounting.sdf". However, the file can be moved at any time (unlike SQL Server)at least as long as the file is not open. Erasing or moving a SSEv database is bad unless your application can automatically relocate the .sdf file.
SSEv cannot link to other databases like SQL Server (and JET) but it can replicate to SQL Server which can link to other servers or instances of SQL Server.
SSEv SQL uses a simpler way to refer to objectsyou don't have to (can't) specify an "owner," "schema," or alternate database but you can change databases in SSEv SQL or codeas long as they aren't password-protected. I expect that issue to be fixed.
SSEv databases can be attached to email or simply copied as a file to any target platformsomething that's very difficult for SQL Server databases. This makes installation and deployment a snapthe database and all of its DLLs (about 1.7MB uncompressed) can actually fit on a floppy or in the limited memory on a Windows Smart Phone or PDA.
SSEv databases don't use a .log file to manage their atomic operations (like SQL Server)logging is done, but via shadow pages managed by SSEv. While a .log file might appear, it's used as a temporary workspace when the database is compacted.

Security

SSEv is as secure as any DBMS engine provided by Microsoftif not more so. It has a far smaller surface area of attack which makes it far easier to secure.
SSEv databases can be password protected, encrypted, and further protected by the Windows file system encryption and rights schemes.
SSEv runs in process with the application, so it's not vulnerable to many of the attacks made against stand-alone or service-based DBMS systems (like SQL Server) that might be exposed to the network.
SSEv does not run as a service and it does not listen on network ports.
SSEv has a code-free data file architecture, which means it does not have the ability to run externally supplied code.
SSEv does not have the ability to run imbedded secondary SQL commands inserted by hackers which means it's not as prone to SQL injection attacks.

Sharing and Replication

SSEv supports a sophisticated replication interface that works with BCP/DTS/SSIS (to some limited extent), multiple subscriptions, configurable compression, column-level tracking and synchronization progress status events. Replication synchronization can be executed synchronously or asynchronously. I'll show you how in my EBook focused on SSEv Replication.
SSEv does not include the new (for ADO.NET 2.0) SqlBulkCopy method, so you'll have to programmatically configure client-side replication routines to upload data.
SSEv databases can be replication Subscribers to SQL Server versions that support replication Publicationthat does not include SQL Server Express. This means if you expect to replicate an SSEv database, you're going to need SQL Server Workplace or a better edition. However, in the next release of Visual Studio (Orcas), it's expected that SSEv databases can sync with other SSEv databasesone on a device and another on the desktop.
SSEv can support up to 256 connections but with the same caveatsa practical limit of simultaneous connected operations is closer to 70. However I don't expect you'll need more than a couple of connections for most applications.
SSEv's sharing mechanism is designed to share the .sdf database file with applications on the same systemit's not designed for multi-user or network file-sharing configurations like JET. It does support multiple connections from different processesbut all on the same system.
SSEv data files can reside on a file share on another computer. It doesn't support multiple machines sharing the same data file as this would fall into the multi-user scenario. However, users who keep their documents on another server for backup reasons can use SSEv data files. This isn't the case with Express where files must physically live on the server machine.
SSEv databases can be shared among applications (including SQL Server Management Studio and Visual Studio) on the same system but must be closed to perform repair, manual compression, and validation. There are also sharing issues when trying to replicate the database that I discuss later. Unfortunately, there is no way to tell if other "users" have the database open other than trying to perform one of these maintenance operations. There is no way to force the other users to close their connectionsas there is with SQL Server.
SSEv database connections do not support connection pooling (they don't need to), and the database engine can't be (and does not have to be) paused, stopped, or started (like SQL Server).

Programming and Maintenance

SSEv uses unique techniques to manage data that makes it especially interesting for a number of lightweight data storage needs, but these techniques are different from those used by SQL Server. These innovations include the SqlCeResultSet class that can implement a scrollable, updatable, bindable, in-memory cursor, as well as the TableDirect CommandType. When coupled with a named index, a TableDirect query can permit high-speed index-ordered access to a selected table.
SSEv's functionality is exposed via the SqlCE namespace. This is similar in some respects to ADO.NET (1.0) so it's missing functionality like SqlBulkCopy, connection GetSchema, and a few other classes. This keeps SSEv light but also might cause you to search for alternative techniques to solve some problems easily addressed by ADO.NET 2.0.
SSEv exposes a SqlCeDataAdapter, but Visual Studio does not expose a wizard to populate it. However, the Visual Studio TableAdapter wizards work against SSEv databases, so you might not find a need to use the SqlCeDataAdapter (at least not directly).
SSEv's data provider "implements" a subset of SqlClient functionality so it sometimes behaves differently. Its system classes can be used to populate System.Data DataTable and DataSet classes with a few restrictions (and surprises) that I'll illuminate later. I'll caution you now that knowing SqlClient programming can be a disadvantage when working with SSEv. While the objects, properties, methods, and events might look similar, there are many differences in how these objects are implemented. These differences can play a significant role in how you design your application and how it interacts with SSEv.
SSEv exposes a number of maintenance methods like Compact, Verify, Repair, and CreateDatabase that imply that the database needs periodic compaction, verification, and repair. Actually, after version 3.1 database compaction is done automatically but you still might find a need to repair and verify your databaseand no, this is not the same kind of compaction that your cat gets when it eats too many mice. You'll also find that SSEv databases are far less fragile than file-share databases like JET.
SSEv's compact and repair feature can be done only when there are no live connections. If you have a corrupted DB, you can use repair to remove the corrupted pages. The Compact method is more like a disk defragmenter. It will move all the free pages to the end and shrink the file at the end.
SSEv 3.1 and later support auto shrink (automatic compression), true ACID support, and row-level locking. The database is automatically compressed when the last connection is closed.

The SQL Server Everywhere SQL Engine

SSEv does not support triggers, stored procedures, SQL Views, temp tables, or familiar SQL Server metadata tables (like sysobjects) and most system views. In addition, its SQL language is a subset if TSQLcontaining just the commands you need to manage a client-side database. In some ways, this is an advantagebecause SSEv does not support any form of programmability, it's not possible to embed malware into the database. The Microsoft team is considering some sort of trigger mechanism (events that fire on data change) but this won't appear until a later version.
SSEv SQL supports a subset of SQL commands. This means statements like TRUNCATE <table> and SELECT Name FROM SYSOBEJECTS won't work. There are system views such as INFORMATION_SCHEMA.TABLES that expose lists of database objects like columns, indexes, tables, and RI constraints. Even some of the more common operations like a simple SELECT INTO to create a new table are not supported. This also means some of the features added to TSQL recently are not implemented in SSEv's SQLincluding TOP, ROWCOUNT, and others.
SSEv supports the ability to "seek" based on a named index valuea feature inherited from ISAM database engines. Just remember, it's not supported on SQL Server so don't get spoiled. It also exposes the TableDirect CommandType to support the ability to seek, find, and return rows based on a selected index.
SSEv SQL supports DDL commands like CREATE DATABASE and CREATE INDEX, but the syntax is only similar to TSQL and less sophisticated. SQL Server Management Studio also does not script SSEv objects to SQL as it does for SQL Server objects. While you can create a database with SQL, you can't alter anything but the tables. This means you have to use code to compact the database if you need to change the password or encryption attributes.
SSEv supports Identity and rowguid columns, as well as column defaults, and CONSTRAINTs but not RULEs, or specific column collation settings. However, there are no GUI tools provided to help create PK/FK relationshipsthis can only be accomplished by using an SQL script to create (or alter) the database tables. You'll also find there's no mechanism to disable identity value generation during data import routines.
SSEv supports full atomicity, consistency, isolation, and durability (ACID) operations just like SQL Server. This is essential for target platforms that experience power outages (most PDAs and cell phones run on batteries) and communications link failures (most platforms depend on wireless connections that are prone to connectivity disruptions).
SSEv queries do not support multiple statements in a batch (like SQL Server). This means no "multiple resultset" scenarios are possibleeven though the NextResult method is exposed. Don't be fooled by the fact that this seems to work in SQL Server Management Studioit's playing "helpful" tricks behind the scenes.
SSEv returns different errors than those returned by SQL Server. This means that everything from the simplest of syntax errors to the most complex synchronization errors are exposed differently. I discuss error handling later in my EBook.
SSEv also does not support [bracketing] reserved wordsyou have to use "double" quotes to delineate reserved words in SSEv SQL. They tell me this might be fixed.
SSEv supports a subset of the SQL Server datatypes and none of the CLR UDT datatypes. This means that VarChar, VarChar(max), Char, Text, XML, Timestamp, and several other familiar and heavily used SQL types are not supported. You will find NVarChar, Nchar, and NText are supported, and the Microsoft team tells me that Timestamp might be part of a subsequent release. By reducing the number of datatypes and simplifying the SQL language, SSEv can be smaller, faster, and less complex to program.

SQL Server Tools and Visual Studio Support

SSEv databases and queries can be managed with Visual Studio's Server Explorer and SQL Server Management Studio. This permits developers to tune queries with a graphical query plan, setup subscriptions, and perform DTS/SSIS operations. There are some disconnects with SQL Server Management Studio such as the inability to directly update or generate SQL scripts from Tables. SQL Server Management Studio also does not support the ability to "syntax check" SQL commands as it does for SQL Server queries nor does it support the ability to script database objects like tables. I've been told that some of these issues are being addressed in future versions.
SSEv supports indexes (and a query optimizer), referential integrity, and constraintsall of the basics for a relational datastore. It also supports dynamic schema alteration which many of its competitors do not.
SSEv databases can be completely encrypted and password-protectedSQL Server databases cannot. While you can encrypt selected columns in a SQL Server database, its security scheme depends on protecting the SQL Server's host system physical securitySSEv does not.
SSEv cannot be monitored with the SQL Profiler or the new Trace tools. This means you're on your own when it comes to debugging what's being sent to SSEv.
SSEv databases can also be used as a data source for Reporting Services or any other product that can accept data from a DataReader.
SSEv is not intended to run on IIS. Yes, there is a documented unlocking mechanism if you really need to enable this scenariobut I'm not providing it.
SSEv's documentation is not up to the same standards as SQL Server. There are many disconnects in the indexes; the topics are all-too-often incomplete and sometimes just wrong. Add to that the three-generations of help topics and web-based content that talks about SQL CE, SQL Mobile, and now SSEv, you can appreciate how hard it is to find help online. Now that SSEv is targeted to more application platforms, it should get more funding and attention from the documentation teams.
And finally, SSEv is not a V1.0 product. It's been around for many years as SQL CE and more recently SQL Mobile. Over the years, it's been tuned, debugged, and refined so it's ready for more platforms and more applications.

Don't get me wrong. I like SQL Server Everywhere. It addresses a host of problems that developers experience with JET (and SQL Server) and provides a fast, light, and fairly flexible DBMS platform. It's a welcome edition to the database application developer's toolset. I provide a lot more detail on where SSEv makes sense, how it works, how to implement and deploy it, and how to maintain it in the field. If you're interested in more information on SSEv, visit the support site for information on a couple of EBooks that discuss SSEv in far more detail.

Using the SQL Server Native Client

Microsoft has also added a new data access layer beneath ADOc when you need to connect to SQL Serverthe SQL Server Native Client (SNC). This new COM-based interface is designed to replace all ODBC and OLE DB interfaces to SQL Server, but without use of the MDAC stack and its associated issues. This means you can take your existing ADOc applications and recode the ConnectionString property to bypass MDAC completely, which should alleviate many DLL Hell issues. In addition, SNC is designed to expose all of the new SQL Server 2005 features, so your Visual Basic 6.0, middle-tier, or ASP applications will be able to use multiple active resultsets (MARS), XML datatypes, and more. I'll show you how to use SNC when I get to the ADO.NET chapter.