Understanding and Managing the Connection Pool | Hitchhikers Guide to Visual Studio and SQL Server: Best Practice Architectures and Examples, 7th Edition (Microsoft Windows Server System Series)

When connection pooling is enabled (and it is by default), the ADO.NET data provider initializes a sophisticated mechanism to manage connections to the .NET data provider. This mechanism is called a "connection pool". When your application code^[13] executes the SqlConnection.Open method, the SqlClient provider creates a new connection pool, opens the connection to SQL Server, and places a handle to that connection in the connection pool. When the SqlConnection object's Close method is executed, in your code, the pooled connection is marked as "idle" but the actual server connection is left open. When another instance of your application (using the same ConnectionString) tries to open a connection, the pooler attempts to draw a connection from the pool of pre-opened, idle connections. If no idle connections are found, the pooling mechanism opens another new connection to the database and the pool grows in size. This process continues until the pool reaches 100 connections (a settable value) or the application/process that owns the connection pool endsin which case, the pool is discarded.

^[13] Visual Studio also opens pooled connections as you work with the Server (or Database) Explorer or the other data tools.

This means when you close your database connection in code, you're really just releasing the connection back to a pool of used connections; the database connection to SQL Server remains open and continues to hold resources on the SQL Server instance until the pooler times out the connection and closes it. The idle timeout always takes between 4 and 8 minutesno, this is not a configurable number. However, if the same process using the same ConnectionString needs to reconnect, the connection pool simply plugs it back into an existing "idle" connectionassuming the transaction scope and other factors haven't changed. No, you might not get the same connection again, but in theory, the connection you get should be functionally equivalent to the original. This means that by pooling connections, your application can save the time that it would otherwise have to expend establishing a brand-new connection to the server.

As with any specialized functionality, connection pooling has a number of rules for its use. Here are some common questions you might have about the rules that govern connection pools.

How does one enable connection pooling? Connection pooling is enabled by default in all ADO.NET providers. You can disable pooling using the Pooling=False key/value pair in the ConnectionString. Don't disable pooling to correct problems with timeout exceptions.
Tip

Don't disable pooling in an attempt to correct problems with timeout exceptions.
When does the .NET Data Provider create a new pool? Your application might not always be able to use an existing pooled connection. If any of the following occurs, the SqlClient .NET Data Provider creates a new pool:
The ConnectionString property changes This property must be the same each time ADO.NET uses the Open method. Even if the only change is that the arguments are in a different order, the string is different. If you add, change, or remove argumentseven if they simply reflect the default behaviorthe property is different, and you get a new pool. The SqlConnectionStringBuilder can help make sure the ConnectionString is the same each time.
- The User ID (UID) or password changes If you use a specific UID for each connection, each UID gets its own pool.
- The process identifier (PID) changes Each process, program, component, Web Service, or web application domain gets its own pool. I discuss this aspect in a moment.
- The transaction scope or enlistment changes If you're using transactions, each Transaction scope gets its own pool.
- MARS enlistment changes If the connection uses multiple active resultsets (MARS), you get a unique pool.
Where are connection pools created? The provider always creates connection pools on the system where the code creating the connection runs: on the client. In a client/server application (for example, a Windows Forms application opens a connection to a remote server), the provider establishes the connection pool on the client (local system). In a Web Service, ASP, ASP.NET, or COM+ architecture, the code that opens the connection runs on a remote server that's hosting Microsoft IIS or Microsoft Transaction Server (MTS). In these cases, the "client" is the code running on the remote code server. Connection pools aren't created on the database server unless it happens to be hosting an application that opens a connection.
How does the security setting affect pooling? As I mentioned, if you use a specific UID for each connection, each UID gets its own pool. But if you specify Integrated Security=SSPI in the ConnectionString, you get one pool for your process's User Account. This means that you can't specify a role or UID and share the pool that the .NET Data Provider created for another role or user.
Why does the PID affect pooling? Like every database interface since the dawn of time, ADO.NET can't share connections across processes, so each process must have its own connection pool. A client/server application gets its own PID each time you run it. If you have five instances of an application, each gets its own PID, regardless of whether you run the application on one system or five systems. If you run a middle-tier componentsuch as a Web Service, a COM+ component, or a web applicationthat component also has its own PID; but in this case, multiple instances are running the same process. This means that a Web Service can serve a few clients or a few thousand clientsall of which share the same connection pool. When you're developing your Web Service, each time you compile and create a new assembly, the new executable gets a new PIDand a new connection pool. So, you could potentially have two nearly identical copies of an executable on your system, but each gets its own connection pool.
What happens when the pool is full? If the pool fills up with connections (the default limit is 100 connections), the .NET Data Provider waits for another instance of the application to close an existing connection and release it back to pool. If the .NET Data Provider can't get a connection in time (i.e., within the limit that the Connection Timeout setting specifies), it throws a System.InvalidOperationException with the message "Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached." You don't want to show this message to your users because it's not their fault and, generally, users can't debug your code (which caused the problem in the first place)^[14] Note that ADO.NET uses System.InvalidOperation Exception for a variety of situations, so you need to parse the Exception Message string to find out what's really wrong.
^[14] When I get this message, I usually send a message to the web master offering to fix the problem.
When are pools destroyed? Connection pools are destroyed when the process that creates them ends. Because of this, an ADO.NET Data Provider running under IIS or MTS creates connection pools and destroys them when the process or Application Domain (AppDomain) ends. ADO.NET 2.0 also permits you to clear a specific pool or all pools associated with an application.

Tip

A bug exists in the Version 1.0 and 1.1 Framework performance counters that the SQL Server .NET Data Provider (System.Data.SqlClient) exposes. The counters falsely report the existence of active pools. To make sure that what you're seeing isn't a "ghost" pool, stop the Performance Monitor application, take a drink of coffee, and check again. When you're working with Windows Forms applications, pools drop off when the application or process ends. This behavior makes sense because the process creating the remote connection pool wouldn't necessarily end if IIS is acting as an agent for the web application or Web Service. This bug has been sidestepped in the 2.0 Frameworkthey replaced the counters with new ones. I'll discuss these next.

Be aware that if you shut down the SQL Server system or it shuts down on its own, any existing SqlClient connection pools remain in a zombie state unless IIS or MTS shuts down, too. In the 2.0 Framework, connection pools containing connections to dead or missing SQL Server instances are automatically flushed and rebuilt on next open.

Before I get any deeper into managing a connection pool, consider that 99% of this discussion is for ASP.NET applications that typically use dozens of pooled connections. A typical Windows Forms application will use one or two connections and really have little need for pooling.

Effectively Using Connection Pools

So what's the best way to use connection pools? Given the constraints that the .NET Data Provider implements, and if you aren't careful what valves you open and knobs you twist, you might think that it's easy to overflow the pool or server with connections. Remember that the limiting resources are the number of connections that a pool can handle, the number of connections available on the server, and the capacity of the server to meet the demand of these connections. As I noted earlier, the default number of connections a pool can handle is 100; when the number of connections within the pool reaches this value, the .NET Data Provider won't create more pooled connections. Ideally, your application should use a connection for a moment and release it back to the pool for other application instances to share, so 100 connections per pool should be more than enough.

Another tactic is to let the .NET Data Provider create any number of pools, again limited by the number of connections SQL Server allows according to your license or administrator settings. If some of your components are likely to compete for connections in a heavily used pool, it's a good idea to separate those components by creating two or more poolsone for each functional operation. One way to establish this separation is to set the application name in the ConnectionString to a unique value for each pool you want the .NET Data Provider to create. Watch out, though, because when a pool overflows (that is, you've exceed the set number of connections), your component performance grinds to a halt and your customers take their business to other sites.

Client/Server Pool Rules

Before you wade too deeply into how to manage your connection pool, you should reflect on what kind of application will benefit from connection management. Not all applications do. Client/server applications use pooled connections, but the specific instance of the client application is the only beneficiary. That's because each application runs in a unique process space, so each client application gets its own pool. Multiple copies or instances of the same Windows Forms application don't share the same pool even when they run on the same system; however, when a single application makes multiple connections (as many applications do), the .NET Data Provider manages all the connections in a common pool.

When you're working with client/server applications that need to manage the server-side DBMS more closely, the pool can be more of a hindrance than a help. In such cases, you might want the functionality of a like-new connection each time you use the Open method and an assurance that the connection actually closes when you use the Close method. Conversely, if the application repeatedly makes and breaks connections, the connection string remains unchanged, and you don't use MARS or transactions, the .NET Data Provider creates a dedicated pool for your connection. This means that the application reuses your pooled connection whenever possible, and your application can (potentially) run faster because the connection isn't being reopened and closed repeatedly. Remember, connection pooling does not make much sense for many client/server applications.

Wading into the Web

Over the last decade, the overwhelming focus at Microsoft has been web and middle-tier architectures, in which a code snippet's lifetime is shorter than a rabbit's heartbeat and replicated just as quickly. These routines typically have to open a database connection, execute a query, and return a response very quicklybefore the customer loses interest. Between one and N instances of the component simultaneously execute a processdozens to thousands of times per secondwith all the code snippets doing pretty much the same thing and using the same connection string.

If the component (such as a Web Service) takes too long to finish using the connection, the SqlClient Data Provider might add more connections to the component's private pool. Also remember that each .NET assembly gets a unique PID. This means that each new assembly you run gets its own poolregardless of the similarity of function or the fact that their ConnectionString values are identical.

Managing Lots of Swimmers

In any architecture, you need to carefully monitor three factors: the number of pools, the number of connections each pool manages, and the load on the server demanded by each connection. Fortunately, you can control all of these with code.

When your web site is under heavy use, a good strategy for making sure you have enough connections is to open each connection just before you need it and close it as soon as you canthe just-in-time (JIT) strategy. You can shorten the system time consumed by the connection by optimizing the query being executed and limiting the size of the returned rowset. ADO.NET can help because it opens and closes connections automatically (and quickly) when you use the DataSet Fill or Update methods. If you're using a SqlDataReader, you have to open and close the connection yourself. Even if you use the CommandBehavior.CloseConnection option, you still have to close the SqlDataReader to get the associated connection to close.

Unlike VB 6.0, the .NET languages can't guarantee your connection will be closed when a SqlConnection object falls out of scope. If the connection is still open, you (and the connection pool) can't reuse it. If the SqlConnection object is still open when it falls out of scope, it's probably lost forever. For example, the code segment shown in Figure 9.20 "leaks" a connection each time the GetData procedure executes because the SqlConnection isn't closed before the function that creates the SqlConnection object is completed. This leak happens because the code can no longer reference the SqlConnection object (which owns the pooled connection) after it falls from scope, and the .NET garbage collector won't necessarily clean up these orphaned objects (at least, not in time). The result? Connection timeout exceptions.

Figure 9.20. How not to handle DataReaders.

Let's try to fix some of the problems with this code. First, reconsider use of the SqlDataReader. If the application intends to bind to the rowset and provide post-fetch sorting, filtering, and finding, you probably should have created a DataTable instead. This is easier than ever in ADO.NET, as you can now load a DataTable object from a SqlDataReader.

But this solves only one problemclosing the SqlDataReader once rowset population is complete. Unless you set the CommandBehavior.CloseConnection option when the SqlDataReader.ExecuteReader method is executed, closing the SqlDataReader won't close the SqlConnection for us.

That solves two problems, but what if there is a problem when the query is executed? In this case, I need to close the SqlConnection to return it to the pool. Let's look at the modified code (shown in Figure 9.21) to see how it addresses these issues.

Figure 9.21. A better approach when passing DataReaders between routines.

When All Connections Are Gone

How your application behaves when all available connections are in use is crucial to the effective performance (and success) of your web site or application. A helpful tip is to remember that you might be able to make a temporary fix by increasing the value of the "Connection Timeout" connection-string argument or the SqlConnection object's ConnectionTimeout property. You use one or both of these values to set the length of time ADO.NET waits for a connection to be released from the pool and made available to your code. If you set the value too high, the browser (or client) might time out and throw a System.InvalidOperationException before you get connected. If you set the value too low, your error handler will have to know how to retry the connectionpossibly after asking the customer to be patient while your application deals with other requests. But increasing the ConnectionTimeout property is like moving the red line on a gauge that measures maximum performance to keep you from burning out a bearingit simply masks the problemright up to the point your server or engine melts down.

Activating, Tuning, and Disabling the Connection Pool

As I explained earlier in this chapter, when you're using the SqlClient .NET Data Provider to access SQL Server 7.0 and later, you need to create a new SqlConnection object and set its ConnectionString propertyit's the only way to change the SqlClient .NET Data Provider's behavior and control how it interacts with the connection pool. By using the appropriate ConnectionString keywords, you can turn off pooling, change the size of the pool, and tune the pool's operations.

Understanding Connection-Pool Options

Table 9.2 defines the SqlClient.SqlConnection.ConnectionString keywords that determine how ADO.NET^[15] manages the connection pool for the specific connection you're opening. You can set these keywords on a connection-by-connection basis, which means that some Connection objects will be pooled and others won't be.

^[15] No, other .NET Data Providers don't all have similar keywords to manage their connection pools.

Turning Off Connection Pooling: When you want more hands-on control of the server-side connection state or you want to debug a connection-pooling problem, using an unpooled connection makes sense, so it's nice to know that you can easily turn off connection pooling. If you're sure you want to turn off pooling, simply set the "Pooling" argument in the ConnectionString to False, and the SqlClient .NET Data Provider won't pool connections. In this case, the Open method establishes a new connection to the server (if one is available), and when you use the Close method, ADO.NET closes the server connection immediately. Be prepared for SQL Server to deny a connection-open request once you've exhausted the number of connections the SQL Server administrator allows or you exceed the number of connection licenses registered for the server. If your application is unable to connect to the server for some reason, your code needs to display a message to tell your user to come back later or simply wait and retry.
Setting the Maximum Pool Size: Use the "Max Pool Size" keyword to determine how many connections the pool manages. By default, the SqlClient .NET Data Provider permits 100 connections to accumulate in each pool. This doesn't mean that you start with 100 SqlConnection objects or 100 connections, only that you can't create more than 100 connections per pool. If you try to exceed the maximum limit you've set, your application waits until you wait for ConnectionTimeout seconds or until a connection becomes available in the pool. If time runs out, the pooler raises an exception. "Max Pool Size" doesn't limit the number of pools you can createthat limit is determined by the number of SQL Server connections that your SA setting or license makes available (about 32,768 per cluster).
Setting the Minimum Pool Size: In some cases, you might want to leave one or more connected SqlConnection objects in a pool for an indefinite length of time. If so, set the "Min Pool Size" keyword to an appropriate value. Setting this value can help connection performance for an application that doesn't run often but can't wait for the .NET Data Provider to construct a new SqlConnection object and establish a connection. You can also use this technique as a way to reserve connections. For example, you might set a "Min Pool Size" value if you want your Internet application to be able to deal with a flood of queries that grab all the connections from the pool but still ensure that you have a private connection. Or suppose you're hosting several applications on the same server and one of those applications is getting hammered with queries; by using this technique, you can reserve some connections for each application.
Cleaning Up the Pool: After you close a connection in code, ADO.NET notifies the connection pooler to release the connection back to the pool, and after a time (between 4 and 8 minutes), a connection-pool routine closes the database connection. No, the "release time" is not configurable. The "Connection Lifetime" ConnectionString keyword isn't what it appears to be. Microsoft included this option to handle a very special circumstancewhen you're using clustered servers. "Connection Lifetime" helps release connections more quickly when several SQL Servers are working in a cluster. In non-clustered systems, the argument has no bearing on how long the connection remains in the pool. After you close the ADO.NET SqlConnection object, connections remain open for the randomly selected period of time after your application releases them. The .NET Data Provider calculates the pool lifetime starting from the time the pooled connection was created. ADO.NET can also free the pools using new ClearPool methods available in the 2.0 version.
What About Other Providers? The way that ADO.NET interfaces with the connection-pool mechanism is newbut connection pools have been around for over a decade by now, so you should be fairly comfortable with how they work. However, if you're not using ADO.NET and the SqlClient .NET Data Provider, it's tougher to manage connection poolsmost providers don't have ConnectionString keywords to configure. If you don't understand the pooling mechanism, there's a good chance you'll create applications that work for a while but fail under stress. Or you might create applications that exhibit substandard performance, but you can't determine why. With the information in this chapter, you can explore how best to manage your connections by using the built-in pooling mechanism.

Monitoring the Connection Pool

So, you've opened and closed a connection, and now you want to know whether the connection is still in place. Let's look at some ways to determine how many connections are open and what they're doing, how many pools have been created, as well as how many connections are in each pool.

Run sp_who or sp_who2: These system stored procedures return a snapshot of information from the sysprocesses system table that shows the status of and information about all working processes. Generally, you'll see one server process ID (SPID) per connection. If you named your connection by using the Application Name argument in the connection string, your working connections are easy to find. Remember, SQL Server runs a number of SPIDs of its own, as does Visual Studio itself.
Use SQL Server Profiler: You can use the SQLProfiler TSQL_Replay template to trace open connections. If you're familiar with Profiler, this method is easier than polling by using sp_who.
Use the Windows Performance Monitor to monitor the pools and connections. I discuss this technique in a moment and walk through a set of examples in Appendix II, "Reinstalling the DACW and Other Missing Functionality in Visual Studio."
Monitor performance counters in code: You can monitor the health of your connection pool and the number of established connections by using hand-coded routines to extract performance counters exposed by the SqlClient provider. To make this easier, you can use .NET PerformanceCounter controls. The SqlClient performance counters in the 1.1 Framework were pretty buggyhopefully, the 2.0 versions (entirely new counters) won't be as problematic.

I show how to monitor SQL Server using the Profiler, the Performance tool, and Performance counters in Appendix III, "Monitoring SQL Server."

Keeping the Connection Pool from Overflowing

As you can tell from the preceding discussion, the connection pool keywords can dramatically impact the way applications (especially ASP.NET applications) perform and scale. Consider that simple (and well-written) Windows Forms applications do not materially benefit from the connection pool because they generally use only one or two connections at any point in time. However, in ASP.NET, applications in the connection pool permit ASP^[16] page instances to open a connection quickly and return it to the pool for other instances of the page to reuse. Typically, in even a heavily used ASP application, the number of pooled connections is usually fairly low. Unless your server is pretty powerful, it might not be able to support more than a few dozen connections at once. The default setting of 100 pooled connections should be more than enough. If you know your ASP application is likely to build up a number of connections before it stabilizes, it's a great idea to set the "Min Pool Size" keyword to that number (or a few more). This means that when the first connection is opened, the pooler automatically opens N 1 more connections and leaves them open for the life of your application domain.

^[16] By "ASP," I mean IIS-based applications that include ASP or ASP.NET architectures.

Sometimes, you can gain a bit of performance by turning off connection pooling in client/server applications, but this isn't advised for ASP appseven those that find the pool overflowing. As I discussed in earlier chapters, solving pool overflow problems is usually a matter of understanding how the pooler works. Remember, there are several typical scenarios that lead to pool overflow:

Unclosed connections: Make sure that any connection you open is closed. No, it's not necessary to use Dispose on the Connection object, but it doesn't hurt. Never depend on the CLR garbage collector (GC) to close your orphaned connections for you. While it might, it can take quite a while before the GC runs. Watch out for code that passes SqlDataReader to other classes or functions within your application. If not programmed correctly, this can easily lead to orphaned connections.
Exception handlers that bypass closing the connection. When something goes wrong as you process a query, the exception handler must be cognizant of the SqlConnection.State property. If the SqlConnection is left open for a common exception, the system will leak connections quicklyand slowly if the exception is irregularly encountered.
Connections closed by the server: When a stored procedure or other query terminates with a severity greater than 19, SQL Server closes the connection from its end. This can damage the pooled connection and make it unusable or orphan a connection if your code does not properly trap the exceptionand this might not be easy to do. Be sure to trap and deal with this (and all) SQL Server exception correctlyand close the connection to release the pooled connection. All data access code should be wrapped in a Try/Catch exception handler. In some situations, your only clue that the connection was lost is the fact that it's no longer usable.
Queries that overwhelm the server: If your application executes a query that takes too long to complete before another instance of your application starts a query that needs a connection, the pooler must create a new connectionit has no choice. If this trend continues, the pool will (perhaps slowly) overflow. If this is a concern, you need to monitor the pool using performance counters or the Performance Monitor application, as discussed in Appendix III.