State Management: What s the Big Deal?

for RuBoard

State Management: What's the Big Deal?

HTTP by its very nature is a stateless protocol. This doesn't mean it disregards geographic boundaries; it means that it is not connection oriented. No request of the Web server can rely on data supplied by some other request. To understand this concept, let's look at an example of how a browser requests a Web page.

When a user types in the address of a Web site, www.deeptraining.com/default.aspx , for example, the Web browser performs a number of steps prior to displaying the page. First, the Web browser converts the hostname, in this case www, to an IP address. It does this by querying a DNS server and asking for the IP address. In our sample, this brings back 192.168.1.200. Next , the Web browser opens a TCP socket to the Web server using port 80. After the connection is made, the Web browser sends a GET /default.asp command. The Web server streams the HTML contents of the page back to the browser. The Web server then closes the TCP socket connection to the Web browser.

NOTE

HTTP 1.1 allows more than one command to be sent without closing the socket connection. This is called Keep-Alive. However, each command stands on its own and should not rely on any state from previous commands.

This series of events is visibly demonstrated by using Telnet instead of a Web browser to communicate with a Web server. Listing 4.1 shows what this would look like.

Listing 4.1 A Sample HTTP Request Using Telnet

 GET /DEFAULT.ASPX HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Wed, 28 Mar 2001 00:38:29 GMT Set-Cookie: ASP.NET_SessionId=sfdaa145jb0mdv55nnhgic55; path=/ Cache-Control: private Content-Type: text/html; charset=iso-8859-1 Content-Length: 130 <html> <head> <title>Welcome to ASP.NET</title> <body> Welcome to ASP.NET! <img src="/image1.jpg"><img src="/image2.jpg"> </body> </html> Connection to host lost.

To replicate this example in Windows 2000, open up a command prompt and type the following:

TELNET localhost 80

This will open up the Telnet connection to your Web server. Now you need to request a page. If you have the default Internet Information Server installation, you can do a GET /localstart.asp to retrieve the Start page. You must type the command exactly; while in Telnet, the Backspace key doesn't work to correct mistakes.

The Web browser receives the HTML, parses it, and is then ready to receive additional requests. In this example, two more requests would need to be made: one for image1.jpg and another for image2.jpg. Each request causes another TCP socket connection to be made over port 80 and then a GET /image1.jpg command to be sent. The Web server streams back the image and then closes the port. A diagram of this process is shown in Figure 4.1.

Figure 4.1. Steps in a standard browser request.

graphics/04fig01.gif

Note that after each request, the Web server terminates the socket connection. The connection does not exist across multiple requests ”it is not connection oriented. Also note that each request must pass to the Web server all information required to satisfy that request. This is fine if all that the ASP code does is take the values from a single form and save them to a database. If the application needs to do something more difficult, such as maintaining a shopping cart, matters are more complicated.

What Are Cookies?

Early in the genesis of the Web, folks working on the HTTP spec realized that there had to be some way to persist state information across Web requests. Cookies were designed for this purpose. If you look again at the code in Listing 4.1, you will notice that a number of headers appear before the HTML. These headers are terminated by a double CRLF before the HTML starts. One of the headers is Set-Cookie: When an HTTP 1.0 “compliant browser sees this header, it assumes responsibility for storing that cookie in whatever fashion it sees fit. Furthermore, the browser is expected to include that cookie value with all future requests it makes of the Web server.

NOTE

This is important. The browser returns the cookie value only to the server that set it, not to other servers. Therefore, a lot of the hype about cookies tracking users across multiple Web sites is nonsense .

User agents (that is, browsers) are expected to accept at least 300 cookies in total with a size of at least 4KB per cookie, but some browsers implement higher limits. Even though the minimum size of 4KB is quite large, cookies were never intended to be a place to cache large amounts of information. Instead, they are intended to be used as a key into state information that is maintained on the server.

For more information on cookies, read RFC 2109 at http://www.w3.org/Protocols/rfc2109/rfc2109.

Cookie-Based Session Identity

ASP.NET provides a way to maintain state using cookies that is very much in line with the recommendations in RFC 2109. On a user agent's first visit to a page that uses cookies, a cookie is sent to the user agent by the Set-Cookie header in the HTTP page. See Listing 4.2 for an example.

Listing 4.2 A Typical `Set-Cookie` Header

 Set-Cookie: ASP.NET_SessionId=sfdaa145jb0mdv55nnhgic55; path=/

This cookie persists for as long as the Web browser is open. This is called a session cookie. If the Web browser is closed, the cookie is lost. The SessionId is a 120-bit string containing URL-legal ASCII characters .

It is generated in such a way that it is unique and random. Uniqueness is obviously important. If an application is storing credit card numbers for a shopping cart in session state, it shouldn't accidentally connect a different user to that session! Randomness is a little harder to understand. The SessionID should be random so that someone can't calculate the session of someone else who is currently on the server and hijack that person's session. This situation would be just as bad as giving out nonunique SessionID s.

Cookieless Session Identity

Cookie-based session identity is a great idea and uses a feature of HTTP ”cookies ”that is intended for maintaining state information. Some people got it into their tiny minds that cookies were invented by the same people who track you through the streets using silent black helicopters. An alarmist cry went out and users insisted that cookies are an invasion of their privacy ”a way for their viewing habits to be tracked across the Web. They were a way for aliens to read their thoughts. You and I both know better after reading RFC 2109, but people who believe in little green men don't always think rationally.

At about this time, the folks who write Web browsers (such as Internet Explorer 5.5) needed some new features to add to their feature checklists, so they added a feature to disable cookies! This seemed like a great idea, but the problem is that no great way exists for an application to check to see whether cookies are disabled. It can look at the user-agent string to see if the user agent is Internet Explorer 5.5, for example, and then make the assumption that cookies are enabled but you know what they say about assumptions. If the user has disabled cookies and the application still thinks they are enabled, it is just going to end up re-creating a new session for the user each time he or she visits . Mighty inconvenient.

Because Microsoft writes one of the more popular Web browsers, Internet Explorer, and because the company added features to it to disable cookies, it only seems fitting that several years later Microsoft would upgrade ASP so that there is a way to maintain state without relying on cookies. Hence, cookieless session state was born.

Cookieless session state works by " munging ," or modifying, the URL so that the SessionID is included as part of the URL. When a user accesses the first page of a site that uses cookieless session state, ASP.NET performs a 302 Redirect back to the same page with the SessionID included in the URL. If, for example, a user types in http://localhost/simplecounter.aspx , the user is redirected back to http://localhost/(adu2o155emcqlbme5gofcu45)/simplecounter.aspx . The goofy looking string between the () is the SessionID and varies for each session. The conversation is shown in Figure 4.2.

Figure 4.2. HTTP conversation establishing `SessionID` .

graphics/04fig02.gif

To maintain the session state, this SessionID must be included in all future requests to the Web server. ASP.NET does not alter the application's HTML code and add the SessionID to the anchor tags. Take a look at the sample page in Listing 4.3.

Listing 4.3 Sample Page with Relative and Absolute References ”RelAbsolute.aspx

 <% if(Session["Count"] == null) Session["Count"] = 0; Session["Count"] = ((int)Session["Count"]) + 1; %> <html> <body>     <b>Count=</b><% = Session["Count"] %>     <br><a href="http://localhost/relabsolute.aspx">Absolute</a>     <br><a href="relabsolute.aspx">Relative</a>     <br><a href="/relabsolute.aspx">Mixed</a>     <br><a href="~/relabsolute.aspx" runat="server">Using ~</a>     <br>Logon Time: <% = Session["LogonTime"] %> </body> </html>

This code shows a page with two anchor tags. When clicked, the first absolute reference loses the SessionID . When the absolute reference is requested from ASP.NET, a new session initiation conversation takes place and all session state is lost. In the second anchor tag, the relative reference maintains session state. The Web browser takes the root of the URL from the currently accessed page, http://localhost/(4ldjqh55tnnq124545dg50ix)/ , and automatically prefixes the URL with it. The request then becomes http://localhost/(4ldjqh55tnnq124545dg50ix)/simplecounter.aspx . Because the SessionID is passed to the next request, session state is maintained. The third anchor tag in Listing 4.3 is relative to the site but not to the directory structure. In this case, the browser takes http://localhost/ as the prefix and requests http://localhost/simplecounter.aspx . This has the same effect as the absolute reference, causing the session initiation conversation to take place again and losing the session state.

The fourth tag uses an undocumented feature in ASP.NET. For any server-side control that uses an HREF, you can start the HREF with a ~ character. At runtime, the tilde will be replaced with the application root and the current directory. This works great for path-independent code, which may be reused in multiple pages in your application.

If an application needs to programmatically determine whether cookieless sessions are in use, it can use the IsCookieless() property on the HttpSessionState object. HttpSessionState is the class that implements all of the session state functionality in ASP.NET. When you use the Page.Session property, it is returning an instance of the HttpSessionState object.

Using the Session

When a page starts executing, an event is fired that the Session HTTP Module listens for. The Session module sinks this event and automatically populates the Session property if session state is required. Using the session object to store data is very simple. Listing 4.4 shows a simple page that increments a counter each time it is accessed. Listing 4.5 shows how to store a value in the Session collection. The key "Counter" can be any arbitrary value that the application specifies. The namespace is global within the ASP.NET application.

Listing 4.4 Setting a Session Value

 Session["Counter"] = 0;

If the application saved a value, it is probably because the application will need to use it later. Specifying the same key allows the application to retrieve the value, as shown in Listing 4.5.

Listing 4.5 Getting a Session Value

 Response.Write(Session["Counter"]);

That's it. That is the minimum work required to use session state.

Initializing a User's State

When a new session is set up, it might be nice to preload some information about the user into session state. This is easily done by handling the Session_Start event. Listing 4.6 shows how to place an event handler in global.asax and store a value into the Session object itself. The example grabs the date and time that the user started up the session and saves it into a Session value called LogonTime .

Listing 4.6 When a Session Starts Up, `Session_Start` Fires

 <script language="C#" runat=server> void Session_Start(object sender, EventArgs e) {    Session["LogonTime"] = DateTime.Now; } </script>

The application can handle the Session_End event, which fires when the session has timed out. It can use this event to clean up things outside session state.

NOTE

In times of high load, Session_End might not fire.

Cleaning Up After Using Session State

So now you know how to create session state. However, if you are a programmer who obsesses over perfect indenting, you probably wonder what needs to be done to clean up after yourself when using session state. You just saved a bunch of data into process memory, thereby using a valuable resource. What needs to be done to clean this up? As it turns out, the session object already has the concept of a finite lifetime. If you don't clean up after yourself, it takes on the role of your mother and cleans up after you after a specified interval. This interval is controlled using the TimeOut property of HttpSessionState . If a client doesn't access a page in the application for 20 minutes, by default, the session data associated with that SessionID is deleted.

If you still feel the need to clean up after yourself, the Remove methods are for you. Remove() enables you to remove a single item from the Session . RemoveAll() allows you to remove all items from the session.

Adding Session Values

Now that you understand the basics, let's create a little more interesting example: a page that allows you to add arbitrary sets of information to the session. Listing 4.7 shows a form that allows the user to type in a key and a value and then saves the pair in Session state. It then loops through the Session collection to return all the keys in the collection and display all the values.

Listing 4.7 Adding and Displaying Session Values ”AddDisplay.aspx

 <%@ Page Language="C#" %> <script language="C#" runat=server>     void btnAdd_Click(object sender, EventArgs e)     {       // Add the key:value pair       Session[txtKey.Text] = txtValue.Text;     }     void btnClear_Click(object sender, EventArgs e)     {         Session.RemoveAll();     } </script> <html> <body>     <b>Manipulating Session</b><br>     <form runat=server>         <asp:label id="lblKey" text="Key:" runat=server />         <asp:textbox id="txtKey" runat=server />         <asp:label id="lblValue" text="Value:" runat=server />         <asp:textbox id="txtValue" runat=server />         <asp:button id="btnAdd" Text="Add" OnClick="btnAdd_Click"              runat=server />         <BR>         <HR>             <table border="1">                 <tr>                     <th>Key</th>                     <th>Value</th>                 </tr> <%             // Loop through the session keys             foreach(string strItem in Session)             {                 Response.Write("<tr>");                 // Output the key                 Response.Write("<td>" + strItem + "</td>");                 // Output the value                 // Use ToString() to co-erce possible objects in session to                 // a string                 Response.Write("<td>" + Session[strItem].ToString() + "</td>");                 Response.Write("</tr>");             } %>             </table>             <asp:button id="btnClear" Text="Clear All"                  OnClick="btnClear_Click" runat=server />     </form> </body> </html>

Beyond the Default Session Configuration

All the work we have done with session state so far uses the default configuration: in-process session state. This is the only option that was available in ASP 3.0. Now three options exist for storing session state in ASP.NET: in process, out of process, and SQL Server.

In-Process Session State

In-process session state works just as the name implies. The data structure that holds the session information is allocated from memory that belongs to the aspnet_wp.exe process. The advantage to this approach is that access to the data is very quick. It is only slightly different from looking up an item in a collection or array that might be in the program itself. When an object is stored using in-process session state, a reference to the object is actually what is stored.

The disadvantage to this approach is that the life of the session data mirrors the life of its host process. When aspnet_wp.exe shuts down, it behaves like any well-mannered program and cleans up all its data structures, releasing the memory back to the system. At this point the session data ceases to exist.

NOTE

Editing the global.asax file or Web.Config file and saving it will also clear all the in-process session states.

The session data is also trapped inside this process. If an application needs to take advantage of a Web farm approach in order to scale, it may run into trouble. Figure 4.3 illustrates what happens in this case. As Web servers are added to the Web farm, each is going to be running its own copy of aspnet_wp.exe, so each will have its own copy of the current session state.

Figure 4.3. Web farm using in-process session state.

graphics/04fig03.gif

This means that a user, on first requesting a page, will have that session set up on a particular server, such as Web1. A subsequent request for a page from the same user is not guaranteed to return to Web1; in a Web farm, the request can be routed to a different server. If the subsequent request is directed to a new Web server, Web2, the session state that the first request set up will not be there.

In the figure, a user logs on to Web Server A. The login process saves the UserID in session state for later use. Everything works fine until the user gets transferred over to Web Server B. The UserID is no longer saved as part of session state. The developer must anticipate and solve this problem when using a Web farm.

Session State Using a State Server

To avoid the problems shown in Figure 4.3, the developer needs to come up with a way to move the session state data structures outside the aspnet_wp.exe process. In fact, to solve the Web farm scenario, the session state data structures must be moved entirely outside the Web server. The ASP.NET State Server provides a solution.

The ASP.NET State Server is a Windows service that runs on any machine where ASP.NET is installed. This service hosts the data structures that were in the aspnet_wp.exe process before. The advantage of this configuration is that now when aspnet_wp.exe shuts down, the session data is no longer in its process space but is on the state server instead, so the data survives the shutdown of the process. This configuration also solves the issue of state that arises as the application is scaled using a Web farm. Figure 4.4 illustrates how moving the session state out of the aspnet_wp.exe process allows multiple Web servers to connect to the state server, thereby maintaining a consistent session state across all servers in the farm. One downside of this approach is that storing an object requires you to serialize or "freeze dry" the object for transfer to the state server. When you later access it, this process must be reversed . This adds some overhead to persisting objects into out-of-process session state.

Figure 4.4. ASP.NET State Server allows multiple servers to share state.

graphics/04fig04.gif

If one of the Web servers in the Web farm gets rebooted, it comes up and immediately starts sharing in the current session state maintained by the ASP.NET State Server. This is so cool that you may wonder why this isn't the default setting for ASP.NET. Well, moving the data structures out of process comes with a cost. By moving the data off to another server, accessing it requires a little more work. Instead of just accessing process memory directly, the SessionState class must now connect to a remote server over TCP/IP to access the data. This connection is clearly slower than accessing a memory location.

So how do you change the default for ASP.NET? Like many of the other settings, this one is stored in the Web.Config file in the application root.

NOTE

See Chapter 5, "Configuration and Deployment," for more information on Web.Config.

If you don't have a Web.Config yet, create it using the XML in Listing 4.8.

Listing 4.8 Sample Web.Config That Enables Out-of-Process Session State

 <configuration>     <system.web>        <sessionState             mode="stateserver"             stateConnectionString="tcpip=127.0.0.1:42424"             sqlConnectionString="data source=127.0.0.1;user id=sa;password="             cookieless="false"             timeout="20"         />     </system.web> </configuration>

The mode attribute, by default, is set to inproc . Changing it to StateServer and saving Web.Config causes all new requests to be directed at the local instance of the state server. This won't work until you fire upwork until you the state server. Listing 4.9 shows the command you can use to start this server.

Listing 4.9 Command to Start State Server

 Net start "ASP.NET State"

If you want to move the state server off the local Web server, you need to change the stateConnectionString attribute in the Web.Config. It is of the form IP Address:Port. Change the IP address from the loopback address of 127.0.0.1 to the IP address of your state server. Finally, fire up or restart the state server.

Important: The ASP.NET State Server performs no authentication. For security purposes you should make sure that you block the port it uses in a firewall in front of your server.

Storing Session State in SQL Server

By moving the session state out of process, the application is able to work effectively in a Web farm as well as protect itself against restarts of the process or the Web server. In this environment, however, there is now a single point of failure ”the state server. In the case of a restart of the box where the state server resides, all the Web servers that rely on it get an instant lobotomy and lose their state information. For some design patterns this may be unacceptable, so a third option exists in ASP.NET: storing the state information in SQL Server.

NOTE

This option is called SQL Server for a reason. It works only with SQL Server and will not work with Oracle, Sybase, or any other database server.

By storing the information in SQL Server, a durable session store is achieved that will survive restarts of every machine in your Web farm, except the database server. With some manual changes to the setup you can even configure it to survive a restart of the SQL Server. Setting up ASP.NET to store session state in SQL Server requires more steps than the other two options. The hardest part really relates to the configuration of the SQL Server.

A script, InstallSqlState.sql, is installed with ASP.NET to do all the hard work. This script can be found in the Framework SDK directory on your machine. Under Windows 2000, this would be c:\winnt\Microsoft.NET\Framework\v1.0.3512 (where the last number is the build number that you have installed). The script creates a database called ASPState that contains only stored procedures. It also creates two tables inside tempdb, called ASPStateTempApplications and ASPStateTempSessions, that store the session state. See Figure 4.5 for the database schema.

Figure 4.5. Schema for SQL Server State.

graphics/04fig05.gif

All of the tables are placed into tempdb for a reason. Tempdb is the only database in SQL Server that will allow a query to return before data is fully committed to the hard drive. This is good for performance. During a restart of SQL Server tempdb is cleared, which is not good for durability. You can move the tables out of tempdb into the ASPState database but realize you are going to be trading durability for performance. If you do move the tables you will need to modify most of the stored procedures in the ASPState database to point to the new location of the tables.

ASPStateTempApplications contains a row for each application root that is using the state service. The AppName field contains the ADSI path to the root. For the default Web root, this would be /LM/W3SVC/1/ROOT. The ASPStateTempSessions table is where all the work is done. One row is inserted for each SessionID that is associated with session state. This last point is important. Until the application attempts to save an item into the session state, no row is inserted. This delay is a small performance optimization so that pages that are not using session state don't take the hit of creating a row in the database. Also note that a row is not inserted for each session state value that is saved. Instead, a single row is inserted and a blob dropped into one of two columns . If the blob is smaller than 7,000 bytes, it is put into SessionItemShort, avoiding the need to allocate additional pages for an image field. If it is greater than 7,000 bytes, the extra work is done to allocate additional pages for storing the image data.

NOTE

This means that storing items of fewer than 7,000 bytes is more efficient from a performance perspective.

The last item InstallSqlState.sql creates is a job that is scheduled to run a stored procedure once a minute. This stored procedure, DeleteExpiredSessions , looks at the Expired field of ASPStateTempSessions and is responsible for removing expired session records. If the current timestamp is past the timestamp in the Expired field, the row is deleted. Each time the row is updated, the Expired field is updated to be the current timestamp plus the Session.TimeOut value.

Which One to Use?

ASP.NET gives you three choices for session state. Which should you use? The answer is it depends on what you need. Each option offers unique advantages. You might think at first that significant performance differences exist among each of the options. However, running the code in Listing 4.10 shows that the performance differences among the three options are relatively minor.

Listing 4.10 Code to Time Read and Write Operations on Session State ”Timing.aspx

 <html> <head> <title>Session Usage Timing</title> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> <script language="C#" runat=server>     void Page_Load(object sender, EventArgs e)     {         int iTemp;         DateTime dStart;         DateTime dEnd;         string szTemp;         Random rnd = new Random(DateTime.Now.Millisecond);         dStart = DateTime.Now;         for(int iCount = 1; iCount < 1000000;iCount++)         {             szTemp = rnd.Next(DateTime.Now.Millisecond).ToString();             Session[iCount.ToString()] = szTemp;         }         dEnd = DateTime.Now;         lblWriteStart.Text = dStart.ToString("T");         lblWriteEnd.Text = dEnd.ToString("T");         lblWriteElapsed.Text = dEnd.Subtract(dStart).ToString();         dStart = DateTime.Now;         for(int iCount = 1; iCount < 1000000;iCount++)         {             szTemp = rnd.Next(DateTime.Now.Millisecond).ToString();             szTemp = (string) Session[iCount.ToString()];         }         dEnd = DateTime.Now;         lblReadStart.Text = dStart.ToString("T");         lblReadEnd.Text = dEnd.ToString("T");         lblReadElapsed.Text = dEnd.Subtract(dStart).ToString();     } </script> </head> <body bgcolor="#FFFFFF" text="#000000">     <table border=1>         <tr>             <th>Operation</th>             <th>Start</th>             <th>End</th>             <th>Elapsed</th>         </tr>         <tr>             <td>Write</td>             <td><asp:label id="lblWriteStart" runat=server /></td>             <td><asp:label id="lblWriteEnd" runat=server /></td>             <td><asp:label id="lblWriteElapsed" runat=server /></td>         </tr>         <tr>             <td>Read</td>             <td><asp:label id="lblReadStart" runat=server /></td>             <td><asp:label id="lblReadEnd" runat=server /></td>             <td><asp:label id="lblReadElapsed" runat=server /></td>         </tr>     </table> </body> </html>

Running this code for 1,000,000 iterations for each configuration with the state server and SQL Server residing on the same machine as the Web server, you will find that there are only slight differences in performance between the methods for simple object types.

This admittedly unscientific test shows that you can worry more about the problem you are trying to solve instead of worrying about the performance of the session state mode. Each of the three options has its advantages and disadvantages.

Mode	Advantages	Disadvantages
In Process	Default. No extra work to configure. Can store complex objects by reference.	Process restart kills session state. Web farms don't share session state.
Out of Process	Works in Web farms.	Single point of failure. Must serialize complex objects.
SQL Server	Survives restart of any Web server. With modifications, will survive restart of SQL Server.	Must administer SQL Server. Must serialize complex objects.

Mode

Advantages

Disadvantages

In Process

Default.

No extra work to configure.

Can store complex objects by reference.

Process restart kills session state.

Web farms don't share session state.

Out of Process

Works in Web farms.

Single point of failure.

Must serialize complex objects.

SQL Server

Survives restart of any Web server.

With modifications, will survive restart of SQL Server.

Must administer SQL Server.

Must serialize complex objects.

Most developers will likely start development using in-process session state because it is the default. As they start to attempt to scale up things, they are likely to choose either out-of-process or SQL Server session state to help their applications scale in a Web farm.

for RuBoard