State Management | Expert C# 2008 Business Objects

The Achilles' heel of web development is state management. The original design of web technology was merely for document viewing, not for the myriad purposes for which it's used today. Because of this, the issue of state management was never thought through in a methodical way. Instead, state management techniques have evolved over time in a relatively ad hoc manner.

Through this haphazard process, some workable solutions have evolved, though each requires trade-offs in terms of performance, scalability, and fault tolerance. The primary options at our disposal are as follows :

State is maintained on the web server.
State is transferred from server to client to server on each page request.
State is stored in temporary files or database tables.

Whether we use a DataSet , a data reader, or business objects to retrieve and update our data is actually immaterial here. Ultimately, we're left to choose one of these three state management strategies. Table 9-1 summarizes the strengths and weaknesses of each.

Table 9-1: State-Management Strategies
Approach	Strengths	Weaknesses
State stored on web server	Easy to code and use. Works well with business objects.	Use of global variables /data is poor programming practice. To get scalability and fault tolerance via a web farm, we must introduce complexity of infrastructure.
State transferred to/from client	Scalability and fault tolerance are easily achieved by implementing a web farm.	Hard to code, requires a lot of manual coding to implement. Performance can be a problem over slow network links.
State stored in file/database	Scalability and fault tolerance are easily achieved by implementing a web farm. We can efficiently store a lot of state data or very complex data.	Increased load on database server since we retrieve/store state on each page hit. Requires manual coding to implement. Data cleanup must be implemented to deal with abandoned state data.

As you can see, all of these solutions have more drawbacks than benefits. Unfortunately, in the more than seven years that the Web has been a mainstream technology, no vendor or standards body has been able to provide a comprehensive solution to the issue of dealing with state data. All we can do is choose the solution that has the lowest negative impact on our particular application.

Let's go into some more detail on each of these techniques, in the context of using business objects behind our web forms.

State on the Web Server

First, we can choose to keep state on the web server. This is easily accomplished through the use of the ASP.NET Session object, which is a name -value collection of arbitrary data or objects. ASP.NET manages the Session object, ensuring that each user has a unique Session , and that the Session object is available to all our Web Forms code on any page request.

This is by far the easiest way to program web applications. The Session object acts as a global repository for almost any data that we need to keep from page to page. By storing state data on the web server, we enable the type of host-based computing that has been done on mainframes and minicomputers for decades.

Tip

In ASP and COM, there were a number of limitations on the Session object, the most significant of which was that we couldn't put VB 6 objects into Session without incurring serious performance penalties. The ASP.NET Session object and .NET objects don't have these issues, so it's perfectly acceptable to store .NET objects in Session in ASP.NET.

As we've already expressed , however, there are drawbacks. Session is a global repository for our user, but as any experienced programmer knows , the use of global variables is very dangerous and can rapidly lead to code that's hard to maintain. If we choose to use Session to store state, we must be disciplined in its use to avoid these problems.

The use of Session also has scalability and fault-tolerance ramifications .

Using a Web Farm in ASP.NET

To achieve scalability and fault tolerance, we typically implement a web farm ”two or more web servers that are running exactly the same application. It doesn't matter which server handles each user page request, because all the servers run the same code. This effectively spreads the processing load across multiple machines, thus increasing scalability. We also gain fault tolerance, since if one machine goes down, the remaining server(s) will simply handle user requests .

What I just described is a fully load-balanced web farm. However, because state data is often maintained directly on each web server, the above scenario isn't possible. Instead, web farms are often configured using "sticky sessions." Once a user starts using a specific server, the user remains on that server because that's where their data is located. This provides some scalability, because our processing load is still spread across multiple servers, but it provides very limited fault tolerance. If a server goes down, all the users attached to that server also go down.

To enable a fully load-balanced web farm, no state can be maintained on any web server. As soon as user state is stored on a web server, our users become attached to that server to the extent that only that server can handle their web requests. By default, the ASP.NET Session object runs on our web server in our ASP.NET process. This provides optimal performance because the state data is stored in process with our code, but this approach doesn't allow us to implement a fully load-balanced web farm.

Instead, the Session object can be run in a separate process on our web server. This can help improve fault tolerance, since the ASP.NET process can restart and users won't lose their state data. However, this still doesn't help us to implement a fully load-balanced web farm, so it doesn't help with scalability. Also, there's a performance cost because the state data must be serialized and transferred from the state management process to the ASP.NET process (and back again) on every page request.

As a third option, ASP.NET allows us to run the Session object on a dedicated, separate server, rather than on any specific web server. This state server can maintain the state data for all users, making it equally accessible to all web servers in a web farm. This does mean that we can implement a fully load-balanced web farm, in which each user request is routed to the least loaded web server. As shown in Figure 9-4, no user is ever "stuck" on a specific web server.

Figure 9-4: Load-balanced web-server farm with centralized state server

With this arrangement, we can lose a web server with minimal impact. Obviously, users in the middle of having a page processed on that particular server will be affected, but all other users should be redirected to the remaining live servers transparently . All the users' Session data will remain available.

As with the out-of-process option that we discussed previously, the Session object is serialized so that it can be transferred to the state server machine efficiently. This means that all objects referenced by Session are also serialized ”which isn't a problem for our business objects, since they're marked as [Serializable()] .

Note	When using this approach, all state must be maintained in `[Serializable()]` objects.

In this arrangement, our fault tolerance is significantly improved, but if the state server goes down, then all user state is lost. To help address this , we can put the Session objects into a SQL Server database (rather than just into memory on the state server), and then use clustering to make the SQL Server fault tolerant as well.

Obviously, these solutions are becoming increasingly complex and costly, and they also worsen performance. By putting our state on a separate state server, we now incur network overhead on each page request, since the user's Session object must be retrieved from the state server by the web server so that our Web Forms code can use the Session data. Once our page is complete, the Session object is transferred back across the network to the state server for storage.

Table 9-2 summarizes our options.

Table 9-2: Session Object Storage Locations
Location of State Data	Performance, Scalability, and Fault Tolerance
`Session` in process	High performance; low scalability; low fault tolerance; web farms must use sticky sessions; fully load-balanced web farms not supported.
`Session` out of process	Decreased performance; low scalability; improved fault tolerance (ASP.NET process can reset without losing state data); web farms must use sticky sessions; fully load-balanced web farms not supported.
`Session` on state server	Decreased performance; high scalability; high fault tolerance.

In conclusion, while storing state data on the web server (or in a state server) provides the simplest programming model, we must make some obvious sacrifices with regard to complexity and performance in order to achieve scalability and fault tolerance.

Transferring State to or from the Client

The second option we're considering is to transfer all state from the server to the client, and back to the server again, on each page request. The idea here is that the web server never maintains any state data ”it gets all state data along with the page request, works with the data, and then sends it back to the client as part of the resulting page.

This approach provides high scalability and fault tolerance with very little complexity in our infrastructure: Since the web servers never maintain state data, we can implement a fully load-balanced web farm without worrying about server-side state issues. On the other hand, there are some drawbacks.

First of all, we're now transferring all the state data over what is typically the slowest link in our system: the link between the user's browser and the web server. Moreover, we're transferring that state twice for each page ”from the server to the browser, and then from the browser back to the server. Obviously, this can have serious performance implications over a slow network link (like a modem), and can even affect an organization's overall network performance due to the volume of data being transferred on each page request.

The other major drawback is the complexity of our code . There's no automatic mechanism that puts all our state data into each page ”we must do that by hand. Often this means creating hidden fields on our pages in which we can store state data that's required, but which the user shouldn't see. Our pages can quickly become very complex as we add these extra fields.

This can also be a security problem. When we send state data to the client, that data becomes potentially available to the end user. In many cases, our state data will include internal information that's not intended for direct consumption by the user. Sometimes, this information may be sensitive, so sending it to the client could create a security loophole in our system. Although we could encrypt this data, that would incur extra processing overhead and could increase the size of our data, so performance would be decreased.

To avoid such difficulties, applications often minimize the amount of data stored in the page by re-retrieving it from the original database on each page request. All we need to keep in the page then is the key information to retrieve the data, and any data values that we've changed. Any other data values can always be reloaded from the database. This solution can dramatically increase the load on our database server, but continues to avoid keeping any state on the web server.

In conclusion, while this solution offers good scalability and fault tolerance, it can be quite complex to program, and can often result in a lot of extra code to write and maintain. Additionally, it can have negative performance impact, especially if our users connect over low-speed lines.

State in a File or Database

The final solution to consider is the use of temporary files (or database tables of temporary data) in which we can store state data. Such a solution offers further other alternatives, including the creation of data schemas in which we can store state data so that it can be retrieved in parts , reported against, and so forth. Typically, these activities aren't important for state data, but they can be important if we want to keep the state data for a long period of time.

Most state data just exists between page calls, or at most, for the period of time during which the user is actively interacting with our site. Some applications, however, keep state data for longer periods of time, thereby allowing the user's "session" to last for days, weeks, or months. Persistent shopping carts and wish lists are examples of long- term state data that's typically stored in a meaningful format in a database.

Whether we store our state as a single blob of data or in a schema, storing it in a file or a database provides good scalability and fault tolerance. It can also provide better performance than sending the state to and from the client workstation, since communicating with a database is typically faster than communicating with the client. In situations like these, the state data isn't kept on the client or the web server, so we can create fully load-balanced web farms as shown in Figure 9-5.

Figure 9-5: Load-balanced web farm with centralized state database

Tip	As I mentioned earlier, one way to accomplish this is to use the ASP.NET `Session` object, and configure it so that the data is stored in a SQL Server database. If we just want to store arbitrary state data as a single chunk of data in the database, then this is probably the best solution.

The first thing you'll notice is that this diagram is virtually identical to the state-server diagram that we discussed earlier, and it turns out the basic model and benefits are indeed consistent with that approach. We get scalability and fault tolerance because we can implement a web farm, whereby the web server that's handling each page request retrieves state from the central database. Once the page request is complete, the data is stored back in the central state database. Using clustering technology, we can make the database server itself fault tolerant, thereby minimizing it as a single point of failure.

In conclusion, though this approach offers a high degree of scalability and fault tolerance, if we implement the retrieval and storage of the state data by hand it increases the complexity of our code. There are also performance implications, since all our state data is transferred across a network and back for each page request ”and then there's the cost of storing and retrieving the data in the database itself.

In the final analysis, determining which of the three solutions to use depends on the specific requirements of your application and environment. For most applications, using the ASP.NET Session object to maintain state data will offer the easiest programming model and the most flexibility. We can get optimal performance by running it in process with our pages, or with optimal scalability and fault tolerance by having the Session object stored in a SQL Server database on a clustered database server. There are shades of compromise in between.

The key is that our business objects are [Serializable()] , so the Session object can serialize them as needed. Even if we choose to implement our own blob-based file or data-storage approach, the fact that our objects are [Serializable()] means that we too can easily convert our objects to a byte stream that can be stored as a blob. If our objects were not [Serializable()] , our options would be severely limited.

For our sample application, we'll use the Session object to help manage our state data ”but we'll use it sparingly, because overuse of global variables is a cardinal sin!