The following section of this chapter deals with the issue of creating websites that function in a web farm. A web farm is essentially a cluster of web servers, all running the same web application. That cluster allows increased user volume, load, and capacity on a website by providing additional hardware, memory, and processing power to sustain and scale the website.
In addition to providing increased scalability by allowing your application to support more concurrent users and higher volume, the use of a farm also can provide increased availability. What do you do if your main web server has a hard drive crash? What is your plan for surviving hardware failure, or server crashes due to other software problems? If you have only one server, you can be guaranteed that you won't survive any of those circumstances without your website being completely unavailable.
Some hardware devices (and some software as well) that create web farms also support the notion of failover. The concept is that when one of the servers in a farm becomes too slow, unresponsive, or completely crashes, other servers in the farm can temporarily handle the troubled server's load until it can be repaired or replaced. This provides a continuously available online presence, even when a server is down. In addition to giving you the ability to recover from hardware problems, a farm like this enables you to install new versions of your website one machine at a time, gradually transferring users from the old version to the new version as you upgrade each server. This type of upgrade can take place without any downtime experienced by any of your users.
Unfortunately, creating websites that not only take advantage of this scalability but function properly within such an environment is anything but trivial. There are quite a few things that a developer needs to do and be aware of to create applications that function in a web farm. The following section will prepare you for creating farm-friendly web applications in ASP.NET.
ViewState in a Web Farm
ViewState is a feature provided by ASP.NET that allows a web page to maintain state between requests. Because HTTP is a stateless protocol, developers often had to come up with their own methods for maintaining state between pages to support items such as dynamic controls and wizards, and to support progressive activities such as shopping carts and wish lists. The code to create those features by hand is fairly extensive. ViewState makes dealing with those situations much easier, but at a price: ViewState is very large and adds quite a bit of weight to pages because of the encryption used.
The encryption is where some of the problems show up with ViewState and web farms. The scenario that gives many of us painful headaches is when the user receives a page from one server that has encrypted ViewState. The user then performs some action on that same page, sending the ViewState back, but the page ends up on a different server at the back-end. ViewState is often validated based on encryption. A MAC (message authentication code) is used to tell ASP.NET whether the ViewState has been tampered with. This is a very handy thing because it protects you from potential attacks involving modified ViewState contents. The downside is that the MAC is machine-specific. In other words, the MAC generated by one server will not be the same as the MAC generated by another server, even if they are in the same farm.
In the web.config file, you can set <enableViewStateMAC> to false to turn it off if you aren't worried about security and you want your ViewState to work properly within the farm.
If you do need the security of tamper-proof ViewState (this also applies to session keys), you can provide a <machineKey> tag in the web.config file. This key sets up a validation key and a decryption key that is used for generating the MAC that becomes the tool for verifying the validity of state variables. If every server in your farm has the same validation and decryption key, they all know how to read and interpret ViewState generated by any other server in the farm.
When specifying the validation key and encryption key, Microsoft recommends that keys be no smaller than 20 bytes (40 hexadecimal characters) and no greater than 64 bytes (128 hexadecimal characters).
The following is an example of an entry in a web.config file that, when set on each application server within a farm, will allow ViewState and session state keys to be handled by any server in the farm:
<machinekey validationkey=".. random sequence of 128 hex characters ..." decryptionkey=".. another random sequence of 128 hex characters ..." validation="SHA1" />
With code like this in place in your application's web.config file, you can rest easy that you're well on your way to being able to run within a farm. This is just the first step, however. There are still quite a few more things that you need to keep in mind in order to build a farm-friendly application.
Session State Maintenance in a Web Farm
If you have worked with ASP.NET for any length of time, you are probably already familiar with the concept of session state. A user's session begins when his browser makes the first request to the website. This session then continues until the server has not received any communication from the client within a certain timeout period. In this way, the stateless limitation of HTTP can be overcome and provide a rich and interactive experience for users. Session state is used to store everything from information on the current user to things such as shopping carts, wizard state, search results, cached data, and pretty much anything else that a developer felt should be easily accessible and divided on a per-session basis.
In its default configuration, ASP.NET session state is a large set of name-value pairs. Every time a piece of code accesses a session state variable, it does so by name. One level of abstraction that is deliberately abstracted from this model is that the session state dictionary is actually a nested dictionary. The top-level dictionary contains a mapping of unique session identifiers to a dictionary containing all the state variables.
Each time a page that has session state enabled is loaded, it retrieves from memory all the name/value pairs that are appropriate for the current session identifier.
Because this session state variable dictionary is stored at the AppDomain scope, it doesn't work within a web farm. Each process would be maintaining its own session state. Used in this default form, users could lose all of their session information if they were transferred from one server to another.
A way around this is to use an out-of-process session state provider. This essentially means that you can configure ASP.NET to use a shared session state provider. In this configuration, when pages are loaded, the session state variables are retrieved from the shared provider. During the page's execution, those values can be changed. Finally, when the page is done executing, changes to the current session are saved to the out-of-process session state provider.
There is one caveat, however, when writing ASP.NET code that utilizes background threads and out-of-process session state servers. When a background thread makes changes to the (out-of-process) session state, the foreground thread will not automatically be made aware of that change. Take care when creating multithreaded ASP.NET applications that use out-of-process session state.
Although you can create your own if you are feeling ambitious, two out-of-process session state servers ship with ASP.NET: the SQL Server session state provider and the ASP.NET State Server. Both are discussed in the following sections.
Session State Maintenance with SQL Server
SQL Server session state is, obviously enough, a session state provider that uses SQL Server as the persistence medium. The first time your ASP.NET page loads, a call is made to SQL Server to obtain the list of name-value pairs that are applicable to the current session ID. The session ID is retrieved from either a cookie or the URL of the current page, depending on how the site is configured.
Throughout the execution of the page, changes made to the session state are stored in a temporary, in-memory session state dictionary. When the page finishes its execution, the changes made to session state are sent to SQL Server to be saved.
As you might have guessed, this method of using SQL Server for a session state manager isn't as fast or as efficient as using the in-process provider that is the ASP.NET default. It does have its benefits, however. It is far more reliable than the default session state provider, and it can be used as a shared provider for multiple web servers in a farm. In fact, you can even use the same SQL Server instance to provide session state management services for multiple web farms if you play with configurations a little.
To set up a SQL Server instance to be used as a session state provider, you first need to run a script that will configure all the appropriate data structures. You actually have two choices of scripts to run: InstallSqlState.sql and InstallPersistSqlState.sql. These are both SQL text files that you can execute from within SQL Query Analyzer or by running a SQL command-line tool. The first script will configure SQL Server to act as a host for session state management, with the information being stored in the tempdb database. As you might know, SQL Server empties the contents of that database on startup.
If you want the information contained in a user's session to survive a reboot of the SQL Server machine, you can use the second script, InstallPersistSqlState.sql. The second script creates a persistent database called ASPState that will maintain user session data even after the server has been rebooted. In addition to creating the data structures, it creates a job that periodically empties out the database of expired session state data. For this feature to work properly, the SQL Agent service must be running on the state management server. As with all session state providers, the interface for accessing session state variables is the same without regard for provider or provider location.
Even though the interface looks the same, you cannot assume that it will work the same. For example, all out-of-process session state servers work via serialization; they do not pass object references like the in-process provider. This means that any object instance you store in the session must not only be serializable, but it must also be stable. Here stable means that if you serialize an object and then reconstruct its graph by deserializing it, the object should have the same state. To avoid potential problems with serialization, always make your session objects serializable, and always make sure that you don't make use of private fields or one-way (only a get or set accessor provided, not both) properties.
To configure your web application to use SQL Server session state, set your <sessionState> tag in web.config to (note that the mode attribute is actually case-sensitive):
<sessionState mode="SQLServer" sqlConnectionString= "server=( server); user id=(user); password=(password); initial catalog=ASPState;" cookieless="false" timeout="60" />
The sqlConnectionString attribute follows all the original rules for creating a SQL Server connection. You can either specify the user credentials in the connection string (generally considered a security risk), or you can indicate integrated security, and so forth.
Session State Maintenance with the ASP.NET State Service
If you don't work for a SQL Server shop, you don't want or need the additional overhead of maintaining a separate SQL Server machine for state maintenance, or you just have something against servers made from three-letter acronyms, there is a more lightweight alternative in the ASP.NET State Service. The ASP.NET State Service is a Windows service that you can activate to handle requests from an ASP.NET application for state management. It is installed by default in the (system)\Microsoft.NET\Framework\(version) directory as the filename aspnet_state.exe.
To configure State Server in the web.config file, use the following <sessionState> tag format (again, the mode attribute is case sensitive):
<sessionState mode="StateServer" stateConnectionString="tcpip=(host):(port)" cookieless="false" timeout="60" />
The advantages of State Server are that it is already installed on every machine with ASP.NET, it has a smaller memory and processor footprint than SQL Server, and you don't have to worry about licensing issues and database load. However, although you can configure SQL Server to maintain a persistent state database that survives reboots, the ASP.NET State Service loses all of its state information when it is shut off for any reason.
Application State in a Web Farm
Whereas session state refers to information that is specific to a particular session on a website, application state refers to information that persists throughout the lifetime of an application (the time between reboots or restarts of IIS).
Each ASP.NET application runs in its own process. Within that process is an area that is available to be used for maintaining state. In fact, the AppDomain object has a dictionary built into it that enables you to get and set name/value pair data via the Getdata and SetData methods. Although this might seem handy, it can be troublesome in a web farm.
You can't share application state between applications without writing your own custom code. You'll see some alternatives and options for sharing application state in the best practices and recommendations section later in the chapter. If you must use application state, you will have to come up with your own scheme for sharing or synchronizing application databe it with a database or some other method.
Web Farm Configuration and Deployment
As you have seen so far, one of the most important issues when dealing with web farms and ASP.NET is in the configuration of the website. You've seen that multiple modifications have to be made to the web.config file to allow ViewState and session state keys to be properly interpreted. Changes must also be made to configuration sections for setting up out-of-process session state management.
There are a few more elements that need to be taken care of in order to properly configure and deploy a web farm. The first and foremost item is the actual creation of a farm. The creation of server sharing, farming, and clustering is outside the scope of this book. It should suffice to know that there are devices that enable you to set up one IP address that forwards to any number of other IPs in the farm, based on a variety of factors such as load and availability. Users come into your site using the external, single IP address, and are then sent seamlessly to one of the machines within the farm. You always need to remember that in a true farm, you have no control over which machine your code is currently executing within and you need to set up your design and architecture with that fact in mind.
IIS Configuration in a Web Farm
After your application has been designed, coded, and configured to run properly within the context of a web farm, and the farm has been created and is running properly, there is one last step: You need to make sure that IIS is configured properly.
If you are using out-of-process session state management in your web farm, and IIS is configured incorrectly, you can actually see data inconsistencies or session state simply not being maintained. The reason for this is that the out-of-process session state is keyed to the application path within the IIS metabase. Each web application in IIS is given a path, such as \LM\W3SVC\1 or \LM\W3SVC\2. If this path doesn't match for every single server in your web farm, those servers will not be able to share session state within that farm. There are two tools that you can use to obtain the application path from the IIS metabase: AdsUtil (a script written in VBScript; available with the NT option pack) and MetaEdit, a tool that you can get from the NT option pack and Windows 2000 CDs.
Before you give the word that your farm-friendly application is live, in production, and ready to go, check your metabase and make sure that every application in your web server has the same application path.
Web Farm Best Practices, Recommendations, and Caveats
As was mentioned earlier in the chapter, ViewState can add a considerable amount of overhead to a page. It can take a relatively long period of time to decrypt and verify the ViewState form variable. In addition, depending on how much information is stored in ViewState, it can actually increase the download time of a web page. On many projects, pages store a complete DataGrid control in the ViewState. For large results, the ViewState form variable can actually consume several hundred kilobytes of space, sometimes reaching over 1MB in size. On a LAN, you might not complain too much about that, but over a slow or weak internet connection, users will certainly notice that page taking too long to load and render.
It is recommend that you disable ViewState unless you are sure that you absolutely need the features it provides. In fact, you should probably disable ViewState throughout your entire web site as a rule, and turn it on only when you have discovered that no other alternative will work. Programmers often store data in ViewState to avoid requerying that same data on subsequent loads of the page. Get out your stopwatch and measure it. Which takes longer? Processing the download and upload of ViewState, or querying the database on the back-end? Before you quickly turn to ViewState as a catch-all to make things easier, examine your options regarding your database and server-side caching to see whether you can avoid using it.
If you are using out-of-process session state, you need to be aware of a crucial fact: Out-of-process session state performs two tasks that are both considered performance issues. The first task is that it makes a network connection to a server somewhere. Although it might be a fast connection, any off-machine connection will always be slower than an in-process data operation. The second task, all out-of-process state management, is accomplished via serialization. The more complex an object is, the larger its serialized representation and the longer it takes to serialize and deserialize on the network stream. In particular, DataSets (prior to .NET 2.0) serialize into extremely large XML representations that can potentially cause severe delays in state management. Also keep in mind that connections are made to the session state provider at both the beginning and at the end of page rendering, so any large object that you have in session state can potentially slow down the pipeline twice per page view.
It has been mentioned before, but it's worth mentioning again: Any object that is stored in out-of-process session state has to be serializable. You must be able to restore the state from a serialized graph for that object to work properly with session state. Keep the data you store in the session small and simple. Sticking to base .NET Framework types will make things a lot easier (and faster).
Application state is an area that can easily be abused. Because the state data stored in the AppDomain object is globally scoped throughout the AppDomain, any large data there is a burden on the garbage collector. Large data will stay in memory for an extremely long time, even if you don't use it. Keep your use of this dictionary (many consider it a crutch to be avoided) to a minimum. If you have to use it, store primitive types or small classes that are easily (and quickly) serialized (no DataSets).
If you truly need shared application state, resist the urge to do something cool and fancy and rig up some kind of Remoting system. Something like that will probably generate a lot of development work and a maintenance headache, when you probably could have used a table in your application's database for application state information. Granted, there are situations in which Remoting or using web services to synchronize application data within a farm is necessary, but those situations are rare and usually not practical.
Many developers easily fall into the habit of assuming that their desktop is the same as the deployment environment. With a few extra configuration steps, and keeping a few tips in mind, you can seamlessly take your single-server application and make it work smoothly within a 2-server, 20-server, or 200-server web farm.