Web Cache Redirection (WCR) | Optimizing Network Performance with Content Switching: Server, Firewall and Cache Load Balancing

We covered SSL redirection in the previous chapter and will be discussing firewall load balancing in the next , so this chapter will focus on one of the most common uses for application redirectionWCR.

WCR is one of the most deployed services in the Internet today. As the Internet has grown and content has become available, ISPs have seen the bandwidth and usage of their infrastructure increase. Imagine an ISP with 300,000 subscribers, and 10,000 are online simultaneously . What if 10 percent of those on line all wanted to look at the same Web page? Without the content being local, every user would have to be routed to the content source. This is a huge duplication of data, a massive drain on resources, and excessive use and waste of bandwidth. So, why not provide a device that stores or caches this content locally? Caching is not a new concept and has been used in PC network devices for many years . It was only in the late 1990s with the growth of the Internet that caching appliances came to the forefront. Sure, we have early software-based caches that were used, but as access speeds increased, the need for dedicated caching appliances has become an important part of internetworking design. Every major ISP in business today will have some form of caching in place to minimize bandwidth usage and increase user performance as can be seen in Figure 8-2.

Figure 8-2. Cached content can be served quickly, while uncached content needs to traverse the network back to the origin server.

graphics/08fig02.gif

Before we rush off and discuss WCR, let's take a step back and spend some time understanding how caching works and the different methods involved in providing this service.

How Caching Works

In short, caching is the function of retrieving data, storing it locally, and then having it readily available for use when next requested . Ensuring that content is current or fresh is a key element that caching manufacturers take very seriously. Today, caching vendors have designed appliances to store and access data extremely quickly. The mechanisms used to do this are typically proprietary and each has its advantages and disadvantages. It is safe to say that caching of data for specific applications is what can differentiate vendors from one another. Some typical examples are:

HTTP data
FTP data
Streaming media such as video on demand

With the advent of streaming media and faster access speeds, users are demanding faster and better services from their ISP. In addition, as compression techniques improve and companies are looking to use media as another aspect to their business, streaming media services have become extremely popular. However, maintaining a steady, jitter-free stream of data from the origin server to the user becomes a challenge in today's overcrowded and oversubscribed backbones. It is here that caching comes to the fore and provides a solution that allows for data to be retrieved and viewed without the impact to the origin server or the backbone. This is true for all forms of data.

Cache Hits

Cache hits are used to describe how successful a cache is. Caches can use both memory and hard disk space to store data. Vendors design algorithms to ensure that the most frequently accessed objects are stored in memory, and some also allow the network administrator to force (pin) objects into memory. The reason for this is that a cache's function in life is to be hit as often as possible and serve that data back to the requesting device. Monitoring cache hits indicates how successful a cache is. A cache hit is when the object requested has been found in the cache store, either in memory or on the hard disk. Hit successes over 50 percent are seen as adequate, and obviously the more cache hits your cache performs , the less your bandwidth or servers are being hit. We will see later that maximizing cache hits is a key design criteria, and simple things such as load balancing metrics can affect the cache hits.

Caching Fundamentals

There have been many studies carried out about how many objects make up a typical Web page, and results vary between 30 and 80 different objects. As Web design becomes more intricate and innovative, a typical Web page will only increase in size. If each object is seen as a single HTTP GET, then between 30 and 80 TCP sessions will have to be created in order to download a typical Web page. We should point out here that the use of HTTP/1.0 causes this to happen. HTTP/1.1 was developed to overcome this limitation, as it allows a TCP session to remain open and multiple HTTP GETs to be sent down this single connection. This has the benefit of reducing TCP session setup and tear- downs and increases performance. Figure 8-3 illustrates the difference when using HTTP/1.0 and HTTP/1.1.

Figure 8-3. HTTP/1.0 vs. HTTP/1.1 shows how session setup is reduced when using HTTP/1.1.

graphics/08fig03.gif

Of the objects that make up a Web page, some of these are dynamic and some are static. By dynamic , we mean that they change often (every second to every few minutes or even hours). A typical example of dynamic content would be news services, stock price tickers, sports scores, and so forth. Static content, on the other hand, can be around for hours before it changes. Classic examples of this would be the date on a Web page, which would obviously change every 24 hours. Site names , banners, borders, company logos, and so forth would very rarely change. Therefore, if this site has been accessed before, why should another user coming from the same place have to consume bandwidth in order to download static content that could very easily be stored locally? It is here that caches make use of the HTTP/1.0 and HTTP/1.1 protocol header information that determines which objects need to be refreshed and which are still current. While developers also often try to force content to be refreshed (or not) by embedding messages within the HTML, this alone cannot guarantee content freshness. We need to make use of the HTTP headers. HTTP headers are discussed in detail in Chapter 2, Understanding Layer 2, 3, and 4 Protocols , but we will run through the most important headers that we use when performing caching.

HTTP/1.x

Lets look at what happens when a device accesses a Web page and understand the different HTTP headers used. There are many different types of HTTP headers, but the one at which we will look a little more closely is the General header. This header contains the Cache Control header, and it is this that determines what the cache must do with this request. The cache cannot disobey this and must honor the request specified in the Cache Control header. The following are the three most common Cache Control header directives used in HTTP/1.1:

Probably the most common is the no-cache command. This indicates to the cache that it must retrieve the data from the source. It cannot send any content to the requester without first ensuring its validity from the source. This obviously impacts caching performance and is not ideal for content that can be cached.
The max-age command indicates to the cache how long before the cache must update the object
The last-modified and if-modified-since commands allow the caches to determine the age of the content and whether it should be updated.

Let's now look at the process that takes place when retrieving a Web page.

A user initiates a TCP session with the origin server.
The user performs the three-way handshake and then sends an HTTP GET.
The cache inspects the request to determine the directive set in the HTTP Cache Control header (or Pragma: No Cache header used in HTTP/1.0).
If the header directs for "no cache," then the cache requests the data directly from the origin server, even if the content is stored on the cache.
If the "no cache" header is not set, then the cache will determine if it has the content, and again, if not, will retrieve it from the origin server.
Assuming the cache does have the content, then it will first confirm the freshness of the content. This is based on the manufacturer's algorithm but will basically see if the content needs to be updated. If not, the cache will serve it directly back to the user.
If it does need to confirm the content's validity, it will send an HTTP header with an "If Modified Since" directive asking the origin server to confirm if the requested content has changed.
If the origin server has not changed the content, it will serve it directly back to the user. If the content has changed, the cache will then request the modified object from the origin server.
On receipt from the origin server, the cache will update its freshness table with the new content and expiration timer based on the HTTP Cache Control headers and serve the data back to the user.

This may seem like a long-winded and time-consuming function, but it must be remembered that only very small requests are sent across the Internet, therefore imposing minimal delay. The more a cache is used, the more effective it becomes, and the quicker the response will be.

Understanding how caching operates is an important function when discussing WCR. Unfortunately, we will not go into a great deal of caching principals, as this is a subject in its own right and there are many books and articles that cover only this topic. We have tried to give a brief overview to assist in understanding the fundamentals while reading this chapter.

HTTP Status Codes

Part of the HTTP protocol specification is to provide a mechanism for feedback to the application (and user) on the success or failure of a request. We will also see that content switches, as well, make use of this feature when performing advanced features such as firewall load balancing and global server load balancing. There are generally five different areas of status codes:

1xx: These cover information aspects on initial connection and indicate that the connection process is proceeding.
2xx: These are used to determine the success of the connection; the 200 OK being the most famous.
3xx: These indicate if redirection needs to take place. This is often seen in the caching environment where a 302 Moved Temporarily or Redirect is used.
4xx: Indicates a client error; the dreaded 404 Not Found being the most popular (or unpopular).
5xx: Indicates a server error; the most typical is 503 Service Unavailable.

It is important to be aware of HTTP status codes when troubleshooting a network, as these can indicate what is happening upstream or downstream from your location. In addition, it can also point clearly to the fact that the server is not issuing content and it is not the fault of the content switch.

Cache Types

There are many different modes in which to deploy a cache, with each having certain benefits over the other. The most common deployment modes are:

Forward proxy
Transparent proxy
Reverse proxy

We will look at each in a bit more detail and understand their capabilities as they are intended to function in stand-alone mode. Caching vendors design their products to be able to operate without the intervention of a content switch. While this is possible, protocols such as WCCP and ICP (discussed later in the chapter) have been developed to overcome the lack of a content switch. These protocols provide some level of scalability and resilience, but fall short of the mark when competing against the functionality of a content switch. It would, however, be safe to say that using a content switch to provide the redirection, intelligence, and resilience greatly enhances the caching appliance's performance. Speed and intelligence are what differentiates caches from each other, so it makes sense not to overload it with other tasks not directly associated with serving content quickly.

Forward Proxy

This is an extremely popular method of deployment, as the cache is seen as the device to which all data is sent. Users and servers configure themselves to point to the cache. When the packet arrives at the cache it terminates that session, determines what content is required, and then requests the content from the origin server on its behalf . The client IP address is not passed to the origin server, and all requests appear to have come from the forward proxy. There are many benefits to using a forward proxy.

The most important is performance increase. Caches are designed to serve data very quickly, and by being the first point of contact, data, even if it is not the whole page, can be served locally.
Caching of data can save bandwidth usage, minimize operational expenditure, and defer link upgrades.
Filtering for inappropriate content that can be detrimental to a company's reputation can be controlled and logged.
Software updates can be carried out by proxy caches. Browser software can be downloaded on execution of a script providing a simple, user-controlled update saving administrators valuable time.
Access via a single device to external networks allows security administrators to allow only that device external access, thus protecting the internal network.
Authentication via a directory service or radius type implementation can ensure that only those authorized can use this proxy service, as well as maintaining logging and in some cases billing records.

When using a forward proxy, the destination TCP port is usually set to 8080 or something similar. This allows the proxy server to listen on that specific port for all incoming traffic knowing that this is the traffic it must proxy out onto the Internet for. On receipt, the forward proxy opens a new session with the original destination address. The source IP is that of the forward proxy, and the destination TCP port is changed to 80.

While forward proxies are an excellent method for accelerating network access and providing a certain level of control they also have certain disadvantages:

The proxy is the single point of failure. As users are pointed to a single device, should this device fail, then external access is not possible.
The forward proxy caches need to maintain all the cached pages so any distribution of content is very difficult and will rely on the caches communicating with one another.
The scalability of this service diminishes rapidly as you add more users to the network. Sure, adding another cache is a fix, but this requires additional desktop visits to change browsers to point to the second cache. In addition, determining who uses which cache requires user intervention, thus minimizing any administrative and manageability benefits.
Logging is probably the single biggest factor when using forward proxies, as the source IP address will always be that of the cache. While this does not impact the provider of the caching service, it certainly makes life difficult for upstream sites and their logging ability.

Figure 8-4 illustrates a forward proxy cache.

Figure 8-4. A forward proxy cache handles all traffic destined for the external network using itself as the intermediary.

graphics/08fig04.gif

It must be noted that there will always be pros and cons to any deployment, and it is up to the network architect to determine the best solution in order to complement and maximize the design. Forward proxy caching can accelerate your network and increase availability while providing a simple, manageable access point. Understanding the benefits and weighing them against the disadvantages will ensure a successful implementation of a caching service.

Transparent Proxy Caching

In contrast to a forward proxy cache, a transparent cache is exactly thattransparent. Typically, it sits in the data path and interrogates every packet to determine what is Web-based traffic and what is not. On receipt of a packet for which it is proxying, albeit transparently , the cache either retrieves the data from its local cache or retrieves it from the origin server. Most transparent caches can also preserve the source IP when requesting data. This allows the origin server to see exactly who accessed the data. This is a fundamental difference between the forward proxy and the transparent proxy; however, a transparent proxy is not obliged to perform IP spoofing. It can function in exactly the same way as a forward proxy and request the data on its behalf.

The key difference between the forward and transparent proxy is that the transparent proxy sits in the data path. Therefore, in effect it could be set up as the default gateway and will receive all traffic destined for other networks. Transparent caching is a popular method of deployment, but is usually deployed in conjunction with an external device. Deployed on its own, it has the following disadvantages:

All traffic, cacheable and noncacheable, is sent through this device, providing a potential bottleneck in the network.
Trying to provide a resilient, scalable solution creates a problem when more caches are needed. Some caches provide their own form of load balancing between themselves, which can overcome this resilience issue but requires expertise on these protocols.
If acting as a proxy, then mega proxy (discussed in Chapter 7) and logging on upstream sites again rears its ugly head.

Saying all that, transparent proxy caching is still an extremely compelling solution when caching is involved. Deploying this allows zero desktop administration, as the requests are intercepted en route to the destination. This means that no browser changes are required. The advantages of using a transparent cache are almost the same as that of a forward proxy:

First and foremost, it will increase the performance. Caches are designed to serve data very quickly, and by being the first point of contact, data, even if it is not the whole page, can be served locally.
Caching of data can save bandwidth usage, minimize operational expenditure, and defer link upgrades.
Filtering for inappropriate content that can be detrimental to a company's reputation can be controlled and logged.
Access via single device to external networks allow security administrators to allow only that device external access, thus protecting the internal network assuming IP spoofing is not configured.
Authentication via a directory service or radius type implementation. This can ensure that only those authorized can use this proxy service, as well as maintaining logging and in some cases billing records

Figure 8-5 illustrates a transparent proxy cache.

Figure 8-5. Transparent proxy cache.

graphics/08fig05.gif

Reverse Proxy Cache

Reverse proxy, server acceleration, Web server acceleration, and server-side caching are all names that are used to describe this method of caching. Whichever one you use, the bottom line is that this method increases site performance. While all the caching fundamentals are the same, this method is typically deployed by the content owner. They want to increase site performance in order to better serve the customer. The advantages of this are as follows :

Large portions of the site content can be prepopulated onto the cache, providing much quicker access to the site. Caches typically are much faster at retrieving and serving data than a normal file server.
Access into the site can be from a single source IP address if this is a requirement and no logging is needed. This enables a much tighter security policy for public facing sites.
Performance can be further increased if the cache is deployed in front of the firewall. This solution is not always recommended due to the security implication , but it can be very beneficial in intranets and extranets where, typically, access is from a trusted network.
As we progress into serving rich content across the Internet, streaming media applications can and will consume massive server resource. Using a dedicated appliance to provide a "front-end" cache allows for better throughput and performance.
Cost of server upgrades can be deferred or minimized, as reverse caches can service the majority of requests.

Like all deployment models, there will always be some disadvantages. However, reverse proxy caching has very few disadvantages as it benefits both the user and content owner. The content owner who installed this benefits from increased site performance and user retainment. These advantages are typically what the user requires anyway, so in most cases a reverse proxy cache is a "win-win" situation for both user and content owner.

Figure 8-6 illustrates a reverse proxy cache.

Figure 8-6. Reverse proxy cache.

graphics/08fig06.gif

While server and network administrators look to enhance their networks, caching provides an excellent mechanism to do this and without doubt is an important part of any site that wants to accelerate and maximize its existing infrastructure. However, deploying caching on its own, without using a load-balancing device, brings with it many challenges. Ensuring scalability and resilience are foremost in any network administrator's mind. There are protocols available to provide some form of communication between caches. The two most common today are:

Internet Cache Protocol (ICP)
Web Cache Communication Protocol (WCCP)

While there are a few others, these two have evolved and emerged as probably the most widely deployed. We will look at each in a little more detail to understand its advantages and disadvantages.

ICP

ICP is a mechanism that allows caches to be configured in a hierarchical fashion. Caches understand who is the parent and who are their peers in this configuration. ICP allows a cache to communicate with its peer or parent if it does not have the content requested. This allows caches to make use of local or hierarchical content from other caches, rather than sending this information across the Internet. This maximizes a caching farms' hit rate and allows for sharing of content. A cache will not send out an ICP request to its neighbors if the content is local. If the content is not found, the cache will send an ICP request to its peers requesting a reply from them. If the content is still not found, the request is sent up the hierarchy until it is found or retrieved from the origin server. It is then sent to the original cache that serves it back to the user. It must be pointed out here that ICP adds an overhead to any request, as the caches first have to communicate using ICP, and then, if required, initiate an HTTP connection in order to retrieve the data. This might be acceptable in a small cache deployment, but can complicate matters in large, dispersed cache deployments.

WCCP

WCCP was developed by Cisco Systems and is used within routers and caches. A WCCP-enabled cache will signal a WCCP-enabled router, informing it of its IP address. When the router receives HTTP requests, it will automatically select a cache based on a hashing algorithm and forward the packet. This process is transparent to the user. On receipt of the packet, the cache will either serve the content or request it from the origin server. Using WCCP allows network administrators to deploy WCR seamlessly into the network. While this is an open protocol and has had many enhancements, there are a few aspects that often deter the network designers at high-performance Web sites.

Placing additional overhead on software-based routers can heavily impact basic router functions such as updates, forwarding, and convergence times. In addition, routers might not be managed by the business and be in the control of an outsourcing provider.
Encapsulation of all packets between the router and the cache adds overhead to both devices.
Layer 7-based Web cache redirection is not possible.

While both these protocols offer a form of scalability and cache meshing or clustering, nothing beats using a content switch to provide load balancing of large cache farms without impacting performance of the cache or other key networking devices.