Edge Side Includes

team lib

Caching and content delivery has been one of the few areas to experience strong customer acceptance in the last few years . Though growth has been down recently, feedback from early adopters is encouraging, and has validated the art of storing frequently requested static Web pages closer to the end user .

Static contentWeb pages that don't change over a long period of timeis easily cached for quick delivery to end users via a content delivery network. But Web pages based on dynamic, frequently-updated content (such as sports scores) and personalized or customized content (such as localized sports scores) are in growing demand. These pages are usually created in a data center by a company's origin server, a database containing Active Server Pages (ASPs) or JavaServer Pages (JSPs), and a Web application server, which must work in real time, making proactive caching difficult.

This procedure entails a lot of processing at the origin server, which must format and deliver the data to the browser, hogging I/O, network, and memory capacity. The process becomes painful when repeated for every request, since the processing overhead expended to regenerate an entire page is very high. A bottleneck can occur as the origin server becomes overwhelmed, resulting in slow downloads or crashes, and ever-growing numbers of servers and load balancers must be deployed to right the balance.

The Proposed Solution

When in doubt, decentralize. Edge Side Includes (ESI) is a specification accepted by the World Wide Web Consortium (W3C, www.3w.org) that describes a means to push dynamic content from the origin server to multiple edge servers, closer to the end user. Offloading the origin server increases download speed and limits crashes because the origin server doesn't have to redesign Web pages every time a page element is updateda process that doesn't scale to handle large numbers of simultaneous users.

ESI is the result of a collaborative initiative between Akamai (www.akamai.com), Digital Island (now Exodus, www. exodus .net), Mirror Image (www.mirrorimage.com), IBM, BEA Systems (www.bea.com), Oracle, and a handful of other server and Content Delivery Network (CDN) vendors . While Akamai is the ESI leader with over a hundred EdgeSuite (which includes ESI) customers, other CDN providers, such as Speedera (www.speedera.com) and Digital Island, have announced ESI-type services.

What It Does

Essentially, the ESI specification is an Extensible Markup Language (XML)-derived language that tags and formats Web pages on origin servers, creating a template. The template is in turn composed of HTML fragments , which are specific pieces of text or customized content that can be treated separately from other Web page elements. A fragment can be a banner, a targeted ad, or a personalized welcome statement for example.

Most pages have fragments that are cacheable (static) and uncacheable (dynamic). However, to complicate matters, some fragments that are uncacheable may contain sub-fragments that are less dynamic and so can be cached. ESI tags indicate which fragments of a Web page can be pushed from the origin server to multiple servers at the CDN edge. The origin server then distributes the dynamic but cacheable content to multiple edge servers, where subsequent requests for that same content are sent until that edge-cached data is no longer valid.

ESI treats each fragment as a separate entity, each with its own cache and access profile. These parameters govern a fragment's Time to Live (TTL)that is, the period of time in which the information contained, such as a sports score, is valid. While that particular fragment is valid, Web page requests can be served from the edge servers. When that fragment's TTL expires , the origin server is called upon to refresh the edge cache with the latest information.

Thus, the end user is much closer to the desired Web page information, and the usual caching benefits applyeven with supposedly uncacheable content. In broad terms, ESI seeks to separate the normally combined tasks of content assembly from content delivery into two separate processes that can be individually managed.

How It Works

The ESI specification stands on four legs:

First, the language specification tags the template so that reverse proxies can understand ESI. ESI's XML-based heritage also enables interoperability.

Second, the Content Invalidation Specification (CIS) defines the rules used to invalidate data stored on the edge server, so that the origin server can send out new content to overwrite the outdated data.

Third, an Architecture Specification provides methods for HTTP intermediaries to control content.

Finally, Java ESI (JESI) Tag Library Specification provides the Java-written application-level interface that can communicate with ESI tags, as many dynamically assembled pages are Java-based.

These components are the building blocks for ESI's four essential functions:

  1. Inclusion, which is the ability to fetch and include files that make up a Web page. Each file has its own configuration and control, TTL, and revalidation rules.

    click to expand
    Pushing Content Delivery to the Edge. Here's a look at how Edge Side Includes (ESI) moves content from the database and Web application server in the data center core and pushes it towards the end user at the Content Delivery Network (CDN) edge.

  2. Conditional inclusion, which provides conditional processing based on Boolean comparisons or other programmable variables , so that rules such as how a template is processed can be modified as needed.

  3. Environmental variable support that allows a subset of standard Common Gateway Interface (CGI) variables, such as cookie information, to be used inside ESI statements or outside of ESI blocks. (CGI is a standard for external gateway programs to interface with information servers such as HTTP servers.)

  4. Exception and error handling, which allows developers to specify where to send the browser if an origin site or document isn't available. Alternative pages or default behavior can be set for every fragment that forms a particular Web page. This same logic also supports what Akamai calls "an explicit exception-handling statement set:" If a major error occurs while processing an ESI-enabled Web page, the content returned to the end user can be specified in a failure-action configuration option associated with that ESI document.

As mentioned above, the CIS is central to ESI. It instructs syndication servers, content management systems, databases, custom scripts, application logic, and so on to send HTTP-based invalidation commands to the edge server and delivery network with instructions to overwrite the old metadata with fresh content from the origin server. (The origin server connects with the ESI intelligence over an optimized link.)

This setup has two advantages: First, Web site managers need only update content at the origin server, not at each edge server. The specification also ensures that old content is purged from the Web page in use in real-time, so that customers don't see outdated data (such as an out-of-stock item in an online purchase catalog) during the updating process, regardless of the number of edge servers that contain the old data. In other words, the revalidation process is distributed throughout the network. Administrators can integrate the CIS into their content management system using a database call, a script, or a number of other site administration methods.

This boils down to a simple way to enable Web content personalization at the network edge, whether on a private local server or a delivery network. More importantly, the caching rules involved in page assembly can be based on user agent and other header values, cookie values, the user's location or connection speed, or other programmable parameters. It's not clear, however, if these parameters can be customer-defined without a lot of vendor help.

Here are a few sundry points on ESI:

  • Single Web pages don't have to be entirely marked up in ESI. Cacheable and uncacheable content can coexist on the page without impacting the performance of those fragments that are dynamically forwarded to the edge. It makes no difference to the proxy servers involved.

  • ESI, as offered by Akamai, can run data compression on loads between the origin and edge server as long as the requesting browser supports this compression. If it doesn't, the edge server can decompress the new content and send it to the browser uncompressed.

  • Cached templates and fragments can be shared among multiple users so that many requests for popular, dynamic content can be fulfilled by using shared components delivered from the edge. For example, say a user requests a stock update for a particular company from a financial investments firm. The update is a dynamic, uncacheable fragment of a greater Web page that's cacheable. But once the fragment is updated, that revalidation is sent to each user's page that references the same fragment.

It's Always Something

Though ESI is promising , and most indications from early customer wins have been very positive, there's room for improvement.

ESI requires retagging the site with appropriate labels and commands that may require better tools that minimize the need to do this manually. Akamai already does this with its "Akamaize" tools that tag static content for the vendor's content delivery system.

"ESI demands retagging the site, and it's not as good for ASP sites," says Peter Firstbrook, a senior research analyst with the META Group (www.metagroup.com). "For most users, ESI might involve at least some site redesign."

Firstbrook also says that "[ESI] can't handle occasions when the page itself, not just the content of the page, is truly dynamically generated. It's very limited in terms of cache revocation, where vendors such as SpiderCache [www.spidercache.com] and Chutney [www.chutney.com] have lots more capability."

Firstbrook is referring to the standard way cached content gets served until its TTL expiresthe only way old data gets dumped.

"But suppose that content was a sports score?" he says. "With such a dynamically changing bit of content, you either have to give it a very short Time to Live or not cache it. It could be valid for two seconds or two hours, depending on the game. So you want to be able to eliminate it from the cache if the score changes. Revocation is the method by which you can tell the cache to dump the score and get a new one from the origin site. This is the biggest problem with dynamic data."

A less dramatic point to consider is the need for buy-in from the developer community, which is tasked with learning and implementing the technology. Developers also want clarification on how ESI works with other Web markup languages, such as Server Side Includes (SSI, an ESI-based Speedera service similar to Edge Side Includes). These languages should be complimentary , but if several are processed by the same device, there must be a way to decide which markup language is paramount.

These hurdles are significant, but not deal-breakers for many businesses' e-needs. Look for further evolution in the ESI market, and increasing competition from other (non-Akamai) CDN providers as they ramp up their own ESI services this year.

Resources

For An Overview Of Edge Side Includes (ESI), Start With The Main ESI Site, www.esi.org.

For code samples, user guides, and testing tools, click on http://developer.akamai.com.

"Emerging ESI: Lower Costs, Better Performance," by Lori MacVittie, details step-by-step template markups and explains the programming commands involved. Go to www.networkcomputing.com/1301/1301ws1.html.

This tutorial, number 169, by Doug Allen, was originally published in the August 2002 issue of Network Magazine.

 
team lib


Network Tutorial
Lan Tutorial With Glossary of Terms: A Complete Introduction to Local Area Networks (Lan Networking Library)
ISBN: 0879303794
EAN: 2147483647
Year: 2003
Pages: 193

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net