3.8 Advertising

only for RuBoard - do not distribute or recompile

3.8 Advertising

We all know that electronic commerce has been a driving force behind the evolution of the Web. Advertising, in particular, is important because it generates revenue for a large number of web sites. Advertising fees are often based on the number of views or impressions . That is, the advertiser pays the web site some amount for every person who sees their ad. But how do the site owners and the advertisers know how many people have seen a particular ad?

The simplest approach is to count the number of accesses logged by the site's HTTP server. As I'm sure you can guess, with caching in place, some of the requests for an advertisement never reach the origin server. Thus, the web site counts too few accesses and perhaps undercharges the advertiser. The advertiser might not mind being undercharged, but it is probably in everyone's best interest to have accurate access counts. Later, in Section 6.4.2, I suggest some techniques that content providers can use to increase ad counting accuracy while remaining cache-friendly .

Some people take issue with the notion of counting ad impressions and other page accesses. The fact that something requests a page or image does not mean a human being actually views it. Search engines and other web robots can generate a large number of requests. The User -agent request header normally identifies the entity that issued the request. Thus, user requests can be differentiated from robot requests. Another tricky aspect of request counting is related to the Back button found on web browsers. When you follow a sequence of hypertext links and then work your way back up the chain, your browser might decide to request some pages or images again. Most people argue that the second request should not be recounted as an ad impression .

Opponents of brute-force access counting suggest using statistical sampling techniques, much like those used to rate television programs. Certainly, the traditional broadcast media (television, radio) and to some extent print publications (newspapers, magazines) have similar needs to gauge readership . Indeed, a few companies now offer high-quality site measurement services. Among them are Media Metrix (http://www.mediametrix.com), Nielson NetRatings (http://www.nielson-netratings.com), and PC Data Online (http://www. pcdata .com).

Apart from paying for measurement services, what can a web site with advertisements do to get more accurate access statistics? One way is to make every object uncachable, but this cache-busting is a bad idea for the reasons described in the previous section. A more sensible approach is to include a tiny, uncachable inline image in the pages to be counted. This allows the page itself and large inline images to be cached, yet still delivers a request to the origin server every time someone views the page.

Another clever counting technique uses JavaScript embedded in a web page. When the page is loaded into a browser, the browser executes the JavaScript, which can send a message back to the origin server to indicate that the page has been loaded. Furthermore, the JavaScript can select which advertisement to show to the user. For example, the script can output HTML for an inline image. The nice characteristic of this solution is that both the web page and the images can be cached, yet the origin server still knows every time the page is viewed and exactly which ad images appear on the page.

Unfortunately, there is a fine line between counting visitors and tracking individuals. The Privacy Foundation (http://www.privacyfoundation.org) calls these hidden images "Web bugs." Though I think simply counting page requests is not so bad, the same technique allows content providers to closely track your browsing activities. This raises all of the privacy issues discussed earlier in this chapter. The Privacy Foundation suggests that Web bugs should be made visible, so users know when they are being tracked. My advice is to always assume that content providers collect as much information about you as they can.

only for RuBoard - do not distribute or recompile


Web Caching
Web Caching
ISBN: 156592536X
EAN: N/A
Year: 2001
Pages: 160

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net