Hack 1. Talk the Talk
Learning how to talk the web measurement talk is the first step in really taking advantage of the data, especially if your hope is to someday become a professional "web data analyst." In web measurement, terminology is tremendously important. Because so few people have experience measuring activity on the Internet, it is important to explain the most important terms and how they're used. If you're technically inclined, this hack is designed to help you understand how the bits and bytes are translated into information about human activities. If you're more marketing oriented, this hack will help you understand where the information comes from. Figure 1-1 illustrates the relationship between the basic terms. As you can see, as the volume of available data decreases, the value of that information increases. At the bottom of the pyramid and in greatest volume, we have "hits," and at the top, we have "unique visitors," the holy grail of "things that can be measured." Figure 1-1. The pyramid model of web measurement dataEven if you already "talk the talk," recognize that many of these terms are loosely defined, and the strict definitions that follow serve as the foundation for the rest of this book. 1.2.1. HitsThe term hit is perhaps the most overused and misunderstood word in the entire web measurement vocabulary. People talk about "site hits," "page hits," and "hits from search engines" ad nauseum. The best definition of a hit is provided by WebTrends:
When you read the definition of a page view, you'll be struck by the similarity of the two definitions, but consider the words "or downloads a file." Files, in this context, include executable files; PDFs; sound files; JPEG, PNG, and GIF images; etc. The problem is that the "page" that appears in your web browser is technically the aggregate of potentially hundreds of "hits"every image and page element is counted as a hit. So if every time a page loads any number of hits is recorded, but a different number of hits depending on the number of images used to render the page, how can one reasonably expect to use hits in a business context? You can't. Don't try. The best you can do with a "hit" is to recognize that it's simply one of those words that people misuse and move on. Use words like "page views" and "referrals from search engines," and you'll be talking the talk. In web measurement, "hits" is an anachronism; the term's time has come and gone. 1.2.2. Page ViewsThe page view is the fundamental unit in web measurement, ideally recorded when a person sees a web page. Page views are the measurement of a visitor's interest in your site and the basis for a visitor's clickstream, the sequential list of pages a visitor sees during his visit. In their recent document Interactive Audience Measurement and Advertising Campaign Reporting and Audit Guidelines, the Interactive Advertising Bureau (IAB), a governing body for Internet advertising measurement standards, had the following to say about page views:
For the sake of this book, the definition of page view is:
While there are a number of problems associated with how page views are defined and used in the web measurement market, it's tremendously important to understand the general concept. Page views, in practical usage, provide an easy way to convey the popularity of a page or section of your site. While not as people-centric as visits and unique visitors, page view is a term you'll use frequently when talking the talk. 1.2.3. VisitsA visit, also referred to as a session or user session, is generally defined by the collection of pages viewed when someone browses a web site (the "clickstream"). It is defined by the IAB (in particularly droll language) as:
While this concept is not particularly complex, ambiguity arises when you consider how people browse web sites. Consider two examples:
Both are reasonable and common strategies for using the Internet. Unfortunately, while it is easy to know when Tammie's visit endswhen she has completed her specific taskthe same determination is difficult to make for Tom. Because it is nearly impossible to determine the intent of a web visitor, certain assumptions are required. A fundamental assumption is that any visitor who fails to click for more than 30 minutes has mentally "moved on," and her visit should be considered ended. Why 30 minutes, you ask? An excellent question! Unfortunately, one without an answer; suffice to say, 30 minutes for visit expiration is a widely used standard, something worth remembering when you want to talk the talk. The most useful definition of a visit is as follows:
You'll see that there is no upper limit on the length of a visitone visitor can click around for as long as he pleases, as long as he clicks a measured link at least once every 29 minutes and 59 seconds. Visitors can visit a site multiple times a day; the ratio of visits to visitors is a great key performance indicator [Hack #94]. Visits are tied to referring sources like paid and natural search terms [Hacks #42 and #43] and banner ad campaigns [Hack #40]. Visits bridge the gap to truly meaningful information about real people. 1.2.4. Unique VisitorsIn the field of web site measurement, people are called "unique visitors." Unique visitors are the top of the pyramid model of web measurement data (Figure 1-1) and exist in three formstotally anonymous, mostly anonymous, and known [Hack #5]. The important things to remember about unique visitors are that they are human beings, not nonhuman user agents [Hack #23]. In terms of a strict definition of unique visitor, the IAB has this to say:
Again, while using the least engaging language possible, the IAB has captured the essence of the unique visitor. Especially important is the concept of timeframe and the relationship between unique visitors and visits. I think the best definition of a unique visitor is as follows:
As long as you remember that unique visitors are people just like you and me, you'll be fine. If you remember that the uniqueness of visitors is associated with a specific timeframethe day, the week, the month, or the football seasonyou're golden. 1.2.5. ReferrersAnything online that drives visitors to your web site is said to "refer" traffic to you, hence the term referrer. Referrers are generic web sites, search engines, banner ads, weblogs, email, and affiliates: basically online sources that inspire unique visitors to visit your web site and generate page views. All that is required of a referrer is that it can be identified based on information contained in the HTTP request. The following logfile shows some examples referrers: 216.219.177.29 --[15/May/2000:23:03:36 -0800] "GET /index.htm HTTP/1.0" 200 956 " http://www.webanalyticsdemystified.com" "Mozilla/2.0 (compatible; MSIE4.0; SK; Windows 98)" 212.219.31.219 -- [15/May/2000:23:03:42 -0900] "GET /mail/email_marketing. htm HTTP/1.0" 200 956 "http://www.altavista.digital.com/cgi-bin/query-bin/ query?pg=aq&text=yes&d0=1%2fnov%2f99&q=email+marketing %2a&stq=30" "Mozilla/ 4.05 [en] (Win 95; I)" 121.12.31.45 -- [15/May/2000:23:03:56 -0300] "GET /index.htm HTTP/1.0" 200 956 "http://www.oreilly.com/lists/links.php?link_list_id=134" "Mozilla/4.0 (compatible; MSIE4.01; Windows 98)" The example shows that:
The best working definition of a referrer is as follows:
The second half of this definition was added in recognition that email is a very important component of Internet marketing efforts, but many email applications don't provide referring URLs. When analyzing a referring URL, you should examine the entire URLthe http://www.oreilly.com/books/hacks/websitemeaurementhacks.html plus any information contained in the query string (the stuff after the ? in a dynamic URL)so you can reconstruct the exact and entire page that contained the original link. If you cannot, hopefully you're able to embed information into the requesting URL that describes the medium and message that contained the referring link. As you can see in Figure 1-2, while this visitor was referred to the Web Analytics Demystified web site from an email, we can determine that he came from the December 2004 campaign (campaign=Dec2004), he clicked on a "buy now" message (message=buy_now), the creative was an image (creative=image), and the link identifier was 54412 (id=54412). Any good web measurement application [Hack #3] will be able to leverage this information, usually using campaign and email tracking functionality [Hack #41]. Figure 1-2. Referring URL1.2.6. Tying It All TogetherAt the end of the day, each term is part of the framework for web site measurement. Make sure you really understand the subtleties associated with each one; using "visits" when you mean "unique visitors" can have a profound effect on someone else's understanding. When you really get thiswhen you talk the talk, as it wereyou're going to be saying things like:
You get the idea. Talk the talk, and everything else falls into place. |