Logs and Cookies


If only anonymous use logs are collected, it's impossible to say anything particularly subtle about people's behavior. Because the logs just collect information about which files were requested, it's only possible to tabulate conglomerate statistics about the overall popularity of pages. To allow deeper insight into information such as the order people visit pages, how long they spend on each page, and how often they visit the site, it's necessary to isolate individuals in the log files and to track those individuals across visits.

Originally, individuals were tracked by keeping a note of unique IP addresses, but by using this method, it's difficult to track different individuals. Even if people are identified, it's almost impossible to track the same person over different sessions. So Web cookies were invented. A Web cookie is a token—generally, a tiny file—that's exchanged by a browser and a Web server, and every time the browser connects to the server, it sends the file. For most cookies, a unique identification code embedded in the cookie identifies it as belonging to a specific visitor's browser. Cookie identification is commonly used for personalization (Amazon, for example, uses them to identify you so that it can make recommendations) and order tracking, but they're also useful for tracking user behavior. The same cookie that lets Amazon give you a personalized experience can be used to see how often you visit their site, what you've looked at, and what you've bought.

Note

Cookie-based session tracking is built into both Apache and the Microsoft IIS server, and just needs to be turned on for cookie information to start appearing in your log files. Please see your server software documentation to see how to do this.

Information about the Apache user-tracking module (which uses cookies) is available at http://apache.org/docs/mod/mod_usertrack.html.

An article describing Microsoft IIS 4.0 user tracking is available at msdn.microsoft.com/library/default.asp?url/library/en-us/dnw2kmag01/html/EventLogging.asp.

Like their edible namesakes, cookies have expiration dates. These dates can range from the short term ("10 seconds from now") to several years. Although primarily a security feature, expiration dates can be used to maximize their utility as tools for understanding how users use the site. One trick that makes user behavior analysis easier is to use two different kinds of cookies with two different expiration times.

  • Session cookies, with expiration times between a couple of minutes and several hours, identify an individual session. If no page has been fetched at a search engine site in 10 minutes, for example, it's likely that the user has finished using it. Likewise, a newspaper site may want to give people 30-minute session cookies, whereas an online game might give ones that last several hours.

  • Identity cookies have much longer expiration times, on the order of months or years. These cookies identify a single user and can be used to track that person's behavior over multiple sessions.

Using this technique, every user will have two cookies from a site at any given time: a short-term session cookie identifying the current session and a long-term identity cookie identifying the computer. Using a session cookie with a list of pages in an access log allows you to extract the order of pages in a session to produce a clickstream. Clickstreams tell you the order of pages visited in a session, which specific pages were accessed, and how much time was spent on each page. They are the molecules of the user experience and the cornerstone for many of the arguments that claim the Web contains the potential to produce unprecedented experiences for its users (and unprecedented profit for the companies that know how to understand them). But more on clickstreams later.

start sidebar
Log Analysis Ethics

People get the willies when you start talking about tracking them. This is natural. No one likes being followed, much less by salespeople or the boss. When people feel they're being followed, you lose their trust, and when you've lost their trust, you've lost their business.

Thus, it's critical to create and follow a strict user privacy policy and to tell your users that you're doing so. Here are some guidelines.

  • Keep confidentiality. Never link people's identities to their behavior in a way that is not directly beneficial to them (and determine benefit through research, not just assuming they're interested in your sales pitch).

  • Never sell or share behavior information that can reveal individual identities.

  • Protect behavior data. Personal information is top-secret information to your company, on par with its intellectual property, and it should be protected with the same rigor.

  • Keep all results anonymous. Never produce reports that can associate a user's identity with his or her behavior.

  • Maintain an analyst code of ethics. Publicize the code. Make everyone agree to abide by the code before granting him or her access to the data.

  • Tell your users which information is collected about them, and give them the option to refuse to supply it.

  • Join and follow the guidelines of a user privacy organization such as TRUSTe and the Electronic Frontier Foundation.

Also see the Interactive Marketing Research Organization's Code of Ethics, www.imro.org/code.htm, and TRUSTe's site, www.truste.org.

end sidebar




Observing the User Experience. A Practioner's Guide for User Research
Real-World .NET Applications
ISBN: 1558609237
EAN: 2147483647
Year: 2002
Pages: 144

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net