Background | Computing Information Technology: The Human Side

Understanding the Problem

Server-side data collection mechanisms based on active server page (ASP) scripts or the like can prove useful in some research circumstances. Consider, for example, the situation where you want to investigate the impact of download time on user satisfaction. Using ASP scripts, a delay mechanism can be easily built into a Web page so that the server will delay serving the requested page to the client until some precise, predetermined time has passed. Different experimental treatment levels are accomplished by merely manipulating the delay time that is scripted into the Web page. Here, the experimental subject, using an ordinary browser, will have the perception that the page is slow to download because of the delay between when the page is requested (e.g., by clicking a hyperlink) and when the page is available in the browser. As another example, consider the situation where you want to study the end user's Web search strategy by recording which pages are accessed, along with the sequence of page access. In this case, we need to record the so-called "click-stream data." Again, ASP scripts in the Web pages could provide a simple data collection mechanism by logging each page request (page ID, server timestamp) in a server database. In both of these research scenarios, standard Web browser software such as Internet Explorer (IE) can be used in the experiment.

In considering the above research problems, it is obvious client-side data collection mechanisms can be constructed just as easily. In both cases, Java applets, Java scripts, or VB scripts can be embedded into the HTML pages to handle the required tasks, and, again, standard browser software can be used. The only difference in this client-side approach is that the data collection is being handled by the client rather than the server machine. Neither approach provides any obvious benefits over the other, although in the client-side approach, the Web pages could be stored locally and thus Web access, or even network access is not required.

One flaw in all of these research scenarios lies in the fact the experimental domain must be restricted to a limited set of Web pages that have been appropriately scripted for data collection. If the experimental subject is allowed to "wander" beyond this limited set of pages (an activity that is quite fundamental to the nature of the Web), then these user actions will not be recorded, and the validity of the experiment will be nullified. There will simply be no data collection scripts to execute. Related to this is the fact that all Web pages used in the experiment must be developed and maintained by the investigator — a task which can be quite labor intensive if a large number of pages are to be made available. Obviously, the experimental pages should usually be large in number and professional in appearance, if external validity is to be maintained.

In some situations the research data can be collected without the use of client-or server-side scripting. Click-stream data, for example, can sometimes be gleaned through the use of standard network management software, or through so-called "network sniffers" that can be configured to monitor Internet requests and/or page downloads. In this case the experimental treatment can involve pages other than those created specifically for the research study, and, again, standard browser software can be used for the experiment. The problem here is in the precision or in the format of the data, as the software was not designed for this purpose. Pages containing multiple frames, for example, may be logged as individual (frame) downloads in some circumstances and as a single page download in others. Client requests that are satisfied through the local cache will not be logged at all.

A problem that underlies all of the data collection methodologies discussed thus far is they suffer from a lack of experimental control. This lack of control comes from the fact the instrument with which the experimental subject is interacting (a standard Web browser such as IE) is not designed to be used as a research tool.

Consider the situation where we wish to study Web use behavior through analyzing click-stream data. There are ways of gathering data on page requests or page downloads, as noted above. However, there is no means, short of direct observation, of recording how a particular page was requested. The page request could have come in the form of a click on a hyperlink, but the request could just as easily have been generated automatically, through a dynamic action on the page (e.g., meta refresh), or through the Back or Forward buttons in the browser interface. Normal click-stream data will not distinguish between these circumstances, so the precise behavior or intentions of the experimental subject cannot be determined.

Another problem relates to the occurrence of multiple windows. Many Web sites open hyperlinks in new browser windows, and the savvy experimental subject can even cause this to happen himself (shift-click in IE). The problem here is normal click-stream data cannot reflect which of the open windows is active when subsequent actions occur, or even that there are multiple windows in use. Again, the data cannot capture, or misrepresents the behavior in question; true "streams" cannot be traced.

Yet another problem relates to the browser cache. Beyond setting the size of the cache, the experimenter has little control over how or when the cache is used in responding to subjects' page requests. (Note that the cache in IE cannot be fully disabled.) In some circumstances this can introduce systematic error into the data and thus can have a negative impact on the data analysis.

Toward a Solution

When faced with these and other related problems in a Web-based study, this author set out to find a solution. It was determined, for maximum flexibility and experimental control, the experimental manipulations (treatments) and the data collection mechanisms should be as close to the experimental subject as possible. That is, they should ideally be embedded in the browser itself. This led to the development of a custom IE-lookalike browser for use in Web-based experiments. As it turns out, this is not as complex an undertaking as it might first appear.

With custom browser software there is no need to depend on scripts or applets in experiment-specific Web pages to administer experimental treatments or to record user actions. Consequently, there is no need to restrict the experimental domain to a limited set of custom Web pages. With this approach, the experimental domain can include the entire Web. The custom software can be built with the ability to precisely record user activity and to preempt or modify actions that could be harmful or inappropriate for the experimental context. Experimental control and experimental manipulation can be integrated into the browser itself.

The software that we know as Internet Explorer (IE) is essentially a software interface surrounding a set of dynamic link libraries (DLL) that provide the requisite Internet processing functionality. The main "guts" of IE is a DLL called SHDOCVW.DLL. This is supported by other DLLs such as MSHTML.DLL, which is responsible for the rendering of HTML documents in the browser window (Microsoft, 2002a). Microsoft, in its Visual Studio suite of software development products, provides a software object called the WebBrowser Control. The object is actually stored as the aforementioned SHDOCVW.DLL that governs the behavior of IE (Cornell & Jezak, 1998, p. 80). This control can be employed in Visual Basic (VB) or in C++ programs to add Web browsing functionality to software applications. The WebBrowser object works with the standard event-based model of Windows computing. The fundamentals of this model are described next.

Windows software objects (e.g., the WebBrowser Control) communicate with their environment by sending out "messages." Messages, also known as events, are how an object tells the environment what is happening (including all relevant details), and when. The process of sending a message is called "firing an event." If a programmer wants her software to react to a certain event, then she can code an "event handler" for the event. An event handler is simply a special subroutine that is bound to the firing of a particular event. The specific details surrounding an event are manipulated as arguments of the event handler subroutine. If an event handler contains no code at all, then that event is, for all intents and purposes, ignored; the event still happens, but nobody cares. If, on the other hand, an event handler does contain program code, then the lines of code detail what will occur at the time of the event.

The WebBrowser Control fires events for all of the major occurrences in an Internet session. For example, events are fired when a request to navigate to a page is made, when a page has completed downloading, and when a request is made to open a new window. Essential details such as URL, Target Frame, and Page Title are available with the WebBrowser events. Coding event handler subroutines for these WebBrowser events is the key to building a customized IE-lookalike browser for Web-based research.

In some cases, Internet actions can be altered or preempted through a Cancel argument in the event handler. One important example of this is the BeforeNavigate2 event handler. This routine fires after a navigation has been requested by the user, but before the request is fulfilled. This allows the custom software to inspect and evaluate the situation, and to possibly modify or cancel the request before it is allowed to proceed.

Properties and methods of the WebBrowser object can be used to dynamically emulate all of the visual and behavioral features of the IE interface such as the status bar, the browser window caption, and the standard buttons (Back, Forward, Stop, Refresh, Home, etc.). In short, an emulation of IE can be built with the inclusion of as few or as many features of the IE interface as are needed in the experimental context. IE features that might corrupt the experiment can be left out or can be monitored by the custom software.

Figure 1 shows a snapshot of a custom browser that was created by the author for a recent research project. This browser is not IE, but rather is a Visual Basic software application hosting a WebBrowser Control object. The Title Bar Control at the top of the VB form programmatically displays the title of the current browser page (as does IE). Similarly, the Status Bar Control at the bottom of the VB form always shows the URL of the currently selected hyperlink. The Button Controls on the VB form (Back, Forward, Stop, etc.) are controlled by the software to emulate their IE counterparts unless a particular action would be detrimental to the experiment. Noticeably absent is the URL entry area through which the user could normally enter a URL target. This feature was left out as part of the experimental design.

click to expand
Figure 1: Snapshot of a Custom Browser that was Created by the Author for a Recent Research Project

In this application the VB form is programmed to fill the entire screen, including the Windows Control Bar, at all times. The Windows Close Button, at the top right, is for show only. It is programmatically disabled so that only the researcher can close the application through a special key sequence. Through these features (and some special keyboard handling techniques), the experimental subject is effectively locked into this application for the duration of the experiment.

Figure 2 shows the record structure for one of the data files, named MAIN.DAT, from this same research project. This is provided to illustrate the rich, detailed nature of data that can be captured through this approach. This particular file contains one record per URL downloaded by the browser. The file captures the so-called click-streams of the user. Note that this data was captured from the general Web, rather than from a limited set of research-specific pages.

click to expand
Figure 2: Record structure for a data file

The purpose of the first two fields, Subject_ID and URL_Sequence are obviously provided for record sorting and identification purposes. The purpose of the final field, Full_URL, is also obvious. The Target_Frame field holds the name of the page frame (if any) in which the URL was displayed. This is an important aspect of a click-stream since each URL download operation does not necessarily translate into a unique page as seen by the user. This information, which was captured through the BeforeNavigate2 event (above), is frequently missing from click-stream data. The URL_Start_Time field measures the time of the experimental subject's mouse click, within the context of the experimental session (rounded to the hundredth of a second). This item is easily captured since the WebBrowser Control fires a DownloadBegin event. The URL_Duration field is a measure of the time until the subsequent download.

The information in the User_Action field is difficult, if not impossible, to capture through alternative experimental approaches. Beyond recording what URL was downloaded, we are often concerned with how that download was initiated. That is, what were the intentions and motivations of the experimental subject? Intentionally navigating through previous pages via the Back button is clearly different from exploring new pages through hyperlinks. Some URLs are downloaded automatically through scripts or through a meta refresh on other pages. With a custom browser, the software surrounding the WebBrowser Control can analyze the mouse clicks and the button presses of the experimental subject to determine how each URL is requested and can record this information along with the click-stream data.

By developing a custom browser research instrument, the investigator is free to include (covertly) all of the requisite mechanisms of experimental control and data monitoring into the browser itself, no external scripting or network monitoring is needed. Timers to precisely control the duration of the experiment or the occurrence of experimental treatments can be easily embedded into the browser software. Experimental treatment randomization can also be built in.

User activity down to the keystroke or mouse-click level can be monitored and recorded with millisecond accuracy if needed. Certain events can also be blocked or modified if necessary. For example, an attempt to open a page in a new window can be intercepted (the NewWindow2 event of the WebBrowser Control) and the page redirected to the initial window. With this approach, no special (i.e., scripted) Web pages are needed, but attempts to "wander" to irrelevant sites or inapposite protocols (e.g., "mailto:," "ftp:," etc.) can easily be halted if desired. The cache can be controlled programmatically through calls to the Windows API. Perhaps best of all, once the basic system is developed, modifications and new features are a fairly simple to effect.