Major Issues and Techniques | Computing Information Technology: The Human Side

In this section, I will discuss some of the more important programming issues and techniques, and the key WebBrowser events that are involved in building a customized browser for Web-based experiments. Code snippets are shown in bold and use the syntax of VB. When the event handler subroutine formats are presented, only the relevant arguments are provided. The ellipsis symbol (…) is used to indicate that additional arguments exist, but are left out of this presentation.

Windows programming skills (VB, C++, J++, etc.) at the intermediate level are needed to actually build a satisfactory tool, but the reader who possesses such a skill set should be pushed well along the learning curve by this material. This author has already struggled through the too-sparse documentation and convoluted technical papers. It seems the reader should be spared the need to reinvent the wheel.

Events of the WebBrowser Control

With the release of Internet Explorer Version 6, the WebBrowser Control (i.e., SHDOCVW.DLL — the "guts" of IE) works with 37 unique events. Luckily, only a handful of these are vital to this application. These more important events are now discussed. The subset is broken down into categories for the purpose of the discussion. The categories relate to (1) navigation and downloading issues, (2) browser interface emulation, and (3) window and session control.

The first category of WebBrowser events has to do with navigation and downloading activities. Experience has shown a significant portion of the processing for this application should appear in these event handler routines. Perhaps the most important event for this application is named BeforeNavigate2 (the '2' suffix is just a means of differentiating this event from its deprecated predecessor, BeforeNavigate). This event fires after navigation has been requested (e.g., when the user clicks a hyperlink), but before the navigation action is attempted by the browser software. The event handler subroutine has the following format: BeforeNavigate2 (URL as Variant, TargetFrameName as Variant, Cancel as Boolean, …). If the Cancel argument is set to TRUE before the event handler subroutine ends, the navigation request is completely ignored by the browser. The beauty of this event handler is it gives the programmer the ability to analyze the request and to possibly cancel or to even modify the request before any action is taken. The URL and TargetFrameName arguments (these can be treated as string data) provide the details necessary to make this decision.

In the application considered here, the custom browser program could contain a list of inappropriate URL targets or a list of heuristic rules against which the request could be compared. For example, a few lines of code can parse the URL argument and identify the protocol portion. Requests containing irrelevant protocols such as "mailto:" or "ftp:" can be halted. The programmer even has the ability to cancel the original request (Cancel = TRUE), and to covertly send the browser to a totally different location. This action can be accomplished with the Navigate2 method of the WebBrowser Control. The code would look something like this: myBrowser.Navigate2(newUrlString).

Individual navigation activities can be monitored through the two WebBrowser Control events, DownloadBegin and DownloadComplete. These fire for every BeforeNavigate2 event that is not cancelled — even if a navigation error occurs. Beware that DownloadComplete is a somewhat misnamed event. A page is not really available for processing by the user or by your application until the occurrence of the DocumentComplete event. This latter event does not fire in the case of a navigation error. The DocumentComplete event subroutine is probably the best place to manage the timing and recording of click-stream data. The format is as follows: DocumentComplete(URL as Variant, …).

A few words of warning are warranted at this juncture. First, when frames are involved, the navigation-related events fire for each frame element. With the BeforeNavigate2 event handler, testing the TargetFrameName argument against the null string can determine whether the navigation is for a frame element. Second, the value of the URL argument in the BeforeNavigate2 event handler may differ from that of the subsequent events. This is because this first event uses the original URL specification that initiated the navigation process (e.g., the actual HREF attribute coded in an HTML Anchor element). Subsequent events use the canonicalized, fully qualified version of the URL.

The second category of events to be discussed for this application has to do with emulating the behavior of the Internet Explorer interface. I will discuss how to emulate the IE Title Bar (at the top of the window), the IE Status Bar (at the bottom of the window), and the standard IE navigation buttons (Back, Forward, Stop, etc.). The Title Bar is handled quite easily through the TitleChange event handler of the WebBrowser Control. The format is TitleChange(Text as String). The event fires each time the title of an IE session would change. The single argument, Text, contains the title that IE would display. Emulation of the textual portion of an IE Status Bar is correspondingly simple. The relevant event handler is as follows: StatusTextChanged(Text as String).

The graphical download-progress animation of an IE Status Bar is easily emulated with the help of the ProgressChange event. This fires whenever the system updates the progress of a download process. Numeric arguments indicate the degree of progress achieved. The DownloadBegin and DownloadComplete events indicate, respectively, when an animated Progress Bar should be made visible or should be hidden.

The WebBrowser Control object maintains a URL history list that can be used to emulate standard Back and Forward buttons. Two methods of the control, GoBack and GoForward, invoke, respectively, navigation to the previous and to the subsequent URLs in the list. Handling the dynamic of the enabled-state of these emulated buttons is a bit more complex, however. For example, the Back button in IE is enabled as long as there is at least one prior page in the history list; it is disabled otherwise. The IE Forward button is enabled, if and only if, there is at least one subsequent page in the list. This particular dynamic can be handled with the CommandStateChange event. This event fires whenever the enabled-state of either of these buttons should change. The two arguments of the event handler indicate (1) which button is affected, and (2) what the new state should be.

Emulation of the remaining IE buttons is quite simple, indeed. The WebBrowser Control has methods named GoHome, GoSearch, Refresh, and Stop that can be tied to the click of button objects on the form container. The meanings of these actions are obvious. Note, however, the GoHome and the GoSearch methods are tied to the respective settings, Home Page URL and search configuration, in the Internet Explorer program (these are actually stored in the Windows Registry). In the research application considered here, you would most likely want to send the experimental subject to experiment-specific pages, so these two method calls may not be suited for this research context. Use appropriate Navigate2 method invocations instead.

The final set of WebBrowser events to be discussed deals with session control and window control. When experimental subjects are allowed to roam freely on the Web, there is no telling what type of scripts and software routines might be encountered. A particularly vexing problem in this research context has to do with multiple windows. Hyperlinks are often designed to display the navigation target within a new window, causing multiple windows to be open simultaneously. With click-stream data, this translates into a situation where simultaneous streams are being formed. The research problem stems from the fact there is no way of determining which stream is the focus of the user's attention at any particular time. This author's solution is to preempt the new window creation and to redirect the navigation to the initial window. The NewWindow2 event is the key to this approach.

The NewWindow2 event of the WebBrowser Control precedes the creation of a new window (new browser instance). The handler for this event has the following format: NewWindow2(ppDisp as Object, Cancel as Boolean). The ppDisp argument can be thought of as a reference to a yet-to-be-created new browser object. By modifying this reference within the event handler routine, you can redirect the action to a known browser object as long as the Cancel argument remains FALSE. Redirecting this action directly to the initial WebBrowser can be problematic, so my solution redirects to a preexisting, yet hidden, buffer WebBrowser Control (e.g., Set ppDisp = Me.myBufferBrowser.object). The BeforeNavigate2 event handler of this buffer browser then cancels this navigation request and navigates back to the original browser (myBrowser.Navigate2 URL: Cancel = TRUE). The end result is all Web navigations are presented in the initial browser window. For an alternative approach, see Microsoft Knowledge Base Article #Q185538 (Microsoft, 2001a).

Web pages, designed to be displayed in new windows, often contain "close" buttons that run scripts to close the (new) window (e.g., window.close in JavaScript). Since a new window was not created in our application, we cannot allow this to happen in our custom browser. The effect would be to close the single main window and to terminate the Internet session. The WindowClosing event of the WebBrowser Control provides a simple solution to this problem. This event fires when the WebBrowser Control is about to be closed through a script action. The event handler is WindowClosing(Cancel as Boolean, …). Just set Cancel = TRUE to circumvent the action. Optionally, you can invoke the GoBack method to simulate closing the window by returning to the prior page in the history list.

While automatic pop-up windows and banner ads are annoying to the normal Web user, they can really raise havoc in a Web-based experiment, where control over the stimuli is needed. These can be extinguished by merely setting Cancel = TRUE in the NewWindow2 event handler discussed above. The problem here is not all new browser windows should be cancelled - only the automatic pop-up ones should. My solution to this dilemma is to extinguish any new window that is generated automatically (i.e., not resulting from a button or hyperlink click action) and whose dimensions are below some critical threshold. The ClientToHostWindow event simplifies this decision. The event fires when a new window is opened through scripting. The event handler subroutine format is ClientToHostWindow(CX as Long, CY as Long). CX and CY are, respectively, the width and height of the window in pixels, as prescribed by the scripted invocation. In my experience, 600 pixels work well as a threshold for both height and width.

Other Special Techniques

In this section, I will discuss some of the remaining programming issues and techniques that are somewhat unique to this research context. Topics include (1) how to control browser-caching activity, (2) how to manage keyboard and mouse activity, and (3) how to extend the reach of the custom browser to include the minute details of the HTML pages.

Control over caching behavior can be critical in Web-based experiments. Page caching in the WebBrowser Control is managed according to the Internet Explorer settings stored in the Windows Registry. For consistency across sessions, I advise that caching be controlled programmatically, from within the custom browser application. Specifically, I suggest the cache be flushed with each URL downloaded. This action can be invoked within the handler for the DocumentComplete event. At first glance, programmatic cache flushing appears messy, involving a half dozen API function declarations. Luckily, Microsoft Knowledge Base article #Q262110 (Microsoft, 2001b) provides cut-and-past code for this task that can be used with little modification.

My experience with this application indicates effective control over keyboard behavior, and to a lesser degree, control over mouse behavior, pose one of the biggest programming challenges. The WebBrowser Control recognizes neither the KeyPress event nor the KeyUp/KeyDown events, so a custom keyboard handler must be built from scratch. Similarly, the MouseUp/MouseDown events do not fire, so a custom mouse handler is also necessary. These tasks are best handled through subclassing techniques and callbacks. The intricacies of these software techniques is beyond the scope of this discussion. Help can be found in Microsoft Knowledge Base articles #Q168795, #Q170570 and #Q177992 (Microsoft 2002b, 2001c, 2001d), and in Appleman (1999).

Internet Explorer responds to over 60 keyboard shortcuts (Microsoft, 2001e). By default, the WebBrowser Control responds similarly to these same keycodes. Many of these can potentially corrupt the integrity of your Web-based experiment, as can several standard Windows shortcuts. These harmful key sequences should therefore be ignored by this software application. To accomplish this, your keyboard handler should always trap and extinguish the Escape key, the Applications key, the left and right Windows keys, all function keys, and the Shift-Enter key combination. In addition, any key combinations containing the Control key or the Alt key should be killed. For general external validity, the following basic browsing keys should be retained in all cases: tab, shift-tab, home, end, enter, page up, page down, up arrow, and down arrow. A simple, conservative approach to keyboard handling is to extinguish all keys except for this final set.

With regard to mouse activity, IE, and thus the WebBrowser Control, responds to a right mouse click by displaying a context-sensitive pop-up menu. Pressing Shift in conjunction with the left mouse button requests a hyperlink be sent to a new window. Your custom mouse handler should therefore ignore these mouse actions. Only the simple left click is needed in this software application.

The WebBrowser Control has yet another capability that holds great potential in this research context. Your custom browser can gain access to, and can even interact with, all aspects of any current HTML page. Access is provided through the Document property (e.g., myBrowser.Document) which exposes the full content of the HTML Document Object Model (DOM). The DOM is a fairly complex structure, and it will not be presented here (see http://www.w3.org/DOM/ for full details). As an illustrative example, though, your application can access the cookie object of the current HTML page with the following simple reference: myBrowser.Document.cookie. As another example consider that timestamp information for each visited page can be captured with myBrowser.Document.lastModified. You can even unobtrusively analyze the entire set of unvisited links on a page by processing the myBrowser.Document.links() array. This feature can indeed add great power and flexibility to your custom research instrument.