only for RuBoard - do not distribute or recompile |
Before looking at developing Browser Extensions for IE 5.0, we'll look at developing BHOs.
Browser helper objects are simple components , really. Once you know what you are doing, you could probably build one in about five minutes (minus any functionality, of course).
When Explorer first loads, it examines the following key for any browser helper objects that might be registered:
HKEY_LOCAL_MACHINE\ Software\ Microsoft\ Windows\ CurrentVersion\ Explorer\ Browser Helper Objects
Chances are, if you look in your registry right now, this key is not there, or if it does exist, it is empty. That's because Internet Explorer does not install a default set of browser helpers.
When BHOs are loaded, Explorer passes to the component what is known as a site pointer , which is actually an IUnknown interface pointer that the BHO can use to communicate with Explorer. Beyond this point, everything a BHO does is application-dependent. Typically, though, the BHO will be used to gain access to the current instance of Internet Explorer by querying the site pointer for IWebBrowser2 . Once it has this interface, it has access to all of IE's events, as well as full access to Microsoft's Dynamic HTML Object Model. This means that whatever is being displayed in the browser is fully accessible to the BHO. So, as you might guess, BHOs can evolve from a very simple components into complex entities rather quickly.
Browser helper objects are only required to implement one interface: IObjectWithSite . IObjectWithSite consists of two methods , SetSite and GetSite , as Table 12.1 shows.
Method | Description |
---|---|
GetSite | Returns the last site set with SetSite. |
SetSite | Provides the IUnknown site pointer of Explorer. |
This function returns the last site set with SetSite . Its syntax is:
HRESULT GetSite(REFIID riid, void** ppvSite);
with the following parameters:
[in] The interface identifier whose pointer should be returned in ppvSite .
[in, out] Address of the interface pointer described by riid .
The job of GetSite is simply to return the site pointer passed in by Explorer via SetSite . It is provided as a means for additional objects to gain access to the site pointer; it is a hooking mechanism.
The interesting thing about this method is that its arguments look very much like those of QueryInterface . This is no coincidence . The only thing this method needs to do is forward the riid and ppvSite parameters to a QueryInterface call and return the result.
When an instance of Explorer is fired up, the BHO is loaded and Explorer calls SetSite , passing in an IUnknown interface pointer. This is known as a site pointer . The syntax of the SetSite method is:
HRESULT SetSite(IUnknown* pUnkSite);
Its single parameter is:
[in] Site pointer.
A BHO typically will query the site pointer for the IWebBrowser2 interface. When you declare a variable as type Internet Explorer, this is really a reference to IWebBrowser2 (but we'll talk about that soon enough). This gives the browser helper access to the current instance of the web browser, including IE's event sink.
The IDL to define the IObjectWithSite interface is shown in Example 12.1. Notice that the IUnknown pointer has been replaced with IUnknownVB in the IDL definition. Here, this is done "just in case" for flexibility. Later, you might want to call QueryInterface , AddRef , or even Release on the site pointer. Who knows ? Here, though, it's not really necessary. When we implement SetSite , we'll cache a copy of the site pointer in a private member that will be declared as IUnknownVB . Our private member will give us access to all of IUnknown 's methods (which is necessary to properly implement GetSite ).
[ uuid(FC4801A3-2BA9-11CF-A229-00AA003D7352), helpstring("IObjectWithSite Interface"), odl ] interface IObjectWithSite : IUnknown { HRESULT SetSite([in] IUnknownVB* pSite); HRESULT GetSite([in] REFIID priid, [in, out] VOID * ppvObj); }
We begin by creating a new ActiveX DLL project called BHO and adding three references to our project:
Our type library
Microsoft Internet Controls
Microsoft HTML Object Library
The Microsoft HTML Object Library will most likely have to be added manually. It is in the system directory in a DLL named mshtml .dll .
We will call our class clsInetSpeak for reasons that will be apparent later, and the first thing we will do is implement IObjectWithSite :
'clsInetSpeak Implements IObjectWithSite
Next we implement SetSite . First, we add two private member variables to the class. The first is of type IUnknownVB . This is used to save the site pointer passed into us by Explorer. The second is of type Internet Explorer. This is used to manipulate the current running instance of Explorer. This is demonstrated in Example 12.2.
'clsInetSpeak Implements IObjectWithSite Private m_pUnkSite As IUnknownVB Private WithEvents m_ie As InternetExplorer Private Sub IObjectWithSite_SetSite( _ ByVal pSite As VBShellLib.IUnknownVB) If ObjPtr(pSite) = 0 Then CopyMemory m_ie, 0&, 4 Exit Sub End If Set m_ pUnkSite = pSite 'Save the site pointer for GetSite Set m_ie = pSite 'QueryInterface for IWebBrowser2 End Sub
SetSite is also called when Explorer is closed, and the value of the pSite argument, instead of containing the site pointer, contains a null pointer. When this happens, we need to overwrite the address of m_ie with and exit the sub. We do not set m_ie equal to Nothing . This is speculation, but it appears that setting m_ie equal to Nothing here does not immediately release the component. Explorer thinks that it still has a valid reference and crashes. Overwriting the address with a prevents this from happening.
Notice in Example 12.2 that m_ie is declared WithEvents . This gives us access to a wide variety of events that are fired by Explorer. We'll talk about that shortly, but let's implement GetSite first. It's one line of code, so let's get it out of the way. Example 12.3 contains the code.
Private Sub IObjectWithSite_GetSite( ByVal priid As VBShellLib.REFIID, ppvObj As VBShellLib.VOID) m_pUnkSite.QueryInterface priid, ppvObj End Sub
Can life get simpler than this? All we need to do here is call QueryInterface on the site pointer with the parameters passed in by the shell. We are done.
If we stop right here, the code we have so far is the minimal skeleton required to implement any BHO. You might want to save this somewhere as a template for creating future BHOs.
Registering browser helper objects is quite easy. Figure 12.1 shows the appropriate entry. {x x x x x x x x-x x x x-x x x x-x x x x-x x x x x x x x x x x x} represents the CLSID of the BHO.
Example 12.4 shows the registry script for the BHO that will be created in this chapter. It only contains one entry. You can modify this script easily for your own helper objects.
REGEDIT4 [HKEY_LOCAL_MACHINE\Software\Microsoft\Windows \CurrentVersion\Explorer\Browser Helper Objects\ {D6862A22-1DD6-11D3-BB7C-444553540000}\]
We are actually going to do something with this BHO we are building, but first we need to discuss this Internet Explorer reference we have and all the things we can do with it. This should give you a better idea of the types of projects you can create with BHOs.
That reference we're holding to Internet Explorer is actually an IWebBrowser2 interface. Why is it important that you know this? Well, for one thing, the documentation for IWebBrowser2 is much better than for the Microsoft Internet Control, and the methods of the IWebBrowser2 interface correspond to the methods and properties of Internet Explorer. This is important because the interfaces we will be working with contain so many methods (over 70 of them) that it would be impractical to list them all here. But since the documentation for this interface is good, you should be able to figure out what most of the methods are for. These interfaces are all documented in the Platform SDK and are listed as IHTML xxxx . You can access them and view their type information in the Object Browser by adding a reference to the Microsoft HTML Object Library ( MSHTML.TLB ) to your project.
Most of the methods of IWebBrowser2 deal with explicit settings of the Explorer program itself: status bar text, menu visibility, view mode, current URL, etc. But there is one property that is especially important to us: Document. This property returns a reference to an IHTMLDocument2 interface. From this interface, we can navigate the entire Dynamic HTML object model, which is depicted in Figure 12.2. This means that our BHO has access to any element on a given web page. If we want to change the href property of an anchor tag that is nested four frames deep inside the third form on the page, no problem. If we want to execute JavaScript against the currently loaded page, we can do that too. In fact, we can programmatically change any element we want or strip information from any page we desire .
Let's take some of the confusion out of navigating the object model. We start by retrieving the value of the Document property of IWebBrowser2 , which gives us an IHTMLDocument2 interface. From here, we can get the parent window of the document, the collection of frames inside of a document, or an object that is a part of the current document itself. Think about it in terms of a web page. A web page contains a main window. This window contains the document. The document can contain a collection of frames. Each one of these frames contains a window. Taking this frame window as a Window object brings us back to the top of the hierarchy.
Once we have an IHTMLDocument2 interface, we can get to any element ( <BODY> , <A> , <TABLE> , etc.) on a page. Each of these elements has a corresponding interface that is in the format IHTML xxxx . As Figure 12.1 illustrates, there are several collections available to us. Let's take the "all" collection for example. The all property returns an IHTMLElementsCollection interface that will allow us to iterate through every element on a web page. Once we have a specific element, we can then get any number of the IHTML xxxx interfaces. The code fragment in Example 12.5 demonstrates this.
Private WithEvents m_ie As InternetExplorer . . . Dim pDocument as IHTMLDocument2 Dim pElements as IHTMLElementsCollection Dim pElement as IHTMLElement Set pDocument = m_ie.Document Set pElements = pDocument.all 'Loop through each element For i = 0 to pElements.length - 1 Set pElement = pElements.item(i) if pElement.tagName = "a" Then Dim pAnchor as IHTMLAnchorElement Set pAnchor = pElement pAnchor.href="http://www.oreilly.com" . . .
As we loop through each element, we check the tag name . If the tag is equal to "a," then we can QueryInterface for IHTMLAnchorElement . The code fragment in Example 12.5 would actually change the href of each anchor it came across to http://www.oreilly.com. You would be redirected to this URL if you were to then click on the link.
There are over 100 IHTML xxxx interfaces documented in the Platform SDK, and each has methods specific to its function. It is well beyond the scope of this book to document them all. A good place to start is IHTMLWindow2 and IHTMLDocument2 . You can navigate the entire Dynamic HTML Object Library from these two interfaces. Oddly enough, if you are running IE 4.0, these two interfaces are marked [hidden] in the library. If you have installed IE 5.0, all of the interfaces are visible.
The Microsoft Internet Control provides 18 events for which we can add our code. Table 12.2 contains a complete list of the events provided by Internet Explorer. These events seemingly cover just about every situation you might envision. The events typically used in a browser extension are marked with an asterisk.
|
Method | Fired When . . . |
---|---|
StatusTextChange | The status bar text has changed. |
ProgressChange | Information on the progress of a download is updated. |
CommandStateChange | The enabled state of a menu command changes. |
DownloadBegin * | A navigation operation is starting, shortly after the BeforeNavigate2 event, unless the navigation is canceled . |
DownloadComplete * | A navigation operation finishes, is stopped , or has failed. |
TitleChange | The title of a document in the Web Browser control becomes available or has changed. |
PropertyChange | The IWebBrowser2::PutProperty method changes the value of a property. |
BeforeNavigate2 * | The Web Browser control is about to navigate to a new URL. |
NewWindow2 | A new window is created for displaying a resource. |
NavigateComplete2 * | The browser has completed navigation to a new location. |
DocumentComplete * | The document being navigated to has finished loading. |
OnQuit | The Internet Explorer application is ready to quit. |
OnVisible | The window for the WebBrowser should be shown or hidden. |
OnToolBar | The ToolBar property has changed. |
OnMenuBar | The MenuBar property has changed. |
OnStatusBar | The StatusBar property has changed. |
OnFullScreen | The FullScreen property has changed. |
OnTheaterMode | The TheaterMode property has changed. |
Okay, let's add some code to our BHO skeleton. This is going to be temporary code that we will rip out later. The purpose is just to show you some object model navigation techniques. This exercise will also demonstrate how one task can be accomplished several different ways.
If you have IE 5.0, undoubtedly you have seen the new feature that remembers values that have been entered previously. Say you go to log in to your Hotmail account. IE 5.0 will ask you if you would like to save the password information. If you say "yes," the next time you come to Hotmail, your previously entered information is available, and then you can log in automatically.
We're not going to do anything that fancy, but this example could be the foundation for such a component. All this component will do is fill in our login name when we go to Hotmail. Everything is hardcoded. This is just a beta, after all!
The first thing we need to do is to add a private member variable to our class, called m_b GoingToHotmail . Then we add some code to the BeforeNavigate2 event, as Example 12.6 shows.
Private Sub m_ie_BeforeNavigate2(ByVal pDisp As Object, _ URL As Variant, _ Flags As Variant, _ TargetFrameName As Variant, _ PostData As Variant, _ Headers As Variant, _ Cancel As Boolean) If URL = "http://www.hotmail.com/" Then m_bGoingToHotmail = True Else m_bGoingToHotmail = False End If End Sub
The reason we are testing for this URL in the BeforeNavigate2 event is that at the time of this writing, when you go to Hotmail, you are immediately redirected to another URL. BeforeNavigate2 will allow us to capture the URL http://www.hotmail.com before the redirection, so we'll know we're at the right place. The result of this event procedure is that when you navigate to Hotmail, a flag indicating this fact is set. Hotmail can redirect us to wherever, and we still know we are heading there.
Now we just have to implement one more event, DocumentComplete. This event is called when the page has finished loading and the document is available. The HTML source for the Hotmail login page indicates that there is one form on the page, and that the login text box we are interested in is on the form. Example 12.7 shows one possible way of navigating to the text box and setting its value.
Private Sub m_ie_DocumentComplete(ByVal pDisp As Object, _ URL As Variant) If (m_bGoingToHotmail = True) Then Dim i As Long Dim pDoc As IHTMLDocument2 Set pDoc = m_ie.Document Dim pForms As IHTMLElementCollection Set pForms = pDoc.Forms Dim pForm As IHTMLFormElement Set pForm = pForms.Item(0) Dim pElements As Object Set pElements = pForm.elements Dim pElement As IHTMLElement Set pElement = pElements.Item("login") Dim pInput As IHTMLInputTextElement Set pInput = pElement pInput.Value = "oreilly" End If End Sub
Wow, that's a bunch of code for one simple task! Don't worry, we'll trim it down to size in a little while, but for now let's walk through it.
First, we get the current document by calling the Document property of m_ie and assigning the resulting object reference to pDoc . Then we get the collection of all the forms on the page and assign it to pForms . Since we have looked at the source to the page, we know there is only one form, so we can get the form directly without looping through the collection by calling:
Set pForm = pForms.Item(0)
Now we get to an inconsistency in the model. We want to grab a collection of all the elements on the form, so we call the elements property of IHTMLFormElement . You might expect this to return an IHTMLElementsCollection reference, but it does not. It returns an IDispatch interface.
Once we have all the elements on the form, we can get the element we are looking for by name with the call:
Set pElement = pElements.Item("login")
We can then query pElement for IHTMLInputTextElement and set the value of the text box with the call:
Dim pInput As IHTMLInputTextElement Set pInput = pElement pInput.Value = "oreilly"
Wait, don't run away. That was the convoluted, horribly inefficient way to achieve this task. Achieving this goal is really is much easier than Example 12.7 shows.
Okay, delete the DocumentComplete code, and we'll try this again. Let's look at Example 12.8.
Private Sub m_ie_DocumentComplete(ByVal pDisp As Object, _ URL As Variant) If (m_bGoingToHotmail = True) Then Dim pDoc As IHTMLDocument2 Set pDoc = m_ie.Document Dim pElements As IHTMLElementCollection Set pElements = pDoc.All Dim pInput As IHTMLInputTextElement Set pInput = pElements.Item("login") pInput.Value = "oreilly" End If End Sub
Once again, we get the current document by calling the Document property. But this time, instead of getting the elements of a form, we just grab the whole page by calling All. Once we have all the elements, we can get the element we are interested in directly by name.
We can also use JavaScript to achieve our goal. Example 12.9 illustrates this point.
Private Sub m_ie_DocumentComplete(ByVal pDisp As Object, _ URL As Variant) If (m_bGoingToHotmail = True) Then Dim pDoc As IHTMLDocument2 Set pDoc = m_ie.Document Dim pWnd As IHTMLWindow2 Set pWnd = pDoc.parentWindow Dim strJava As String strJava = "document.passwordform.login.value = 'oreilly';" pWnd.execScript strJava End If End Sub
That's kind of slick, isn't it? We get the current document as before. Then we get the current window by calling parentWindow. This returns an IHTMLWindow2 interface. IHTMLWindow2 is [hidden] , by the way, if you're still running IE 4.0. Once we have that, we can execute JavaScript directly against the page by calling execScript . execScript takes a string that contains either JavaScript or VBScript, which can be specified by a second, optional parameter to the method.
You can execute two whole pages of JavaScript if you want. As long as each statement is delimited by a semicolon, the whole two pages could be shoved into a string and sent to execScript . But that's a little unwieldy. The easy way would be to chop up the string into manageable chunks and execute them one at a time:
strJava = "document.passwordform.login.value = 'oreilly';" pWnd.execScript strJava strJave = "document.submit( );" pWnd.execScript strJava . . .
Before we continue, delete the code for BeginNavigate2 and DocumentComplete, so we can have a fresh start. Ahhh, that's better.
only for RuBoard - do not distribute or recompile |