Recipe 17.3. Getting All Links from a Web Page


Problem

You want to build a list of the hyperlinks included in a specific web page.

Solution

Sample code folder: Chapter 17\ListWebLinks

Use the Managed HTML DOM to traverse the list of web page links as objects.

Discussion

This recipe's sample code builds a list of links from a web page. Create a new Windows Forms application, and add the following controls to Form1:

  • A TextBox control named WebAddress.

  • A Button control named ActGo. Set its Text property to Go.

  • A WebBrowser control named WebContent.

  • A ListBox control named WebLinks.

Add informational labels if desired, and arrange the controls to look like Figure 17-4.

Figure 17-4. Controls for the listing web links sample


Next add the following source code to the form's class template:

 Private Class LinkDetail    Public LinkURL As String    Public LinkText As String    Public Overrides Function ToString() As String       Return LinkText    End Function End Class Private Sub ActGo_Click(ByVal sender As System.Object, _       ByVal e As System.EventArgs) Handles ActGo.Click    ' ----- Jump to a new web page.    If (Trim(WebAddress.Text) <> "") Then       WebLinks.Items.Clear()       WebContent.Navigate(WebAddress.Text)    End If End Sub Private Sub WebContent_DocumentCompleted( _       ByVal sender As Object, ByVal e As       System.Windows.Forms. _       WebBrowserDocumentCompletedEventArgs) _       Handles WebContent.DocumentCompleted    ' ----- Build the list of links.    Dim oneLink As HtmlElement    Dim newLink As LinkDetail    ' ----- Scan through all the links.    For Each oneLink In WebContent.Document.Links       ' ----- Buld a new link entry.       newLink = New LinkDetail       If (oneLink.InnerText = "") Then          newLink.LinkText = "[Image or Unknown]"       Else          newLink.LinkText = oneLink.InnerText       End If       newLink.LinkURL = oneLink.GetAttribute("href")       ' ----- Add the link to the list.         WebLinks.Items.Add(newLink)    Next oneLink End Sub Private Sub WebLinks_DoubleClick(ByVal sender As Object, _       ByVal e As System.EventArgs) Handles WebLinks.DoubleClick    ' ----- Show the detail of a web link.    Dim linkContent As LinkDetail    If (WebLinks.SelectedIndex = -1) Then Return    linkContent = CType(WebLinks.SelectedItem, LinkDetail)    MsgBox("Display = " & linkContent.LinkText & vbCrLf & _       "URL = " & linkContent.LinkURL) End Sub 

Run the program, enter an address in the TextBox control, and click the Go button. The web page appears, as does the list of its links. Double-click a link to display its target URL, as shown in Figure 17-5.

Figure 17-5. Displaying the URL for a parsed web link


See Also

Recipe 17.2 discusses the general use of the Managed HTML Document Object Model.




Visual Basic 2005 Cookbook(c) Solutions for VB 2005 Programmers
Visual Basic 2005 Cookbook: Solutions for VB 2005 Programmers (Cookbooks (OReilly))
ISBN: 0596101775
EAN: 2147483647
Year: 2006
Pages: 400

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net