< Day Day Up > |
Drive Acrobat using VB or Microsoft Word's Visual Basic for Applications (VBA) . Adobe Acrobat's OLE interface enables you to access or manipulate PDFs from a freestanding Visual Basic script or from another application, such as Word. You can also use Acrobat's OLE interface to render a PDF inside your own program's window. The Acrobat SDK [Hack #98] comes with a number of Visual Basic examples under the InterAppCommunicationSupport directory. The SDK also includes OLE interface documentation. Look for IACOverview.pdf and IACReference.pdf . These OLE features do not work with the free Reader; you must own Acrobat.
The following example shows how easily you can work with PDFs using Acrobat OLE. It is a Word macro that scans the currently open PDF document for readers' annotations (e.g., sticky notes). It creates a new Word document and then builds a summary of these annotation comments. 7.3.1 The Code To add this macro to Word, select Tools Example 7-1. VBA code for summarizing commentsSub SummarizeComments( ) Dim app As Object Set app = CreateObject("AcroExch.App") If (0 < app.GetNumAVDocs) Then ' a PDF is open in Acrobat ' create a new Word doc to hold the summary Dim NewDoc As Document Dim NewDocRange As Range Set NewDoc = Documents.Add(DocumentType:=wdNewBlankDocument) Set NewDocRange = NewDoc.Range Dim found_notes_b As Boolean found_notes_b = False ' get the active doc and drill down to its PDDoc Dim avdoc, pddoc As Object Set avdoc = app.GetActiveDoc Set pddoc = avdoc.GetPDDoc ' iterate over pages Dim num_pages As Long num_pages = pddoc.GetNumPages For ii = 0 To num_pages - 1 Dim pdpage As Object Set pdpage = pddoc.AcquirePage(ii) If (Not pdpage Is Nothing) Then ' iterate over annotations (e.g., sticky notes) Dim page_head_b As Boolean page_head_b = False Dim num_annots As Long num_annots = pdpage.GetNumAnnots For jj = 0 To num_annots - 1 Dim annot As Object Set annot = pdpage.GetAnnot(jj) ' Popup annots give us duplicate contents If (annot.GetContents <> "" And _ annot.GetSubtype <> "Popup") Then If (page_head_b = False) Then ' output the page number NewDocRange.Collapse wdCollapseEnd NewDocRange.Text = "Page: " & (ii + 1) & vbCr NewDocRange.Bold = True NewDocRange.ParagraphFormat.LineUnitBefore = 1 page_head_b = True End If ' output the annotation title and format it a little NewDocRange.Collapse wdCollapseEnd NewDocRange.Text = annot.GetTitle & vbCr NewDocRange.Italic = True NewDocRange.Font.Size = NewDocRange.Font.Size - 1 NewDocRange.ParagraphFormat.LineUnitBefore = 0.6 ' output the note text and format it a little NewDocRange.Collapse wdCollapseEnd NewDocRange.Text = annot.GetContents & vbCr NewDocRange.Font.Size = NewDocRange.Font.Size - 2 found_notes_b = True End If Next jj End If Next ii If (Not found_notes_b) Then NewDocRange.Collapse wdCollapseEnd NewDocRange.Text = "No Notes Found in PDF" & vbCr NewDocRange.Bold = True End If End If End Sub 7.3.2 Running the Code Open a PDF in Acrobat, as shown in Figure 7-6. In Word, run the macro by selecting Tools Figure 7-6. PDF Comments displayed in Acrobat![]() Figure 7-7. The PDF Comments in Word after extraction via SummarizeComments![]() 7.3.3 Hacking the HackThis script demonstrates the typical process of drilling down through layers of PDF objects to find desired information. Here is a simplified sketch of the layers :
These OLE objects closely resemble the objects exposed by the Acrobat API [Hack #97] . The API gives you much more power, however. |
< Day Day Up > |