Hack 94 Script Acrobat Using Visual Basic on Windows

 < Day Day Up > 

figs/moderate.gif figs/hack94.gif

Drive Acrobat using VB or Microsoft Word's Visual Basic for Applications (VBA) .

Adobe Acrobat's OLE interface enables you to access or manipulate PDFs from a freestanding Visual Basic script or from another application, such as Word. You can also use Acrobat's OLE interface to render a PDF inside your own program's window. The Acrobat SDK [Hack #98] comes with a number of Visual Basic examples under the InterAppCommunicationSupport directory. The SDK also includes OLE interface documentation. Look for IACOverview.pdf and IACReference.pdf . These OLE features do not work with the free Reader; you must own Acrobat.

Acrobat Distiller also has an OLE interface. It is documented in DistillerAPIReference.pdf , which comes with the full Acrobat SDK.


The following example shows how easily you can work with PDFs using Acrobat OLE. It is a Word macro that scans the currently open PDF document for readers' annotations (e.g., sticky notes). It creates a new Word document and then builds a summary of these annotation comments.

7.3.1 The Code

To add this macro to Word, select Tools Macro Macros . . . , type in the macro name SummarizeComments , and click Create. Word will open a text editor where you can enter the code shown in Example 7-1. Save, and then test. You can download this code from http://www.pdfhacks.com/summarize.

Example 7-1. VBA code for summarizing comments
 Sub SummarizeComments( ) Dim app As Object Set app = CreateObject("AcroExch.App") If (0 < app.GetNumAVDocs) Then   ' a PDF is open in Acrobat   ' create a new Word doc to hold the summary   Dim NewDoc As Document   Dim NewDocRange As Range   Set NewDoc = Documents.Add(DocumentType:=wdNewBlankDocument)   Set NewDocRange = NewDoc.Range      Dim found_notes_b As Boolean   found_notes_b = False      ' get the active doc and drill down to its PDDoc   Dim avdoc, pddoc As Object   Set avdoc = app.GetActiveDoc   Set pddoc = avdoc.GetPDDoc      ' iterate over pages   Dim num_pages As Long   num_pages = pddoc.GetNumPages   For ii = 0 To num_pages - 1          Dim pdpage As Object     Set pdpage = pddoc.AcquirePage(ii)     If (Not pdpage Is Nothing) Then              ' iterate over annotations (e.g., sticky notes)       Dim page_head_b As Boolean       page_head_b = False       Dim num_annots As Long       num_annots = pdpage.GetNumAnnots       For jj = 0 To num_annots - 1                  Dim annot As Object         Set annot = pdpage.GetAnnot(jj)         ' Popup annots give us duplicate contents         If (annot.GetContents <> "" And _             annot.GetSubtype <> "Popup") Then                      If (page_head_b = False) Then ' output the page number             NewDocRange.Collapse wdCollapseEnd             NewDocRange.Text = "Page: " & (ii + 1) & vbCr             NewDocRange.Bold = True             NewDocRange.ParagraphFormat.LineUnitBefore = 1             page_head_b = True           End If                      ' output the annotation title and format it a little           NewDocRange.Collapse wdCollapseEnd           NewDocRange.Text = annot.GetTitle & vbCr           NewDocRange.Italic = True           NewDocRange.Font.Size = NewDocRange.Font.Size - 1           NewDocRange.ParagraphFormat.LineUnitBefore = 0.6                      ' output the note text and format it a little           NewDocRange.Collapse wdCollapseEnd           NewDocRange.Text = annot.GetContents & vbCr           NewDocRange.Font.Size = NewDocRange.Font.Size - 2                      found_notes_b = True         End If       Next jj     End If   Next ii      If (Not found_notes_b) Then     NewDocRange.Collapse wdCollapseEnd     NewDocRange.Text = "No Notes Found in PDF" & vbCr     NewDocRange.Bold = True   End If End If End Sub 

7.3.2 Running the Code

Open a PDF in Acrobat, as shown in Figure 7-6. In Word, run the macro by selecting Tools Macro Macros . . . SummarizeComments and then clicking Run. After a few seconds, a new Word document will appear, as shown in Figure 7-7. It will list all the comments that readers have added to each page of the currently visible PDF.

Figure 7-6. PDF Comments displayed in Acrobat
figs/pdfh_0706.gif

Figure 7-7. The PDF Comments in Word after extraction via SummarizeComments
figs/pdfh_0707.gif

7.3.3 Hacking the Hack

This script demonstrates the typical process of drilling down through layers of PDF objects to find desired information. Here is a simplified sketch of the layers :


app

The currently running Acrobat program. Use the app to alter the user interface or Acrobat's preferences.


avdoc

The PDF currently displayed in Acrobat. Use the avdoc to change how the PDF appears in the viewer or to print pages.


pddoc

Represents the underlying PDF document. Use the pddoc to access or manipulate the PDF's pages or metadata.


pdpage

Represents the underlying PDF page. Use the pdpage to access or manipulate a page's annotations, its rotation, or its cropping.

These OLE objects closely resemble the objects exposed by the Acrobat API [Hack #97] . The API gives you much more power, however.

 < Day Day Up > 


PDF Hacks.
PDF Hacks: 100 Industrial-Strength Tips & Tools
ISBN: 0596006551
EAN: 2147483647
Year: N/A
Pages: 158
Authors: Sid Steward

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net