Recipe 2.24. Counting Pages of PDF Documents on Mac OS XCredit: Dinu Gherman, Dan Wolfe ProblemYou're running on a reasonably recent version of Mac OS X (version 10.3 "Panther" or later), and you need to know the number of pages in a PDF document. SolutionThe PDF format and Python are both natively integrated with Mac OS X (10.3 or later), and this allows a rather simple solution: #!/usr/bin python import CoreGraphics def pageCount(pdfPath): "Return the number of pages for the PDF document at the given path." pdf = CoreGraphics.CGPDFDocumentCreateWithProvider( CoreGraphics.CGDataProviderCreateWithFilename(pdfPath) ) return pdf.getNumberOfPages( ) if _ _name_ _ == '_ _main_ _': import sys for path in sys.argv[1:]: print pageCount(path) DiscussionA reasonable alternative to this recipe might be to use the PyObjC Python extension, which (among other wonders) lets Python code reuse all the power in the Foundation and AppKit frameworks that come with Mac OS X. Such a choice would let you write a Python script that is also able to run on older versions of Mac OS X, such as 10.2 Jaguar. However, relying on Mac OS X 10.3 or later ensures we can use the Python installation that is integrated as a part of the operating system, as well as such goodies as the CoreGraphics Python extension module (also part of Mac OS X "Panther") that lets your Python code reuse Apple's excellent Quartz graphics engine directly. See AlsoPyObjC is at http://pyobjc.sourceforge.net/; information on the CoreGraphics module is at http://www.macdevcenter.com/pub/a/mac/2004/03/19/core_graphics.html. |