< Day Day Up > |
Create a PDF template that you can populate as it is served . Sometimes a PDF needs to include dynamic information. For example, you could fashion the cover of your personalized PDF sales brochure [Hack #89] to include the customer's name : "Created for Mary Jane Doe on March 15, 2004." To do this, let's use what we know about modifying PDF text in a plain-text editor [Hack #80] to create a PDF template. Then we'll fill in this template using a web server script. The overall process resembles [Hack #83] . Instead of PDF links, you will add placeholders to the PDF's page streams. As it is served, these placeholders can be replaced with your data. 6.12.1 Create the PDFDesign the document using your favorite authoring application. Add placeholder text where you want the dynamic data to appear. Placeholders should have a common prefix, such as textbeg_customer . Style this text to taste, but align it to the left (not the center). Before creating a PDF, be careful with the placeholder fonts to avoid results such as the one in Figure 6-14. Figure 6-14. Acrobat displaying parentheses around "Jane" as empty rectangles, because we omitted them from our alphabet soupWhichever font you choose for your placeholder, you must make sure the font gets adequately embedded into the PDF [Hack #43] . An embedded font is often subset , which means it includes only the characters that are used in your document. If your placeholder text uses a Type 1 font, you can configure Distiller to not subset this font [Hack #43] . If your placeholder text uses a TrueType or OpenType font, you must be sure that every character you might need occurs in your document. To be safe, create a separate page that includes every letter in the alphabet, every number, and every punctuation mark you'll need. Set this alphabet soup to the font of your placeholder. Print to PDF and delete this alphabet page. 6.12.2 Convert the PDF into a TemplatePrepare the PDF for text editing with pdftk [Hack #79] like this (if you use gVim and our plug-in [Hack #82] to edit PDF, this step isn't necessary): pdftk mydoc .pdf output mydoc.plain .pdf uncompress Open the results in your editor and search for your placeholder text. If you can't find it, search on its page numbere.g., pageNum 5 and then dig down [Hack #81] to find the page stream that has your placeholder. Distiller probably split it into piecese.g., textbeg_customer might end up as [(text)5(b)-1.7(eg_cust)5(o)-1.7(mer)] .
Make a few changes to this page stream. First, repair your placeholder text so that grep can find it. So: [(text)5(b)-1.7(eg_cust)5(o)-1.7(mer)]TJ becomes: [(textbeg_customer)]TJ Or, if your string ends in Tj , such as this: (Created for textbeg_customer on textbeg_date)Tj rewrite it like this, adding square brackets and changing the Tj at the end to TJ : [(Created for textbeg_customer on textbeg_date)]TJ Next , isolate each placeholder on its own line, if necessary. So, the previous example becomes: [(Created for ) (textbeg_customer) ( on ) (textbeg_date)]JT Finally, pad the placeholders with asterisks ( * ). Add enough asterisks so that the placeholder is longer than any possible data you might write there. Padding the previous example would look like this: [(Created for ) (textbeg_customer***********************************) ( on ) (textbeg_date**********************)]JT Save and close your altered PDF.
6.12.3 Add Placeholder Offsets to the PDFIf you used gVim and our plug-in to edit the PDF, now you must uncompress the PDF. If you did not use gVim, now you must repair the PDF's XREF table and stream lengths. One command accomplishes both tasks : pdftk mydoc.plain .pdf output mydoc .pdfsrc uncompress From this point on, you should not treat the file like a PDF, and this pdfsrc extension will remind you. Find the byte offsets to your placeholders with grep (Windows users visit http://gnuwin32.sf.net/packages/grep.htm or install MSYS [Hack #97] to get grep): ssteward@armand:~$ grep -ab textbeg mydoc .pdfsrc 9202:(textbeg_customer***************************) 9247:(textbeg_date***************************)]TJ 11793:(textbeg_customer***************************) In your text editor, add one line for each offset to the beginning of your pdfsrc file. Each line should look like this: #- dataname - dataoffset The dataname is used in the following script code to identify the data to be written into the PDF. In this example, customer will be replaced with the customer's name. For example, here is how the preceding grep output would appear at the beginning of a pdfsrc file: #-customer-9202 #-date-9247 #-customer-11793 %PDF-1.3... After adding these lines, do not modify the PDF with pdftk, gVim, or Acrobat. The pdfsrc extension should remind you to not treat this file like a PDF. Altering the PDF could invalidate these byte offsets. 6.12.4 The CodeThis example PHP script, alter_pdf_text_example.php , opens a pdfsrc file, reads the offset data we added, and then serves the PDF. As it serves the PDF, it replaces the placeholders with the given text. Note how the replacement text is escaped using escape_pdf_string . <?php // alter_pdf_text_example.php, version 1.0 // http://www.pdfhacks.com/dynamic_text/ // the filename of the source PDF file, which // contains placeholders for our dynamic text $pdfsrc_fn= './cover.pdfsrc'; // the data we will place into the PDF text; $customer_text= "Mary Jane Doe"; $date_text= "March 15, 2004"; function escape_pdf_string( $ss ) { $ss_esc= ''; $ss_len= strlen( $ss ); for( $ii= 0; $ii< $ss_len; ++$ii ) { if( ord($ss{$ii})== 0x28 // open paren ord($ss{$ii})== 0x29 // close paren ord($ss{$ii})== 0x5c ) // backslash { $ss_esc.= chr(0x5c).$ss{$ii}; // escape the character w/ backslash } else if( ord($ss{$ii}) < 32 126 < ord($ss{$ii}) ) { $ss_esc.= sprintf( "\%03o", ord($ss{$ii}) ); // use an octal code } else { $ss_esc.= $ss{$ii}; } } return $ss_esc; } // open the source PDF file, which contains placeholders $fp= @fopen( $pdfsrc_fn, 'r' ); if( $fp ) { if( $_GET['debug'] ) { header("Content-Type: text/plain"); // debug } else { header('Content-Type: application/pdf'); } $pdf_offset= 0; $text_offsets= array( ); // iterate over first lines of pdfsrc file to load $text_offsets; while( $cc= fgets($fp, 1024) ) { if( $cc{0}== '#' ) { // one of our comments list($comment, $name, $offset)= explode( '-', $cc ); if( $name== 'customer' ) { $text_offsets[(int)$offset]= escape_pdf_string( $customer_text ); } else if( $name== 'date' ) { $text_offsets[(int)$offset]= escape_pdf_string( $date_text ); } else { // default $text_offsets[(int)$offset]= escape_pdf_string( '[ERROR]' ); } } else { // finished with our comments echo $cc; $pdf_offset= strlen($cc)+ 1; break; } } // sort by increasing offsets ksort( $text_offsets, SORT_NUMERIC ); reset( $text_offsets ); $output_text_line_b= false; $output_text_b= false; $closed_string_b= false; list( $offset, $text )= each( $text_offsets ); $text_ii= 0; $text_len= strlen($text); // iterate over rest of file while( ($cc= fgetc($fp))!= "" ) { if( $output_text_line_b && $cc== '(' ) { // we have reached the beginning of our TEXT $output_text_line_b= false; $output_text_b= true; echo '('; } else if( $output_text_b ) { if( $cc== ')' ) { // finished with this TEXT if( $closed_string_b ) { // string has already been capped; pad echo ' '; } else { echo ')'; } // get next offset/TEXT pair list( $offset, $text )= each( $text_offsets ); $text_ii= 0; $text_len= strlen($text); // reset $output_text_b= false; $closed_string_b= false; } else if( $text_ii< $text_len ) { // output one character of $text echo $text{$text_ii++}; } else if( $text_ii== $text_len ) { // done with $text, so cap this string echo ')'; $closed_string_b= true; $text_ii++; } else { echo ' '; // replace padding with space } } else { // output this character echo $cc; if( $offset== $pdf_offset ) { // we have reached a line in pdfsrc where // our TEXT should be; begin a lookout for '(' $output_text_line_b= true; } } ++$pdf_offset; } fclose( $fp ); } else { // file open failure echo 'Error: failed to open: '.$pdfsrc_fn; } ?> 6.12.5 Running the HackIndigoPerl users (see Section 6.2.2 in [Hack #74] ) can copy alter_pdf_text_example.php into C:\indigoperl\apache\htdocsdf_hacks along with a PDF template named cover.pdfsrc . Point your browser to http://localhost/pdf_hacks/alter_pdf_text_example.php, and a PDF should appear. All instances of textbeg_customer should be replaced with "Mary Jane Doe," and all instances of textbeg_date should be replaced with "March 15, 2004." Naturally, you will need to adapt this script to your own purposes. |
< Day Day Up > |