Hack 80. Profile Your Program Size | Perl Hacks: Tips & Tools for Programming, Debugging, and Surviving

Find out how much memory your program takes, and then trim it!

The difference between a Perl program and a natively compiled binary is far more than just program convenience. Although the Perl program can do far more with less source code, in memory, Perl's data structures and bookkeeping can take up more space than you might think. Size matters sometimeseven if you have plenty of memory (if you're not trying to optimize for shared memory in a child-forking application, for example), a program with good algorithms and not tied to IO or incoming requests can still run faster if it has fewer operations to perform.

One of the best optimizations of Perl programs is trimming the number of operations it has to perform. The less work it has to do, the better.

This isn't an argument for obfuscated or golfed codejust good profiling to find and trim the few fat spots left in a program.

The Hack

When Perl compiles a program, it builds an internal representation called the optree. This represents every single discrete operation in a program. Thus knowing how many opcodes there are in a program (or module) and the size of each opcode is necessary to know where to start optimizing.

The B::TerseSize module is useful in this case.^[15] It adds a size( ) method to all ops. More importantly, it gives you detailed information about the size of all symbols in a package if you call package_size( ).

^[15] It's more useful when used with mod_perl and Apache::Status; see http://modperlbook.org/html/ch09_04.html.

To find the largest subroutine in a package and report on its opcodes, use code something like:

use B::TerseSize; sub report_largest_sub {     my $package                  = shift;     my ($symbols, $count, $size) = B::TerseSize::package_size( $package );     my ($largest)                =         sort { $symbols->{$b}{size} <=> $symbols->{$a}{size} }         grep { exists $symbols->{$_}{count} }         keys %$symbols;     print "Total size for $package is $size in $count ops.\\n";     print "Reporting $largest.\\n";     B::TerseSize::CV_walk( 'root', $package . '::' . $largest ); }

package_size( ) returns three items: a reference to a hash where the key is the name of a symbol and the value is a hash reference with the count of opcodes for that symbol and the total size of the symbol, the total count of opcodes for the package, and the total size of the package.

report_largest_sub( ) takes the name of a loaded package, finds the largest subroutine in that package (where the heuristic is that only subroutines have a count key in the second-level hash of the symbol information), prints some summary information about the package, and then calls CV_walk( ) which prints a lot of information about the selected subroutine.

Running the Hack

The real meat of the hack is in interpreting the output. B::TerseSize displays statistics for every significant line of code in a subroutine. Thus, calling report_largest_sub( ) on Text::WikiFormat will print pages of output for find_list( ):

Total size for Text::WikiFormat is 92078 in 1970 ops. Reporting find_list. UNOP   leavesub      0x10291e88 {28 bytes} [targ 1 - $line]     LISTOP lineseq       0x10290050 {32 bytes} ------------------------------------------------------------         COP    nextstate     0x10290010 {24 bytes}         BINOP  aassign       0x1028ffe8 {32 bytes} [targ 6 - undef]             UNOP   null          0x1028fd38 {28 bytes} [list]                 OP     pushmark      0x1028ffc8 {24 bytes}                 UNOP   rv2av         0x1028ffa8 {28 bytes} [targ 5 - undef]                     SVOP   gv            0x1028ff88 {96 bytes}  GV *_             UNOP   null          0x1028d660 {28 bytes} [list]                 OP     pushmark      0x1028fec0 {24 bytes}                 OP     padsv         0x1028fe68 {24 bytes} [targ 1 - $line]                 OP     padsv         0x1028fea0 {24 bytes} [targ 2 -                                                                $list_types]                 OP     padsv         0x1028fee0 {24 bytes} [targ 3 - $tags]                 OP     padsv         0x1028ff10 {24 bytes} [targ 4 - $opts] [line 317 size: 380 bytes] ------------------------------------------------------------ (snip 234 more lines)

The final line gives the key to interpreting the output; it represents line 317 of the file defining this package:

315: sub find_list 316: { 317:     my ( $line, $list_types, $tags, $opts ) = @_; 318: 319:     for my $list (@$list_types)

This single line costs twelve opcodes and around 380 bytes^[16] of memory. If this were worth optimizing, perhaps removing an unused variable would help.

^[16] Give or take; B::TerseSize can only guess sometimes.

The previous lines list each op on this line in tree order. That is, the root of the branch is the nextstate control op. It has a sibling, the leaveloop binary op. You can ignore the memory address, but the size of the op in curly braces can be useful. Finally, some ops have additional information in square bracketsespecially those referring to lexical variables.

The real use of this information is when you can compare two different implementations of an algorithm to each other to optimize for memory usage or number of ops. Sometimes the code with the fewest number of lines really isn't slimmer.

Hacking the Hack

Do you absolutely hate the output from CV_walk( )? Write your callback and use B::walkoptree_slow( ) or B::walkoptree_exec( ) to call it. Don't forget to use B::TerseSize to make the size( ) method available on ops. You can get package and line number information from nextstate ops.

Unfortunately, doing this effectively probably means stealing the code from B::TerseSize. At least it's reasonably small and self-contained. Look for the methods declared in the B:: namespace.