Hack 78. Introspect Your Subroutines | Perl Hacks: Tips & Tools for Programming, Debugging, and Surviving

Trace any subroutine to its source.

You can name anonymous subroutines [Hack #57] and deparse them [Hack #56]. You can even peek at their closed-over lexical variables [Hack #76]. There are still more wonders in the world.

Someday you'll have to debug a running program and figure out exactly where package A picked up subroutine B. One option is to trace all import( ) calls, but that's even less fun than it sounds. Another option is to pull out the scariest and most powerful toolkit in the Perl hacker's toolbox: the B::* modules.

The Hack

Finding a misbehaving function means you need to know two of three things:

The original package of the function
The name of the file containing the function
The line number in the file corresponding to the function

From there, your debugging should be somewhat easier. Perl stores all of this information for every CV^[11] it compiles. You just need a way to get to it.

^[11] The internal representation of all subroutines and methods.

The usual entry point is through the B module and its svref_2object( ) function, which takes a normal Perl data structure, grabs the underlying C representation, and wraps it in hairy-scary objects that allow you to peek (though not usually poke) at its guts.

It's surprisingly easy to report a subroutine's vital information:

use B; sub introspect_sub {     my $sub      = shift;     my $cv       = B::svref_2object( $sub );     return join( ':',         $cv->STASH->NAME( ), $cv->FILE( ), $cv->GV->LINE( ) . "\\n"     ); }

introspect_sub( ) takes one argument, a reference to a subroutine. After passing it to svref_2object( ), it receives back a B::CV object. The STASH( ) method returns the typeglob representing the package's namespacecalling NAME( ) on this returns the package name. The FILE( ) method returns the name of the file containing this subroutine. The GV( ) method returns the particular symbol table entry for this subroutine, in which the LINE( ) method returns the line of the file corresponding to the start of this subroutine.

Okay, using Devel::Peek::CvGV on a subroutine reference is easier.

use Devel::Peek 'CvGV'; sub Foo::bar { } print CvGV( \\&Foo::bar );

Of course, that prints the name of the glob containing the subroutine...but it's a quick way to find even that much information. Now you know two ways to do it!

Running the Hack

Pass in any subroutine reference and print the result somehow to see all of this wonderful data:

use Data::Dumper; package Foo; sub foo { } package Bar; sub bar { } *foo = \\&Foo::foo; package main; warn introspect_sub( \\&Foo::foo ); warn introspect_sub( \\&Bar::bar ); warn introspect_sub( \\&Bar::foo ); warn introspect_sub( \\&Dumper ); # introspect_sub( ) as before...

Run the file as normal:

$ perl introspect.pl Foo:examples/introspect.pl:14 Bar:examples/introspect.pl:18 Foo:examples/introspect.pl:14 Data::Dumper:/usr/lib/perl5/site_perl/5.8.7/powerpc-linux/Data/Dumper.pm:495

As you can see, aliasing Bar::foo( ) to Foo::foo( ) didn't fool the introspector, nor did importing Dumper( ) from Data::Dumper.

Hacking the Hack

That's not all though. You can also see any lexical variables declared within a subroutine. Every CV holds a special array^[12] called a padlist. This padlist itself contains two arrays, one holding the name of lexical variables and the other containing an array of arrays holding the values for subsequent recursive invocations of the subroutine.^[13]

^[12] In an AV data structure that represents arrays.

^[13] At least, it's something like that; it gets complex quickly.

Grabbing a list of all lexical variables declared in that scope is as simple as walking the appropriate array in the padlist:

sub introspect_sub {     my $sub      = shift;     my $cv       = B::svref_2object( $sub );     my ($names)  = $cv->PADLIST->ARRAY( );                my $report   = join( ':',         $cv->STASH->NAME( ), $cv->FILE( ), $cv->GV->LINE( ) . "\\n"     );     my @lexicals = map { $_->can( 'PV' ) ? $_->PV( ) : ( ) } $names->ARRAY( );                return $report unless @lexicals;                $report .= "\\t(" . join( ', ', @lexicals ) . ")\\n";                return $report; }

There's one trick and that's that the array containing the names of the lexicals doesn't only contain their names. However, knowing that the B::OP-derived objects holding the names will always have a PV( ) method that returns a string representing the appropriate value of the scalar, the code filters out everything else. It works nicely, too:

use Data::Dumper; package Foo; sub foo {                my ($foo, $bar, $baz) = @_;                } package Bar; sub bar { } *foo = \\&Foo::foo; package main; warn introspect_sub( \\&Foo::foo ); warn introspect_sub( \\&Bar::bar ); warn introspect_sub( \\&Bar::foo ); warn introspect_sub( \\&Dumper ); # introspect_sub( ) as modified...

This outputs:

$ perl introspect_lexicals.pl Foo:examples/introspect.pl:14     ($foo, $bar, $baz) Bar:examples/introspect.pl:18 Foo:examples/introspect.pl:14     ($foo, $bar, $baz) Data::Dumper:/usr/lib/perl5/site_perl/5.8.7/powerpc-linux/Data/Dumper.pm:495

Easy...at least once you've trawled through perldoc B and perhaps the Perl source code (cv.h and pad.c, if you really need details).