Hack 77. Find All Global Variables


Track down global variables so you can replace them.

Perl 5's roots in Perl 1 show through sometimes. This is especially evident in the fact that variables are global by default and lexical only by declaration. The strict pragma helps, but adding that to a large program that's only grown over time (in the sense that kudzu grows) can make programs difficult to manage.

One problem of refactoring such a program is that it's difficult to tell by reading whether a particular variable is global or lexical, especially when any declaration may have come hundreds or thousands of lines earlier. Your friends and co-workers may claim that you can't run a program to analyze your program and find these global variables, but you can!

The Hack

Perl 5 has several core modules in the B::* namespace referred to as the backend compiler collection. These modules let you work with the internal form of a program as Perl has compiled and is running it. To see a representation of a program as Perl sees it, use the B::Concise module. Here's a short program that uses both lexical and global variables:

use vars qw( $frog $toad ); sub wear_bunny_costume {     my $bunny = shift;     $frog     = $bunny;     print "\\$bunny is $bunny\\n\\$frog is $frog\\n\\$toad is $toad"; }

$frog and $toad are global variables.[9]$bunny is a lexical variable. Unless you notice the my or use vars lines, it's not obvious to the reader which is which. Perl knows, though:

[9] They're also friends.

$ perl -MO=Concise,wear_bunny_costume friendly_animals.pl examples/friendly_animals.pl syntax OK main::wear_bunny_costume: n  <1> leavesub[1 ref] K/REFC,1 ->(end) -     <@> lineseq KP ->n 1        <;> nextstate(main 35 friendly_animals.pl:5) v ->2 6        <2> sassign vKS/2 ->7 4           <1> shift sK/1 ->5 3              <1> rv2av[t2] sKRM/1 ->4 2                 <$> gv(*_) s ->3 5           <0> padsv[$bunny:35,36] sRM*/LVINTRO -6 7        <;> nextstate(main 36 friendly_animals.pl:6) v ->8 a        <2> sassign vKS/2 ->b 8           <0> padsv[$bunny:35,36] s ->9 -           <1> ex-rv2sv sKRM*/1 ->a 9              <$> gvsv(*frog) s -a b        <;> nextstate(main 36 friendly_animals.pl:7) v ->c m        <@> print sK ->n c           <0> pushmark s ->d -           <1> ex-stringify sK/1 ->m -              <0> ex-pushmark s ->d l              <2> concat[t6] sKS/2 ->m j                 <2> concat[t5] sKS/2 ->k h                    <2> concat[t4] sKS/2 ->i f                       <2> concat[t3] sK/2 ->g d                          <$> const(PV "$bunny is ") s ->e e                          <0> padsv[$bunny:35,36] s -f g                       <$> const(PV "\\n$frog is ") s ->h -                    <1> ex-rv2sv sK/1 ->j i                       <$> gvsv(*frog) s -j k                 <$> const(PV "\\n") s ->l

That's a lot of potentially confusing output, but it's reasonably straightforward. This is a textual representation of the optree representing the wear_bunny_costume( ) subroutine. The emboldened lines represent variable accesses. As you can see, there are two different opcodes used to fetch values from a variable. padsv fetches the value of a named lexical from a lexical pad, while gvsv fetches the value of a scalar from a typeglob.

Running the Hack

Knowing this, you can search for all gvsv ops within a compiled program and find the global variables! B::XPath is a backend module that allows you to search a given tree with XPath expressions. To look for a gvsv node in the optree, use the XPath expression //gvsv:

use B::XPath; my $node = B::XPath->fetch_root( \\&wear_bunny_costume ); for my $global ( $node->match( '//gvsv' ) ) {     my $location = $global->find_nextstate( );     printf( "Global %s found at %s:%d\\n",     $global->NAME( ), $location->file( ), $location->line( ) ); }

fetch_root( ) gets the root opcode for a given subroutine. To search the entire program, use B::XPath::fetch_main_root( ). match( ) applies an XPath expression to the optree starting at the given $node, returning a list of matching nodes.

As each node returned should be a gvsv op (blessed into B::XPath::SVOP), the NAME( ) method retrieves the name of the glob. The find_nextstate( ) method finds the nearest parent control op (or COP) which contains the name of the file and the line number on which the variable appeared.[10] The results are:

[10] It uses a heuristic, so it may not always be exact.

$ perl friendly_animals.pl Global frog found at friendly_animals.pl:8 Global frog found at friendly_animals.pl:9

Hacking the Hack

If you want to find only globals named $toad, change the XPath expression and parameterize it by a node attribute:

$node->match( '//gvsv[@NAME="toad"]' ))

There's no limit to the types of opcodes you can search for in a program beyond what B::XPath supports and the XPath expressions you can write. As long as you can dump a snippet of code into an optree list, you can eventually turn that into an XPath expression. From there, just grab the node information you need and you're on your way.

See also the built-in B::Xref module. It produces a cross reference of variables and subroutines in your code.



Perl Hacks
Perl Hacks: Tips & Tools for Programming, Debugging, and Surviving
ISBN: 0596526741
EAN: 2147483647
Year: 2004
Pages: 141

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net