3.9 How AUTOLOAD Works


The AUTOLOAD mechanism, built into the definition of Perl packages, is simple to use. If a subroutine named AUTOLOAD is declared within a package, it is called whenever an undefined subroutine is called within the package. AUTOLOAD is a special name, and must be capitalized as shown, because Perl is designed that way. Don't use the subroutine name AUTOLOAD (or DESTROY ) for any other purpose, or you'll suffer unintended consequences.

Without an AUTOLOAD subroutine defined in a package, an attempt to call some undefined subroutine simply produces an error when the program runs. But if an AUTOLOAD subroutine is defined, it is called instead and is passed the arguments of the undefined subroutine. At the same time, the $AUTOLOAD variable is set to the name of the undefined subroutine.

Here's an example of a short Perl program that tries to call an undefined function:

 #!/usr/bin/perl use strict; use warnings; print "I started the program\n"; report_protein_function("one", "two"); print "I got to the end of the program\n"; 

It gives the following output:

 I started the program Undefined subroutine &main::report_protein_function called at jk.pl line 8. 

Here's what happens when an AUTOLOAD subroutine is defined in the package:

 #!/usr/bin/perl use strict; use warnings; use vars '$AUTOLOAD'; print "I started the program\n"; report_protein_function("one", "two"); print "I got to the end of the program\n"; sub AUTOLOAD {         print "AUTOLOAD is set to $AUTOLOAD\n";         print "with arguments ", "@_\n"; } 

It gives the following output:

 I started the program AUTOLOAD is set to main::report_protein_function with arguments one two I got to the end of the program 

3.9.1 Defining Global Variables

Recall that when you start programs with such statements as:

 use strict; 

you have to declare all variables as lexically scoped using my . However, there are times when your program needs to use global variables that aren't lexically scoped. To use AUTOLOAD , you need access to the predefined $AUTOLOAD global variable.

To enable access of the package global $AUTOLOAD , you must specifically exempt it from the use strict injunction. This can be accomplished with the use vars statement:

 use vars '$AUTOLOAD'; 

Other globals can be declared in this way as well, but globals should be used sparingly, and preferably not at all.

Newer versions of Perl (after Version 5.6.0) have a cleaner way to declare global variables even when use strict is in effect:

 our $AUTOLOAD; 

This makes the variable $AUTOLOAD a legal global within the scope in which it is declared ”in Gene3.pm , the scope is the entire class.

Without our $AUTOLOAD or use vars '$AUTOLOAD ', the program won't run; instead, it complains vociferously that:

 Global symbol "$AUTOLOAD" requires explicit package name 

3.9.2 AUTOLOAD Simplifies Writing Methods

Having the AUTOLOAD mechanism available can greatly simplify the writing of class methods. Many classes require methods to examine and to change the values of attributes, as have the two previous versions Gene1.pm and Gene2.pm .

If an object has many attributes, you have to write an accessor method and a mutator method for each attribute. This is repetitive; it requires defining more methods every time the list of attributes changes, and, in general, it's hard to maintain such code.

The new version Gene3.pm uses AUTOLOAD to automate the handling of methods for accessors and mutators. All you need do is write the one AUTOLOAD subroutine, and all these similar, basic methods are handled in the same fashion by the one bit of code.

3.9.2.1 Bypassing use strict

AUTOLOAD starts by fiddling with the use strict statement. Just as it requires the $AUTOLOAD global variable to be exempted from the use strict directive, so does the magic AUTOLOAD speedup (described in the next section) require an exemption from the use strict directive at a specific place within the AUTOLOAD subroutine. Thus, the statement:

 no strict "refs"; 

turns off the use strict where required. This enables the lines (to be explained later) such as:

 *{$AUTOLOAD} = sub { return $_[0]->{$attribute} }; 

to bypass the otherwise desirable use strict instruction.

3.9.2.2 AUTOLOAD arguments

Recall that AUTOLOAD is automatically used when 1) it has been defined, and 2) an undefined subroutine is called. When this happens, AUTOLOAD is simply passed the arguments that would have gone to the undefined subroutine.

For example, say you call an undefined method fold on an object $peptide :

 $peptide->fold(-style => 'prion') 

If you define an AUTOLOAD method in the class, it's called and passed the calling object or class name, as usual, plus the arguments -style => ' prion ' you were trying to pass to the nonexistent fold method. The global scalar variable $AUTOLOAD is also set to the name of the nonexistent fold method.

The version of AUTOLOAD in Gene3.pm captures one written argument. So, of course, this AUTOLOAD actually captures two arguments: the class object automatically passed into the subroutine by arrow notation, which appears first, and the other arguments, if any. This line in the AUTOLOAD subroutine:

 my ($self, $newvalue) = @_; 

assigns the reference to the object to the new variable $self and the value to be set, if any, to the new variable $newvalue .

3.9.2.3 Using naming conventions to write code: get_ and set_

The various versions of the Gene module have named attributes with beginning underscores, for example, _name for the gene name. The accessors and mutators for attributes have been assigned names that prepend get and set to the beginning of the attribute name, for example, get_name and set_name .

In Gene3.pm , the AUTOLOAD subroutine elevates this convention to an enforced discipline, by recognizing only method names and attribute names that conform to this convention. It first examines the name of the called subroutine as stored in the $AUTOLOAD global variable, checks if the subroutine name is in the expected form, and if so, extracts the attribute name from the subroutine name with a regular expression. The AUTOLOAD subroutine then checks that the requested attribute exists, and fetches or sets the value of that attribute.

The first part of the AUTOLOAD subroutine does some checking to see if the subroutine name is in the expected form, and if so, it extracts the attribute name, and the requested operation ( get or set ). This first test:

 my ($operation, $attribute) = ($AUTOLOAD =~ /(getset)(_\w+)$/);      # Is this a legal method name? unless($operation && $attribute) {    croak "Method name $AUTOLOAD is not in the recognized form (getset)_attribute\n"; } unless(exists $self->{$attribute}) {    croak "No such attribute '$attribute' exists in the class ", ref($self);     } 

uses a regular expression to see if the $AUTOLOAD variable is storing a method name that ends with an attribute name (complete with leading underscore ) that is defined for objects of this class if it begins with get or set as the desired operation. The regular expression:

 (getset)(_\w+)$ 

looks for a name that, after get or set , is composed of an underscore followed by one or more legal word characters (as described in the perlre manpage on regular expressions):

 _\w+ 

Here, the underscore matches an underscore, and the \w matches any legal word character, and the + matches one or more such word characters. These are remembered and captured in the $operation and $attribute variables by surrounding with parentheses the parts of the regular expression that match the operation and the attribute name:

 (getset)(_\w+) 

This attribute name is assigned to the variable $attribute (for obvious mnemonic reasons) to use in the rest of the subroutine. Similarly, the operation get or set is assigned to the $operation variable.

The second part of the test checks to see if such an attribute name exists in the hash that represents the class object:

 unless(exists $self->{$attribute}) {     croak "No such attribute '$attribute' exists in the class ", ref($self); } 

The exists Perl command checks to see if a hash key exists; the value for the key may not have been set, but the key must exist. $self is the reference to the class object, so the following:

 exists $self->{$attribute} 

checks to see if any such attribute actually exists in the object.

If the method name passed to AUTOLOAD begins with get or set , ends with a name including a leading underscore, and if that name is an existing key in the hash that is the class object, the tests will succeed. If they fail, the program will croak at this point.

3.9.2.4 AUTOLOAD accessors

The next bit of AUTOLOAD code handles the calls to class accessors:

 # AUTOLOAD accessors if($operation eq 'get') {     # define subroutine     *{$AUTOLOAD} = sub { shift->{$attribute} }; } 

The code first determines that a get accessor was wanted. Then the undefined accessor method (whose name has been saved in the variable $AUTOLOAD ) is defined. The subroutine definition is placed in the program's symbol table with *{$AUTOLOAD} . The new subroutine gets the object from the arguments by the call to shift . The object is a hash, and the value in the hash for the attribute is returned from the subroutine. So this method is a simple accessor, that, given an attribute name, returns the value. This accessor isn't actually used here; it's just defined in the symbol table.

3.9.2.5 AUTOLOAD mutators

The next bit of AUTOLOAD code handles the calls to class mutators:

 # AUTOLOAD mutators }elsif($operation eq 'set') {     # define subroutine     *{$AUTOLOAD} = sub { shift->{$attribute} = shift; };     # set the new attribute value     $self->{$attribute} = $newvalue; } 

Here, after determining that a set mutator method was called, the undefined mutator method (whose name has been saved in the variable $AUTOLOAD ) is defined. The new subroutine gets the object from the arguments by the first call to shift and sets the attribute of the object to the new value, which it gets from the arguments by the second call to shift . After defining the new mutator method, the code actually sets the attribute key to the $newvalue that was passed in as an argument.

Finally, the AUTOLOAD program, after defining the new accessor or mutator method, as the case may be, and setting the new value of the attribute if a mutator method has been defined, returns the value of the attribute:

 # return the attribute value return $self->{$attribute}; 

So the AUTOLOAD method both defines the accessor or mutator methods and behaves just like the defined accessor or mutator method by returning the attribute value (if it's a mutator, it first resets the attribute).

3.9.2.6 AUTOLOAD speedup

The so-called "magic" lines in the accessor and mutator code that I've referred to:

 *{$AUTOLOAD} = sub { shift->{$attribute} }; 

and:

 *{$AUTOLOAD} = sub { shift->{$attribute} = shift; }; 

are there purely in order to speed up the code.

AUTOLOAD performs its tasks a bit on the slow side. For a large program that does a lot of getting and setting of attributes, the slowdown is noticeable. What is saved in programming time by having AUTOLOAD handle all these accessors and mutators, is lost in runtime. The slowdown comes from the program having to figure out what is wanted by the undefined methods, the use of regular expressions to parse the names of the methods, etc.

The magic lines actually define the new methods in the symbol table, on the fly, when they don't already exist. (The * gives access to the symbol table, but I'll omit the details of how the symbol table is defined and manipulated and stick to practicalities here.) After they are called once, and the AUTOLOAD overhead is incurred, the methods are thenceforth defined in the symbol table of the running program. So, for instance, the second time that the accessor method get_name is called, the program finds the definition in the symbol table, and AUTOLOAD isn't called. This results in a considerable speedup for the program overall.

I'll not delve too deeply into how this works. Briefly, the $AUTOLOAD variable contains the name of the desired method call, say, get_name . The star * in *{$AUTOLOAD} is a reference to the definition of that method call in the symbol table. This symbol table reference is assigned the part of the expression to the right of the assignment sign (=) that's an (anonymous) subroutine definition.

The symbol table is thus manipulated directly from your program, and the missing accessor and mutator definitions are installed in the symbol table the first time AUTOLOAD is called to handle them. After this first call that invokes AUTOLOAD , the program can find the method definitions in the symbol table and uses those definitions, bypassing AUTOLOAD . For more details, see O'Reilly's Programming Perl .



Mastering Perl for Bioinformatics
Mastering Perl for Bioinformatics
ISBN: 0596003072
EAN: 2147483647
Year: 2003
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net