Modules | Writing Perl Modules for CPAN

Chapter 2 - Perl Module Basics

by?Sam Tregar?

Apress ? 2002

Companion Web Site

Modular programming requires two facilities in a programming language- encapsulation and interfaces. Packages provide encapsulation by furnishing separate namespaces for subroutines and variables. Modules are Perl's facility for providing packages with interfaces. In actuality, Perl's support for interfaces are a set of conventions around the use of packages and module filenaming. There is no module keyword^[6] and no extra syntax to learn.

You tell Perl your module's name in two ways-first, by naming the module file. The filename corresponds to the module name by replacing the :: marks with file- system separators and appending a .pm extension to the end. For example, here are some module names and their associated filenames on UNIX and Windows systems:

Table 2-1: Examples of Module Names Converted to Filenames on UNIX and Windows systems
Module Name	UNIX Filename	Windows Filename

CGI	CGI.pm	CGI.pm

HTML::Template	HTML/Template.pm	HTML\Template.pm

Scalar::List::Utils	Scalar/List/Utils.pm	Scalar\List\Utils.pm

Secondly, at the top of your module file you declare the name in a package line:

 package Hello; sub greet {    my $name = shift;  print "Hello, $name!\n"; } 1;

Tip

Perl modules must end by with a true statement. This tells Perl that your module compiled successfully. Leaving off the true statement will result in a compilation error.

If you place the preceding code in a file called Hello.pm, then you can use the module in a script placed in the same directory as Hello.pm:

 #!/usr/bin/perl use lib '.'; use Hello; Hello::greet("World");

This produces the following output:

 Hello, World!

Most of the code example should look familiar, but the use lib line might be new. I'll explain that in the next section.

Module Names

A module's name is its introduction. If you choose good names for your modules, you'll rarely have to answer the question "What does it do?" For example, programmers rarely ask me what HTML::Template does, but HTML::Pager draws inquiries on every mention.

Perl modules must have unique names. Having two modules with the same name will cause difficulties. This is similar to machines on the Internet-if there were two Web sites called http://www.cpan.org, how would a browser know where to send you?^[7]

An easy solution to the problem of finding a unique name is to use a multipart name. Module names can be composed of parts delimited by double-colons- (::). Most modules have names with two parts and some have three-Inline::C, Scalar::List::Utils, Parse::RecDescent, CGI::Application. Following this practice is a good idea-it keeps your module names short enough to easily remember. If you do use a long name, then you should be careful to choose a name within a hierarchy that will make it easy for others to find.

Many organizations use a common prefix for all their internal modules. For example, Vanguard Media (http://www.vm.com) creates their internal modules under the name "Dynexus"-Dynexus::Template::File, Dynexus::Class::Base, and so on. This keeps the internal modules names from conflicting with names of externally produced modules. If you are creating private modules, you should consider a similar naming convention.

This is similar to the system used by Java where class names are preceded by the reversed domain name of their creators. For example, Java classes written by Sun have names beginning with "com.sun". The intent is the same-that module names never accidentally conflict, but the Perl system is considerably simpler and results in shorter names. Of course, if you'd like to create a module called Com::Acme::AutomaticDogCatcher module, you can.

How Perl Finds Modules

Let's take a brief detour into Perl mechanics. You need to know how Perl finds modules before you can start writing your own. When Perl encounters a use statement during compilation, it turns the module name into a filename as described earlier in this chapter. For example, Scalar::List::Utils becomes Scalar/List/Utils.pm. Next, Perl uses the global array @INC^[8] to find a list of candidate directories to look for Scalar/List/Utils.pm. When Perl finds a module file, it is immediately compiled. You can find out what your Perl's default @INC is with this command:

 perl -e 'print join("\n", @INC) . "\n";'

One way to use modules is to put your modules into one of the listed directories-usually one with site_perl in the name. This is what happens when you install a module from CPAN. Another way to use modules is to modify @INC before Perl starts looking for modules to include a different directory where you store your modules. An easy way to do that is through the use lib statement shown earlier. A use lib statement prepends a directory onto @INC at compile time.

For example, if you have a private modules directory in your home directory^[9] called modules, you could start your programs with the following:

 use lib '/home/sam/modules';

You can see the effect of this command on @INC by printing out after a use lib:

 use lib '/home/sam/modules'; print join("\n", @INC) . "\n";

Of course, this code will only work if your home directory is called "/home/sam". You can use the following to pull the home directory out of the environment:

 use lib "$ENV{HOME}/modules";

But this won't work:

 $home_dir = $ENV{HOME}; use lib "$home_dir/modules";

If you do something like this you'll receive the following error:

 Empty compile time value given to use lib

The problem is that Perl processes use statements at compile time but the variable assignment to $home_dir happens at runtime. Perl needs to know where to look for modules at compile time so that it can find the modules to compile- runtime is much too late. One way to solve this problem is to ask Perl for a little runtime before compile time is over with BEGIN:

 BEGIN { $home_dir = $ENV{HOME}; } use lib $home_dir;

Of course, you can also modify @INC directly, which also needs to be in a BEGIN block to be useful:

 BEGIN { unshift(@INC, "/home/sam/modules"); }

The preceding line is equivalent to use lib "/home/sam/modules". In general use lib is the preferred method of adding a custom library path to your programs.

Once Perl has loaded a module, it creates an entry in the global hash %INC. The keys of this hash are module filenames (that is, File/Find.pm), and the values are the full path to the files loaded for the module (that is, /usr/local/lib/perl5/5.6.1/File/Find.pm). You can use this hash to get a list of loaded modules and where they were loaded from:

 print map { "$_ => $INC{$_}\n" } keys %INC;

This can be very useful as a debugging aid when you're not sure Perl is picking up the right version of a module. Perl uses %INC to avoid loading a module file more than once.

Functional Modules

The most obvious way to build a module is to place subroutines in the module and document them as the module's interface. For example, here's a module that provides a logging facility for a fictional application called BOA:

 package BOA::Logger; $LOG_LEVEL = 1; # default log level is 1 # open log file sub open_log {    my $filename = shift;    open(LOG_FILE, ">>$filename") or die "Unable to open $filename : $!";    print LOG_FILE "BOA log started: " . localtime(time) . "\n"; } # set logging level sub log_level { $LOG_LEVEL = shift; } # write a log message if level is set high enough sub write_log {    my ($level, $message) = @_;    print LOG_FILE "$message\n" if $level <= $LOG_LEVEL; } 1;

Caution

A real logging module would use flock() to prevent file corruption, but that would make these examples twice as long! The code in this chapter is kept as simple as possible-real production code would need significant enhancement.

The concept for the module is simple-BOA::Logger will provide logging at varying levels of detail known as log levels. The module's interface consists of three subroutines-open_log(), log_level(), and write_log(). The application must call open_log() before the first call to write_log(). When a piece of code calls write_log(), it provides two arguments, $level and $message itself. If $level is less than or equal to the currently set log level, the message is printed to the log. The log level defaults to 1 and the application can change the value using the log_level() subroutine.

Notice how the package variable $LOG_LEVEL is used to maintain state between calls to log_level() and write_log(). By state I mean that the module contains variables that store the value of past operations between calls to the interface. Thus the state of the module changes over time as the interface is used.

Here's a possible usage of the module, which would go in a separate script file:

 # use the module use BOA::Logger; # open the log file BOA::Logger::open_log("logs/boa.log"); # set the log level higher BOA::Logger::log_level(10); # write a log entry at level 5 BOA::Logger::write_log(5, "Hello log reader."); # write a log entry at level 15 - this won't be printed to the log BOA::Logger::write_log(15, "Debugging data here.");

Exporting

BOA::Logger is useful enough, but it could be improved. For one thing, the module takes too much typing. To solve this problem, you can use the Exporter. The Exporter enables you to export symbols from your module into the package of the calling module. Exporting makes an entry in the calling package's symbol table that points to the called package. To export the three subroutines in BOA::Logger, you would change the top of the module source file, BOA/Logger.pm, to read as follows:

 package BOA::Logger; require Exporter; @ISA = qw(Exporter); @EXPORT = qw(open_log log_level write_log);

The second line loads the Exporter module-require is commonly used here, but use also works. The third line accesses Perl's inheritance mechanism-I'll describe inheritance in more detail in the "Object-Oriented Modules" section, but for now you can just treat it as magic code that makes the Exporter work. Finally, the @EXPORT array is initialized with the list of symbols to export.

Now code that uses BOA::Logger can dispense with the package name:

 use BOA::Logger; open_log("logs/boa.log"); log_level(10); write_log(5, "Hello log reader."); write_log(15, "Debugging data here..");

Of course, the full package specification would still work-you can always refer to BOA::Logger::write_log().

Now, BOA is a big application. In fact, BOA stands for big ol' application, so many other modules will be using BOA::Logger. Most of these modules will only be calling write_log(). The only code that will call open_log() and log_level() is the startup code. Fortunately users of the module can choose what symbols they want exported-by providing a list of symbols to use:

 use BOA::Logger qw(write_log);

Without this addition, a use BOA::Logger will import all exported symbols. To import nothing from a module that exports symbols by default, use an empty list:

 use BOA::Logger ();

Subroutines aren't the only thing you can export. Variables can also be exported. For example, BOA::Logger could omit the log_level() subroutine and just export $LOG_LEVEL directly:

 @EXPORT = qw(open_log $LOG_LEVEL write_log);

Now code that wants to set the logging level can import the $LOG_LEVEL variable and manipulate it directly:

 use BOA::Logger qw($LOG_LEVEL write_log); $LOG_LEVEL = 10; write_log(10, "Log level set to 10.");

I'll return to the Exporter to provide more details in the next chapter.

BEGIN

Another problem with the BOA::Logger module is that other modules have to wait for open_log() to get called before they can use write_log(). This makes it difficult for modules to log their compilation and initialization. To solve this problem, the module could be changed to automatically open the log file as soon as possible- during compile time. To cause code to be run at compile time, move the code from open_log() into a BEGIN block:

 BEGIN {    open(LOG_FILE, ">>logs/boa.log") or die "Unable to open log : $!";    print LOG_FILE "BOA log started: " . localtime(time) . "\n"; }

Now the log file is opened as soon as the BOA::Logger module is compiled. The downside here is that the location of the log file is hard-coded into BOA::Logger.

END

It is often useful to know when an application exited. BOA::Logger can provide this by registering an action to take place when the application exits. This is done with an END block-the opposite of the BEGIN block described earlier.

 END {    print LOG_FILE "BOA log exited: " . localtime(time) . "\n";    close LOG_FILE or die "Unable to close log/boa.log : $!"; }

As an added bonus I get to feel like a good citizen by closing the LOG_FILE file handle instead of letting global destruction do it for me. Global destruction refers to the phase in a Perl program's life when the interpreter is shutting down and will automatically free all resources held by the program. END blocks are often used to clean up resources obtained during BEGIN blocks.

Error Reporting

BOA::Logger is a careful module-it always checks to make sure system calls like open() and close() succeed. When they don't, BOA::Logger calls die(), which will cause the program to exit if not caught by an eval.^[10] This is all well and good, but unfortunately the error messages generated aren't very helpful-they make it look as though there's a problem in BOA::Logger. For example, if you call open_log() on a file that can't be opened, you'll receive the following error message:

 Unable to open /path/to/log : No such file or directory at BOA/Logger.pm line 8.

When my fellow BOA developers see this message, they'll likely jump to the conclusion that there's something wrong with BOA::Logger. They'll send me angry e-mails and I'll be forced to sign them up for spam.^[11] Nobody wants that, and thankfully the situation can be avoided. The Carp module, which comes with Perl, can be used to place the blame where it belongs. Here's a new version of the module header and open_log() using Carp:

 package BOA::Logger; use Carp qw(croak); sub open_log {    my $filename = shift;    open(LOG_FILE, ">>$filename") or croak("Unable to open $filename : $!");    print LOG_FILE "BOA log started: " . localtime(time) . "\n"; }

Now the blame is properly placed and the resulting error is

 Unable to open /path/to/log : No such file or directory at caller.pl line 5

The croak() routine provides a die() replacement that assigns blame to the caller of the subroutine. The Carp module also provides a warn() replacement called carp(), as well as routines to generate full back traces. See the Carp documentation for more details; you can access it with the command perldoc Carp.

Object-Oriented Modules

As previously mentioned, BOA is a big ol' application. In fact, it's so big that just one log file will not be enough. There are several subsystems (GUI, Network, Database, and so on) that each need their own log files with independent log levels. One way to address this would be to create a new package for each log file and copy and paste the code from BOA::Logger into each one-creating BOA::Logger::GUI, BOA::Logger::Network, and so on. This approach has some obvious drawbacks- the code becomes harder to maintain since a change in once place has to be carefully replicated in each copy. Also, it would be difficult to use multiple BOA::Logger clones at the same time-they all want to export write_log(), so you'd have to forgo exporting and type the whole package name for every call.

There is an easier way. Instead of creating a new package just to hold some state information, you'll create an object-oriented module that provides an object for each log file. These objects will contain the state necessary to support a single log file as well as the functions needed to operate on this state data. This is the basic definition of an object: state data and functions to operate on that state wrapped up in one data structure. The benefits of object orientation are increased flexibility and improved potential for code reuse.

References: A Brief Refresher

Perl supports object-oriented programming through references. It's possible to do a lot of useful things with Perl without using a single reference. As a result you may be ready to learn object-oriented Perl without having ever used a single reference. I'll give you a quick refresher on the topic, but if you're left with questions, I suggest you head for a good introductory book on Perl for details.

A reference is simply a variable that points to another variable. By points to, I mean that you can follow the link from a reference to the variable it references. This action of following a reference is known as dereferencing.

Here's a simple example that prints "Hello, New World" using a reference to a scalar:

 $message = "Hello, New World.\n"; $ref = \$message; print $$ref;

This example shows two important operations on references. First, a reference is created using the \ operator:

 $ref = \$message;

After this line, $ref points to $message. You can see this in action by changing $message and observing that the new value is visible through $ref:

 $message = "Goodbye, dear friend."; print $$ref; # prints "Goodbye, dear friend."

Second, the reference is dereferenced using a second $ in front of the reference:

 print $$ref;

You can create a reference to other types of variables but the result is always stored in a scalar. For example, this example prints "Hello, New World" using a reference to an array:

 @array = ("Hello,", "New", "World"); $ref = \@array; print join(" ", @$ref) . "\n";

This example works similarly to the earlier example and uses an @ to deference the reference to @array. This works fine for access to the whole array, but more often you'll want to pick out a single value:

 print $ref->[0] . "\n"; # prints "Hello,"

This syntax is known as arrow notation. You can use arrow notation with hashes as well. For example, here's another way to print "Hello, New World", this time using a reference to a hash:

 %hash = ( message => "Hello, New World" ); $ref = \%hash; print $ref->{message} . "\n";

Finally, Perl contains operators to create array and hash references without requiring an intermediate variable. These are known as anonymous arrays and anonymous hashes, respectively. For example, the preceding example could be rewritten to use an anonymous hash:

 $ref = { message => "Hello, New World" }; print $ref->{message} . "\n";

The curly braces ({}) produce a reference to an anonymous hash. Similarly, square braces ([]) produce a reference to an anonymous array:

 $ref = [ "Hello", "New", "World" ]; print join(" ", @$ref) . "\n";

References are often used to implement call-by-reference in subroutines. Call- by-reference means that the subroutine takes a reference as a parameter and acts on the data pointed to by the reference. For example, here's a function that takes a reference to an array of words and uppercases them:

 sub upper {    my $words = shift;    $words->[$_] = uc($words->[$_]) for (0 .. $#$words); }

Notice that this subroutine doesn't return anything-it works by modifying the array pointed to by the reference passed as the first parameter. Here's an example of how upper() would be used:

 my @words = ("Hello,", "New", "World"); upper(\@words); print join(" ", @words) . "\n"; # prints "HELLO, NEW WORLD"

Object Vocabulary

Object-oriented (OO) programming has a language all its own. Fortunately for us, Perl provides a simple translation from the OO lexicon to everyday Perl.^[12] See Table 2-2 for a cheat sheet. Don't worry if this vocabulary isn't immediately clear, I'll provide more explanation as we go.

Table 2-2: OO Vocabulary Cheat Sheet
OO	Perl

Class	Package

Object	A reference blessed into a package

Method	A subroutine in a class

Object method	A method that expects to be called using an object

Class method	A method designed to be called using a class

Constructor	A class method that returns a new object

Using OO Modules

Before I show you the details of creating an OO module, it helps to know how to use one. Here's an example using IO::File, an OO wrapper around Perl's file operators (open, print, seek, and so on) included with Perl:

 use IO::File; # create a new IO::File object for writing "file.txt" my $filehandle = IO::File->new(">file.txt"); # print to the file $filehandle->print("This line goes into file.txt\n"); # close the file $filehandle->close();

The three subroutine calls-new(), print() and close()-are examples of method calls. Method calls are the bread-and-butter of object-oriented programming, and in typical Perl fashion, there's more than one way to do it. The preceding example uses the arrow operator, ->. The left-hand side of the arrow operator must be either a package name (such as IO::File) or an object (such as $filehandle). The right- hand side is the name of a subroutine to call.

Methods automatically receives as an extra initial parameter-the variable on the left-hand side of the arrow operator. You can imagine that the call to new() is translated into the following:

 my $filehandle = IO::File::new("IO::File", "> file.txt");

But you shouldn't write it that way-using method call syntax enables Perl's inheritance to work. I'll describe inheritance in more detail later.

Perl offers another method call syntax known as indirect object syntax. Here's the code from the example rewritten to use indirect object method calls:

 my $filehandle = new IO::File ">file.txt"; print $filehandle "This line goes into file.txt\n"; close $filehandle;

In this style, the method name comes first followed by either a package name or an object. Both calling styles result in the same method invocation-the extra initial argument is supplied to the method subroutine in both cases. Choosing which one to use is largely a matter of preference, although many Perl programmers prefer the arrow notation since it is less visually ambiguous. Furthermore, Perl itself occasionally has trouble parsing indirect object syntax. For these reasons, I'll be using arrow notation in my examples from this point forward.

Caution

C++ programmers take note-there is nothing special about methods named new(). It is only by convention that constructors are often named new().

A method that is called using a package name is a class method. A method called with an object is an object method. Class methods are used to provide services that are not specific to any one object; object construction is the most common example but I'll explore others in the next sections.

The Class

A class in Perl is nothing more than a package that happens to have subroutines meant to be used as methods. Here's an example of BOA::Logger transformed into a class.

 package BOA::Logger; use Carp qw(croak); use IO::File; # constructor - returns new BOA::Logger objects sub new {    my ($pkg, $filename) = @_;    # initialize $self as a reference to an empty hash    my $self = {};    # open the log file and store IO::File object in $self->{fh}    my $filehandle = IO::File->new(">>$filename");    croak("Unable to open $filename : $!") unless $filehandle;    # print startup line    $filehandle->print("BOA log started: " . localtime(time) . "\n");    # store the filehandle in $self    $self->{fh} = $filehandle;    # set default log_level of one    $self->{level} = 1;    # bless $self as an object in $pkg and return it    bless($self, $pkg);    return $self; } # level method - changes log level for this log object sub level {    my ($self, $level) = @_;    $self->{level} = $level; } # write method - writes a line to the log file if log-level is high enough sub write {    my ($self, $level, $message) = @_;    $self->{fh}->print($message) if $level <= $self->{level}; } 1;

The module begins by using two modules you've met before: Carp and IO::File. Next, the first subroutine, new(), is defined. This is the constructor-a class method that returns new objects. new() receives two arguments-the name of the package and the filename to open.

The object itself is just a hash underneath. Most objects in Perl are really hashes, but it's possible to create objects based on anything you can make a reference to. Hashes are used so often for their inherent flexibility. In this case, the hash contains two keys-"fh" and "level". The "fh" key contains an open IO::File object for the log file. The "level" key is set to the default log level of 1. Data elements kept in an object are known as the object's attributes.

So far so good, but what about that last section:

 bless($self, $pkg); return $self;

The call to bless()^[13] tells Perl that $self is an object in the package named $pkg. This is how a reference becomes an object. After this point, methods can be called using the object, and they will result in subroutine calls in $pkg-BOA::Logger in this case. A call to ref($self) will return $pkg ("BOA::Logger") after blessing. Finally, since this is a constructor, the new object is returned to the caller.

Methods all share a common structure. They receive their $self object as an automatic first argument and any additional arguments after that. The two methods here, level() and write(), work with the data stored in the $self hash. The contents of the $self hash is known as instance data. Instance data is different for each instance (a fancy word for object) of this class.

Here's an example of using the module, which would be placed in a separate script file:

 use BOA::Logger; my $logger = BOA::Logger->new('logs/boa.log'); $logger->level(10); $logger->write(10, "Hello world!");

One thing to notice is that making the module object oriented allows you to simplify the names of the subroutines in BOA::Logger. This is because object-oriented modules should never export their methods. Thus there's no need to worry about confusion with other subroutines called level() and write(). Another advantage of the object-oriented BOA::Logger is that you can have multiple loggers active at the same time with different log files and different log levels.

Accessors and Mutators

The level() method shown earlier is called a mutator- it is used to change, or mutate, the value of the level attribute. It is not an accessor since it doesn't allow the user to query the current value of the level attribute. An accessor for the value of level could potentially be useful-a user of the module could avoid executing costly debugging code if the log level is set too low to show the results. Here's a new level() method that functions as both an accessor and a mutator:

 sub level {    my ($self, $level) = @_;    $self->{level} = $level if @_ == 2;    return $self->{level}; }

Now it's possible to call the level() method with no arguments to receive the current value of the level attribute. For example, this checks the log level before calling write():

 if ($logger->level() >= 5) {    $logger->write(5, "Here's the full state of the system: " . dump_state()); }

This way you can avoid calling dump_state() if the result will never be printed.

Writing accessor-mutators for each attribute in your object enables you to perform checks on the value being set. For example, it might be useful to verify that the level value is a nonnegative integer. One way to do this is to check it with a regular expression that only matches digits:

 sub level {    my ($self, $level) = @_;    if (@_ == 2) {       croak("Argument to level() must be a non-negative integer!")          unless $level =~ /^\d+$/;       $self->{level} = $level;    }    return $self->{level}; }

It might seem convenient to allow users to simply access the hash keys directly:

 $logger->{level} = 100; # works, but not a good idea

The problem with this is that it breaks the encapsulation of your class. You are no longer free to change the implementation of BOA::Logger-you can't change the class to use an array underneath or change the keys of the hash. Also, you can't perform any checking of the value set for an attribute. As a general rule, all access to an object-oriented class should be through methods, either class methods or object methods.

Destructors

The non-OO version of BOA::Logger had a useful feature that this version lacks- it prints a message when the program exits. You can provide this by setting up a destructor for the class. Destructors are the opposite of constructors-they are called when an object is no longer being used.^[14] They can perform cleanup actions, such as closing file handles. To create a destructor, simply define a method called DESTROY.

 sub DESTROY {    my $self = shift;    $self->write($self->{level}, "BOA log exited: " . localtime(time) . "\n");    $self->{fh}->close() or die "Unable to close log file : $!"; }

Class Data

By now you know that BOA is a big ol' application. As such, there are many modules that will want to write to the same log file. With the example OO implementation, this means that each client module will create its own BOA::Logger object. This will have a number of unpleasant side effects. First, when each BOA::Logger object is destroyed, it will write its own "BOA log exited" message. Second, each BOA::Logger object will consume a file handle. Many systems limit the number of open file handles a process can have, so it's best not to use more than necessary.

We can solve this problem using, you guessed it, class data. Class data is data that is stored at the class level and is not associated with any specific object. In Perl, class data is supported through package-scoped variables. It can be used to maintain state separate from each objects' own state. Common uses of class data include keeping track of the number of objects created or the number of objects still alive. In this case, you'll use a hash called %CACHE to maintain a cache of BOA::Logger objects:

 # constructor - returns new BOA::Logger objects sub new {    my ($pkg, $filename) = @_;    # lookup $filename in %BOA::Logger::CACHE - if an entry exists, return it    return $CACHE{$filename} if $CACHE{$filename};    # initialize $self as a reference to an empty hash    my $self = {};    # store in %CACHE    $CACHE{$filename} = $self;    # ... same as previous example ... }

When new() is called, it will first check the cache to see if you've already got a BOA::Logger object for the filename. If it does, the existing object is immediately returned. If not, the new object is stored in the cache for future lookups.

This works, but it causes a subtle change in BOA::Logger's behavior. After adding the cache, DESTROY is only called at program exit, rather than when the last reference to a BOA::Logger object goes out of scope. This is because objects aren't destroyed until the last reference to them goes out of scope; %CACHE maintains a reference to every object created by new() and as a package variable it never goes out of scope. This might be acceptable behavior, but if it's not, you could fix it by using the WeakRef module.^[15] WeakRef provides weaken(), which enables you to create references that don't prevent objects from being destroyed. This version will allow the BOA::Logger objects to be destroyed as soon as possible:

 use WeakRef  qw(weaken); # constructor - returns new BOA::Logger objects sub new {    my ($pkg, $filename) = @_;    # lookup $filename in %BOA::Logger::CACHE - if an entry exists, return it    return $CACHE{$filename} if $CACHE{$filename};    # initialize $self as a reference to an empty hash    my $self = {};    # store in %CACHE    $CACHE{$filename} = $self;   weaken($CACHE{$filename});    # ... same as previous example ... }

Inheritance

BOA::Logger is a simple module, but simple doesn't last. As more BOA developers start using BOA::Logger, requests for new features will certainly start piling up. Satisfying these requests by adding new features to the module might be possible, but the effect on performance might be severe. One solution would be to create a new module called BOA::Logger::Enhanced that supported some enhanced features and just copy the code in from BOA::Logger to get started. This has an unpleasant consequence: The code would be harder to maintain since bugs would need to be fixed in two places at once.

There is a better way. Object-oriented classes can be enhanced using inheritance. Inheritance enables one module to be based on one (or more) classes known as parent or base classes. The new derived class that inherits from the parent class is known as the child class. Here's an example module, called BOA::Logger::Enhanced, that inherits from BOA::Logger:

 package BOA::Logger::Enhanced; use BOA::Logger; @ISA = qw(BOA::Logger);

By assigning "BOA::Logger" to the package variable @ISA, the module tells Perl that it is inheriting from the BOA::Logger class. This variable is pronounced "is a" and refers to the fact that a BOA::Logger::Enhanced object "is a" BOA::Logger object. Inheritance relationships are known as "is a" relationships.

To provide the advertised enhanced functionality, the class will override the write() method. Overriding is when a child class replaces a parent class's method. Here's a new write() method that puts a timestamp on every log line. This code would be placed in BOA/Logger/Enhanced.pm:

 sub write {    my ($self, $level, $message) = @_;    $message = localtime(time) . " : " . $message;    $self->{fh}->print($message) if $level <= $self->{level}; }

The method modifies the $message parameter to contain a timestamp and then prints out the line in the same way as the original BOA::Logger::write(). Here's an example using the new module:

 use BOA::Logger::Enhanced; my $logger = BOA::Logger::Enhanced->new("logs/boa.log"); $logger->level(10); $logger->write(10, "The log level is at least 10!");

When BOA::Logger::Enhanced->new() is called, Perl first looks in the BOA::Logger::Enhanced package to see if a subroutine called new() is defined. When it finds that there is no BOA::Logger::Enhanced::new(), Perl checks to see if @ISA is defined and proceeds to check each package name listed in @ISA for the required method. When it finds BOA::Logger::new(), it calls the subroutine with two arguments, BOA::Logger::Enhanced and logs/boa.log. BOA::Logger::Enhanced gets assigned to $pkg in BOA::Logger::new() and used in the call to bless():

 bless($self, $pkg);

The result is that BOA::Logger::new() returns an object in the BOA::Logger::Enhanced class without needing to know anything about BOA::Logger::Enhanced! Isn't Perl great?

Caution

Don't be fooled by the similar class names-no automatic inheritance is happening between BOA::Logger and BOA::Logger::Enhanced. Inheritance must be explicitly declared through @ISA to be used.

UNIVERSAL

All classes in Perl implicitly inherit from a common base class-UNIVERSAL. The UNIVERSAL class provides three methods that can be used on all objects-isa(), can(), and VERSION().

The isa() method can be used to determine if an object belongs to a particular class or any child of that class. This is preferable to using ref() to check the class of an object since it works with inheritance. For example, the following code prints "Ok":

 my $logger = BOA::Logger::Enhanced->new("logs/boa.log"); print "Ok" if $logger->isa('BOA::Logger');

but this similar code does not:

 my $logger = BOA::Logger::Enhanced->new("logs/boa.log"); print "Ok" if ref($logger) eq 'BOA::Logger';

This is because ref() returns the name of the class that the object belongs to which is BOA::Logger::Enhanced. Even though BOA::Logger::Enhanced inherits from BOA::Logger, that won't make eq return true when comparing them as strings. The moral here is simple: Don't use ref() to check the class of objects, use isa() instead.

To check if an object supports a method call, use can(). You can use can() to provide support for older versions of modules while still taking advantage of the newer features. For example, imagine that at some point in the future BOA::Logger::Enhanced adds a method set_color() that sets the color for the next line in the log file. This code checks for the availability of the set_color() method and calls it if it is available:

 if ($logger->can('set_color')) {    $logger->set_color('blue'); } $logger->write("This might be blue, or it might not!");

Another way to query the features provided by a module is to use the VERSION() method. With no arguments, this method looks at the $VERSION variable defined in the class package and returns its value. If you pass an argument to VERSION(), then the method will check if the class's $VERSION is greater than or equal to the argument and die() if it isn't. This form is used by use when use is passed a version number. For example, this statement calls BOA::Logger->VERSION(1.1) after BOA::Logger is compiled and exits with an error message if the call returns false:

 use BOA::Logger 1.1;

To support this usage, BOA::Logger would need to be modified to initialize a $VERSION package variable:

 package BOA::Logger; $VERSION = 1.1;

Since these features are provided as method calls in a parent class, child classes can override them and provide their own implementations. This enables classes to lie to the rest of the world about their inheritance, capabilities, and even version. In Perl, things are not always as they appear.

Overloaded Modules

Object-oriented programming can be cumbersome. Everything is a method call, and sooner or later all your method calls start to look the same. Overloading your modules provides a way to simplify code that uses your module. It allows you to express code like the following:

 $foo->add(10); print "My favorite cafe is " . $cafe->name() . "\n";

in a more natural way:

 $foo += 10; print "My favorite cafe is $cafe\n";

Overloading enables your objects to work with Perl's existing math and string operators. When a Perl operator is used with an object of an overloaded class, a method is called. You specify which operators you are overloading and which methods to call using the overload pragma.^[16]

 package My::Adder; use overload '+' => "add",              '-' => \&subtract;

The overload pragma takes a list of key-value pairs as an argument. The keys are symbols representing the various operators available for overloading. The values specify the method to call when the operator is used; this can be expressed as a string or as a reference to a subroutine. The string form is preferred since it allows for a child class to override an overloaded method. Table 2-3 lists the overloadable operations.

Table 2-3: Overloadable Operations
Operation Type	Symbols

Conversion	`"" 0+ bool`

Arithmetic	`+ += − −= * = / /= % %= * **= ++ −−`

String	`× ×= . .=`

Numeric comparison	`< <= > >= == != <=>`

String comparison	`lt le gt ge eq ne cmp`

Bitwise	`<< >> <<= >>= & ^ \| neg ~`

Logical	`!`

Transcendental	`atan2 cos sin exp abs log sqrt int`

Iteration	`<>`

Dereferencing	`${} @{} %{} &{} *{}`

Special	`nomethod fallback =`

This method will be called with three parameters-the object itself, the variable on the opposite side of the operator, and metadata about the operator call including the order of the arguments.

Note

Overloading in Perl has little in common with overloading in other languages. For example, in C++ "overloading" refers to the ability to have two functions with the same name and different parameter types. Currently Perl does not have this ability, but rumor has it Perl 6 will change that.

Overloading Conversion

Overloading's most useful feature is not its ability to overload math operators. I'll be covering that in a moment, but unless you're inventing new mathematical types, it's not likely you'll be overloading addition in your modules. On the other hand, overloading conversion is quite common. An overloaded conversion operator is called when Perl wants to use your object in a particular context- string, numeric, or Boolean.

Overloading string conversion enables you to provide a method that Perl will call when it wants to turn your object into a string. Here are a few examples of places where a string conversion operator is used:

 $string = "$object"; $string = "I feel like a " . $fly . " with its wings dipped in honey."; print "Say hello to my little ", $friend, ".\n";

Without an overloaded string conversion operator, objects are converted to highly esoteric strings such as "IO::File=GLOB(0x8103ee4)"-just next door to useless. By providing a string conversion operator, a class can furnish a more useful string representation. This can enhance debugging and provide a simpler interface for some modules.

For example, one of my fellow BOA programmers is an exceptionally lazy individual. He's responsible for the networking code in BOA::Network. Each network connection is represented as an object in the BOA::Network class. Since he's such a lazy guy, he'd like to be able to use the BOA::Logger class with the absolute minimum work:

 $logger->write(1, $connection);

His initial suggestion was that I modify write() to check for BOA::Network objects and pull out the relevant status information for logging. That would work, but sooner or later you'd have an if() for every module in BOA. Because BOA is a big ol' application, this wouldn't be a good idea. Instead, BOA::Network can overload string conversion:

 package BOA::Network; use overload """ => "stringify";    sub stringify {      my $self = shift;      return ref($self) . " => read $self->{read_bytes} bytes, " .             "wrote $self->{wrote_bytes} bytes at " .             "$self->{kps} kps"; }

Now when BOA::Network calls

 $logger->write(1, $connection);

the log file will contain a line like the following:

 BOA::Network => read 1024 bytes, wrote 58 bytes at 10 kps

Nothing in BOA::Logger changes-the overloaded string conversion provides the method call to BOA::Network::stringify() automatically when BOA::Logger::write() prints its second argument.

Overloading numification, through the ‘0+' key, works similarly. Numification is the name for the process that coverts a variable to a number when the variable is used in a numeric context. This happens when an object is used with a math operator, as an array index or in a range operator (..). For example, the variable $number is numified in the second line in order to increment it:

 $foo = "10"; # foo contains the string value "10" $foo++; # foo is numified and then incremented to contain 11

Overloading numification gives you control over how your variable is represented as a number.

Finally, a Boolean conversion method, using the bool overload key, is employed when the object is used in Boolean context. This happens inside an if() and with logical operations such as && and ||.

Unary Operators

A unary operator is one that applies to only one argument. It's very simple to provide an overloaded unary operator-there are no arguments to deal with and you only need to worry about implementing the required semantics. See Listing 2-1 for a module that overrides ++ and −− so that the object always contains an even number.

Listing 2-1: Overloading Unary Operations in the Even Class

 package Even; use overload    '++' => "incr",    '--' => "decr",    '+0' => "numify", sub new {    my ($pkg, $num) = @_;    croak("Even requires an even number to start with!") if $num % 2;    return bless(\$num, $pkg); } sub incr {    my $self = shift;    $$self += 2;    return $self; } sub decr {    my $self = shift;    $$self -= 2;    return $self; } sub numify {    my $self = shift;    return $$self; } 1;

This module also serves as a demonstration of an idea briefly discussed earlier: Objects need not be based on hashes but can be based on any reference. In this case objects in the Even class are implemented using a scalar as the underlying type. When new() creates a object, it simply blesses a reference to a scalar:

 return bless(\$num, $pkg);

Then when object methods need access to the underlying scalar, they simply use a scalar dereference:

 $$self -= 2;

Binary Operators

Most of the overloadable operators are binary operators. To implement a binary operator, you provide a method that takes three arguments. The first argument is always the overloaded object. The second is the other argument to the operator- it might be another object or a plain scalar. The third argument gives you information about the order of the arguments. When the third argument is true the arguments are in the same position as in the operation that generated the call; when it is false, the arguments are reversed.

Why do you need this third argument? Consider implementing subtraction. These lines will generate the same method call if $number is an object in a class that overloads subtraction with a method called subtract():

 $result = $number - 7; # actually $number->subtract(7, 0); $result = 7 - $number; # actually $number->subtract(7, 1);

By examining the third argument, the implementation for subtract can do the right thing:

 sub subtract {    my ($self, $other, $reversed) = @_;    if ($reversed) {      return $other - $$self;    } else {      return $$self - $other;    } }

Of course, this assumes that your module will obey the normal rules of arithmetic; it doesn't have to!

Your binary operators will need to be a little more complicated than the preceding simple example. Since the second argument to the operator could be a normal variable or another object, there needs to be logic to handle both cases. Also, a binary operator should function as a constructor so that the results of the operation are also members of the class. Here's an implementation of addition for the Even module that always produces even numbers, rounding up where necessary:

 use overload '+' => "add"; sub add {    my ($self, $other, $reversed) = @_;    my $result;    if (ref($other) and $other->isa('Even')) {       # another Even object will always be even, so the addition       # can always be done       $result = $$self + $$other;    } else {       # make sure it's even       $other += 1 if $other % 2;       $result = $$self + $other;    }    # return a new object in the same class as $self    return ref($self)->new($result); }

This method will work with other objects that are either of the Even class or inherit from it. It also uses an inheritance-safe method for creating new objects by calling the new() method (implemented earlier in this section) on the result of calling ref() on the object. This means that the new() method will be called on whatever class $self belongs to, even if that's a child class of the class where add() is implemented.

Auto-Generation

As you can see from the preceding example, it takes a lot of work to write a safe overload method. Fortunately, it's usually not necessary to create methods for all the possible overloadable operations. This is because the overload module can auto-generate many overload methods from existing overloaded methods. Table 2-4 contains the rules for method auto-generation.

Table 2-4: Method Auto-Generation
Method(s)	Auto-Generation Description

Assignment forms of math operators	Can be auto-generated from the nonassignment forms (+= can be auto-generated from +, for example)

Conversion operators	Any conversion operator can be auto-generated from any other

++, −−	Auto-generated from += and −=

`abs()`	Auto-generated from < and binary subtraction

Unary −	Auto-generated from subtraction

Negation	Auto-generated from Boolean conversion

Concatenation	Auto-generated from string conversion

Using method auto-generation generally requires that your module follow the normal rules of arithmetic. For example, if abs() is to be successfully generated by < and subtraction, your module will have to have fairly standard semantics.

You can provide your own auto-generation rules by overloading the special nomethod key. The method will receive four arguments-the normal three associated with binary operators (whether the called operator is binary or not), and a fourth argument containing the operator actually called.

Finally, you can turn off auto-generation altogether by setting the special overload key fallback to 0 (although nomethod will still be tried if it exists). Alternately you can set it to 1 to allow an unavailable overload method to be ignored- Perl will continue with whatever behavior it would have had if overloading had not been used at all. The default setting for fallback, undef, produces an error if an overloaded operation cannot be found.

Overloading "="-It's Not What You Think

Overloading = does not overload assignment. What overloading = does do is provide a copy constructor. For example, consider a class called Fraction that uses an array^[17] of two elements to represent a fraction internally and provides all the normal overloaded math operators. Imagine it also provides an overloaded copy constructor with the copy() method. Here's an example showing how the copy constructor is used:

 $x = Fraction->new(1, 2); # create a new Fraction containing one half (1/2). $y = $x;                  # assign the reference in $x to $y. At this point                           # both $x and $y reference the same object. $x *= 4;                  # first implicitely calls the copy                           # constructor: $x = $x->copy()                           # then multiplies $x by 2 yielding 4/2 print "x = $x\n";         # prints x = 4/2 print "y = $y\n";         # prints y = 1/2

As you can see, the copy constructor is called implicitly before a mutating operator is used, not when the assignment operator is used. If Fraction did not provide an overloaded copy constructor, then this code would generate an error:

 Operation '=': no method found, argument in overloaded package Fraction

Implementing a copy constructor is usually a simple matter of pulling out the values needed for initialization and calling new() to create a new object. In this case the object stores the numerator and denominator in the first and second positions of the object array so the copy constructor is written as follows:

 package Fraction; use overload '=' => "copy"; sub copy {    my $self = shift;    return ref($self)->new($self->[0], $self->[1]); }

The copy constructor is only activated for the mutating operators: ++, −−, +=, and so on. If your object can have its value changed in other ways-through a method call or by a nonmutating operator, for example-then the user will need to call the copy constructor directly. In practice, this restriction makes the copy constructor suitable only for rather normal mathematical packages. If your module is playing outside the bounds of Perl's normal math, then it's probably not going to mesh well with an overloaded =.

Just to drive home the point that you're not overloading assignment, note that the copy constructor is never called if the object is the recipient of an assignment:

 $object = 6;

After this assignment, $object contains the scalar "6" and is no longer a reference to an object, overloaded or not! If you really want to overload assignment, then what you need is a tied module. The next section will describe tied modules in all their glory.

Tied Modules

Tying enables a class to provide the implementation for a normal Perl variable. When a variable is tied to a class, all accesses to that variable are handled by methods in the class. This is similar to an overloaded class, but instead of returning a reference to a specially prepared object, tying enables a variable to be magically associated with a hidden object. This may sound complicated, but the implementation is quite simple-all the hard stuff is handled by Perl.

Tying Scalars

Sometimes a class is so simple that its entire interface can be represented by a tied scalar. For example, imagine that the BOA::Thermostat module implements an interface to a thermometer and a heater for the BOA spacecraft. If this class provided a tied scalar interface, then reads could correspond to checking the current temperature and writes could correspond to opening or closing the heating vents. Here's some example code that keeps compartment 10 of the spacecraft between 20 and 30 degrees:

 use BOA::Thermostat; # tie $thermo to temperator controls for compartment 10, the captain's quarters tie $thermo, 'BOA::Thermostat', compartment => 10; # enter infinite loop while (1) {    # check temperature    if ($thermo <= 20) {      # too cool?       $thermo = 1;           # open the vents    } elsif ($thermo >= 30) { # too hot?       $thermo = 0;           # close the vents  }  sleep(30);                  # pause for 30 seconds }

The code starts by using the BOA::Thermostat module. Next, a call to the tie function is made. This call tells Perl that the $thermo variable's implementation will be provided by the BOA::Thermostat class. Whenever the program accesses the $thermo variable, a method in the BOA::Thermostat class is called. The call also passes the compartment number to the BOA::Thermostat class using a named parameter style. The program then enters an infinite loop, checking the temperature and opening or closing the vents as appropriate.

This example highlights a key difference between tying and overloading-the ability to handle assignment. An overloaded class could not provide this interface because after the first assignment the $thermo variable would no longer contain a reference to an overloaded object. Tied variables provide magic containers, whereas overloaded objects provide magic values that can be assigned to variables. The difference is subtle but important to understand.

To implement a tied scalar class, you need to provide three methods-TIESCALAR(), FETCH(), and STORE(). The first, TIESCALAR(), is the constructor for the tied scalar class. It works just like the new() methods you've seen previously-it takes some parameters and returns a bless()'d reference. Here's a possible implementation for BOA::Thermostat::TIESCALAR():

 package BOA::Thermostat; sub TIESCALAR {    my $pkg = shift;    my $self = { @_ }; # retrieve named options into $self hash-ref    # check for required 'compartment' option    croak("Missing compartment number!") unless exists $self->{compartment};    # the vent is initially closed    $self->{vent_state} = 0;    # bless $self and return    return bless($self, $pkg); }

This should look very familiar by now-it's just a simple variation on a normal constructor. The only difference is the name and the way it will be called-by tie() instead of directly. Notice that even though this code is emulating a scalar, there's no need to use a scalar underneath-the object itself is a hash in this example.

The remaining two methods are FETCH() and STORE(), which get called when the tied variable is read and written, respectively. Here's the FETCH() implementation for the BOA::Thermostat class:

 # method called when scalar is read sub FETCH {    my $self = shift;    return get_temp($self->{compartment}); }

FETCH() receives only the object as an argument and returns the results of calling the class method get_temp() with the compartment number as a parameter. This method will check the temperature in the given compartment and return it. I'll leave implementing this method as an exercise for budding rocket scientists in the audience.

The STORE() method is almost as simple:

 # method called when scalar is written to sub STORE {    my ($self, $val) = @_;    # return if the vent is already in the requested state    return $val if $val == $self->{vent_state};    # open or close vent    if ($val) {      open_vent($self->{compartment});    } else {      close_vent($self->{compartment});    }    # store and return current vent state    return $self->{vent_state} = $val; }

STORE() receives two arguments, the object and the new value. The code checks to see if it can return right away-if the vent is already in the requested position. It then calls the class methods close_vent() or open_vent() as necessary. Finally, it returns the set value. STORE() methods should return their new value so that chained assignment works as expected:

 $foo = $thermo = 1;   # $foo == 1

It's possible to call object methods using a tied variable. There are two ways to get access to the underlying object. First, it's returned by tie(). Second, it can be retrieved from a tied variable using the tied() routine. For example, say you added a method to the BOA::Thermometer class called fan() to turn on and off a fan inside the vent. Client code could call this method as follows:

 $thermo_obj = tie $thermo, 'BOA::Thermometer', compartment => 10; $thermo_obj->fan(1); # turn on the fan

This will also work:

 tie $thermo, 'BOA::Thermometer', compartment => 10; tied($thermo)->fan(1); # turn on the fan

Using additional object methods, tied modules can provide enhanced functionality without giving up the simplicity of a tied variable interface.

Tying Hashes

By far the most commonly used tied interface is the tied hash. Hashes are so inherently flexible that they lend themselves well to representing an interface to a variety of data types. In fact, Perl's support for tied variables evolved from support for tying hashes to database files using dbmopen() into the general mechanism it is today.

One common use for tied hashes is to provide lazy computation and caching for some large dataset. For example, BOA::Network::DNS provides a tied hash interface for network name-to-IP mappings. Here's an example using the module:

 use BOA::Network::DNS; # tie hash to BOA::Network::DNS - provide nameserver as argument to constructor tie %dns, 'BOA::Network::DNS', nameserver => '10.0.0.1'; # lookup IP address for www.perl.com print "www.perl.com : ", $dns{'www.perl.com'} || "not found!", "\n"; # do a reverse lookup for the DNS server print "The name for the DNS server is: ", $dns{'10.0.0.1'} || "not found!", "\n";

Obviously it would be impossible to prepopulate a hash with all the possible names and addresses on the Internet, but a tied hash allows you to pretend that you have. Also, as you'll see, the hash can very easily hold onto the results of past lookups to improve performance.

To implement a tied hash interface, you must provide eight methods-TIEHASH(), FETCH(), STORE(), DELETE(), EXISTS(), CLEAR(), FIRSTKEY(), and NEXTKEY(). Here's TIEHASH() the constructor:

 package BOA::Network::DNS; sub TIEHASH {    my $pkg = shift;    my $self = { @_ }; # retrieve named options into $self hash-ref    # check for required 'nameserver' option    croak("Missing nameserver address!") unless exists $self->{nameserver};    # initialize cache to an empty hash    $self->{cache} = {};    # bless $self and return    return bless($self, $pkg); }

This should definitely look familiar by now-it's the same basic constructor pattern you've seen earlier in the chapter.

The rest of the methods are more interesting. FETCH() is similar to the methods in a tied scalar, but it receives an extra parameter-the key that's being requested. The implementation here is very simple:

 # method called when an entry is read from the hash sub FETCH {    my ($self, $key) = @_;    # check cache and return if found    return $self->{cache}{$key} if exists $self->{cache}{$key};    # make lookup using nameserver provided to TIEHASH    my $result = _do_dns_lookup($self->{nameserver}, $key);    # cache result and reverse mapping    $self->{cache}{$key} = $result;    $self->{cache}{$result} = $key;    # return result    return $result; }

It's debatable whether BOA::Network::DNS should even provide a STORE() method-DNS entries are generally considered to be read-only! However, for the sake of completeness, let's provide one. STORE() takes two parameters, the key and the value to be set for that key:

 # called when an entry is written to the hash sub STORE {    my ($self, $key, $value) = @_;    # store the value in the cache, forward and reverse    $self->{cache}{$key} = $value;    $self->{cache}{$value} = $key;    # return the value stored so that chained assignment works    return $value; }

Perl's hashes distinguish between an entry containing undef and an entry that doesn't exist at all. The defined() operator simply calls FETCH() on tied hashes, but exists() needs special support from the tied implementation in the form of EXISTS(). To complete the picture, DELETE() must be provided to remove a key from the hash, after which it is expected that EXISTS() will return false for that key. It is often difficult to decide what behavior to provide for these calls on a tied hash. In this case, you'd want to do the simple thing and just examine the underlying cache:

 # method called when exists() is called on the hash sub EXISTS {    my ($self, $key) = @_;    return exists $self->{cache}{$key}; } # method called when delete() is called on the hash sub DELETE {    my ($self, $key) = @_;    # delete both forward and reverse lookups if the key exists    my $value;    if (exists $self->{cache}{$key}) {       $value = $self->{cache}{$key};       delete $self->{cache}{$value};       delete $self->{cache}{$key};    }    # return deleted value, just like the normal delete()    return $value; }

Perl provides a hook for a special case of delete() when the entire hash is being cleared. This is triggered by assigning an empty list to a hash:

 %dns = ();

It's possible to implement this by looping over the keys and calling DELETE(), but there's usually a more efficient implementation. In this case you can just clear the cache:

 sub CLEAR {    my $self = shift;    %{$self->{cache}} = (); }

Finally, you must provide an iterator by implementing FIRSTKEY() and NEXTKEY(). The iterator functions are used when several Perl operators are called-keys(), values(), and each(). The utility of allowing users to iterate over DNS lookups in the cache is questionable, but here's a possible implementation:

 sub FIRSTKEY {    my $self = shift;    # reset iterator for the cache    scalar keys %{$self->{cache}};    # return the first key from the cache    return scalar each %{$self->{cache}}; } sub NEXTKEY {    my ($self, $lastkey) = @_;    return scalar each %{$self->{cache}}; }

This implementation just proxies the call to each() on the underlying cache. As a result, it doesn't use the second parameter to NEXTKEY()-the last key returned. This can be useful if the underlying data store isn't a hash but rather something that maintains an order.

Other Ties

In addition to scalars and hashes, you can also tie arrays and file handles. Once you've grokked implementing tied scalars and hashes, it's just a matter of learning the specific method names and interfaces. You can get this information from Perl's documentation with the command perlpod perltie.

Tying and Overloading

You might imagine that you could combine tying and overloading to form the ultimate magic Perl module. Unfortunately, due to a bug in the implementation of overloading in the version of Perl I'm using (5.6.1) and older versions, this isn't easily done. You can find the rather complicated details in the overload documentation, but suffice it to say that for the time being you should choose tying or overloading, not both.

^[6]At least not yet! Early Perl 6 designs include mention of a new keyword for modules and classes separate from normal packages.

^[7]Ok, bad example-round-robin DNS works using this technique, but you get my point.

^[8]The name @INC refers to its use as an "include" path, although using a module is rarely referred to as "including" the module.

^[9]This is a UNIX-specific example since Windows (and other single-user operating systems) don't provide a "home directory." However, use lib works just as well on Windows as it does on UNIX, so the techniques should be easily adaptable.

^[10]This is Perl's poor-man exception handling. For a more evolved system, see the Exception module on CPAN.

^[11]I recommend Oprah's book club mailing list.

^[12]Which is not the case for all those C programmers learning C++-they don't have a leg to stand on!

^[13]There is also a single-argument form of bless that blesses into the current package. This should be avoided because it doesn't allow for inheritance. Since there's no drawback to using the two-argument form, it should be used in all cases.

^[14]When the last variable holding a reference to the object goes out of scope, or at program exit-whichever comes first

^[15]Written by Tuomas J. Lukka and available on CPAN

^[16]A pragma is loosely defined as a module that functions as a compiler directive; it changes the way Perl compiles the code that follows. The pragmas that come with Perl all have lowercase names.

^[17]It's important that this class be implemented using something other than a scalar because overload will actually auto-generate a copy constructor for scalars.