Section 15.3. Using Simple Modules | Learning Perl, 5th Edition

15.3. Using Simple Modules

Suppose that you've got a long filename like /usr/local/bin/perl in your program, and you need to determine the basename. That's easy enough since the basename is everything after the last slash (in this case, perl):

     my $name = "/usr/local/bin/perl";     (my $basename = $name) =~ s#.*/##;  # Oops!

As you saw earlier, first Perl does the assignment inside the parentheses, and then it does the substitution. The substitution is supposed to replace any string ending with a slash (that is, the directory name portion) with an empty string, leaving the basename.

If you try this, it will seem to work. Well, it will seem to, but there are three problems.

First, a Unix file or directory name could contain a newline character. (It's not something that's likely to happen by accident, but it's permitted.) Since the regular expression dot (.) can't match a newline, a filename like the string "/home/fred/flintstone\n/brontosaurus" won't work right because that code will think the basename is "flintstone\n/brontosaurus". You could fix that with the /s option to the pattern (if you remembered about this subtle and infrequent case), making the substitution look like this: s#.*/##s. The second problem is this is Unix-specific. It assumes the forward slash will be the directory separator as it is on Unix and not the backslash or colon that some systems use.

The third (and biggest) problem with this is we're trying to solve a problem someone else has solved. Perl comes with a number of modules, which are smart extensions to Perl that add to its functionality. If those aren't enough, many other useful modules are available on CPAN, with new ones being added every week. You (or, better yet, your system administrator) can install them if you need their functionality.

In the rest of this section, we'll show you how to use the features of some of the modules that come with Perl. (There's more that these modules can do. This is just an overview to illustrate the general principles of how to use a module.)

We can't show you everything you'd need to know about using modules since you'd have to understand advanced topics like references and objects to use some modules.^[] Those topics, including how to create a module, will be covered in detail in the Alpaca. Further information on some interesting and useful modules is included in Appendix B.

] As well see in the next few pages, you may be able to use a module that uses objects and references without having to understand those advanced topics.

15.3.1. The File::Basename Module

In the previous example, we found the basename of a filename in a way that's not portable. We showed that something that seemed straightforward was susceptible to subtle, mistaken assumptions. (The assumption was that newlines would never appear in file or directory names.) And we were re-inventing the wheel, solving a problem that others have solved (and debugged) many times before us.

Here's a better way to extract the basename of a filename. Perl comes with a module called File::Basename. With the command perldoc File::Basename, or with your system's documentation system, you can read about what it does, which is the first step when using a new module. (It's often the third and fifth step, as well.)

When you're ready to use it, declare it with a use directive near the top of your program:^[*]

^[*] It's traditional to declare modules near the top of the file since that makes it easy for the maintenance programmer to see which modules you'll be using. That greatly simplifies matters when it's time to install your program on a new machine, for example.

     use File::Basename;

During compilation, Perl sees that line and loads up the module. Now it's as if Perl has some new functions you can use in the remainder of your program.^[] The one we wanted in the earlier example is the basename function:

] You guessed it: theres more to the story, having to do with packages and fully qualified names. When your programs are growing beyond a few hundred lines in the main program (not counting code in modules), which is quite large in Perl, you should probably investigate these advanced features. Start with the perlmod manpage.

     my $name = "/usr/local/bin/perl";     my $basename = basename $name;  # gives 'perl'

Well, that worked for Unix. What if our program were running on MacPerl, Windows, or VMS, to name a few? There's no problem because this module can tell which kind of machine you're using, and it uses that machine's filename rules by default. (In that case, you'd have that machine's kind of filename string in $name, in that case.)

This module provides other related functions. One is the dirname function, which pulls the directory name from a full filename. The module also lets you separate a filename from its extension or change the default set of filename rules.^[]

] You might need to change the filename rules if youre working with a Unix machine's filenames from a Windows machine, perhaps while sending commands over an FTP connection, for example.

15.3.2. Using Only Some Functions from a Module

Suppose that when you went to add the File::Basename module to your existing program, you discovered a subroutine called &dirname. That is, you have a subroutine with the same name as one of the module's functions.^[*] The trouble is the new dirname has been implemented as a Perl subroutine (inside the module). What do you do?

^[*] Well, it's not likely you would have a &dirname subroutine you use for another purpose, but this is an example. Some modules offer hundreds of new functions, making name collisions more frequent.

Give File::Basename, in your use declaration, an import list showing which function names it should give you, and it'll supply those and no others. Here, we'll get nothing but basename:

     use File::Basename qw/ basename /;

Here, we'll ask for no new functions at all:

     use File::Basename qw/ /;

This is frequently written as:

     use File::Basename (  );

Why would you want to do that? Well, this directive tells Perl to load File::Basename as before but not to import any function names. Importing lets us use the short, simple function names such as basename and dirname. However, if we don't import those names, we can still use the functions. When they're not imported, we have to call them by their full names:

     use File::Basename qw/ /;                     # import no function names     my $betty = &dirname($wilma);                 # uses our own subroutine &dirname                                                   # (not shown)     my $name = "/usr/local/bin/perl";     my $dirname = File::Basename::dirname $name;  # dirname from the module

The full name of the dirname function from the module is File::Basename::dirname. We can always use the function's full name, once we've loaded the module, whether we've imported the short name dirname or not.

Most of the time, you'll want to use a module's default import list. But you can override that with a list of your own if you want to leave out some of the default items. Another reason to supply your own list would be if you wanted to import some function not on the default list since most modules include some (infrequently needed) functions not on the default import list.

Some modules will, by default, import more symbols than others. Each module's documentation should make it clear which symbols it imports, if any, but you are always free to override the default import list by specifying one of your own as we did with File::Basename. Supplying an empty list imports no symbols.

15.3.3. The File::Spec Module

Now you can find out a file's basename. That's useful, but you'll often want to put that together with a directory name to get a full filename. For example, we want to take a filename like /home/rootbeer/ice-2.1.txt and add a prefix to the basename:

     use File::Basename;     print "Please enter a filename: ";     chomp(my $old_name = <STDIN>);     my $dirname = dirname $old_name;     my $basename = basename $old_name;     $basename =~ s/^/not/;  # Add a prefix to the basename     my $new_name = "$dirname/$basename";     rename($old_name, $new_name)       or warn "Can't rename '$old_name' to '$new_name': $!";

Do you see the problem here? Once again, we're making the assumption that filenames will follow the Unix conventions and use a forward slash between the directory name and the basename. Fortunately, Perl comes with a module to help with this problem, too.

The File::Spec module is used for manipulating file specifications, which are the names of files, directories, and the other things that are stored on filesystems. Like File::Basename, it understands what kind of system it's running on, and it chooses the right set of rules every time. Unlike File::Basename, File::Spec is an object-oriented (often abbreviated "OO") module.

If you've never caught the fever of OO, don't worry. If you understand objects, that's great; you can use this OO module. If you don't understand objects, that's okay, too. Type the symbols as we show you, and it will work as if you knew what you were doing.

In this case, we learn from reading the documentation for File::Spec that we want to use a method called catfile. What's a method? It's just a different kind of function, as far as we're concerned here. The difference is that you'll always call the methods from File::Spec with their full names, like this:

     use File::Spec;     .     .  # Get the values for $dirname and $basename as above     .     my $new_name = File::Spec->catfile($dirname, $basename);     rename($old_name, $new_name)       or warn "Can't rename '$old_name' to '$new_name': $!";

The full name of a method is the name of the module (called a class, here), a small arrow (->), and the short name of the method. Use the small arrow rather than the double-colon that we used with File::Basename.

Since we're calling the method by its full name, what symbols does the module import? None of them. That's normal for OO modules. You don't have to worry about having a subroutine with the same name as one of the many methods of File::Spec.

Should you bother using modules like these? If you're sure your program will never be run anywhere but on a Unix machine and you're sure you completely understand the rules for filenames on Unix, then you may prefer to hardcode your assumptions into your programs. But these modules give you an easy way to make your programs more robust in less time and and more portable at no extra charge.

15.3.4. CGI.pm

If you need to create CGI programs (which we don't cover in this book), use the CGI.pm module. You don't need to handle the interface and input parsing portion of the script which gets so many other people into trouble. The CGI.pm author, Lincoln Stein, spent a lot of time ensuring the module would work with most servers and operating systems. Use the module and focus on the interesting parts of your script.

The CGI module has two flavors: the plain old functional interface and the OO interface. We'll use the first one. As before, you can follow the examples in the CGI.pm documentation. Our simple CGI script parses the CGI input and displays the input names and values as a plain text document. In the import list, we use :all, which is an export tag that specifies a group of functions rather than a single function as you saw in the previous modules.^[*]

^[*] The module has several other export tags to select different groups of functions. For instance, if you want the ones that deal with the CGI, you can use :cgi, or if you want the HTML generation functions, you can use :html4. See the CGI.pm documentation for more details.

     #!/usr/bin/perl     use CGI qw(:all);     print header("text/plain");     foreach my $param ( param(  ) )             {             print "$param: " . param($param) . "\n";             }

We can get fancier because we want to output HTML, and CGI.pm has many convenience functions to do that. It handles the CGI header, the beginning parts of HTML with start_html( ), and many HTML tags with functions of the same name, such as h1( ) for the <H1> tag.

     #!/usr/bin/perl     use CGI qw(:all);     print header(  ),             start_html("This is the page title"),             h1( "Input parameters" );     my $list_items;     foreach my $param ( param(  ) )             {             $list_items .= li( "$param: " . param($param) );             }     print ul( $list_items );     print end_html(  );

Wasn't that easy? You don't have to know how CGI.pm is doing all this stuff: you just have to trust that it does it correctly. Once you let CGI.pm do all the hard work, you get to focus on the interesting parts of your program.

The CGI.pm module does a lot more, such as handle cookies, redirection, and multi-page forms. You will learn more from the module documentation examples.

15.3.5. Databases and DBI

The DBI (database interface) module doesn't come with Perl, but it's one of the most popular modules since most people have to connect to a database of some sort. The beauty of DBI is it allows you to use the same interface for almost any common database, from comma-separated value files to big database servers like Oracle. It has ODBC drivers, and some of its drivers are supported by vendors. To get the full details, get Programming the Perl DBI (O'Reilly). You can check out the DBI web site, http://dbi.perl.org/.

Once you install DBI, you also have to install a DBD (database driver). You can get a long list of DBDs from CPAN Search. Install the right one for your database server, and ensure you get the version that goes with the version of your server.

The DBI is an OO module, but you don't have to know everything about OO programming to use it; just follow the examples in the documentation. To connect to a database, you use the DBI module and call its connect method.

     use DBI;     $dbh = DBI->connect($data_source, $username, $password);

The $data_source contains information particular to the DBD you want to use, so you'll get that from the DBD. For PostgreSQL, the driver is DBD::Pg, and the $data_source is something like:

     my $data_source = "dbi:Pg:dbname=name_of_database";

Once you connect to the database, you will go through a cycle of preparing, executing, and reading queries.

     $sth = $dbh->prepare("SELECT * FROM foo WHERE bla");     $sth->execute(  );     @row_ary  = $sth->fetchrow_array;     $sth->finish;

When you are finished, you disconnect from the database.

     $dbh->disconnect(  );

See DBI's documentation for more details.