Section 5.7. Opening a Filehandle | Learning Perl, 5th Edition

5.7. Opening a Filehandle

You've seen that Perl provides three filehandlesSTDIN, STDOUT, and STDERRwhich are automatically open to files or devices established by the program's parent process (probably the shell). When you need other filehandles, use the open operator to tell Perl to ask the operating system to open the connection between your program and the outside world. Here are some examples:

     open CONFIG, "dino";     open CONFIG, "<dino";     open BEDROCK, ">fred";     open LOG, ">>logfile";

The first one opens a filehandle called CONFIG to a file called dino. That is, the (existing) file dino will be opened and whatever it holds will come into our program through the filehandle named CONFIG. This is similar to the way that data from a file could come in through STDIN if the command line had a shell redirection like <dino. The second example uses the same sequence; it does the same as the first, but the less-than sign explicitly says "use this filename for input," even though that's the default.^[*]

^[*] This may be important for security reasons. As we'll see in a moment (and in further detail in Chapter 14), a number of magical characters may be used in filenames. If $name holds a user-chosen filename, opening $name will allow any of these magical characters to come into play. This could be a convenience to the user, or it could be a security hole. But opening "< $name" is much safer since it explicitly says to open the given name for input. Still, this doesn't prevent all possible mischief. For more information on different ways of opening files, especially when security may be a concern, see the perlopentut manpage.

Though you don't have to use the less-than sign to open a file for input, we include that because, as you can see in the third example, a greater-than sign means to create a new file for output. This opens the filehandle BEDROCK for output to the new file fred. Just as when the greater-than sign is used in shell redirection, we're sending the output to a new file called fred. If a file has that name, we'll wipe it out and replace it with this new one.

The fourth example shows how two greater-than signs may be used (again, as the shell does) to open a file for appending. That is, if the file exists, we will add new data at the end. If it doesn't exist, it will be created in much the same way as if we had used one greater-than sign. This is handy for log files; your program could write a few lines to the end of a log file each time it's run. That's why the fourth example names the filehandle LOG and the file logfile.

You can use any scalar expression in place of the filename specifier, though typically you'll want to be explicit about the direction specification:

     my $selected_output = "my_output";     open LOG, "> $selected_output";

Note the space after the greater-than sign. Perl ignores this,^[] but it keeps unexpected things from happening if $selected_output were ">passwd", for example, which would make an append instead of a write.

] Yes, this means that if your filename were to have leading whitespace, Perl would ignore that, too. See perlfunc and perlopentut if youre worried about this.

In modern versions of Perl (starting with Perl 5.6), you can use a "three-argument" open:

     open CONFIG, "<", "dino";     open BEDROCK, ">", $file_name;     open LOG, ">>", &logfile_name(  );

The advantage here is that Perl never confuses the mode (the second argument) with some part of the filename (the third argument), which has nice advantages for security.^[*] However, if you need your Perl to be backward compatible to older Perl versions (such as when you are contributing to the CPAN), avoid these forms or mark your Perl sources as being compatible only with newer Perls.^[]

^[] Via use 5.6, for example.

We'll see how to use these filehandles later in this chapter.

5.7.1. Bad Filehandles

Perl can't open a file all by itself. Like any other programming language, Perl merely asks the operating system to open a file. Of course, the operating system may refuse because of permission settings, an incorrect filename, or other reasons.

If you try to read from a bad filehandle (that is, a filehandle that isn't properly open), you'll see an immediate end-of-file. (With the I/O methods we'll see in this chapter, end-of-file will be indicated by undef in a scalar context or an empty list in a list context.) If you try to write to a bad filehandle, the data is silently discarded.

Fortunately, these dire consequences are avoidable. First of all, if we ask for warnings with -w or the warnings pragma, Perl will generally be able to tell us with a warning when it sees that we're using a bad filehandle. But even without that, open always tells us if it succeeded or failed by returning true for success or false for failure. You could write code like this:

     my $success = open LOG, ">>logfile";  # capture the return value     if ( ! $success) {       # The open failed        . . .     }

You could do it like that, but there's another way that we'll see in the next section.

5.7.2. Closing a Filehandle

When you are finished with a filehandle, you may close it with the close operator like this:

     close BEDROCK;

Closing a filehandle tells Perl to inform the operating system that we're all done with the given data stream, so any last output data should be written to disk in case someone is waiting for it.^[*] Perl will automatically close a filehandle if you reopen it (that is, if you reuse the filehandle name in a new open) or if you exit the program.^[]

^[] Any exit from the program will close all filehandles, but if Perl breaks, pending output buffers wont get flushed. That is to say, if you accidentally crash your program by dividing by zero, for example, Perl will still run and ensure that data you've written will get output. But if Perl can't run (because you ran out of memory or caught an unexpected signal), the last few pieces of output may not be written to disk. Usually, this isn't a big issue.

Because of this, many Perl programs don't bother with close. But it's there if you want to be tidy, with one close for every open. In general, it's best to close each filehandle soon after you're done with it, though the end of the program often arrives soon enough.^[]

] Closing a filehandle will flush any output buffers and release any locks on the file. Since someone else may be waiting for those things, a long-running program should close each filehandle as soon as possible. But many of our programs will take only one or two seconds to run to completion, so this may not matter. Closing a filehandle also releases possibly limited resources, so its more than being tidy.