Item 11: Consider different ways of reading from a stream. | Effective Perl Programming: Writing Better Programs with Perl

stream."-->

The line input operator < filehandle > can be used to read either a single line from a stream in a scalar context, or the entire contents of a stream in a list context. Which method you should use depends on your need for efficiency, access to the lines read, and other factors like syntactic convenience.

The line-at-a-time method is the most efficient in terms of memory, and is as fast as "ordinary" alternatives. The implicit while (<>) form is equivalent in speed to the corresponding explicit code:

while (<FH>) { # do something with $_ }	The usual implicit line-at-atime loop using `<FH>` inside `while` .
while (defined($line = <FH>)) { # do something with $line }	Explicit versionsimilar logic.

Note the use of the defined operator. This prevents the loop from missing a line if the very last line of a file is the single character " " with no terminating newlinenot a likely occurrence, but it can't hurt to be careful.

You can use a similar syntax with a foreach loop to read the entire file into memory in a single operation:

 foreach (<FH>) {    # do something with $_  }

Read the whole file into memory, then step through it.

The all-at-once method uses more memory than the line-at-a-time method, but it is potentially faster. If all you want to do is step through the lines in a short file, it won't likely matter which method you use. All-at-once has its advantages when combined with operations like sorting:

 print sort <FH>;

Print a file with its lines sorted "ASCIIbetically."

All-at-once may be appropriate if you need access to more than one line at a time:

Read in a file all at once to manipulate more than one line at a time.

@f = <FH>; foreach ( 0..$#f ) { if ($f[$_] =~ /\bShazam\b/) { $lo = ($_ > 0) ? $_ - 1 : $_; $hi = ($_ < $#f) ? $_ + 1 : $_; print map { "$_: $f[$_]" } $lo .. $hi; } }	Read in the whole file and look at a "window" of lines. Looking for `Shazam` .
	Print 3 adjacent lines with line numbers .

Many of these situations can still be handled with line-at-a-time input, although the code is definitely more complex:

Use a queue to manipulate more than one line at a time.

 @f[0..2] = ("\n") x 3;  for (;;) {    @f[0..2] = (@f[1, 2], scalar(<FH>));    last if not defined $f[1];    if ($f[1] =~ /\bShazam\b/) {      print map        { ($_ + $. - 1) . ": $f[$_]" } 0..2;    }  }

Initialize the queue.

Queue with a slice assignment.

Looking for Shazam .

Print 3 adjacent lines with line numbers, again.

Maintaining a queue of lines of text with slice assignments makes this slower than the equivalent all-at-once code, but this technique works for arbitrarily large input. The queue could also be implemented with an index variable rather than a slice assignment, which would result in more complex but faster running code.

If your goal is simply to read a file into memory as quickly as possible, you might consider clearing the input separator variable $/ and reading the entire file as a single string. This will read the contents of a file or stream much faster than either of the alternatives above:

 {    local $/;    $the_file = <FH>;  }

No input separator.

Slurp! Entire file in $the_file .

Finally, the read and sysread operators are useful for quickly scanning a file if line boundaries are of no importance:

Use read or sysread for maximum speed.

Compare files by reading blocks from each with `sysread` .
open FH1, $file1 or die; open FH2, $file2 or die; my $chunk = 4096; my ($bytes, $buf1, $buf2, $diff);	Open two files.
	Block size to read. Set up buffers, etc.
CHUNK: while ($bytes = sysread FH1, $buf1, $chunk) { sysread FH2, $buf2, $chunk; $diff++, last CHUNK if $buf1 ne $buf2; } print "$file1 and $file2 differ" if $diff;	Read a chunk from `FH1` . Read a chunk from `FH2` . Compare chunks .