Recipe 8.8 Reading a Particular Line in a File

8.8.1 Problem

You want to extract a single line from a file.

8.8.2 Solution

The simplest solution is to read the lines until you get to the one you want:

# looking for line number $DESIRED_LINE_NUMBER $. = 0; do { $LINE = <HANDLE> } until $. =  = $DESIRED_LINE_NUMBER || eof;

If you are going to be doing this a lot and the file fits into memory, read the file into an array:

@lines = <HANDLE>; $LINE = $lines[$DESIRED_LINE_NUMBER];

The standard (as of v5.8) Tie::File ties an array to a file, one line per array element:

use Tie::File; use Fcntl; tie(@lines, Tie::File, $FILE, mode => O_RDWR)   or die "Cannot tie file $FILE: $!\n"; $line = $lines[$sought - 1];

If you have the DB_File module, its DB_RECNO access method ties an array to a file, one line per array element:

use DB_File; use Fcntl; $tie = tie(@lines, DB_File, $FILE, O_RDWR, 0666, $DB_RECNO) or die      "Cannot open file $FILE: $!\n"; # extract it $line = $lines[$sought - 1];

8.8.3 Discussion

Each strategy has different features, useful in different circumstances. The linear access approach is easy to write and best for short files. The Tie::File module gives good performance, regardless of the size of the file or which line you're reading (and is pure Perl, so doesn't require any external libraries). The DB_File mechanism has some initial overhead, but later accesses are faster than with linear access, so use it for long files that are accessed more than once and are accessed out of order.

It is important to know whether you're counting lines from 0 or 1. The $. variable is 1 after the first line is read, so count from 1 when using linear access. The index mechanism uses many offsets, so count from 0. Tie::File and DB_File treat the file's records as an array indexed from 0, so count lines from 0.

Here are three different implementations of the same program, print_line. The program takes two arguments: a filename and a line number to extract.

The version in Example 8-1 simply reads lines until it finds the one it's looking for.

Example 8-1. print_line-v1
  #!/usr/bin/perl -w   # print_line-v1 - linear style      @ARGV =  = 2 or die "usage: print_line FILENAME LINE_NUMBER\n";      ($filename, $line_number) = @ARGV;   open(INFILE, "<", $filename)     or die "Can't open $filename for reading: $!\n";   while (<INFILE>) {       $line = $_;       last if $. =  = $line_number;   }   if ($. != $line_number) {       die "Didn't find line $line_number in $filename\n";   }   print;

The Tie::File version is shown in Example 8-2.

Example 8-2. print_line-v2
  #!/usr/bin/perl -w   # print_line-v2 - Tie::File style   use Tie::File;   use Fcntl;   @ARGV =  = 2 or die "usage: print_line FILENAME LINE_NUMBER\n";   ($filename, $line_number) = @ARGV;   tie @lines, Tie::File, $filename, mode => O_RDWR       or die "Can't open $filename for reading: $!\n";   if (@lines > $line_number) {       die "Didn't find line $line_number in $filename\n";   }   print "$lines[$line_number-1]\n";

The DB_File version in Example 8-3 follows the same logic as Tie::File.

Example 8-3. print_line-v3
  #!/usr/bin/perl -w   # print_line-v3 - DB_File style   use DB_File;   use Fcntl;      @ARGV =  = 2 or die "usage: print_line FILENAME LINE_NUMBER\n";   ($filename, $line_number) = @ARGV;   $tie = tie(@lines, DB_File, $filename, O_RDWR, 0666, $DB_RECNO)       or die "Cannot open file $filename: $!\n";      unless ($line_number < $tie->length) {       die "Didn't find line $line_number in $filename\n"   }      print $lines[$line_number-1];                        # easy, eh?

If you will be retrieving lines by number often and the file doesn't fit into memory, build a byte-address index to let you seek directly to the start of the line using the techniques in Recipe 8.27.

8.8.4 See Also

The documentation for the standard Tie::File and DB_File modules (also in Chapter 32 of Programming Perl); the tie function in perlfunc(1) and in Chapter 29 of Programming Perl; the entry on $. in perlvar(1) and in Chapter 28 of Programming Perl; Recipe 8.27



Perl Cookbook
Perl Cookbook, Second Edition
ISBN: 0596003137
EAN: 2147483647
Year: 2003
Pages: 501

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net