2.3 Matrices


Perl matrices are built from simpler data structures using references. Recall that a matrix is a set of values that can be uniquely referenced by indexes. If only one index is required, the matrix is one-dimensional (this is exactly how an array works in Perl). If n indexes are required, the matrix is n -dimensional.

2.3.1 Two-Dimensional Matrices

A two-dimensional matrix is one of the simplest complex data structures. It can be conceptualized as a table of rows and columns , in which each element of the table is uniquely identified by its particular row and column.

There are several ways to build matrices in Perl. We'll look at some of the most useful.

Because there is no built-in matrix data structure, you have to build a matrix from other data structures. The most straightforward way to do this is with an array of arrays :

 @probes = (     [1, 3, 2, 9],     [2, 0, 8, 1],     [5, 4, 6, 7],     [1, 9, 2, 8] ); print "The probe at row 1, column 2 has value ", $probes[1][2], "\n"; 

This prints out:

 The probe at row 1, column 2 has value 8 

Recall that in Perl the first element of an array is indexed 0; so row 1 in this program is actually the second row, and column 2 is actually the third column. Sometimes you may want to refer to the 0th row as row 1; you have to adjust your code and your interactions with the user accordingly .

This matrix is implemented as an array (in parentheses), each element of which is a reference to an anonymous array [in square brackets], which itself is a list of integers.

Another good way to build an array is to declare a reference to an anonymous array . In the following example, I declare an empty anonymous array and then populate it as desired. This is, in effect, an anonymous array of anonymous arrays:

 # Declare reference to (empty) anonymous array $array = [  ]; # Initialize the array for($i=0; $i < 4 ; ++$i) {   for($j=0; $j < 4 ; ++$j) {       $array->[$i][$j] = $i * $j;   } } # Reset one of the elements of the array $array->[3][2] = 99; # Print the array for($i=0; $i < 4 ; ++$i) {   for($j=0; $j < 4 ; ++$j) {       printf("%3d ", $array->[$i][$j]);   }   print "\n"; } 

Note the use of printf to format the output nicely . For a refresher on this Perl function, consult the Perl documentation, by typing:

 perldoc -f printf 

and

 perldoc -f sprintf 

at a shell prompt or check out http://www.perldoc.com.

This program produces the following output:

 0   0   0   0    0   1   2   3    0   2   4   6    0   3  99   9 

Alternatively, if the values are known, I can declare this as an anonymous array of anonymous arrays by saying:

 $array = [   [0, 0, 0, 0],    [0, 1, 2, 3],    [0, 2, 4, 6],    [0, 3, 99, 9]  ]; 

I can also declare an array of anonymous arrays, by saying:

 @array = (   [0, 0, 0, 0],    [0, 1, 2, 3],    [0, 2, 4, 6],    [0, 3, 99, 9]  ); 

Notice the slight syntactical difference between an array of anonymous arrays:

 @array = ( [  ], [  ], ... ); 

and an anonymous array of anonymous arrays:

 $array = [ [  ], [  ], ... ]; 

Note that Perl also allows you to say:

 $$array[$i][$j] 

as a synonym for:

 $array->[$i][$j] 

But beware confusing:

 $array->[$i][$j] 

with:

 $array[$i][$j] 

They are not the same thing and won't refer to the same array if you intermix them!

Very often you read data in from a file that has the elements of a matrix displayed one row per line, and you have to store the data from that file in an array in your Perl program. Say you have the following data:

 0   0   0   0    0   1   2   3    0   2   4   6    0   3  99   9 

You can read the data into a Perl array with the following loop:

 while (<>) {   @row = split;   push(@array, [ @row ]); } 

This assumes that you've named the file on the command line as an argument to the program. Note that each incoming line is assigned to the special variable $_ on each iteration through the while loop. The split function uses this line stored in $_ by default. Each incoming line is split into an array of its whitespace-separated elements, and then an anonymous array [ @row ] containing those elements is pushed onto the @array array.

For more details on arrays of arrays, see the perllol manpage ; type perldoc perllol at your command prompt or visit the Perl documentation web site at http://www.perldoc.com.

2.3.2 Higher-Dimensional Matrices

To use a higher-dimensional matrix, simply add another dimension:

 # Populate a 3-dimensional array $array = [  ]; # Initialize the array for($i=0; $i < 4 ; ++$i) {   for($j=0; $j < 4 ; ++$j) {     for($k=0; $k < 4 ; ++$k) {       $array->[$i][$j][$k] = $i * $j * k;     }   } } 

The sharp-witted reader may have noticed that we seem to be omitting arrow operators between array subscripts. (After all, these are anonymous arrays of anonymous arrays of anonymous arrays, etc., so shouldn't they be written [$array->[$i]->[$j]->[$k] ?) Perl allows this; only the arrow operator between the variable name and the first array subscript is required. It make things easier on the eyes and helps avoid carpal tunnel syndrome. On the other hand, you may prefer to keep the dereferencing arrows in place, to make it clear you are dealing with references. Your choice.

There's no need to stop at three-dimensional arrays. If higher-dimensional arrays are hard to imagine, just don't think of "dimension" as tied to space. For instance, four- dimensional arrays have points that are uniquely identified by four indices; five- dimensional arrays have points that are uniquely identified by five indices, etc. In fact, subatomic space is thought to contain eleven dimensions.

2.3.3 Sparse Arrays

Some programs need arrays, but only a small number of the array elements are ever used. Such arrays are called sparse arrays .

It would be inefficient to declare, for instance, a 1,000-by-1,000 element array, 1 million elements in all, if only 100 elements are ever actually used. For such sparse two-dimensional arrays, it's best to implement the array as a hash of hashes:

 $array = {  }; $array->{4}{83} = 'set'; $array->{34}{9} = 'set'; print $array->{4}{83}, "\n"; print $array->{34}{9}, "\n"; 

This prints out:

 set set 

Perl creates only the table elements referenced, which makes an efficient implementation for a sparse matrix. However, because merely looking at a location (to see if there's anything there) creates an entry in the hash, you have to use the Perl exists function to keep your hashes sparse when looking at them. exists reports on whether a particular key (or array element) has been created, without actually creating it. [4] So to explore the sparse matrix just shown, you can say:

[4] The function defined is related but different; when used on a hash element, it checks if the value is undef , not whether the value exists.

 $array = {  }; $array->{4}{83} = 'set'; $array->{34}{9} = 'set'; for(my $i=0 ; $i < 100 ; ++$i) {     for(my $j=0 ; $j < 100 ; ++$j) {         if( exists($array->{$i}) and exists($array->{$i}{$j}) ) {             print "Array element row $i column $j is $array->{$i}{$j}\n";         }     } } 

This reports, without increasing the size of the array, that:

 Array element row 4 column 83 is set Array element row 34 column 9 is set 

Question: why did you need two exists tests? (Hint: it's a two-dimensional array.) Another question: is $array a hash or a reference to an anonymous hash? Can you implement it the other way? See the exercises for this chapter.



Mastering Perl for Bioinformatics
Mastering Perl for Bioinformatics
ISBN: 0596003072
EAN: 2147483647
Year: 2003
Pages: 156

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net