Hack78.Ease Data Manipulation with Visitors


Hack 78. Ease Data Manipulation with Visitors

Use the Visitor pattern to separate data traversal from data handling.

Early in my career as a programmer, I did a lot of scientific programming with data acquisition systems. These were systems that recorded data at a sampling interval of 3 microsecondsin other words, 333,333 samples per second. That came out to 38 megabytes for every minute of information! For long recording sessions, a file could easily get into gigabytes. Needless to say, we had problems recording and storing that much information without any hiccups.

Another problem had to do with analyzing this data. How do you analyze a multigigabyte file when the machine doing the work has only 128 MB of memory? The answer is to chunk the data. Chunking means reading in the file by sections, swapping out the data you don't need and swapping in what you do need.

That said, you'd think that these scientific algorithms were tough enough without worrying about swapping in and out big chunks of data. To solve these problemsand do it elegantlywe used the Visitor pattern. One object would handle getting the data in and out of memory, and another object would handle processing the data when it was in memory.

Figure 7-12 shows a RecordList object that contains a list of Records. It has an iterate( ) function that, when given another function, calls the passed-in function on each record.

With this approach, the data processing functionpassed in to iterate()doesn't have to understand how records are managed in memory. All the function has to do is handle the data it's given.

7.13.1. The Code

Save the code in Example 7-17 as visitor1.php.

Figure 7-12. A RecordList with an iterate method


Example 7-17. A Visitor pattern moving over database records
 <?php class Record {   public $name;   public $age;   public $salary;   public function Record( $name, $age, $salary )   {     $this->name = $name; $this->age = $age; $this->salary = $salary;   } } class RecordList {   private $records = array();   public function RecordList()   {     $this->records []= new Record( "Larry", 22, 35000 ); $this->records []= new Record( "Harry", 25, 37000 ); $this->records []= new Record( "Mary", 42, 65000 ); $this->records []= new Record( "Sally", 45, 80000 );      }   public function iterate( $func )   {     foreach( $this->records as $r ) {   call_user_func( $func, $r ); }   } } $min = 100000; function find_min_salary( $rec ) {   global $min;   if( $rec->salary < $min ) { $min = $rec->salary; } } $rl = new RecordList(); $rl->iterate( "find_min_salary", $min ); echo( $min."\n" ); ?> 

7.13.2. Running the Hack

You run this hack on the command line like this:

 % php visitor1.php 35000 

This particular algorithm finds the lowest salary of all of the records it sees.

The code in this script is fairly simple. The Record class holds the data for each record. The RecordList class then preloads itself with some dummy data (in a real system, this data might be read from a database or filesystem). In the iterate( ) method, the foreach loop walks through the list of records. call_user_func( ) calls the passed-in data processing function on each record. In this example, that function is find_min_salary( ); it inspects each record to find the lowest salary value.

7.13.3. Hacking the Hack

I find the function-based version of this pattern a little clunky. I would rather specify a visitor object that will receive each record. That way, the object can hold the minimum value that can be accessed later.

Figure 7-13 shows a variant where the iterate( ) method takes an object (of type RecordVisitor) rather than a function.

The updated code is shown in Example 7-18.

Figure 7-13. A variant where the visitor is an object


Example 7-18. An updated version of the visitor
 <?php class Record {   public $name;   public $age;   public $salary;   public function Record( $name, $age, $salary )   {     $this->name = $name; $this->age = $age; $this->salary = $salary;   } } abstract class RecordVisitor {   abstract function visitRecord( $rec ); } class RecordList {   private $records = array();   public function RecordList()   { $this->records []= new Record( "Larry", 22, 35000 ); $this->records []= new Record( "Harry", 25, 37000 ); $this->records []= new Record( "Mary", 42, 65000 ); $this->records []= new Record( "Sally", 45, 80000 );   }   public function iterate( $vis )   {     foreach( $this->records as $r ) {   call_user_func( array( $vis, "visitRecord" ), $r ); }   } } class MinSalaryFinder extends RecordVisitor {   public $min = 1000000;   public function visitRecord( $rec )   {     if( $rec->salary < $this->min ) { $this->min = $rec->salary; }   } } $rl = new RecordList(); $msl = new MinSalaryFinder(); $rl->iterate( $msl ); echo( $msl->min."\n" ); ?> 

I have added the RecordVisitor abstract class, and implemented it with the MinSalaryFinder class; this implementation stores a minimum value. The test code now creates a RecordList, then creates a MinSalaryFinder, and applies it to the list through the iterate( ) method. Finally, the minimum value is printed.

A couple of thoughts about this hack are in order before leaving it. First, PHP is not very good at calling functions dynamically. Specifying a function by name is weak and prone to error. Python, Perl, Ruby, Java, and C# (and almost every other language) all have the ability to assign a function pointer to a value. Then it's easy to invoke a method through that function pointer (or reference, depending on the language). I like PHP just like everyone else, but I think this should be cleaned up in the next version of the language.



PHP Hacks
PHP Hacks: Tips & Tools For Creating Dynamic Websites
ISBN: 0596101392
EAN: 2147483647
Year: 2006
Pages: 163

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net