Hack 100. Overload Your Operators | Perl Hacks: Tips & Tools for Programming, Debugging, and Surviving

Make your objects look like numbers, strings, and booleans sensibly.

Few people realize that Perl is an operator-oriented language, where the behavior of data depends on the operations you perform on it. You've probably had the experience of inadvertently stringifying an object or reference and wondering where and why you suddenly see memory addresses.

Fortunately, you can control what happens to your objects in various contexts.

Consider the Number::Intervals module from "Track Your Approximations" [Hack #99]. It's useful, but as shown there it has a few drawbacks.

The effect of the import( ) subroutine is that any code that declares:

use Number::Intervals;

will thereafter have every floating-point constant replaced by a Number::Intervals object that encodes upper and lower bounds on the original constant. That impressive achievement (utterly impossible in most other programming languages) will, sadly, be somewhat undermined when you then write:

use Number::Intervals; my $avogadro    = 6.02214199e23;   # standard physical constant my $atomic_mass = 55.847;          # atomic mass of iron my $mass        = 100;             # mass in grams my $count       = int( $mass * $avogadro/$atomic_mass ); print "Number of atoms in $mass grams of iron = $count\\n";

The unfortunate result is:

$ perl count_atoms.pl Number of atoms in 100 grams of iron = 99

Iron atoms are heavy, but they're not that heavy. The correct answer is just a little over 1 million billion billion, so converting to intervals appears to have made the calculation noticably less accurate.

The problem is that the import( ) code you implemented to reinterpret Perl's floating-point constants did just that. It converted those constants into interval objects; that is, into references to blessed arrays. When you multiply and divide those interval objects, Perl converts the corresponding array references to integer addresses, which it then multiplies and divides. The calculation:

$mass * $avogadro / $atomic_mass

becomes something like:

100 * 0x1808248 / 0x182dc10

which is:

100 * 25199176 / 25353232

which is where the spurious 99 came from.

Somehow, you need to teach Perl not only how to convert floating-point numbers to interval objects, but also how to compute sensibly with those objects.

The Hack

The trick, of course, is to overload the arithmetic operators that will apply to Number::Intervals objects by using the overload pragma:

# Overload operators for Number::Intervals objects... use overload (     # Add two intervals by independently adding minima and maxima...     q{+} => sub     {         my ($x, $y) = _check_args(@_);         return _interval($x->[0] + $y->[0], $x->[1] + $y->[1]);     },     # Subtract intervals by subtracting maxima from minima and vice versa...     q{-} => sub     {         my ($x, $y) = _check_args(@_);         return _interval($x->[0] - $y->[1], $x->[1] - $y->[0]);     },     # Multiply intervals by taking least and greatest products...     q{*} => sub     {         my ($x, $y) = _check_args(@_);         return _interval($x->[0] * $y->[0], $x->[1] * $y->[0],                          $x->[1] * $y->[1], $x->[0] * $y->[1],                         );     },     # Divide intervals by taking least and greatest quotients...     q{/} => sub     {         my ($x, $y) = _check_args(@_);         return _interval($x->[0] / $y->[0], $x->[1] / $y->[0],                          $x->[1] / $y->[1], $x->[0] / $y->[1],                         );     },     # Exponentiate intervals by taking least and greatest powers...     q{**} => sub     {         my ($x, $y) = _check_args(@_);         return _interval($x->[0] ** $y->[0], $x->[1] ** $y->[0],                          $x->[1] ** $y->[1], $x->[0] ** $y->[1],                         );     },     # Integer value of an interval is integer value of bounds...     q{int} => sub     {         my ($x) = @_;         return _interval(int $x->[0], int $x->[1]);     },     # Square root of interval is square roots of bounds...     q{sqrt} => sub     {         my ($x) = @_;         return _interval(sqrt $x->[0], sqrt $x->[1]);     },     # Unary minus: negate bounds and swap upper/lower:     q{neg} => sub     {         my ($x) = @_;         return _interval(-$x->[1], -$x->[0]);     },     # etc. etc. for the other arithmetic operators... );

The overload module expects a list of key/value pairs, where each key is the name of an operator and each value is a subroutine that implements that operator. Once they're installed, each of the implementation subroutines will be called whenever an object of the class is an argument to the corresponding operator.

Unary operators (including int, neg, and sqrt) receive the operand object as their only argument; binary operators (like +, *, and **) receive three arguments: their two operands and an extra flag indicating whether the operands appear in reversed order (because the first operand wasn't an object). Binary operators therefore need to check, and sometimes unreverse, their arguments, which the _check_args( ) subroutine does for them:

# Flip args if necessary, converting to an interval if not already... sub _check_args {     my ($x, $y, $reversed) = @_;     return $reversed              ?  ( _interval($y), $x            )          : ref $y ne __PACKAGE__  ?  ( $x,            _interval($y) )          :                           ( $x,            $y            ); }

Note that this utility subroutine also converts any non-interval arguments (integers, for example) to interval ranges. This means that, after calling _check_args( ), all of the binary handlers can be certain that their operands are in the correct order and that both operands are proper interval objects. This greatly simplifies the implementation of the overloaded operators. In particular, they don't need to implement three separate sets of logic for handling interval/number, number/interval, and interval/interval interactions.

Saying what you mean

Reimplementing the necessary operators enables you to add, subtract, multiply, divide, and so on, interval representations correctly. However, even with the overloading in place, the results of counting the atoms are still more ironic than ferric:

$ perl count_atoms_v2.pl Number of atoms = Number::Intervals=ARRAY(0x182f89c)

The problem is that, although Perl now knows how to do operations on interval objects, it still has no idea how to convert those interval objects back to simple numbers, or to strings. When you try to print a floating-point interval object, it prints the string representation of the object reference, rather than the string representation of the value that the object represents.

Fortunately, it's easy to tell the interpreter how to convert intervals back to sensible numbers and strings. Just give the Number::Intervals class two extra handlers for stringification and numerification, like this:

use overload (     # Stringify intervals as: VALUE (UNCERTAINTY)...     q{""} => sub     {         my ($self) = @_;         my $uncert = ($self->[1] - $self->[0]) / 2;         use charnames qw( :full );         return $self->[0]+$uncert . " (\\N{PLUS-MINUS SIGN}$uncert)";     },     # Numerify intervals by averaging their bounds (with warning)...     q{0+} => sub     {         my ($self) = @_;         carp "Approximating interval by a single (averaged) number";         return ($self->[0] + $self->[1]) /2;     }, );

With that back-translation in place, the floating point calculations can finally proceed correctly, with their accuracy being automatically tracked and reported as well:

$ perl count_atoms_v3.pl Number of atoms = 1.07832864612244e+24 (805306368)