Section 13.13. Customizing Classes with Special Methods

13.13. Customizing Classes with Special Methods

We covered two important aspects of methods in preceding sections of this chapter: first, that methods must be bound (to an instance of their corresponding class) before they can be invoked; and second, that there are two special methods which provide the functionality of constructors and destructors, namely __init__() and __del__() respectively.

In fact, __init__() and __del__() are part of a set of special methods which can be implemented. Some have the predefined default behavior of inaction while others do not and should be implemented where needed. These special methods allow for a powerful form of extending classes in Python. In particular, they allow for:

Emulating standard types
Overloading operators

Special methods enable classes to emulate standard types by overloading standard operators such as +, *, and even the slicing subscript and mapping operator [ ]. As with most other special reserved identifiers, these methods begin and end with a double underscore ( __ ). Table 13.4 presents a list of all special methods and their descriptions.

Table 13.4. Special Methods for Customizing Classes
Special Method	Description
Basic Customization
`C.__init__`(`self`[, `arg1, ...`] )	Constructor (with any optional arguments)
`C.__new__`(`self`[`, arg1, ...`] )^[a]	Constructor (with any optional argu ments); usually used for setting up subclassing of immutable data types
`C.__del__`(`self`)	Destructor
`C.__str__`(`self`)	Printable string representation; `str()` built-in and `print` statement
`C.__repr__`(`self`)	Evaluatable string representation; `repr()` built-in and `''` operator
`C.__unicode__`(`self`)^[b]	Unicode string representation; `unicode()` built-in
`C.__call__`(`self, *args`)	Denote callable instances
`C.__nonzero__`(`self`)	Define `False` value for object; `bool()` built-in (as of 2.2)
`C.__len__`(`self`)	"Length" (appropriate for class); `len()` built-in
Object (Value) Comparison^[c]
`C.__cmp__`(`self, obj`)	object comparison; `cmp()` built-in
`C.__lt__`(`self, obj`) and `C.__le__`(`self, obj`)	less than/less than or equal to; `<` and `<=` operators
`C.__gt__`(`self, obj`) and `C.__ge__`(`self, obj`)	greater than/greater than or equal to; `>` and `>=` operators
`C.__eq__`(`self, obj`) and `C.__ne__`(`self, obj`)	equal/not equal to; `==,!=` and `<>` operators
Attributes
`C.__getattr__`(`self, attr`)	Get attribute; `getattr()` built-in; called only if attributes not found
`C.__setattr__`(`self, attr, val`)	Set attribute;
`C.__delattr__`(`self, attr`)	Delete attribute;
`C.__getattribute__`(`self, attr`)^[a]	Get attribute; `getattr()` built-in; always called
`C.__get__`(`self, attr`)^[a]	(descriptor) Get attribute
`C.__set__`(`self, attr, val`)^[a]	(descriptor) Set attribute
`C.__delete__`(`self, attr`)^[a]	(descriptor) Delete attribute
Customizing Classes / Emulating Types
Numeric Types: Binary Operators^[d]
`C.__*add__`(`self, obj`)	Addition; `+` operator
`C.__*sub__`(`self, obj`)	Subtraction; `-` operator
`C.__*mul__`(`self, obj`)	Multiplication; `*` operator
`C.__*div__`(`self, obj`)	Division; `/` operator
C`.__*truediv__`(`self, obj`)^[e]	True division; `/` operator
`C.__*floordiv__`(`self, obj`)^[e]	Floor division; `//` operator
`C.__*mod__`(`self, obj`)	Modulo/remainder; `%` operator
`C.__*divmod__`(`self, obj`)	Division and modulo; `divmod()` built-in
`C.__*pow__`(`self, obj`[, `mod`])	Exponentiation; `pow()` built-in; `**` operator
`C.__*lshift__`(`self, obj`)	Left shift; `<<` operator
Customizing Classes / Emulating Types
Numeric Types: Binary Operators^[f]
`C.__*rshift__`(`self, obj`)	Right shift; `>>` operator
`C.__*and__`(`self, obj`)	Bitwise AND; `&` operator
`C.__*or__`(`self, obj`)	Bitwise OR; `\|` operator
`C.__*xor__`(`self, obj`)	Bitwise XOR; `^` operator
Numeric Types: Unary Operators
`C.__neg__`(`self`)	Unary negation
`C.__pos__`(`self`)	Unary no-change
`C.__abs__`(`self`)	Absolute value; `abs()` built-in
`C.__invert__`(`self`)	Bit inversion; `~` operator
Numeric Types: Numeric Conversion
`C.__complex__`(`self, com`)	Convert to complex; `complex()` built-in
`C.__int__`(`self`)	Convert to int; `int()` built-in
`C.__long__`(`self`)	Convert to long; `long()` built-in
`C.__float__`(`self`)	Convert to float; `float()` built-in
Numeric Types: Base Representation (String)
`C.__oct__`(`self`)	Octal representation; `oct()` built-in
`C.__hex__`(`self`)	Hexadecimal representation; `hex()` built-in
Numeric Types: numeric coercion
`C.__coerce__`(`self, num`)	Coerce to same numeric type; `coerce()` built-in
`C.__index__`(`self`)^[g]	Coerce alternate numeric type to integer if/when necessary (e.g., for slice indexes, etc.)
Sequence Types^[e]
`C.__len__`(`self`)	Number of items in sequence
`C.__getitem__`(`self, ind`)	Get single sequence element
`C.__setitem__`(`self, ind, val`)	Set single sequence element
`C.__delitem__`(`self, ind`)	Delete single sequence element
Special Method	Description
Sequence Types^[e]
`C.__getslice__`(`self, ind1, ind2`)	Get sequence slice
`C.__setslice__`(`self, i1, i2, val`)	Set sequence slice
`C.__delslice__`(`self, ind1, ind2`)	Delete sequence slice
`C.__contains__`(`self, val`)^[f]	Test sequence membership; `in` keyword
`C.__*add__`(`self, obj`)	Concatenation; `+` operator
`C.__*mul__`(`self, obj`)	Repetition; `*` operator
`C.__iter__`(`self`)^[e]	Create iterator class; `iter()` built-in
Mapping Types
`C.__len__`(`self`)	Number of items in mapping
`C.__hash__`(`self`)	Hash function value
`C.__getitem__`(`self, key`)	Get value with given `key`
`C.__setitem__`(`self, key, val`)	Set `value` with given `key`
`C.__delitem__`(`self, key`)	Delete value with given `key`
`C.__missing__`(`self, key`)^[g]	Provides default value when dictionary does not have given `key`

^[a] New in Python 2.2; for use with new-style classes only.

^[b] New in Python 2.3.

^[c] All except cmp() new in Python 2.1.

^[d] "*"either nothing (self OP obj), "r" (obj OP self), or "i" for in-place operation (new in Python 2.0), i.e., __add__, __radd__, or __iadd__.

^[e] New in Python 2.2.

^[f] "*" either nothing (self OP obj), "r" (obj OP self), or "i" for in-place operation (new in Python 1.6), i.e., __add__, __radd__, or __iadd__.

^[g] New in Pathon 2.5.

The Basic Customization and Object (Value) Comparison special methods can be implemented for most classes and are not tied to emulation of any specific types. The latter set, also known as Rich Comparisons, was added in Python 2.1.

The Attributes group helps manage instance attributes of your class. This is also independent of emulation. There is also one more, __getattribute__(), which applies to new-style classes only, so we will describe it in an upcoming section.

The Numeric Types set of special methods can be used to emulate various numeric operations, including those of the standard (unary and binary) operators, conversion, base representation, and coercion. There are also special methods to emulate sequence and mapping types. Implementation of some of these special methods will overload operators so that they work with instances of your class type.

The additional division operators __*truediv__() and __*floordiv__() were added in Python 2.2 to support the pending change to the Python division operatoralso see Section 5.5.3. Basically, if the interpreter has the new division enabled, either via a switch when starting up Python or via the import of division from __future__, the single slash division operator ( / ) will represent true division, meaning that it will always return a floating point value, regardless of whether floats or integers make up the operands (complex division stays the same). The double slash division operator ( // ) will provide the familiar floor division with which most engineers who come from the standard compiled languages like C/C++ and Java are familiar. Similarly, these methods will only work with these symbols applied to classes that implement these methods and when new division is enabled.

Numeric binary operators in the table annotated with a wildcard asterisk in their names are so denoted to indicate that there are multiple versions of those methods with slight differences in their name. The asterisk either symbolizes no additional character in the string, or a single "r" to indicate a right-hand operation. Without the "r," the operation occurs for cases that are of the format self OP obj; the presence of the "r" indicates the format obj OP self. For example, __add__(self, obj) is called for self + obj, and __radd__(self, obj) would be invoked for obj + self.

Augmented assignment, new in Python 2.0, introduces the notion of "in-place" operations. An "i" in place of the asterisk implies a combination left-hand operation plus an assignment, as in self = self OP obj. For example, __iadd__(self, obj) is called for self = self + obj.

With the arrival of new-style classes in Python 2.2, several more special methods have been added for overriding. However, as we mentioned at the beginning of the chapter, we are now focusing only on the core portion of material applicable to both classic classes as well as new-style classes, and then later on in the chapter, we address the advanced features of new-style classes.

13.13.1. Simple Customization (`RoundFloat2`)

Our first example is totally trivial. It is based to some extent on the RoundFloat class we saw earlier in the section on subclassing Python types. This example is simpler. In fact, we are not even going to subclass anything (except object of course)... we do not want to "take advantage" of all the "goodies" that come with floats. No, this time, we want to create a barebones example so that you have a better idea of how class customization works. The premise of this class is still the same as the other one: we just want a class to save a floating point number rounded to two decimal places.

class RoundFloatManual(object):     def __init__(self, val):         assert isinstance(val, float), \             "Value must be a float!"         self.value = round(val, 2)

This class takes a single floating point valueit asserts that the type must be a float as it is passed to the constructorand saves it as the instance attribute value. Let us try to execute it and create an instance of this class:

>>> rfm = RoundFloatManual(42) Traceback (most recent call last):   File "<stdin>", line 1, in ?   File "roundFloat2.py", line 5, in __init__     assert isinstance(val, float), \ AssertionError: Value must be a float! >>> rfm = RoundFloatManual(4.2) >>> rfm <roundFloat2.RoundFloatManual object at 0x63030> >>> print rfm <roundFloat2.RoundFloatManual object at 0x63030>

As you can see, it chokes on invalid input, but provides no output if input was valid. But look what happens when we try to dump the object in the interactive interpreter. We get some information, but this is not what we were looking for. (We wanted to see the numeric value, right?) And calling print does not apparently help, either.

Unfortunately, neither print (using str()) nor the actual object's string representation (using repr()) reveals much about our object. One good idea would be to implement either __str__() or __repr__(), or both so that we can "see" what our object looks like. In other words, when you want to display your object, you actually want to see something meaningful rather than the generic Python object string (<object object at id>). Let us add a __str__() method, overriding the default behavior:

def __str__(self):     return str(self.value)

Now we get the following:

    >>> rfm = RoundFloatManual(5.590464)     >>> rfm     <roundFloat2.RoundFloatManual object at 0x5eff0>     >>> print rfm     5.59     >>> rfm = RoundFloatManual(5.5964)     >>> print rfm     5.6

We still have a few problems ... one is that just dumping the object in the interpreter still shows the default object notation, but that is not so bad. If we wanted to fix it, we would just override __repr__(). Since our string representation is also a Python object, we can make the output of __repr__() the same as __str__().

To accomplish this, we can just copy the code from __str__() to __repr__(). This is a simple example, so it cannot really hurt us, but as a programmer, you know that is not the best thing to do. If a bug existed in __str__(), then we will copy that bug to __repr__().

The best solution is to recall that the code represented by __str__() is an object too, and like all objects, references can be made to them, so let us just make __repr__() an alias to __str__():

    __repr__ = __str__

In the second example with 5.5964, we see that it rounds the value correctly to 5.6, but we still wanted two decimal places to be displayed. One more tweak, and we should be done. Here is the fix:

def __str__(self):     return '%.2f' % self.value

And here is the resulting output with both str() and repr() output:

    >>> rfm = RoundFloatManual(5.5964)     >>> rfm     5.60     >>> print rfm     5.60

In our original RoundFloat example at the beginning of this chapter, we did not have to worry about all the fine-grained object display stuff; the reason is that __str__() and __repr__() have already been defined for us as part of the float class. All we did was inherit them. Our more "manual" version required additional work from us. Do you see how useful derivation is? You do not even need to know how far up the inheritance tree the interpreter needs to go to find a declared method that you are using without guilt. We present the full code of this class in Example 13.2.

Example 13.2. Basic Customization (`roundFloat2.py`)

1  #!/usr/bin/env python 2 3  class RoundFloatManual(object): 4     def __init__(self, val): 5        assert isinstance(val, float), \ 6        "Value must be a float!" 7        self.value = round(val, 2) 8 9     def __str__(self): 10       return '%.2f' % self.value 11 12    __repr__ = __str__

Now let us try a slightly more complex example.

13.13.2. Numeric Customization (`Time60`)

For our first realistic example, let us say we wanted to create a simple application that manipulated time as measured in hours and minutes. The class we are going to create can be used to track the time worked by an employee, the amount of time spent online by an ISP (Internet service provider) subscriber, the amount of total uptime for a database (not inclusive of downtime for backups and upgrades), the total amount of time played in a poker tournament, etc.

For our Time60 class, we will take integers as hours and minutes as input to our constructor.

class Time60(object):            # ordered pair     def __init__(self, hr, min): # constructor         self.hr = hr             # assign hours         self.min = min           # assign minutes

Display

Also, as seen in the previous example, we want meaningful output if we display our instances, so we need to override __str__() (and __repr__() if so desired). As humans, we are used to seeing hours and minutes in colon-delimited format, e.g. "4:30," representing four and a half hours (four hours and thirty minutes):

def __str__(self):     return '%d:%d' % (self.hr, self.min)

Using this class, we can instantiate some objects. In the example below, we are starting a timesheet to track the number of billable hours for a contractor:

    >>> mon = Time60(10, 30)     >>> tue = Time60(11, 15)     >>>     >>> print mon, tue     10:30 11:15

The output is very nice, exactly what we wanted to see. What is the next step? Let us say we want our objects to interact. In particular, for our timesheet application, it is a necessity to be able to add Time60 instances together and have our objects do all meaningful operations. We would love to see something like this:

    >>> mon + tue     21:45

Addition

With Python, overloading operators is simple. For the plus sign ( + ), we need to overload the __add__() special method, and perhaps __radd__() and __iadd__(), if applicable. More on those in a little while. Implementing __add__() does not sound too difficultwe just add the hours together followed by the minutes. Most of the complexity lies in what we do with the new totals. If we want to see "21:45," we have to realize that that is another Time60 object. We are not modifying mon or tue, so our method would have to create another object and fill it in with the sums we calculated.

We implement the __add__() special method in such a way that we calculate the individual sums first, then call the class constructor to return a new object:

def __add__(self, other):     return self.__class__(self.hr + other.hr,         self.min + other.min)

The new object is created by invoking the class as in any normal situation. The only difference is that from within the class, you typically would not invoke the class name directly. Rather, you take the __class__ attribute of self, which is the class from which self was instantiated, and invoke that. Because self.__class__ is the same as Time60, calling self.__class__() is the same as calling Time60().

This is the more object-oriented approach anyway. The other reason is that if we used the real class name everywhere we create a new object and later on decided to change the class name to something else, we would have to perform very careful global search-and-replace. By using self.__class__, we do not have to do anything other than change the name in the class directive.

With our plus sign overloading, we can now "add" Time60 objects:

    >>> mon = Time60(10, 30)     >>> tue = Time60(11, 15)     >>> mon + tue     <time60.Time60 object at 0x62190>     >>> print mon + tue     21:45

Oops, we forgot to add an __repr__ alias to __str__, which is easily fixable.

One question you may have is, "What happens when I try to use an operator in an overload situation where I do not have the appropriate special methods defined?" The answer is a TypeError exception:

    >>> mon - tue     Traceback (most recent call last):       File "<stdin>", line 1, in ?     TypeError: unsupported operand type(s) for -: 'Time60'     and 'Time60'

In-Place Addition

With augmented assignment (introduced back in Python 2.0), we may also wish to override the "in-place" operators, for example, __iadd__(). This is for supporting an operation like mon += tue and having the correct result placed in mon. The only trick with overriding an __i*__() method is that it has to return self. Let us add the following bits of code to our example, fixing our repr() issue above as well as supporting augmented assignment:

__repr__ = __str__ def __iadd__(self, other):     self.hr += other.hr     self.min += other.min     return self

Here is our resulting output:

    >>> mon = Time60(10, 30)     >>> tue = Time60(11, 15)     >>> mon     10:30     >>> id(mon)     401872     >>> mon += tue     >>> id(mon)     401872     >>> mon     21:45

Note the use of the id() built-in function to confirm that before and after the in-place addition we are indeed modifying the same object and not creating a new one. This is a great start at a class that has a lot of potential. The complete class definition for Time60 is given in Example 13.3.

Example 13.3. Intermediate Customization (`time60.py`)

1  #!/usr/bin/env python 2 3  class Time60(object): 4     'Time60 - track hours and minutes' 5 6     def __init__(self, hr, min): 7     'Time60 constructor - takes hours and minutes' 8          self.hr = hr 9          self.min = min 10 11    def __str__(self): 12         'Time60 - string representation' 13         return '%d:%d' % (self.hr, self.min) 14 15    __repr__ = __str__ 16 17    def __add__(self, other): 18         'Time60 - overloading the addition operator' 19         return self.__class__(self.hr + other.hr, 20         self.min + other.min) 21 22    def __iadd__(self, other): 23         'Time60 - overloading in-place addition' 24         self.hr += other.hr 25         self.min += other.min 26         return self

Example 13.4. Random Sequence Iterator (`randSeq.py`)

1  #!/usr/bin/env python 2 3  from random import choice 4 5  class RandSeq(object): 6     def __init__(self, seq): 7          self.data = seq 8 9     def __iter__(self): 10         return self 11 12    def next(self): 13         return choice(self.data)

Further Refinements

We will leave it here, but there is plenty of optimization and significant improvements that can be made to this class. For example, wouldn't it be nice if we could just feed a 2-tuple (10, 30) into our constructor rather than having to pass in two separate arguments? What about a string like "10:30"?

The answer is yes, you can, and it is easy to do in Python but not by overloading the constructor as the case may be with other object-oriented programming languages. Python does not allow overloading callables with multiple signatures, so the only way to make it happen is with a single constructor and performing self-introspection with the isinstance() and (perhaps) type() built-in functions.

Supporting multiple forms of input makes our application more robust and flexible. The same is true for the ability to perform other operations like subtraction. Of course these are optional and serve as icing on the cake, but what we should be worried about first are two moderate flaws: undesired formatting when there are fewer than ten minutes and the lack of support of sexagesimal^[1] (base 60) operations:

^[1] Latin-originated name for base 60; sometimes hexagesimal is used, a hybrid combining the Greek root "hexe" with the Latin "gesimal."

    >>> wed = Time60(12, 5)     >>> wed     12:5     >>> thu = Time60(10, 30)     >>> fri = Time60(8, 45)     >>> thu + fri     18:75

Displaying wed should have resulted in "12:05," and summing thu and fri should have given an output of "19:15." The fixes for these flaws and the improvements suggested just above are great practice building your class customization skills. You can get a more complete description of these upgrades in Exercise 13-20 at the end of the chapter.

Hopefully, you now have a better understanding of operator overloading, why you would want to do it, and how you can implement special methods to accomplish that task. Let's look at more complex customizations, continuing with the optional section that follows.

13.13.3. Iterators (`RandSeq` and `AnyIter`)

RandSeq

We were introduced to iterators formally in Chapter 8 but we have been using them throughout this text. They are simply a mechanism to go through items of a sequence (or sequence-like object) one at a time. In Chapter 8 we described how implementing the __iter__() and next() methods of a class can be used to create an iterator. We will demonstrate that with two examples here.

The first example is a RandSeq (short for RANDom SEQuence). We feed an initial sequence to our class, then let the user iterate (infinitely) through it via next().

The __init__() method does the aforementioned assignment. The __iter__() just returns self, which is how you declare an object is an iterator, and finally, next() is called to get successive values of iteration. The only catch with this iterator is that it never ends.

This example demonstrates some unusual things we can do with custom class iterations. One is infinite iteration. Because we read the sequence nondestructively, we never run out of elements. Each time the user calls next(), it gets the next value, but our object never raises StopIteration. If we run it, we will get output similar to the following:

    >>> from randseq import RandSeq     >>> for eachItem in RandSeq(     ...         ('rock', 'paper', 'scissors')):     ...     print eachItem     ...     scissors     scissors     rock     paper     paper     scissors     :

Example 13.5. Any Number of Items Iterator (`anyIter.py`)

1  #!/usr/bin/env python 2 3  class AnyIter(object): 4      def __init__(self, data, safe=False): 5          self.safe = safe 6          self.iter = iter(data) 7 8      def __iter__(self): 9          return self 10 11     def next(self, howmany=1): 12         retval = [] 13         for eachItem  in range(howmany): 14             try: 15                retval.append(self.iter.next()) 16             except StopIteration: 17                if self.safe: 18                     break 19                else: 20                     raise 21         return retval

`AnyIter`

In the second example, we do create an iterator object, but rather than iterating through one item at a time, we give the next() method an argument telling how many items to return. Here is the code for our (ANY number of items ITERator):

Like RandSeq, the AnyIter class should be fairly simple to figure out. We described the basic operation above... it works just like any other iterator except that users can request the next N items of the iterable instead of only one.

We create the object by being given an iterable and a safe flag. If the flag is TRue, we will return any items retrieved before exhausting the iterable, but if the flag is False, we will reraise the exception if the user asked for too many items. The core of any complexity lies in next(), specifically how it quits (lines 14-21).

In the last part of next(), we create a list of items to return and call the object's next() for each item. If we exhaust the list and get a StopIteration exception, we check the safe flag. If unsafe, we throw the exception back to the caller (raise); otherwise, we return whatever items we have saved up (break and return).

    >>> a = AnyIter(range(10))     >>> i = iter(a)     >>> for j  in range(1,5):     >>>     print j, ':', i.next(j)     1 : [0]     2 : [1, 2]     3 : [3, 4, 5]     4 : [6, 7, 8, 9]

The execution above ran fine because the iteration fit the number of items perfectly. What happens when things go awry? Let us try "unsafe" mode first, which is how we created our iterator to begin with from above:

    >>> i = iter(a)     >>> i.next(14)     Traceback (most recent call last):       File "<stdin>", line 1, in ?       File "anyIter.py", line 15, in next         retval.append(self.iter.next())       StopIteration

The StopIteration exception was raised because we exceeded our supply of items, and that exception was reraised back to the caller (line 20). If we were to recreate the iterator in "safe" mode and run it with the same example, we get back whatever the iterator could get us before running out of items:

    >>> a = AnyIter(range(10), True)     >>> i = iter(a)     >>> i.next(14)     [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

13.13.4. *Multi-type Customization (`NumStr`)

Let us create another new class, NumStr, consisting of a number-string ordered pair, called n and s, respectively, using integers as our number type. Although the "proper" notation of an ordered pair is (n, s), we choose to represent our pair as [n :: s] just to be different. Regardless of the notation, these two data elements are inseparable as far as our model is concerned. We want to set up our new class, called NumStr, with the following characteristics:

Initialization

The class should be initialized with both the number and string; if either (or both) is missing, then 0 and the empty string should be used, i.e., n=0 and s='', as defaults.

Addition

We define the addition operator functionality as adding the numbers together and concatenating the strings; the tricky part is that the strings must be concatenated in the correct order. For example, let NumStr1 = [n1 :: s1] and NumStr2 = [n2 :: s2]. Then NumStr1 + NumStr2 is performed as [n1 + n2 :: s1 + s2] where + represents addition for numbers and concatenation for strings.

Multiplication

Similarly, we define the multiplication operator functionality as multiplying the numbers together and repeating or concatenating the strings, i.e., NumStr1 * NumStr2 = [n1 * n :: s1 * n].

False Value

This entity has a false value when the number has a numeric value of zero and the string is empty, i.e., when NumStr = [0 :: ''].

Comparisons

Comparing a pair of NumStr objects, i.e., [n1 :: s1] vs. [n2 :: s2], we find nine different combinations (i.e., n1 > n2 and s1 < s2, n1 == n2 and s1 > s2, etc.). We use the normal numeric and lexicographic compares for numbers and strings, respectively, i.e., the ordinary comparison of cmp(obj1, obj2) will return an integer less than zero if obj1 < obj2, greater than zero if obj1 > obj2, or equal to zero if the objects have the same value.

The solution for our class is to add both of these values and return the result. The interesting thing is that cmp() does not always return -1, 0, or 1 for us. It is, as described above, an integer less than, equal to, or greater than zero.

In order to correctly compare our objects, we need __cmp__() to return a value of 1 if (n1 > n2) and (s1 > s2), -1 if (n1 < n2) and (s1 < s2), and 0 if both sets of numbers and strings are the same, or if the comparisons offset each other, i.e., (n1 < n2) and (s1 > s2), or vice versa.

Example 13.6. Multi-Type Class Customization (`numstr.py`)

1  #!/usr/bin/env python 2 3  class NumStr(object): 4 5      def __init__(self, num=0, string=''): 6          self.__num = num 7          self.__string = string 8 9      def __str__(self):        # define for str() 10         return '[%d :: %r]' % \ 11             self.__num, self.__string) 12     __repr__ = __str__ 13 14     def __add__(self, other):     # define for s+o 15         if isinstance(other, NumStr): 16             return self.__class__(self.__num + \ 17                 other.__num, \ 18                 self.__string + other.__string) 19         else: 20             raise TypeError, \ 21     'Illegal argument type for built-in operation' 22 23     def __mul__(self, num):       # define for o*n 24         if isinstance(num, int): 25             return self.__class__(self.__num * num 26                 self.__string * num) 27        else: 28            raise TypeError, \ 29     'Illegal argument type for built-in operation' 30 31     def __nonzero__(self):        # False if both are 32         return self.__num  or len(self.__string) 33 34     def __norm_cval(self, cmpres):# normalize cmp() 35         return cmp(cmpres, 0) 36 37     def __cmp__(self, other):     # define for cmp() 38         return self.__norm_cval( 39                 cmp(self.__num, other.__num)) + \ 40             self.__norm_cval( 41                 cmp(self.__string, other.__string))

Given the above criteria, we present the code below for numStr.py, with some sample execution:

   >>> a = NumStr(3, 'foo')    >>> b = NumStr(3, 'goo')    >>> c = NumStr(2, 'foo')    >>> d = NumStr()    >>> e = NumStr(string='boo')    >>> f = NumStr(1)    >>> a    [3 :: 'foo']    >>> b    [3 :: 'goo']    >>> c    [2 :: 'foo']    >>> d    [0 :: '']    >>> e    [0 :: 'boo']    >>> f    [1 :: '']    >>> a < b    True    >>> b < c    False    >>> a == a    True    >>> b * 2    [6 :: 'googoo']    >>> a * 3    [9 :: 'foofoofoo']    >>> b + e    [3 :: 'gooboo']    >>> e + b    [3 :: 'boogoo']    >>> if d: 'not false'     # also bool(d)    ...    >>> if e: 'not false'     # also bool(e)    ...    'not false'    >>> cmp(a,b)    -1    >>> cmp(a,c)    1    >>> cmp(a,a)    0

Line-by-Line Explanation

Lines 17

The top of our script features the constructor __init__() setting up our instance initializing itself with the values passed via the class instantiator call NumStr(). If either value is missing, the attribute takes on the default false value of either zero or the empty string, depending on the argument.

One significant oddity is the use of double underscores to name our attributes. As we will find out in the next section, this is used to enforce a level, albeit elementary, of privacy. Programmers importing our module will not have straightforward access to our data elements. We are attempting to enforce one of the encapsulation properties of OO design by permitting access only through accessor functionality. If this syntax appears odd or uncomfortable to you, you can remove all double underscores from the instance attributes, and the examples will still work in the exact same manner.

All attributes that begin with a double underscore ( __ ) are "mangled" so that these names are not as easily accessible during runtime. They are not, however, mangled in such a way so that it cannot be easily reverse-engineered. In fact, the mangling pattern is fairly well known and easy to spot. The main point is to prevent the name from being accidentally used when it is imported by an external module where conflicts may arise. The name is changed to a new identifier name containing the class name to ensure that it does not get "stepped on" unintentionally. For more information, check out Section 13.14 on privacy.

Lines 912

We choose the string representation of our ordered pair to be "[num :: 'str']" so it is up to __str__() to provide that representation whenever str() is applied to our instance and when the instance appears in a print statement. Because we want to emphasize that the second element is a string, it is more visually convincing if the users view the string surrounded by quotation marks. To that end, we use the "repr()" representation format conversion code "%r" instead of "%s." It is equivalent to calling repr() or using the single back quotation marks to give the evaluatable version of a string, which does have quotation marks:

    >>> print a     [3 :: 'foo']

Not calling repr() on self.__string (leaving the backquotes off or using "%s") would result in the string quotations being absent:

 return '[%d :: %s]' % (self.__num, self.__string)

Now calling print again on an instance results in:

    >>> print a     [3 :: foo]

How does that look without the quotations? Not as convincing that "foo" is a string, is it? It looks more like a variable. The author is not as convinced either. (We quickly and quietly back out of that change and pretend we never even touched it.)

The first line of code after the __str__() function is the assignment of that function to another special method name, __repr__. We made a decision that an evaluatable string representation of our instance should be the same as the printable string representation. Rather than defining an entirely new function that is a duplicate of __str__(), we just create an alias, copying the reference.

When you implement __str__(), it is the code that is called by the interpreter if you ever apply the str() built-in function using that object as an argument. The same goes for __repr__() and repr().

How would our execution differ if we chose not to implement __repr__()? If the assignment is removed, only the print statement that calls str() will show us the contents of our object. The evaluatable string representation defaults to the Python standard of <...some_object_ information...>.

    >>> print a          # calls str(a)     [3 :: 'foo']     >>> a                # calls repr(a)     <NumStr.NumStr instance at 122640>

Lines 1421

One feature we would like to add to our class is the addition operation, which we described earlier. One of Python's features for customizing classes is that we can overload operators to make these types of customizations more "realistic." Invoking a function such as "add(obj1, obj2)" to "add" objects obj1 and obj2 may seem like addition, but is it not more compelling to be able to invoke that same operation using the plus sign ( + ) like this? obj1 + obj2

Overloading the plus sign requires the implementation of __add__() for self (SELF) and the other operand (OTHER). The __add__() function takes care of the Self + Other case, but we do not need to define __radd__() to handle the Other + Self because that is taken care of by the __add__() for Other. The numeric addition is not affected as much as the string concatenation because order matters.

The addition operation adds each of the two components, with the pair of results forming a new objectcreated as the results are passed to a call for instantiation as calling self.__class__() (again, also previously explained above). Any object other than a like type should result in a TypeError exception, which we raise in such cases.

Lines 2329

We also overload the asterisk [by implementing __mul__()] so that both numeric multiplication and string repetition are performed, resulting in a new object, again created via instantiation. Since repetition allows only an integer to the right of the operator, we must enforce this restriction as well. We also do not define __rmul__() for the same reason.

Lines 3132

Python objects have a Boolean value at any time. For the standard types, objects have a false value when they are either a numeric equivalent of zero or an empty sequence or mapping. For our class, we have chosen both that its numeric value must be zero and that the string be empty in order for any such instance to have a false value. We override the __nonzero__() method for this purpose. Other objects such as those that strictly emulate sequence or mapping types use a length of zero as a false value. In those cases, you would implement the __len__() method to effect that functionality.

Lines 3441

__norm_cval() (short for "normalize cmp() value") is not a special method. Rather, it is a helper function to our overriding of __cmp__(); its sole purpose is to convert all positive return values of cmp() to 1, and all negative values to -1. cmp() normally returns arbitrary positive or negative values (or zero) based on the result of the comparison, but for our purposes, we need to restrict the return values to only -1, 0, and 1. Calling cmp() with integers and comparing to zero will give us the result we need, being equivalent to the following snippet of code:

def __norm_cval(self, cmpres):     if cmpres < 0:         return -1     elif cmpres > 0:         return 1     else:         return 0

The actual comparison of two like objects consists of comparing the numbers and the strings, and returning the sum of the comparisons.

13.13. Customizing Classes with Special Methods

Table 13.4. Special Methods for Customizing Classes

13.13.1. Simple Customization (RoundFloat2)

Example 13.2. Basic Customization (roundFloat2.py)

13.13.2. Numeric Customization (Time60)

Display

Addition

In-Place Addition

Example 13.3. Intermediate Customization (time60.py)

Example 13.4. Random Sequence Iterator (randSeq.py)

Further Refinements

13.13.3. Iterators (RandSeq and AnyIter)

RandSeq

Example 13.5. Any Number of Items Iterator (anyIter.py)

AnyIter

13.13.4. *Multi-type Customization (NumStr)

Initialization

Addition

Multiplication

False Value

Comparisons

Example 13.6. Multi-Type Class Customization (numstr.py)

Line-by-Line Explanation

Lines 17

Lines 912

Lines 1421

Lines 2329

Lines 3132

Lines 3441

13.13.1. Simple Customization (`RoundFloat2`)

Example 13.2. Basic Customization (`roundFloat2.py`)

13.13.2. Numeric Customization (`Time60`)

Example 13.3. Intermediate Customization (`time60.py`)

Example 13.4. Random Sequence Iterator (`randSeq.py`)

13.13.3. Iterators (`RandSeq` and `AnyIter`)

Example 13.5. Any Number of Items Iterator (`anyIter.py`)

`AnyIter`

13.13.4. *Multi-type Customization (`NumStr`)

Example 13.6. Multi-Type Class Customization (`numstr.py`)