Section 20.4. Subclassing Built-In Types


20.4. Subclassing Built-In Types

There is one more twist in the stack and set story, before we move on to some more classical data structures. In recent Python releases, it is also possible to subclass built-in datatypes such as lists and dictionaries, in order to extend them. That is, because datatypes now look and feel just like customizable classes, we can code unique datatypes that are extensions of built-ins, with subclasses that inherit built-in tool sets. For instance, here are our stack and set objects coded in the prior sections, revised as customized lists (the set union method has been simplified slightly here to remove a redundant loop):

 class Stack(list):     "a list with extra methods"     def top(self):         return self[-1]     def push(self, item):         list.append(self, item)     def pop(self):         if not self:             return None                 # avoid exception         else:             return list.pop(self) class Set(list):     " a list with extra methods and operators"     def _ _init_ _(self, value=[]):      # on object creation         list._ _init_ _(self)         self.concat(value)     def intersect(self, other):         # other is any sequence type         res = []                        # self is the instance subject         for x in self:             if x in other:                 res.append(x)         return Set(res)                 # return a new Set     def union(self, other):         res = Set(self)                 # new set with a copy of my list         res.concat(other)               # insert uniques from other         return res     def concat(self, value):            # value: a list, string, Set...         for x in value:                 # filters out duplicates            if not x in self:                 self.append(x)     # len, getitem, iter inherited, use list repr     def _ _and_ _(self, other):   return self.intersect(other)     def _ _or_ _(self, other):    return self.union(other)     def _ _str_ _(self):          return '<Set:' + repr(self) + '>' class FastSet(dict):     pass    # this doesn't simplify much 

The stack and set implemented in this code are essentially like those we saw earlier, but instead of embedding and managing a list, these objects really are customized lists. They add a few additional methods, but they inherit all of the list object's functionality.

This can reduce the amount of wrapper code required, but it can also expose functionality that might not be appropriate in some cases. In the following self-test code, for example, we're able to sort and insert into stacks and reverse a set, because we've inherited these methods from the list object. In most cases, such operations don't make sense for the data structures in question, and the wrapper class approach of the prior sections may still be preferred:

 def selfTest( ):     # normal use cases     stk = Stack( )     print stk     for c in 'spam': stk.push(c)     print stk, stk.top( )     while stk: print stk, stk.pop( )     print stk, stk.pop( )     print     set = Set('spam')     print set, 'p' in set     print set & Set('slim')     print set | 'slim'     print Set('slim') | Set('spam')     # downside? these work too     print     stk = Stack('spam')     print stk     stk.insert(1, 'X')     # should only access top     print stk     stk.sort( )             # stack not usually sorted     print stk     set = Set('spam')     set.reverse( )          # order should not matter     print set, set[1] if _ _name_ _ == '_ _main_ _':     selfTest( ) 

When run, this module produces the following results on standard output; we're able to treat the stack and set objects like lists, whether we should or not:

 [] ['s', 'p', 'a', 'm'] m ['s', 'p', 'a', 'm'] m ['s', 'p', 'a'] a ['s', 'p'] p ['s'] s [] None <Set:['s', 'p', 'a', 'm']> True <Set:['s', 'm']> <Set:['s', 'p', 'a', 'm', 'l', 'i']> <Set:['s', 'l', 'i', 'm', 'p', 'a']> ['s', 'p', 'a', 'm'] ['s', 'X', 'p', 'a', 'm'] ['X', 'a', 'm', 'p', 's'] <Set:['m', 'a', 'p', 's']> a 

Subclassing built-in types has other applications, which may be more useful than those demonstrated by the preceding code. Consider a queue, or ordered dictionary, for example. The queue could take the form of a list subclass with get and put methods to insert on one end and delete from the other; the dictionary could be coded as a dictionary subclass with an extra list of keys that is sorted on insertion or request. Similarly, the built-in Python bool Boolean datatype is implemented as a subclass that customizes the integer int with a specialized display format (true is like integer 1, but it prints itself as the word True).

You can also use type subclassing to alter the way built-in types behave; a list subclass could map indexes 1..N to built-in indexes 0..N-1, for instance:

 # map 1..N to 0..N-1, by calling back to built-in version class MyList(list):     def _ _getitem_ _(self, offset):         print '(indexing %s at %s)' % (self, offset)         return list._ _getitem_ _(self, offset - 1) if _ _name_ _ == '_ _main_ _':     print list('abc')     x = MyList('abc')               # _ _init_ _ inherited from list     print x                         # _ _repr_ _ inherited from list     print x[1]                      # MyList._ _getitem_ _     print x[3]                      # customizes list superclass method     x.append('spam'); print x       # attributes from list superclass     x.reverse( );      print x % python typesubclass.py ['a', 'b', 'c'] ['a', 'b', 'c'] (indexing ['a', 'b', 'c'] at 1) a (indexing ['a', 'b', 'c'] at 3) c ['a', 'b', 'c', 'spam'] ['spam', 'c', 'b', 'a'] 

This works, but it is probably not a good idea in general. It would likely confuse its usersthey will expect Python's standard 0..N-1 indexing, unless they are familiar with the custom class. As a rule of thumb, type subclasses should probably adhere to the interface of the built-in types they customize.

The New Built-In Set Datatype

Python has a way of conspiring to make my book examples obsolete over time. Beginning in version 2.4, the language sprouted a new built-in set datatype, which implements much of the set functionality that we coded in the set examples of this chapter (and more). It is implemented with some of the same algorithms we used, but it is available on all Pythons today.

Built-in set usage is straightforward: set objects are created by calling the new built-in set function, passing in an iterable/sequence for the initial components of the set (sets are also available in 2.3, but the set creation call must be imported from a module):

 >>> x = set('abcde') >>> y = set('bdxyz') >>> x set(['a', 'c', 'b', 'e', 'd']) >>> 'e' in x                            # membership True >>> x - y                               # difference set(['a', 'c', 'e']) >>> x | y                               # union set(['a', 'c', 'b', 'e', 'd', 'y', 'x', 'z']) >>> x & y                               # intersection set(['b', 'd']) 

Interestingly, just like the dictionary-based optimized set we coded, built-in sets are unordered and require that all set components be hashable (immutable). In fact, their current implementation is based on wrapped dictionaries. Making a set with a dictionary of items works, but only because set uses the dictionary iterator, which returns the next key on each iteration (it ignores key values):

 >>> x = set(['spam', 'ham', 'eggs'])   # list of immutables >>> x set(['eggs', 'ham', 'spam']) >>> x = set([['spam', 'ham'], ['eggs']]) Traceback (most recent call last):   File "<pyshell#30>", line 1, in -toplevel-     x = set([['spam', 'ham'], ['eggs']]) TypeError: list objects are unhashable >>> x = set({'spam':[1, 1], 'ham': [2, 2], 'eggs':[3, 3]}) >>> x set(['eggs', 'ham', 'spam']) 

Built-in sets also support operations such as superset testing, and they come in two flavors: mutable and frozen (and thus hashable, for sets of sets). For more details, see the set type in the built-in types section of the Python library manual.

The set examples in this chapter are still useful as demonstrations of general data structure coding techniques, but they are not strictly required for set functionality in Python today. In fact, this is how Python tends to evolve over time: operations that are coded manually often enough wind up becoming built-in tools. I can't predict Python evolution, of course, but with any luck at all, the 10th edition of this book might be just a pamphlet.





Programming Python
Programming Python
ISBN: 0596009259
EAN: 2147483647
Year: 2004
Pages: 270
Authors: Mark Lutz

Similar book on Amazon

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net