Section 20.10. Data Structures Versus Python Built-Ins


20.10. Data Structures Versus Python Built-Ins

Now that I've shown you all of these complicated algorithms, I need to also tell you that at least in some cases, they may not be an optimal approach. Built-in types such as lists and dictionaries are often a simpler and more efficient way to represent data. For instance:


Binary trees

These may be useful in many applications, but Python dictionaries already provide a highly optimized, C-coded, search table tool. Indexing a dictionary by key is likely to be faster than searching a Python-coded tree structure:

 >>> x = {} >>> for i in [3,1,9,2,7]: x[i] = None                 # insert >>> for i in range(10): print (i, x.has_key(i)),      # lookup (0, 0) (1, 1) (2, 1) (3, 1) (4, 0) (5, 0) (6, 0) (7, 1) (8, 0) (9, 1) 

Because dictionaries are built into the language, they are always available and will usually be faster than Python-based data structure implementations.


Graph algorithms

These serve many purposes, but a purely Python-coded implementation of a very large graph might be less efficient than you want in some applications. Graph programs tend to require peak performance; using dictionaries rather than class instances to represent graphs may boost performance some, but using linked-in compiled extensions may as well.


Sorting algorithms

These are an important part of many programs too, but Python's built-in list sort method is so fast that you would be hard-pressed to beat it in Python in most scenarios. In fact, it's generally better to convert sequences to lists first just so that you can use the built-in:[*]

[*] Recent news: in Python 2.4, the sort list method also accepts a Boolean reverse flag to reverse the result (there is no need to manually reverse after the sort), and there is a new sorted built-in function, which returns its result list and works on any iterable, not just on lists (there is no need to convert to a list to sort). Python makes lives easier over time. The underlying sort routine in Python is very good, by the way. In fact, its documentation claims that it has "supernatural performance"not bad for a sorter.

 temp = list(sequence) temp.sort( ) ...use items in temp... 

For custom sorts, simply pass in a comparison function of your own:

 >>> L = [{'n':3}, {'n':20}, {'n':0}, {'n':9}] >>> L.sort( lambda x, y: cmp(x['n'], y['n']) ) >>> L [{'n': 0}, {'n': 3}, {'n': 9}, {'n': 20}] 


Reversal algorithms

These are generally superfluous by the same tokenbecause Python lists provide a fast reverse method, you may be better off converting a nonlist to a list first, just so that you can run the built-in list method.

Don't misunderstand: sometimes you really do need objects that add functionality to built-in types or do something more custom. The set classes we met, for instance, add tools not directly supported by Python today, and the tuple-tree stack implementation was actually faster than one based on built-in lists for common usage patterns. Permutations are something you need to add on your own too.

Moreover, class encapsulations make it possible to change and extend object internals without impacting the rest of your system. They also support reuse much better than built-in typestypes are not classes today, and they cannot be specialized directly without wrapper class logic.

Yet because Python comes with a set of built-in, flexible, and optimized datatypes, data structure implementations are often not as important in Python as they are in lesser-equipped languages such as C and C++. Before you code that new datatype, be sure to ask yourself whether a built-in type or call might be more in line with the Python way of thinking.




Programming Python
Programming Python
ISBN: 0596009259
EAN: 2147483647
Year: 2004
Pages: 270
Authors: Mark Lutz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net