1.3 Other Modules in the Standard Library

If your application performs other types of tasks besides text processing, a skim of this module list can suggest where to look for relevant functionality. As well, readers who find themselves maintaining code written by other developers may find that unfamiliar modules are imported by the existing code. If an imported module is not summarized in the list below, nor documented elsewhere, it is probably an in-house or third-party module. For standard library modules, the summaries here will at least give you a sense of the general purpose of a given module.

__builtin__

Access to built-in functions, exceptions, and other objects. Python does a great job of exposing its own internals, but "normal" developers do not need to worry about this.

1.3.1 Serializing and Storing Python Objects

In object-oriented programming (OOP) languages like Python, compound data and structured data is frequently represented at runtime as native objects. At times these objects belong to basic datatypes lists, tuples, and dictionaries but more often, once you reach a certain degree of complexity, hierarchies of instances containing attributes become more likely.

For simple objects, especially sequences, serialization and storage is rather straightforward. For example, lists can easily be represented in delimited or fixed-length strings. Lists-of-lists can be saved in line-oriented files, each line containing delimited fields, or in rows of RDBMS tables. But once the dimension of nested sequences goes past two, and even more so for heterogeneous data structures, traditional table-oriented storage is a less-obvious fit.

While it is possible to create "object/relational adaptors" that write OOP instances to flat tables, that usually requires custom programming. A number of more general solutions exist, both in the Python standard library and in third-party tools. There are actually two separate issues involved in storing Python objects. The first issue is how to convert them into strings in the first place; the second issue is how to create a general persistence mechanism for such serialized objects. At a minimal level, of course, it is simple enough to store (and retrieve) a serialization string the same way you would any other string to a file, a database, and so on. The various *dbm modules create a "dictionary on disk," while the shelve module automatically utilizes cPickle serialization to write arbitrary objects as values (keys are still strings).

Several third-party modules support object serialization with special features. If you need an XML dialect for your object representation, the modules gnosis.xml.pickle and xmlrpclib are useful. The YAML format is both human-readable/editable and has support libraries for Python, Perl, Ruby, and Java; using these various libraries, you can exchange objects between these several programming languages.

SEE ALSO: gnosis.xml.pickle 410; yaml 415; xmlrpclib 407;

DBM Interfaces to dbm-style databases

A dbm-style database is a "dictionary on disk." Using a database of this sort allows you to store a set of key/val pairs to a file, or files, on the local filesystem, and to access and set them as if they were an in-memory dictionary. A dbm-style database, unlike a standard dictionary, always maps strings to strings. If you need to store other types of objects, you will need to convert them to strings (or use the shelve module as a wrapper).

Depending on your platform, and on which external libraries are installed, different dbm modules might be available. The performance characteristics of the various modules vary significantly. As well, some DBM modules support some special functionality. Most of the time, however, your best approach is to access the locally supported DBM module using the wrapper module anydbm. Calls to this module will select the best available DBM for the current environment without a programmer or user having to worry about the underlying support mechanism.

Functions and methods are documents using the nonspecific capitalized form DBM. In real usage, you would use the name of a specific module. Most of the time, you will get or set DBM values using standard named indexing; for example, db["key"]. A few methods characteristic of dictionaries are also supported, as well as a few methods special to DBM databases.

SEE ALSO: shelve 98; dict 24; UserDict 24;

FUNCTIONS
DBM.open(fname [,flag="r" [,mode=0666]])

Open the filename fname for dbm access. The optional argument flag specifies how the database is accessed. A value of r is for read-only access (on an existing dbm file); w opens an already existing file for read/write access; c will create a database or use an existing one, with read/write access; the option n will always create a new database, erasing the one named in fname if it already existed. The optional mode argument specifies the Unix mode of the file(s) created.

METHODS
DBM.close()

Close the database and flush any pending writes.

DBM.first()

Return the first key/val pair in the DBM. The order is arbitrary but stable. You may use the DBM.first() method, combined with repeated calls to DBM.next(), to process every item in the dictionary.

In Python 2.2+, you can implement an items() function to emulate the behavior of the .items() method of dictionaries for DBMs:

 >>> from __future__ import generators >>> def items(db): ...     try: ...         yield db.first() ...         while 1: ...             yield db.next() ...     except KeyError: ...         raise StopIteration ... >>> for k,v in items(d):   # typical usage ...     print k,v 
DBM.has_key(key)

Return a true value if the DBM has the key key.

DBM.keys()

Return a list of string keys in the DBM.

DBM.Iast()

Return the last key/val pair in the DBM. The order is arbitrary but stable. You may use the DBM.last() method, combined with repeated calls to DBM.previous() , to process every item in the dictionary in reverse order.

DBM.next()

Return the next key/val pair in the DBM. A pointer to the current position is always maintained, so the methods DBM.next() and DBM.previous() can be used to access relative items.

DBM.previous()

Return the previous key/val pair in the DBM. A pointer to the current position is always maintained, so the methods DBM.next() and DBM.previous() can be used to access relative items.

DBM.sync()

Force any pending data to be written to disk.

SEE ALSO: FILE.flush() 16;

MODULES
anydbm

Generic interface to underlying DBM support. Calls to this module use the functionality of the "best available" DBM module. If you open an existing database file, its type is guessed and used assuming the current machine supports that style.

SEE ALSO: whichdb 93;

bsddb

Interface to the Berkeley DB library.

dbhash

Interface to the BSD DB library.

dbm

Interface to the Unix (n)dbm library.

dumbdbm

Interface to slow, but portable pure Python DBM.

gdbm

Interface to the GNU DBM (GDBM) library.

whichdb

Guess which db package to use to open a db file. This module contains the single function whichdb.whichdb(). If you open an existing DBM file with anydbm, this function is called automatically behind the scenes.

SEE ALSO: shelve 98;

cPickle Fast Python object serialization

pickle Standard Python object serialization

The module cPickle is a comparatively fast C implementation of the pure Python pickle module. The streams produced and read by cPickle and pickle are interchangeable. The only time you should prefer pickle is in the uncommon case where you wish to subclass the pickling base class; cPickle is many times faster to use. The class pickle.Pickler is not documented here.

The cPickle and pickle modules support a both binary and an ASCII format. Neither is designed for human readability, but it is not hugely difficult to read an ASCII pickle. Nonetheless, if readability is a goal, yaml or gnosis.xml.pickle are better choices. Binary format produces smaller pickles that are faster to write or load.

It is possible to fine-tune the pickling behavior of objects by defining the methods .__getstate__(), .__setstate__(), and .__getinitargs__(). The particular black magic invocations involved in defining these methods, however, are not addressed in this book and are rarely necessary for "normal" objects (i.e., those that represent data structures).

Use of the cPickle or pickle module is quite simple:

 >>> import cPickle >>> from somewhere import my_complex_object >>> s = cPickle.dumps(my_complex_object) >>> new_obj = cPickle.loads(s) 
FUNCTIONS
pickle.dump(o, file [,bin=0])
cPickle.dump(o, file [,bin=0])

Write a serialized form of the object o to the file-like object file. If the optional argument bin is given a true value, use binary format.

pickle.dumps(o [,bin=0])
cPickle.dumps(o [,bin=0])

Return a serialized form of the object o as a string. If the optional argument bin is given a true value, use binary format.

pickle.load(file)
cPickle.load(file)

Return an object that was serialized as the contents of the file-like object file.

pickle.loads(s)
cPickle.load(s)

Return an object that was serialized in the string s.

SEE ALSO: gnosis.xml.pickle 410; yaml 415;

marshal

Internal Python object serialization. For more general object serialization, use pickle, cPickle, or gnosis.xml.pickle, or the YAML tools at <http://yaml.org>; marshal is a limited-purpose serialization to the pseudo-compiled byte-code format used by Python .pyc files.

pprint Pretty-print basic datatypes

The module pprint is similar to the built-in function repr() and the module repr. The purpose of pprint is to represent objects of basic datatypes in a more readable fashion, especially in cases where collection types nest inside each other. In simple cases pprint.pformat and repr() produce the same result; for more complex objects, pprint uses newlines and indentation to illustrate the structure of a collection. Where possible, the string representation produced by pprint functions can be used to re-create objects with the built-in eval() .

I find the module pprint somewhat limited in that it does not produce a particularly helpful representation of objects of custom types, which might themselves represent compound data. Instance attributes are very frequently used in a manner similar to dictionary keys. For example:

 >>> import pprint >>> dct = {1.7:2.5, ('t','u','p'):['l','i','s','t']} >>> dct2 = {'this':'that', 'num':38, 'dct':dct} >>> class Container: pass ... >>> inst = Container() >>> inst.this, inst.num, inst.dct = 'that', 38, dct >>> pprint.pprint(dct2) {'dct': {('t', 'u', 'p'): ['l', 'i', 's', 't'], 1.7: 2.5},  'num': 38,  'this': 'that'} >>> pprint.pprint(inst) <__main__.Container instance at 0x415770> 

In the example, dct2 and inst have the same structure, and either might plausibly be chosen in an application as a data container. But the latter pprint representation only tells us the barest information about what an object is, not what data it contains. The mini-module below enhances pretty-printing:

pprint2.py
 from pprint import pformat import string, sys def pformat2(o):     if hasattr(o,'__dict__'):         lines = []         klass = o.__class__.__name__         module = o.__module__         desc = '<%s.%s instance at 0x%x>' % (module, klass, id(o))         lines.append(desc)         for k,v in o.__dict__.items():             lines.append('instance.%s=%s' % (k, pformat(v)))         return string.join(lines,'\n')     else:         return pprint.pformat(o) def pprint2(o, stream=sys.stdout):     stream.write(pformat2(o)+'\n') 

Continuing the session above, we get a more useful report:

 >>> import pprint2 >>> pprint2.pprint2(inst) <__main__.Container instance at 0x415770> instance.this='that' instance.dct={('t', 'u', 'p'): ['l', 'i', 's', 't'], 1.7: 2.5} instance.num=38 
FUNCTIONS
pprint.isreadable(o)

Return a true value if the equality below holds:

 o == eval(pprint.pformat(o)) 
pprint.isrecursive(o)

Return a true value if the object o contains recursive containers. Objects that contain themselves at any nested level cannot be restored with eval().

pprint.pformat(o)

Return a formatted string representation of the object o.

pprint.pprint(o [,stream=sys.stdout])

Print the formatted representation of the object o to the file-like object stream.

CLASSES
pprint.PrettyPrinter(width=80, depth=..., indent=1, stream=sys.stdout)

Return a pretty-printing object that will format using a width of width, will limit recursion to depth depth, and will indent each new level by indent spaces. The method pprint.PrettyPrinter.pprint() will write to the file-like object stream.

 >>> pp = pprint.PrettyPrinter(width=30) >>> pp.pprint(dct2) {'dct': {1.7: 2.5,          ('t', 'u', 'p'): ['l',                            'i',                            's',                            't']},  'num': 38,  'this': 'that'} 
METHODS

The class pprint.PrettyPrinter has the same methods as the module level functions. The only difference is that the stream used for pprint.PrettyPrinter.pprint() is configured when an instance is initialized rather than passed as an optional argument.

SEE ALSO: gnosis.xml.pickle 410; yaml 415;

repr Alternative object representation

The module repr contains code for customizing the string representation of objects. In its default behavior the function repr.repr() provides a length-limited string representation of objects in the case of large collections, displaying the entire collection can be unwieldy, and unnecessary for merely distinguishing objects. For example:

 >>> dct = dict([(n,str(n)) for n in range(6)]) >>> repr(dct)     # much worse for, e.g., 1000 item dict "{0: '0', 1: '1', 2: '2', 3: '3', 4: '4', 5: '5'}" >>> from repr import repr >>> repr(dct) "{0: '0', 1: '1', 2: '2', 3: '3', ...}" >>>   'dct' "{0: '0', 1: '1', 2: '2', 3: '3', 4: '4', 5: '5'}" 

The back-tick operator does not change behavior if the built-in repr() function is replaced.

You can change the behavior of the repr.repr() by modifying attributes of the instance object repr.aRepr.

 >>> dct = dict([(n,str(n)) for n in range(6)]) >>> repr(dct) "{0: '0', 1: '1', 2: '2', 3: '3', 4: '4', 5: '5'}" >>> import repr >>> repr.repr(dct) "{0: '0', 1: '1', 2: '2', 3: '3', ...}" >>> repr.aRepr.maxdict = 5 >>> repr.repr(dct) "{0: '0', 1: '1', 2: '2', 3: '3', 4: '4', ...}" 

In my opinion, the choice of the name for this module is unfortunate, since it is identical to that of the built-in function. You can avoid some of the collision by using the as form of importing, as in:

 >>> import repr as _repr >>> from repr import repr as newrepr 

For fine-tuned control of object representation, you may subclass the class repr.Repr. Potentially, you could use substitutable repr() functions to change the behavior of application output, but if you anticipate such a need, it is better practice to give a name that indicates this; for example, overridable_repr().

CLASSES
repr.Repr()

Base for customized object representations. The instance repr.aRepr automatically exists in the module namespace, so this class is useful primarily as a parent class. To change an attribute, it is simplest just to set it in an instance.

ATTRIBUTES
repr.maxlevel

Depth of recursive objects to follow.

repr.maxdict
repr.maxlist
repr.maxtuple

Number of items in a collection of the indicated type to include in the representation. Sequences default to 6, dicts to 4.

repr.maxlong

Number of digits of a long integer to stringify. Default is 40.

repr.maxstring

Length of string representation (e.g., s[:N]). Default is 30.

repr.maxother

"Catch-all" maximum length of other representations.

FUNCTIONS
repr.repr(o)

Behaves like built-in repr(), but potentially with a different string representation created.

repr.repr_TYPE(o, level)

Represent an object of the type TYPE, where the names used are the standard type names. The argument level indicates the level of recursion when this method is called (you might want to decide what to print based on how deep within the representation the object is). The Python Library Reference gives the example:

 class MyRepr(repr.Repr):     def repr_file(self, obj, level):         if obj.name in ['<stdin>', '<stdout>', '<stderr>']:             return obj.name         else:             return 'obj' aRepr = MyRepr() print aRepr.repr(sys.stdin)          # prints '<stdin>' 

shelve • General persistent dictionary

The module shelve builds on the capabilities of the DBM modules, but takes things a step forward. Unlike with the DBM modules, you may write arbitrary Python objects as values in a shelve database. The keys in shelve databases, however, must still be strings.

The methods of shelve databases are generally the same as those for their underlying DBMs. However, shelves do not have the .first(), .last(), .next(), or .previous () methods; nor do they have the .items () method that actual dictionaries do. Most of the time you will simply use name-indexed assignment and access. But from time to time, the available shelve.get(), shelve.keys(), shelve.sync(), shelve.has_key(), and shelve.close() methods are useful.

Usage of a shelve consists of a few simple steps like the ones below:

 >>> import shelve >>> sh = shelve.open('test_shelve') >>> sh.keys() ['this'] >>> sh['new_key'] = {1:2, 3:4, ('t','u','p'):['l','i','s','t']} >>> sh.keys() ['this', 'new_key'] >>> sh['new_key'] {1: 2, 3: 4, ('t', 'u', 'p'): ['l', 'i', 's', 't']} >>> del sh['this'] >>> sh.keys() ['new_key'] >>>  sh.close() 

In the example, I opened an existing shelve, and the previously existing key/value pair was available. Deleting a key/value pair is the same as doing so from a standard dictionary. Opening a new shelve automatically creates the necessary file(s).

Although shelve only allows strings to be used as keys, in a pinch it is not difficult to generate strings that characterize other types of immutable objects. For the same reasons that you do not generally want to use mutable objects as dictionary keys, it is also a bad idea to use mutable objects as shelve keys. Using the built-in hash() method is a good way to generate strings but keep in mind that this technique does not strictly guarantee uniqueness, so it is possible (but unlikely) to accidentally overwrite entries using this hack:

 >>> '%x' % hash((1,2,3,4,5)) '866123f4' >>> '%x' % hash(3.1415) '6aad0902' >>> '%x' % hash(38) '26' >>> '%x' % hash('38') '92bb58e3' 

Integers, notice, are their own hash, and strings of digits are common. Therefore, if you adopted this approach, you would want to hash strings as well, before using them as keys. There is no real problem with doing so, merely an extra indirection step that you need to remember to use consistently:

 >>> sh['%x' % hash('another_key')] = 'another value' >>> sh.keys() ['new_key', '8f9ef0ca'] >>> sh['%x' % hash('another_key')] 'another value' >>> sh['another_key'] Traceback (most recent call last):   File "<stdin>", line 1, in ?   File "/sw/lib/python2.2/shelve.py", line 70, in __getitem__     f = StringIO(self.dict[key]) KeyError: another_key 

If you want to go beyond the capabilities of shelve in several ways, you might want to investigate the third-party library Zope Object Database (ZODB). ZODB allows arbitrary objects to be persistent, not only dictionary-like objects. Moreover, ZODB lets you store data in ways other than in local files, and also has adaptors for multiuser simultaneous access. Look for details at:

<http://www.zope.org/Wikis/ZODB/StandaloneZODB>

SEE ALSO: DBM 90; dict 24;

graphics/common.gif

The rest of the listed modules are comparatively unlikely to be needed in text processing applications. Some modules are specific to a particular platform; if so, this is indicated parenthetically. Recent distributions of Python have taken a "batteries included" approach much more is included in a base Python distribution than is with other free programming languages (but other popular languages still have a range of existing libraries that can be downloaded separately).

1.3.2 Platform-Specific Operations

_winreg

Access to the Windows registry (Windows).

AE

AppleEvents (Macintosh; replaced by Carbon.AE).

aepack

Conversion between Python variables and AppleEvent data containers (Macintosh).

aetypes

AppleEvent objects (Macintosh).

applesingle

Rudimentary decoder for AppleSingle format files (Macintosh).

buildtools

Build MacOS applets (Macintosh).

calendar

Print calendars, much like the Unix cal utility. A variety of functions allow you to print or stringify calendars for various time frames. For example,

 >>> print calendar.month(2002,11)     November 2002 Mo Tu We Th Fr Sa Su              1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 
Carbon.AE, Carbon.App, Carbon.CF, Carbon.Cm, Carbon.Ctl, Carbon.Dlg, Carbon.Evt, Carbon.Fm, Carbon.Help, Carbon.List, Carbon.Menu, Carbon.Mlte, Carbon.Qd, Carbon.Qdoffs, Carbon.Qt, Carbon.Res, Carbon.Scrap, Carbon.Snd, Carbon.TE, Carbon.Win

Interfaces to Carbon API (Macintosh).

cd

CD-ROM access on SGI systems (IRIX).

cfmfile

Code Fragment Resource module (Macintosh).

ColorPicker

Interface to the standard color selection dialog (Macintosh).

ctb

Interface to the Communications Tool Box (Macintosh).

dl

Call C functions in shared objects (Unix).

EasyDialogs

Basic Macintosh dialogs (Macintosh).

fcntl

Access to Unix fcntl() and iocntl() system functions (Unix).

findertools

AppleEvents interface to MacOS finder (Macintosh).

fl, FL, flp

Functions and constants for working with the FORMS library (IRIX).

fm, FM

Functions and constants for working with the Font Manager library (IRIX).

fpectl

Floating point exception control (Unix).

FrameWork, MiniAEFrame

Structured development of MacOS applications (Macintosh).

gettext

The module gettext eases the development of multilingual applications. While actual translations must be performed manually, this module aids in identifying strings for translation and runtime substitutions of language-specific strings.

grp

Information on Unix groups (Unix).

locale

Control the language and regional settings for an application. The locale setting affects the behavior of several functions, such as time.strftime() and string.lower(). The locale module is also useful for creating strings such as number with grouped digits and currency strings for specific nations.

mac, macerrors, macpath

Macintosh implementation of os module functionality. It is generally better to use os directly and let it call mac where needed (Macintosh).

macfs, macfsn, macostools

Filesystem services (Macintosh).

MacOS

Access to MacOS Python interpreter (Macintosh).

macresource

Locate script resources (Macintosh).

macspeech

Interface to Speech Manager (Macintosh).

mactty

Easy access serial to line connections (Macintosh).

mkcwproject

Create CodeWarrior projects (Macintosh).

msvcrt

Miscellaneous Windows-specific functions provided in Microsoft's Visual C++ Runtime libraries (Windows).

Nac

Interface to Navigation Services (Macintosh).

nis

Access to Sun's NIS Yellow Pages (Unix).

pipes

Manage pipes at a finer level than done by os.popen() and its relatives. Reliability varies between platforms (Unix).

PixMapWrapper

Wrap PixMap objects (Macintosh).

posix, posixfile

Access to operating system functionality under Unix. The os module provides more portable version of the same functionality and should be used instead (Unix).

preferences

Application preferences manager (Macintosh).

pty

Pseudo terminal utilities (IRIX, Linux).

pwd

Access to Unix password database (Unix).

pythonprefs

Preferences manager for Python (Macintosh).

py_resource

Helper to create PYC resources for compiled applications (Macintosh).

quietconsole

Buffered, nonvisible STDOUT output (Macintosh).

resource

Examine resource usage (Unix).

syslog

Interface to Unix syslog library (Unix).

tty, termios, TERMIOS

POSIX tty control (Unix).

W

Widgets for the Mac (Macintosh).

waste

Interface to the WorldScript-Aware Styled Text Engine (Macintosh).

winsound

Interface to audio hardware under Windows (Windows).

xdrlib

Implements (a subset of) Sun eXternal Data Representation (XDR). In concept, xdrlib is similar to the struct module, but the format is less widely used.

1.3.3 Working with Multimedia Formats

aifc

Read and write AIFC and AIFF audio files. The interface to aifc is the same as for the sunau and wave modules.

al, AL

Audio functions for SGI (IRIX).

audioop

Manipulate raw audio data.

chunk

Read chunks of IFF audio data.

colorsys

Convert between RGB color model and YIQ, HLS, and HSV color spaces.

gl, DEVICE, GL

Functions and constants for working with Silicon Graphics' Graphics Library (IRIX).

imageop

Manipulate image data stored as Python strings. For most operations on image files, the third-party Python Imaging Library (usually called "PIL"; see <http://www.pythonware.com/products/pil/>) is a versatile and powerful tool.

imgfile

Support for imglib files (IRIX).

jpeg

Read and write JPEG files on SGI (IRIX). The Python Imaging Library (<http://www.pythonware.com/products/pil/>) provides a cross-platform means of working with a large number of image formats and is preferable for most purposes.

rgbimg

Read and write SGI RGB files (IRIX).

sunau

Read and write Sun AU audio files. The interface to sunau is the same as for the aifc and wave modules.

sunaudiodev, SUNAUDIODEV

Interface to Sun audio hardware (SunOS/Solaris).

videoreader

Read QuickTime movies frame by frame (Macintosh).

wave

Read and write WAV audio files. The interface to wave is the same as for the aifc and sunau modules.

1.3.4 Miscellaneous Other Modules

array

Typed arrays of numeric values. More efficient than standard Python lists, where applicable.

atexit

Exit handlers. Same functionality as sys.exitfunc, but different interface.

BaseHTTPServer, SimpleHTTPServer, SimpleXMLRPCServer, CGIHTTPServer

HTTP server classes. BaseHTTPServer should usually be treated as an abstract class. The other modules provide sufficient customization for usage in the specific context indicated by their names. All may be customized for your application's needs.

Bastion

Restricted object access. Used in conjunction with rexec.

bisect

List insertion maintaining sort order.

cmath

Mathematical functions over complex numbers.

cmd

Build line-oriented command interpreters.

code

Utilities to emulate Python's interactive interpreter.

codeop

Compile possibly incomplete Python source code.

compileall

Module/script to compile .py files to cached byte-code files.

compile, compile.ast, compile.visitor

Analyze Python source code and generate Python byte-codes.

copy_reg

Helper to provide extensibility for pickle/cPickle.

curses, curses.ascii, curses.panel, curses.textpad, curses.wrapper

Full-screen terminal handling with the (n)curses library.

dircache

Cached directory listing. This module enhances the functionality of os.listdir().

dis

Disassembler of Python byte-code into mnemonics.

distutils

Build and install Python modules and packages. distutils provides a standard mechanism for creating distribution packages of Python tools and libraries, and also for installing them on target machines. Although distutils is likely to be useful for text processing applications that are distributed to users, a discussion of the details of working with distutils is outside the scope of this book. Useful information can be found in the Python standard documentation, especially Greg Ward's Distributing Python Modules and Installing Python Modules.

doctest

Check the accuracy of _doc_ strings.

errno

Standard errno system symbols.

fpformat

General floating point formatting functions. Duplicates string interpolation functionality.

gc

Control Python's (optional) cyclic garbage collection.

getpass

Utilities to collect a password without echoing to screen.

imp

Access the internals of the import statement.

inspect

Get useful information from live Python objects for Python 2.1+.

keyword

Check whether string is a Python keyword.

math

Various trigonometric and algebraic functions and constants. These functions generally operate on floating point numbers use cmath for calculations on complex numbers.

mutex

Work with mutual exclusion locks, typically for threaded applications.

new

Create special Python objects in customizable ways. For example, Python hackers can create a module object without using a file of the same name or create an instance while bypassing the normal .__init__() call. "Normal" techniques generally suffice for text processing applications.

pdb

A Python debugger.

popen2

Functions to spawn commands with pipes to STDIN, STDOUT, and optionally STDERR. In Python 2.0+, this functionality is copied to the os module in slightly improved form. Generally you should use the os module (unless you are running Python 1.52 or earlier).

profile

Profile the performance characteristics of Python code. If speed becomes an issue in your application, your first step in solving any problem issues should be profiling the code. But details of using profile are outside the scope of this book. Moreover, it is usually a bad idea to assume speed is a problem until it is actually found to be so.

pstats

Print reports on profiled Python code.

pyclbr

Python class browser; useful for implementing code development environments for editing Python.

pydoc

Extremely useful script and module for examining Python documentation. pydoc is included with Python 2.1+, but is compatible with earlier versions if downloaded. pydoc can provide help similar to Unix man pages, help in the interactive shell, and also a Web browser interface to documentation. This tool is worth using frequently while developing Python applications, but its details are outside the scope of this book.

py_compile

"Compile" a .py file to a .pyc (or .pyo) file.

Queue

A multiproducer, multiconsumer queue, especially for threaded programming.

readline, rlcompleter

Interface to GNU readline (Unix).

rexec

Restricted execution facilities.

sched

General event scheduler.

signal

Handlers for asynchronous events.

site, user

Customizable startup module that can be modified to change the behavior of the local Python installation.

statcache

Maintain a cache of os.stat() information on files. Deprecated in Python 2.2+.

statvfs

Constants for interpreting the results of os.statvfs() and os.fstatvfs().

thread, threading

Create multithreaded applications with Python. Although text processing applications like other applications might use a threaded approach, this topic is outside the scope of this book. Most, but not all, Python platforms support threaded applications.

Tkinter, ScrolledText, Tix, turtle

Python interface to TCL/TK and higher-level widgets for TK. Supported on many platforms, but not on all Python installations.

traceback

Extract, format, and print information about Python stack traces. Useful for debugging applications.

unittest

Unit testing framework. Like a number of other documenting, testing, and debugging modules, unittest is a useful facility and its usage is recommended for Python applications in general. But this module is not specific enough to text processing applications to be addressed in this book.

warnings

Python 2.1 added a set of warning messages for conditions a user should be aware of, but that fall below the threshold for raising exceptions. By default, such messages are printed to STDERR, but the warning module can be used to modify the behavior of warning messages.

weakref

Create references to objects that do not limit garbage collection. At first brush, weak references seem strange, and the strangeness does not really go away quickly. If you do not know why you would want to use these, do not worry about it you do not need to.

whrandom

Wichmann-Hill random number generator. Deprecated since Python 2.1, and not necessary to use directly before that use the module random to create pseudorandom values.



Text Processing in Python
Text Processing in Python
ISBN: 0321112547
EAN: 2147483647
Year: 2005
Pages: 59
Authors: David Mertz

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net