A C Extension Type String Stack

To implement multiple-instance objects in C, you need to code a C extension type, not a module. Like Python classes, C types generate multiple-instance objects and can overload (i.e., intercept and implement) Python expression operators and type operations. Unlike classes, though, types do not support attribute inheritance by themselves -- attributes are fetched from a flat names table, not a namespace objects tree. That makes sense if you realize that Pythons built-in types are simply precoded C extension types; when you ask for the list append method, for instance, inheritance never enters the picture. We can add inheritance for types by coding "wrapper" classes, but it is a manual process (more on this later).

One of the biggest drawbacks of types, though, is their size -- to implement a realistically equipped C type, you need to code lots of not-very-pretty C code, and fill out type descriptor tables with pointers to link up operation handlers. In fact, C extension types are so complex that Im going to cut some details here. To give you a feel for the overall structure, Example 19-15 presents a C string stack type implementation, but with the bodies of all its functions stripped out. For the complete implementation, see this file on the books CD (see http://examples.oreilly.com/python2).

This C type roughly implements the same interface as the stack classes we met earlier in Chapter 17, but imposes a few limits on the stack itself and does not support specialization by subclassing (its a type, not a class). The stripped parts use the same algorithms as the C module in Example 19-14, but operate on the passed-in self object, which now refers to the particular type instance object being processed, just as the first argument does in class methods. In types, self is a pointer to an allocated C struct that represents a type instance object.

Example 19-15. PP2EIntegrateExtendStacksstacktyp.c
/****************************************************
 * stacktyp.c: a character-string stack data-type;
 * a C extension type, for use in Python programs;
 * stacktype module clients can make multiple stacks;
 * similar to stackmod, but self is the instance,
 * and we can overload sequence operators here;
 ****************************************************/

#include "Python.h"

static PyObject *ErrorObject; /* local exception */
#define onError(message) 
 { PyErr_SetString(ErrorObject, message); return NULL; }

/*****************************************************************************
 * STACK-TYPE INFORMATION
 *****************************************************************************/

#define MAXCHARS 2048
#define MAXSTACK MAXCHARS

typedef struct { /* stack instance object format */
 PyObject_HEAD /* python header: ref-count + &typeobject */
 int top, len; 
 char *stack[MAXSTACK]; /* per-instance state info */
 char strings[MAXCHARS]; /* same as stackmod, but multiple copies */
} stackobject;

/*****************************************************************************
 * INSTANCE METHODS
 *****************************************************************************/

static PyObject * /* on "instance.push(arg)" */
stack_push(self, args) /* self is the stack instance object */
 stackobject *self; /* args are args passed to self.push method */
 PyObject *args; 
{ ...
}
static PyObject *
stack_pop(self, args)
 stackobject *self;
 PyObject *args; /* on "instance.pop( )" */
{ ...
}
static PyObject *
stack_top(self, args)
 stackobject *self;
 PyObject *args;
{ ...
}
static PyObject *
stack_empty(self, args)
 stackobject *self;
 PyObject *args;
{ ...
}
static struct PyMethodDef stack_methods[] = { /* instance methods */
 {"push", stack_push, 1}, /* name/address table */
 {"pop", stack_pop, 1}, /* like list append,sort */
 {"top", stack_top, 1}, 
 {"empty", stack_empty, 1}, /* extra ops besides optrs */
 {NULL, NULL} /* end, for getattr here */
};

/*****************************************************************************
 * BASIC TYPE-OPERATIONS
 *****************************************************************************/

static stackobject * /* on "x = stacktype.Stack( )" */
newstackobject( ) /* instance constructor function */ 
{ ... /* these don	 get an args input */
}
static void /* instance destructor function */
stack_dealloc(self) /* when reference-count reaches zero */
 stackobject *self;
{ ... /* do cleanup activity */
}
static int
stack_print(self, fp, flags)
 stackobject *self;
 FILE *fp;
 int flags; /* print self to file */
{ ...
}
static PyObject *
stack_getattr(self, name) /* on "instance.attr" reference */
 stackobject *self; /* make a bound-method or member */
 char *name; 
{ ...
}
static int
stack_compare(v, w) /* on all comparisons */
 stackobject *v, *w;
{ ...
}

/*****************************************************************************
 * SEQUENCE TYPE-OPERATIONS
 *****************************************************************************/

static int
stack_length(self)
 stackobject *self; /* called on "len(instance)" */
{ ...
}
static PyObject *
stack_concat(self, other)
 stackobject *self; /* on "instance + other" */
 PyObject *other; /* self is the instance */
{ ...
}
static PyObject *
stack_repeat(self, n) /* on "instance * N" */
 stackobject *self; /* new stack = repeat self n times */
 int n;
{ ...
}
static PyObject *
stack_item(self, index) /* on "instance[offset]", "in/for" */
 stackobject *self; /* return the i-th item of self */
 int index; /* negative index pre-adjusted */
{ ...
}
static PyObject *
stack_slice(self, ilow, ihigh)
 stackobject *self; /* on "instance[ilow:ihigh]" */
 int ilow, ihigh; /* negative-adjusted, not scaled */
{ ...
}

/*****************************************************************************
 * TYPE DESCRIPTORS
 *****************************************************************************/

static PySequenceMethods stack_as_sequence = { /* sequence supplement */
 (inquiry) stack_length, /* sq_length "len(x)" */
 (binaryfunc) stack_concat, /* sq_concat "x + y" */
 (intargfunc) stack_repeat, /* sq_repeat "x * n" */
 (intargfunc) stack_item, /* sq_item "x[i], in" */
 (intintargfunc) stack_slice, /* sq_slice "x[i:j]" */
 (intobjargproc) 0, /* sq_ass_item "x[i] = v" */
 (intintobjargproc) 0, /* sq_ass_slice "x[i:j]=v" */
};

static PyTypeObject Stacktype = { /* main python type-descriptor */
 /* type header */ /* shared by all instances */
 PyObject_HEAD_INIT(&PyType_Type) 
 0, /* ob_size */
 "stack", /* tp_name */
 sizeof(stackobject), /* tp_basicsize */
 0, /* tp_itemsize */

 /* standard methods */
 (destructor) stack_dealloc, /* tp_dealloc ref-count==0 */
 (printfunc) stack_print, /* tp_print "print x" */
 (getattrfunc) stack_getattr, /* tp_getattr "x.attr" */
 (setattrfunc) 0, /* tp_setattr "x.attr=v" */
 (cmpfunc) stack_compare, /* tp_compare "x > y" */
 (reprfunc) 0, /* tp_repr `x`, print x */

 /* type categories */
 0, /* tp_as_number +,-,*,/,%,&,>>,...*/
 &stack_as_sequence, /* tp_as_sequence +,[i],[i:j],len, ...*/
 0, /* tp_as_mapping [key], len, ...*/

 /* more methods */
 (hashfunc) 0, /* tp_hash "dict[x]" */
 (ternaryfunc) 0, /* tp_call "x( )" */
 (reprfunc) 0, /* tp_str "str(x)" */

}; /* plus others: see Include/object.h */

/*****************************************************************************
 * MODULE LOGIC 
 *****************************************************************************/

static PyObject *
stacktype_new(self, args) /* on "x = stacktype.Stack( )" */
 PyObject *self; /* self not used */
 PyObject *args; /* constructor args */
{
 if (!PyArg_ParseTuple(args, "")) /* Module-method function */
 return NULL;
 return (PyObject *)newstackobject( ); /* make a new type-instance object */
} /* the hook from module to type... */

static struct PyMethodDef stacktype_methods[] = {
 {"Stack", stacktype_new, 1}, /* one function: make a stack */ 
 {NULL, NULL} /* end marker, for initmodule */
};

void
initstacktype( ) /* on first "import stacktype" */
{
 PyObject *m, *d;
 m = Py_InitModule("stacktype", stacktype_methods); /* make the module, */
 d = PyModule_GetDict(m); /* with Stack func */
 ErrorObject = Py_BuildValue("s", "stacktype.error");
 PyDict_SetItemString(d, "error", ErrorObject); /* export exception */
 if (PyErr_Occurred( ))
 Py_FatalError("can	 initialize module stacktype");

}

.7.1 Anatomy of a C Extension Type

Although most of file stacktyp.c is missing, there is enough here to illustrate the global structure common to C type implementations:

Instance struct

The file starts off by defining a C struct called stackobject that will be used to hold per-instance state information -- each generated instance object gets a newly mallocd copy of the struct. It serves the same function as class instance attribute dictionaries, and contains data that was saved in global variables by the C stack module.

Instance methods

As in the module, a set of instance methods follows next; they implement method calls such as push and pop. But here, method functions process the implied instance object, passed in to the self argument. This is similar in spirit to class methods. Type instance methods are looked up in the registration table of the code listing (Example 19-15) when accessed.

Basic type operations

Next, the file defines functions to handle basic operations common to all types: creation, printing, qualification, and so on. These functions have more specific type signatures than instance method handlers. The object creation handler allocates a new stack struct, and initializes its header fields: the reference count is set to 1, and its type object pointer is set to the Stacktype type descriptor that appears later in the file.

Sequence operations

Functions for handling sequence type operations come next. Stacks respond to most sequence operators: len, +, *, and [i]. Much like the __getitem__ class method, the stack_item indexing handler performs indexing, but also in membership tests and for iterator loops. These latter two work by indexing an object until an IndexError exception is caught by Python.

Type descriptors

The type descriptor tables (really, structs) that appear near the end of the file are the crux of the matter for types -- Python uses these tables to dispatch an operation performed on an instance object to the corresponding C handler function in this file. In fact, everything is routed through these tables; even method attribute lookups start by running a C stack_getattr function listed in the table (which in turn looks up the attribute name in a name/function-pointer table). The main Stacktype table includes a link to the supplemental stack_as_sequence table where sequence operation handlers are registered; types can provide such tables to register handlers for mapping, number, and sequence operation sets. See Pythons integer and dictionary objects source code for number and mapping examples; they are analogous to the sequence type here, but their operation tables vary.[5]

[5] Note that type descriptor layouts, like most C API tools, are prone to change over time, and you should always consult Include/object.h in the Python distribution for an up-to-date list of fields. Some new Python releases may also require that types written to work with earlier releases be recompiled to pick up descriptor changes. As always, see Pythons extension manuals and its full source code distribution for more information and examples.

Constructor module

Besides defining a C type, this file also creates a simple C module at the end that exports a stacktype.Stack constructor function, which Python scripts call to generate new stack instance objects. The initialization function for this module is the only C name in this file that is not static (local to the file); everything else is reached by following pointers -- from instance, to type descriptor, to C handler function.

Again, see the book CD (see http://examples.oreilly.com/python2) for the full C stack type implementation. But to give you the general flavor of C type methods, here is what the C types pop function looks like; compare this with the C modules pop function to see how the self argument is used to access per-instance information in types:

static PyObject *
stack_pop(self, args)
 stackobject *self;
 PyObject *args; /* on "instance.pop()" */
{
 PyObject *pstr;
 if (!PyArg_ParseTuple(args, "")) /* verify no args passed */
 return NULL;
 if (self->top == 0)
 onError("stack underflow") /* return NULL = raise */
 else {
 pstr = Py_BuildValue("s", self->stack[--self->top]);
 self->len -= (strlen(self->stack[self->top]) + 1);
 return pstr;
 }
}

.7.2 Compiling and Running

This C extension file is compiled and dynamically or statically linked like previous examples; file makefile.stack on the CD (see http://examples.oreilly.com/python2) handles the build like this:

stacktype.so: stacktyp.c
	gcc stacktyp.c -g -I$(PY)/Include -I$(PY) -fpic -shared -o stacktype.so

Once compiled, you can import the C module and make and use instances of the C type it defines much as if it were a Python class (but without inheritance). You would normally do this from a Python script, but the interactive prompt is a convenient place to test the basics:

[mark@toy ~/.../PP2E/Integrate/Extend/Stacks]$ python
>>> import stacktype # import C constructor module
>>> x = stacktype.Stack( ) # make C type instance object
>>> x.push(
ew)  # call C type methods
>>> x # call C type print handler
[Stack:
0: 
ew
]

>>> x[0] # call C type index handler

ew
>>> y = stacktype.Stack( ) # make another type instance
>>> for c in SPAM: y.push(c) # a distinct stack object
...
>>> y
[Stack:
3: M
2: A
1: P
0: S
]

>>> z = x + y # call C type concat handler
>>> z
[Stack:
4: M
3: A
2: P
1: S
0: 
ew
]

>>> y.pop( )
M
>>> len(z), z[0], z[-1] # for loops work too (indexing)
(5, 
ew, M)

.7.3 Timing the C Implementations

So how did we do on the optimization front this time? Lets resurrect that timer module we wrote back in Example 17-6 to compare the C stack module and type to the Python stack module and classes we coded in Chapter 17. Example 19-16 calculates the system time in seconds that it takes to run tests on all of this books stack implementations.

Example 19-16. PP2EIntegrateExtendStacksexttime.py
#!/usr/local/bin/python
# time the C stack module and type extensions
# versus the object chapters Python stack implementations

from PP2E.Dstruct.Basic.timer import test # second count function
from PP2E.Dstruct.Basic import stack1 # python stack module 
from PP2E.Dstruct.Basic import stack2 # python stack class: +/slice
from PP2E.Dstruct.Basic import stack3 # python stack class: tuples
from PP2E.Dstruct.Basic import stack4 # python stack class: append/pop
import stackmod, stacktype # c extension type, module

from sys import argv
rept, pushes, pops, items = 200, 200, 200, 200 # default: 200 * (600 ops)
try:
 [rept, pushes, pops, items] = map(int, argv[1:])
except: pass
print 
eps=%d * [push=%d+pop=%d+fetch=%d] % (rept, pushes, pops, items)

def moduleops(mod):
 for i in range(pushes): mod.push(hello) # strings only for C
 for i in range(items): t = mod.item(i)
 for i in range(pops): mod.pop( )

def objectops(Maker): # type has no init args
 x = Maker( ) # type or class instance
 for i in range(pushes): x.push(hello) # strings only for C
 for i in range(items): t = x[i]
 for i in range(pops): x.pop( )

# test modules: python/c
print "Python module:", test(rept, moduleops, stack1)
print "C ext module: ", test(rept, moduleops, stackmod), \n

# test objects: class/type
print "Python simple Stack:", test(rept, objectops, stack2.Stack) 
print "Python tuple Stack:", test(rept, objectops, stack3.Stack)
print "Python append Stack:", test(rept, objectops, stack4.Stack)
print "C ext type Stack: ", test(rept, objectops, stacktype.Stack) 

Running this script on Linux produces the following results. As we saw before, the Python tuple stack is slightly better than the Python in-place append stack in typical use (when the stack is only pushed and popped), but it is slower when indexed. The first test here runs 200 repetitions of 200 stack pushes and pops, or 80,000 stack operations (200 x 400); times listed are test duration seconds:

[mark@toy ~/.../PP2E/Integrate/Extend/Stacks]$ python exttim.py 200 200 200 0
reps=200 * [push=200+pop=200+fetch=0]
Python module: 2.09
C ext module: 0.68

Python simple Stack: 2.15
Python tuple Stack: 0.68
Python append Stack: 1.16
C ext type Stack: 0.5

[mark@toy ~/.../PP2E/Integrate/Extend/Stacks]$ python exttim.py 100 300 300 0
reps=100 * [push=300+pop=300+fetch=0]
Python module: 1.86
C ext module: 0.52

Python simple Stack: 1.91
Python tuple Stack: 0.51
Python append Stack: 0.87
C ext type Stack: 0.38

At least when there are no indexing operations on the stack as in these two tests (just pushes and pops), the C type is only slightly faster than the best Python stack (tuples). In fact, its almost a draw -- in these first two tests, the C type reports only a tenth of a second speedup after 200 stacks and 80,000 stack operations. Its not exactly the kind of performance difference that would generate a bug report.[6]

[6] Interestingly, Python has gotten much faster since this books first edition, relative to C. Back then, the C type was still almost three times faster than the best Python stack (tuples) when no indexing was performed. Today, its almost a draw. One might infer from this that C migrations have become a third as important as they once were.

The C module comes in at roughly three times faster than the Python module, but these results are flawed. The stack1 Python module tested here uses the same slow stack implementation as the Python "simple" stack (stack2). If it was recoded to use the tuple stack representation used in Chapter 17, its speed would be similar to the "tuple" figures listed here, and almost identical to the speed of the C module in the first two tests:

[mark@toy ~/.../PP2E/Integrate/Extend/Stacks]$ python exttim.py 200 200 200 50
reps=200 * [push=200+pop=200+fetch=50]
Python module: 2.17
C ext module: 0.79

Python simple Stack: 2.24
Python tuple Stack: 1.94
Python append Stack: 1.25
C ext type Stack: 0.52

[mark@toy ~/.../PP2E/Integrate/Extend/Stacks]$ python exttim.py
reps=200 * [push=200+pop=200+fetch=200]
Python module: 2.42
C ext module: 1.1

Python simple Stack: 2.54
Python tuple Stack: 19.09
Python append Stack: 1.54
C ext type Stack: 0.63

But under the different usage patterns simulated in these two tests, the C type wins the race. It is about twice as fast as the best Python stack (append) when indexing is added to the test mix, as illustrated by two of the preceding test runs that ran with a nonzero fetch count. Similarly, the C module would be twice as fast as the best Python module coding in this case as well.

In other words, the fastest Python stacks are as good as the C stacks if you stick to pushes and pops, but the C stacks are roughly twice as fast if any indexing is performed. Moreover, since you have to pick one representation, if indexing is possible at all you would likely pick the Python append stack; assuming they represent the best case, C stacks would always be twice as fast.

Of course, the measured time differences are so small that in many applications you won care. Further, the C stacks are much more difficult to program, and achieve their speed by imposing substantial functional limits; in many ways, this is not quite an apples-to-apples comparison. But as a rule of thumb, C extensions can not only integrate existing components for use in Python scripts, they can also optimize time-critical components of pure Python programs. In other scenarios, migration to C might yield an even larger speedup.

On the other hand, C extensions should generally be used only as a last resort. As we learned earlier, algorithms and data structures are often bigger influences on program performance than implementation language. The fact that Python-coded tuple stacks are just as fast as the C stacks under common usage patterns speaks volumes about the importance of data structure representation.

.7.4 Wrapping C Types in Classes

In the current Python implementation, to add inheritance to C types you must have a class somewhere. The most common way to support type customization is to introduce a wrapper class -- a Python class that does little but keep a reference to a type object and pass all operations off to the type. Because such a wrapper adds a class interface on top of the type, though, it allows the underlying type to be subclassed and extended as though the type was a class. This is illustrated in Example 19-17.

Example 19-17. PP2EIntegrateExtendStacksoopstack.py
import stacktype # get the C type/module
class Stack:
 def __init__(self, start=None): # make/wrap a C type-instance
 self._base = start or stacktype.Stack( ) # deleted when class-instance is
 def __getattr__(self, name):
 return getattr(self._base, name) # methods/members: type-instance
 def __cmp__(self, other):
 return cmp(self._base, other)
 def __repr__(self): # print is not really repr
 print self._base,; return \
 def __add__(self, other): # operators: special methods
 return Stack(self._base + other._base) # operators are not attributes
 def __mul__(self, n): 
 return Stack(self._base * n) # wrap result in a new Stack
 def __getitem__(self, i): 
 return self._base[i] # item: index, in, for
 def __len__(self):
 return len(self._base)

This wrapper class can be used the same as the C type, because it delegates all operations to the type instance stored away in the class instances self._base:

[mark@toy ~/.../PP2E/Integrate/Extend/Stacks]$ python
>>> import oopstack
>>> x = oopstack.Stack( )
>>> y = oopstack.Stack( )
>>> x.push(class)
>>> for c in "SPAM": y.push(c)
...
>>> x
[Stack:
0: class
]

>>> y[2]
A
>>> z = x + y
>>> for s in z: print s,
...
class S P A M

>>> z.__methods__, z.__members__, z.pop( )
([empty, pop, push, 	op], [len], M)
>>> type(z), type(z._base)
(, )

The point of coding such a wrapper is to better support extensions in Python. Subclasses really subclass the wrapper class, but because the wrapper is just a thin interface to the type, its like subclassing the type itself, as in Example 19-18.

Example 19-18. PP2EIntegrateExtendStackssubstack.py
from oopstack import Stack # get the stub class (C-type wrapper)

class Substack(Stack):
 def __init__(self, start=[]): # extend the 
ew operation
 Stack.__init__(self) # initialize stack from any sequence
 for str in start: # start can be another stack too
 self.push(str)
 def morestuff(self): # add a new method
 print more stack stuff
 def __getitem__(self, i): # extend item to trace accesses
 print accessing cell, i
 return Stack.__getitem__(self, i)

This subclass extends the type (wrapper) to support an initial value at construction time, prints trace messages when indexed, and introduces a brand new morestuff method. This subclass is limited (e.g., the result of a + is a Stack, not a Substack), but proves the point -- wrappers let you apply inheritance and composition techniques weve met in this book to new types coded in C:

>>> import substack
>>> a = substack.Substack(x + y)
>>> a
[Stack:
4: M
3: A
2: P
1: S
0: class
]

>>> a[3]
accessing cell 3
A
>>> a.morestuff( )
more stack stuff
>>> b = substack.Substack("C" + "++")
>>> b.pop(), b.pop( )
(+, +)
>>> c = b + substack.Substack([-, -])
>>> for s in c: print s,
...
C - -


.7.5 But Don Do That Either -- SWIG

You can code C types manually like this, but you don necessarily have to. Because SWIG knows how to generate glue code for C++ classes, you can instead automatically generate all the C extension and wrapper class code required to integrate such a stack object, simply by running SWIG over an appropriate class declaration. The next section shows how.


Introducing Python

Part I: System Interfaces

System Tools

Parallel System Tools

Larger System Examples I

Larger System Examples II

Part II: GUI Programming

Graphical User Interfaces

A Tkinter Tour, Part 1

A Tkinter Tour, Part 2

Larger GUI Examples

Part III: Internet Scripting

Network Scripting

Client-Side Scripting

Server-Side Scripting

Larger Web Site Examples I

Larger Web Site Examples II

Advanced Internet Topics

Part IV: Assorted Topics

Databases and Persistence

Data Structures

Text and Language

Part V: Integration

Extending Python

Embedding Python

VI: The End

Conclusion Python and the Development Cycle



Programming Python
Python Programming for the Absolute Beginner, 3rd Edition
ISBN: 1435455002
EAN: 2147483647
Year: 2000
Pages: 245

Flylib.com © 2008-2020.
If you may any questions please contact us: flylib@qtcs.net