Recipe17.6.Translating a Python Sequence into a C Array with the PySequence_Fast Protocol


Recipe 17.6. Translating a Python Sequence into a C Array with the PySequence_Fast Protocol

Credit: Luther Blissett

Problem

You have an existing C function that takes as an argument a C array of C-level values (e.g., doubles), and you want to wrap it into a Python-callable C extension that takes as an argument a Python sequence or iterator.

Solution

The easiest way to accept an arbitrary Python sequence (or any other iterable object) in the Python C API is with the PySequence_Fast function. It builds and returns a tuple when needed but returns only its argument (with the reference count incremented) when the argument is already a list or tuple:

#include <Python.h> /* a preexisting C-level function you want to expose, e.g: */ static double total(double* data, int len) {     double total = 0.0;     int i;     for(i=0; i<len; ++i)         total += data[i];     return total; } /* here is how you expose it to Python code: */ static PyObject *totalDoubles(PyObject *self, PyObject *args) {     PyObject* seq;     double *dbar;     double result;     int seqlen;     int i;     /* get one argument as a sequence */     if(!PyArg_ParseTuple(args, "O", &seq))         return 0;     seq = PySequence_Fast(seq, "argument must be iterable");     if(!seq)         return 0;     /* prepare data as an array of doubles */     seqlen = PySequence_Fast_GET_SIZE(seq);     dbar = malloc(seqlen*sizeof(double));     if(!dbar) {         Py_DECREF(seq);         return PyErr_NoMemory( );     }     for(i=0; i < seqlen; i++) {         PyObject *fitem;         PyObject *item = PySequence_Fast_GET_ITEM(seq, i);         if(!item) {             Py_DECREF(seq);             free(dbar);             return 0;         }         fitem = PyNumber_Float(item);         if(!fitem) {             Py_DECREF(seq);             free(dbar);             PyErr_SetString(PyExc_TypeError, "all items must be numbers");             return 0;         }         dbar[i] = PyFloat_AS_DOUBLE(fitem);         Py_DECREF(fitem);     }     /* clean up, compute, and return result */     Py_DECREF(seq);     result = total(dbar, seqlen);     free(dbar);     return Py_BuildValue("d", result); } static PyMethodDef totalMethods[  ] = {     {"total", totalDoubles, METH_VARARGS, "Sum a sequence of numbers."},     {0} /* sentinel */ }; void inittotal(void) {     (void) Py_InitModule("total", totalMethods); }

Discussion

The two best ways for your C-coded, Python-callable extension functions to accept generic Python sequences as arguments are PySequence_Fast and PyObject_GetIter. The latter, which I cover in the next recipe, can often save some memory, but it is appropriate only when it's OK for the rest of your C code to get the items one at a time, without knowing beforehand how many items there will be in total. You often have preexisting C functions from an existing library that you want to expose to Python code, and such functions may require C arrays as their input arguments. Thus, this recipe shows how to build a C array (in this case, an array of double) from a generic Python sequence (or other iterable) argument, so that you can pass the array (and the integer that gives the array's length) to your existing C function (represented here, purely as an example, by the total function at the start of the recipe). (In the real world, you would use Python's built-in function sum for this specific functionality, rather than exposing any existing C function (but this is meant to be just an example!)

PySequence_Fast takes two arguments: a Python iterable object to be presented as a sequence, and a string to use as the error message in case the Python object cannot be presented as a sequence, in which case PySequence_Fast returns 0 (the C null pointer, NULL, an error indicator). If the Python object is already a list or tuple, PySequence_Fast returns the same object with the reference count increased by one. If the Python object is any other kind of sequence (or any iterator, or other iterable), PySequence_Fast builds and returns a new tuple with all items already in place. In any case, PySequence_Fast returns an object on which you can call PySequence_Fast_GET_SIZE to obtain the sequence length (as we do in the recipe, in order to malloc the appropriate amount of storage for the C array) and PySequence_Fast_GET_ITEM to get an item given a valid index (an int between 0, included, and the sequence length, excluded).

The recipe requires quite a bit of care (as is typical of all C-coded Python extensions, and more generally of any C code) to deal properly with memory issues and error conditions. For C-coded Python extensions, in particular, it's imperative that you know which Python C API functions return new references (which you must Py_DECREF when you are done with them) and which ones return borrowed references (which you must not Py_DECREF when you're done with them; on the contrary, you must Py_INCREF such a reference if you want to keep a copy for a longer time). In this specific case, you have to know the following (by reading the Python documentation):

  • PyArg_ParseTuple produces borrowed references.

  • PySequence_Fast returns a new reference.

  • PySequence_Fast_GET_ITEM returns a borrowed reference.

  • PyNumber_Float returns a new reference.

There is method to this madness, even though, as you start your career as a coder of C API Python extensions, you'll no doubt have to double-check each case carefully. Python's C API strives to return borrowed references (for the sake of the modest performance increase that they afford, by avoiding needless incrementing and decrementing of reference counts), when it knows it can always do so safely (i.e., it knows that the reference it is returning necessarily refers to an already existing object). However, Python's C API has to return a new reference when it's possible (or certain) that a new object may have to be created.

For example, in the preceding list, PyNumber_Float and PySequence_Fast may be able to return the same object they were given as an argument, but it's also quite possible that they may have to create a new object for this purpose, to ensure that the returned object has the correct type. Therefore, these two functions are specified as always returning new references. PyArg_ParseTuple and PySequence_Fast_GET_ITEM, on the other hand, always return references to objects that already exist elsewhere (as items in the arguments' tuple, or as items in the fast-sequence container, respectively). Therefore, these two functions can afford to return borrowed references and are thus specified as doing so.

One last note: in this recipe, as soon as we obtain an item from the fast-sequence container, we immediately try to transform it into a Python float object, and thus we have to deal with the possibility that the transformation will fail (e.g., if we're passed a sequence containing a string, a complex number, etc.). It is most often futile to first attempt a check (with PyNumber_Check) because the check might succeed, and the later transformation attempt might fail anyway (e.g., with a complex-number item). Therefore, it's better to attempt the transformation and deal with the resulting error, if any. This approach is yet another case of the common situation in which it's easier to get forgiveness than permission!

As usual, the best way to build this extension (assuming e.g., that you've saved the extension's source code as a file named total.c) is with the distutils package. Place a file named setup.py in the same directory as the C source:

from distutils.core import setup, Extension setup(name="total", maintainer="Luther Blissett", maintainer_email=     "situ@tioni.st", ext_modules=[Extension('total', sources=['total.c'])] )

then build and install by running:

$ python setup.py install

An appealing aspect of this approach is that it works on any platform, assuming that you have access to the same C compiler used to build your version of Python, and permission to write on the site-packages directory where the resulting dynamically loaded library gets installed.

See Also

The Extending and Embedding manual is available as part of the standard Python documentation set at http://www.python.org/doc/current/ext/ext.html; documentation on the Python C API is at http://www.python.org/doc/current/api/api.html; the section "Distributing Python Modules" in the standard Python documentation set is still incomplete, but it's a good source of information on the distutils package; Python in a Nutshell covers the essentials of extending and embedding, of the Python C API, and of the distutils package.



Python Cookbook
Python Cookbook
ISBN: 0596007973
EAN: 2147483647
Year: 2004
Pages: 420

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net