To build and install a C-coded Python extension module, it's simplest and most productive to use the distribution utilities,
distutils
, covered in Chapter 26. In the same directory as
x.c
, place a file named
setup.py
that contains at least the following statements:
to build the module and install it so that it becomes usable in your Python installation. The
distutils
perform all needed compilation and linking steps, with the right compiler and linker commands and flags, and copy the resulting dynamic library in an appropriate directory, dependent on your Python installation. Your Python code can then access the resulting module with the statement
import
x
.
More details are covered in Section 24.1.4 later in this chapter.
x_methods
is an array of
PyMethodDef
structs. Each
PyMethodDef
struct in the
x_methods
array describes a C function that your module
x
makes available to Python code that imports
x
. Each such C function has the following overall structure:
How C-coded functions access arguments passed by Python code is covered in Section 24.1.6 later in this chapter. How such functions build Python objects is covered in Section 24.1.7, and how they raise or propagate exceptions back to the Python code that called them is covered in Section 24.1.8. When your module defines new Python types (as well as or instead of Python-callable functions), your C code defines one or more instances of struct
PyTypeObject
. This subject is covered in Section 24.1.12 later in this chapter.
A simple example that makes use of all these concepts is shown in Section 24.1.11 later in this chapter. A toy-level "Hello World" example could be as simple as:
PyObject* Py_InitModule3(char*
name
,PyMethodDef*
methods
,char*
doc
)
|
|
name
is the C string name of the module you are initializing (e.g., "
name
").
methods
is an array of
PyMethodDef
structures, covered
next
in this chapter.
doc
is the C string that becomes the docstring of the module.
Py_InitModule3
returns a
PyObject*
that is a borrowed reference to the new module object, as covered in Section 24.1.5 later in this chapter. In practice, this means that you can ignore the return value if you need to perform no more initialization operations on this module.
Otherwise
, assign the return value to a C variable of type
PyObject*
and continue initialization.
Py_InitModule3
initializes the module object to contain the functions described in table
methods
. Further initialization, if any, may add other module attributes, and is generally best performed with calls to the following convenience functions.
int PyModule_AddIntConstant(PyObject*
module
,char*
name
,int
value
)
|
|
Adds to module
module
an attribute named
name
with integer value
value
.
int PyModule_AddObject(PyObject*
module
,char*
name
,PyObject*
value
)
|
|
Adds to module
module
an attribute named
name
with value
value
and steals a reference to value, as covered in Section 24.1.5.
|
PyModule_AddStringConstant
|
|
int PyModule_AddStringConstant(PyObject*
module
,char*
name
,char*
value
)
|
|
Adds to module
module
an attribute named
name
with string value
value
.
Some module initialization operations may be conveniently performed by executing Python code with
PyRun_String
, covered later in Section 24.3.4, with the module's dictionary as both the
globals
and
locals
argument. If you find yourself using
PyRun_String
extensively, rather than just as an
occasional
convenience, consider the possibility of splitting your extension module in two: a C-coded extension module offering raw, fast functionality, and a Python module wrapping the C-coded extension to provide further convenience and handy utilities.
When you do need to get a module's dictionary, use the
PyModule_GetDict
function.
PyObject* PyModule_GetDict(PyObject*
module
)
|
|
Returns a borrowed reference to the dictionary of module
module
. You should not use
PyModule_GetDict
for the specific
tasks
supported by the
PyModule_Add
functions covered earlier in this section; I suggest using
PyModule_GetDict
only for such purposes as supporting the use of
PyRun_String
.
If you need to access another module, you can import it by calling the
PyImport_Import
function.
PyObject* PyImport_Import(PyObject*
name
)
|
|
Imports the module named in Python string object
name
and returns a new reference to the module object, like Python's
_ _import_ _(
name
)
.
PyImport_Import
is the highest-level, simplest, and most often used way to import a module.
Beware, in particular, of using function
PyImport_ImportModule
, which may often look more
convenient
because it accepts a
char*
argument.
PyImport_ImportModule
operates on a lower level, bypassing any import hooks that may be in force, so extensions that use it will be far harder to
incorporate
in packages such as those built by tools
py2exe
and
Installer
, covered in Chapter 26. Therefore, always do your importing by calling
PyImport_Import
, unless you have very specific needs and know exactly what you're doing.
To add functions to a module (or non-special methods to new types, as covered later in Section 24.1.12), you must describe the functions or methods in an array of
PyMethodDef
structures, and terminate the array with a
sentinel
(i.e., a structure whose fields are all
or
NULL
).
PyMethodDef
is defined as
follows
:
typedef struct {
char*
ml_name
; /* Python name of function or method */
PyCFunction
ml_meth
; /* pointer to C function impl */
int
ml_flags
; /* flag describing how to pass arguments */
char*
ml_doc
; /* docstring for the function or method */
} PyMethodDef
You must cast the second field to
(PyCFunction)
unless the C function's signature is exactly
PyObject*
function
(PyObject*
self
,
PyObject*
args
)
, which is the
typedef
for
PyCFunction
. This signature is correct when
ml_flags
is
METH_O
, meaning a function that accepts a single argument, or
METH_VARARGS
, meaning a function that accepts positional arguments. For
METH_O
,
args
is the only argument. For
METH_VARARGS
,
args
is a tuple of all arguments, to be parsed with the C API function
PyArg_ParseTuple
. However,
ml_flags
can also be
METH_NOARGS
, meaning a function that accepts no arguments, or
METH_KEYWORDS
, meaning a function that accepts both positional and named arguments. For
METH_NOARGS
, the signature is
PyObject*
function
(PyObject*
self
)
, without arguments. For
METH_KEYWORDS
, the signature is:
PyObject*
function
(PyObject*
self
, PyObject*
args
, PyObject*
kwds
)
args
is the tuple of positional arguments, and
kwds
the dictionary of named arguments.
args
and
kwds
are parsed together with the C API function
PyArg_ParseTupleAndKeywords
.
When a C-coded function implements a module's function, the
self
parameter of the C function is always
NULL
for any value of the
ml_flags
field. When a C-coded function implements a non-special method of an extension type, the
self
parameter points to the instance on which the method is being called.
24.1.5 Reference Counting
Python objects live on the heap, and C code sees them via
PyObject*
. Each
PyObject
counts how many references to itself are outstanding, and destroys itself when the number of references goes down to
. To make this possible, your code must use Python-supplied macros:
Py_INCREF
to add a reference to a Python object, and
Py_DECREF
to abandon a reference to a Python object. The
Py_XINCREF
and
Py_XDECREF
macros are like
Py_INCREF
and
Py_DECREF
, but you may also use them innocuously on a null pointer. The test for a non-null pointer is implicitly performed inside the
Py_XINCREF
and
Py_XDECREF
macros, which saves you from needing to write out that test explicitly.
A
PyObject*
p
, which your code receives by calling or being called by other functions, is known as a
new reference
if the code that
supplies
p
has already called
Py_INCREF
on your
behalf
. Otherwise, it is called a
borrowed
reference
. Your code is said to
own
new references it holds, but not borrowed ones. You can call
Py_INCREF
on a borrowed reference to make it into a reference that you own; you must do this if you need to use the reference across calls to code that might cause the count of the reference you borrowed to be decremented. You must always call
Py_DECREF
before abandoning or overwriting references that you own, but never on references you don't own. Therefore, understanding which interactions transfer reference ownership and which ones rely on reference
borrowing
is
absolutely
crucial. For most functions in the C API, and for all functions that you write and Python calls, the following general rules apply:
-
PyObject*
arguments are borrowed references
-
A
PyObject*
returned as the function's result transfers ownership
For each of the two rules, there are occasional exceptions.
PyList_SetItem
and
PyTuple_SetItem
steal
a reference to the item they are setting (but not to the list or tuple object into which they're setting it). So do the faster versions of these two functions that exist as C preprocessor macros,
PyList_SET_ITEM
and
PyTuple_SET_ITEM
. So does
PyModule_AddObject
, covered earlier in this chapter. There are no other exceptions to the first rule. The rationale for these exceptions, which may help you remember them, is that the object you're setting is most often one you created for the purpose, so the reference-stealing semantics save you from having to call
Py_DECREF
immediately afterward.
The second rule has more exceptions than the first one: there are several cases in which the returned
PyObject*
is a borrowed reference rather than a new reference. The abstract functions, whose
names
begin with
PyObject_
,
PySequence_
,
PyMapping_
, and
PyNumber_
, return new references. This is because you can call them on objects of many types, and there might not be any other reference to the resulting object that they return (i.e., the returned object might be created on the fly). The concrete functions, whose names begin with
PyList_
,
PyTuple_
,
PyDict_
, and so on, return a borrowed reference when the semantics of the object they return ensure that there must be some other reference to the returned object somewhere.
In this chapter, I indicate all cases of exceptions to these rules (i.e., the return of borrowed references and the rare cases of reference stealing from arguments) regarding all functions that I cover. When I don't explicitly mention a function as being an exception, it means that the function follows the rules: its
PyObject*
arguments, if any, are borrowed references, and its
PyObject*
result, if any, is a new reference.
24.1.6 Accessing Arguments
A function that has
ml_flags
in its
PyMethodDef
set to
METH_NOARGS
is called from Python with no arguments. The corresponding C function has a signature with only one argument,
self
. When
ml_flags
is
METH_O
, Python code must call the function with one argument. The C function's second argument is a borrowed reference to the object that the Python caller
passes
as the argument's value.
When
ml_flags
is
METH_VARARGS
, Python code can call the function with any number of positional arguments, which are collected as a tuple. The C function's second argument is a borrowed reference to the tuple. Your C code can then call the
PyArg_ParseTuple
function.
int PyArg_ParseTuple(PyObject*
tuple
,char*
format
,...)
|
|
Returns
for errors, a value not equal to
for success.
tuple
is the
PyObject*
that was the C function's second argument.
format
is a C string that describes mandatory and optional arguments. The following arguments of
PyArg_ParseTuple
are the addresses of the C
variables
in which to put the values extracted from the tuple. Any
PyObject*
variables among the C variables are borrowed references. Table 24-1 lists the commonly used code strings, of which zero or more are joined to form string
format
.
Table 24-1. Format codes for PyArg_ParseTuple
c
|
char
|
A Python string of length
1
becomes a C
char
|
d
|
double
|
A Python
float
becomes a C
double
|
d
|
Py_Complex
|
A Python
complex
becomes a C
Py_Complex
|
f
|
float
|
A Python
float
becomes a C
float
|
i
|
int
|
A Python
int
becomes a C
int
|
l
|
long
|
A Python
int
becomes a C
long
|
l
|
long long
|
A Python
int
becomes a C
long
long
(or
_int64
on Windows)
|
O
|
PyObject*
|
Gets non-
NULL
borrowed reference to a Python argument
|
O!
|
type + PyObject*
|
Like code
O
, plus type checking or
TypeError
(see below)
|
O&
|
convert + void*
|
Arbitrary conversion (see below)
|
s
|
char*
|
Python string without embedded nulls to C
char*
|
s#
|
char* + int
|
Any Python string to C address and length
|
t#
|
char* + int
|
Read-only single-segment buffer to C address and length
|
u
|
Py_UNICODE*
|
Python Unicode without embedded nulls to C (UTF-16)
|
u#
|
Py_UNICODE* + int
|
Any Python Unicode C (UTF-16) address and length
|
w#
|
char* + int
|
Read-write single-segment buffer to C address and length
|
z
|
char*
|
Like code
s
, also accepts
None
(sets C's
char*
to
NULL
)
|
z#
|
char* + int
|
Like code
s#
, also accepts
None
(sets C's
char*
to
NULL
)
|
(...)
|
as per
..
.
|
A Python sequence is treated as one argument per item
|
|
|
The following arguments are optional
|
:
|
|
Format finished, followed by function name for error messages
|
;
|
|
Format finished, followed by entire error message text
|
Code formats
d
to
L
accept numeric arguments from Python. Python coerces the corresponding values. For example, a code of
i
can
correspond
to a Python
float
—the
fractional
part gets truncated, as if built-in function
int
had been called.
Py_Complex
is a C struct with two fields named
real
and
imag
, both of type
double
.
O
is the most general format code and accepts any argument, which you can later check and/or convert as needed. Variant
O!
corresponds to two arguments in the variable arguments: first the address of a Python type object, then the address of a
PyObject*
.
O!
checks that the corresponding value belongs to the given type (or any subtype of that type) before setting the
PyObject*
to point to the value. Variant
O&
also corresponds to two arguments in the variable arguments: first the address of a converter function you coded, then a
void*
(i.e., any address at all). The converter function must have signature
int
convert
(PyObject*
,
void*)
. Python calls your conversion function with the value passed from Python as the first argument and the
void*
from the variable arguments as the second argument. The conversion function must either return
and raise an exception (as covered in Section 24.1.8 later in this chapter) to indicate an error, or return
1
and store whatever is appropriate via the
void*
it gets.
Code format
s
accepts a string from Python and the address of a
char*
(i.e., a
char**
) among the variable arguments. It changes the
char*
to point at the string's buffer, which your C code must then treat as a read-only, null-
terminated
array of
char
s (i.e., a typical C string; however, your code must not modify it). The Python string must contain no embedded null
characters
.
s#
is similar, but corresponds to two arguments among the variable arguments: first the address of a
char*
, then the address of an
int
to set to the string's length. The Python string can contain embedded nulls, and therefore so can the buffer to which the
char*
is set to point.
u
and
u#
are similar, but accept any Unicode string, and the C-side pointers must be
Py_UNICODE*
rather than
char*
.
Py_UNICODE
is a macro defined in
Python.h
, and corresponds to the type of a Python Unicode character in the implementation (this is often, but not always, the same as a
wchar_t
in C).
t#
and
w#
are similar to
s#
, but the corresponding Python argument can be any object of a type that respects the buffer protocol, respectively read-only and read-write. Strings are a typical example of read-only buffers.
mmap
and
array
instances are typical examples of read-write buffers, and they are also acceptable where a read-only buffer is required (i.e., for a
t#
).
When one of the arguments is a Python sequence of known length, you can use format codes for each of its items, and corresponding C addresses among the variable arguments, by grouping the format codes in parentheses. For example, code
(ii)
corresponds to a Python sequence of two
numbers
, and, among the remaining arguments, corresponds to two addresses of
int
s.
The format string may include a vertical bar (
) to indicate that all following arguments are optional. You must initialize the C variables, whose addresses you pass among the variable arguments for later arguments, to suitable default values before you call
PyArg_ParseTuple
.
PyArg_ParseTuple
does not change the C variables corresponding to optional arguments that were not passed in a given call from Python to your C-coded function.
The format string may
optionally
end with :
name
to indicate that
name
must be used as the function name if any error messages are needed. Alternatively, the format string may end with
;
text
to indicate that
text
must be used as the entire error message if
PyArg_ParseTuple
detects errors (this is rarely used).
A function that has
ml_flags
in its
PyMethodDef
set to
METH_KEYWORDS
accepts positional and keyword arguments. Python code calls the function with any number of positional arguments, which get collected as a tuple, and keyword arguments, which get collected as a dictionary. The C function's second argument is a borrowed reference to the tuple, and the third one is a borrowed reference to the dictionary. Your C code then calls the
PyArg_ParseTupleAndKeywords
function.
|
PyArg_ParseTupleAndKeywords
|
|
int PyArg_ParseTupleAndKeywords(PyObject*
tuple
,PyObject*
dict
,
char*
format
,char**
kwlist
,...)
|
|
Returns
for errors, a value not equal to
for success.
tuple
is the
PyObject*
that was the C function's second argument.
dict
is the
PyObject*
that was the C function's third argument.
format
is like for
PyArg_ParseTuple
, except that it cannot include the
(...)
format code to parse nested sequences.
kwlist
is an array of
char*
terminated by a
NULL
sentinel, with the names of the parameters, one after the other. For example, the following C code:
static PyObject*
func_c(PyObject* self, PyObject* args, PyObject* kwds)
{
static char* argnames[] = {"x", "y", "z", NULL};
double x, y=0.0, z=0.0;
if(!PyArg_ParseTupleAndKeywords(
args,kwds,"ddd",argnames,&x,&y,&z))
return NULL;
/* rest of function snipped */
is
roughly
equivalent to this Python code:
def func_py(x, y=0.0, z=0.0):
x, y, z = map(float, (x,y,z))
# rest of function snipped
24.1.7 Creating Python Values
C functions that communicate with Python must often build Python values, both to return as their
PyObject*
result and for other purposes, such as setting items and attributes. The simplest and handiest way to build a Python value is most often with the
Py_BuildValue
function.
PyObject* Py_BuildValue(char*
format
,...)
|
|
format
is a C string that describes the Python object to build. The following arguments of
Py_BuildValue
are C values from which the result is built. The
PyObject*
result is a new reference. Table 24-2 lists the commonly used code strings, of which zero or more are joined into string
format
.
Py_BuildValue
builds and returns a tuple if
format
contains two or more format codes, or if
format
begins with
(
and ends with
)
. Otherwise, the result is not a tuple. When you pass buffers, as for example in the case of format code
s#
,
Py_BuildValue
copies the data. You can therefore modify, abandon, or
free( )
your original copy of the data after
Py_BuildValue
returns.
Py_BuildValue
always returns a new reference (except for format code
N
). Called with an empty
format
,
Py_BuildValue("")
returns a new reference to
None
.
Table 24-2. Format codes for Py_BuildValue
c
|
char
|
A C
char
becomes a Python string of length
1
|
d
|
double
|
A C
double
becomes a Python
float
|
d
|
Py_Complex
|
A C
Py_Complex
becomes a Python
complex
|
i
|
int
|
A C
int
becomes a Python
int
|
l
|
long
|
A C
long
becomes a Python
int
|
N
|
PyObject*
|
Passes a Python object and steals a reference
|
O
|
PyObject*
|
Passes a Python object and
INCREF
s it as per normal rules
|
O&
|
convert + void*
|
Arbitrary conversion (see below)
|
s
|
char*
|
C null-terminated
char*
to Python string, or
NULL
to
None
|
s#
|
char* + int
|
C
char*
and length to Python string, or
NULL
to
None
|
u
|
Py_UNICODE*
|
C wide (UCS-2) null-terminated string to Python Unicode, or
NULL
to
None
|
u#
|
Py_UNICODE* + int
|
C wide (UCS-2) string and length to Python Unicode, or
NULL
to
None
|
(...)
|
as per
..
.
|
Build Python tuple from C values
|
[...]
|
as per
..
.
|
Build Python list from C values
|
{...}
|
as per
..
.
|
Build Python dictionary from C values, alternating keys and values (must be an even number of C values)
|
Code
O&
corresponds to two arguments among the variable arguments: first the address of a converter function you code, then a
void*
(i.e., any address at all). The converter function must have signature
PyObject*
convert
(void*)
. Python calls the conversion function with the
void*
from the variable arguments as the only argument. The conversion function must either return
NULL
and raise an exception (as covered in Section 24.1.8 later in this chapter) to indicate an error, or return a new reference
PyObject*
built from the data in the
void*
.
Code
{...}
builds dictionaries from an even number of C values, alternately keys and values. For example,
Py_BuildValue("{issi}",23,"zig","zag",42)
returns a dictionary like Python's
{23:'zig','zag':42}
.
Note the important difference between codes
N
and
O
.
N
steals a reference from the
PyObject*
corresponding value among the variable arguments, so it's convenient when you're building an object including a reference you own that you would otherwise have to
Py_DECREF
.
O
does no reference stealing, so it's appropriate when you're building an object including a reference you don't own, or a reference you must also keep elsewhere.
24.1.8 Exceptions
To propagate exceptions raised from other functions you call, return
NULL
as the
PyObject*
result from your C function. To raise your own exceptions, set the current-exception indicator and return
NULL
. Python's built-in exception classes (covered in Chapter 6) are globally available, with names starting with
PyExc_
, such as
PyExc_AttributeError
,
PyExc_KeyError
, and so on. Your extension module can also supply and use its own exception classes. The most commonly used C API functions
related
to raising exceptions are the following.
PyObject* PyErr_Format(PyObject*
type
,char*
format
,...)
|
|
Raises an exception of class
type
, a built-in such as
PyExc_IndexError
, or an exception class created with
PyErr_NewException
. Builds the associated value from format string
format
, which has syntax similar to
printf
's, and the following C values indicated as variable arguments above. Returns
NULL
, so your code can just call:
return PyErr_Format(PyExc_KeyError,
"Unknown key name (%s)", thekeystring);
PyObject* PyErr_NewException(char*
name
,PyObject*
base
,PyObject*
dict
)
|
|
Subclasses exception class
base
, with extra class attributes and methods from dictionary
dict
(normally
NULL
, meaning no extra class attributes or methods), creating a new exception class named
name
(string
name
must be of the form "
modulename
.
classname
") and returning a new reference to the new class object. When
base
is
NULL
, uses
PyExc_Exception
as the base class. You normally call this function during initialization of a module object
module
. For example:
PyModule_AddObject(module, "error",
PyErr_NewException("mymod.error", NULL, NULL));
PyObject* PyErr_NoMemory( )
|
|
Raises an out-of-memory error and returns
NULL
, so your code can just call:
return PyErr_NoMemory( );
void PyErr_SetObject(PyObject*
type
,PyObject*
value
)
|
|
Raises an exception of class
type
, a built-in such as
PyExc_KeyError
, or an exception class created with
PyErr_NewException
, with
value
as the associated value (a borrowed reference).
PyErr_SetObject
is a
void
function (i.e., returns no value).
PyObject* PyErr_SetFromErrno(PyObject*
type
)
|
|
Raises an exception of class
type
, a built-in such as
PyExc_OSError
, or an exception class created with
PyErr_NewException
. Takes all details from global variable
errno
, which C library functions and system calls set for many error cases, and the standard C library function
strerror
. Returns
NULL
, so your code can just call:
return PyErr_SetFromErrno(PyExc_IOError);
|
PyErr_SetFromErrnoWithFilename
|
|
PyObject* PyErr_SetFromErrnoWithFilename(PyObject*
type
,char*
filename
)
|
|
Like
PyErr_SetFromErrno
, but also provides string
filename
as part of the exception's value. When
filename
is
NULL
, works like
PyErr_SetFromErrno
.
Your C code may want to deal with an exception and continue, as a
try
/
except
statement would let you do in Python code. The most commonly used C API functions related to catching exceptions are the following.
Clears the error indicator.
Innocuous
if no error is pending.
int PyErr_ExceptionMatches(PyObject*
type
)
|
|
Call only when an error is pending, or the whole program might crash. Returns a value not equal to
when the pending exception is an instance of the given
type
or any subclass of
type
, or
when the pending exception is not such an instance.
PyObject* PyErr_Occurred( )
|
|
Returns
NULL
if no error is pending, otherwise a borrowed reference to the type of the pending exception. (Don't use the returned value; call
PyErr_ExceptionMatches
instead, in order to catch exceptions of subclasses as well, as is normal and expected.)
Call only when an error is pending, or the whole program might crash. Outputs a standard traceback to
sys.stderr
, then clears the error indicator.
If you need to process errors in highly sophisticated ways, study other error-related functions of the C API, such as
PyErr_Fetch
,
PyErr_Normalize
,
PyErr_GivenExceptionMatches
, and
PyErr_Restore
. However, I do not cover such advanced and rarely needed possibilities in this book.
24.1.9 Abstract Layer Functions
The code for a C extension typically needs to use some Python functionality. For example, your code may need to examine or set attributes and items of Python objects, call Python-coded and built-in functions and methods, and so on. In most cases, the best approach is for your code to call functions from the abstract layer of Python's C API. These are functions that you can call on any Python object (functions whose names start with
PyObject_
), or any object within a wide category, such as mappings, numbers, or sequences (with names respectively starting with
PyMapping_
,
PyNumber_
, and
PySequence_
).
Some of the functions callable on objects within these categories duplicate functionality that is also available from
PyObject_
functions; in these cases, you should use the
PyObject_
function instead. I don't cover such redundant functions in this book.
Functions in the abstract layer raise Python exceptions if you call them on objects to which they are not
applicable
. All of these functions accept borrowed references for
PyObject*
arguments, and return a new reference (
NULL
for an exception) if they return a
PyObject*
result.
The most frequently used abstract layer functions are the following.
int PyCallable_Check(PyObject*
x
)
|
|
True if
x
is callable, like Python's
callable(
x
)
.
PyObject* PyEval_CallObject(PyObject*
x
,PyObject*
args
)
|
|
Calls callable Python object
x
with the positional arguments held in tuple
args
. Returns the call's result, like Python's
return
x
(*
args
)
.
|
PyEval_CallObjectWithKeywords
|
|
PyObject* PyEval_CallObjectWithKeywords(PyObject*
x
,PyObject*
args
,PyObject*
kwds
)
|
|
Calls callable Python object
x
with the positional arguments held in tuple
args
and the named arguments held in dictionary
kwds
Returns the call's result, like Python's
return
x
(*
args
,**
kwds
)
.
int PyIter_Check(PyObject*
x
)
|
|
True if
x
supports the iterator protocol (i.e., if
x
is an iterator).
PyObject* PyIter_Next(PyObject*
x
)
|
|
Returns the next item from iterator
x
. Returns
NULL
without raising any exception if
x
's iteration is finished (i.e., when Python's
x
.next( )
raises
StopIteration
).
int PyNumber_Check(PyObject*
x
)
|
|
True if
x
supports the number protocol (i.e., if
x
is a number).
PyObject* PyObject_CallFunction(PyObject*
x
,char*
format
,...)
|
|
Calls the callable Python object
x
with positional arguments described by format string
format
, using the same format codes as
Py_BuildValue
, covered earlier. When
format
is
NULL
, calls
x
with no arguments. Returns the call's result.
PyObject* PyObject_CallMethod(PyObject*
x
,char*
method
,char*
format
,...)
|
|
Calls the method named
method
of Python object
x
with positional arguments described by format string
format
, using the same format codes as
Py_BuildValue
. When
format
is
NULL
, calls the method with no arguments. Returns the call's result.
int PyObject_Cmp(PyObject*
x1
,PyObject*
x2
,int*
result
)
|
|
Compares objects
x1
and
x2
and places the result (
-1
,
, or
1
) in
*
result
, like Python's
result
=cmp(
x1
,
x2
)
.
int PyObject_DelAttrString(PyObject*
x
,char*
name
)
|
|
Deletes
x
's attribute named
name
, like Python's
del
x
.
name
.
int PyObject_DelItem(PyObject*
x
,PyObject*
key
)
|
|
Deletes
x
's item with key (or index)
key
, like Python's
del
x
[
key
]
.
int PyObject_DelItemString(PyObject*
x
,char*
key
)
|
|
Deletes
x
's item with key
key
, like Python's
del
x
[
key
]
.
PyObject* PyObject_GetAttrString(PyObject*
x
,char*
name
)
|
|
Returns
x
's attribute named
name
, like Python's
x
.
name
.
PyObject* PyObject_GetItem(PyObject*
x
,PyObject*
key
)
|
|
Returns
x
's item with key (or index)
key
, like Python's
x
[
key
]
.
int PyObject_GetItemString(PyObject*
x
,char*
key
)
|
|
Returns
x
's item with key
key
, like Python's
x
[
key
]
.
PyObject* PyObject_GetIter(PyObject*
x
)
|
|
Returns an iterator on
x
, like Python's
iter(
x
)
.
int PyObject_HasAttrString(PyObject*
x
,char*
name
)
|
|
True if
x
has an attribute named
name
, like Python's
hasattr(
x
,
name
)
.
int PyObject_IsTrue(PyObject*
x
)
|
|
True if
x
is true for Python, like Python's
bool(
x
)
.
int PyObject_Length(PyObject*
x
)
|
|
{% if main.adsdop %}{% include 'adsenceinline.tpl' %}{% endif %}
Returns
x
's length, like Python's
len(
x
)
.
PyObject* PyObject_Repr(PyObject*
x
)
|
|
Returns
x
's detailed string representation, like Python's
repr(
x
)
.
PyObject* PyObject_RichCompare(PyObject*
x
,PyObject*
y
,int
op
)
|
|
Performs the comparison indicated by
op
between
x
and
y
, and returns the result as a Python object.
op
can be
Py_EQ
,
Py_NE
,
Py_LT
,
Py_LE
,
Py_GT
, or
Py_GE
, corresponding to Python comparisons
x
==
y
,
x
!=
y
,
x
<
y
,
x
<=
y
,
x
>
y
, or
x
>=
y
, respectively.
int PyObject_RichCompareBool(PyObject*
x
,PyObject*
y
,int
op
)
|
|
Like
PyObject_RichCompare
, but returns
for false,
1
for true.
int PyObject_SetAttrString(PyObject*
x
,char*
name
,PyObject*
v
)
|
|
Sets
x
's attribute named
name
to
v
, like Python's
x
.
name
=
v
.
int PyObject_SetItem(PyObject*
x
,PyObject*
k
,PyObject *
v
)
|
|
Sets
x
's item with key (or index)
key
to
v
, like Python's
x
[
key
]=
v
.
int PyObject_SetItemString(PyObject*
x
,char*
key
,PyObject *
v
)
|
|
Sets
x
's item with key
key
to
v
, like Python's
x
[
key
]=
v
.
PyObject* PyObject_Str(PyObject*
x
)
|
|
Returns
x
's readable string form, like Python's
str(
x
)
.
PyObject* PyObject_Type(PyObject*
x
)
|
|
Returns
x
's type object, like Python's
type(
x
)
.
PyObject* PyObject_Unicode(PyObject*
x
)
|
|
Returns
x
's Unicode string form, like Python's
unicode(
x
)
.
int PySequence_Contains(PyObject*
x
,PyObject*
v
)
|
|
True if
v
is an item in
x
, like Python's
v
in
x
.
int PySequence_DelSlice(PyObject*
x
,int
start
,int
stop
)
|
|
Delete
x
's slice from
start
to
stop
, like Python's
del
x
[
start
:
stop
]
.
PyObject* PySequence_Fast(PyObject*
x
)
|
|
Returns a new reference to a tuple with the same items as
x
, unless
x
is a list, in which case returns a new reference to
x
. When you need to get many items of an arbitrary sequence
x
, it's
fastest
to call
t
=PySequence_Fast(
x
)
once, then call
PySequence_Fast_GET_ITEM(
t
,
i
)
as many times as needed, and finally call
Py_DECREF(
t
)
.
PyObject* PySequence_Fast_GET_ITEM(PyObject*
x
,int
i
)
|
|
Returns the
i
item of
x
, where
x
must be the result of
PySequence_Fast
,
x
!=NULL
, and
0<=i<PySequence_Fast_GET_SIZE(
t
)
. Violating these conditions can cause program crashes: this approach is optimized for speed, not for safety.
int PySequence_Fast_GET_SIZE(PyObject*
x
)
|
|
Returns the length of
x
.
x
must be the result of
PySequence_Fast
,
x
!=NULL
.
PyObject* PySequence_GetSlice(PyObject*
x
,int
start
,int
stop
)
|
|
Returns
x
's slice from
start
to
stop
, like Python's
x
[
start
:
stop
]
.
PyObject* PySequence_List(PyObject*
x
)
|
|
Returns a new list object with the same items as
x
, like Python's
list(
x
)
.
int PySequence_SetSlice(PyObject*
x
,int
start
,int
stop
,PyObject*
v
)
|
|
Sets
x
's slice from
start
to
stop
to
v
, like Python's
x
[
start
:
stop
]=
v
. Just as in the equivalent Python statement,
v
must be a sequence of the same type as
x
.
PyObject* PySequence_Tuple(PyObject*
x
)
|
|
Returns a new reference to a tuple with the same items as
x
, like Python's
tuple(
x
)
.
The functions whose names start with
PyNumber_
allow you to perform numeric operations. Unary
PyNumber
functions, which take one argument
PyObject*
x
and return a
PyObject*
, are listed in Table 24-3 with their Python equivalents.
Table 24-3. Unary PyNumber functions
PyNumber_Absolute
|
abs(x)
|
PyNumber_Float
|
float(x)
|
PyNumber_Int
|
int(x)
|
PyNumber_Invert
|
~x
|
PyNumber_Long
|
long(x)
|
PyNumber_Negative
|
-x
|
PyNumber_Positive
|
+x
|
Binary
PyNumber
functions, which take two
PyObject*
arguments
x
and
y
and return a
PyObject*
, are similarly listed in Table 24-4.
Table 24-4. Binary PyNumber functions
PyNumber_Add
|
x + y
|
PyNumber_And
|
x & y
|
PyNumber_Divide
|
x / y
|
PyNumber_Divmod
|
divmod(x, y)
|
PyNumber_FloorDivide
|
x // y
|
PyNumber_Lshift
|
x << y
|
PyNumber_Multiply
|
x * y
|
PyNumber_Or
|
x y
|
PyNumber_Remainder
|
x % y
|
PyNumber_Rshift
|
x >> y
|
PyNumber_Subtract
|
x - y
|
PyNumber_TrueDivide
|
x / y
(non-truncating)
|
PyNumber_Xor
|
x ^ y
|
All the binary
PyNumber
functions have in-place equivalents whose names start with
PyNumber_InPlace
, such as
PyNumber_InPlaceAdd
and so on. The in-place versions try to modify the first argument in-place, if possible, and in any case return a new reference to the result, be it the first argument (modified) or a new object. Python's built-in numbers are immutable; therefore, when the first argument is a number of a built-in type, the in-place versions work just the same as the ordinary versions. Function
PyNumber_Divmod
returns a tuple with two items (the
quotient
and the remainder) and has no in-place equivalent.
There is one ternary
PyNumber
function,
PyNumber_Power
.
PyObject* PyNumber_Power(PyObject*
x
,PyObject*
y
,PyObject*
z
)
|
|
When
z
is
Py_None
, returns
x
raised to the
y
power, like Python's
x
**
y
or equivalently
pow(
x,y
)
. Otherwise, returns
x
**
y
%
z
, like Python's
pow(
x,y,z
)
. The in-place version is named
PyNumber_InPlacePower
.
24.1.10 Concrete Layer Functions
Each specific type of Python built-in object supplies concrete functions to
operate
on instances of that type, with names starting with
Py
type
_
(e.g.,
PyInt_
for functions related to Python
int
s). Most such functions duplicate the functionality of abstract-layer functions or auxiliary functions covered earlier in this chapter, such as
Py_BuildValue
, which can generate objects of many types. In this section, I cover some frequently used functions from the concrete layer that provide unique functionality or substantial convenience or speed. For most types, you can check if an object belongs to the type by calling
Py
type
_Check
, which also accepts instances of
subtypes
, or
Py
type
_CheckExact
, which accepts only instances of
type
, not of subtypes. Signatures are as for functions
PyIter_Check
, covered earlier in this chapter.
PyObject* PyDict_GetItem(PyObject*
x
,PyObject*
key
)
|
|
Returns a borrowed reference to the item with key
key
of dictionary
x
.
int PyDict_GetItemString(PyObject*
x
,char*
key
)
|
|
Returns a borrowed reference to the item with key
key
of dictionary
x
.
int PyDict_Next(PyObject*
x
,int*
pos
,PyObject**
k
,PyObject**
v
)
|
|
Iterates over items in dictionary
x
. You must initialize
*
pos
to
at the start of the iteration:
PyDict_Next
uses and updates
*
pos
to keep track of its place. For each successful iteration step, returns
1
; when there are no more items, returns
. Updates
*
k
and
*
v
to point to the next key and value respectively (borrowed references) at each step that returns
1
. You can pass either
k
or
v
as
NULL
if you are not interested in the key or value. During an iteration, you must not change in any way the set of
x
's keys, but you can change
x
's values as long as the set of keys remains identical.
int PyDict_Merge(PyObject*
x
,PyObject*
y
,int
override
)
|
|
Updates dictionary
x
by merging the items of dictionary
y
into
x
.
override
determines what happens when a key
k
is present in both
x
and
y
: if
override
is
, then
x
[
k
]
remains the same; otherwise
x
[
k
]
is
replaced
by the value
y
[
k
]
.
int PyDict_MergeFromSeq2(PyObject*
x
,PyObject*
y
,int
override
)
|
|
Like
PyDict_Merge
, except that
y
is not a dictionary but a sequence of sequences, where each
subsequence
has length 2 and is used as a
(
key
,
value
)
pair.
double PyFloat_AS_DOUBLE(PyObject*
x
)
|
|
Returns the C
double
value of Python
float
x
, very fast, without error checking.
PyObject* PyList_New(int
length
)
|
|
Returns a new,
uninitialized
list of the given
length
. You must then initialize the list, typically by calling
PyList_SET_ITEM
length
times.
PyObject* PyList_GET_ITEM(PyObject*
x
,int
pos
)
|
|
Returns the
pos
item of list
x
, without error checking.
int PyList_SET_ITEM(PyObject*
x
,int
pos
,PyObject*
v
)
|
|
Sets the
pos
item of list
x
to
v
, without error checking. Steals a reference to
v
. Use only immediately after creating a new list
x
with
PyList_New
.
char* PyString_AS_STRING(PyObject*
x
)
|
|
Returns a pointer to the internal buffer of string
x
, very fast, without error checking. You must not modify the buffer in any way, unless you just allocated it by calling
PyString_FromStringAndSize(NULL
,
size
)
.
int PyString_AsStringAndSize(PyObject*
x
,char**
buffer
,int*
length
)
|
|
Puts a pointer to the internal buffer of string
x
in
*
buffer
, and
x
's length in
*
length
. You must not modify the buffer in any way, unless you just allocated it by calling
PyString_FromStringAndSize(NULL
,
size
)
.
PyObject* PyString_FromFormat(char*
format
,...)
|
|
Returns a Python string built from format string
format
, which has syntax similar to
printf
's, and the following C values indicated as variable arguments above.
|
PyString_FromStringAndSize
|
|
PyObject* PyString_FromFormat(char*
data
,int
size
)
|
|
Returns a Python string of length
size
, copying
size
bytes from
data
. When
data
is
NULL
, the Python string is uninitialized, and you must initialize it. You can get the pointer to the string's internal buffer by calling
PyString_AS_STRING
.
PyObject* PyTuple_New(int
length
)
|
|
Returns a new, uninitialized tuple of the given
length
. You must then initialize the tuple, typically by calling
PyTuple_SET_ITEM
length
times.
PyObject* PyTuple_GET_ITEM(PyObject*
x
,int
pos
)
|
|
Returns the
pos
item of tuple
x
, without error checking.
int PyTuple_SET_ITEM(PyObject*
x
,int
pos
,PyObject*
v
)
|
|
Sets the
pos
item of tuple
x
to
v
, without error checking. Steals a reference to
v
. Use only immediately after creating a new tuple
x
with
PyTuple_New
.
24.1.11 A Simple Extension Example
Example 24-1 exposes the functionality of Python C API functions
PyDict_Merge
and
PyDict_MergeFromSeq2
for Python use. The
update
method of dictionaries works like
PyDict_Merge
with
override
=1
, but Example 24-1 is more general.
Example 24-1. A simple Python extension module merge.c
#include <Python.h>
static PyObject*
merge(PyObject* self, PyObject* args, PyObject* kwds)
{
static char* argnames[] = {"x","y","override",NULL};
PyObject *x, *y;
int override = 0;
if(!PyArg_ParseTupleAndKeywords(args, kwds, "O!Oi", argnames,
&PyDict_Type, &x, &y, &override))
return NULL;
if(-1 == PyDict_Merge(x, y, override)) {
if(!PyErr_ExceptionMatches(PyExc_TypeError)):
return NULL;
PyErr_Clear( );
if(-1 == PyDict_MergeFromSeq2(x, y, override))
return NULL;
}
return Py_BuildValue("");
}
static char merge_docs[] = "\
merge(x,y,override=False): merge into dict x the items of dict y (or the pairs\n\
that are the items of y, if y is a sequence), with optional override.\n\
Alters dict x directly, returns None.\n\
";
static PyObject*
mergenew(PyObject* self, PyObject* args, PyObject* kwds)
{
static char* argnames[] = {"x","y","override",NULL};
PyObject *x, *y, *result;
int override = 0;
if(!PyArg_ParseTupleAndKeywords(args, kwds, "O!Oi", argnames,
&PyDict_Type, &x, &y, &override))
return NULL;
result = PyObject_CallMethod(x, "copy", "");
if(!result)
return NULL;
if(-1 == PyDict_Merge(result, y, override)) {
if(!PyErr_ExceptionMatches(PyExc_TypeError)):
return NULL;
PyErr_Clear( );
if(-1 == PyDict_MergeFromSeq2(result, y, override))
return NULL;
}
return result;
}
static char merge_docs[] = "\
mergenew(x,y,override=False): merge into dict x the items of dict y (or\n\
the pairs that are the items of y, if y is a sequence), with optional\n\
override. Does NOT alter x, but rather returns the modified copy as\n\
the function's result.\n\
";
static PyMethodDef funcs[] = {
{"merge", (PyCFunction)merge, METH_KEYWORDS, merge_docs},
{"mergenew", (PyCFunction)mergenew, METH_KEYWORDS, mergenew_docs},
{NULL}
};
void
initmerge(void)
{
Py_InitModule3("merge", funcs, "Example extension module");
}
This example declares as
static
every function and global variable in the C source file, except
initmerge
, which must be visible from the outside to let Python call it. Since the functions and variables are exposed to Python via the
PyMethodDef
structures, Python does not need to see their names directly. Therefore, declaring them
static
is best: this ensures that names don't
accidentally
end up in the whole program's global namespace, as might otherwise happen on some platforms, possibly
causing
conflicts and errors.
The format string "
O!Oi
" passed to
PyArg_ParseTupleAndKeywords
indicates that function
merge
accepts three arguments from Python: an object with a type constraint, a generic object, and an optional integer. At the same time, the format string indicates that the variable part of
PyArg_ParseTupleAndKeywords
's arguments must contain four addresses: in order, the address of a Python type object, then two addresses of
PyObject*
variables, and finally the address of an
int
variable. The
int
variable must have been previously
initialized
to its intended default value, since the corresponding Python argument is optional.
And indeed, after the
argnames
argument, the code passes
&PyDict_Type
(i.e., the address of the dictionary type object). Then it passes the addresses of the two
PyObject*
variables. Finally, it passes the address of variable
override
, an
int
that was previously initialized to
, since the default, when the
override
argument isn't explicitly passed from Python, should be no overriding. If the return value of
PyArg_ParseTupleAndKeywords
is
, the code immediately returns
NULL
to propagate the exception; this automatically diagnoses most cases where Python code passes wrong arguments to our new function
merge
.
When the arguments appear to be okay, it
tries
PyDict_Merge
, which succeeds if
y
is a dictionary. When
PyDict_Merge
raises a
TypeError
, indicating that
y
is not a dictionary, the code clears the error and tries again, this time with
PyDict_MergeFromSeq2
, which succeeds when
y
is a sequence of pairs. If that also fails, it returns
NULL
to propagate the exception. Otherwise, it returns
None
in the simplest way (i.e., with
return
Py_BuildValue("")
) to indicate success.
Function
mergenew
basically duplicates
merge
's functionality; however,
mergenew
does not alter its arguments, but rather builds and returns a new dictionary as the function's result. The C API function
PyObject_CallMethod
lets
mergenew
call the
copy
method of its first Python-passed argument, a dictionary object, and obtain a new dictionary object that it then alters (with exactly the same logic as function
merge
). It then returns the
altered
dictionary as the function result (thus, no need to call
Py_BuildValue
in this case).
The code of Example 24-1 must reside in a source file named
merge.c
. In the same directory, create the following script named
setup.py
:
from distutils.core import setup, Extension
setup(name='merge', ext_modules=[ Extension('merge',sources=['merge.c']) ])
Now, run
python setup.py install
at a shell prompt in this directory. This command builds the dynamically loaded library for the
merge
extension module, and copies it to the appropriate directory, depending on your Python installation. Now your Python code can use the module. For example:
import merge
x = {'a':1,'b':2 }
merge.merge(x,[['b',3],['c',4]])
print x # prints: {'a':1, 'b':2, 'c':4 }
print merge.mergenew(x,{'a':5,'d':6},override=1)
# prints: {'a':5, 'b':2, 'c':4, 'd':6 }
print x # prints: {'a':1, 'b':2, 'c':4 }
This example shows the difference between
merge
(which alters its first argument) and
mergenew
(which returns a new object and does not alter its argument). It also shows that the second argument can be either a dictionary or a sequence of two-item subsequences. Further, it
demonstrates
default operation (where keys that are already in the first argument are left alone) as well as the
override
option (where keys coming from the second argument take precedence, as in Python dictionaries'
update
method).
24.1.12 Defining New Types
In your extension modules, you often want to define new types and make them available to Python. A type's definition is held in a large struct named
PyTypeObject
. Most of the fields of
PyTypeObject
are pointers to functions. Some fields point to other structs, which in
turn
are blocks of pointers to functions.
PyTypeObject
also includes a few fields giving the type's name, size, and behavior details (option flags). You can leave almost all fields of
PyTypeObject
set to
NULL
if you do not supply the related functionality. You can point some fields to functions in the Python C API in order to supply certain aspects of fundamental object functionality in standard ways.
The best way to implement a type is to copy from the Python sources the file
Modules/xxsubtype.c
, which Python supplies exactly for such didactical purposes, and edit it. It's a complete module with two types, subclassing from
list
and
dict
respectively. Another example in the Python sources,
Objects/xxobject.c
, is not a complete module, and the type in this file is minimal and old-fashioned, not using modern recommended approaches. See http://www.python.org/dev/doc/
devel
/api/type-structs.html for detailed documentation on
PyTypeObject
and other related structs. File
Include/object.h
in the Python sources contains the declarations of these types, as well as several important comments that you would do well to study.
24.1.12.1 Per-instance data
To represent each instance of your type, declare a C struct that starts, right after the opening
brace
, with macro
PyObject_HEAD
. The macro expands into the data fields that your struct must begin with in order to be a Python object. Those fields include the reference count and a pointer to the instance's type. Any pointer to your structure can be correctly cast to a
PyObject*
.
The
PyTypeObject
struct that defines your type's characteristics and behavior must contain the size of your per-instance struct, as well as pointers to the C functions you write to operate on your structure. Therefore, you normally place the
PyTypeObject
toward the end of your code, after the per-instance struct and all the functions that operate on instances of the per-instance struct. Each
x
that points to a structure starting with
PyObject_HEAD
, and in particular each
PyObject*
x
, has a field
x
->ob_type
that is the address of the
PyTypeObject
structure that is
x
's Python type object.
24.1.12.2 The PyTypeObject definition
Given a per-instance struct such as:
typedef struct {
PyObject_HEAD
/* other data needed by instances of this type, omitted */
} mytype;
the corresponding
PyTypeObject
struct almost invariably begins in a way similar to:
static PyTypeObject t_mytype = {
/* tp_head */ PyObject_HEAD_INIT(NULL) /* use NULL, for MSVC++ */
/* tp_internal */ 0, /* must be 0 */
/* tp_name
/
"mymodule.mytype", /* type name with module */
/* tp_basicsize */ sizeof(mytype),
/* tp_itemsize */ 0, /* 0 except variable-size type */
/* tp_dealloc */ (destructor)mytype_dealloc,
/* tp_print */ 0, /* usually 0, use str instead */
/* tp_getattr */ 0, /* usually 0 (see getattro) */
/* tp_setattr */ 0, /* usually 0 (see setattro) */
/* tp_compare*/ 0, /* see also richcompare */
/* tp_repr */ (reprfunc)mytype_str, /* like Python's _ _repr_ _ */
/* rest of struct omitted */
For portability to Microsoft Visual C++, the
PyObject_HEAD_INIT
macro at the start of the
PyTypeObject
must have an argument of
NULL
. During module initialization, you must call
PyType_Ready(&t_mytype)
, which, among other tasks,
inserts
in
t_mytype
the address of its type (the type of a type is also known as a metatype), normally
&PyType_Type
. Another slot in
PyTypeObject
that points to another type object is
tp_base
, later in the structure. In the structure definition itself, you must have a
tp_base
of
NULL
, again for compatibility with Microsoft Visual C++. However, before you invoke
PyType_Ready(&t_mytype)
, you can optionally set
t_mytype.tp_base
to the address of another type object. When you do so, your type inherits from the other type, just like a class coded in Python 2.2 can optionally inherit from a built-in type. For a Python type coded in C, inheriting means that for most fields in the
PyTypeObject
, if you set the field to
NULL
,
PyType_Ready
copies the corresponding field from the base type. A type must
specifically
assert in its field
tp_flags
that it is usable as a base type, otherwise no other type can inherit from it.
The
tp_itemsize
field is of interest only for types that, like tuples, have instances of different sizes, and can determine instance size once and forever at creation time. Most types just set
tp_itemsize
to
. Fields such as
tp_getattr
and
tp_setattr
are generally set to
NULL
because they exist only for backward compatibility: modern types use fields
tp_getattro
and
tp_setattro
instead. Field
tp_repr
is typical of most of the following fields, which are omitted here: the field holds the address of a function, which corresponds directly to a Python special method (here,
_ _repr_ _
). You can set the field to
NULL
, indicating that your type does not supply the special method, or else set the field to point to a function with the needed functionality. If you set the field to
NULL
, but also point to a base type from the
tp_base
slot, you inherit the special method, if any, from your base type. You often need to cast your functions to the specific
typedef
type that a field needs (here, type
reprfunc
for field
tp_repr
) because the
typedef
has a first argument
PyObject*
self
, while your functions, being specific to your type, normally use more specific pointers. For example:
static PyObject* mytype_str(mytype* self) { ... /* rest omitted */
Alternatively, you can declare
mytype_str
with a
PyObject*
self
, then use a cast
(mytype*)self
in the function's body. Either alternative is acceptable, but it's more common to locate the casts in the
PyTypeObject
declaration.
24.1.12.3 Instance initialization and
finalization
The task of finalizing your instances is split among two functions. The
tp_dealloc
slot must never be
NULL
, except for immortal types (i.e., types whose instances are never deallocated). Python calls
x
->ob_type->tp_dealloc(
x
)
on each instance
x
whose reference count decreases to
, and the function thus called must release any resource held by object
x
, including
x
's memory. When an instance of
mytype
holds no other resources that must be released (in particular, no owned references to other Python objects that you would have to
DECREF
),
mytype
's destructor can be extremely simple:
static void mytype_dealloc(PyObject *x)
{
x->ob_type->tp_free((PyObject*)x);
}
The function in the
tp_free
slot has the specific task of freeing
x
's memory. In Python 2.2, the function has signature
void
name
(PyObject*)
. In Python 2.3, the signature has changed to
void
name
(void*)
. One way to ensure your sources compile under both versions of Python is to put in slot
tp_free
the C API function
_PyObject_Del
, which has the right signature in each version.
The task of initializing your instances is split among three functions. To allocate memory for new instances of your type, put in slot
tp_alloc
the C API function
PyType_GenericAlloc
, which does absolutely minimal initialization, clearing the newly allocated memory bytes to
except for the type pointer and reference count. Similarly, you can often set field
tp_new
to the C API function
PyType_GenericNew
. In this case, you can perform all per-instance initialization in the function you put in slot
tp_init
, which has the signature:
int
init_name
(PyObject *
self
,PyObject *
args
,PyObject *
kwds
)
The positional and named arguments to the function in slot
tp_init
are those passed when calling the type to create the new instance, just like, in Python, the positional and named arguments to
_ _init_ _
are those passed when calling the class object. Again like for types (classes) defined in Python, the general rule is to do as little initialization as possible in
tp_new
and as much as possible in
tp_init
. Using
PyType_GenericNew
for
tp_new
accomplishes this. However, you can choose to define your own
tp_new
for special types, such as ones that have immutable instances, where initialization must happen earlier. The signature is:
PyObject*
new_name
(PyObject *
subtype
,PyObject *
args
,PyObject *
kwds
)
The function in
tp_new
must return the newly created instance, normally an instance of
subtype
(which may be a type that inherits from yours). The function in
tp_init
, on the other hand, must return
for success, or
-1
to indicate an exception.
If your type is subclassable, it's important that any instance invariants be established before the function in
tp_new
returns. For example, if it must be
guaranteed
that a certain field of the instance is never
NULL
, that field must be set to a non-
NULL
value by the function in
tp_new
. Subtypes of your type might fail to call your
tp_init
function; therefore such
indispensable
initializations should be in
tp_new
for subclassable types.
24.1.12.4 Attribute access
Access to attributes of your instances, including methods (as covered in Chapter 5) is mediated by the functions you put in slots
tp_getattro
and
tp_setattro
of your
PyTypeObject
struct. Normally, you put there the standard C API functions
PyObject_GenericGetAttr
and
PyObject_GenericSetAttr
, which implement standard semantics. Specifically, these API functions access your type's methods via the slot
tp_methods
, pointing to a sentinel-terminated array of
PyMethodDef
structs, and your instances'
members
via the slot
tp_members
, a similar sentinel-terminated array of
PyMemberDef
structs:
typedef struct {
char*
name
; /* Python-visible name of the member */
int
type
; /* code defining the data-type of the member */
int
offset
; /* offset of the member in the per-instance struct */
int
flags
; /* READONLY for a read-only member */
char*
doc
; /* docstring for the member */
} PyMemberDef
As an exception to the general rule that including
Python.h
gets you all the declarations you need, you have to include
structmember.h
explicitly in order to have your C source see the declaration of
PyMemberDef
.
type
is generally
T_OBJECT
for members that are
PyObject*
, but many other type codes are defined in
Include/structmember.h
for members that your instances hold as C-native data (e.g.,
T_DOUBLE
for
double
or
T_STRING
for
char*
). For example, if your per-instance struct is something like:
typedef struct {
PyObject_HEAD
double
datum
;
char*
name
;
} mytype;
to expose to Python per-instance attributes
datum
(read/write) and
name
(read-only), you can define the following array and point your
PyTypeObject
's
tp_members
to it:
static PyMemberDef[] mytype_members = {
{"datum", T_DOUBLE, offsetof(mytype, datum), 0, "The current datum"},
{"name", T_STRING, offsetof(mytype, name), READONLY,
"Name of the datum"},
{NULL}
};
Using
PyObject_GenericGetAttr
and
PyObject_GenericSetAttr
for
tp_getattro
and
tp_setattro
also provides further possibilities, which I will not cover in detail in this book. Field
tp_getset
points to a sentinel-terminated array of
PyGetSetDef
structs, the equivalent of having
property
instances in a Python-coded class. If your
PyTypeObject
's field
tp_dictoffset
is not equal to
, the field's value must be the offset, within the per-instance struct, of a
PyObject*
that points to a Python dictionary. In this case, the generic attribute access API functions use that dictionary to allow Python code to set arbitrary attributes on your type's instances, just like for instances of Python-coded classes.
Another dictionary is per-type, not per-instance: the
PyObject*
for the per-type dictionary is slot
tp_dict
of your
PyTypeObject
struct. You can set slot
tp_dict
to
NULL
, and then
PyType_Ready
initializes the dictionary appropriately. Alternatively, you can set
tp_dict
to a dictionary of type attributes, and then
PyType_Ready
adds other entries to that same dictionary, in addition to the type attributes you set. It's generally easier to start with
tp_dict
set to
NULL
, call
PyType_Ready
to create and initialize the per-type dictionary, and then, if need be, add any further entries to the dictionary.
Field
tp_flags
is a
long
whose bits determine your type struct's exact layout, mostly for backward compatibility. Normally, set this field to
Py_TPFLAGS_DEFAULT
to indicate that you are defining a normal, modern type. You should set
tp_flags
to
Py_TPFLAGS_DEFAULTPy_TPFLAGS_HAVE_GC
if your type supports cyclic garbage collection. Your type should support cyclic garbage collection if instances of the type contain
PyObject*
fields that might point to arbitrary objects and form part of a reference loop. However, to support cyclic garbage collection, it's not enough to add
Py_TPFLAGS_HAVE_GC
to field
tp_flags
; you also have to supply appropriate functions, indicated by slots
tp_traverse
and
tp_clear
, and register and unregister your instances appropriately with the cyclic garbage collector. Supporting cyclic garbage collection is an advanced subject, and I do not cover it further in this book. Similarly, I do not cover the advanced subject of supporting weak references.
Field
tp_doc
, a
char*
, is a null-terminated character string that is your type's docstring. Other fields point to structs (whose fields point to functions); you can set each such field to
NULL
to indicate that you support none of the functions of that kind. The fields pointing to such blocks of functions are
tp_as_number
, for special methods typically supplied by numbers;
tp_as_sequence
, for special methods typically supplied by sequences;
tp_as_mapping
, for special methods typically supplied by mappings; and
tp_as_buffer
, for the special methods of the buffer protocol.
For example, objects that are not sequences can still support one or a few of the methods listed in the block to which
tp_as_sequence
points, and in that case the
PyTypeObject
must have a non-
NULL
field
tp_as_sequence
, even if the block of function pointers it points to is in turn mostly full of
NULL
s. For example, dictionaries supply a
_ _contains_ _
special method so that you can check if
x
in
d
when
d
is a dictionary. At the C code level, the method is a function pointed to by field
sq_contains
, which is part of the
PySequenceMethods
struct to which field
tp_as_sequence
points. Therefore, the
PyTypeObject
struct for the
dict
type, named
PyDict_Type
, has a non-
NULL
value for
tp_as_sequence
, even though a dictionary supplies no other field in
PySequenceMethods
except
sq_contains
, and therefore all other fields in
*(PyDict_Type.tp_as_sequence)
are
NULL
.
24.1.12.5 Type definition example
Example 24-2 is a complete Python extension module that defines the very simple type
intpair
, each instance of which holds two integers named
first
and
second
.
Example 24-2. Defining a new intpair type
#include "Python.h"
#include "structmember.h"
/* per-instance data structure */
typedef struct {
PyObject_HEAD
int first, second;
} intpair;
static int
intpair_init(PyObject *self, PyObject *args, PyObject *kwds)
{
static char* nams[] = {"first","second",NULL};
int first, second;
if(!PyArg_ParseTupleAndKeywords(args, kwds, "ii", nams, &first, &second))
return -1;
((intpair*)self)->first = first;
((intpair*)self)->second = second;
return 0;
}
static void
intpair_dealloc(PyObject *self)
{
self->ob_type->tp_free(self);
}
static PyObject*
intpair_str(PyObject* self)
{
return PyString_FromFormat("intpair(%d,%d)",
((intpair*)self)->first, ((intpair*)self)->second);
}
static PyMemberDef intpair_members[] = {
{"first", T_INT, offsetof(intpair, first), 0, "first item" },
{"second", T_INT, offsetof(intpair, second), 0, "second item" },
{NULL}
};
static PyTypeObject t_intpair = {
PyObject_HEAD_INIT(0) /* tp_head */
0, /* tp_internal */
"intpair.intpair", /* tp_name */
sizeof(intpair), /* tp_basicsize */
0, /* tp_itemsize */
intpair_dealloc, /* tp_dealloc */
0, /* tp_print */
0, /* tp_getattr */
0, /* tp_setattr */
0, /* tp_compare */
intpair_str, /* tp_repr */
0, /* tp_as_number */
0, /* tp_as_sequence */
0, /* tp_as_mapping */
0, /* tp_hash */
0, /* tp_call */
0, /* tp_str */
PyObject_GenericGetAttr, /* tp_getattro */
PyObject_GenericSetAttr, /* tp_setattro */
0, /* tp_as_buffer */
Py_TPFLAGS_DEFAULT,
"two ints (first,second)",
0, /* tp_traverse */
0, /* tp_clear */
0, /* tp_richcompare */
0, /* tp_weaklistoffset */
0, /* tp_iter */
0, /* tp_iternext */
0, /* tp_methods */
intpair_members, /* tp_members */
0, /* tp_getset */
0, /* tp_base */
0, /* tp_dict */
0, /* tp_descr_get */
0, /* tp_descr_set */
0, /* tp_dictoffset */
intpair_init, /* tp_init */
PyType_GenericAlloc, /* tp_alloc */
PyType_GenericNew, /* tp_new */
_PyObject_Del, /* tp_free */
};
void
initintpair(void)
{
static PyMethodDef no_methods[] = { {NULL} };
PyObject* this_module = Py_InitModule("intpair", no_methods);
PyType_Ready(&t_intpair);
PyObject_SetAttrString(this_module, "intpair", (PyObject*)&t_intpair);
}
The
intpair
type defined in Example 24-2 gives just about no substantial benefits when compared to an equivalent definition in Python, such as:
class intpair(object):
__slots_ _ = 'first', 'second'
def __init_ _(self, first, second):
self.first = first
self.second = second
def __repr_ _(self):
return 'intpair(%s,%s)' % (self.first, self.second)
The C-coded version does ensure the two attributes are integers, truncating float or complex number arguments as needed. For example:
import intpair
x=intpair.intpair(1.2,3.4) # x is: intpair(1,3)
Each instance of the C-coded version of
intpair
occupies somewhat less memory than an instance of the Python version in the above example. However, the purpose of Example 24-2 is purely didactic: to present a C-coded Python extension that defines a new type.