Python Programming on Win32: Appendix C - The Python Database API Version 2.0 | Python Programming on WIN32: Help for Windows Programmers

This appendix is a direct reproduction of Version 2.0 of the Python Database API. The same information can be found at http://www.python.org/topics/database/DatabaseAPI-2.0.html.

Footnotes are collected as endnotes at the end of the chapter, as in the online specification.

Python Database API Specification 2.0

This API has been defined to encourage similarity between the Python modules that access databases. By doing this, we hope to achieve a consistency leading to more easily understood modules, code that is generally more portable across databases, and a broader reach of database connectivity from Python.

The interface specification consists of several sections:

Module interface

Connection objects

Cursor objects

Type objects and constructors

Implementation hints

Major changes from 1.0 to 2.0

Comments and questions about this specification may be directed to the SIG for Database Interfacing with Python.

For more information on database interfacing with Python and available packages see the Database Topics Guide on www.python.org.

This document describes the Python Database API Specification 2.0. The previous Version 1.0 version is still available as reference. Package writers are encouraged to use this version of the specification as basis for new interfaces.

Module Interface

Access to the database is made available through connection objects. The module must provide the following constructor for these:

connect(parameters )Constructor for creating a connection to the database. Returns a Connection object. It takes a number of parameters that are database dependent. ⁵

A reference to the operation is retained by the cursor. If the same operation object is passed in again, the cursor can optimize its behavior. This is most effective for algorithms where the same operation is used, but different parameters are bound to it (many times).

For maximum efficiency when reusing an operation, it's best to use the setinputsizes() method to specify the parameter types and sizes ahead of time. It's legal for a parameter to not match the predefined information; the implementation should compensate, possibly with a loss of efficiency.

The parameters may also be specified as a list of tuples to, for example, insert multiple rows in a single operation, but this kind of use is depreciated: executemany() should be used instead. Return values are not defined.

executemany(operation, seq_of_parameters)Prepares a database operation (query or command) and then executes it against all parameter sequences or mappings found in the sequence seq_of_parameters.

Modules are free to implement this method using multiple calls to the execute() method or by using array operations to have the database process the sequence-as a whole in one call. The same comments as for execute() also apply accordingly to this method. Return values aren't defined.

fetchone()Fetches the next row of a query result set, returning a single sequence, or None when no more data is available.⁶

An Error (or subclass) exception is raised if the previous call to executeXXX() doesn't produce any result set or no call was issued.

fetchmany ([size=cursor.arraysize])Fetches the next set of rows of a query result, returning a sequence of sequences (e.g., a list of tuples). An empty sequence is returned when no more rows are available.

The number of rows to fetch per call is specified by the parameter. If it isn't given, the cursor's arraysize determines the number of rows to be fetched. The method should try to fetch as many rows as indicated by the size parameter. If this isn't possible due to the specified number of rows not being available, fewer rows may be returned.

An Error (or subclass) exception is raised if the previous call to executeXXX() doesn't produce any result set or no call was issued.

Note there are performance considerations involved with the size parameter. For optimal performance, it's usually best to use the arraysize attribute. If the size parameter is used, then it's best for it to retain the same value from one fetchmany() call to the next.

fetchall()Fetches all (remaining) rows of a query result, returning them as a sequence of sequences (e.g., a list of tuples). The cursor's arraysize attribute can affect the performance of this operation.

An Error (or subclass) exception is raised if the previous call to executeXXX() doesn't produce any result set or no call was issued.

nextset()This method is optional since not all databases support multiple result sets. It makes the cursor skip to the next available set, discarding any remaining rows from the current set. If there are no more sets, the method returns None. Otherwise, it returns a true value and subsequent calls to the fetch methods return rows from the next result set.

An Error (or subclass) exception is raised if the previous call to executeXXX() doesn't produce any result set or no call was issued.

arraysizeThis read/write attribute specifies the number of rows at a time to fetch with fetchmany(). It defaults to 1, meaning to fetch a single row at a time.

Implementations must observe this value with respect to the fetchmany() method but are free to interact with the database a single row at a time. It may also be used in the implementation of executemany().

setinputsizes (sizes)This can be used before a call to executeXXX() to predefine memory areas for the operation's parameters.

sizes is specified as a sequence: one item for each input parameter. The item should be a Type object that corresponds to the input used, or it should be an integer specifying the maximum length of a string parameter. If the item is None, no predefined memory area is reserved for that column (this is useful to avoid predefined areas for large inputs).

This method is used before the executeXXX() method is invoked. Implementations are free to have this method do nothing, and users are free to not use it.

setoutputsize(size[,column])Sets a column buffer size for fetches of large columns (e.g., LONGs, BLOBs, and so on). The column is specified as an index into the result sequence. Not specifying the column sets the default size for all large columns in the cursor.

This method is used before the executeXXX() mmethod is invoked. Implementations are free to have this method do nothing, and users are free to not use it.

Type Objects and Constructors

Many databases need to have the input in a particular format in order to bind to an operation's input parameters. For example, if an input is destined for a DATE column, it must be bound to the database in a particular string format. Similar problems exist for "Row ID" columns or large binary items (e.g., BLOBs or RAW columns). This presents problems for Python since the parameters to the executeXXX() method are untyped. When the database module sees a Python string object, it doesn't know if it should be bound as a simple CHAR column, as a raw BINARY item, or as a DATE.

To overcome this problem, a module must provide the constructors defined here to create objects that can hold special values. When passed to the cursor methods, the module can then detect the proper type of the input parameter and bind it accordingly.

A cursor object's description attribute returns information about each of the result columns of a query. The type_code must compare equal to one of type objects defined here. Type objects may be equal to more than one type code (e.g., DATETIME could be equal to the type codes for date, time, and timestamp columns; see the implementation hints later in this appendix for details).

The module exports the following constructors and singletons:

Date(year,month,day)This function constructs an object holding a date value.

Time(hour,minute,second)This function constructs an object holding a time value.

Timestamp(year,month,day,hour,minute,second)This function constructs an object holding a timestamp value.

DateFromTicks(ticks)This function constructs an object holding a date value from the given ticks value (number of seconds since the epoch; see the documentation of the standard Python time module for details).

TimeFromTicks(ticks)This function constructs an object holding a time value from the given ticks value (number of seconds since the epoch; see the documentation of the standard Python time module for details).

TimestampFromTicks(ticks)This function constructs an object holding a timestamp value from the given ticks value (number of seconds since the epoch; see the documentation of the standard Python time module for details).

Binary(string)This function constructs an object capable of holding a binary (long) string value.

STRINGThis type object describes columns in a database that are string-based (e.g., CHAR).

BINARYThis type object describes (long) binary columns in a database (e.g., LONG, RAW, BLOBs).

NUMBERThis type object describes numeric columns in a database.

DATETIMEThis type object describes date/time columns in a database.

ROWIDThis type object describes the "Row ID" column in a database.

SQL NULL values are represented by the Python None singleton on input and output.

Using Unix ticks for database interfacing can cause troubles because of the limited date range they cover.

Implementation Hints

The preferred object types for the date/time objects are those defined in the mxDateTime package. It provides all necessary constructors and methods both at Python and C level.

The preferred object type for binary objects are the buffer types available in standard Python starting with Version 1.5.2. Please see the Python documentation for details. For information about the C interface have a look at Include/bufferobject.h and Objects/bufferobject.c in the Python source distribution.

Here is a sample implementation of the Unix ticks based constructors for date/time delegating work to the generic constructors:

import time def DateFromTicks(ticks): return apply(Date,time.localtime(ticks)[:3]) def TimeFromTicks(ticks): return apply(Time,time.localtime(ticks)[3:6]) def TimestampFromTicks(ticks): return apply(Timestamp,time.localtime(ticks)[:6])

This Python class allows implementing the above type objects even though the description type code field yields multiple values for on type object:

class DBAPITypeObject: def __init__(self,*values): self.values = values def __cmp__(self,other): if other in self.values: return 0 if other < self.values: return 1 else: return -1

The resulting type object compares equal to all values passed to the constructor.

Here is a snippet of Python code that implements the exception hierarchy defined previously:

import exceptions class Error(exceptions.StandardError): pass class Warning(exceptions.StandardError): pass class InterfaceError(Error): pass class DatabaseError(Error): pass class InternalError(DatabaseError): pass class OperationalError(DatabaseError): pass class ProgrammingError(DatabaseError): pass class IntegrityError(DatabaseError): pass class DataError(DatabaseError): pass class NotSupportedError(DatabaseError): pass

In C you can use the PyErr_NewException (fullname, base, NULL) API to create the exception objects.

Major Changes from Version 1.0 to Version 2.0

The Python Database API 2.0 introduces a few major changes compared to the 1.0 version. Because some of these changes will cause existing DB API 1.0-based scripts to break, the major version number was adjusted to reflect this change. These are the most important changes from 1.0 to 2.0:

The need for a separate dbi module was dropped, and the functionality merged into the module interface itself.

New constructors and type objects were added for date/time values, the RAW type object was renamed to BINARY. The resulting set should cover all basic data types commonly found in modern SQL databases.

New constants (apilevel, threadlevel, paramstyle) and methods (executemany, nextset) were added to provide better database bindings.

The semantics of .callproc() needed to call stored procedures are now clearly defined.

The definition of the .execute() return value changed. Previously, the return value was based on the SQL statement type (which was hard to correctly implement); it's undefined now. Use the more flexible .rowcount attribute instead. Modules are free to return the old-style return values, but these are no longer mandated by the specification and should be considered database interface dependent.

Class-based exceptions were incorporated into the specification. Module implementors are free to extend the exception layout defined in this specification by subclassing the defined exception classes.

Open Issues

Although the Version 2.0 specification clarifies a lot of questions that were left open in the 1.0 version, there are still some remaining issues:

Define a useful return value for .nextset() for the case where a new result set is available.

Create a fixed point numeric type for use as loss-less monetary and decimal interchange format.

Endnotes

1. As a guideline, the connection constructor parameters should be implemented as keyword parameters for more intuitive use and follow this order of parameters:

dsn = Data source name as string

user = User name as string (optional)

password = Password as string (optional)

host = Hostname (optional)

database = Database name (optional)

For example, a connect could look like this.


		`connect(dsn='myhost:MYDB',user='guido',password='234$¶')`

2. Module implementors should prefer numeric , named or pyformat over the other formats because these offer more clarity and flexibility.


		3. If the database doesn't support the functionality required by the method, the interface should throw an exception in case the method is used.The preferred approach is to not implement the method and thus have Python generate an `AttributeError` in case the method is requested. This allows the programmer to check for database capabilities using the standard `hasattr()` function.For some dynamically configured interfaces, it may not be appropriate to require dynamically making the method available. These interfaces should then raise a `NotSupportedError` to indicate the inability to perform the rollback when the method is invoked.

4. A database interface may choose to support named cursors by allowing a string argument to the method. This feature is not part of the specification, since it complicates semantics of the .fetchXXX() methods.


		5. The module uses the `__getitem__` method of the `parameters` object to map either positions (integers) or names (strings) to parameter values. This allows for both sequences and mappings to be used as input.The term "bound" refers to the process of binding an input value to a database execution buffer. In practical terms, this means that the input value is used directly as a value in the operation. The client should not be required to "escape" the value so that it can be used; the value should be equal to the actual database value.

6. Note that the interface may implement row fetching using arrays and other optimizations. It's not guaranteed that a call to this method will move only the associated cursor forward by one row.

7. The rowcount attribute may be coded in a way that updates its value dynamically. This can be useful for databases that return usable rowcount values only after the first call to a .fetchXXX() method.

Back