Section 21.2. Python Database Application Programmer s Interface (DB-API)


21.2. Python Database Application Programmer's Interface (DB-API)

Where can one find the interfaces necessary to talk to a database? Simple. Just go to the database topics section at the main Python Web site. There you will find links to the full and current DB-API (version 2.0), existing database modules, documentation, the special interest group, etc. Since its inception, the DB-API has been moved into PEP 249. (This PEP obsoletes the old DB-API 1.0 specification which is PEP 248.) What is the DB-API?

The API is a specification that states a set of required objects and database access mechanisms to provide consistent access across the various database adapters and underlying database systems. Like most community-based efforts, the API was driven by strong need.

In the "old days," we had a scenario of many databases and many people implementing their own database adapters. It was a wheel that was being reinvented over and over again. These databases and adapters were implemented at different times by different people without any consistency of functionality. Unfortunately, this meant that application code using such interfaces also had to be customized to which database module they chose to use, and any changes to that interface also meant updates were needed in the application code.

A special interest group (SIG) for Python database connectivity was formed, and eventually, an API was born ... the DB-API version 1.0. The API provides for a consistent interface to a variety of relational databases, and porting code between different databases is much simpler, usually only requiring tweaking several lines of code. You will see an example of this later on in this chapter.

21.2.1. Module Attributes

The DB-API specification mandates that the features and attributes listed below must be supplied. A DB-API-compliant module must define the global attributes as shown in Table 21.1.

Table 21.1. DB-API Module Attributes

Attribute

Description

apilevel

Version of DB-API module is compliant with

threadsafety

Level of thread safety of this module

paramstyle

SQL statement parameter style of this module

Connect()

Connect() function

(Various exceptions)

(See Table 21.4)


Data Attributes

apilevel

This string (not float) indicates the highest version of the DB-API the module is compliant with, i.e., "1.0", "2.0", etc. If absent, "1.0" should be assumed as the default value.

threadsafety

This an integer with these possible values:

  • 0: Not threadsafe, so threads should not share the module at all

  • 1: Minimally threadsafe: threads can share the module but not connections

  • 2: Moderately threadsafe: threads can share the module and connections but not cursors

  • 3: Fully threadsafe: threads can share the module, connections, and cursors

If a resource is shared, a synchronization primitive such as a spin lock or semaphore is required for atomic-locking purposes. Disk files and global variables are not reliable for this purpose and may interfere with standard mutex operation. See the threading module or the chapter on multithreaded programming (Chapter 16) on how to use a lock.

paramstyle

The API supports a variety of ways to indicate how parameters should be integrated into an SQL statement that is eventually sent to the server for execution. This argument is just a string that specifies the form of string substitution you will use when building rows for a query or command (see Table 21.2).

Table 21.2. paramstyle Database Parameter Styles

Parameter Style

Description

Example

numeric

Numeric positional style

WHERE name=:1

named

Named style

WHERE name=:name

pyformat

Python dictionary printf() format conversion

WHERE name=%(name)s

qmark

Question mark style

WHERE name=?

format

ANSI C printf() format conversion

WHERE name=%s


Function Attribute(s)

connect() Function access to the database is made available through Connection objects. A compliant module has to implement a connect() function, which creates and returns a Connection object. Table 21.3 shows the arguments to connect().

Table 21.3. connect() Function Attributes

Parameter

Description

user

Username

password

Password

host

Hostname

database

Database name

dsn

Data source name


You can pass in database connection information as a string with multiple parameters (DSN) or individual parameters passed as positional arguments (if you know the exact order), or more likely, keyworded arguments. Here is an example of using connect() from PEP 249:

  connect(dsn='myhost:MYDB',user='guido',password='234$')


The use of DSN versus individual parameters is based primarily on the system you are connecting to. For example, if you are using an API like ODBC or JDBC, you would likely be using a DSN, whereas if you are working directly with a database, then you are more likely to issue separate login parameters. Another reason for this is that most database adapters have not implemented support for DSN. Below are some examples of non-DSN connect() calls. Note that not all adapters have implemented the specification exactly, e.g., MySQLdb uses db instead of database.

• MySQLdb.connect(host='dbserv', db='inv', user='smith') • PgSQL.connect(database='sales') • psycopg.connect(database='template1', user='pgsql') • gadfly.dbapi20.connect('csrDB', '/usr/local/database') • sqlite3.connect('marketing/test')


Exceptions

Exceptions that should also be included in the compliant module as globals are shown in Table 21.4.

Table 21.4. DB-API Exception Classes

Exception

Description

Warning

Root warning exception class

Error

Root error exception class

InterfaceError

Database interface (not database) error

DatabaseError

Database error

DataError

Problems with the processed data

OperationalError

Error during database operation execution

IntegrityError

Database relational integrity error

InternalError

Error that occurs within the database

ProgrammingError

SQL command failed

NotSupportedError

Unsupported operation occurred


21.2.2. Connection Objects

Connections are how your application gets to talk to the database. They represent the fundamental communication mechanism by which commands are sent to the server and results returned. Once a connection has been established (or a pool of connections), you create cursors to send requests to and receive replies from the database.

Methods

Connection objects are not required to have any data attributes but should define the methods shown in Table 21.5.

Table 21.5. Connection Object Methods

Method Name

Description

close()

Close database connection

commit()

Commit current transaction

rollback()

Cancel current transaction

cursor()

Create (and return) a cursor or cursor-like object using this connection

errorhandler(cxn, cur, errcls, errval)

Serves as a handler for given connection cursor


When close() is used, the same connection cannot be used again without running into an exception.

The commit() method is irrelevant if the database does not support transactions or if it has an auto-commit feature that has been enabled. You can implement separate methods to turn auto-commit off or on if you wish. Since this method is required as part of the API, databases that do not have the concept of transactions should just implement "pass" for this method.

Like commit(), rollback() only makes sense if transactions are supported in the database. After execution, rollback() should leave the database in the same state as it was when the transaction began. According to PEP 249, "Closing a connection without committing the changes first will cause an implicit rollback to be performed."

If the RDBMS does not support cursors, cursor() should still return an object that faithfully emulates or imitates a real cursor object. These are just the minimum requirements. Each individual adapter developer can always add special attributes specifically for their interface or database.

It is also recommended but not required for adapter writers to make all database module exceptions (see above) available via a connection. If not, then it is assumed that Connection objects will throw the corresponding module-level exception. Once you have completed using your connection and cursors closed, you should commit() any operations and close() your connection.

21.2.3. Cursor Objects

Once you have a connection, you can start talking to the database. As we mentioned above in the introductory section, a cursor lets a user issue database commands and retrieve rows resulting from queries. A Python DB-API cursor object functions as a cursor for you, even if cursors are not supported in the database. In this case, the database adapter creator must implement CURSOR objects so that they act like cursors. This keeps your Python code consistent when you switch between database systems that have or do not have cursor support.

Once you have created a cursor, you can execute a query or command (or multiple queries and commands) and retrieve one or more rows from the results set. Table 21.6 shows data attributes and methods that cursor objects have.

Table 21.6. Cursor Object Attributes

Object Attribute

Description

arraysize

Number of rows to fetch at a time with fetch many(); defaults to 1

connection

Connection that created this cursor (optional)

description

Returns cursor activity (7-item tuples): (name, type_code, display_size, internal_ size, precision, scale, null_ok); only name and type_code are required

lastrowid

Row ID of last modified row (optional; if row IDs not supported, default to None)

rowcount

Number of rows that the last execute*() produced or affected

callproc( func[, args])

Call a stored procedure

close()

Close cursor

execute(op[, args])

Execute a database query or command

executemany(op, args)

Like execute() and map() combined; prepare and execute a database query or command over given arguments

fetchone()

Fetch next row of query result

fetchmany ([ size=cursor.arraysize])

Fetch next size rows of query result

fetchall()

Fetch all (remaining) rows of a query result

__iter__()

Create iterator object from this cursor (optional; also see next())

messages

List of messages (set of tuples) received from the database for cursor execution (optional)

next()

Used by iterator to fetch next row of query result (optional; like fetchone(), also see __iter__())

nextset()

Move to next results set (if supported)

rownumber

Index of cursor (by row, 0-based) in current result set (optional)

setinput-sizes(sizes)

Set maximum input-size allowed (required but implementation optional)

setoutput size(size[, col])

Set maximum buffer size for large column fetches (required but implementation optional)


The most critical attributes of cursor objects are the execute*() and the fetch*() methods ... all the service requests to the database are performed by these. The arraysize data attribute is useful in setting a default size for fetchmany(). Of course, closing the cursor is a good thing, and if your database supports stored procedures, then you will be using callproc().

21.2.4. Type Objects and Constructors

Oftentimes, the interface between two different systems are the most fragile. This is seen when converting Python objects to C types and vice versa. Similarly, there is also a fine line between Python objects and native database objects. As a programmer writing to Python's DB-API, the parameters you send to a database are given as strings, but the database may need to convert it to a variety of different, supported data types that are correct for any particular query.

For example, should the Python string be converted to a VARCHAR, a TEXT, a BLOB, or a raw BINARY object, or perhaps a DATE or TIME object if that is what the string is supposed to be? Care must be taken to provide database input in the expected format, so because of this another requirement of the DB-API is to create constructors that build special objects that can easily be converted to the appropriate database objects. Table 21.7 describes classes that can be used for this purpose. SQL NULL values are mapped to and from Python's NULL object, None.

Table 21.7. Type Objects and Constructors

Type Object

Description

Date(yr, mo, dy)

Object for a date value

Time(hr, min, sec)

Object for a time value

Timestamp(yr, mo, dy, hr, min, sec)

Object for a timestamp value

DateFromTicks(ticks)

Date object given number of seconds since the epoch

TimeFromTicks(ticks)

Time object given number of seconds since the epoch

TimestampFromTicks(ticks)

Timestamp object given number of seconds since the epoch

Binary(string)

Object for a binary (long) string value

STRING

Object describing string-based columns, e.g., VARCHAR

BINARY

Object describing (long) binary columns, i.e., RAW, BLOB

NUMBER

Object describing numeric columns

DATETIME

Object describing date/time columns

ROWID

Object describing "row ID" columns


Changes to API Between Versions

Several important changes were made when the DB-API was revised from version 1.0 (1996) to 2.0 (1999):

  • Required dbi module removed from API

  • Type objects were updated

  • New attributes added to provide better database bindings

  • callproc() semantics and return value of execute() redefined

  • Conversion to class-based exceptions

Since version 2.0 was published, some of the additional optional DB-API extensions that you read about above were added in 2002. There have been no other significant changes to the API since it was published. Continuing discussions of the API occur on the DB-SIG mailing list. Among the topics brought up over the last 5 years include the possibilities for the next version of the DB-API, tentatively named DB-API 3.0. These include the following:

  • Better return value for nextset() when there is a new result set

  • Switch from float to Decimal

  • Improved flexibility and support for parameter styles

  • Prepared statements or statement caching

  • Refine the transaction model

  • State the role of API with respect to portability

  • Add unit testing

If you have strong feelings about the API, feel free to participate and join in the discussion. Here are some references you may find handy.

  • http://python.org/topics/database

  • http://www.linuxjournal.com/article/2605

  • http://wiki.python.org/moin/DbApi3

21.2.5. Relational Databases

So, you are now ready to go. A burning question must be, "Interfaces to which database systems are available to me in Python?" That inquiry is similar to, "Which platforms is Python available for?" The answer is, "Pretty much all of them." Following is a list that is comprehensive but not exhaustive:

Commercial RDBMSs

  • Informix

  • Sybase

  • Oracle

  • MS SQL Server

  • DB/2

  • SAP

  • Interbase

  • Ingres

Open Source RDBMSs

  • MySQL

  • PostgreSQL

  • SQLite

  • Gadfly

Database APIs

  • JDBC

  • ODBC

To find a current list of what databases are supported, check out:

http://python.org/topics/database/modules.html

21.2.6. Databases and Python: Adapters

For each of the databases supported, there exists one or more adapters that let you connect to the target database system from Python. Some databases, such as Sybase, SAP, Oracle, and SQLServer, have more than one adapter available. The best thing to do is to find out which ones fit your needs best. Your questions for each candidate may include: how good its performance is, how useful is its documentation and/or Web site, whether it has an active community or not, what the overall quality and stability of the driver is, etc. You have to keep in mind that most adapters provide just the basic necessities to get you connected to the database. It is the extras that you may be looking for. Keep in mind that you are responsible for higher-level code like threading and thread management as well as management of database connection pools, etc.

If you are squeamish and want less hands-onfor example, if you wish to do as little SQL or database administration as much as possiblethen you may wish to consider object-relational mappers, covered later on in this chapter.

Let us now look at some examples of how to use an adapter module to talk to a relational database. The real secret is in setting up the connection. Once you have this and use the DB-API objects, attributes, and object methods, your core code should be pretty much the same regardless of which adapter and RDBMS you use.

21.2.7. Examples of Using Database Adapters

First, let us look at a some sample code, from creating a database to creating a table and using it. We present examples using MySQL, PostgreSQL, and SQLite.

MySQL

We will use MySQL as the example here, along with the only MySQL Python adapter: MySQLdb, aka MySQL-python. In the various bits of code, we will also show you (deliberately) examples of error situations so that you have an idea of what to expect, and what you may wish to create handlers for.

We first log in as an administrator to create a database and grant permissions, then log back in as a normal client.

  >>> import MySQLdb   >>> cxn = MySQLdb.connect(user='root')   >>> cxn.query('DROP DATABASE test')   Traceback (most recent call last):     File "<stdin>", line 1, in ?   _mysql_exceptions.OperationalError: (1008, "Can't drop   database 'test'; database doesn't exist")   >>> cxn.query('CREATE DATABASE test')   >>> cxn.query("GRANT ALL ON test.* to ''@'localhost'")   >>> cxn.commit()   >>> cxn.close()


In the code above, we did not use a cursor. Some adapters have Connection objects, which can execute SQL queries with the query() method, but not all. We recommend you either not use it or check your adapter to make sure it is available.

The commit() was optional for us as auto-commit is turned on by default in MySQL. We then connect back to the new database as a regular user, create a table, and perform the usual queries and commands using SQL to get our job done via Python. This time we use cursors and their execute() method.

The next set of interactions shows us creating a table. An attempt to create it again (without first dropping it) results in an error.

  >>> cxn = MySQLdb.connect(db='test')   >>> cur = cxn.cursor()   >>> cur.execute('CREATE TABLE users(login VARCHAR(8), uid INT)')   0L


Now we will insert a few rows into the database and query them out.

  >>> cur.execute("INSERT INTO users VALUES('john', 7000)")   1L   >>> cur.execute("INSERT INTO users VALUES('jane', 7001)")   1L   >>> cur.execute("INSERT INTO users VALUES('bob', 7200)")   1L   >>> cur.execute("SELECT * FROM users WHERE login LIKE 'j%'")   2L   >>> for data in cur.fetchall():   ... print '%s\t%s' % data   ...   john   7000   jane   7001


The last bit features updating the table, either updating or deleting rows.

  >>> cur.execute("UPDATE users SET uid=7100 WHERE uid=7001")   1L   >>> cur.execute("SELECT * FROM users")   3L   >>> for data in cur.fetchall():   ...  print '%s\t%s' % data   ...   john    7000   jane    7100   bob     7200   >>> cur.execute('DELETE FROM users WHERE login="bob"')   1L   >>> cur.execute('DROP TABLE users')   0L   >>> cur.close()   >>> cxn.commit()   >>> cxn.close()


MySQL is one of the most popular open source databases in the world, and it is no surprise that a Python adapter is available for it. Keep in mind that no database modules are available in the Python standard libraryall adapters are third-party packages that have to be downloaded and installed separately from Python. Please see the References section toward the end of the chapter to find out how to download it.

PostgreSQL

Another popular open source database is PostgreSQL. Unlike MySQL, there are no less than three current Python adapters available for Postgres: psycopg, PyPgSQL, and PyGreSQL. A fourth, PoPy, is now defunct, having contributed its project to combine with that of PyGreSQL back in 2003. Each of the three remaining adapters has its own characteristics, strengths, and weaknesses, so it would be a good idea to practice due diligence to determine which is right for you.

The good news is that the interfaces are similar enough that you can create an application that, say, measures the performance between all three (if that is a metric that is important to you). Here we show you the setup code to get a Connection object for each:

psycopg

  >>> import psycopg   >>> cxn = psycopg.connect(user='pgsql')


PyPgSQL

  >>> from pyPgSQL import PgSQL   >>> cxn = PgSQL.connect(user='pgsql')


PyGreSQL

  >>> import pgdb   >>> cxn = pgdb.connect(user='pgsql')


Now comes some generic code that will work for all three adapters.

  >>> cur = cxn.cursor()   >>> cur.execute('SELECT * FROM pg_database')   >>> rows = cur.fetchall()   >>> for i in rows:   ...  print i   >>> cur.close()   >>> cxn.commit()   >>> cxn.close()


Finally, you can see how their outputs are slightly different from one another.

PyPgSQL

  sales   template1   template0


psycopg

  ('sales', 1, 0, 0, 1, 17140, '140626', '3221366099',   '', None, None)   ('template1', 1, 0, 1, 1, 17140, '462', '462', '', None,   '{pgsql=C*T*/pgsql}')   ('template0', 1, 0, 1, 0, 17140, '462', '462', '', None,   '{pgsql=C*T*/pgsql}')


PyGreSQL

  ['sales', 1, 0, False, True, 17140L, '140626',   '3221366099', '', None, None]   ['template1', 1, 0, True, True, 17140L, '462', '462',   '', None, '{pgsql=C*T*/pgsql}']   ['template0', 1, 0, True, False, 17140L, '462',   '462', '', None, '{pgsql=C*T*/pgsql}']


SQLite

For extremely simple applications, using files for persistent storage usually suffices, but the most complex and data-driven applications demand a full relational database. SQLite targets the intermediate systems and indeed is a hybrid of the two. It is extremely lightweight and fast, plus it is serverless and requires little or no administration.

SQLite has seen a rapid growth in popularity, and it is available on many platforms. With the introduction of the pysqlite database adapter in Python 2.5 as the sqlite3 module, this marks the first time that the Python standard library has featured a database adapter in any release.

It was bundled with Python not because it was favored over other databases and adapters, but because it is simple, uses files (or memory) as its backend store like the DBM modules do, does not require a server, and does not have licensing issues. It is simply an alternative to other similar persistent storage solutions included with Python but which happens to have a SQL interface.

Having a module like this in the standard library allows users to develop rapidly in Python using SQLite, then migrate to a more powerful RDBMS such as MySQL, PostgreSQL, Oracle, or SQL Server for production purposes if this is their intention. Otherwise, it makes a great solution to stay with for those who do not need all that horsepower.

Although the database adapter is now provided in the standard library, you still have to download the actual database software yourself. However, once you have installed it, all you need to do is start up Python (and import the adapter) to gain immediate access:

  >>> import sqlite3   >>> cxn = sqlite3.connect('sqlite_test/test')   >>> cur = cxn.cursor()   >>> cur.execute('CREATE TABLE users(login VARCHAR(8), uid         INTEGER)')   >>> cur.execute('INSERT INTO users VALUES("john", 100)')   >>> cur.execute('INSERT INTO users VALUES("jane", 110)')   >>> cur.execute('SELECT * FROM users')   >>> for eachUser in cur.fetchall():   ...     print eachUser   ...   (u'john', 100)   (u'jane', 110)   >>> cur.execute('DROP TABLE users')   <sqlite3.Cursor object at 0x3d4320>   >>> cur.close()   >>> cxn.commit()   >>> cxn.close()


Okay, enough of the small examples. Next, we look at an application similar to our earlier example with MySQL, but which does a few more things:

  • Creates a database (if necessary)

  • Creates a table

  • Inserts rows into the table

  • Updates rows in the table

  • Deletes rows from the table

  • Drops the table

For this example, we will use two other open source databases. SQLite has become quite popular of late. It is very small, lightweight, and extremely fast for all the most common database functions. Another database involved in this example is Gadfly, a mostly SQL-compliant RDBMS written entirely in Python. (Some of the key data structures have a C module available, but Gadfly can run without it [slower, of course].)

Some notes before we get to the code. Both SQLite and Gadfly require the user to give the location to store database files (while MySQL has a default area and does not require this information from the use). The most current incarnation of Gadfly is not yet fully DB-API 2.0 compliant, and as a result, is missing some functionality, most notably the cursor attribute rowcount in our example.

Database Adapter Example Application

In the example below, we want to demonstrate how to use Python to access a database. In fact, for variety, we added support for three different database systems: Gadfly, SQLite, and MySQL. We are going to create a database (if one does not already exist), then run through various database operations such as creating and dropping tables, and inserting, updating, and deleting rows. Example 21.1 will be duplicated for the upcoming section on ORMs as well.

Example 21.1. Database Adapter Example (ushuffle_db.py)

This script performs some basic operations using a variety of databases (MySQL, SQLite, Gadfly) and a corresponding Python database adapter.

1   #!/usr/bin/env python 2 3   import os 4   from random import randrange as rrange 5 6   COLSIZ = 10 7   RDBMSs = {'s': 'sqlite', 'm': 'mysql', 'g': 'gadfly'} 8   DB_EXC = None 9 10  def setup(): 11      return RDBMSs[raw_input(''' 12  Choose a database system: 13 14 (M)ySQL 15 (G)adfly 16 (S)QLite 17 18  Enter choice: ''').strip().lower()[0]] 19 20  def connect(db, dbName): 21      global DB_EXC 22      dbDir = '%s_%s' % (db, dbName) 23 24      if db == 'sqlite': 25          try: 26              import sqlite3 27          except ImportError, e: 28               try: 29                   from pysqlite2 import dbapi2 as sqlite3 30               except ImportError, e: 31                   return None 32 33          DB_EXC = sqlite3 34          if not os.path.isdir(dbDir): 35              os.mkdir(dbDir) 36          cxn = sqlite.connect(os.path.join(dbDir, dbName)) 37 38      elif db == 'mysql': 39          try: 40               import MySQLdb 41               import _mysql_exceptions as DB_EXC 42          except ImportError, e: 43               return None 44 45          try: 46               cxn = MySQLdb.connect(db=dbName) 47          except _mysql_exceptions.OperationalError, e: 48          cxn = MySQLdb.connect(user='root') 49               try: 50                     cxn.query('DROP DATABASE %s' % dbName) 51               except DB_EXC.OperationalError, e: 52                    pass 53               cxn.query('CREATE DATABASE %s' % dbName) 54               cxn.query("GRANT ALL ON %s.* to ''@'localhost'" % dbName) 55           cxn.commit() 56               cxn.close() 57               cxn = MySQLdb.connect(db=dbName) 58 59      elif db == 'gadfly': 60           try: 61                 from gadfly import gadfly 62                 DB_EXC = gadfly 63           except ImportError, e: 64                 return None 65 66           try: 67                 cxn = gadfly(dbName, dbDir) 68           except IOError, e: 69                 cxn = gadfly() 70                 if not os.path.isdir(dbDir): 71                     os.mkdir(dbDir) 72                 cxn.startup(dbName, dbDir) 73      else: 74           return None 75      return cxn 76 77  def create(cur): 78      try 79          cur.execute(''' 80            CREATE TABLE users ( 81              login VARCHAR(8), 82              uid INTEGER, 83              prid INTEGER) 84          ''') 85      except DB_EXC.OperationalError, e: 86           drop(cur) 87           create(cur) 88 89   drop = lambda cur: cur.execute('DROP TABLE users') 90 91   NAMES = ( 92       ('aaron', 8312), ('angela', 7603), ('dave', 7306), 93       ('davina',7902), ('elliot', 7911), ('ernie', 7410), 94       ('jess', 7912), ('jim', 7512), ('larry', 7311), 95       ('leslie', 7808), ('melissa', 8602), ('pat', 7711), 96       ('serena', 7003), ('stan', 7607), ('faye', 6812), 97       ('amy', 7209), 98   ) 99 100  def randName(): 101      pick = list(NAMES) 102      while len(pick) > 0: 103          yield pick.pop(rrange(len(pick))) 104 105  def insert(cur, db): 106      if db == 'sqlite': 107          cur.executemany("INSERT INTO users VALUES(?, ?, ?)", 108          [(who, uid, rrange(1,5)) for who, uid in randName()]) 109      elif db == 'gadfly': 110           for who, uid in randName(): 111             cur.execute("INSERT INTO users VALUES(?, ?, ?)", 112                (who, uid, rrange(1,5))) 113      elif db == 'mysql': 114           cur.executemany("INSERT INTO users VALUES(%s, %s, %s)", 115           [(who, uid, rrange(1,5)) for who, uid in randName()]) 116 117  getRC = lambda cur: cur.rowcount if hasattr(cur,      'rowcount') else -1 118 119  def update(cur): 120      fr = rrange(1,5) 121      to = rrange(1,5) 122      cur.execute( 123          "UPDATE users SET prid=%d WHERE prid=%d" % (to, fr)) 124      return fr, to, getRC(cur) 125 126  def delete(cur): 127      rm = rrange(1,5) 128      cur.execute('DELETE FROM users WHERE prid=%d' % rm) 129      return rm, getRC(cur) 130 131  def dbDump(cur): 132      cur.execute('SELECT * FROM users') 133      print '\n%s%s%s' % ('LOGIN'.ljust(COLSIZ), 134          'USERID'.ljust(COLSIZ), 'PROJ#'.ljust(COLSIZ)) 135      for data in cur.fetchall(): 136           print '%s%s%s' % tuple([str(s).title().ljust(COLSIZ) \ 137               for s in data]) 138 139  def main(): 140      db = setup() 141      print '*** Connecting to %r database' % db 142      cxn = connect(db, 'test') 143      if not cxn: 144           print 'ERROR: %r not supported, exiting' % db 145           return 146      cur = cxn.cursor() 147 148      print '\n*** Creating users table' 149         create(cur) 150 151      print '\n*** Inserting names into table' 152      insert(cur, db) 153      dbDump(cur) 154 155      print '\n*** Randomly moving folks', 156      fr, to, num = update(cur) 157      print 'from one group (%d) to another (%d)' % (fr, to) 158      print '\t(%d users moved)' % num 159      dbDump(cur) 160 161      print '\n*** Randomly choosing group', 162      rm, num = delete(cur) 163      print '(%d) to delete' % rm 164      print '\t(%d users removed)' % num 165      dbDump(cur) 166 167      print '\n*** Dropping users table' 168      drop(cur) 169      cur.close() 170      cxn.commit() 171      cxn.close() 172 173  if__name__ == '__main__': 174     main()

Line-by-Line Explanation
Lines 118

The first part of this script imports the necessary modules, creates some global "constants" (the column size for display and the set of databases we are supporting), and features the setup() function, which prompts the user to select the RDBMS to use for any particular execution of this script.

The most notable constant here is DB_EXC, which stands for DataBase EXCeption. This variable will eventually be assigned the database exception module for the specific database system that the users chooses to use to run this application with. In other words, if users choose MySQL, DB_EXC will be _mysql_exceptions, etc. If we developed this application in more of an object-oriented fashion, this would simply be an instance attribute, i.e., self.db_exc_module or something like that.

Lines 2075

The guts of consistent database access happens here in the connect() function. At the beginning of each section, we attempt to load the requested database modules. If a suitable one is not found, None is returned to indicate that the database system is not supported.

Once a connection is made, then all other code is database and adapter independent and should work across all connections. (The only exception in our script is insert().) In all three subsections of this set of code, you will notice that a valid connection should be passed back as cxn.

If SQLite is chosen (lines 24-36), we attempt to load a database adapter. We first try to load the standard library's sqlite3 module (Python 2.5+). If that fails, we look for the third-party pysqlite2 package. This is to support 2.4.x and older systems with the pysqlite adapter installed. If a suitable adapter is found, we then check to ensure that the directory exists because the database is file based. (You may also choose to create an in-memory database.) When the connect() call is made to SQLite, it will either use one that already exists or make a new one using that path if it does not.

MySQL (lines 38-57) uses a default area for its database files and does not require this to come from the user. Our code attempts to connect to the specified database. If an error occurs, it could mean either that the database does not exist or that it does exist but we do not have permission to see it. Since this is just a test application, we elect to drop the database altogether (ignoring any error if the database does not exist), and re-create it, granting all permissions after that.

The last database supported by our application is Gadfly (lines 59-75). (At the time of writing, this database is mostly but not fully DB-API-compliant, and you will see this in this application.) It uses a startup mechanism similar to that of SQLite: it starts up with the directory where the database files should be. If it is there, fine, but if not, you have to take a roundabout way to start up a new database. (Why this is, we are not sure. We believe that the startup() functionality should be merged into that of the constructor gadfly.gadfly().)

Lines 7789

The create() function creates a new users table in our database. If there is an error, that is almost always because the table already exists. If this is the case, drop the table and re-create it by recursively calling this function again. This code is dangerous in that if the recreation of the table still fails, you will have infinite recursion until your application runs out of memory. You will fix this problem in one of the exercises at the end of the chapter.

The table is dropped from the database with the one-liner drop().

Lines 91103

This is probably the most interesting part of the code outside of database activity. It consists of a constant set of names and user IDs followed by the generator randName() whose code can be found in Chapter 11 (Functions) in Section 11.10. The NAMES constant is a tuple that must be converted to a list for use with randName() because we alter it in the generator, randomly removing one name at a time until the list is exhausted. Well, if NAMES was a list, we would only use it once. Instead, we make it a tuple and copy it to a list to be destroyed each time the generator is used.

Lines 105115

The insert() function is the only other place where database-dependent code lives, and the reason is that each database is slightly different in one way or another. For example, both the adapters for SQLite and MySQL are DB-API-compliant, so both of their cursor objects have an executemany() function, whereas Gadfly does not, so rows have to be inserted one at a time.

Another quirk is that both SQLite and Gadfly use the qmark parameter style while MySQL uses format. Because of this, the format strings are different. If you look carefully, however, you will see that the arguments themselves are created in a very similar fashion.

What the code does is this: for each name-userID pair, it assigns that individual to a project group (given by its project ID or prid). The project ID is chosen randomly out of four different groups (randrange(1,5)).

Line 117

This single line represents a conditional expression (read as: Python ternary operator) that returns the rowcount of the last operation (in terms of rows altered), or if the cursor object does not support this attribute (meaning it is not DB-API-compliant), it returns -1.

Conditional expressions were added in Python 2.5, so if you are using 2.4.x or older, you will need to convert it back to the "old-style" way of doing it:

  getRC = lambda cur: (hasattr(cur, 'rowcount') \       and [cur.rowcount] or [-1])[0]


If you are confused by this line of code, don't worry about it. Check the FAQ to see why this is, and get a taste of why conditional expressions were finally added to Python in 2.5. If you are able to figure it out, then you have developed a solid understanding of Python objects and their Boolean values.

Lines 119129

The update() and delete() functions randomly choose folks from one group. If the operation is update, move them from their current group to another (also randomly chosen); if it is delete, remove them altogether.

Lines 131137

The dbDump() function pulls all rows from the database, formats them for printing, and displays them to the user. The print statement to display each user is the most obfuscated, so let us take it apart.

First, you should see that the data were extracted after the SELECT by the fetchall() method. So as we iterate each user, take the three columns (login, uid, prid), convert them to strings (if they are not already), titlecase it, and format the complete string to be COLSIZ columns left-justified (right-hand space padding). Since the code to generate these three strings is a list (via the list comprehension), we need to convert it to a tuple for the format operator ( % ).

Lines 139174

The director of this movie is main(). It makes the individual functions to each function described above that defines how this script works (assuming that it does not exit due to either not finding a database adapter or not being able to obtain a connection [lines 143-145]). The bulk of it should be fairly self-explanatory given the proximity of the print statements. The last bits of main() close the cursor, and commit and close the connection. The final lines of the script are the usual to start the script.



Core Python Programming
Core Python Programming (2nd Edition)
ISBN: 0132269937
EAN: 2147483647
Year: 2004
Pages: 334
Authors: Wesley J Chun

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net