Python/PostgreSQL Interface Architecture | Using PostgreSQL with Python

Table of contents:

Client 3Query Processing

Client 3Query Processing

To execute a SQL command with Python's DB-API, you must first create a cursor. Don't confuse this cursor with a cursor created by PostgreSQL's DECLARE CURSOR command; they have some similarities, but they are certainly not the same thing, as you will see in this section.

You create a cursor object by calling a connection's cursor() function[1]. For example, if you have a connection named connect, you would create a cursor like this:

[1] It is possible, but extremely unlikely, that a call to connect.cursor() can throw a pgOperationalError exception. In fact, the only way that can happen is if somebody is messing around with the internals of a connection object; and we would never do that, would we?

cur = connect.cursor()

Notice that the cursor() function expects no arguments. You can create multiple cursor objects from the same connection; they operate independently, except that a commit() or rollback() executed on the connection will affect all cursors open on that connection.

The next client application (client3.py) shows the steps required to create a cursor, execute a command, and print the results (see Listing 17.3).

Listing 17.3. client3.pymain()

 1 #!/usr/bin/python
 2 #
 3 # File: client3.py
 4
 5 import pgdb
 6 import string
 7
 8 ############################################################
 9 def main( ):
 10 try:
 11 connection = pgdb.connect( database = "movies",
 12 user = "bruce",
 13 password = "cows" )
14 
 15 except Exception, e:
 16 print str( e )
 17 exit
 18 
 19 cur = connection.cursor()
 20 
 21 try:
 22 cur.execute( "SELECT * FROM customers" )
 23 process_results( cur )
 24 
 25 except StandardError, e:
 26 print str( e )
 27 
 28 cur.close()
 29 connection.close()
 30 exit

Listing 17.3 shows the main() procedure from client3.py. It starts by calling pgdb.connect() to establish a connection to the movies database. Lines 15 through 17 take care of any exceptions thrown by pgdb.connect(). I'll take a shortcut here by defining a single exception handler that can catch proper DB-API exceptions as well as the (apparently) erroneous exception thrown by the PostgreSQL interface.

At line 19, main() creates a new cursor object by calling connection.cursor(). It is very unlikely that this call to cursor() will fail, so we won't bother catching any exceptions. If cursor() does fail, Python will print a stack trace and an error message and abort the application.

Next, main() uses the cursor.execute() function to execute a simple SELECT command. If something goes wrong with this command, execute() will throw an exception. The text of the error message is encapsulated in the exception parameter (specifically, e.args). If the command completes without error, main() calls the process_result() function (see Listing 17.4) to display the result set.

After you have finished with the cursor object, you should close it by calling cur.close(). This is not strictly required because Python closes this object for you during garbage collection, but it's usually a good idea.

You should also close the connection object when you are done with it. Even though you can ignore the cursor.close() function, you should get into the habit of closing connection objects. In fact, before you call connection.close(), you should call connection.commit(). Why? Because the PostgreSQL DB-API interface does not run in "auto-commit" mode. When you first call pgdb.connect() to establish a connection, the connect() function silently executes a BEGIN command for you. That means that all commands that you execute belong to a single multistatement transaction until you either connection.commit() or connection.rollback(). If you fail to commit before you close a connection, any changes made in the most recent transaction are rolled back. Watch out for this-it will bite you if you aren't careful.

Now, let's look at the process_results() function (see Listing 17.4). This function is responsible for formatting and displaying the result of the SELECT command. We don't actually do any of the grunt work in process_results(); instead, I've factored the details into three helper functions.

Listing 17.4. client3.pyprocess_results()

32 ############################################################
33 def process_results( cur ):
34 
35 widths = []
36 rows = cur.fetchall()
37 cols = cur.description
38 
39 compute_widths( cur, widths, rows, cols )
40 print_headers( cur, widths, cols )
41 print_values( cur, widths, rows )

process_results() starts by defining an (empty) array that holds the display width for each column in the result set. I'll pass this array to the helper functions, so I'll define it here.

Next, process_results() calls the cursor.fetchall() function to retrieve all rows from the result set. The cursor.fetchall() function returns a sequence of sequences[2]. Each member of this sequence represents a single row. So, to get to the second column in the third row, you would use the following:

[2] If you're not familiar with Python, think of a "sequence of sequences" as "an array of arrays" or maybe as a "list of lists." They are not completely analogous, but close enough to understand that fetchall() returns a collection of collections.

print rows[2][1] # sequence indexes start at 0, not 1

Besides cursor.fetchall(), there are two other functions that return all or part of a result set. The cursor.fetchone() function fetches the next row in a result set. fetchone() returns a sequence or returns None if you have exhausted the result set. The cursor.fetchmany( [size=n] ) function returns the next n rows in the result set. If you omit the size parameter, fetchmany() will assume that n=5. If there are fewer than n rows remaining in the result set, fetchmany() will return all remaining rows. If the result set has been exhausted, fetchmany() will return None. Like fetchall(), fetchmany() returns a sequence of one or more sequences.

Notice that there is no way to go backward in the result set. You can't re-fetch a row after you have gone past it, nor can you "rewind" the result set to the beginning. If you need to move around in the result set, use fetchall() or declare a PostgreSQL cursor (not a DB-API cursor) and execute the FETCH commands yourself.

After you have retrieved all the rows in the result set, process_results() nabs the column metadata from cursor.description. Notice that cursor.description is a public data member, not a function. cursor.description is a list of seven-element lists. Table 17.2 shows the meaning of each sublist.

Table 17.2. `cursor.description` Metadata Values
Element	Meaning
0	Column name
1	Data type
2	Maximum display size
3	Server size (in bytes)
4	Precision (not used)
5	Scale (not used)
6	Null allowed? (not used)

Currently, the PyGreSQL DB-API implementation does not use the last three elements in the table (precision, scale, and null allowed?); they are always set to None. The data type member does not conform to the DB-API specification, but it's probably more useful that way. Data types are reported by their PostgreSQL names (char, oid, float4, and so on). The display size and server size elements are set to -1 for any variable-sized columns.

We will be using the column names a little later, so we store them in the local variable cols.

Now that you have access to the data (rows) and the metadata (cols), process_results() calls each of the helper functions in the right order. compute_widths() computes the width of each column name, storing the result in the widths array (see Listing 17.5). Next, print_headers() prints column headings. Finally, print_values() prints the entire result set.

Listing 17.5. client3.pycompute_widths()

43 ############################################################
44 def compute_widths( cur, widths, rows, cols ):
45
46 c = 0 
47 
48 for col in cols:
49 widths.append( len( col[0] ))
50 c = c + 1
51
52 r = 0
53 
54 for row in rows:
55 c = 0
56
57 for col in row:
58 if( len( str( col )) > widths[c] ):
59 widths[c] = len( str( col ))
60 c = c + 1
61 r = r + 1

The compute_widths() function computes the width of each column in the result set.

It starts by walking through the list of column names and appending the length of each name to the widths[] array. Remember, the caller (process_results()) gave you a complete metadata array in the cols parameter. Element 0 of each metadata list is the column name.

Next, compute_widths() has to find the widest value in each column of the result set. The caller gave it a list of all the rows in the result set in the rows parameter. As it processes each column in each row of the result set, compute_widths() increases the corresponding element in the widths[] array to its maximum required width.

Notice (in lines 58 and 59) that this function converts each data value into string form before calling the len() function. The result set can contain integer values, string values, float values, and so on. You can't invoke the len() function on a numeric value so you must convert them into string form first.

You can view the actual Python data types using the type() function:

>>> cur.execute( "SELECT * FROM pg_class" )
>>> c = 0
>>> for col in cur.fetchone():
... print cur.description[c][0], '	', col, '	', type(col)
... c = c+1
... 
relname pg_type 
reltype 71L 
relowner 1 
relam 0L 
relfilenode 1247L 
relpages 2 
reltuples 143.0 
reltoastrelid 0L 
reltoastidxid 0L 
relhasindex 1 
relisshared 0 
relkind r 
relnatts 17 
relchecks 0 
reltriggers 0 
relukeys 0 
relfkeys 0 
relrefs 0 
relhasoids 1 
relhaspkey 0 
relhasrules 0 
relhassubclass 0 
relacl None

Listing 17.6 shows the print_headers() function.

Listing 17.6. client3.pyprint_headers()

63 ############################################################
64 def print_headers( cur, widths, cols ):
65
66 c = 0;
67 
68 for col in cols:
69 print string.center( col[0], widths[c] ), 
70 c = c + 1
71 print
72
73 c = 0;
74 
75 for col in cur.description:
76 print '-' * widths[c], 
77 c = c + 1
78 print

print_headers() centers each column name within the width calculated by compute_widths(). You may have noticed that I have a dangling comma at the end of line 69 (and again at the end of line 76). Those aren't typosa dangling comma suppresses the new-line character that print would otherwise emit. I want all the column names to appear on the same line, so I'll suppress all new-lines until you get to line 71 (or 78 in the case of the second loop).

Following the column names, print_headers() prints a line of separator characters (hyphens). When you apply the multiply operator (*) to a string, as in line 76, the result is a string of repeated characters. I'll create the separator strings my "multiplying" a dash by the width of each column.

Listing 17.7 shows the remaining code in client3.py. The print_values() function loops through each row and column in the result set (rows). At line 89, convert each value to string form, left-justify it within the proper column, and print it.

Listing 17.7. client3.pyprint_values() and mainline

80 ############################################################
81 def print_values( cur, widths, rows ):
82
83 r = 0
84
85 for row in rows:
86 c = 0
87
88 for col in row:
89 print string.ljust( str(col), widths[c] ),
90 c = c + 1
91 r = r + 1
92 print
93
94 
95 ############################################################
96
97 main()

The mainline code (that is, the entry point for your client application) is at line 97we just call the main() function and exit when main() returns.

Now, run this application:

$ chmod a+x client3.py
$ ./client3.py 
id customer_name phone birth_date
-- -------------------- -------- ----------
1 Jones, Henry 555-1212 1970-10-10
2 Rubin, William 555-2211 1972-07-10
3 Panky, Henry 555-1221 1968-01-21
4 Wonderland, Alice N. 555-1122 1969-03-05
7 Grumby, Jonas None 1984-02-21

At this point, you know how to connect to a PostgreSQL server from Python, how to intercept errors, and how to process SELECT commands. In the next section, we'll develop an interactive command processor using Python and the Tkinter GUI module.

Part I: General PostgreSQL Use

Introduction to PostgreSQL and SQL

Working with Data in PostgreSQL

PostgreSQL SQL Syntax and Use

Performance

Part II: Programming with PostgreSQL

Introduction to PostgreSQL Programming

Extending PostgreSQL

PL/pgSQL

The PostgreSQL C APIlibpq

A Simpler C APIlibpgeasy

The New PostgreSQL C++ APIlibpqxx

Embedding SQL Commands in C Programsecpg

Using PostgreSQL from an ODBC Client Application

Using PostgreSQL from a Java Client Application

Using PostgreSQL with Perl

Using PostgreSQL with PHP

Using PostgreSQL with Tcl and Tcl/Tk

Using PostgreSQL with Python

Using PostgreSQL with Python
Python/PostgreSQL Interface Architecture
Prerequisites
Client 1Connecting to the Server
Client 2Adding Error Checking
Client 3Query Processing
Client 4An Interactive Command Processor
Summary

Npgsql: The .NET Data Provider

Other Useful Programming Tools

Part III: PostgreSQL Administration

Introduction to PostgreSQL Administration

PostgreSQL Administration

Internationalization and Localization

Security

Replicating PostgreSQL Data with Slony

Contributed Modules

Index