Python

Python is another language that can easily be combined with PostgreSQL. Python's interface to PostgreSQL can easily be handled. In this section, you will learn how powerful applications can be built with the help of PostgreSQL.

A Short Overview of Python

In the past few years , Python has become a very widespread language. In this section, we'll try to explain the basic concepts of Python and take a brief look at this powerful, open -source programming language.

Python was born in December 1989. Guido van Rossum, the father of Python, was looking for a hobby to work on during the week around Christmas. Guido decided to write an interpreter that would be similar to ABC but would be more suitable for a UNIX/C hacker. The name Python was chosen because Guido is a real fan of Monty Python's Flying Circus. I guess back in 1989, nobody expected Python to become the programming language it is today.

The idea behind Python was to create a language that was not bound to UNIX and had some significant improvements over ABC. Some of ABC's features were aimed at novice programmers but were not suitable for the intended audience of experienced UNIX programmers. One point was also that the integrated editor of ABC was not liked by many people. Python, however, would rely on the infrastructure and conventions provided by a UNIX system. This made Python more accepted by advanced programmers.

But when designing Python, Guido was not only influenced by ABC. Some components were taken from C and Modula 3, an elegant programming language that many of you might have seen back in the early days.

Python uses indentation for statement grouping. This feature is inherited from ABC and is today one of the most contentious features of Python because not many other languages use indentation. In general, it is a religious question whether it is a good feature. According to Guido, indentation for statement grouping has two important advantages:

It reduces visual clutter and makes programs shorter because no lines are wasted for brackets. There is also no discussion about which is the best way to use brackets in the source code as there is for languages like C/C++.
Programmers have less freedom when formatting the code. At first glance, this sounds like a disadvantage , but it makes the code of various programmers look the same way. This makes programs easier to read and easier to understand.

It's up to you to decide whether you like "Python-style" programs. Guido von Rossum once said, "Rich syntax is more of a burden than a help." There should be only one simple way to achieve a target. Many programming languages, such as C, C++, and Perl (I guess Perl is in this case even " worse "), offer many ways to solve a problem. Depending on the knowledge of the programmer who has to read the code, some of these ways can be very hard to understand. The target of Python is to provide a programming platform that focuses on the reusability of code and the production of easier to understand programs. The first point is an especially important issue.

Python is an object-orientated programming language, so all components are clustered in objects. Object-orientation is one of the core features of Python because it is the best way to allow programmers to share their code. The way objects are accessed in the code does not differ from other programming languages, because the syntax used by other languages should be familiar to almost all programmers, and switching to Python should be an easy task.

Compared to most compiled languages, Python is a rather slow language. The Python community tries to improve the speed, but the main focus is not on increasing the speed of the language itself but decreasing the time it takes a programmer to write a program. Today, the main costs for establishing and running a program rise with the time it takes to write the program. The speed of hardware is constantly increasing, so programming time gains more and more importance. You will soon recognize that Python is a perfect language for prototyping and developing applications very fast.

Currently, you can choose between the pg module and the newer pgdb interface that is compliant with the DB-API 2.0 specification.

Now let's look at Python's interface to PostgreSQL.

Connecting

We will first look closer at how a connection can be established to the database, and how we can check whether the authentification process has failed.

In the following example, you can see how easy a connection to the database can be opened using Python:

 #!/usr/bin/python # importing the PostgreSQL module import pg # connecting to the database try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established" except:         print "Connection to the database could not be established"

We first import the Python module for PostgreSQL. In the next step, we try to connect to the database. The various connection parameters are passed to the function in brackets. You can see that the names of the parameters are directly derived from the C interface of PostgreSQL.

If an error occurs, a message is displayed. Let's execute the script:

 [hs@duron python]$  make  python script.py The connection has successfully been established

The connection has successfully been established. Python offers a second way to connect to PostgreSQL:

 #!/usr/bin/python # importing the PostgreSQL module import pg # connecting to the database try:         conn = pg.connect('mypython', 'localhost', 5432, None,                 None, 'hs', None)         print "The connection has successfully been established" except: cx         print "Connection to the database could not be established"

In the first example, we passed the name and the value of a parameter to the function. In this example, we use a fixed parameter list to connect . You can see which parameter has to be in which position in the following:

 connect([dbname], [host], [port], [opt], [tty], [user], [passwd])

If a parameter has to keep the default setting, we simply use None in the parameter list of the connect function.

To overwrite the default parameters, Python offers some very simple functions. In the next example, you an see how easy it is to set a default value of a variable:

 #!/usr/bin/python # importing the PostgreSQL module import pg pg.set_defbase('mypython') pg.set_defhost('localhost') pg.set_defport(5432) pg.set_defopt('-i') pg.set_deftty('tty3') try:         conn = pg.connect(user='hs')         print "The connection has successfully been established" except:         print "Connection to the database could not be established"

When running the script, the connection can easily be established, although the connect string only contains the username:

 [hs@duron python]$  make  python script.py The connection has successfully been established

In addition to setting default values, it is also possible to retrieve the value of the variables . Python offers functions that have names similar to those we used for setting the values (simply change the set in the name of the functions to get ). The following script sets the default variables to a certain value, connects to the database, and tells us the default values:

 #!/usr/bin/python # importing the PostgreSQL module import pg pg.set_defbase('mypython') pg.set_defhost('localhost') pg.set_defport(5432) pg.set_defopt('-i') pg.set_deftty('tty3') try:         conn = pg.connect(user='hs')         print "The connection has successfully been established"         print "default database: ", pg.get_defbase()         print "default host: ", pg.get_defhost()         print "default port: ", pg.get_defport()         print "default options: ", pg.get_defopt()         print "default tty: ", pg.get_deftty() except:         print "Connection to the database could not be established"

Let's execute the script:

 [hs@duron python]$  make  python script.py The connection has successfully been established default database:  mypython default host:  localhost default port:  5432L default options:  -i default tty:  tty3

Note

The functions just shown do not check environment variables; the values have to be set using the Python functions to be retrieved.

To close a connection explicitly, use the close() method.

pgobject and pgqueryobjectpgobject handles a connection to the database. The object contains several important variables we may need in our scripts:

 #!/usr/bin/python # importing the PostgreSQL module import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         print "current database: ", conn.db         print "current host: ", conn.host         print "current port: ", conn.port         print "current options: ", conn.options         print "current tty: ", conn.tty         print "current user: ", conn.user         print "status: ", conn.status         print "errors: ", conn.error except:         print "Connection to the database could not be established"

As you can see in the source code, pgobject allows us to access the parameters of the connection:

 [hs@duron python]$  make  python script.py The connection has successfully been established current database:  mypython current host:  localhost current port:  5432 current options: current tty: current user:  Deprecated facility status:  1 errors:

The only variable that can't be accessed in the previous example is the username on the database system. The status of the connection is 1, which means that everything is up and running. No errors have occurred.

After establishing a connection to the database, we will perform some simple SQL commands. First, we insert some sample data into the database:

 CREATE TABLE "persons" (         "name" text,         "birth" date,         "gender" character(1), PRIMARY KEY(name) ); CREATE TABLE "income" (         "name" text,         "year" integer,         "income" integer ); COPY "persons"  FROM stdin; Albert        1970-01-01        m John        1973-04-04        m Carla        1963-10-21        f \ . COPY "income"  FROM stdin; Albert        1998        28000 Albert        1999        30000 Jon        1998        20000 Jon        1999        40000 Carla        1998        30000 Carla        1999        32000 \ . CREATE  INDEX "income_name_key" on "income" using btree ( "name" "text_ops" );

When using Python, interacting with a PostgreSQL database is truly an easy task. In the next example, we will select all records in the income table and display them onscreen:

 #!/usr/bin/python # importing the PostgreSQL module import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         res = conn.query("SELECT name, year, income FROM income")         print res except:         print "An error with the database has occurred."

We send a query to the server with the help of the query method, which is defined for connections. If we want to display the complete result onscreen, we use a simple print command to do the job:

 [hs@duron python]$  make  python script.py The connection has successfully been established name  yearincome ------+----+------ Albert1998 28000 Albert1999 30000 Jon   1998 20000 Jon   1999 40000 Carla 1998 30000 Carla 1999 32000 (6 rows)

You can see that we do not even have to define a cursor or a loop to display the result; everything is done internally by Python. I would say that this is a very comfortable feature and makes writing small applications very easy indeed.

In many cases, displaying the entire result without processing the data returned, won't be enough. The more comfortable way is to process the query line-by-line . Every line returned should be a dictionary consisting of the fields selected by the SQL command:

 #!/usr/bin/python # importing the PostgreSQL module import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         rowindex = 0         sqlcommand = "SELECT * FROM income WHERE year=1998"         for res in  conn.query(sqlcommand).dictresult():                 rowindex = rowindex + 1                 print rowindex, res except:         print "An error with the database has occurred."

This time, we process the data using a simple loop. With the help of the dictresult method, we make the database return the data in a dictionary ”a very efficient data structure provided by Python. If we execute the script now, we will see what a dictionary looks like if it is displayed without being indexed:

 [hs@duron python]$  make  python script.py The connection has successfully been established 1 { 'year': 1998, 'income': 28000, 'name': 'Albert'} 2 { 'year': 1998, 'income': 20000, 'name': 'Jon'} 3 { 'year': 1998, 'income': 30000, 'name': 'Carla'}

In the previous listing, you can see that the names of the columns retrieved from the query are used as keys. Because we want the output to look more beautiful, we will now display the output using fixed fields with fixed length. This can be done by using print :

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         rowindex = 0         sqlcommand = "SELECT * FROM income WHERE year=1998"         for res in  conn.query(sqlcommand).dictresult():                 rowindex = rowindex + 1                 print '%(name)15s %(year)15s %(income)15d' % res except:         print "An error with the database has occurred."

Every field except the last one is displayed as a 15 character string; the last field will be displayed as decimal value:

 [hs@duron python]$  make  python script.py The connection has successfully been established          Albert            1998           28000             Jon            1998           20000           Carla            1998           30000

But the result does not have to be returned as a dictionary. Sometimes it is easier to process a list instead of a dictionary. In this case, getresult() has to be used. In the next listing, you can see how this can be done:

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         rowindex = 0         sqlcommand = "SELECT * FROM income WHERE year=1998"         for res in  conn.query(sqlcommand).getresult():                 rowindex = rowindex + 1                 print res                 print res[0], res[1], res[2], "\ n" except:         print "An error with the database has occurred."

We want the result to be printed line-by-line, we first display the whole list onscreen. Then we want the various elements of the list to be displayed:

 [hs@duron python]$  make  python script.py The connection has successfully been established ('Albert', 1998, 28000) Albert 1998 28000 ('Jon', 1998, 20000) Jon 1998 20000 ('Carla', 1998, 30000) Carla 1998 30000

Sometimes it is necessary to find out which fields are returned by the database. It might be useful for you to retrieve a list of all columns, especially if you are working with SELECT * statements. Python offers the listfields() method to do the job:

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         sqlcommand = "SELECT * FROM income WHERE year=1998"         res = conn.query(sqlcommand)         print "fields: ", res.listfields() except:         print "An error with the database has occurred."

The method returns the names of the three columns returned as a list:

 [hs@duron python]$  make  python script.py The connection has successfully been established fields:  ('name', 'year', 'income')

Of course, finding the names is not the only thing that can be done. In the next example, we present two additional examples you can use in your scripts:

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         sqlcommand = "SELECT * FROM income WHERE year=1998"         res = conn.query(sqlcommand)         for x in res.listfields():                 print res.fieldnum(x), " - ", x         print "\ nfield #2: ", res.fieldname(2), "\ n"         print res.dictresult() except:         print "An error with the database has occurred."

The first new function used in this script is called fieldnum() and returns the position of a field in the result. But the operation can also be done the other way around ” fieldname() returns the name of a field related to a given index.

Both functions can be used before extracting the result of the query. The output of the script is as follows :

 [hs@duron python]$  make  python script.py The connection has successfully been established 0  -  name 1  -  year 2  -  income field #2:  income  [{ 'year': 1998, 'income': 28000, 'name': 'Albert'} , { 'year': 1998, 'income': 20000, 'name': 'Jon'} , { 'year': 1998, 'income': 30000, 'name': 'Carla'} ]

The most interesting part of the output is the last line. We extract the whole result of the query with just one command, and we receive a data structure that contains all lines returned as a dictionary. Use Python's onboard functions to process the object returned.

Some scripts have to find the number of tuples returned by the query. The Python interface provides the ntuples function we will need in the next script:

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         sqlcommand = "SELECT * FROM income WHERE year=1998"         res = conn.query(sqlcommand)         print "tuples: ", res.ntuples() except:         print "An error with the database has occurred."

Our query returns three tuples; that is what we have the ntuples function expected to return:

 [hs@duron python]$  make  python script.py The connection has successfully been established tuples:  3

Sometimes you may find it useful to reset the connection to your database. Resetting means that the current connection is closed and opened with the same parameters as before:

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         conn.reset();         print "reset ..." except:         print "An error with the database has occurred."

We simply apply the reset() method on the connection handle to reset the connection and run the application again:

 [hs@duron python]$  make  python script.py The connection has successfully been established reset ...

Retrieving notifications from the backend is as easy as performing a reset:

 #!/usr/bin/python import pg try:         conn = pg.connect(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         notif = conn.getnotify()         print "notification: ", notif except:         print "An error with the database has occurred."

Because nothing happened , no notifications are returned by the backend:

 [hs@duron python]$  make  python script.py The connection has successfully been established notification:  None

The Database Wrapper Class `DB`

The DB class is a very powerful part of the pg module and contains many extremely useful functions. We will take a closer look at these functions in this section.

The first example shows you how we can retrieve a list of all databases installed on the system:

 #!/usr/bin/python import pg try:         conn = pg.DB(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         print conn.get_databases() except:         print "An error with the database has occurred."

Python returns the names of the databases in a list that can easily be used for further processing. get_databases() is an extremely useful function, especially for administration purposes. Imagine a situation where you have to work with dynamically created databases. With the help of simple Python scripts, checking these databases should be an easy task (in many cases, it is much easier than using a shell script).

 [hs@duron python]$  make  python script.py The connection has successfully been established ['myodbc', 'template1', 'template0', 'myjava', 'mypython', 'lebenslauf', 'db']

You can see that we have seven databases up and running on the test system.

Another useful method is pkey() . It is used to retrieve the name of the primary key defined on a table. If no primary key is defined or if the table does not exist in the database, an exception is caused.

The following script checks two tables to see whether a primary key is defined:

 #!/usr/bin/python import pg try:         conn = pg.DB(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         try:                 print "primary key (persons): ", conn.pkey('persons')         except:                 print "no primary key on table persons defined"         try:                 print "primary key (income): ", conn.pkey('income')         except:                 print "no primary key on table persons defined" except:         print "An error with the database has occurred."

name is the column used as primary key for the first table. The second table has no primary key, so the exception is caught and a message is displayed:

 [hs@duron python]$  make  python script.py The connection has successfully been established primary key (persons):  name primary key (income):  no primary key on table persons defined

Before looking for the primary key of a table, it might be useful to find out which tables can be found in a database. We can use the get_tables() method to do this:

 #!/usr/bin/python import pg try:         conn = pg.DB(dbname='mypython', host='localhost', user='hs')         print "The connection has successfully been established"         try:                 print "tables: ", conn.get_tables()         except:                 print "an error has occurred ..." except:         print "An error with the database has occurred."

Two tables can be found in the database to which we are currently connected:

 [hs@duron python]$  make  python script.py The connection has successfully been established tables:  ['persons', 'income']

The next function we want to present in this section is called get_attnames() and is used to find the attribute names of a table. The result is returned as a list. Look at the source code of the sample script:

 #!/usr/bin/python import pg try:         conn = pg.DB(dbname='mypython', host='localhost', user='hs')         print conn.get_attnames('income') except:         print "An error with the database has occurred."

The list generated by the function contains four fields. The table has only three columns, but four columns are returned. The mysterious fourth column we cannot see in the description of the table is the object ID of the rows of the table. The data type of this column is Oid :

 [hs@duron python]$  make  python script.py { 'income': 'int', 'name': 'text', 'oid': 'int', 'year': 'int'}

get() is a function you can use to get a tuple from a database table. In the following example, you can see how things work:

 #!/usr/bin/python import pg conn = pg.DB(dbname='mypython', host='localhost', user='hs') x = conn.get('income', '1998', 'year') print x

One record is retrieved from the table:

 [hs@duron python]$  make  python script.py { 'income': 28000, 'name': 'Albert', 'oid': 28963, 'year': 1998, 'oid_income': 28963}

The module also offers a very easy method to insert data into a table. Imagine that we want to insert the salary of Hans into the database. One way to insert the data is to send a simple SQL statement to the server. Another way (and I would consider this the more elegant way) is to use the insert() function:

 #!/usr/bin/python import pg try:         conn = pg.DB(dbname='mypython', host='localhost', user='hs')         ret=conn.insert('income', { 'income': 88000, 'name': 'Hans', 'year': 2002} )         res=conn.query("SELECT * FROM income WHERE name='Hans'").dictresult()         print "result: ", res except:         print "an error has occurred ..."

We pass a dictionary to the function and Python does the rest for us. To see if the data has been inserted successfully, we retrieve the new record from the table.

Let's see what is displayed by the script:

 [hs@duron python]$  make  python script.py result:  [{ 'year': 2002, 'income': 88000, 'name': 'Hans'} , { 'year': 2002, [ic:ccc]'income': 88000, 'name': 'Hans'} ]

Updating works pretty much the same way as selecting data. The parameters passed to the function are the name of the table that has to be updated and an object. The object should contain the object ID so that the value can be found in the database.

In the next example, we select the affected data from the table first and display it onscreen. Then we add 1000 to the income of Hans, perform the update operation, and query the table again to see if the variables have changed to the correct values:

 #!/usr/bin/python import pg try:         conn = pg.DB(dbname='mypython', host='localhost', user='hs')         res = conn.query("SELECT * FROM income WHERE name='Hans'").dictresult()         print "result before: ", res                 val = conn.get('income', '2002', 'year')         print "\ nval: ", val, "\ n"         val["income"] = val["income"] + 1000         oid = conn.update('income', val);         res = conn.query("SELECT * FROM income WHERE name='Hans'").dictresult()         print "result after: ", res except:         print "an error has occurred ..."

We execute the script to see what has happened:

 [hs@duron python]$  make  python script.py result before:  [{ 'year': 2002, 'income': 88000, 'name': 'Hans'} ] val:  { 'income': 88000, 'name': 'Hans', 'oid': 28974, 'year': 2002, 'oid_income': 28974} result after:  [{ 'year': 2002, 'income': 89000, 'name': 'Hans'} ]

The income of Hans is now 1000 units higher than before.

Other important functions supported by the DB class are

clear(table, [a]) ( [a] is a dictionary of values) is used to clear a database table, and delete(table, [a]) ( [a] is a dictionary of values) is used to delete the row from a table.

A Short Overview of Python

Connecting

The Database Wrapper Class DB

The Database Wrapper Class `DB`