Client 4 - An Interactive Query Processor

Client 4 ”An Interactive Query Processor

At this point, you should have a pretty good understanding of how to use many of the libpq functions. There are two other issues I want to explore in this chapter: processing multiple result sets and asynchronous operations. Before we get to those, let's convert the previous client application ( client3d ) into an interactive query processor. After you've done that, you will have a good example of why you need to consider multiple result sets and asynchronous processing.

The next client that we want to build connects to a database and prompts you for a SQL command. You send the command to the server and display the results. You repeat this cycle (prompt, execute, display) until you enter the command quit .

You've already seen most of the code in this application; you are building on the client3d application. The important difference between client3d and client4 is that you use the GNU readline library to prompt the user for multiple commands (in client3d , the command text was hard-coded).

 1 /*  2 ** File:  client4.c  3 */  4  5 #include <stdlib.h>  6 #include <string.h>  7 #include <libpq-fe.h>  8 #include <readline/readline.h>  9 #include <readline/history.h> 10 11 typedef enum { FALSE, TRUE } bool;

Notice the two extra header files in this application. The readline/readline.h header file defines the interface to the GNU readline library. You may not be familiar with the name of the readline library; but if you are a Linux (or bash ) user, you probably know the user interface that it provides. When you use the readline library in your application, your users can enter SQL commands and correct their typing errors. I don't know about you, but I type faster backward than I do forward ”I hate using tools that don't let me correct typing mistakes.

The readline/history.h header file defines the interface to the GNU history library. readline and history work well together. The history library gives you an easy way to record SQL commands and recall them later.

I'll show you how to use readline and history a bit later.

 13 #define MAX_PRINT_LEN       40  14  15 static char separator[MAX_PRINT_LEN+1];  16  17 void print_result_set( PGresult * result )  18 {  19   int          col;  20   int          row;  21   int            * sizes;  22  23 /*  24 ** Compute the size for each column  25 */  26   sizes = (int *)calloc( PQnfields( result ), sizeof( int ));  27  28   for( col = 0; col < PQnfields( result ); col++ )  29   {  30     int     len = 0;  31  32     for( row = 0; row < PQntuples( result ); row++ )  33     {  34       if( PQgetisnull( result, row, col ))  35         len = 0;  36       else  37         len = PQgetlength( result, row, col );  38  39       if( len > sizes[col] )  40         sizes[col] = len;  41     }  42  43     if(( len = strlen( PQfname( result, col ))) > sizes[col] )  44       sizes[col] = len;  45  46     if( sizes[col] > MAX_PRINT_LEN )  47       sizes[col] = MAX_PRINT_LEN;  48   }  49  50 /*  51 ** Print the field names.  52 */  53   for( col = 0; col < PQnfields( result ); col++ )  54   {  55     printf( "%-*s ", sizes[col], PQfname( result, col ));  56   }  57  58   printf( "\n" );  59  60 /*  61 ** Print the separator line  62 */  63   memset( separator, '-', MAX_PRINT_LEN );  64  65   for( col = 0; col < PQnfields( result ); col++ )  66   {  67     printf( "%*.*s ", sizes[col], sizes[col], separator );  68   }  69  70   printf( "\n" );  71  72 /*  73 ** Now loop through each of the tuples returned by  74 ** our query and print the results.  75 */  76   for( row = 0; row < PQntuples( result ); row++ )  77   {  78     for( col = 0; col < PQnfields( result ); col++ )  79     {  80       if( PQgetisnull( result, row, col ))  81         printf( "%*s", sizes[col], "" );  82       else  83         printf( "%*s ", sizes[col], PQgetvalue( result, row, col ));  84     }  85  86     printf( "\n" );  87  88   }  89   printf( "(%d rows)\n\n", PQntuples( result ));  90   free( sizes );  91 }  92  93 void process_query( PGconn * connection, const char * query_text )  94 {  95   PGresult  *       result;  96  97   if(( result = PQexec( connection, query_text )) == NULL )  98   {  99     printf( "%s\n", PQerrorMessage( connection )); 100     return; 101   } 102 103   if( PQresultStatus( result ) == PGRES_TUPLES_OK ) 104   { 105     print_result_set( result ); 106   } 107   else if( PQresultStatus( result ) == PGRES_COMMAND_OK ) 108   { 109     printf( "%s", PQcmdStatus( result )); 110 111     if( strlen( PQcmdTuples( result ))) 112       printf( " - %s rows\n\n", PQcmdTuples( result )); 113     else 114       printf( "\n\n" ); 115   } 116   else 117   { 118       printf( "%s\n\n", PQresultErrorMessage( result )); 119   } 120 121   PQclear( result ); 122 }

The print_result_set() and process_query() functions in client4 are identical to those used in client3d . If you need a refresher on how these functions operate , look back to the previous example.

 124 int main( int argc, char * argv[] ) 125 { 126   PGconn  * connection; 127   char    * buf; 128 129 130   connection = PQconnectdb( argc > 1 ? argv[1] : "" ); 131 132   if( connection == NULL ) 133   { 134     printf( "Fatal error - unable to allocate connection\n" ); 135     exit( EXIT_FAILURE ); 136   } 137 138   if( PQstatus( connection ) == CONNECTION_OK ) 139   { 140 141     using_history(); 142     read_history( ".pg_history" ); 143 144     while(( buf = readline( "-->" )) != NULL ) 145     { 146       if( strncmp( buf, "quit", sizeof( "quit" ) - 1  ) == 0 ) 147       { 148         break; 149       } 150       else 151       { 152         if(strlen( buf ) != 0 ) 153         { 154           add_history( buf ); 155           process_query( connection, buf ); 156         } 157         free( buf ); 158       } 159     } 160 161     err = write_history( ".pg_history" ); 162 163   } 164   else 165   { 166     printf( "%s\n", PQerrorMessage( connection )); 167   } 168 169   PQfinish( connection ); 170 171   exit( EXIT_SUCCESS ); 172 }

The main() function differs significantly from client3d. The first change you might notice is how we handle command-line arguments. In previous examples, you were required to enter a connection string on the command-line. Now we are trying to be a bit more user-friendly, so the command-line argument is optional. If you provide a command-line argument, we assume that it is a connection string. If you don't, you pass an empty string to PQconnectdb() (see line 130) to indicate that you want to connect using default connection attributes.

The most significant change is the processing loop starting at line 141 and continuing through line 158. At line 141, you call a function named using_history() that initializes the GNU history library.

Just before exiting this application, you will call the write_history() function to write your command history to the $PWD/.pg_history file. The call to read_history() reads in any history records from previous invocations. Using write_history() and read_history() , you can maintain a command history across multiple invocations of client4. The read_history() and write_history() functions are part of the GNU history library.

At line 144, you prompt the user for a command using the readline() function. readline() is the primary function in the GNU readline library (no big surprise there). This function prints the prompt that provided ( --> ) and waits for you to enter a complete command. You can use the normal editing keys ( backspace , left and right arrows, and so on) to correct typing errors. You can also use the up- and down-arrow keys to scroll through command history. (See the readline man page for a complete list of editing options.) readline() returns a pointer to the null- terminated command string entered by the user. readline() will return a NULL pointer if the user presses the end-of-file key (usually Ctrl-D ).

Check for the quit command at line 146 and break out of the command-processing loop when you see it.

If you enter a non-blank command, you add the command to the history list and call process_query() to execute and display the results. You free() the buffer returned by readline() after you have finished processing the command.

At line 161, you write the history list to the .pg_history file. The next time you run this application, you will read the .pg_history file at startup.

This function finishes up by handling connection errors (at line 166), disconnecting from the server (line 166), and exiting.

You have to make a couple of minor changes to the makefile before you can build this application:

 ## ##  File:  Makefile ## ##         Rules for building libpq sample applications ## INCLUDES  += -I/usr/local/pg721/include CPPFLAGS  += $(INCLUDES) CFLAGS    +=  -g LDLIBS    += -L/usr/local/pgsql/lib -lpq LDFLAGS   += -g client1:        client1.o get_dflts:      get_dflts.o client4:        LDLIBS += -lreadline -ltermcap client4:        client4.o

The last two lines tell make that you need to link client4.o against the readline (and termcap ) libraries to build the client4 application ( termcap is required by the readline library).

Now let's build client4 and test it:

 $ make client4 cc -g -I/usr/local/pg721/include -c -o client4.o client4.c cc -g client4.o -L/usr/local/pgsql/lib -lpq -lreadline -ltermcap -o client4 $ ./client4 -->SELECT * FROM rentals; tape_id  rental_date customer_id -------- ----------- ----------- AB-12345  2002-07-01           1 AB-67472  2002-07-01           3 OW-41221  2002-07-01           1 (3 rows) -->quit $

Go ahead and play around with client4 a little. Try the editing keys; use the up-arrow key and down-arrow key to scroll through your history list. Notice that when you quit and reinvoke client4 , you can recall the commands entered in the previous session ^[4] .

^[4] If you find that your command history is not saved between sessions, it is probably because you don't have the permissions required to create the .pg_history file in your current directory.

Processing Multiple Result Sets

Now try an experiment. Run client4 and enter two commands on the same line, terminating the first command with a semicolon:

 $ client4 "dbname=movies" -->SELECT * FROM tapes; SELECT * FROM rentals tape_id  rental_date customer_id -------- ----------- ----------- AB-12345  2002-07-03           1 AB-67472  2002-07-03           3 OW-41221  2002-07-03           1 (3 rows) -->

Hmmm, there's a problem here. We executed two SELECT commands, but we only see the results of the last command.

This demonstrates a problem with the PQexec() function. PQexec() discards all result sets except for the last one.

Fortunately, it's not too difficult to fix this problem. Here is a replacement for the process_query() function that will correctly handle multiple result sets (this function appears in client4b.c in the sample source code):

 1 void process_query( PGconn * connection, const char * query_text )  2 {  3   PGresult  *       result;  4  5   if( PQsendQuery( connection, query_text ) == 0 )  6   {  7     printf( "%s\n", PQerrorMessage( connection ));  8     return;  9   } 10 11   while(( result = PQgetResult( connection )) != NULL ) 12   { 13     if( PQresultStatus( result ) == PGRES_TUPLES_OK ) 14     { 15       print_result_set( result ); 16     } 17     else if( PQresultStatus( result ) == PGRES_COMMAND_OK ) 18     { 19       printf( "%s", PQcmdStatus( result )); 20 21       if( strlen( PQcmdTuples( result ))) 22         printf( " - %s rows\n", PQcmdTuples( result )); 23       else 24         printf( "\n" ); 25     } 26     else 27     { 28       printf( "%s\n", PQresultErrorMessage( result )); 29     } 30 31     PQclear( result ); 32   } 33 }

In this version of process_query() , you split the command-processing effort into two steps. First, you send the command string to the server using the PQsendQuery() function. PQsendQuery() returns immediately after queuing the command ”it will not wait for results from the server. If PQsendQuery() cannot send the command string, it will return and you can find the error message by calling PQerrorMessage() .

The second step starts at line 11. You call PQgetResult() to obtain a result set from the server. Notice that you invoke PQgetResult() within a loop. PQgetResult() returns one result set for each command in the command string and returns NULL when there are no more result sets to process. The PQgetResult() function returns a pointer to a PGresult object ”we already know how to work with a Pgresult , so the rest of this function remains unchanged.

Now let's try to run this version ( client4b ):

 $ client4b "dbname=movies" -->SELECT * FROM tapes; SELECT * FROM rentals tape_id  title -------- ------------- AB-12345 The Godfather AB-67472 The Godfather MC-68873    Casablanca OW-41221  Citizen Kane AH-54706   Rear Window (5 rows) tape_id  rental_date customer_id -------- ----------- ----------- AB-12345  2002-07-03           1 AB-67472  2002-07-03           3 OW-41221  2002-07-03           1 (3 rows) -->quit $

This time, you get the results that you would expect: one result set for each command.

Asynchronous Processing

In the previous section, I mentioned that the PQsendQuery() function will not wait for a result set to be returned by the server. That can be an important feature for certain applications, particularly graphical (GUI) applications. In a GUI application, your code must remain responsive to the user even if you are waiting for results from a long-running SQL command. If you use PQexec() in a GUI application, you will find that the screen will not repaint while waiting for server results. The PQexec() function (and in fact most of the libpq functions) is synchronous ”the function will not return until the work has been completed.

In a GUI application, you need asynchronous functions, like PQsendQuery() . Things get a little more complex when you use asynchronous functions. Simply using PQsendQuery() is not enough to make your application responsive while waiting for results. Without doing some extra work, your application will still pause when you call the PQgetResult() function.

Here is a revised version of the process_query() function:

 1 void process_query( PGconn * connection, const char * query_text )  2 {  3   PGresult  *       result;  4  5   if( PQsendQuery( connection, query_text ) == 0 )  6   {  7     printf( "%s\n", PQerrorMessage( connection ));  8     return;  9   } 10 11   do 12   { 13     while( is_result_ready( connection ) == FALSE ) 14     { 15       putchar( '.' ); 16       fflush( stdout ); 17     } 18     printf( "\n" ); 19 20     if(( result = PQgetResult( connection )) != NULL ) 21     { 22       if( PQresultStatus( result ) == PGRES_TUPLES_OK ) 23       { 24         print_result_set( result ); 25       } 26       else if( PQresultStatus( result ) == PGRES_COMMAND_OK ) 27       { 28         printf( "%s", PQcmdStatus( result )); 29 30         if( strlen( PQcmdTuples( result ))) 31           printf( " - %s rows\n", PQcmdTuples( result )); 32         else 33           printf( "\n" ); 34       } 35       else 36       { 37         printf( "%s\n", PQresultErrorMessage( result )); 38       } 39       PQclear( result ); 40     } 41   } while( result != NULL ); 42 }

The important change to this version of process_query() starts at line 13. After sending the command to the server, you enter a loop that calls is_result_ready() . The is_result_ready() function waits for a result set to appear from the server. is_result_ready() will wait no longer than one second ”if a result set is not ready within one second, is_result_ready() will return FALSE . You simulate normal GUI processing here by printing a " . " for every second that we wait. (Okay, that's a pretty cheap imitation of a GUI don't you think?)

Now let's look at the is_result_ready() function:

 44 bool is_result_ready( PGconn * connection ) 45 { 46     int             my_socket; 47     struct timeval  timer; 48     fd_set          read_mask; 49 50     if( PQisBusy( connection ) == 0 ) 51       return( TRUE ); 52 53     my_socket = PQsocket( connection ); 54 55     timer.tv_sec  = (time_t)1; 56     timer.tv_usec = 0; 57 58     FD_ZERO( &read_mask ); 59     FD_SET( my_socket, &read_mask ); 60 61     if( select(my_socket + 1, &read_mask, NULL, NULL, &timer) == 0) 62     { 63       return( FALSE ); 64     } 65     else if( FD_ISSET( my_socket, &read_mask )) 66     { 67       PQconsumeInput( connection ); 68 69       if( PQisBusy( connection ) == 0 ) 70         return( TRUE ); 71       else 72         return( FALSE ); 73     } 74     else 75     { 76       return( FALSE ); 77     } 78 }

This is one of the most complex functions that we've seen in this chapter. You start (at line 50) by calling a the PQisBusy() function. PQisBusy() returns if a result set is ready for processing, and 1 if not.

If you find that a result set is not ready, you have more work to do. It might help to understand the details to come if you have a quick overview of where you are heading.

When you connect to a PostgreSQL server, the connection is represented by a PGconn object. You know that a PGconn object is opaque ”you can't look at the internals of the object to see what's inside. libpq provides one function that enables you to peek under the covers: PQsocket() . The PQsocket() returns the network socket that libpq uses to communicate with the server. We will use that socket to determine when data from the server becomes available.

Although server data is available it does not mean that a result set is ready. This is an important point. You may find that a single byte has been received from the server, but the result set is many megabytes in size. Once you know that some data is available, you have to let libpq peek at it. The PQconsumeInput() function (from libpq) reads all available server data and assembles it into a partial result set. After libpq has processed the available data, you can ask if an entire result set is ready for you.

That's the overview, now the details.

At line 53, you retrieve the client/server socket by calling PQsocket() . Remember, this is the socket that libpq uses to communicate with the server.

Next, you prepare to wait for data to become available from the server. At lines 55 and 56, set up a timer structure. You want to wait, at most, one second for data to become available from the server so you initialize the timer structure to indicate one second and zero microseconds. This is an arbitrary value ”if you want to be a bit more responsive, you can choose a shorter interval. If you want to consume fewer CPU cycles, you can choose a longer interval.

At lines 58 and 59, you initialize an fd_set structure. An fd_set is a data structure that represents a set of file (or, in our case, socket) descriptors. When you call select() , you must tell it which file descriptors (or socket descriptors) you are interested in. You want to listen for data on the socket you retrieved from PQsocket() , so you turn on the corresponding entry in the fd_set ^[5] .

^[5] This description might sound a bit mysterious . We programmers aren't supposed to know how an fd_set is implemented. The developers of the socket library want to hide the implementation so they can change it without our permission. We are only supposed to use a prescribed set of macros and functions to manipulate an fd_set . Think of an fd_set as a big set of bits. Each bit corresponds to a file/socket descriptor. When you call PQsocket() , it gives you back a number ”you want to turn on the bit corresponding to that number to tell select() that you are interested in activity on that socket. The FD_SET() macro turns on one bit. FD_ZERO() turns off all the bits. Now, if anyone asks, pretend that you don't know any of this stuff.

At line 61, you call the select() function. This function waits until any of the following occurs:

Data is ready on one of the file descriptors indicated in the read_mask .
The timer expires (that is, 1 second elapses).
A Unix signal is intercepted.

In other words, the select() function returns after waiting one second for data to become available on the my_socket socket. If data is ready before the timer expires, select() will return immediately.

When select() finally returns, you have to figure out which of the three previously-mentioned events actually occurred.

If select() returns zero, it's telling you that the timer expired without any activity on my_socket . In that case, you know that a result set can't possibly be ready so you return FALSE to your caller.

If select() returns something other than zero, you know that one of the file descriptors specified in read_mask has some data available. We'll be good little programmers here and use the FD_ISSET() macro to make sure that data is available on the my_socket socket. Practically speaking, there is only one descriptor enabled in read_mask , so you know that if any of the descriptors has data, it must be your descriptor.

At line 66, you know that some data is available from the server, but don't know if an entire result set is ready so you call PQconsumeInput() . PQconsumeInput() reads all data available from the server and stuffs that data into the result set that is being accumulated .

After that's done, you can call PQisBusy() again. PQisBusy() tells you whether a complete result set has been assembled . If PQisBusy() returns (meaning, no, the connection is not busy), you tell the caller that a result set is ready for processing. Otherwise, you return FALSE to indicate that more data is needed.

Lines 74 through 77 handle the case where a Unix signal interrupted the call to select() . There really isn't much to do in this case, so you just tell the caller that a result set is not ready for processing.

If you want to try this code, you will find it in the client4c.c source file. Here is a sample session:

 $ ./client4c dbname=movies -->SELECT COUNT(*) FROM pg_class, pg_attribute; ......... count ----- 96690 (1 rows) -->

Notice that it took nine seconds to execute this query (nine dots printed while we were waiting for the result set to come back from the server).

Besides the asynchronous command processing functions, libpq offers a way to make asynchronous connection attempts. I find that the asynchronous connection functions are overly complex for the limited benefits that they offer. In general, database connections are established in such a short period of time that I am willing to wait for the attempt to complete. If you find that a connection attempt is taking an excessive amount of time, you probably have a name server problem and I would rather fix that problem. If you do find that you need to make asynchronous connection attempts, see the PostgreSQL Programmer's Guide.