Embedded SQL C Preprocessor (ECPG) | PostgreSQL Developer's Handbook2001

Just like Oracle, PostgreSQL provides a powerful preprocessor C. The idea is to have an easy and efficient interface for generating optimized C code. In this section, you will learn to use this interface efficiently .

The Basic Concepts of ECPG

One of the most powerful and comfortable programming interfaces is Embedded SQL C Preprocessor (ECPG). ECPG is a preprocessor that does nothing else but generate C code. Files containing ECPG code are converted to C code and can be compiled and linked like any other C program. The advantage of writing ECPG code over writing normal C code is that it is much easier to write ECPG code than to write a C program.

All those hardcore C hackers among you will think that this is not correct, but I guess simplicity was the reason why ECPG was introduced.

The package was initially written by Linus Tolke, who also maintained the package up to version 0.2. Currently, Michael Meskes develops and maintains ECPG. The software can be copied and distributed under the same terms as the rest of PostgreSQL.

Many database systems support embedded SQL, and there is also a description of an ANSI standard available. As you might have guessed, the developers of ECPG try to make ECPG as close to the standard as possible.

The advantage of taking the standard into consideration is that an application using embedded SQL can easily be ported from one database system to the other.

A First Example

In general, examples are much easier to understand than any theoretical description of a topic. We have included a very simple example that does nothing but compute the result of one plus one. The result is not displayed. In the first step, we just want to show you how you can connect to the database and execute a simple command. We create a file called first.pgc :

 #include <stdio.h> EXEC SQL BEGIN DECLARE SECTION; int result; EXEC SQL END DECLARE SECTION; int main() {         EXEC SQL CONNECT TO myecpg;         EXEC SQL SELECT (1+1) INTO :result;         return 0; }

You can see that ECPG code is a mixture of SQL and C. From my point of view, the code is easy to read and much easier to understand than a native C program.

Let's go through the code.

First, we include stdio.h ( stdio.h is the standard library for I/O operations). In the next step, we tell the preprocessor that the section for defining variables starts. In this section, we define an integer variable called result . The section for defining the variables ends, and we start with the main function, just as we would do in a "normal" C program. In the next step, we connect to the database called myecpg and execute the SQL command. The result is assigned to the result , the variable we have defined as integer before. Finally, the program returns and quits.

Now we have to run the preprocessor and convert the source code to C code. The output of the preprocessor is as follows :

 /* Processed by ecpg (2.8.0) */ /* These three include files are added by the preprocessor */ #include <ecpgtype.h> #include <ecpglib.h> #include <ecpgerrno.h> #line 1 "first.pgc" /* exec sql begin declare section */ #line 1 "first.pgc"    int  result   ; /* exec sql end declare section */ #line 2 "first.pgc" {  ECPGdo(__LINE__, NULL, "select  1 + 1     ", ECPGt_EOIT,         ECPGt_int,&(result),1L,1L,sizeof(int),         ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);} #line 4 "first.pgc" int main() {         return 0; }

We could have written the code ourselves , using the preprocessor is much easier than coding real C code. We use a Makefile to compile the program:

 INC=    /usr/include/pgsql LIB=    /usr/share/pgsql FILE=   first.pgc TEMP=   file.c x       :       $(FILE) Makefile         ecpg -o $(TEMP) $(FILE)         gcc -g -I $(INC) -o prog $(TEMP) -L $(LIB) -lecpg -lpq

In the first four lines of the Makefile, we define the variables we will use next.

$FILE contains the input file. $TEMP contains the name of the C file you want to generate by using the preprocessor.

$INC and $LIB contain the directory the linker will use to look for the files that are needed for linking.

If we execute the script now, nothing will be displayed. Let's write the result of the query to standard output:

 #include <stdio.h> EXEC SQL BEGIN DECLARE SECTION; int result; EXEC SQL END DECLARE SECTION; int main() {         EXEC SQL CONNECT TO myecpg;         EXEC SQL SELECT (1+1) INTO :result;         printf("result: %i\n", result);         return 0; }

We simply add printf that displays the result as an integer value to the program; we use normal C code for printing the output. We compile the code using make and execute the program again:

 [hs@duron ecpg]$  ./prog  result: 2

2 is the result most of us have expected.

Error Handling

Error handling is one of the most important things when writing a program. Imagine when we change the name of the database we want to connect to from myecpg to nodb . The database nodb does not exist on the system. If we execute the script, the result will be instead of 2 , and no error will be displayed. The script behaves the same way as it would if we had a working database called nodb . This is very dangerous indeed, and some sort of error and exception handling has to be implemented. If you want to include error and exception handling in your application, you have to add the line

 EXEC SQL INCLUDE sqlca;

to the application. The line makes sure that a struct and a variable called sqlca will be defined. We will look at the data structure of the sqlca struct later in this section.

Now we try to connect to a database that does not exist on the system:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; EXEC SQL BEGIN DECLARE SECTION; int result; EXEC SQL END DECLARE SECTION; int main() {         EXEC SQL CONNECT TO ondb;         if      (sqlca.sqlcode)         {                 printf("Error connecting to database server.\n");                 printf("%i\n", sqlca.sqlcode);                 exit(0);         }         EXEC SQL SELECT (1+1) INTO :result;         printf("result: %i\n", result);         return 0; }

If an error occurs, sqlca.sqlcode will contain a non-zero value. In this case, we want the application to quit and print an error message onscreen. If we can connect to the database successfully, the program will work just like the one we saw in the previous section.

We have included the C code of the previous program here because it is a good chance to take a look at the definition of the sqlca struct:

 /* Processed by ecpg (2.8.0) */ /* These three include files are added by the preprocessor */ #include <ecpgtype.h> #include <ecpglib.h> #include <ecpgerrno.h> #line 1 "first.pgc" #include <stdio.h> #line 1 "/usr/include/pgsql/sqlca.h" #ifndef POSTGRES_SQLCA_H #define POSTGRES_SQLCA_H #define SQLERRMC_LEN        70 #ifdef __cplusplus extern                "C" { #endif         struct sqlca         {                 char                sqlcaid[8];                 long                sqlabc;                 long                sqlcode;                 struct                 {                         int                        sqlerrml;                         char                sqlerrmc[SQLERRMC_LEN];                 }                         sqlerrm;                 char                sqlerrp[8];                 long                sqlerrd[6];                 /* Element 0: empty                                                */                 /* 1: OID of processed tuple if applicable                        */                 /* 2: number of rows processed                                */                 /* after an INSERT, UPDATE or                                */                 /* DELETE statement                                        */                 /* 3: empty                                                */                 /* 4: empty                                                */                 /* 5: empty                                                */                 char                sqlwarn[8];                 /* Element 0: set to 'W' if at least one other is 'W'        */                 /* 1: if 'W' at least one character string                */                 /* value was truncated when it was                        */                 /* stored into a host variable.                                */                 /*                  * 2: if 'W' a (hopefully) non-fatal notice occurred                  *//* 3: empty */                 /* 4: empty                                                */                 /* 5: empty                                                */                 /* 6: empty                                                */                 /* 7: empty                                                */                 char                sqlext[8];         } ;         extern struct sqlca sqlca; #ifdef __cplusplus } #endif #endif #line 2 "first.pgc" /* exec sql begin declare section */ #line 5 "first.pgc"    int  result   ; /* exec sql end declare section */ #line 6 "first.pgc" int main() {         {  ECPGconnect(__LINE__, "ondb" , NULL,NULL , NULL, 0); } #line 10 "first.pgc"         if        (sqlca.sqlcode)         {                 printf("Error connecting to database server.\n");                 printf("%i\n", sqlca.sqlcode);                 exit(0);         }         {  ECPGdo(__LINE__, NULL, "select  ( 1 + 1 )     ", ECPGt_EOIT,         ECPGt_int,&(result),1L,1L,sizeof(int),         ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EORT);} #line 17 "first.pgc"         printf("result: %i\n", result);         return 0; }

You can see that the C code we have written is not affected by the preprocessor; only embedded SQL is processed and converted to C code. Although the ECPG code is still very brief, the C code is already quite long. But let's look at sqlca now.

One of the most important fields in the structure is sqlca.errm.sqlerrmc . It contains the error message as a string and not as a number. We recommend that your applications display both the ID and the description of error, because your applications can debugged more easily.

After running the preprocessor and compiling the C code, we can execute the software and, what a surprise, an error will be displayed:

 [hs@duron ecpg]$  ./prog  Error connecting to database server. -402

-402 means that the application was not able to open the database. A huge number of different error messages are provided by PostgreSQL. We have compiled a complete list of all error messages available in the Table 9.1.

Table 9.1. Error messages and Error Codes

Error Code and Error String	Comment
`-12` , Out of memory in line `%d` .	If that error occurs, all memory available on the system has been allocated.
`-200` , Unsupported type %s on line `%d` .	The preprocessor has most likely created something that does not work with the library used by the application. Check whether the versions of preprocessor and software are compatible or not.
`-201` , Too many arguments line `%d` .	The database returns more values than your application can handle. Check if you have defined enough host variables.
`-202` , Too few arguments line `%d` .	This error is the opposite of error `"-201` ." PostgreSQL does not return enough values for your application.
`-203` , Too many matches line `%d` .	The query returned too many rows. Maybe you want to `SELECselecTt` many rows into one variable that is not an array.
`-204` , Not correctly formatted `int` type: `%s` line `%d` .	The field returned by PostgreSQL is not an integer and cannot be converted to integer.
`-205` , Not correctly formatted unsigned type: `%s` line `%d` .	The variable returned cannot be used as unsigned value. The type of variable returned by PostgreSQL does not match the type you use in the application.
`-206` , Not correctly formatted floating point type: `%s` line `%d` .	The field returned cannot be interpreted as `float` .
`-207` , Unable to convert `%s` to `bool` on line `%d` .	The field in your PostgreSQL does not contain `"t` " or `"f` ", so it and can therefore not be converted to a `bool` .
`-208` , Empty query line `%d` .	`PGRES_EMPTY_QUERY` was returned.
`-220` , No such connection `%s` in line `%d` .	Your application tries to access a connection that does not exist anymore.
`-221` , Not connected in line `%d` .	Your application tries to access an existing connection that is closed.
`-230` , Invalid statement name `%s` in line `%d` .	The statement your application tries to execute has not been prepared.
`-400` , Postgres error: `%s` line `%d` .	This is an error message from the backend.
`-401` , Error in transaction processing line `%d` .	PostgreSQL tells us that we cannot start, commit, or rollback a transaction.
`-402` , connect: could not open database `%s` .	Connecting to your database does not work.
`100` , Data not found line `%d` .	It tells you that what you are querying cannot be found or that you have gone through the cursor.

All types of errors listed in Table 9.1 can be accessed with the help of the sqlca struct.

Connecting

Connecting to a PostgreSQL database by using ECPG can be done in more than just one way. You may wonder why we have not included this part at the beginning of this section, but some basics about error handling is essential for dealing with connections.

In the previous examples, we connected to the database simply by using a command such as:

 EXEC SQL CONNECT TO myecpg;

If the database is on a remote host or if the port the database is listening to is not the default one, it is necessary to use a different CONNECT string.

Two possibilities are provided to connect to the database:

 dbname[@server][:port]

 <tcpunix>:postgresql://server[:port][/dbname][?options]

In this section, we will present some practical examples that show how to connect to the database.

In the following example, we connect to the database called myecpg on localhost port number 5432:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; int main() {         EXEC SQL CONNECT TO myecpg@localhost:5432;         if      (sqlca.sqlcode) {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0); }         return 0; }

The example works perfectly well if the postmaster has been started with the -i flag enabled. Otherwise , the connection can't be established because PostgreSQL is not allowed to listen to remote hosts . Every host that is defined explicitly in the code is considered to be a remote host.

The port does not have to be explicitly defined; it also works if we use the default settings for the port.

Depending on whether we want to connect via TCP or UNIX domain sockets, we can use two different methods for connecting. We have included full examples that can be directly compiled.

For connecting via TCP, use the following:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; int main() {         EXEC SQL CONNECT TO tcp:postgresql://localhost:5432/myecpg;         if      (sqlca.sqlcode) {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0); }         return 0; }

If you want to use UNIX domain sockets instead, you can use the following:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; int main() {         EXEC SQL CONNECT TO unix:postgresql://localhost:5432/myecpg";         if      (sqlca.sqlcode) {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0); }         return 0; }

Up to now, we have connected to the database as the default user. In most cases, this is not what you want to do. Defining a certain user and a password is a very important thing. In the following example, you can see how you can add the username to the CONNECT string. Let's try to connect to the database as user hs :

 #include <stdio.h> EXEC SQL INCLUDE sqlca; int main() {         EXEC SQL CONNECT TO myecpg user hs;         if      (sqlca.sqlcode) {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0); }         return 0; }

The password can be defined as follows:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; int main() {         EXEC SQL CONNECT TO myecpg@localhost USER hs/apassword;         if      (sqlca.sqlcode) {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0); }         return 0; }

Make sure that the user who wants to connect to the database is existing and that he or she has enough rights.

Simple Queries and Retrieving Data

Now we want to write some small applications that retrieve data from the following table:

 CREATE TABLE os(id int4, name text); COPY "os"  FROM stdin; 1       Open Bsd 2       RedHat Linux 3       Windows ME 4       Solaris 5       QNX \.

The first application shows how you can retrieve all data and display it onscreen:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; EXEC SQL BEGIN DECLARE SECTION; int id; char name[20]; EXEC SQL END DECLARE SECTION; int main() {         EXEC SQL CONNECT TO myecpg;         if        (sqlca.sqlcode)         {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0);         }         EXEC SQL PREPARE mystat FROM "SELECT id, name FROM os";         EXEC SQL DECLARE mycurs CURSOR FOR mystat;         EXEC SQL OPEN mycurs;         EXEC SQL WHENEVER NOT FOUND DO BREAK;         while        (1)         {                 EXEC SQL FETCH IN mycurs INTO :id, :name;                 printf("%i - %s\n", id, name);         }         EXEC SQL CLOSE mycurs;                EXEC SQL DISCONNECT;         return 0; }

Actually the code looks more difficult than it is. In the DECLARE section, we define two variables we use to store the result of the query. id will contain the first column, while name will be used for the second column of the result. In the next step, we establish a connection to the database and check whether the connection has been established successfully. If an error has occurred, the application quits. Then we prepare a statement called mystat that contains the query we want to execute. Because we expect the result of the query to have more than just one line, we have to declare a so-called cursor. Cursors are used to move inside the result of a query. In general, you go from the first line to the last line of the result step-by-step. In this case, the cursor can be processed using a simple WHILE loop.

If no result is found the application stops.

EXEC SQL FETCH IN mycurs INTO :id, :name; fetches one row of data. The first column of the result is assigned to id , the second column to name . We have to add a colon before the variables to tell the preprocessor that the C variables have to be used to assign the values. We display the values with a simple printf command. The first column is displayed as an integer, while the second column has to be displayed as a string. After executing the WHILE loop, we close the cursor and disconnect from the database. After running the preprocessor and compiling the C code, we try to execute the program:

 [hs@duron ecpg]$  ./prog  1 - Open Bsd 2 - RedHat Linux 3 - Windows ME 4 - Solaris 5 - QNX

All values have been displayed correctly.

Some data types supported by PostgreSQL are not native data types of the C programming language. In this case, it is necessary to perform simple cast operations. We do not want to go too deep into detail, but we want to include a short example that shows how a numeric value can be converted to double :

 #include <stdio.h> #include <stdlib.h> EXEC SQL INCLUDE sqlca; EXEC SQL BEGIN DECLARE SECTION; char id[20]; char * pEnd; double result; EXEC SQL END DECLARE SECTION; int main() {         EXEC SQL CONNECT TO myecpg;         if      (sqlca.sqlcode)         {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0);         }         EXEC SQL SELECT '1.22'::numeric(6,2) INTO :id;         printf("numeric as string: %s\n", id);         result = strtod (id ,&pEnd);         printf ("numeric as double: %lf\n",result);         return 0; }

We simply include stdlib.h and perform the cast by converting the text returned by the database to double . This works fine for values that are in the range of a valid double . If the numeric values stored in the database exceed the range of double , the whole thing will become a little bit more difficult.

If we execute the script, we can see that the conversion works without any problems:

 [hs@duron ecpg]$  ./prog  numeric as string: 1.22 numeric as double: 1.220000

In the previous example, we can simply use SELECT INTO to assign the result of the query to id because the query returns only one row. If we had a query returning more than just one row, we would have to define a cursor and process the result line-by-line .

A Final Example

Let's get back to the table containing the list of operating systems. To avoid some unnecessary page turning, we have included the data in the table again:

 CREATE TABLE os(id int4, name text); COPY "os"  FROM stdin; 1       Open Bsd 2       RedHat Linux 3       Windows ME 4       Solaris 5       QNX \.

We want to write a simple application where the user can select a record from the table. If the record does not exist, the user is asked to insert the record:

 #include <stdio.h> EXEC SQL INCLUDE sqlca; EXEC SQL BEGIN DECLARE SECTION; int id; char name[256]; char intext[10];                        // will contain the id of the                                         //  line we are looking for char inos[256];                                // name of the new OS char myquery[256];                        // will contain the query for searching char insql[256];                        // used to insert new OS int  flag=0;                                // if records found ... EXEC SQL END DECLARE SECTION; EXEC SQL WHENEVER SQLERROR sqlprint; int main() {         // connecting ...         EXEC SQL CONNECT TO myecpg;         if        (sqlca.sqlcode)         {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0);         }         // reading input data ...         printf("Insert an id: ");                scanf("%2s", intext);         sprintf(myquery, "SELECT id, name FROM os WHERE id='%s' ", intext);         printf("%s\n\n", myquery);         EXEC SQL PREPARE mystatement FROM :myquery;         if        (sqlca.sqlcode)         {                 printf("%s\n", sqlca.sqlerrm.sqlerrmc);                 exit(0);         }         EXEC SQL DECLARE mycursor CURSOR FOR mystatement;         EXEC SQL OPEN mycursor;         EXEC SQL WHENEVER NOT FOUND DO BREAK;         while        (1)         {                 EXEC SQL FETCH IN mycursor INTO :id, :name;                 flag=1;                 printf("records found: %i, %s\n\n", id, name);         }         EXEC SQL CLOSE mycursor;         EXEC SQL WHENEVER NOT FOUND CONTINUE;         if        (flag == 0)         {                 printf("Insert the name of the OS: ");                        scanf("%256s", inos);                 EXEC SQL INSERT INTO os (id, name) VALUES ( :intext, :inos);                 EXEC SQL COMMIT;                 if        (sqlca.sqlcode)                 {                         printf("%s\n", sqlca.sqlerrm.sqlerrmc);                         exit(0);                 }                 else                 {                         printf("Record successfully imported\n");                 }         }         EXEC SQL DISCONNECT;         return 0; }

We first declare all the variables we will need in the script. After connecting to the database, the program prompts for an ID. scanf inserts the value inserted by the user into the variable called intext . We declare a cursor and perform a SELECT statement to retrieve all records containing the ID the user has just entered. If records are found, flag is set to 1 so that the script does not ask for the name of the operating system anymore because the record already exists. The records are displayed, and the cursor is closed.

If no records are found, the program prompts for input again. This time, we have to insert the name of the new operating system.

The new record is inserted into the database, and the application quits.

Note

After inserting the record into the table, it is necessary to perform a COMMIT ; otherwise, the record will never be inserted into the database.

Finally, we disconnect from the database, and is returned by the application.

To compile the program, we can write a simple Makefile:

 INC=    /usr/include/pgsql LIB=    /usr/share/pgsql FILE=   retrieve.pgc TEMP=   file.c x       :       $(FILE) Makefile         ecpg $(FILE) -o $(TEMP)         gcc -g -I $(INC) -o prog $(TEMP) -L $(LIB) -lecpg -lpq

If we want to find what has to be done by make , we can use make -n :

 [hs@duron ecpg]$  make -n  ecpg retrieve.pgc -o file.c gcc -g -I /usr/include/pgsql -o prog file.c -L /usr/share/pgsql -lecpg -lpq

You can see that some additional compiler flags for linking are required. Keep this in mind when writing your own ECPG applications.

Let's execute the program:

 [hs@duron ecpg]$  ./prog  Insert an id: 3 SELECT id, name FROM os WHERE id='3' records found: 3, Windows ME

The ID 3 can already be found in the database, so the result is displayed.

 [hs@duron ecpg]$  ./prog  Insert an id: 93 SELECT id, name FROM os WHERE id='93' Insert the name of the OS: myOS Record successfully imported

98 cannot be found in the database, so the software prompts for some input, and the new record is inserted in the database:

 myecpg=#  SELECT * FROM os WHERE id=93;  id  name ----+------  93  myOS (1 row)