9.2 PLpgSQL | PHP and PostgreSQL. Advanced Web Programming2002

9.2 PL/pgSQL

In this section you will learn to implement simple functions based on PL/pgSQL. PL/pgSQL is an embedded language provided by PostgreSQL for writing extensions to PostgreSQL. Before you can start working with PL/pgSQL and PostgreSQL, you must add the language to the database explicitly by using a command-line tool:

 bash-2.04$ createlang plpgsql phpbook

If no error occurred, the database called phpbook now supports PL/pgSQL.

9.2.1 The Structure of PL/pgSQL Functions

Usually a PL/pgSQL function consists of three components as shown in the next listing:

 [<<label>>] [DECLARE     declarations] BEGIN     statements END;

First, a label is defined and after that a section for declaring variables can be added. Finally, the block containing the instructions of the function must be added. This block starts with BEGIN and stops with END;. The label and the DECLARE block of the function are optional.

9.2.2 A Simple Example

After this theoretical overview about PL/pgSQL, let's take a look at an example. The goal is to write a simple function for computing the average of two numeric values:

 CREATE OR REPLACE FUNCTION average (numeric, numeric) RETURNS numeric AS '         DECLARE                 result numeric;         BEGIN                 result := ($1 + $2) / 2;                 RETURN result;         END; ' LANGUAGE 'plpgsql';

After the function has been added to the database, it can be executed:

 phpbook=# SELECT average(1, 2);    average --------------  1.5000000000 (1 row)

Let's take a closer look at the code of the function. First, a variable is defined within a DECLARE block. In the next step the two values passed to the function are averaged. The result is returned to the calling function by using a RETURN statement.

9.2.3 Loops and Conditions

In many cases a block of code has to be executed more than once. In this situation, you can use loops. Loops are strongly related to condition. The code within a loop is executed as long as certain conditions are fulfilled; otherwise, it will be terminated.

One type of loop that is commonly used in many programming languages is called a FOR loop. Almost all programming languages support some kind of FOR loops, and so does PL/pgSQL. A FOR loop is executed as long as a variable is within a certain range of values.

The next example shows a function for generating the geometric mean of a range of values:

 CREATE OR REPLACE FUNCTION geomean (numeric, numeric) RETURNS numeric AS '         DECLARE                 i       int4;                 result  numeric;         BEGIN                 result := 0;                 FOR i IN $1..$2 LOOP                         result := i * i + result;                 END LOOP;                 RETURN sqrt(result);         END; ' LANGUAGE 'plpgsql';

First, two variables are defined in the DECLARE section. In the next step result is set to zero. The FOR loop is executed as long as i is within the range defined by the first and second parameter passed to the function.

After you have inserted the function into the database, it can be accessed by any SQL statement:

 phpbook=# SELECT geomean(2, 5);    geomean --------------  7.3484692283 (1 row)

The geometric mean of the numbers from 2 to 5 will be the result of the function. To check whether the result is correct, you can send a SQL statement to the server:

 phpbook=# SELECT sqrt(4+9+16+25::numeric);      sqrt --------------  7.3484692283 (1 row)

As you can see, the result is equal to the one generated by the geomean function.

What happens if the parameters are passed to the function in a different order? In the next example you can see what comes out in this case:

 phpbook=# SELECT geomean(5, 2);    geomean --------------  0.0000000000 (1 row)

The result is zero because the loop is never entered by PL/pgSQL. To get around this problem, it is necessary to add a check to the function that swaps the values if the second parameter is lower than the first parameter passed to the function.

Let's try to swap the content of $1 and $2:

 CREATE OR REPLACE FUNCTION geomean (numeric, numeric) RETURNS numeric AS '         DECLARE                 i       int4;                 result  numeric;                 tmp     numeric;         BEGIN                 IF      $1 > $2 THEN                         tmp := $1;                         $1 := $2;                         tmp := $2;                 END IF;                 result := 0;                 FOR i IN $1..$2 LOOP                         result := i * i + result;                 END LOOP;                 RETURN sqrt(result);         END; ' LANGUAGE 'plpgsql';

The function cannot be executed successfully because $1 and $2 are constant values.

Unlike many other languages, there is no way to modify the input parameters of a PL/pgSQL function, as you can see in the next listing:

 phpbook=# SELECT geomean(5, 2); ERROR:  $1 is declared CONSTANT NOTICE:  plpgsql: ERROR during compile of geomean near line 8 NOTICE:  plpgsql: ERROR during compile of geomean near line 8 ERROR:  $1 is declared CONSTANT

To make the function work properly, it is necessary to assign the content of $1 and $2 to temporary variables. This will lead to a minor slowdown but it is the only way to make the function work correctly:

 CREATE OR REPLACE FUNCTION geomean (numeric, numeric) RETURNS numeric AS '         DECLARE                 i       int4;                 result  numeric;                 first   numeric;                 second  numeric;                 tmp     numeric;         BEGIN                 first := $1;                 second := $2;                 IF      first > second THEN                         tmp := first;                         first := second;                         second := tmp;                 END IF;                 result := 0;                 FOR i IN first..second LOOP                         result := i * i + result;                 END LOOP;                 RETURN sqrt(result);         END; ' LANGUAGE 'plpgsql';

After inserting the function again, the result will be computed correctly:

 phpbook=# SELECT geomean(5, 2);    geomean --------------  7.3484692283 (1 row)

As you have seen in the function, it is necessary to add an IF statement to the function. In PL/pgSQL, IF works pretty much the same way as it does in PHP. A condition is checked and if it is fulfilled, the IF block will be entered. The main difference between PHP's and PL/pgSQL's IF function is that in PL/pgSQL, no parentheses are needed because PL/pgSQL is a block-oriented language. This is also similar to Python, which also does not force the user to use parentheses.

WHILE loops are another type of loops provided by PL/pgSQL. Let's try to implement the same function with the help of a WHILE loop:

 CREATE OR REPLACE FUNCTION geomean (numeric, numeric) RETURNS numeric AS '         DECLARE                 result  numeric;                 i       numeric;         BEGIN                 i := $1;                 result := 0;                 WHILE   i <= $2 LOOP                         result := i * i + result;                         i := i + 1;                 END LOOP;                 RETURN sqrt(result);         END; ' LANGUAGE 'plpgsql';

This time the value of the first parameter is assigned to i. In the next step a WHILE loop is processed until i is equal to $2. The most important part in the code of the function is the line where i is incremented. Without this line the loop would never stop processing because the condition used to stop the loop would always be true. Endless loops are a danger for your programs, so make sure that there is always a way out of the loops in your applications.

Let's test the function you have just seen:

 phpbook=# SELECT geomean(2, 5);    geomean --------------  7.3484692283 (1 row)

The result is correct.

In some cases it might be necessary to quit the execution of a block within a function. Therefore you can use EXIT. If the condition passed to the EXIT function is fulfilled, the loop is terminated. Let's take a look at an example:

 CREATE OR REPLACE FUNCTION myexit (numeric, numeric) RETURNS numeric AS '         DECLARE                 i       int4;         BEGIN                 FOR i IN $1..$2 LOOP                         EXIT WHEN i = 2;                         RETURN i;                 END LOOP;                 RETURN 99;         END; ' LANGUAGE 'plpgsql';

If i equals 2, the loop will be terminated and PL/pgSQL will continue executing the code after the loop, which is in this case a simple RETURN statement:

 phpbook=# SELECT myexit(2, 5);  myexit --------      99 (1 row)

9.2.4 Comments

Comments can be used to make your source code clearer. The more documentation you add to your programs, the easier it will be for a programmer to understand your code. In addition, you can use comments to turn off parts of your applications for debugging purposes.

In PL/pgSQL two kinds of comments can be used. Two dashes tell PostgreSQL to ignore the current line. In addition, C-style comments can be used:

 CREATE OR REPLACE FUNCTION mycomment () RETURNS numeric AS '         BEGIN                 /* RETURN 0; */                 -- RETURN 1;                 RETURN 2;         END; ' LANGUAGE 'plpgsql';

Just execute the function to see which lines are ignored by PostgreSQL:

 phpbook=# SELECT mycomment();  mycomment -----------          2 (1 row)

Two is returned, which means that only the last line will be executed.

9.2.5 Exception Handling

Let's get back to the function for generating the geometric mean of a range of values. So far, you have seen how to get around the problem concerning the loop and how to compute the right result even if the first parameter is higher than the second one. This time a message should be displayed if the first value passed to the function is higher than the second one. Therefore PostgreSQL provides a function called RAISE NOTICE:

 CREATE OR REPLACE FUNCTION geomean (numeric, numeric) RETURNS numeric AS '         DECLARE                 i       int4;                 result  numeric;                 first   numeric;                 second  numeric;         BEGIN                 first := $1;                 second := $2;                 IF      first > second THEN                         RAISE NOTICE ''wrong order of parameters'';                 END IF;                 result := 0;                 FOR i IN first..second LOOP                         result := i * i + result;                 END LOOP;                 RETURN sqrt(result);         END; ' LANGUAGE 'plpgsql';

If the first value passed to the function is higher than the second parameter, a warning is displayed:

 phpbook=# SELECT geomean(20, 2); NOTICE:  wrong order of parameters    geomean --------------  0.0000000000 (1 row)

Warnings are an important component of the programming language and can be used as an onboard debugger for PL/pgSQL. Because no other way of debugging is available, using RAISE NOTICE is the most comfortable way to find tricky bugs in your application.

If you want your application to terminate in case of an error, RAISE EXCEPTION must be used:

 RAISE EXCEPTION ''wrong order of parameters'';

When using RAISE EXCEPTION instead of RAISE NOTICE, the function will be terminated:

 phpbook=# SELECT geomean(20, 2); ERROR:  wrong order of parameters

Let's have a closer look at the two ways of making PL/pgSQL complain about something: The string you want to display on the screen has to be passed to PostgreSQL using two single quotes. This is very important for one particular reason: The entire code of the function is defined under single quotes and if a string has to be processed, one pair of single quotes must be escaped by using two single quotes. If you don't escape the single quotes needed by RAISE NOTICE and RAISE EXCEPTION, PL/pgSQL will report an error because it is no longer possible to find the end of the function passed to the database.

9.2.6 Aliases

Aliases are a core feature of PL/pgSQL. With the help of aliases it is possible to assign more than just one name to a variable. In some cases this will make your code more flexible and easier to understand. Let's take a look at a function for computing the sum of two numeric values:

 CREATE OR REPLACE FUNCTION mysum (numeric, numeric) RETURNS numeric AS '         DECLARE                 first   ALIAS FOR $1;                 second  ALIAS FOR $2;         BEGIN                 RETURN (first + second);         END; ' LANGUAGE 'plpgsql';

first and second are aliases for $1 and $2, so these names can be used just like the two original values passed to the function.

When executing the function, you will see that the result will be computed as if you had not used aliases:

 phpbook=# SELECT mysum(23, 45);  mysum -------     68 (1 row)

Aliases are just additional names of a certain variable they cannot be used to modify constant variables:

 CREATE OR REPLACE FUNCTION mysum (numeric, numeric) RETURNS numeric AS '         DECLARE                 first   ALIAS FOR $1;                 second  ALIAS FOR $2;         BEGIN                 first = first + 1;                 RETURN (first + second);         END; ' LANGUAGE 'plpgsql';

Recall that values passed to a function are constant values, so it is not possible to modify the content of those variables not even when using aliases:

 phpbook=# SELECT mysum(23, 45); ERROR:  $1 is declared CONSTANT NOTICE:  plpgsql: ERROR during compile of mysum near line 5 NOTICE:  plpgsql: ERROR during compile of mysum near line 5 ERROR:  $1 is declared CONSTANT

9.2.7 SQL and Executing Functions

Executing SQL codeinside a PL/pgSQL function is as easy as executing a SQL command using the interactive shell. The target of the next example is to write a simple, database-driven logging function. Therefore you have to create a table for storing the logging information:

 phpbook=# CREATE TABLE logtable(id serial, tstamp timestamp, message text, mestype int4); NOTICE:  CREATE TABLE will create implicit sequence 'logtable_id_seq' for SERIAL column 'logtable.id' NOTICE:  CREATE TABLE/UNIQUE will create implicit index 'logtable_id_key' for table 'logtable' NOTICE:  CREATE TABLE will create implicit sequence 'logtable_id_seq' for SERIAL column 'logtable.id' NOTICE:  CREATE TABLE/UNIQUE will create implicit index 'logtable_id_key' for table 'logtable' CREATE

In this example the table consists of a column storing a serial, a time stamp, the logging message, and the type of the message. Therefore a PL/pgSQL function can be implemented. In this example the function returns a Boolean value:

 CREATE OR REPLACE FUNCTION dblog (text, int4) RETURNS bool AS '         BEGIN                 INSERT INTO logtable(tstamp, message, mestype)                         VALUES (now(), $1, $2);                 RETURN ''t'';         END; ' LANGUAGE 'plpgsql';

As you can see the SQL statement can be used just like any other PL/pgSQL command. Let's execute the function:

 phpbook=# SELECT dblog('an error message', 0);  dblog -------  t (1 row)

true has been returned and one record can be found in the table:

 phpbook=# SELECT * FROM logtable;  id |            tstamp             |     message      | mestype ----+-------------------------------+------------------+---------   1 | 2001-11-12 13:38:19.875317+01 | an error message |       0 (1 row)

In some cases you might want to execute a SQL command without bothering about the result. Therefore you can use a function called EXECUTE.

 CREATE OR REPLACE FUNCTION checkcontent () RETURNS bool AS '         BEGIN                 EXECUTE ''CREATE TABLE tmptab (id int4, data text)'';                 RETURN ''t'';         END; ' LANGUAGE 'plpgsql';

The function creates a new table and returns true:

 phpbook=# SELECT checkcontent();  checkcontent --------------  t (1 row)

If the function has been executed successfully, a table called tmptab can be found in the database now:

 phpbook=# \d tmptab          Table "tmptab"  Attribute |  Type   | Modifier -----------+---------+----------  id        | integer |  data      | text    |

The return value of the CREATE TABLE statement has not been checked, but it has been silently omitted.

Another important issue is that PL/pgSQL is often used for is dynamic SQL. In many applications the SQL code that must be executed by the server is not predefined, so the correct parameters of the command have to be computed at runtime. With the help of PL/pgSQL, it is an easy task to build dynamic SQL code and to execute it using a function. The next example shows a function for changing the error level of a logging entry:

 CREATE OR REPLACE FUNCTION changecontent (int4, int4) RETURNS bool AS '         BEGIN                 EXECUTE ''UPDATE logtable SET mestype=''|| $1 || '' WHERE ''                         || ''id='' || $2;                 RETURN ''t'';         END; ' LANGUAGE 'plpgsql';

The SQL command is sent to the server by using the EXECUTE function. The parameters of the SQL command are extracted from the input parameters of the function and may vary. If the function is executed by the user, true will be returned and the values in the table will be updated:

 phpbook=# SELECT changecontent(1, 1);  changecontent ---------------  t (1 row)

As you can see, the record in the table has been updated to the desired value:

 phpbook=# SELECT * FROM logtable;  id |            tstamp             |     message      | mestype ----+-------------------------------+------------------+---------   1 | 2001-11-12 13:38:19.875317+01 | an error message |       1 (1 row)

Depending on the kind of application you are working on, the complexity of your PL/pgSQL function can differ significantly. However, if the computations you have to perform become more complex, you should consider using other embedded languages such as PL/Perl or PL/Tcl.

9.2.8 Working with Queries

In many PL/pgSQL functions, the result of a query must be processed by the function. Therefore it is necessary to retrieve the result of the query inside a PL/pgSQL function. In the next example you will see how easily this can be done by using a loop and a data type called RECORD:

 CREATE OR REPLACE FUNCTION makebool () RETURNS bool AS '         DECLARE                 row     RECORD;         BEGIN                 FOR row IN SELECT * FROM logtable LOOP                         RAISE NOTICE ''notice: %, %'', row.tstamp, row.message;                 END LOOP;                 RETURN ''t'';         END; ' LANGUAGE 'plpgsql';

First, a variable is defined. This variable is a so-called record, which means that it can be used to store exactly one line of data. One line of data consists of all fields stored in the table, and each field can easily be accessed inside the record. In this example you can see how some fields in the table are displayed on the screen using the NOTICE function. You can see that a field can easily be accessed by a point and the name of the field you want to access.

Before executing the function, recall the content of logtable:

 phpbook=# SELECT * FROM logtable;  id |            tstamp             |     message      | mestype ----+-------------------------------+------------------+---------   1 | 2001-11-12 13:38:19.875317+01 | an error message |       1 (1 row)

Let's execute the function and see how the data is displayed:

 phpbook=# SELECT makebool(); NOTICE:  notice: 2001-11-12 13:38:19.875317+01, an error message  makebool ----------  t (1 row)

The result is no surprise. The various fields displayed are listed one after the other and the content is displayed in the same format as would be done by using a SQL statement.

The goal of the next example is to reuse the data retrieved by the query. To make the example a little bit more spectacular, a second record can be added to the table called logtable by using an INSERT statement:

 phpbook=# INSERT INTO logtable(tstamp, message, mestype) VALUES (now(), 'the second error', 2); INSERT 16625 1

Let's modify the function and add an UPDATE statement to it that uses the data generated by the SELECT statement:

 CREATE OR REPLACE FUNCTION makebool () RETURNS bool AS '         DECLARE                 row     RECORD;         BEGIN                 FOR row IN SELECT * FROM logtable LOOP                         UPDATE logtable SET tstamp=now(), mestype=3                                 WHERE id=row.id;                 END LOOP;                 RETURN ''t'';         END; ' LANGUAGE 'plpgsql';

This might not be the most efficient way of updating the data in a table, but it shows how data can be processed inside a PL/pgSQL function.

When the function executes, true will be returned:

 phpbook=# SELECT makebool();  makebool ----------  t (1 row)

After the function is called, the data in the table contains some new values:

 phpbook=# SELECT * FROM logtable;  id |            tstamp             |     message      | mestype ----+-------------------------------+------------------+---------   1 | 2001-11-12 19:08:19.431142+01 | an error message |       3   2 | 2001-11-12 19:08:19.431142+01 | the second error |       3 (2 rows)

9.2.9 PL/pgSQL and Transactions

As you can see in table logtable, all records contain the same timestamp. Although it takes some time to process the data, the timestamp is the same. This behavior of PostgreSQL has to do with transactions. Every PL/pgSQL function is executed just as a standalone SQL statement. This means that the components of a function are all executed within the same transaction. Because now() returns the current transaction time and not the time the function was called, all records contain the same result.

To show PostgreSQL's behavior in combination with PL/pgSQL, we have included an example where transactions are started and committed explicitly.

First, a transaction is started using the BEGIN command:

 phpbook=# BEGIN; BEGIN

Inside the transaction, makebool is started and true is returned by the function:

 phpbook=# SELECT makebool();  makebool ----------  t (1 row)

To check the content of the table, you can write a SQL statement that selects all data from the table, as shown in the next listing:

 phpbook=# SELECT * FROM logtable;  id |            tstamp             |     message      | mestype ----+-------------------------------+------------------+---------   1 | 2001-11-12 19:20:55.652559+01 | an error message |       3   2 | 2001-11-12 19:20:55.652559+01 | the second error |       3 (2 rows)

After that a ROLLBACK is performed to see what can be found in the table after undoing all operations:

 phpbook=# ROLLBACK; ROLLBACK

The table contains the same data as before the makebool function was called. This shows that PL/pgSQL does not commit transactions implicitly:

 phpbook=# SELECT * FROM logtable;  id |            tstamp             |     message      | mestype ----+-------------------------------+------------------+---------   1 | 2001-11-12 19:08:19.431142+01 | an error message |       3   2 | 2001-11-12 19:08:19.431142+01 | the second error |       3 (2 rows)

Both records still have the same value that they had before starting the transaction.

Another important thing is that transactions cannot be started and committed inside a PL/pgSQL function:

 CREATE OR REPLACE FUNCTION transtest () RETURNS bool AS '         BEGIN                 BEGIN TRANSACTION;                 DELETE FROM logtable;                 COMMIT WORK;                 RETURN ''t'';         END; ' LANGUAGE 'plpgsql';

Adding transaction code to a function will lead to syntax errors as well as to execution errors, as shown in the next listing:

 phpbook=# SELECT transtest(); NOTICE:  plpgsql: ERROR during compile of transtest near line 7 ERROR:  parse error at or near ""

9.2.10 Writing Data Type Independent Functions

So far, you have learned to write functions that can only operate with a fixed data type. In many cases this will not be flexible enough because if a data type in a certain column changes, you have to change your function as well. Therefore PostgreSQL supports two keywords: %TYPE and %ROWTYPE.

With the help of %TYPE it is possible to declare a variable using the data type of a certain column. To make this clearer, we have included an example.

First, a table for storing incomes is created. The goal of the example is to write a function for changing the income of a person:

 phpbook=# CREATE TABLE income(name text, salary int4); CREATE

In the next step, one record is added to the table:

 phpbook=# INSERT INTO income VALUES ('John', 3400); INSERT 21635 1

If no error occurred, the table will contain one record now:

 INSERT 21635 1 phpbook=# SELECT * FROM income;  name | salary ------+--------  John |   3400 (1 row)

After creating a table and adding some data to it, you can start writing the function:

 CREATE OR REPLACE FUNCTION changesal (text, int4, int4) RETURNS int4 AS '         DECLARE                 sal income.salary%TYPE;                 inc_table income%ROWTYPE;                 pname ALIAS FOR $1;                 inc ALIAS FOR $2;                 increase ALIAS FOR $3;         BEGIN                 SELECT DISTINCT INTO inc_table * FROM income                         WHERE salary=inc AND name=pname;                 RAISE NOTICE ''inc_table.name: %'', inc_table.name;                 RAISE NOTICE ''inc_table.salary: %'', inc_table.salary;                 RAISE NOTICE ''increase: %'', increase;                 sal := sal + increase;                 RAISE NOTICE ''sal: %'', sal;                 UPDATE income SET salary = sal                         WHERE salary = inc AND name=pname;                 RETURN 0;         END; ' LANGUAGE 'plpgsql';

In the DECLARE section, three aliases for the parameters passed to the function are defined. In addition, two other variables are defined and it is worthwhile to take a closer look at them: The data type of sal is the same as the data type of the column salary in table income. This way the function does not have to be changed when the data type of salary is changed. inc_table is defined as ROWTYPE. Using ROWTYPE makes accessing various components of the record easy and modifications can easily be done. After the DECLARE section a query is performed and the result of the query, which is exactly one record, is assigned to inc_table. There are no problems with the data type when assigning the value because PostgreSQL will automatically choose the correct data type for every column because ROWTYPE has been used in the DECLARE section. After displaying the content of the record, the salary of a person is increased and an UPDATE query is performed to set the salary to the new value.

Let's execute the function and see what is displayed on the screen:

 phpbook=# SELECT changesal('John', 3400, 10); NOTICE:  inc_table.name: John NOTICE:  inc_table.salary: 3400 NOTICE:  increase: 10 NOTICE:  sal: 3410  changesal -----------          0 (1 row)

After the function executes, the value in the table has changed:

 phpbook=# SELECT * FROM income;  name | salary ------+--------  John |   3410 (1 row)

As you can see, it is an easy task to work with TYPE and ROWTYPE. If you want to implement flexible functions, these two keywords will be essential for you.