8.7 Constraints

Constraints are attributes of an object that are used to restrict certain events from happening. Using constraints will help you to prevent things from happening. Especially when you want to protect the integrity of your data, constraints will be essential. The constraints provided by PostgreSQL have to a large extent been defined in the ANSI standard, so using constraints should not affect the portability of your data structures significantly as long as the database you are porting your data structures to supports ANSI SQL-92 as well.

8.7.1 The `CREATE TABLE` Command and Constraints

In most cases, constraints are defined when creating tables. Therefore the CREATE TABLE command provides many options to define the right set of constraints. Recall the syntax overview of the CREATE TABLE command. We have included it here again to avoid some unnecessary page turning:

 phpbook=# \h CREATE TABLE Command:     CREATE TABLE Description: define a new table Syntax: CREATE [ [ LOCAL ] { TEMPORARY | TEMP } ] TABLE table_name (     { column_name data_type [ DEFAULT default_expr ] [ column_constraint [, ... ] ]     | table_constraint }  [, ... ] ) [ INHERITS ( parent_table [, ... ] ) ] [ WITH OIDS | WITHOUT OIDS ]

where column_constraint is:

 [ CONSTRAINT constraint_name ] { NOT NULL | NULL | UNIQUE | PRIMARY KEY |   CHECK (expression) |   REFERENCES reftable [ ( refcolumn ) ] [ MATCH FULL | MATCH PARTIAL ]     [ ON DELETE action ] [ ON UPDATE action ] } [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]

and table_constraint is:

 [ CONSTRAINT constraint_name ] { UNIQUE ( column_name [, ... ] ) |   PRIMARY KEY ( column_name [, ... ] ) |   CHECK ( expression ) |   FOREIGN KEY ( column_name [, ... ] ) REFERENCES reftable [ ( refcolumn [, ... ] ) ]     [ MATCH FULL | MATCH PARTIAL ] [ ON DELETE action ] [ ON UPDATE action ] } [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY IMMEDIATE ]

In this section you will take a closer look at most of the features of the CREATE TABLE command and you will learn about the power of this command.

8.7.1.1 `CHECK` Constraints

One of the most widespread constraints is the CHECK constraint. It can be used to check whether the input satisfies certain conditions and helps you to protect your data by omitting data that is obviously wrong. With the help of CHECK constraints, a lot of work that is often done on the application level is passed to the database, and in many cases the database can deal with things like that much more efficiently.

Imagine a situation where you want to implement a table for storing a list of products you have sold. Some columns in the table must not have invalid data, so CHECK constraints can be used:

 CREATE TABLE sales (         id int4,         tstamp timestamp        CHECK (tstamp < now()),         product_id int4         CHECK (product_id > 0),         product_name text       CHECK (length(product_name) > 3),         units int4              CHECK (units > 0 AND units < 1000),         price numeric(9,2)      CHECK (price > 0) );

The data in id is not restricted by a constraint. The second column may only contain values where the date the sale took place was before the time the record is inserted into sales. This will help you to find bugs in your application because future sales cannot be inserted into the database, so one source of trouble can be avoided. In column number 3 you make sure that the product id must not be a negative value. In the next line of code you can see that the name of a product must be at least three characters long. The column called units has to satisfy two conditions: The first condition is that a product must be sold more than zero times. In addition, a product cannot be sold more than one thousand times. Take a look at the definition of the table:

 phpbook=# \d sales                      Table "sales"     Column    |            Type             | Modifiers --------------+-----------------------------+-----------  id           | integer                     |  tstamp       | timestamp(6) with time zone |  product_id   | integer                     |  product_name | text                        |  units        | integer                     |  price        | numeric(9,2)                | Check constraints: "sales_tstamp" (tstamp < now())                    "sales_product_id" (product_id > 0)                    "sales_product_name" (length(product_name) > 3)                    "sales_units" ((units > 0) AND (units < 1000))                    "sales_price" (price > '0'::"numeric")

As you can see, the definition of the table contains information about CHECK constraints you have added to the table. This is comfortable because you don't have to look this information up in one of the system tables manually \d does the job for you.

After you have defined the table, it is time to insert some data into it:

 phpbook=# INSERT INTO sales VALUES (1, '2001/1/1', 3243243, 'cheeseburger', 2, '3.16'); INSERT 25494 1 phpbook=# INSERT INTO sales VALUES (2, '2001/1/1', 3243242, 'hamburger', -200, '3.16'); ERROR:  ExecAppend: rejected due to CHECK constraint sales_units phpbook=# INSERT INTO sales VALUES (3, '2001/1/1', 3243244, 'do', 1, '4.95'); ERROR:  ExecAppend: rejected due to CHECK constraint sales_product_name

The first record can be added to the table without any problems because none of the columns violates the condition of a CHECK constraint. In case of the second record the amount of products bought by the customer is negative. In this example this is not allowed because you do not allow orders to be canceled. The third INSERT statement tries to insert a product into the table where the name is too short, so the INSERT statement is not performed.

As you can see, a lot of things can be prevented from happening when working with CHECK constraints and this way your data can be protected.

8.7.1.2 `DEFAULT` Values

In order to tell PostgreSQL which data to insert into a column if an INSERT statement does not pass the value to the database, you can use default values. Before you can re-create sales, you have to drop the old sales table by using DROP TABLE:

 phpbook=# DROP TABLE sales; DROP

In the next step the table can be generated and DEFAULT constraints can be used:

 CREATE TABLE sales (         id int4,         tstamp timestamp        DEFAULT now(),         product_id int4 ,         units int4              DEFAULT 1 CHECK (units > 0 AND units < 1000),         price numeric(9,2) );

In this example the timestamp is set to the current time. If no value for this column is specified, PostgreSQL will use the current transaction time as the default value. More than just one constraint is assigned to the units column. First a DEFAULT constraint is defined and in addition, CHECK constraints have been defined. As you can see, a list of constraints can be defined on one column, which makes the entire process very flexible.

Take a look at the data structure of the table:

 phpbook=# \d sales                       Table "sales"    Column   |            Type             |   Modifiers ------------+-----------------------------+---------------  id         | integer                     |  tstamp     | timestamp(6) with time zone | default now()  product_id | integer                     |  units      | integer                     | default 1  price      | numeric(9,2)                | Check constraints: "sales_units" ((units > 0) AND (units < 1000))

The DEFAULT values as well as the CHECK constraints are listed in the output. Now it is time to insert some data into the table:

 phpbook=# INSERT INTO sales (id, product_id, price) VALUES ('1', '2343240', '12.9'); INSERT 25497 1 phpbook=# INSERT INTO sales (id, product_id, units, price) VALUES ('2', '2343240', -1, '12.9'); ERROR:  ExecAppend: rejected due to CHECK constraint sales_units

The first record can be added to the table without any problems. However, the second INSERT statement violates the CHECK constraint and won't be performed successfully. Therefore the table contains exactly one record:

 phpbook=# SELECT * FROM sales;  id |            tstamp             | product_id | units | price ----+-------------------------------+------------+-------+-------   1 | 2001-11-30 11:16:29.199441+01 |    2343240 |     1 | 12.90 (1 row)

The values of columns that have not been in the INSERT statement have been assigned to the DEFAULT value.

8.7.1.3 Column Constraints

Some additional constraints can be defined that add certain attributes to a column. The UNIQUE constraint guarantees that no duplicated values can be inserted into a column. NOT NULL makes sure that a column can never contain a NULL value, and the NULL constraint sets the default value to NULL. Let's drop and create the table sales again and use these constraints:

 CREATE TABLE sales (         id int4                 UNIQUE,         tstamp timestamp        NOT NULL,         product_id int4 ,         units int4              NULL,         price numeric(9,2) );

To insert the data structure into the table, we suggest that you use psql:

 [postgres@athlon const]$ psql phpbook < constraint.sql NOTICE:  CREATE TABLE/UNIQUE will create implicit index 'sales_id_key' for table 'sales' CREATE

If no errors occurred, PostgreSQL has created a table and an index has been defined automatically. This index has been defined as a unique index; this way, PostgreSQL makes sure that no duplicated values can be added to the column. It seems obvious that an index is created automatically because otherwise PostgreSQL would have to read the entire table every time a record is added to the table in order to find out if a certain value already exists.

 phpbook=# \d sales                     Table "sales"    Column   |            Type             | Modifiers ------------+-----------------------------+-----------  id         | integer                     |  tstamp     | timestamp(6) with time zone | not null  product_id | integer                     |  units      | integer                     |  price      | numeric(9,2)                | Unique keys: sales_id_key

Let's insert some data into the relation:

 phpbook=# INSERT INTO sales VALUES (1, now(), 2323411, 4, '23.1'); INSERT 25501 1

No errors occurred. In the next example you will see what happens when you try to insert a NULL value in the second column:

 phpbook=# INSERT INTO sales VALUES (2, NULL, 2323411, 4, '23.1'); ERROR:  ExecAppend: Fail to add null value in not null attribute tstamp

The INSERT statement will not succeed because the column is not allowed to contain NULL values because of the NOT NULL constraint.

In the next step two records with the same id will be added to the table:

 phpbook=# INSERT INTO sales VALUES (2, now(), 2323412, 3, '29.4'); INSERT 25502 1 phpbook=# INSERT INTO sales VALUES (2, now(), 2323412, 3, '29.4'); ERROR:  Cannot insert a duplicate key into unique index sales_id_key

The second INSERT statement has failed because the first column is defined to be unique. Therefore duplicated values are not allowed.

8.7.1.4 Primary Keys and Object Ids

Primary keys are a fundamental component of the relation database model. A primary key of a table is a unique identifier of a row in a table. Therefore a column defined as a primary key must not contain duplicated values.

Let's try an example:

 phpbook=# CREATE TABLE book (isbn text PRIMARY KEY, title text, author text); NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'book_pkey' for table 'book' CREATE

A table for storing books has been created and the first column has been defined as a primary key. Using the ISBN number of a book as the primary key is a good choice because this number is unique.

After creating the table, you can take a look at the definition of the table:

 phpbook=# \d book        Table "book"  Column | Type | Modifiers --------+------+-----------  isbn   | text | not null  title  | text |  author | text | Primary key: book_pkey

PostgreSQL mentions in the listing that the primary key is book_pkey.

8.7.1.5 Integrity Constraints

Up to now you have dealt with a set of independent tables. In real-world applications this is more often an exception than the normal status. Data stored in various tables is in many cases not independent. Imagine a situation where one table stores the salary of a person and the second table stores the attributes of the person. If a person is deleted from the table storing the attributes, it might be useful to remove the person from the table storing the salaries as well. For this purpose, you can use integrity constraints. PostgreSQL provides a rich set of integrity constraints, and you will learn about these in this section.

Foreign key constraints are fundamental when working with data integrity on the database level. A column defined as a foreign key must always be related to a column in another table. In many cases, a foreign key column is related to a primary key of another table as shown in the next listing:

 CREATE TABLE company (id int4 PRIMARY KEY,                 name    text    NOT NULL,                 city    text    NOT NULL ); CREATE TABLE employee (company_id int4 REFERENCES company,                 name    text    NOT NULL,                 salary  numeric(9,2) );

The table called company stores information about companies and employee stores information about the workers employed at the company. The column called company_id is a foreign key and it is related to the primary key of the table called company. To create the table we recommend using an ASCII file containing the SQL code:

 [postgres@athlon postgres]$ psql phpbook < comp.sql NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index 'company_pkey' for table 'company' CREATE NOTICE:  CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check(s) CREATE

When creating the tables, PostgreSQL will automatically create triggers to guarantee data integrity. Triggers are used to call functions automatically and that's exactly what has to be done. Every time a record is changed, PostgreSQL will fire a trigger to see if data integrity is violated.

Take a look at the definition of the tables you have just created:

 phpbook=# \d company        Table "company"  Column |  Type   | Modifiers --------+---------+-----------  id     | integer | not null  name   | text    | not null  city   | text    | not null Primary key: company_pkey Triggers: RI_ConstraintTrigger_25546,           RI_ConstraintTrigger_25548 phpbook=# \d employee            Table "employee"    Column   |     Type     | Modifiers ------------+--------------+-----------  company_id | integer      |  name       | text         | not null  salary     | numeric(9,2) | Triggers: RI_ConstraintTrigger_25544

As you can see, the triggers used by PostgreSQL are listed in the description.

After creating the tables, it is time to insert some values. First a record is added to the table used to store the companies:

 phpbook=# INSERT INTO company VALUES (1, 'Cybertec', 'Vienna'); INSERT 25575 1

Cybertec has been assigned to the id 1. In the next step two employees are added. Both persons are employees of Cybertec, so the id of company is used as a foreign key:

 phpbook=# INSERT INTO employee VALUES (1, 'Susi', '40000'); INSERT 25576 1 phpbook=# INSERT INTO employee VALUES (1, 'Horst', '38000'); INSERT 25577 1

The records have been added successfully, but what happens if the first column is defined as NULL?

 phpbook=# INSERT INTO employee VALUES(NULL, 'Epi', 52000); INSERT 25578 1

No problem occurred because columns that have been defined as foreign key may contain NULL values. In the next step you can try to insert a person who is employed at a company having the id 2:

 phpbook=# INSERT INTO employee VALUES (2, 'Pauline', '39500'); ERROR:  <unnamed> referential integrity violation - key referenced from employee not found in company

No data can be added to the table because there is no appropriate company to add the person to. The foreign key makes sure that no invalid persons can be added to the table.

If a company has employees, it cannot be deleted:

 phpbook=# DELETE FROM company; ERROR:  <unnamed> referential integrity violation - key in company still referenced from employee

To remove Cybertec, all employees must be removed, and if the id of Cybertec is no longer needed as a foreign key, the company can be removed.

In this example you have seen that the id is used as a foreign key. Because the id is the primary key of company, no problems have occurred, but things can also be done differently. Before you get to the next example, the two tables you have created must be removed by using DROP TABLE:

 phpbook=# DROP TABLE employee; NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "company" NOTICE:  DROP TABLE implicitly drops referential integrity trigger from table "company" DROP phpbook=# DROP TABLE company; DROP

When deleting the first table, PostgreSQL mentions that the integrity constraints have been removed from the table as well:

 CREATE TABLE company (id int4 PRIMARY KEY,                 name    text    NOT NULL UNIQUE,                 city    text    NOT NULL ); CREATE TABLE employee (company_name text REFERENCES company(name),                 name    text    NOT NULL,                 salary  numeric(9,2) );

In this example the column called name is referenced and taken as the primary key.

Let's insert one record into each table:

 phpbook=# INSERT INTO company VALUES (1, 'Cybertec', 'Vienna'); INSERT 25784 1 phpbook=# INSERT INTO employee VALUES('Cybertec', 'Epi', 52000); INSERT 25786 1

The target of the next example is to see what happens when the primary key has to be updated. Imagine that the name of the company changes:

 phpbook=# UPDATE company SET name='Cybertec Geschwinde and Schoenig OEG'; ERROR:  <unnamed> referential integrity violation - key in company still referenced from employee

The UPDATE operation cannot be performed because referential integrity would be violated.

This behavior is not a good one because you cannot change the data used as foreign key. To get around the problem, PostgreSQL offers additional constraints. Therefore the SQL commands for generating the tables have to be changed. Before inserting the SQL commands in the next listing, don't forget to delete the old tables using DROP TABLE:

 CREATE TABLE company (id int4 PRIMARY KEY,                 name    text    NOT NULL UNIQUE,                 city    text    NOT NULL ); CREATE TABLE employee (company_name text                         REFERENCES company(name)                         INITIALLY DEFERRED,                 name    text    NOT NULL,                 salary  numeric(9,2) );

As you can see, an additional constraint has been added to employee. This command makes sure that the integrity is only changed at the end of a transaction and not after every statement. The advantage of this is that the name of the company can be changed in both tables before committing the transaction. This way data can easily be changed. The next listing shows how this can be done:

 INSERT INTO company VALUES (1, 'Cybertec', 'Vienna'); INSERT INTO employee VALUES('Cybertec', 'Epi', 52000); BEGIN; UPDATE company SET name='Cybertec Geschwinde and Schoenig OEG' WHERE name='Cybertec'; UPDATE employee SET company_name='Cybertec Geschwinde and Schoenig OEG'         WHERE company_name='Cybertec'; COMMIT;

First, two records have been added. In the next step a transaction is started and the name is changed in both tables. During the transaction PostgreSQL will not check data integrity and therefore both operations can easily be performed.

The opposite of the INITIALLY DEFERRED constraint is the INITIALLY IMMEDIATE constraint, which checks data integrity after every SQL statement.

Working with INITIALLY DEFERRED will in some cases help you to solve problems concerning data integrity, but sometimes other constraints are more comfortable. Take a look at the next example:

 CREATE TABLE company (id int4 PRIMARY KEY,                 name    text    NOT NULL UNIQUE,                 city    text    NOT NULL ); CREATE TABLE employee (company_id int4 REFERENCES company(id)                         ON UPDATE CASCADE,                 name    text    NOT NULL,                 salary  numeric(9,2) );

In the second table the column called company_id has been set to ON UPDATE CASCADE. This means that every time the primary key of the first table is changed, PostgreSQL will automatically change all columns referring to it. Let's see how this works:

 phpbook=# INSERT INTO company VALUES (1, 'Cybertec', 'Vienna'); INSERT 25757 1 phpbook=# INSERT INTO employee VALUES(1, 'Epi', 52000); INSERT 25758 1

First, two records have been added to the database. In the next step an UPDATE operation is performed to change the id of the company stored in the first table:

 phpbook=# UPDATE company SET id=2 WHERE id=1; UPDATE 1

PostgreSQL does not complain that something has gone wrong and has changed the value in company:

 phpbook=# SELECT * FROM company;  id |   name   |  city ----+----------+--------   2 | Cybertec | Vienna (1 row)

In addition, the column company_id has been changed:

 phpbook=# SELECT * FROM employee;  company_id | name |  salary ------------+------+----------           2 | Epi  | 52000.00 (1 row)

This wayonly one operation has to be performed to change all values referring to company. This is comfortable and will save a lot of time as well as a lot of overhead.

In general ON can be used in combination with two SQL commands: UPDATE and DELETE. Both commands support a list of possible settings. Let's go through the DELETE command's settings:

NO ACTION and RESTRICT An error is displayed if the foreign key is violated; this is the default setting.
CASCADE All rows referenced by the deleted row are deleted as well.
SET NULL sets the referencing columns to NULL.
SET DEFAULT Sets the referencing column to the default value.

Here is a list of what can be done with the options provided by UPDATE:

NO ACTION and RESTRICT An error is displayed if the foreign key is violated; this is the default setting.
CASCADE Update the values in the columns referencing a key.
SET NULL Sets the referencing columns to NULL.
SET DEFAULT Sets the referencing column to the default value.
[NOT] DEFERRABLE DEFERRABLE makes sure that data integrity is checked at the end of the transaction. NOT DEFERRABLE is the default setting and makes PostgreSQL check data integrity after every operation.

As you can see, PostgreSQL provides many ways to make sure that data integrity is not violated. Because PostgreSQL is able to take care of data integrity, it is not necessary to do it on the application level.

8.7.1 The CREATE TABLE Command and Constraints