8.1 Advanced Data Types

So far you have dealt with the basic set of PostgreSQL's data types such as int4, numeric, and text. These basic data types are provided by almost all common databases. In PostgreSQL an impressive list of additional data types is provided. These data types can be used to store specific pieces of information more efficiently than could be done using commonly known data types.

In addition to a broad range of data types, PostgreSQL provides an easy C interface for defining your own data types. Only a few functions need to be implemented in order to create new data types. When working with your own data types, you need not worry about accessing your data using PHP because data is always returned as text, so new data types do not affect your PHP application.

In this section, you will be guided through some data types provided by PostgreSQL and you will learn to work with these data types.

8.1.1 Working with Geometric Data Types

Geometric data types are a core feature of PostgreSQL and they are often used for building highly reliable and sophisticated systems that use Geographic Information Systems (GIS). GIS systems are software products used for analyzing and storing geographic data with the help of databases. For these purposes, PostgreSQL is a good choice because it provides an efficient index structure for accessing geometric data extremely fast. This index structure is called R-tree and it allows spatial searching as well as numerous other operations.

Table 8.1 lists the geometric data types provided by PostgreSQL.

Table 8.1. PostgreSQL's Geometric Data Types
Data Type	Description
`box`	A geometric box, rectangle
`circle`	A circle defined by a point and a radius
`line`	A line defined by a starting point and an end point
`lseg`	A geometric line segment
`path`	A sequence of lines defined by points
`point`	A point defined by two coordinates
`polygon`	A polygon defined by a list of points

As you can see, PostgreSQL provides data types for storing all kinds of geometric information. Of course, it is possible to store geometric data differently, but using PostgreSQL's built-in data types has some significant advantages because of efficiency.

8.1.1.1 Points

Points are the key data structure of PostgreSQL's implementation of geometric data types. Most other geometric data types are built on points, so knowing how to deal with points is an extremely important task.

Let's create a table containing one column for storing points:

 phpbook=# CREATE TABLE store_point(mypoint point); CREATE

Creating the table is an easy task, but let's try to insert some data into it:

 phpbook=# INSERT INTO store_point VALUES ('23, 17'); INSERT 24836 1

A point consists of two values, or coordinates. The first coordinate defines the x-coordinate and the second value contains the y-coordinate of the point. Currently PostgreSQL supports only two-dimensional points. At the moment it is not possible to work with multidimensional points without implementing a separate data type.

In the next step you can retrieve the point you just added to the table:

 phpbook=# SELECT * FROM store_point;  mypoint ---------  (23,17) (1 row)

8.1.1.2 Boxes

The data type box is used to store rectangles. A rectangle can be defined by two opposite points, and that's the way rectangles are stored internally.

Let's create a table:

 phpbook=# CREATE TABLE store_box(data box); CREATE

After creating the table, you can try to insert some values into it. The next examples all lead to the same result:

 phpbook=# INSERT INTO store_box VALUES ('(1,1), (3,3)'); INSERT 24839 1 phpbook=# INSERT INTO store_box VALUES ('1,1, 3,3'); INSERT 24840 1 phpbook=# INSERT INTO store_box VALUES ('(1,1, 3,3)'); INSERT 24841 1 phpbook=# INSERT INTO store_box VALUES ('((1,1), (3,3))'); INSERT 24842 1

The way parentheses are used does not affect the way PostgreSQL treats the data. The first two values are used to define one edge of the rectangle, and the third and fourth value contain information about the second point defining the rectangle. Take a look at the content of the table:

 phpbook=# SELECT * FROM store_box;     data -------------  (3,3),(1,1)  (3,3),(1,1)  (3,3),(1,1)  (3,3),(1,1) (4 rows)

You can easily look for a specific box in the table by using the = operator:

 phpbook=# SELECT * FROM store_box WHERE data = '(3,3),(1,1)' LIMIT 1;     data -------------  (3,3),(1,1) (1 row)

8.1.1.3 Lines

Lines consist of two points as well. One point defines the beginning and one point defines the end of the line. At the time this book was written, lines have not yet been fully implemented in PostgreSQL.

However, tables containing lines can be created:

 phpbook=# CREATE TABLE store_line(data line); CREATE

Creating a table containing lines is already possible, but it is not yet possible to insert and select records:

 phpbook=# SELECT '(1,1), (2,2)'::line; ERROR:  line not yet implemented

We hope that full support for lines will be implemented in future versions.

8.1.1.4 Line Segments

Line segments are represented by a pair of points. You can create a table that contains line segments just as you have seen for the other data types:

 phpbook=# CREATE TABLE store_line (data line); CREATE

PostgreSQL provides a sophisticated parsing function for line segments, so the input format is flexible:

 phpbook=# INSERT INTO store_lseg VALUES ('(1,1), (2,2)'); INSERT 24848 1 phpbook=# INSERT INTO store_lseg VALUES ('(1,1, 2,2)'); INSERT 24849 1 phpbook=# INSERT INTO store_lseg VALUES ('1,1, 2,2'); INSERT 24850 1

All INSERT statements are equal and will lead to the same result, as you can see in the table:

 phpbook=# SELECT * FROM store_lseg;      data ---------------  [(1,1),(2,2)]  [(1,1),(2,2)]  [(1,1),(2,2)] (3 rows)

The definition of a line segment looks similar to boxes, but internally things are treated differently. Different data types are defined, so things are only similar but not equal.

8.1.1.5 Paths

Paths consist of a set of points. In general two kinds of paths are available. Open paths are paths where the first and the last point do not match. In closed paths the beginning of a path is defined by the same point as the end of the path. PostgreSQL provides easy methods to find out whether a path is closed or not.

To show you how you can work with paths, we create a table called store_path:

 phpbook=# CREATE TABLE store_path(data path); CREATE

Let's insert an open path into the table:

 phpbook=# INSERT INTO store_path VALUES ('[(1,1), (2,2), (-3,2)]'); INSERT 24856 1

The square brackets at the beginning of the string tell PostgreSQL that the path has to be treated as a closed path. To insert a closed path into the table, parentheses must be used:

 phpbook=# INSERT INTO store_path VALUES ('((1,1), (2,2), (-3,2))'); INSERT 24857 1

You used the same values inside the path. However, because one path is closed and one is open, the two values you have just inserted into the database differ:

 phpbook=# SELECT '((1,1), (2,2), (-3,2))' = '[(1,1), (2,2), (-3,2)]';  ?column? ----------  f (1 row)

False is returned, which means that the two records are not the same.

Let's see what is in the table:

 phpbook=# SELECT * FROM store_path;          data ----------------------  [(1,1),(2,2),(-3,2)]  ((1,1),(2,2),(-3,2)) (2 rows)

As you can see, PostgreSQL displays the correct brackets for every record. This way it is easy to find out if a path is open or not. However, when building applications, it won't be useful to see things on first sight because things must be checked by the database. Therefore PostgreSQL provides two functions called isopen and isclosed. As you can easily imagine, isopen returns true if a path is open:

 phpbook=# SELECT data, isopen(data) FROM store_path;          data         | isopen ----------------------+--------  [(1,1),(2,2),(-3,2)] | t  ((1,1),(2,2),(-3,2)) | f (2 rows)

The first path is open and the second path in the table is closed. The isclosed function is the exact opposite of the isopen function:

 phpbook=# SELECT data, isclosed(data) FROM store_path;          data         | isclosed ----------------------+----------  [(1,1),(2,2),(-3,2)] | f  ((1,1),(2,2),(-3,2)) | t (2 rows)

8.1.1.6 Polygons

Polygons are similar to closed paths. However, internally PostgreSQL has a separate set of functions and operators in order to handle polygons efficiently.

The syntax and the way polygons are treated are equal to what you have just seen in the section called "Paths." Tables with polygons can be created in the same way as tables with paths:

 phpbook=# CREATE TABLE store_polygon(data polygon); CREATE

8.1.1.7 Circles

Circles consist of a point that marks the center of the circle and a radius. Let's see how circles can be inserted into the database:

 phpbook=# CREATE TABLE store_circle(data circle); CREATE

The table has been created successfully. Here is a list of the most important ways circles can be inserted into a table:

 phpbook=# INSERT INTO store_circle VALUES ('1,0, 12'); INSERT 24860 1 phpbook=# INSERT INTO store_circle VALUES ('((1,0), 12)'); INSERT 24861 1 phpbook=# INSERT INTO store_circle VALUES ('<(1,0), 12>'); INSERT 24862 1

The table will contain three values that are all the same:

 phpbook=# SELECT * FROM store_circle;     data ------------  <(1,0),12>  <(1,0),12>  <(1,0),12> (3 rows)

To retrieve records from the table, every syntax shown in the previous examples can be used. PostgreSQL will parse the data internally, so it does not matter in which format the data is actually passed to the database:

 phpbook=# SELECT * FROM store_circle WHERE data = '1,0,12' LIMIT 1;     data ------------  <(1,0),12> (1 row)

No matter what the WHERE clause looks like, PostgreSQL will always display the data in the same way.

8.1.1.8 Geometric Data Types and Operators

As we promised, PostgreSQL provides a rich set of operators and functions that can easily be used in combination with geometric data types. Together with R-trees, which are the standard index structure for indexing geometric data types, these operators are a strong team for a broad range of applications.

In this section, you will learn about the most important and most useful operators defined for geometric data types. Because the list of operators provided by the database is nearly endless, only the most important and most frequently used operators will be covered and discussed in detail.

Take a look at the list of geometric operators in Table 8.2. Keep in mind that not all of these operators are defined for every data type.

Table 8.2. Geometric Operators
Operator	Description
`+`	Translation
`-`	Translation
`*`	Scaling/rotation
`/`	Scaling/rotation
`#`	Intersection or number of points in a polygon
`##`	Point of closest proximity
`&&`	Overlaps?
`&<`	Overlaps to left?
`&>`	Overlaps to right?
`<->`	Distance between
`<<`	Left of?
`<^`	Is below?
`>>`	Is right of?
`>^`	Is above?
`?#`	Intersects or overlaps
`?-`	Is horizontal?
`?-\|`	Is perpendicular?
`@-@`	Length or circumference
`?\|`	Is vertical?
`?\|\|`	Is parallel?
`@`	Contained or on
`@@`	Center of
`~=`	Same as

After taking a look at the list of operators, it is time to do some practical examples.

To add the coordinates of two objects, use the + operator. In the next example you will use it for adding two points:

 phpbook=# SELECT '1,1'::point + '3,4'::point;  ?column? ----------  (4,5) (1 row)

It is also possible to add a point and a box. To cast the string to the right data type, use the :: operator. The string '1,1' is cast to point in order to tell PostgreSQL what to do with it. In this example every point defining the box will be increased, as shown in the next code segment:

 phpbook=# SELECT '(1,1), (2,2)'::box + '3,4'::point;   ?column? -------------  (5,6),(4,5) (1 row)

As you can see, this way it is easily possible to work with vector operations on a database level. This makes building applications simple because you don't have to implement these things on the application level.

It is also possible to add circles to each other:

 phpbook=# SELECT '1,1, 2'::circle + '1,1, 1'::circle;  ?column? -----------  <(2,2),2> (1 row)

The radius of the first circle is not modified the two radiuses have not been added. Let's swap the order of the circles passed to the database:

 phpbook=# SELECT '1,1, 1'::circle + '1,1, 2'::circle;  ?column? -----------  <(2,2),1> (1 row)

The result is not the same as you saw in the previous example. In circles, the + operator is not commutative, which means that it makes a difference if you use A+B or B+A. This is extremely dangerous because you have to know exactly what you are doing and how certain operations are defined. I want to point out that this is not a weak spot of PostgreSQL it is a matter of how things are defined internally. We strongly recommend trying out a few examples before making heavy use of PostgreSQL's operators.

The operators -, *, and / work pretty much the same way as the + operator does.

To compute the intersection of two geometric objects, PostgreSQL provides the # operator. In the next example you see how the intersection of two boxes can be computed:

 phpbook=# SELECT '(4,4), (6,6)'::box # '(6,6), (7,7)'::box;   ?column? -------------  (6,6),(6,6) (1 row)

The result seems obvious because the two boxes have only one point in common. Here's an additional example:

 phpbook=# SELECT '(4,4), (6,6)'::box # '6,6, 7'::circle;   ?column? -------------  (6,6),(4,4) (1 row)

In this example the intersection of the two objects is the entire first object. As you can see, it is also possible to compute the intersection of objects that do not have the same data type.

Recall the structure of store_polygon:

 phpbook=# \d store_polygon     Table "store_polygon"  Column |  Type   | Modifiers --------+---------+-----------  data   | polygon |

The table contains one column for storing polygons. Let's insert one value into the table:

 phpbook=# INSERT INTO store_polygon VALUES('(4,4), (6,6), (-3,1), (2,9)'); INSERT 24868 1

The # operator can also be used to compute the number of points a polygon contains:

 phpbook=# SELECT data, #data FROM store_polygon;             data            | ?column? ----------------------------+----------  ((4,4),(6,6),(-3,1),(2,9)) |        4 (1 row)

The polygon you have inserted into the table consists of four points.

To find the point of closest proximity, the ## has been implemented:

 phpbook=# SELECT '<(0,0), 4>'::circle ## '<(7,8), 3>'::circle;               ?column? -------------------------------------  (4.87867965644036,5.87867965644036) (1 row)

Sometimes it is necessary to find out if two objects overlap. You can use the && operator:

 phpbook=# SELECT '(1,1), (2,2)'::box && '(1,1) ,(3,3)'::box;  ?column? ----------  t (1 row)

Take a look at three additional examples:

 phpbook=# SELECT '(1,1), (2,2)'::box && '(-1,-1) ,(0,0)'::box;  ?column? ----------  f (1 row) phpbook=# SELECT '(1,1), (2,2)'::box &< '(-1,-1) ,(0,0)'::box;  ?column? ----------  f (1 row) phpbook=# SELECT '(1,1), (2,2)'::box &> '(-1,-1) ,(0,0)'::box;  ?column? ----------  t (1 row)

As you can see, in the first example, the two boxes do not overlap. In the second example, you want to find out if the left box overlaps to the left, whereas the third example computes whether they overlap to the right. In the third example true is returned.

One of the most commonly used operators when working with geometric data types is the <-> operator:

 phpbook=# SELECT '1,1'::point <-> '2,2'::point;     ?column? -----------------  1.4142135623731 (1 row)

The distance between the two points is about 1.41, which is the square root of 2.

To find out if an object contains a second object, PostgreSQL provides the @ operator:

 phpbook=# SELECT '(-3,-3), (4,5)'::box @ '(2,2), (3,3)'::box;  ?column? ----------  f (1 row)

In this example the first object does not contain the second one. Just like the + operator, the @ operator is not commutative. If the order of the two objects in the query is changed, the result will be true:

 phpbook=# SELECT '(2,2), (3,3)'::box @ '(-3,-3), (4,5)'::box;  ?column? ----------  t (1 row)

The next example shows how to compute the center of an object:

 phpbook=# SELECT @@ '(2,2), (3,3)'::box;  ?column? -----------  (2.5,2.5) (1 row)

With boxes, the center of the object is the average of the two points. The same operation also works for polygons:

 phpbook=# SELECT @@ '(2,2), (3,3), (-3,2), (0,3)'::polygon;  ?column? -----------  (0.5,2.5) (1 row)

As you can see, PostgreSQL provides a rich set of operators for working with geometric data types. It is not possible to cover all operators in detail because this would be beyond the scope of this book.

8.1.1.9 Functions for Geometric Data Types

PostgreSQL supports not only a rich set of operators, but a number of functions that you can use to perform all basic operations concerning geometric data types. In this section, you will find out about the most important of these functions and see how these functions can be used.

One of these functions is called area and can be used to compute the area of an object:

 phpbook=# SELECT area('<(1,1), 5>'::circle);     area -------------  78.53981634 (1 row)

The area of a circle can be computed by calculating the square of the radius and multiplying it by Pi as shown in the next example:

 phpbook=# SELECT (5^2)*pi();      ?column? ------------------  78.5398163397448 (1 row)

The calculation shown in the SQL statement is performed by PostgreSQL internally. The function called pi returned the value of Pi:

 phpbook=# SELECT pi();         pi ------------------  3.14159265358979 (1 row)

Center retrieves the center of an object. You have already seen the @@ operator, which is equal to the center function. Let's use the center function to find the center of a circle:

 phpbook=# SELECT center('<(1,1), 5>'::circle);  center --------  (1,1) (1 row)

The result is not surprising because the center of the circle is already in the string, but what is this function good for when working with circles? The advantage of the function is that you don't have to write a function for parsing the string provided by PostgreSQL.

For computing the diameter, PostgreSQL provides a function called diameter:

 phpbook=# SELECT diameter('<(1,1), 5>'::circle);  diameter ----------        10 (1 row)

In circles, the diameter is twice the value of the radius.

The height of an object can be computed as shown in the next example:

 phpbook=# SELECT height('(1,1), (3,3)'::box);  height --------       2 (1 row)

The result is not surprising because all PostgreSQL does is to compute the difference between the two y-coordinates.

The counterpart of the height function is the width function:

 phpbook=# SELECT width('(1,1), (3,4)'::box);  width -------      2 (1 row)

Internally PostgreSQL computes the difference of the x-coordinates of the box.

Another important attribute of an object is its length:

 phpbook=# SELECT length('(2,2), (3,3), (-3,2), (0,3)'::polygon);       length ------------------  12.8953217303395 (1 row)

The length function can be used to retrieve the length of an object. In this example you have computed the length of a polygon.

8.1.2 Working with Network Data Types

Now that you have dealt with geometric data types in detail, it is time to have a closer look at the data types related to network data types. PostgreSQL provides three data types for storing network information, as shown in Table 8.3.

Table 8.3. Network Data Types in PostgreSQL
Data Type	Internal Size	Description
`cidr`	12 bytes	Stores valid Ipv4 network addresses
`inet`	12 bytes	Stores IP addresses or network addresses
`macaddr`	6 bytes	Stores MAC addresses

The advantage of these data types is that data can be stored much more efficiently than when working with data types such as text. Just think of a MAC address that is usually 12 characters long. With the help of the data type macaddr, it will only take 6 bytes to store one record. Another advantage of network data types is that PostgreSQL does not allow invalid data to be inserted into the database. The parser makes sure that no IP address contains values lower than 0 or higher than 255. Things like that will save a lot of overhead and protect your data because no syntactically incorrect IP addresses can be stored in the database.

Let's take a detailed look at the various data types.

8.1.2.1 `inet`

Fields defined as inet can hold an IP host address and optionally the identity of the subnet it is in. The IP address and the subnet are stored in one field, which guarantees that the subnet and the IP are stored as one item of information.

Let's look at an example:

 phpbook=# CREATE TABLE store_inet(data inet); CREATE

After creating the table, you can try to insert some data. The next SQL statement shows how to insert an IP address into the table:

 phpbook=# INSERT INTO store_inet VALUES ('192.168.1.1'); INSERT 24870 1

Sometimes you might want to add a netmask to the IP address. You can do this by adding a slash and the number of bit sets in the netmask to the string. In the next example you want the column to contain a 24-bit netmask (255.255.255.0):

 phpbook=# INSERT INTO store_inet VALUES ('192.168.1.1/24'); INSERT 24871 1

If the IP address should be in a net consisting of only one machine, you have to use 32 bits instead of 24 bits:

 phpbook=# INSERT INTO store_inet VALUES ('192.168.1.1/32'); INSERT 24872 1

What happens when an incorrect netmask is passed to the database?

 phpbook=# INSERT INTO store_inet VALUES ('192.168.1.1/33'); ERROR:  invalid INET value '192.168.1.1/33'

Because the netmask is not valid, PostgreSQL will display an error as shown in the listing. Let's see what is in the table:

 phpbook=# SELECT * FROM store_inet;       data ----------------  192.168.1.1/24  192.168.1.1  192.168.1.1 (3 rows)

Three records have been inserted into the table successfully. The first record contains an IP address and the netmask and the remaining two records contain only the IP address. In the case of a 32-bit netmask, the netmask will be omitted because it is redundant to mention it.

8.1.2.2 `cidr`

cidr is used to store IPv4 network addresses. Here's an example:

 phpbook=# CREATE TABLE store_cidr(data cidr); CREATE

Let's insert a new record into the table. The next example shows how a netmask can be inserted into the table:

 phpbook=# INSERT INTO store_cidr VALUES('192/8'); INSERT 24876 1

The zeros in the netmask don't have to be displayed 192 is equal to 192.0.0.0, which is an entire subnet.

The next three examples show how smaller nets can be added to the table:

 phpbook=# INSERT INTO store_cidr VALUES('192.168/16'); INSERT 24877 1 phpbook=# INSERT INTO store_cidr VALUES('192.168.1/24'); INSERT 24878 1 phpbook=# INSERT INTO store_cidr VALUES('192.168.1.1/32'); INSERT 24879 1

Four records have been inserted into the table:

 phpbook=# SELECT * FROM store_cidr;       data ----------------  192.0.0.0/8  192.168.0.0/16  192.168.1.0/24  192.168.1.1 (4 rows)

As you can see, the IP addresses are displayed, including the zeros at the end.

8.1.2.3 `macaddr`

Every piece of hardware in a network, such as network cards, print servers, switches, routers, and so forth have a so-called MAC address. A MAC address is a unique identification number for a network device and it is essential for keeping a network up and running. PostgreSQL provides a separate data type called macaddr for storing MAC addresses in a column.

To show you how this works, we have included an example:

 phpbook=# CREATE TABLE store_macaddr (data macaddr); CREATE

In the next step a MAC address will be added to the table in four different ways:

 phpbook=# INSERT INTO store_macaddr VALUES ('123456-7890ab'); INSERT 24882 1 phpbook=# INSERT INTO store_macaddr VALUES ('12:34:56:78:90:ab'); INSERT 24883 1 phpbook=# INSERT INTO store_macaddr VALUES ('1234.5678.90ab'); INSERT 24884 1 phpbook=# INSERT INTO store_macaddr VALUES ('12-34-56-78-90-ab'); INSERT 24885 1

All values are the same and will be displayed using the format shown in the next example:

 phpbook=# SELECT * FROM store_macaddr LIMIT 1;        data -------------------  12:34:56:78:90:ab (1 row)