22.1 A User-Defined Data Type


Let's get started and see how you can add a user-defined data type to PostgreSQL. The standard distribution of PostgreSQL offers a set of predefined data types. In some cases these data types might not offer enough efficiency for your applications. If you have to store special information that cannot easily be modeled using standard data types, it can be useful to implement your own data type. Because this can easily be done, we have decided to provide a brief example.

The goal of this section is to implement a data type for storing information about spheres. Several files will be needed, as shown in Table 22.1.

Table 22.1. The Files Used in This Section
Filename Description
Makefile This file is used to run the compiler and to generate the SQL needed for adding the data type to the database.
Sphere.c Sphere.c contains the C code of the data type.
Sphere.o The object that comes out when compiling sphere.c.
Libsphere.so The shared object that will be accessed by PostgreSQL.
Sphere.tpl A SQL template.
Sphere.sql Sphere.sql is generated by transforming sphere.tpl. It already contains the full path to the binaries of our data type.
Test.sql A test suite for the data type.

Let's start with sphere.c. This file is the core of the data type sphere.c. Let's take a look at the code:

 #include <postgres.h> #include <stdio.h> #include <math.h> #ifndef M_PI #define M_PI    3.14159265358979323846264338327950288 #endif typedef struct  sphere {         float4 x;       /* x-coordinate */         float4 y;       /* y-coordinate */         float4 z;       /* z-coordinate */         float4 radius;  /* radius of sphere */ } sphere; /* function for parsing the input */ sphere *sphere_in(char *str) {         float x, y, z, radius;         sphere *mysphere;       /* creating a new object */         mysphere = (sphere *) palloc(sizeof(sphere));         if      (sscanf(str, " %f, %f, %f, %f ",                 &x, &y, &z, &radius) != 4)         {                 elog(ERROR, "sphere_in: parse error in %s", str);                 return NULL;         }         /* checking for valid data */         if      (x >= 0 && y >= 0 && z >= 0 && radius >= 0)         {                 mysphere->x = x;                 mysphere->y = y;                 mysphere->z = z;                 mysphere->radius = radius;         }         else         {                 elog(ERROR, "sphere_in: input not valid in %s", str);                 return NULL;         }         return mysphere; } char *sphere_out(sphere *mysphere) {         char *result;         /* checking for NULL values */         if      (mysphere == NULL)         {                 return NULL;         }         /* allocate memory for storing result */         result = (char * ) palloc(100);            /* dangerous (buffer overflow), use snprintf if available,               %g */         sprintf(result, "%f, %f, %f, %f",                 mysphere->x, mysphere->y,                 mysphere->z, mysphere->radius);         return result; } /* comparison function */ bool sphere_eq(sphere *one, sphere *two) {         if      (one->x == two->x &&                 one->y == two->y &&                 one->z == two->z &&                 one->radius == two->radius)         {                 return TRUE;         }         return FALSE; } /* lower operator */ bool sphere_lt(sphere *one, sphere *two) {         return (one->radius < two->radius) ? TRUE : FALSE; } /* <= operator */ bool sphere_le(sphere *one, sphere *two) {         return (one->radius <= two->radius) ? TRUE : FALSE; } /* > operator */ bool sphere_gt(sphere *one, sphere *two) {         return (one->radius > two->radius) ? TRUE : FALSE; } /* >= operator */ bool sphere_ge(sphere *one, sphere *two) {         return (one->radius >= two->radius) ? TRUE : FALSE; } /* computing volume */ float8 *sphere_volume(sphere *one) {         static float8 retvalue;         retvalue = 4./3. * M_PI * (one->radius*one->radius*one->radius);         return &retvalue; } 

First a set of libraries is included. postgres.h contains the header information for all functions related to PostgreSQL. math.h is included for working with Pi (we will need this to compute the column of a sphere). Then a structure called sphere is defined. This structure contains all components of the data type.

The next step is to implement a function for reading values and one for displaying values. The function for parsing the input is called sphere_in. The goal of this function is to initialize a sphere and retrieve all values from the input. In this example sscanf is used to assign the components of a sphere to the internal structure. Now memory is allocated and the input data is checked. Notice that memory is allocated using palloc. This way PostgreSQL makes sure that no memory leaks can occur. In this scenario, the coordinates of the sphere must be positive. We have done this just for presentation purposes so that you can see how the input can be restricted.

Sphere_out is even more simple. The function is used for formatting a string that can be displayed by the back end.

In the next step a list of functions is implemented. These functions will be accessed by the operators related to the data type. The sphere_eq function, for instance, will be assigned to the = operator. As you can see, a comparison is performed that returns TRUE or FALSE.

The last function in the file can be used to compute the volume of a sphere. sphere_volume returns a float variable to the calling function. The return value of this function is float8. We recommend using float8 instead of float because what comes out when compiling the code depends on the compiler you are using.

Now that you have gone through the C file, it is time to take a look at the file you can use to create the operators and functions needed by the data type. The template for generating sphere.sql is called sphere.tpl, as shown in the next listing:

 -- Input Function CREATE OR REPLACE FUNCTION sphere_in(opaque)         RETURNS sphere         AS '%LIB%'         LANGUAGE 'C'; -- Output Function CREATE OR REPLACE FUNCTION sphere_out(opaque)         RETURNS opaque         AS '%LIB%'         LANGUAGE 'C'; -- Creating new data type CREATE TYPE sphere(         internallength = 16,         input = sphere_in,         output = sphere_out); /* Creating functions for operators */ CREATE OR REPLACE FUNCTION sphere_eq(sphere, sphere)         RETURNS bool         AS '%LIB%'         LANGUAGE 'C'; CREATE OR REPLACE FUNCTION sphere_lt(sphere, sphere)         RETURNS bool         AS '%LIB%'         LANGUAGE 'C'; CREATE OR REPLACE FUNCTION sphere_le(sphere, sphere)         RETURNS bool         AS '%LIB%'         LANGUAGE 'C'; CREATE OR REPLACE FUNCTION sphere_gt(sphere, sphere)         RETURNS bool         AS '%LIB%'         LANGUAGE 'C'; CREATE OR REPLACE FUNCTION sphere_ge(sphere, sphere)         RETURNS bool         AS '%LIB%'         LANGUAGE 'C'; CREATE OR REPLACE FUNCTION sphere_volume(sphere)         RETURNS float8         AS '%LIB%'         LANGUAGE 'C'; /* Creating operators */ CREATE OPERATOR = (         leftarg = sphere,         rightarg = sphere,         procedure = sphere_eq); CREATE OPERATOR < (         leftarg = sphere,         rightarg = sphere,         commutator = >,         procedure = sphere_lt);          CREATE OPERATOR <= (         leftarg = sphere,         rightarg = sphere,         commutator = >=,         procedure = sphere_le); CREATE OPERATOR > (         leftarg = sphere,         rightarg = sphere,         procedure = sphere_gt); CREATE OPERATOR >= (         leftarg = sphere,         rightarg = sphere,         procedure = sphere_ge); 

In this file %LIB will be substituted for the absolute path to the C library, which will be generated using sphere.c. After adding sphere_in and sphere_out to the database, the data type can be created. The internal length is 16 bytes (four float variables). In addition, the functions used for parsing and displaying the result are defined. This is all the information PostgreSQL needs to process a rudimentary data type. All additional operators can be seen as features of this rudimentary data type.

After defining the data type, a set of functions is created. These functions will be needed for defining the operators. The first operator defined in this file is the = operator. As you can see, it has a right and a left argument. In other words, spheres can be compared with spheres as soon as this operator has been defined. The function used to compare two spheres is called sphere_eq. You have seen this function twice already. The first time it occurred in the C file, where you saw how it was implemented. The second time was in this file when a function was created. Now this function is used for defining the operator. Let's take a look at the next operator. In addition to the left operator and the right operator, the function the operator is based on has been defined. The commutator of the < operator is the > operator because it is the exact opposite of the < operator. Internally a commutator is used to help the optimizer to find the best way through the query. In addition to commutators, negators can be defined. The negator of the = operator would be the != operator, which means not equal instead of equal.

After all functions and operators have been defined, it is time to compile the code. Therefore a makefile should be used. The following listing contains a makefile you can use to compile your sources and to set up a testing environment:

 DB=     phpbook #       Rules INC=    -I/usr/src/postgresql-7.2rc2/src/include # INC=  -I/usr/include/postgresql CFLAGS= -Wall -fPIC -O2 $(INC) CC=     gcc #       Libraries OBJ=    sphere.o LIB=    /home/hs/datatype/datatype/libsphere.so all::   $(LIB) clean::         rm -f $(LIB) $(OBJ) $(LIB): $(OBJ)         gcc -shared -Wl,-soname,$@ -o $@ $(OBJ) #       SQL-File all:: sphere.sql clean::         rm -f sphere.sql #       Testing stamp_db:         if test -f $@; then dropdb $(DB); fi         createdb $(DB)         touch $@ clean::         -dropdb $(DB)         rm -f stamp_db run: all stamp_db sphere.sql test.sql         psql phpbook < sphere.sql         psql phpbook < test.sql 

Just run make to compile the sources. The next listing shows what happens if make is used:

 [hs@duron datatype]$ make gcc -Wall -fPIC -O2 -I/usr/src/postgresql-7.2rc2/src/include -c -o sphere.o sphere.c gcc -shared -Wl,-soname,/home/hs/datatype/datatype/libsphere.so -o  /home/hs/datatype/datatype/libsphere.so sphere.o sed -e 's|%LIB%|/home/hs/datatype/datatype/libsphere.so|' sphere.tpl > sphere.sql 

The sources are compiled and a shared object called libsphere.so is generated. In addition, the SQL file is generated based on the template file sphere.sql. As you can see, sed is called to substitute %LIB for the absolute path to the shared object. This way the path need not be hard-coded in a header file or a C file. Now that sphere.sql has been generated, it is worth taking a brief look at it:

 -- Input Function CREATE OR REPLACE FUNCTION sphere_in(opaque)         RETURNS sphere         AS '/home/hs/datatype/datatype/libsphere.so'         LANGUAGE 'C'; 

As you can see, the SQL file contains the correct settings. Now you can add the data type to the database by running make run or psql as shown in the next listing. If you decide to use make, make sure that make clean has been used before running make run:

 psql phpbook < sphere.sql 

make clean makes sure that trash is removed:

 [hs@duron datatype]$ make clean -n rm -f /home/hs/datatype/datatype/libsphere.so sphere.o rm -f sphere.sql dropdb phpbook rm -f stamp_db 

If no errors occurred during compilation, it is time to test the data type. Therefore test.sql has been implemented:

 CREATE TABLE mysphere (data sphere); INSERT INTO mysphere VALUES ('1, 1, 1, 3'); INSERT INTO mysphere VALUES ('1, 1, 1, 8'); INSERT INTO mysphere VALUES ('3, 9, 0, 7'); SELECT data FROM mysphere WHERE data = '3, 9, 0, 7'; SELECT data FROM mysphere WHERE data > '3, 9, 0, 7'; SELECT data FROM mysphere WHERE data <= '3, 9, 0, 7'; SELECT data, sphere_volume(data) FROM mysphere; 

First a table is created and several lines are added to the table. Then three queries are executed and the volume of all records in the table is computed. Let's send the file to psql and see what comes out:

 [hs@duron datatype]$ psql phpbook < test.sql CREATE INSERT 24944 1 INSERT 24945 1 INSERT 24946 1                   data ----------------------------------------  3.000000, 9.000000, 0.000000, 7.000000 (1 row)                   data ----------------------------------------  1.000000, 1.000000, 1.000000, 8.000000 (1 row)                   data ----------------------------------------  1.000000, 1.000000, 1.000000, 3.000000  3.000000, 9.000000, 0.000000, 7.000000 (2 rows)                   data                  |  sphere_volume ----------------------------------------+------------------  1.000000, 1.000000, 1.000000, 3.000000 | 113.097335529233  1.000000, 1.000000, 1.000000, 8.000000 | 2144.66058485063  3.000000, 9.000000, 0.000000, 7.000000 | 1436.75504024173 (3 rows) 

As you can see, no errors occurred, so the data type can be used safely.

So far you have seen how to build a rudimentary data type. It is not possible to work with aggregations and index structures yet. As you can see in the next example, SUM does not work yet:

 phpbook=# SELECT SUM(data) FROM mysphere; ERROR:  Unable to select an aggregate function sum(sphere) 

In addition, it is not possible to use indexes because it is necessary to add some entries to PostgreSQL's system tables. This is not as easy as implementing the C code a data type is based on. This is a book about Web development, so for our purposes, it is enough to be able to implement simple data types because you can easily build workarounds.



PHP and PostgreSQL. Advanced Web Programming2002
PHP and PostgreSQL. Advanced Web Programming2002
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 201

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net