Adding Datatypes


PostgreSQL provides an easy interface for adding your own datatypes. In many applications, self-defined datatypes are much more efficient than PostgreSQL's onboard datatypes. Your own datatypes not only speed up your database significantly; they also allow the user to define functions that are not included in PostgreSQL's core distribution. With the help of highly optimized, self-defined datatypes, PostgreSQL can be used in environments that it was not explicitly designed for. There is no need to invent the wheel once again ”simply take a PostgreSQL database and add the desired functions to it.

The most efficient way to add new datatypes is to use the C programming language, because C is one of the fastest and most efficient programming languages available.

In this section, we define a simple datatype for PostgreSQL that can be used to store colors. You will see that this is not a difficult task.

The most important components of a datatype are the in and out functions. The in function is used to insert values into the database. The out function is called when a value using the datatype has to be displayed. Besides some SQL code creating the datatype and the function, this is all we need for creating a primitive datatype.

The following lists the content of color .c , which shows the two functions we need:

 #include <stdio.h> #include <pgsql/postgres.h> /* defining the datastructure "color" */ typedef      struct color {      short   r;      short   g;      short   b; }  color; /* function to insert color into database */ color *color_in(char *str) {      short r, g, b;      color *mycolor;      /* reading the input */      if      (sscanf(str, " ( %hi , %hi , %hi )", &r, &g, &b) != 3)      {              elog(ERROR, "color_in: parse error in %s", str);              return NULL;      }      /* allocating the memory used to store the result */      mycolor = (color *)palloc(sizeof(color));      /* setting the minimum value to zero */      mycolor->r = (0 > r) ? 0 : r;      mycolor->g = (0 > g) ? 0 : g;      mycolor->b = (0 > b) ? 0 : b;      /* setting the maximum value to 255 */      mycolor->r = (255 < r) ? 255 : mycolor->r;      mycolor->g = (255 < g) ? 255 : mycolor->g;      mycolor->b = (255 < b) ? 255 : mycolor->b;      return (mycolor); } /* function to print colors */ char *color_out(color *mycolor) {      char *result;      /* checking for NULL values */      if      (mycolor == NULL)      {              return (NULL);      }      /* allocating the memory used to store the result */      result = (char *) palloc(37);      /* assigning the required value to "result" */      sprintf(result, "(%i, %i, %i)", mycolor->r, mycolor->g, mycolor->b);      return result; } 

Note

Errors in one of the functions related to a datatype can lead to backend crashes.


First we define the data structure of color . This is done by defining a simple struct called color , which contains all components our datatype must have. In our example, we combine three components. The first variable ( r ) defines the red component of the color. The second component ( g ) sets the green value, and the third component ( b ) tells us the amount of blue our color contains.

With the help of these three components, it's an easy task to generate every possible color.

Note

rgb colors are usually used on computers. Printing machines sometime use "ryc" (red, magenta , cyan) colors.


In the next step, we read the input of the function by using sscanf , which helps us extract the three components from the input data. The definition of the syntax looks like this: ( %hi , %hi , %hi ) .

We allow only one syntax for our datatype, because this is the easiest way and that's enough for our example (we want the code to be as simple as possible). If you want your datatype to support richer syntax, your in function has to be more complex and therefore ”in most cases ”also slower.

If the string passed to the function by the user does not satisfy our demands, an error is displayed using elog . In the example, we want an error to be displayed, which causes the function to quit. If we don't want our function to quit, we would have to use NOTICE instead.

In the next step, we allocate the required amount of memory to store the structure. Before we return the result of the function, we make sure that the components of our datatype can contain only values from zero to 255 .

The out function is even easier than the in function. We check for a NULL value; if the input does not contain NULL , we allocate the memory that we need to display the result and assign the content of the datatype to that memory. Finally, the result is returned as a string.

We compile the code shown as a shared object. As we did in nearly all our examples, we use a simple Makefile:

 libfoo.so    :      color.c      rm lib* *o -f      gcc -Wall -fPIC -c *.c      gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o      ln -s libfoo.so.1.0 libfoo.so.1      ln -s libfoo.so.1 libfoo.so      psql mydata < setup.sql 

The second line does a simple cleanup. In the third line, we compile all C files, and the fourth line makes a shared object of the result. After creating two softlinks, we insert setup.sql into the database.

setup.sql contains a few lines of SQL code that generate the datatype and drop datatypes having the same datatype as the one we want to insert:

 DROP TYPE color; DROP FUNCTION color_in(opaque); DROP FUNCTION color_out(opaque); CREATE FUNCTION color_in(opaque)      RETURNS color AS '/usr/local/postgres_extpack/libfoo.so'      LANGUAGE 'c'; CREATE FUNCTION color_out(opaque)      RETURNS opaque AS '/usr/local/postgres_extpack/libfoo.so'      LANGUAGE 'c'; CREATE TYPE color (              internallength=6,              input=color_in,              output=color_out ); DROP TABLE mycolor; CREATE TABLE mycolor(cfield color); 

You can see that the functions needed for the datatype can be created just like any other C function. The most critical part of the whole process is the CREATE TYPE function:

The actual creation of the datatype is done by CREATE TYPE . Several parameters are passed to the function. internallength defines ”as you might expect ”the internal amount of storage needed for a variable using our datatype. input defines the function PostgreSQL should use as the in function of our datatype. output tells the database which out function has to be used. The CREATE TYPE command supports a variety of other parameters; we will deal with some of these later in this chapter.

Let's start the Makefile and see what happens:

 [hs@duron postgres_extpack]$  make  rm lib* *o -f gcc -Wall -fPIC -c *.c gcc -shared -Wl,-soname,libfoo.so.1 -o libfoo.so.1.0 *.o ln -s libfoo.so.1.0 libfoo.so.1 ln -s libfoo.so.1 libfoo.so psql mydata < setup.sql DROP DROP DROP NOTICE:  ProcedureCreate: type 'color' is not yet defined CREATE CREATE CREATE DROP CREATE 

All CREATE statements have been executed successfully, so it looks as if everything works fine. Let's try it:

 mydata=#  \   d   mycolor  Table "mycolor"  Attribute  Type   Modifier -----------+-------+----------  cfield     color 

The Makefile (it was actually done by setup.sql ) has already created a table successfully. Now we want to insert some data:

 mydata=#  INSERT INTO mycolor VALUES ('(0, 20, 0)');  INSERT 119310 1 mydata=#  INSERT INTO mycolor VALUES ('(0, 20, 0)'::color);  INSERT 119311 1 

It seems as if everything is still okay and no errors have occurred. Let's select the data that we have just inserted into the table by using a simple SELECT statement:

 mydata=#  SELECT * FROM mycolor;  cfield ------------  (0, 20, 0)  (0, 20, 0) (2 rows) 

Up to now, all records that we have inserted into the database contained valid components, which means that no number has been smaller than zero and higher than 255. In the in function, these values have to be set to a correct value; let's see what happens:

 mydata=#  INSERT INTO mycolor VALUES ('(-10, 20, 300)'::color);  INSERT 119312 1 

The record has successfully been inserted into the database; now we query the table again to see what has happened to the record:

 mydata=#  SELECT * FROM mycolor;  cfield --------------  (0, 20, 0)  (0, 20, 0)  (0, 20, 255) (3 rows) 

The negative value has been changed to , and 300 has been changed to 255 , which is the maximum value we allow for a component of a color.

The previous datatype shows impressively how simple it is to extend PostgreSQL's set of datatypes by implementing a simple C function. Some C gurus among you might say that the next code is no efficient implementation of a datatype, because three bytes (instead of six bytes) of storage would be enough to store information about a color. This is certainly true. The reason for our wasteful way of programming is that we want the code to be as simple as possible so that even C beginners can easily understand how PostgreSQL can be extended.



PostgreSQL Developer's Handbook2001
PostgreSQL Developer's Handbook2001
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net