7.7 Inserting Huge Amounts of Data


Until now you have only dealt with small amounts of data. Inserting only a few records can easily be done by using INSERT statements. With huge amounts of data, using INSERT would be far too slow due to a lot of parsing and transaction overhead.

To insert data more efficiently, PostgreSQL provides a command called COPY, which can also be used to store tables in external files. Let's look at the syntax overview of the COPY command:

 phpbook=# \h COPY Command:     COPY Description: copy data between files and tables Syntax: COPY [ BINARY ] table [ WITH OIDS ]     FROM { 'filename' | stdin }     [ [USING] DELIMITERS 'delimiter' ]     [ WITH NULL AS 'null string' ] COPY [ BINARY ] table [ WITH OIDS ]     TO { 'filename' | stdout }     [ [USING] DELIMITERS 'delimiter' ]     [ WITH NULL AS 'null string' ] 

As you can see, the COPY command is a powerful feature of PostgreSQL. To insert some data into the table called student, you can use PostgreSQL's interactive shell:

 phpbook=# COPY student FROM stdin; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> 4    Paul >> 5    Henry >> 6    Sheila >> 7    Shelley >> 8    Pat >> 9    Pauline >> \. 

After starting the COPY command, the data has to be passed to the database using tabs as separators. To tell PostgreSQL that no more data will be inserted, \. has to be used. If no error occurred, the data has successfully been inserted into the table:

 phpbook=# SELECT * FROM student;  id |  name ----+---------   1 | Epi   2 | Hans   3 | Heinz   4 | Paul   5 | Henry   6 | Sheila   7 | Shelley   8 | Pat   9 | Pauline (9 rows) 

In many cases the data that has to be imported does not contain tabs for separating the various fields. In this case it is necessary to tell COPY which symbol has to be used as delimiter. In addition, it is possible to import the data from files as shown in the next example:

The file called data.sql contains two lines of data:

 COPY student FROM stdin USING DELIMITERS ';'; 10;Olaf 11;Andrea \. 

To import the data you can send the data in data.sql to psql:

 psql phpbook < data.sql 

The table contains two additional values now:

 phpbook=# SELECT * FROM student WHERE id > 9;  id |  name ----+--------  10 | Olaf  11 | Andrea (2 rows) 

Another important thing is to tell PostgreSQL what to treat as NULL values. If NULL is passed to the COPY command, PostgreSQL will treat it just like a string, which is not the desired behavior. If a blank value is passed to COPY, it will be treated as an empty string but not as blank.

Take a look at an example where the string null should be treated as NULL:

 phpbook=# COPY student FROM stdin WITH NULL AS 'null'; Enter data to be copied followed by a newline. End with a backslash and a period on a line by itself. >> 12   Carlos >> 13   null >> null James >> \. 

Two records contain NULL values. The next query returns a record if either id or name contains a NULL value.

 phpbook=# SELECT * FROM student WHERE id is NULL OR name IS NULL;  id | name ----+-------  13 |     | James (2 rows) 

In this example two records contain NULL values.

PostgreSQL is capable of handling extremely huge amounts of data, and with the help of the COPY command it is possible to insert data fast and reliably.

As you have seen in the syntax overview, it is also possible to work with binary data. However, we recommend using binary data only when you have exported the data in binary format and you want to import it into the database again. When you are only importing the data, it is easier to use ASCII files because using ASCII will make your applications easier to understand.



PHP and PostgreSQL. Advanced Web Programming2002
PHP and PostgreSQL. Advanced Web Programming2002
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 201

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net