Working with Large Amounts of Data


When you want to build a database containing huge amounts of data, it can be very convenient to speed up the process a little. Consequently, it is interesting to know whether to create indexes on certain columns before or after inserting the data.

In general, it is faster to create the index after inserting the data, because the index has to build only once instead of being changed after every record inserted into the database.

Another important issue when working with large amounts of data is the comparison of COPY and INSERT . In general, we can say that inserting data with the help of COPY is much faster than with INSERT commands because of parsing and transaction overhead.

PostgreSQL is capable of handling even enormous amounts of data and is perfectly stable under high load. The test database we use in this chapter consists of 10 million records, but most operations can still be done quickly when using indexes. You will most likely not face any problems with PostgreSQL, even if the database has to face dozens of gigabytes of data. Linux kernels version 2.4 don't support files that are bigger than 2GB. PostgreSQL uses many separate files to get around the problem, and the user does not have to care about operating system or file system specifics.



PostgreSQL Developer's Handbook2001
PostgreSQL Developer's Handbook2001
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 125

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net