After a database has been set up, the work of an administrator is not over because some things have to be done to keep the system up and running. In contrast to other databases, it takes very little effort to administer a PostgreSQL database because the entire system has been implemented flexibly and reliably. In addition, PostgreSQL is capable of handling huge amounts of data. However, some basic administration tasks have to be done and you will learn about these in this section. In addition, it is necessary to understand some basic internals of the database. You will need this knowledge for configuring PostgreSQL to your needs. 10.2.1 VACUUM and ANALYZEVACUUM is the most important command when working with PostgreSQL. It is used for many purposes and should be run periodically. Here is a list of what VACUUM is used for:
Now that you have seen what VACUUM can be used for, let's take a look at a practical example. Create an empty table called data: phpbook=# CREATE TABLE data (data int4); CREATE Currently the database cluster we are working on is about 41.7MB in size. Just go to the directory where your database cluster can be found and run du: [postgres@athlon pgdb]$ du 1716 ./base/1 1716 ./base/16555 3576 ./base/25110 1716 ./base/26076 8728 ./base 140 ./global 32812 ./pg_xlog 12 ./pg_clog 41732 . In the next step you can write a PHP script for generating a file containing some UPDATE and DELETE operations: <?php $fp = fopen("/tmp/data.sql", "w+"); if (!$fp) { echo "cannot open file\n"; exit; } for ($i = 0; $i < 10000; $i++) { $sql = "INSERT INTO data VALUES ($i);\n"; $sql.="DELETE FROM data;\n"; fwrite ($fp, $sql); } fclose($fp); ?> To generate the file called data.sql, use the PHP command-line interpreter: [postgres@athlon php]$ php makedata.php X-Powered-By: PHP/4.0.6 Content-type: text/html If no error occurred, the file has been generated successfully and contains a set of UPDATE and DELETE operations as shown in the next listing: [postgres@athlon tmp]$ head data.sql INSERT INTO data VALUES (0); DELETE FROM data; INSERT INTO data VALUES (1); DELETE FROM data; INSERT INTO data VALUES (2); DELETE FROM data; INSERT INTO data VALUES (3); DELETE FROM data; INSERT INTO data VALUES (4); DELETE FROM data; This SQL code can be sent to the database easily: [postgres@athlon pgdb]$ psql phpbook < /tmp/data.sql Although not a single additional record is in the database, the amount of disk space needed has grown: [postgres@athlon pgdb]$ du 1716 ./base/1 1716 ./base/16555 3984 ./base/25110 1716 ./base/26076 9136 ./base 140 ./global 32812 ./pg_xlog 12 ./pg_clog 42140 . To solve the problem and to reduce the space allocated on disk, VACUUM can be run. As you have already seen, VACUUM can be run from PostgreSQL's interactive shell, but in some cases it is more comfortable to use a shell program called vacuumdb to perform the job: [postgres@athlon pgdb]$ vacuumdb --help vacuumdb cleans and analyzes a PostgreSQL database. Usage: vacuumdb [options] [dbname] Options: -h, --host=HOSTNAME Database server host -p, --port=PORT Database server port -U, --username=USERNAME Username to connect as -W, --password Prompt for password -d, --dbname=DBNAME Database to vacuum -a, --all Vacuum all databases -t, --table='TABLE[(columns)]' Vacuum specific table only -f, --full Do full vacuuming -v, --verbose Write a lot of output -z, --analyze Update optimizer hints -e, --echo Show the command being sent to the backend -q, --quiet Don't write any output Read the description of the SQL command VACUUM for details. Report bugs to <pgsql-bugs@postgresql.org>. As you can see, it is possible to define quite a few parameters that are mostly needed for vacuuming a remote database. To vacuum all your databases, you can use the following command: [postgres@athlon pgdb]$ vacuumdb -e -z -a Vacuuming template1 VACUUM ANALYZE VACUUM Vacuuming phpbook VACUUM ANALYZE VACUUM Because the -e flag has been used, PostgreSQL lists the SQL commands sent to the back end on the screen. After that you can see how much disk space is still allocated: [postgres@athlon pgdb]$ du 1744 ./base/1 1716 ./base/16555 3576 ./base/25110 1744 ./base/26076 8784 ./base 140 ./global 32812 ./pg_xlog 12 ./pg_clog 41788 . The amount of space needed has decreased. If you are running many UPDATE and DELETE operations, it is recommended that you run VACUUM using a cron job when the load on your database is low. To rebuild the statistics needed by the optimizer, ANALYZE has to be used. Whenever the data in your table changes significantly, you should run ANALYZE to provide the optimizer with correct and reliable information. 10.2.2 WALWrite Ahead Logs (WAL) are a sophisticated feature of PostgreSQL. The main idea of WAL is to write files to disk after the changes have been logged. This has significant advantages because the number of accesses to the disk is reduced, so a lot of speed can be gained. In case of disaster, it is easy to restore the data because the changes have already been logged. With the help of the log, it is an easy task to repair the data after the crash safely. The concept is in a way similar to the journaling used by some filesystems. In addition to speed, PostgreSQL can guarantee that the data in the database is consistent and ready for action after a crash. UNDO operations are not yet implemented, but they will be implemented in future releases of PostgreSQL. With the help of UNDOs, it will be possible to implement savepoints, which will be another important step in history of PostgreSQL. 10.2.3 Maintaining Log FilesPostgreSQL can be configured to produce a lot of logging information. If the database is running for a long time, the logfile will grow and will allocate a lot of disk space. Therefore old information should be removed from time to time. PostgreSQL does not remove old logging information by itself, so it is necessary to use external software such as Apache's logrotate. |