Securing the PostgreSQL Data Files

Table of contents:

Unix File Permissions and Ownership

The first step in securing a PostgreSQL installation is to secure the actual data files that comprise each database. PostgreSQL is typically installed in the /usr/local/pgsql directory. Executables (such as psql, initdb, and the postmaster) are often installed in the /usr/local/pgsql/bin directory. If you have a typical installation, you can expect to find data files: databases, configuration, and security information in /usr/local/pgsql/data. I'll refer to this last directory as $PGDATA. PostgreSQL uses the $PGDATA environment variable to find its data files.

Let's start by looking at the directory structure of a PostgreSQL installation (see Figure 23.1):

Figure 23.1. The directory structure of a PostgreSQL installation.

The data directory contains six subdirectories: base, global, pg_xlog, pg_clog, pg_subtrans, and pg_tblspc[1].

[1] You may see more files and subdirectories if you are running a different version of PostgreSQL. This snapshot shows a typical installation of PostgreSQL release 7.1.3.

The data/base directory is where your databases live. Notice that I have three subdirectories underneath the base directorythat's because I have three databases. If you are curious about the directory naming scheme, the numbers correspond to the OIDs (object ids) of the corresponding rows in the pg_database table. You can see the correspondence by executing the following query:

psql> select oid, datname from pg_database;

 oid | datname
------+-----------
18721 | movies
 1 | template1
18719 | template0

The data/global directory contains information that spans all databases; in other words, the information in the global directory is independent of any particular databaseit contains information that's shared by each database in the cluster. The global directory contains the following files (in version 8.0): 1213, 1260, 1261, 1262, pg_control, pg_pwd, and a number of indexes. If you're running a different version of PostgreSQL, you'll see a different set of files in the global directoryyou may also see a pg_pwd file, pgstat.stat, and pg_fsm.cache in some releases.

Like the data/base directory, the data/global directory contains a few files whose names are actually OID values. Table 23.1 shows how the OID values translate into table names.

Table 23.1. OID to Table Mapping in the `global` Directory
Filename/OID	Corresponding Table
`1213`	`pg_tablespace`
`1260`	`pg_shadow`
`1261`	`pg_group`
`1262`	`pg_database`

Each of these files is explained in Chapter 21, "PostgreSQL Administration," so I won't cover that information here.

The data/pg_xlog directory contains the write-ahead transaction log (also described in Chapter 19). data/pg_clog contains "commit" logs (which, when used together with data/pg_xlog, can provide point-in-time recovery). PostgreSQL keeps track of nested transactions in the data/pg_subtrans directory. When you create a new tablespace, PostgreSQL creates a symbolic link to the new location in the data/pg_tblspc directory (see Chapter 3, "PostgreSQL Syntax and Use," for more details).

Unix File Permissions and Ownership

In a Unix environment, there are three aspects to filesystem security. Each file (or directory) has an owner, a group, and a set of permissions. You can see all three of these attributes using the ls -l command. Here is an example:

[View full width]

total 40 drwx------ 5 postgres postgresgrp 4096 Oct 22 17:40 base drwx------ 2 postgres postgresgrp 4096 Jan 15 18:58 global -rw------- 1 postgres postgresgrp 7482 Jan 15 19:26 pg_hba.conf -rw------- 1 postgres postgresgrp 1118 Oct 22 17:35 pg_ident.conf -rw------- 1 postgres postgresgrp 4 Oct 22 17:35 PG_VERSION drwx------ 2 postgres postgresgrp 4096 Oct 22 17:35 pg_xlog -rw------- 1 postgres postgresgrp 3137 Oct 22 17:35

postgresql.conf -rw------- 1 postgres postgresgrp 49 Jan 10 14:18

postmaster.opts -rw------- 1 postgres postgresgrp 47 Jan 10 14:18

postmaster.pid

Each line of output can be divided into seven columns. Starting at the rightmost column, you see the file (or directory) name. Working to the left, you'll see the modification date, file size (in bytes), group name, username, link count, and file permissions.

The file permissions column can be interpreted as follows:

drwxrw-r--

The first character is a file type indicator and contains a "d" for directories and a "-" for normal files (other values are possiblerefer to your OS documentation for more information).

Following the type indicator are three groups of access permissions, and each group contains three characters. The first group (rwx in this example) specifies access permissions for the owner of the file. rwx means that the owner can read, write, and execute the file. The next three characters (rw-) specify access permissions for members of the group. rw- means that members of the group can read and write this file, but cannot execute it. The last three characters in the permissions column control access by other users (you are considered an "other" user if you are not the owner and you are not in the file's group). r-- means that other users can read the file, but cannot write or execute it.

Permissions mean something a little different for directories. If you have read permissions for a directory, you can list the contents of that directory (using ls, for example). If you have write permissions for a directory, you can create files in, and remove files from, that directory. If you have execute permission, you can access the files in a directory (read permission allows you to list the contents of a directory; execute permission allows you to work with the contents of the files in that directory).

When you install PostgreSQL from a standard distribution, such as an RPM package, the installation procedure will automatically apply the correct ownership and permissions to all PostgreSQL components. In rare circumstances, you may find that you need to reset ownerships and permissions back to their correct states. Why? You may find that your system has been "hacked." You may need to recover from an error in a backup/restore procedure. You may have executed a recursive chown, chmod, or chgrp starting in the wrong directoryyou're not an experienced system administrator until you have made (and recovered from) this mistake. It's a good idea to understand what the correct ownerships and permissions are, just in case you ever need to put things back the way they are supposed to be.

The entire directory tree (starting at and including the $PGDATA directory) should be owned by the PostgreSQL administrative user (this user is typically named :postgres"). It's easy to correct the file ownerships using the chown command:

$ chown -R postgres $PGDATA

You can use the following commands to find any files that are not owned by user postgres:

$ cd $PGDATA
$ find . -not -user postgres -ls

The $PGDATA directory tree should be readable and writable by the PostgreSQL administrative user, and should provide no access to the group and other categories. Again, setting the file permissions is easy:

$ cd $PGDATA
$ find . -type d -exec chmod 700 '{}' ';'
$ find . -type f -exec chmod 600 '{}' ';'

The first find command modifies the directories, and the second modifies the normal files. The numbers (700 and 600) are a portable way to specify access permissions. 700 is equivalent to u=rwx,g=,o=, meaning that the owner of the directory should have read, write, and execute permissions; other users have no rights. 600 is equivalent to u=rw,g=,o= meaning that the owner of the file should have read and write permissions and other users should have no access rights. You can use whichever form you prefer. The numeric form is more succinct and more portable. I prefer the symbolic form, probably because I can't do octal arithmetic in my head.

It's a good idea to verify file and directory permissions occasionally for the reasons I mentioned earlier: You may have an intruder on your system, or you might need to recover from a user mistake. You can also use the find command to find any files or directories with incorrect permissions:

$ cd $PGDATA
$ find . -type d -not -perm 700 -print
$ find . -type f -not -perm 600 -print

There is one more file that you should consider securing besides the files in the $PGDATA directory tree. When local users (meaning users who are logged in to the system that hosts your PostgreSQL database) connect to the postmaster, they generally use a Unix-domain socket. (A Unix-domain socket is a network interface that doesn't actually use a network. Instead, a Unix-domain socket is implemented entirely within a single Unix operating system.) When you start the postmaster process, it creates a Unix-domain socket, usually in the /tmp directory. If you have a postmaster running on your system, look in the /tmp directory and you will see the socket that your postmaster uses to listen for connection requests:

$ ls -la /tmp
total 8095
drwxrwxrwt 12 root root 1024 Jan 25 18:04 .
drwxr-xr-x 21 root root 4096 Jan 25 16:23 ..
drwxr-xr-x 2 root root 1024 Jan 10 10:37 lost+found
srwxrwxrwx 1 postgres postgresgrp 0 Jan 25 18:01 .s.PGSQL.5432
-r--r--r-- 1 root root 11 Jan 24 19:18 .X0-lock

(You will likely find other files in the /tmp directory.) The postmaster's socket is named s.PGSQL.5432. You can tell that this is a socket because of the s in the leftmost column. Because the name of the socket starts with a., I had to use the -a flag on the ls command. Files whose names begin with a period (.) are normally hidden from the ls command.

Notice that the permissions on this socket are rwxrwxrwx. This means that any user (the owner, members of the group, or others) can connect to this socket. You might consider restricting access to this socket. For example, if you change the permissions to rwxrwx---, only user postgres and members of the postgresgrp group could connect.

Unlike normal files, you don't set the socket permissions using the chmod command (the postmaster's socket is created each time the postmaster starts). Instead, you use the UNIX_SOCKET_PERMISSION runtime-configuration option (Chapter 19 discusses runtime-configuration options in more detail).

Note that just because you can connect to the socket does not mean that the postmaster will allow you to access a databasethe next section describes how to secure the postmaster.

Securing PostgreSQL Data Files in Windows

In a typical Windows installation, the PostgreSQL installer creates a new account (named postgres) that owns the PostgreSQL service, the executables (psql.exe, postmaster.exe, initdb.exe, and so on), PostgreSQL configuration files, and all PostgreSQL data files. The installer explicitly defines permissions for the $PGDATA directory (and all subdirectories) and the binaries directory (the directory that contains the executables).

For the $PGDATA directory, the installer grants CHANGE privileges to the postgres account. For the binaries directory, the installer first denies all privileges for the postgres account, and then grants READ privileges. Unfortunately, the installer leaves all other (inherited) privileges in place. If you install PostgreSQL in the default location (Program Files), your PostgreSQL server inherits a number of privileges from that directory. For example, on a fresh Win2000 (Professional) system, the Program Files directory grants READ and EXECUTE privileges to all to members of the Users group, CHANGE privileges to the Power Users group, and FULL privileges to Administrators.

If you want to tighten up the security of your PostgreSQL installation, I'd recommend a different strategy: delete all of the ACL entries for the PostgreSQL directory tree and then add back only those privileges that you really want to grant. You can do that with a few simple commands. I'll walk you through the process, assuming that you've installed PostgreSQL in the default location (Program Files) and the PostgreSQL service is owned by the default user (postgres).

First, open a command-prompt window and navigate to the PostgreSQL install directory:

CD "Program FilesPostgreSQL8.0"

To start, delete all ACL entries assigned to the bin and data directory trees:

CACLS bin /T /G Administrators:F
CACLS data /T /G Administrators:F

The first command deletes any existing ACL entries and grants FULL privileges to the Administrators group. The /T flag tells CACLS to operate on the entire directory tree. The /G flag grants a privilege (in this case, it grants the F(ull) privilege to the Administrators group). Because you are not using the /E flag, CACLS deletes all existing ACL entries before granting new ones (/E stands for editif you include /E, CACLS edits existing ACLs; if you don't, CACLS replaces any existing ACLs).

The PostgreSQL executables reside in the bin directory so the postgres user must hold READ (and therefore, execute) privileges for that directory. Grant postgres READ privileges for the bin directory tree:

CACLS bin /E /T /G postgres:R

That command also grants READ (and execute) privileges to every file in the bin directory, not just the directory itself.

When you create a new database, table, or index, PostgreSQL creates a new file somewhere in the data directory tree. When you UPDATE, DELETE, or INSERT data, the PostgreSQL server modifies files in the data directory tree. Accordingly, you must grant the postgres account the right to CHANGE (read, write, and create) files in the data directory tree

CACLS data /E /T /G postgres:C

Your bin and data directories are now protected against tampering. A member of the Administrators group can still mess around with your database, but you shouldn't convey administrator rights to people that you don't trust. You can start the PostgreSQL service at this point and it should be ready to service client requests.

However, most of the PostgreSQL client applications reside in the bin directory and you've just clamped down that directory so tight that a mere user can't see inside of it. To fix that problem, grant READ privileges to the Users group:

CACLS bin /E /G Users:R

Notice that I didn't include the /T (tree) flag in that command. Without the /T flag, CACLS only grants READ privileges to the bin directory itself, not to the files within the directory. That means that member of the Users group can see the files in the bin directory (with the DIR command, for example), but he can't execute any of those programs. Look through the bin directory and decide which programs you want your Users to run. In most cases, you don't want Users to run administrative commands such as createuser or dropdb. In fact, the only program needed by most Users is the PostgreSQL command-line client: psql.exe.

To allow your Users to execute the psql.exe command, execute the following:

CACLS binpsql.exe /E /G Users:R

If you find other programs that you want to expose to your Users, just grant READ privileges for those programs.

You may want to secure other directories in the PostgreSQL tree as well, but protecting the bin and data directories will ensure that a nefarious intruder can't tamper with your PostgreSQL server or data.

Part I: General PostgreSQL Use

Introduction to PostgreSQL and SQL

Working with Data in PostgreSQL

PostgreSQL SQL Syntax and Use

Performance

Part II: Programming with PostgreSQL

Introduction to PostgreSQL Programming

Extending PostgreSQL

PL/pgSQL

The PostgreSQL C APIlibpq

A Simpler C APIlibpgeasy

The New PostgreSQL C++ APIlibpqxx

Embedding SQL Commands in C Programsecpg

Using PostgreSQL from an ODBC Client Application

Using PostgreSQL from a Java Client Application

Using PostgreSQL with Perl

Using PostgreSQL with PHP

Using PostgreSQL with Tcl and Tcl/Tk

Using PostgreSQL with Python

Npgsql: The .NET Data Provider

Other Useful Programming Tools

Part III: PostgreSQL Administration

Introduction to PostgreSQL Administration

PostgreSQL Administration

Internationalization and Localization

Security

Replicating PostgreSQL Data with Slony

Contributed Modules

Index