Object IDs (OID) | Working with Data in PostgreSQL

An OID is a 32-bit, positive whole number. Every row[5] in a PostgreSQL database contains a unique identifier[6]the object ID (or OID). Normally, the OID column is hidden. You can see the OID for a row by including the OID column in the target list of a SELECT statement:

[5] By default, all tables are created such that every row contains an OID. You can omit the object IDs using the WITHOUT OIDS clause of the CREATE TABLE command.

[6] The PostgreSQL documentation warns that object IDs are currently unique within a database cluster; but in a future release, an OID may be unique only within a single table.

movies=# SELECT OID, * FROM customers;

 oid | customer_id | customer_name | phone | birth_date | balance
-------+-------------+----------------------+----------+------------+---------
 33876 | 3 | Panky, Henry | 555-1221 | 1968-01-21 | 0.00
 33877 | 1 | Jones, Henry | 555-1212 | 1970-10-10 | 0.00
 33878 | 4 | Wonderland, Alice N. | 555-1122 | 1969-03-05 | 3.00
 33879 | 2 | Rubin, William | 555-2211 | 1972-07-10 | 15.00
 33889 | 5 | Funkmaster, Freddy | 555-FUNK | | 0.00
 33890 | 7 | Gull, Jonathon LC | 555-1111 | 1984-02-05 | 0.00
 33891 | 8 | Grumby, Jonas | 555-2222 | 1984-02-21 | 0.00

You can create a column of type OID if you want to explicitly refer to another object (usually a row in another table). Think back to the rentals table that you developed in Chapter 1. Each row in the rentals table contains a tape_id, a customer_id, and a rental date. The rentals table currently looks like this:

movies=# d rentals
 Table "public.rentals"
 Attribute | Type | Modifier
-------------+--------------+----------
 tape_id | character(8) | not null
 rental_date | date | not null
 customer_id | integer | not null

movies=# SELECT * FROM rentals;
 tape_id | rental_date | customer_id
----------+-------------+-------------
 AB-12345 | 2001-11-25 | 1
 AB-67472 | 2001-11-25 | 3
 OW-41221 | 2001-11-25 | 1
 MC-68873 | 2001-11-20 | 3
 KJ-03335 | 2001-11-26 | 8
(5 rows)

Each value in the tape_id column refers to a row in the tapes table. Each value in the customer_id column refers to a row in the customers table. Rather than storing the tape_id and customer_id in the rentals table, you could store OID s for the corresponding rows. The following CREATE TABLE ... AS command creates a new table, rentals2, that is equivalent to the original rentals table:

movies=# CREATE TABLE rentals2 AS
movies-# SELECT
movies-# t.oid AS tape_oid, c.oid AS customer_oid, r.rental_date
movies-# FROM
movies-# tapes t, customers c, rentals r
movies-# WHERE
movies-# t.tape_id = r.tape_id
movies-# AND
movies-# c.id = r.customer_id;

This statement (conceptually) works as follows. First, you retrieve a row from the rentals table. Next, you use the rentals.customer_id column to retrieve the matching customers row and the rentals.tape_id column to retrieve the matching tapes row. Finally, you store the OID of the customers row and the OID of the tapes row (and the rental_date) in a new rentals2 row.

Now, when you SELECT from the rentals2 table, you will see the object IDs for the customers row and the tapes row:

movies=# SELECT * FROM rentals2;
 tape_oid | customer_oid | rental_date
----------+--------------+-------------
 38337 | 38333 | 2001-11-25
 38338 | 38335 | 2001-11-25
 38394 | 38393 | 2001-11-26
 38339 | 38335 | 2001-11-20
 38340 | 38333 | 2001-11-25

You can re-create the data in the original table by joining the corresponding customers and tapes records, based on their respective OID s:

movies=# SELECT t.tape_id, r.rental_date, c.id
movies-# FROM
movies-# tapes t, rentals2 r, customers c
movies-# WHERE
movies-# t.oid = r.tape_oid AND
movies-# c.oid = r.customer_oid
movies-# ORDER BY t.tape_id;

tape_id | rental_date | id
----------+-------------+----
 AB-12345 | 2001-11-25 | 1
 AB-67472 | 2001-11-25 | 3
 KJ-03335 | 2001-11-26 | 8
 MC-68873 | 2001-11-20 | 3
 OW-41221 | 2001-11-25 | 1
(5 rows)

Here are a couple of warnings about using OID s in your own tables.

The first concern has to do with backups. The standard tool for performing a backup of a PostgreSQL database is pg_dump. By default, pg_dump will not archive OID s. This means that if you back up a table that contains an OID column (referring to another object) and then restore that table from the archive, the relationships between objects will be lost, unless you remembered to tell pg_dump to archive OID s. This happens because when you restore a row from the archive, it might be assigned a different OID.

The second thing you should consider when using OID s is that they offer no real performance advantages. If you are coming from an Oracle or Sybase environment, you might be thinking that an OID sounds an awful lot like a ROWID. It's true that an OID and a ROWID provide a unique identifier for a row, but that is where the similarity ends. In an Oracle environment, you can use a ROWID as the fastest possible way to get to a specific row. A ROWID encodes the location (on disk) of the row that it belongs towhen you retrieve a row by ROWID, you can bypass any index7 searches and go straight to the data. An OID is just a 32-bit numberyou can create an index on the OID column, but you could also create an index on any other (unique) column to achieve the same results. In fact, the only time that it might make sense to use an OID to identify a row is when the primary key[7] for a table is very long.

[7] Don't be too concerned if you aren't familiar with the concept of indexes or primary keys. I'll cover each of those topics a bit later.

Finally, I should point out that OID s can wrap. In an active database cluster, it's certainly possible that 4 billion objects can be created. That doesn't mean that all 4 billion objects have to exist at the same time, just that 4 billion OID s have been created since the cluster was created. When the OID generator wraps, you end up with duplicate values. This may sound a little far-fetched, but it does happen and it is not easy to recover from. There really is no good reason to use an OID as a primary keyuse SERIAL (or BIGSERIAL) instead.

Syntax for Literal Values

The format in which you enter literal OID values is the same that you would use for unsigned INTEGER values. An OID literal is simply a sequence of decimal digits.

Size and Valid Values

As I mentioned earlier, an OID is an unsigned 32-bit (4-byte) integer. An OID column can hold values between 0 and 4294967295. The value 0 represents an invalid OID.

Supported Operators

You can compare two OID values, and you can compare an OID value against an INTEGER value. Table 2.26 shows which operators you can use with the OID data type.

Table 2.26. `OID` Operators
Data Types	Valid Operators
`OID` q `OID`	`< <= <> = >= >`
`OID` q `INT4`	`< <= <> = >= >`
`INT4` q `OID`	`< <= <> = >= >`

Part I: General PostgreSQL Use

Introduction to PostgreSQL and SQL

Working with Data in PostgreSQL

PostgreSQL SQL Syntax and Use

Performance

Part II: Programming with PostgreSQL

Introduction to PostgreSQL Programming

Extending PostgreSQL

PL/pgSQL

The PostgreSQL C APIlibpq

A Simpler C APIlibpgeasy

The New PostgreSQL C++ APIlibpqxx

Embedding SQL Commands in C Programsecpg

Using PostgreSQL from an ODBC Client Application

Using PostgreSQL from a Java Client Application

Using PostgreSQL with Perl

Using PostgreSQL with PHP

Using PostgreSQL with Tcl and Tcl/Tk

Using PostgreSQL with Python

Npgsql: The .NET Data Provider

Other Useful Programming Tools

Part III: PostgreSQL Administration

Introduction to PostgreSQL Administration

PostgreSQL Administration

Internationalization and Localization

Security

Replicating PostgreSQL Data with Slony

Contributed Modules

Index