Gathering Performance Information

   

With release 7.2, the PostgreSQL developers introduced a new collection of performance- related system views. These views return two distinct kinds of information. The pg_stat views characterize the frequency and type of access for each table in a database. The pg_statio views will tell you how much physical I/O is performed on behalf of each table.

Let's look at each set of performance-related views in more detail.

The pg_stat_all_tables contains one row for each table in your database. Here is the layout of pg _stat_all_tables :

 perf=# \d pg_stat_all_tables       View "pg_stat_all_tables"     Column       Type    Modifiers ---------------+---------+-----------  relid          oid       relname        name      seq_scan       bigint    seq_tup_read   bigint    idx_scan       numeric   idx_tup_fetch  numeric   n_tup_ins      bigint    n_tup_upd      bigint    n_tup_del      bigint 

The seq_scan column tells you how many sequential (that is, table) scans have been performed for a given table, and seq_tup_read tells you how many rows were processed through table scans. The idx_scan and idx_tup_fetch columns tell you how many index scans have been performed for a table and how many rows were processed by index scans. The n_tup_ins , n_tup_upd , and n_tup_del columns tell you how many rows were inserted, updated, and deleted, respectively.

Query Execution

If you're not familiar with the terms "table scan" or "index scan," don't worryI'll cover query execution later in this chapter (see "Understanding How PostgreSQL Executes a Query").

The real value in pg_stat_all_tables is that you can find out which tables in your data base are most heavily used. This view does not tell you much disk I/O is performed against each table file, nor does it tell you how much time it took to perform the operations.

The following query finds the top 10 tables in terms of number of rows read:

 SELECT relname, idx_tup_fetch + seq_tup_read AS Total   FROM pg_stat_all_tables   WHERE idx_tup_fetch + seq_tup_read != 0   ORDER BY Total desc   LIMIT 10; 

Here's an example that shows the result of this query in a newly created database:

 perf=# SELECT relname, idx_tup_fetch + seq_tup_read AS Total perf-#   FROM pg_stat_all_tables perf-#   WHERE idx_tup_fetch + seq_tup_read != 0 perf-#   ORDER BY Total desc perf-#   LIMIT 10;    relname     total --------------+-------  recalls       78482  pg_class      57425  pg_index      20901  pg_attribute   5965  pg_proc        1391 

It's easy to see that the recalls table is heavily usedyou have read 78482 tuples from that table.

There are two variations on the pg_stat_all_tables view. The pg_stat_ sys_tables view is identical to pg_stat_all_tables , except that it is restricted to showing system tables. Similarly, the pg_stat_user_tables view is restricted to showing only user -created tables.

You can also see how heavily each index is being usedthe pg_stat_all_indexes , pg_stat_user_indexes , and pg_stat_system_indexes views expose index information.

Although the pg_stat view tells you how heavily each table is used, it doesn't provide any information about how much physical I/O is performed on behalf of each table. The second set of performance-related views provides that information.

The pg_statio_all_tables view contains one row for each table in a database. Here is the layout of pg_statio_all_tables :

 perf=# \d pg_statio_all_tables       View "pg_statio_all_tables"      Column        Type    Modifiers -----------------+---------+-----------  relid            oid       relname          name      heap_blks_read   bigint    heap_blks_hit    bigint    idx_blks_read    numeric   idx_blks_hit     numeric   toast_blks_read  bigint    toast_blks_hit   bigint    tidx_blks_read   bigint    tidx_blks_hit    bigint 

This view provides information about heap blocks ( heap_blks_read , heap_blks_hit ), index blocks ( idx_blks_read , idx_blks_hit ), toast blocks ( toast_blks_read , toast_blks_hit ), and index toast blocks ( tidx_blks_read , tidx_blks_hit ). For each of these block types, pg_statio_all_tables exposes two values: the number of blocks read and the number of blocks that were found in PostgreSQL's cache. For example, the heap_blks_read column contains the number of heap blocks read for a given table, and heap_blks_hit tells you how many of those pages were found in the cache.

PostgreSQL exposes I/O information for each index in the pg_statio_all_ indexes , pg_statio_user_indexes , and pg_statio_sys_indexes views .

Let's try a few examples and see how you can use the information exposed by pg_statio_all_tables .

I've written a simple utility (called timer) that makes it a little easier to see the statistical results of a given query. This utility takes a snapshot of pg_stat_all_tables and pg_statio_all_tables , executes a given query, and finally compares the new values in pg_stat_all_tables and pg_statio_all_tables . Using this utility, you can see how much I/O was performed on behalf of the given query. Of course, the database must be idle except for the query under test.

Execute this simple query and see what kind of I/O results you get:

 $ timer "SELECT * FROM recalls;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1     39241      4412     0     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

This query retrieved 39241 rows in a single table scan. This scan read 4412 heap blocks from disk and found none in the cache. Normally, you would hope to see a cache ratio much higher than 4412 to 0! In this particular case, I had just started the postmaster so there were few pages in the cache and none were devoted to the recalls table. Now, try this experiment again to see if the cache ratio gets any better:

 $ timer "SELECT * FROM recalls;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1     39241      4412     0     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

You get exactly the same resultsno cache hits. Why not? We did not include an ORDER BY clause in this query so PostgreSQL returned the rows in (approximately) the order of insertion. When we execute the same query a second time, PostgreSQL starts reading at the beginning of the page file and continues until it has read the entire file. Because my cache is only 64 blocks in size , the first 64 blocks have been forced out of the cache by the time I get to the end of the table scan. The next time I execute the same query, the final 64 blocks are in the cache, but you are looking for the leading blocks. The end result is no cache hits.

Just as an experiment, try to increase the size of the cache to see if you can force some caching to take place.

The PostgreSQL cache is kept in a segment of memory shared by all backend processes. You can see this using the ipcs -m command [7] :

[7] In case you are curious , the key value uniquely identifies a shared memory segment. The key is determined by multiplying the postmaster's port number by 1000 and then incrementing until a free segment is found. The shmid value is generated by the operating system ( key is generated by PostgreSQL). The nattach column tells you how many processes are currently using the segment.

 $ ipcs -m ------ Shared Memory Segments -------- key        shmid     owner     perms     bytes     nattch    status 0x0052e2c1 1409024   postgres  600       1417216   3 

The shared memory segment contains more than just the buffer cache: PostgreSQL also keeps some bookkeeping information in shared memory. With 64 pages in the buffer cache and an 8K block size, you see a shared memory segment that is 1,417,216 bytes long. Let's increase the buffer cache to 65 pages and see what effect that has on the size of the shared memory segment. There are two ways that you can adjust the size of the cache. You could change PostgreSQL's configuration file ( $PGDATA/postgresql.conf ), changing the shared_buffers variable from 64 to 65. Or, you can override the shared_buffers configuration variable when you start the postmaster :

 $ pg_ctl stop waiting for postmaster to shut down......done postmaster successfully shut down $ # $ # Note:  specifying -o "-B 65" is equivalent $ #        to setting shared_buffers = 65 in $ #        the $PGDATA/postgresql.conf file $ # $ pg_start -o "-B 65" -l /tmp/pg.log postmaster successfully started 

Now you can use the ipcs -m command to see the change in the size of the shared memory segment:

 $ ipcs -m ------ Shared Memory Segments -------- key        shmid     owner     perms     bytes     nattch    status 0x0052e2c1 1409024   postgres  600       1425408   3 

The shared memory segment increased from 1,417,216 bytes to 1,425,408 bytes. That's a difference of 8,192 bytes, which happens to be the size of a block. Now, let's increase our buffer count to 128 (twice the default):

 $ pg_ctl stop waiting for postmaster to shut down......done postmaster successfully shut down $ pg_start -o "-B 128" -l /tmp/pg.log postmaster successfully started $ ipcs -m ------ Shared Memory Segments -------- key        shmid     owner     perms     bytes     nattch    status 0x0052e2c1 1409024   postgres  600       1949696   3 

If you do the math, you'll see that the difference in size of the shared memory segment between 64 buffers and 128 buffers is greater than 64 8192. The overhead in the shared memory segment is not fixedit varies with the number of buffers.

Now, let's get back to the problem at hand. We want to find out if doubling the buffer count will result in more cache hits and therefore fewer I/O operations. Remember, a table scan on the recalls table resulted in 4,412 heap blocks read and 0 cache hits. Let's try the same query again and check the results:

 $ timer "SELECT * FROM recalls;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1     39241      4412     0     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

You have to run this query twice because you shut down and restarted the postmaster to adjust the cache size. When you shut down the postmaster, the cache is destroyed (you can use the ipcs -m command to verify this).

 $ timer "SELECT * FROM recalls;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1     39241      4412     0     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

Still the same results as beforePostgreSQL does not seem to buffer any of the data blocks read from the recalls table. Actually, each block is buffered as soon as it is read from disk; but as before, the blocks read at the beginning of the table scan are pushed out by the blocks read at the end of the scan. When you execute the same query a second time, you start at the beginning of the table and find that the blocks that you need are not in the cache.

You could increase the cache size to be large enough to hold the entire table (somewhere around 4412 + 64 blocks should do it), but that's a large shared memory segment, and if you don't have enough physical memory, your system will start to thrash.

Let's try a different approach. You have room for 128 pages in your buffer. The entire recalls table consumes 4412 pages. If you use the LIMIT clause to select a subset of the recalls table, you should see some caching. I'm going to lower the cache size back to its default of 64 pages before we startmy development system is memory-starved at the moment:

 $ pg_ctl stop waiting for postmaster to shut down......done postmaster successfully shut down $ pg_start -o "-B 64" -l /tmp/pg.log postmaster successfully started 

You know that it takes 4,412 pages to hold the 39,241 rows in recalls , which gives you an average of about 9 rows per page. We have 64 pages in the cache; let's assume that PostgreSQL needs half (32) of them for its own bookkeeping. So, you should ask for 9 32 (or 288) rows:

 $ timer "SELECT * FROM recalls LIMIT 288;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1       289        40     0     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

PostgreSQL read 40 heap blocks. If everything worked, those pages should still be in the cache. Let's run the query again:

 $ timer "SELECT * FROM recalls LIMIT 288;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1       289        40    40     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

Now you're getting somewhere. You read 40 heap blocks and found all 40 of them in the cache.

Dead Tuples

Now let's look at another factor that affects performance. Make a simple update to the recalls table:

 perf=# UPDATE recalls SET potaff = potaff + 1; UPDATE 

This command increments the potaff column of each row in the recalls table. (Don't read too much into this particular UPDATE . I chose potaff simply because I needed an easy way to update every row.) Now, after restarting the database, go back and SELECT all rows again:

 $ timer "SELECT * FROM recalls" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    1      39241      8825    0     0      0        0        0   +-------+--------+-----------+------+-------+--------+----------+------+ 

That's an interesting resultyou still retrieved 39,241 rows, but this time you had to read 8,825 pages to find them. What happened ? Let's see if the pg_class table gives any clues:

 perf=# SELECT relname, reltuples, relpages perf-#   FROM pg_class perf-#   WHERE relname = 'recalls';  relname  reltuples  relpages ---------+-----------+----------  recalls      39241      4412 

No clues there pg_class still thinks you have 4,412 heap blocks in this table. Let's try counting the individual rows:

 perf=# SELECT count(*) FROM recalls;  count -------  39241 

At least that gives you a consistent answer. But why does a simple update cause you to read twice as many heap blocks as before?

When you UPDATE a row, PostgreSQL performs the following operations:

  1. The new row values are written to the table.

  2. The old row is deleted from the table.

  3. The deleted row remains in the table, but is no longer accessible.

This means that when you executed the statement " UPDATE recalls SET potaff = potaff + 1" , PostgreSQL inserted 39,241 new rows and deleted 39,241 old rows. We now have 78,482 rows, half of which are inaccessible.

Why does PostgreSQL carry out an UPDATE command this way? The answer lies in PostgreSQL's MVCC ( multiversion concurrency control) feature. Consider the following commands:

 perf=# BEGIN WORK; BEGIN perf=# UPDATE recalls SET potaff = potaff + 1; UPDATE 

Notice that you have started a new transaction, but you have not yet completed it. If another user were to SELECT rows from the recalls table at this point, he must see the old valuesyou might roll back this transaction. In other database systems (such as DB2, Sybase, and SQL Server), the other user would have to wait until you either committed or rolled back your transaction before his query would complete. PostgreSQL, on the other hand, keeps the old rows in the table, and other users will see the original values until you commit your transaction. If you roll back your changes, PostgreSQL simply hides your modifications from all transactions.

When you DELETE rows from a table, PostgreSQL follows a similar set of rules. The deleted row remains in the table, but is hidden. If you roll back a DELETE command, PostgreSQL will simply make the rows visible again.

Now you also know the difference between a tuple and a row. A tuple is some version of a row.

You can see that these hidden tuples can dramatically affect performanceupdating every row in a table doubles the number of heap blocks required to read the entire table.

There are at least three ways to remove dead tuples from a database. One way is to export all (visible) rows and then import them again using pg_dump and pg_restore . Another method is to use CREATE TABLE ... AS to make a new copy of the table, drop the original table, and rename the copy. The preferred way is to use the VACUUM command. I'll show you how to use the VACUUM command a little later (see the section "Table Statistics").

Index Performance

You've seen how PostgreSQL batches all disk I/O into 8K blocks, and you've seen how PostgreSQL maintains a buffer cache to reduce disk I/O. Let's find out what happens when you throw an index into the mix. After restarting the postmaster (to clear the cache), execute the following query:

 $ timer "SELECT * FROM recalls ORDER BY record_id;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    0         0     26398   12843   1    39241      146      0   +-------+--------+-----------+------+-------+--------+----------+------+ 

You can see that PostgreSQL chose to execute this query using an index scan (remember, you have an index defined on the record_id column). This query read 146 index blocks and found none in the buffer cache. You also processed 26,398 heap blocks and found 12,843 in the cache. You can see that the buffer cache helped the performance a bit, but you still processed over 26,000 heap blocks, and you need only 4,412 to hold the entire recalls table. Why did you need to read each heap block (approximately) five times? Think of how the recalls table is stored on disk (see Figure 4.2).

Figure 4.2. The recalls table on disk.

graphics/04fig02.jpg

Notice that the rows are not stored in record_id order. In fact, they are stored in order of insertion. When you create an index on the record_id column, you end up with a structure like that shown in Figure 4.3.

Figure 4.3. The recalls table structure after creating an index.

graphics/04fig03.jpg

Consider how PostgreSQL uses the record_id index to satisfy the query. After the first block of the record_id index is read into the buffer cache, PostgreSQL starts scanning through the index entries. The first index entry points to a recalls row on heap block 2, so that heap block is read into the buffer cache. Now, PostgreSQL moves on to the second index entrythis one points to a row in heap block 1. PostgreSQL reads heap block 1 into the buffer cache, throwing out some other page if there is no room in the cache. Figure 4.2 shows a partial view of the recalls table: remember that there are actually 4,412 heap blocks and 146 index blocks needed to satisfy this query. It's the random ordering of the rows within the recalls table that kills the cache hit ratio.

Let's try reordering the recalls table so that rows are inserted in record_id order. First, create a work table with the same structure as recalls :

 perf=# CREATE TABLE work_recalls AS perf-#   SELECT * FROM recalls ORDER BY record_id; SELECT 

Then, drop the original table, rename the work table, and re-create the index:

 perf=# DROP TABLE recalls; DROP perf=# ALTER TABLE work_recalls RENAME TO recalls; ALTER perf=# CREATE INDEX recalls_record_id ON recalls( record_id ); CREATE 

At this point, you have the same data as before, consuming the same amount of space:

 perf=# SELECT relname, relpages, reltuples FROM pg_class perf-#   WHERE relname IN ('recalls', 'recalls_record_id' );       relname       relpages  reltuples -------------------+----------+-----------  recalls_record_id       146      39241  recalls                4422      39241 (2 rows) 

After restarting the postmaster (again, this clears out the buffer cache so you get consistent results), let's re-execute the previous query:

 $ timer "SELECT * FROM recalls ORDER BY record_id;" +-----------------------------------+----------------------------------+          SEQUENTIAL I/O                    INDEXED I/O                 scans  tuples  heap_blks cached scans  tuples  idx_blks cached +-------+--------+-----------+------+-------+--------+----------+------+    0         0      4423   34818   1    39241      146      0   +-------+--------+-----------+------+-------+--------+----------+------+ 

That made quite a difference. Before reordering, you read 26,398 heap blocks from disk and found 12,843 in the cache for a 40% cache hit ratio. After physically reordering the rows to match the index, you read 4,423 heap blocks from disk and found 34,818 in the cache for hit ratio of 787%. This makes a huge performance difference. Now as you read through each index page, the heap records appear next to each other; you won't be thrashing heap pages in and out of the cache. Figure 4.4 shows how the recalls table looks after reordering.

Figure 4.4. The recalls table on disk after reordering.

graphics/04fig04.jpg

We reordered the recalls table by creating a copy of the table (in the desired order), dropping the original table, and then renaming the copy back to the original name. You can also use the CLUSTER commandit does exactly the same thing.

   


PostgreSQL
PostgreSQL (2nd Edition)
ISBN: 0672327562
EAN: 2147483647
Year: 2005
Pages: 220
Authors: Korry Douglas

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net