PostgreSQL Vacuums

In the previous sections we’ve learned how Postgres handles inserts, updates, and deletes in its data files. In order to maintain ACID compliance, it doesn’t actually remove any data at the time of the query. Instead, it marks updated and deleted tuples as “dead” and appends the newly updated tuple like it would an insert.

On a busy table with a lot of updates and deletes, this file will get really big. Beyond the obvious drawback of using a lot more hard drive space than is needed, it will also slow down future queries that have to scan this file looking for the correct tuples to return.

Table Statistics

Fortunately, PG collects a lot of stats on each table such as how many times the table has been scanned since the last time the table has been analyzed. It also shows things like the number of live vs. dead tuples. Calling a tuple dead or dirty is the same thing.

To see these stats1, run the query select * from pg_stat_user_tables; or you can look at specific metrics such as with the query select relname, n_live_tup, n_dead_tup from pg_stat_user_tables;. Earlier in the book, we talked about how a “relation” in Relational Algebra is a table. This is why PG uses the column “relname” in this view for the name of the table.

Running the query above, gives us this result:

millis=# select relname, n_live_tup, n_dead_tup from pg_stat_user_tables;
 relname | n_live_tup | n_dead_tup
---------+------------+------------
 matter  |          2 |          2
(1 row)

In the previous 2 sections, we ran 3 inserts, 1 update, and 1 delete. This left 2 live tuples and 2 are now considered dead.

Vacuums

A vacuum is a separate built in process that sets dead tuples as available for future reuse. How often it’s run is controlled in the postgresql.conf file. For example, autovacuum_vacuum_threshold tells PG to run a vacuum automatically when a specified number of dead tuples are reached.2

Before we go further, the existing “matter” table’s data file looks like this:

00000000  00 00 00 00 f8 46 b5 01  00 00 00 00 28 00 00 1f  |.....F......(...|
00000010  00 20 04 20 73 10 00 00  b0 9f 98 00 78 9f 66 00  |. . s.......x.f.|
00000020  38 9f 78 00 00 9f 70 00  00 00 00 00 00 00 00 00  |8.x...p.........|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001f00  73 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |s...............|
00001f10  04 00 04 80 02 28 18 00  02 00 00 00 0f 52 6f 62  |.....(.......Rob|
00001f20  65 72 74 11 43 68 69 63  61 67 6f 1b 6d 6f 76 65  |ert.Chicago.move|
00001f30  64 20 63 69 74 69 65 73  72 10 00 00 00 00 00 00  |d citiesr.......|
00001f40  00 00 00 00 00 00 00 00  03 00 04 00 02 08 18 00  |................|
00001f50  03 00 00 00 0d 4a 61 6d  65 73 15 42 6f 75 6e 74  |.....James.Bount|
00001f60  69 66 75 6c 21 69 6d 70  6f 72 74 61 6e 74 20 73  |iful!important s|
00001f70  74 75 66 66 00 00 00 00  71 10 00 00 73 10 00 00  |tuff....q...s...|
00001f80  00 00 00 00 00 00 00 00  04 00 04 40 02 01 18 00  |...........@....|
00001f90  02 00 00 00 0f 52 6f 62  65 72 74 0d 50 61 72 69  |.....Robert.Pari|
00001fa0  73 15 73 6f 6d 65 20 69  6e 66 6f 00 00 00 00 00  |s.some info.....|
00001fb0  70 10 00 00 74 10 00 00  00 00 00 00 00 00 00 00  |p...t...........|
00001fc0  01 00 04 20 02 01 18 00  01 00 00 00 0d 42 72 69  |... .........Bri|
00001fd0  61 6e 1f 53 61 6c 74 20  4c 61 6b 65 20 43 69 74  |an.Salt Lake Cit|
00001fe0  79 37 61 62 63 64 65 66  67 68 69 6a 6b 6c 6d 6e  |y7abcdefghijklmn|
00001ff0  6f 70 71 72 73 74 75 76  77 78 79 7a 00 00 00 00  |opqrstuvwxyz....|
00002000

In addition to waiting for PG to run a vacuum automatically, we can run it explicitly. To do this for the “matter” table, run the query VACUUM matter; and CHECKPOINT;

Running stats again now shows:

millis=# select relname, n_live_tup, n_dead_tup from pg_stat_user_tables;
 relname | n_live_tup | n_dead_tup
---------+------------+------------
 matter  |          2 |          0
(1 row)

And running hexdump -C 24648 now shows:

00000000  00 00 00 00 d0 49 b5 01  00 00 05 00 28 00 88 1f  |.....I......(...|
00000010  00 20 04 20 00 00 00 00  00 00 00 00 04 00 01 00  |. . ............|
00000020  c0 9f 78 00 88 9f 70 00  00 00 00 00 00 00 00 00  |..x...p.........|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001f00  73 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |s...............|
00001f10  04 00 04 80 02 29 18 00  02 00 00 00 0f 52 6f 62  |.....).......Rob|
00001f20  65 72 74 11 43 68 69 63  61 67 6f 1b 6d 6f 76 65  |ert.Chicago.move|
00001f30  64 20 63 69 74 69 65 73  72 10 00 00 00 00 00 00  |d citiesr.......|
00001f40  00 00 00 00 00 00 00 00  03 00 04 00 02 09 18 00  |................|
00001f50  03 00 00 00 0d 4a 61 6d  65 73 15 42 6f 75 6e 74  |.....James.Bount|
00001f60  69 66 75 6c 21 69 6d 70  6f 72 74 61 6e 74 20 73  |iful!important s|
00001f70  74 75 66 66 00 00 00 00  71 10 00 00 73 10 00 00  |tuff....q...s...|
00001f80  00 00 00 00 00 00 00 00  73 10 00 00 00 00 00 00  |........s.......|
00001f90  00 00 00 00 00 00 00 00  04 00 04 80 02 29 18 00  |.............)..|
00001fa0  02 00 00 00 0f 52 6f 62  65 72 74 11 43 68 69 63  |.....Robert.Chic|
00001fb0  61 67 6f 1b 6d 6f 76 65  64 20 63 69 74 69 65 73  |ago.moved cities|
00001fc0  72 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |r...............|
00001fd0  03 00 04 00 02 09 18 00  03 00 00 00 0d 4a 61 6d  |.............Jam|
00001fe0  65 73 15 42 6f 75 6e 74  69 66 75 6c 21 69 6d 70  |es.Bountiful!imp|
00001ff0  6f 72 74 61 6e 74 20 73  74 75 66 66 00 00 00 00  |ortant stuff....|
00002000

The page has been altered. The dead tuples are no longer there and the live tuples have been moved; however, the bytes that were previously used still contain some unused data. Let’s see what happens when we insert another record.

INSERT INTO matter (name, city, info) VALUES ('William', 'San Francisco', 'really important data'); and CHECKPOINT;

00000000  00 00 00 00 c8 88 b5 01  00 00 01 00 28 00 40 1f  |............(.@.|
00000010  00 20 04 20 00 00 00 00  40 9f 90 00 04 00 01 00  |. . ....@.......|
00000020  c0 9f 78 00 88 9f 70 00  00 00 00 00 00 00 00 00  |..x...p.........|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001f00  73 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |s...............|
00001f10  04 00 04 80 02 29 18 00  02 00 00 00 0f 52 6f 62  |.....).......Rob|
00001f20  65 72 74 11 43 68 69 63  61 67 6f 1b 6d 6f 76 65  |ert.Chicago.move|
00001f30  64 20 63 69 74 69 65 73  72 10 00 00 00 00 00 00  |d citiesr.......|
00001f40  75 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |u...............|
00001f50  01 00 04 00 02 08 18 00  04 00 00 00 11 57 69 6c  |.............Wil|
00001f60  6c 69 61 6d 1d 53 61 6e  20 46 72 61 6e 63 69 73  |liam.San Francis|
00001f70  63 6f 2d 72 65 61 6c 6c  79 20 69 6d 70 6f 72 74  |co-really import|
00001f80  61 6e 74 20 64 61 74 61  73 10 00 00 00 00 00 00  |ant datas.......|
00001f90  00 00 00 00 00 00 00 00  04 00 04 80 02 29 18 00  |.............)..|
00001fa0  02 00 00 00 0f 52 6f 62  65 72 74 11 43 68 69 63  |.....Robert.Chic|
00001fb0  61 67 6f 1b 6d 6f 76 65  64 20 63 69 74 69 65 73  |ago.moved cities|
00001fc0  72 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |r...............|
00001fd0  03 00 04 00 02 09 18 00  03 00 00 00 0d 4a 61 6d  |.............Jam|
00001fe0  65 73 15 42 6f 75 6e 74  69 66 75 6c 21 69 6d 70  |es.Bountiful!imp|
00001ff0  6f 72 74 61 6e 74 20 73  74 75 66 66 00 00 00 00  |ortant stuff....|
00002000

PG looks for a reusable space big enough to contain the new tuple and inserts it there. This way, dead tuples are reclaimed by new writes.

Vacuum Full

So far, we’ve been using the non-blocking safe vacuum process. This is natural and happens all the time on a well functioning PG server; however, because new writes are constantly going over the top of obsolete data, this can cause fragmentation. Many small bits of unclaimed space that is too small for another tuple to fit may collect between live tuples. This wastes space on the hard drive and makes table scans take longer.

The solution to this is running VACUUM FULL on the table. This completely writes the file from scratch removing any fragmentation. The table is locked during this time preventing other updates from happening. On a large table, this can take minutes or hours so it’s best to run during scheduled maintenance time.

Let’s see how this affects our example. Run VACUUM FULL matter; and CHECKPOINT; Then let’s check out the file.

hexdump -C 24658
hexdump: 24658: No such file or directory
hexdump: 24658: Bad file descriptor

Wait? Where did that file go? PG literally wrote a new file from scratch and deleted the existing one so we need to find out what the new file name is.

> oid2name -d millis -t matter

From database "millis":
  Filenode  Table Name
----------------------
     24666      matter
> hexdump -C 24666

00000000  00 00 00 00 20 0d b7 01  00 00 00 00 24 00 40 1f  |.... .......$.@.|
00000010  00 20 04 20 00 00 00 00  b8 9f 90 00 78 9f 78 00  |. . ........x.x.|
00000020  40 9f 70 00 00 00 00 00  00 00 00 00 00 00 00 00  |@.p.............|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001f40  73 10 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |s...............|
00001f50  03 00 04 00 02 2b 18 00  02 00 00 00 0f 52 6f 62  |.....+.......Rob|
00001f60  65 72 74 11 43 68 69 63  61 67 6f 1b 6d 6f 76 65  |ert.Chicago.move|
00001f70  64 20 63 69 74 69 65 73  72 10 00 00 00 00 00 00  |d citiesr.......|
00001f80  00 00 00 00 00 00 00 00  02 00 04 00 02 0b 18 00  |................|
00001f90  03 00 00 00 0d 4a 61 6d  65 73 15 42 6f 75 6e 74  |.....James.Bount|
00001fa0  69 66 75 6c 21 69 6d 70  6f 72 74 61 6e 74 20 73  |iful!important s|
00001fb0  74 75 66 66 00 00 00 00  75 10 00 00 00 00 00 00  |tuff....u.......|
00001fc0  00 00 00 00 00 00 00 00  01 00 04 00 02 0b 18 00  |................|
00001fd0  04 00 00 00 11 57 69 6c  6c 69 61 6d 1d 53 61 6e  |.....William.San|
00001fe0  20 46 72 61 6e 63 69 73  63 6f 2d 72 65 61 6c 6c  | Francisco-reall|
00001ff0  79 20 69 6d 70 6f 72 74  61 6e 74 20 64 61 74 61  |y important data|
00002000

The file now only contains live data.


  1. Monitoring Stats 

  2. Runtime Config Autovacuum