WittCode💻

Where Does Postgres Store Its Data?

By

Lets demistify where Postgres stores its data on disk. We will also learn some helpful postgres and Unix utility commands.

Table of Contents 📖

Where Postgres Stores its Data

At its core, Postgres data is simply stored as a set of files on disk. This means that if we wanted to backup a database, we would just need to copy all the database data files that are on disk. We can see the exact location of the data files by running the following command:

SHOW data_directory;
      data_directory      
--------------------------
 /var/lib/postgresql/data
(1 row)

The SHOW data_directory; command in PostgreSQL displays the file system path of the data directory used by the currently running PostgreSQL instance. The data directory contains all sorts of files including configuration files, logs, and database files.

The base Directory

If we list out the contents of this directory we will see the following:

ls
PG_VERSION            pg_ident.conf         pg_snapshots          pg_wal
base                  pg_logical            pg_stat               pg_xact
global                pg_multixact          pg_stat_tmp           postgresql.auto.conf
pg_commit_ts          pg_notify             pg_subtrans           postgresql.conf
pg_dynshmem           pg_replslot           pg_tblspc             postmaster.opts
pg_hba.conf           pg_serial             pg_twophase           postmaster.pid

An interesting directory in here is the base directory. This is where all the database data is held. Specifically, it contains the physical storage for individual databases. Each database in Postgres is represented by a separate subdirectory inside the base directory and has a unique identifier called a database OID. We can get the name of the databases and their corresponding OIDs by running the following command:

SELECT datname, oid FROM pg_database;
  datname   |  oid  
------------+-------
 postgres   |     5
 extensions | 16384
 template1  |     1
 template0  |     4
(4 rows)

This corresponds with the folder IDs in the base directory.

ls /var/lib/postgresql/data/base
1      16384  4      5

Modifying files in the base directory directly is not recommended unless under specific instructions, as it can lead to corruption. Always use PostgreSQL's SQL commands or tools for interacting with the database.

Where is the Data?

If we want to find the exact location of the Postgres data, we can use the pg_relation_filepath function. This function returns a relative path string (from PostgreSQL's data directory) that indicates where the physical storage for the relation resides on disk. Here I will get the location of the extension table.

SELECT pg_relation_filepath('extension');
 pg_relation_filepath 
----------------------
 base/16384/16386
(1 row)

16384 is the OID of the database containing the relation and 16386 is the OID of the extension table. Now I would just need to perform a hexdump on this location to see the data as it is written to disk in the file.

hexdump -C /var/lib/postgresql/data/base/16384/16386
00000000  00 00 00 00 08 eb 96 01  00 00 00 00 28 00 f8 1c  |............(...|
00000010  00 20 04 20 00 00 00 00  40 9f 80 01 80 9e 80 01  |. . ....@.......|
00000020  c0 9d 80 01 f8 9c 90 01  00 00 00 00 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001cf0  00 00 00 00 00 00 00 00  ec 02 00 00 ed 02 00 00  |................|
00001d00  00 00 00 00 00 00 00 00  04 00 08 00 92 01 18 00  |................|
00001d10  04 00 00 00 43 6b 6a 67  66 6b 63 61 6f 62 61 66  |....Ckjgfkcaobaf|
00001d20  65 6e 69 67 66 62 6b 61  65 61 66 6e 6c 6c 69 64  |enigfbkaeafnllid|
00001d30  6d 68 70 6c 61 1f 43 6f  6e 73 6f 6c 65 4c 6f 67  |mhpla.ConsoleLog|
00001d40  2d 44 45 56 0b 00 80 d0  07 63 54 75 72 6e 20 61  |-DEV.....cTurn a|
00001d50  6c 6c 20 79 6f 75 72 20  64 65 76 65 6c 6f 70 65  |ll your develope|
00001d60  72 20 63 6f 6e 73 6f 6c  65 20 6c 6f 67 73 20 69  |r console logs i|
00001d70  6e 74 6f 20 61 6c 65 72  74 73 63 68 74 74 70 73  |nto alertschttps|

INFO: hexdump is a Unix utility that outputs the raw contents of a file in hexidecimal format. The -C flag provides a formatted output with both hexadecimal and ASCII representation side by side.

Where Does Postgres Store Its Data?