Where Does Postgres Store Its Data?
Lets demistify where Postgres stores its data on disk. We will also learn some helpful postgres and Unix utility commands.
Table of Contents 📖
Where Postgres Stores its Data
At its core, Postgres data is simply stored as a set of files on disk. This means that if we wanted to backup a database, we would just need to copy all the database data files that are on disk. We can see the exact location of the data files by running the following command:
SHOW data_directory;
data_directory
--------------------------
/var/lib/postgresql/data
(1 row)
The SHOW data_directory; command in PostgreSQL displays the file system path of the data directory used by the currently running PostgreSQL instance. The data directory contains all sorts of files including configuration files, logs, and database files.
The base Directory
If we list out the contents of this directory we will see the following:
ls
PG_VERSION pg_ident.conf pg_snapshots pg_wal
base pg_logical pg_stat pg_xact
global pg_multixact pg_stat_tmp postgresql.auto.conf
pg_commit_ts pg_notify pg_subtrans postgresql.conf
pg_dynshmem pg_replslot pg_tblspc postmaster.opts
pg_hba.conf pg_serial pg_twophase postmaster.pid
An interesting directory in here is the base directory. This is where all the database data is held. Specifically, it contains the physical storage for individual databases. Each database in Postgres is represented by a separate subdirectory inside the base directory and has a unique identifier called a database OID. We can get the name of the databases and their corresponding OIDs by running the following command:
SELECT datname, oid FROM pg_database;
datname | oid
------------+-------
postgres | 5
extensions | 16384
template1 | 1
template0 | 4
(4 rows)
This corresponds with the folder IDs in the base directory.
ls /var/lib/postgresql/data/base
1 16384 4 5
Modifying files in the base directory directly is not recommended unless under specific instructions, as it can lead to corruption. Always use PostgreSQL's SQL commands or tools for interacting with the database.
Where is the Data?
If we want to find the exact location of the Postgres data, we can use the pg_relation_filepath function. This function returns a relative path string (from PostgreSQL's data directory) that indicates where the physical storage for the relation resides on disk. Here I will get the location of the extension table.
SELECT pg_relation_filepath('extension');
pg_relation_filepath
----------------------
base/16384/16386
(1 row)
16384 is the OID of the database containing the relation and 16386 is the OID of the extension table. Now I would just need to perform a hexdump on this location to see the data as it is written to disk in the file.
hexdump -C /var/lib/postgresql/data/base/16384/16386
00000000 00 00 00 00 08 eb 96 01 00 00 00 00 28 00 f8 1c |............(...|
00000010 00 20 04 20 00 00 00 00 40 9f 80 01 80 9e 80 01 |. . ....@.......|
00000020 c0 9d 80 01 f8 9c 90 01 00 00 00 00 00 00 00 00 |................|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001cf0 00 00 00 00 00 00 00 00 ec 02 00 00 ed 02 00 00 |................|
00001d00 00 00 00 00 00 00 00 00 04 00 08 00 92 01 18 00 |................|
00001d10 04 00 00 00 43 6b 6a 67 66 6b 63 61 6f 62 61 66 |....Ckjgfkcaobaf|
00001d20 65 6e 69 67 66 62 6b 61 65 61 66 6e 6c 6c 69 64 |enigfbkaeafnllid|
00001d30 6d 68 70 6c 61 1f 43 6f 6e 73 6f 6c 65 4c 6f 67 |mhpla.ConsoleLog|
00001d40 2d 44 45 56 0b 00 80 d0 07 63 54 75 72 6e 20 61 |-DEV.....cTurn a|
00001d50 6c 6c 20 79 6f 75 72 20 64 65 76 65 6c 6f 70 65 |ll your develope|
00001d60 72 20 63 6f 6e 73 6f 6c 65 20 6c 6f 67 73 20 69 |r console logs i|
00001d70 6e 74 6f 20 61 6c 65 72 74 73 63 68 74 74 70 73 |nto alertschttps|
INFO: hexdump is a Unix utility that outputs the raw contents of a file in hexidecimal format. The -C flag provides a formatted output with both hexadecimal and ASCII representation side by side.