PostgreSQL physical storage of rows

Data is money, data is business. After reading this you will know how and where PostgreSQL stores your important data.

Directory structure

PostgreSQL holds all of its data and configuration in one directory called PGDATA which you can find on most systems under /var/lib/pgsql/data/. There are three directories where tables will be stored:

  • base: Databases in pg_default tablespace
  • global: Cluster wide tables which don’t belong to a database e.g. pg_authid (tablespace pg_global)
  • pg_tblspc: Symbolic links to other tablespaces, which can reside outside PGDATA

Every database has its own directory named after the OID (Object Identifier) which you can query with an SQL statement:

For this example we will have a look on table pgbench_accounts of database djboris.

Tables store its data in files usually named after tables OID which grow until 1 GB. Afterwards new files are created with an numeric suffix: .1, .2 and so on. Because there are cases where the filename isn’t the OID but a relfilenode identifier, PostgreSQL offers the helpful function pg_relation_filenode() which shows the relative path to the basefile.

Now we have the base path of the table. Every file within this path is somehow related to our table.

You can ignore the 16426_fsm and 16426_vm file, which are the “Free Space Map” and “Visibility Map” and are not interesting in this case.

Page layout

Every table file consists of pages, which are 8 kB large and are divided as follows:

It’s important to mention, that the ItemsIdData array grows from top to down where the Items go bottom up. ItemIdData fields are used as pointers to the Items which in our case are table rows.

On the following picture you can see the hex dump of the first part (and first page of course) of base/16385/16426. Make sure to run VACUUM FULL before dumping it to ensure, that your first page isn’t empty when analysing it on your own.

The red and blue coloured parts are the PageHeaderData which is 24 bytes long. I’ve coloured the pd_upper and pd_lower fields blue, these describe where the ItemIdData struct array (green coloured) stops and the free space begins. The pd_upper points at the end of the free space, so it’s the beginning of the Item area.

You can find the following definition of type ItemIdData in itemid.h:

Getting Items

As explained, the green ones act as pointers to the Items, thus we will take the first one as an example and calculate the fields. This results in:

Finally we found the Item at end of the page. But we don’t know the columns yet:

Now it’s time to disassemble our table structure by reading pg_attribute. You can ignore the virtual fields (attnum<1) which aren’t part of the Item body.

Back to the hex-dump we have a header again (yellow). The last byte (Hex: 18) shows us the offset to data which is the 24st byte. There begin our columns in order of the seen attnum value ascending:

So you see the value 2 in our purple block, value 1 in the green, value 0 in the red and the varlena (attlen=-1) blue block. Did you look at the virtual fields from pg_attribute? There are by default not shown if you don’t specify them explicitly. CTID is a helpful one, because it’s value is (pagenum, itemnum) which means (page number index, ItemIdData index + 1).

We were in the first page and followed the first ItemIdData which gives us the reference (0, 1). Note that pagenum starts with zero, itemnum with one. This way, we can identify in which page and where an actual row is stored.

Where to continue

If you’ve wondered, how data which would span multiple pages can be stored, you have to study TOAST as it provides a solution for it. There are also other interesting structures like Indexes which are not covered. Visibility Maps and Free Space Maps are also important for this database. MVCC stuff is also not covered so you don’t know if you are consistent. PostgreSQL has a good documentation and readable, well documented source in which you can find everything you need to dive deeper.

Information

At time of writing PostgreSQL 9.6 was used on a Linux x86_64. The purpose of this article is to wake up your interest and give you a general overview. This is far from complete. Many thanks to the PostgreSQL developers for this wonderful piece of software.

Many thanks karras for reviewing this article.

Linux, Go, Container, PostgreSQL, Bitcoin, Infosec

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store