The Postgres Write Ahead Log (WAL) is a functional component to the database.
WAL makes a lot of key functionality possible, like
Point-in-Time-Recovery
backups,
recovering from an event,
streaming replication,
and more. From time to time, those deep inside the database will need to work
directly with WAL files to diagnose or
recover.
Recently in working with one of Crunchy Data’s customers, I came across a
situation where understanding the names and sequence numbers was important. In
working with several of my colleauges that commit to the Postgres project, I
collected notes on some of the details inside WAL. The goal today to look at the
LSN and naming convention for WAL to help users understand WAL files a little
better.
Log Sequence Number
Transactions in PostgreSQL create WAL records which are ultimately appended to
the WAL log (file). The position where the insert occurs is known as the Log
Sequence Number (LSN). The values of LSN (of type pg_lsn
) can be compared to
determine the amount of WAL generated between two different offsets (in bytes).
When using in this manner, it is important to know the calculation assumes the
full WAL segment was used (16MB) if multiple WAL logs are used. A similar
calculation to the one used here is often used to determine latency of a
replica.
The LSN is a 64-bit integer, representing a position in the write-ahead log
stream. This 64-bit integer is split into two segments (high 32 bits and low 32
bits). It is printed as two hexadecimal numbers separated by a slash
(XXXXXXXX/YYZZZZZZ). The ‘X’ represents the high 32-bits of the LSN and ‘Y’ is
the high 8 bits of the lower 32-bits section. The ‘Z’ represents the offset
position in the file. Each element is a hexadecimal number. The ‘X’ and ‘Y’
values are used in the second part of the WAL file on a default PostgreSQL
deployment.
WAL File
The WAL file name is in the format TTTTTTTTXXXXXXXXYYYYYYYY. Here ‘T’ is the
timeline, ‘X’ is the high 32-bits from the LSN, and ‘Y’ is the low 32-bits of
the LSN.
Start by looking at the current WAL LSN and insert LSN. The pg_current_wal_lsn
is the location of the last write. The pg_current_wal_insert_lsn
is the
logical location and reflects data in the buffer that has not been written to
disk. There is also a flush value that shows what has been written to durable
storage.
[postgres] # select pg_current_wal_lsn(), pg_current_wal_insert_lsn();
pg_current_wal_lsn | pg_current_wal_insert_lsn
--------------------+---------------------------
76/7D000000 | 76/7D000028
(1 row)
Although you can guess the name of the WAL file based on the above output, it is
best to use the pg_walfile_name
function.
[postgres] # select pg_walfile_name('76/7D000028');
pg_walfile_name
--------------------------
00000001000000760000007D
(1 row)
Looking at the file system we see that indeed segment 00000001000000760000007D&