DEV Community

Pawan Kukreja
Pawan Kukreja

Posted on • Edited on

[Summary] Chapter#01 'The Internals of PostgreSQL' Book (Part-02).

Internal Layout of a Heap Table File

PostgreSQL divides data files into fixed-length blocks, with a default of 8192 byte (8 KB). Block numbers are numbered sequentially from 0. If the file is filled, PostgreSQL adds a new empty page to increase the file size.

Image description

Page within tables contains three kinds of data described as follows:
Heap tuple: It records data itself and is stacked in order from the bottom of the page. Tuple describes Concurrency Control and WAL in PostgreSQL is required.

Line pointer: A 4 byte long line pointer holds a pointer to each tuple which is also called item pointer. Line pointers form a simple array and each index numbered sequentially from 1 and called offset number. New line pointer is pushed when a new tuple is added.

Header data: Header contains general information about the page and it is 24 byte long, and it is located at the beginning of the page.
Major variables of structures are:

  • Pd_lsn: It is an 8-byte unsigned integer related to the WAL mechanism, It stores last changes in this page.
  • pd_checksum: This variable contains checksum values.
  • pd_lower and pd_upper: pd_lower points the end of line pointer and pd_upper to the beginning of the newest heap tuple
  • pd_special: This variable for indexes and In the page within tables it points to the end of the page.

The Methods of writing and reading Tuples

*Writing Heap Tuple: *

When a table with one page and the lower pointer points to the first line pointer, while the line pointer and pd_upper point to the first heap tuple. And when inserting a second tuple the second line pointer is pushed onto first pointing to the second tuple, and pd_lower and pd_upper change to point to the second line pointer.

Reading Heap Tuple:
There are two methods, Sequential Scan and B-tree index scan

Sequential Scan:
All tuples in all pages are sequentially read by scanning all the pointer in each page

B-tree index scan:
PostgreSQL can read the desired heap tuple without unnecessary scanning in the pages by using the Index key and TID pointer. If the index tuple with the key that you are looking for has been found PostgreSQL reads the desired heap tuple using the obtained TID value.

Reference

Top comments (0)