This blog aims to assist you in understanding the concepts of Chapter:3 [Query Processing] from the book The Internals of PostgreSQL.
Note: Ensure that you have a thorough understanding of Chapter 1 and Chapter 2 before we proceed to Chapter 3, as it forms the foundation for our exploration.
So, Let's Start:
An Example of Query Processing is depicted in figure below:
Parser
The parser generates a parse tree that can be read by subsequent subsystems from an SQL statement in plain text.
A parse tree is a tree whose root node is the SelectStmt structure defined in parsenodes.h.
An example of a parse tree is depicted in figure below:
- The SELECT query and the parse tree have corresponding elements with the same numbering.
- The parser's main role is to check the syntax of the query and generate a parse tree.
- It does not perform semantic checks on the input query.
- Syntax errors are reported by the parser, while semantic checks are handled by the analyzer/analyser.
- The parser does not verify the existence of table names or other semantic aspects of the query.
- Its focus is on the structure and grammar of the query rather than its meaning or validity.
Analyzer/Analyser
- The analyzer/analyser runs a semantic analysis of a parse tree generated by the parser and generates a query tree.
An example of a query tree is depicted in figure below:
The join tree stores the FROM clause and the WHERE clauses.
The targetlist is a list of columns that are the result of this query.
The range table is a list of relations that are used in this query.
The sort clause is a list of SortGroupClause.
Rewriter
Views in PostgreSQL are implemented by using the rule system. When a view is defined by the CREATE VIEW command, the corresponding rule is automatically generated and stored in the catalog.
At this stage, the rewriter processes the range table node to a parse tree of the subquery, which is the corresponding view, stored in pg_rules.
An example of the rewriter stage is depicted in figure below:
Planner and Executor
The planner receives a query tree from the rewriter and generates a (query) plan tree that can be processed by the executor most effectively.
The planner in PostgreSQL is based on pure cost-based optimization.
A simple plan tree and the relationship between the plan tree and the result of the EXPLAIN command is depicted in figure below:
A plan tree is composed of elements called plan nodes, and it is connected to the plantree list of the PlannedStmt structure.
Each plan node has information that the executor requires for processing, and the executor processes from the end of the plan tree to the root in the case of a single-table query.
The executor reads and writes tables and indexes in the database cluster via the buffer manager.
The relationship among the executor, buffer manager and temporary files is depicted in figure below:
I hope, this blog has helped you in understanding the concepts of query processing in PostreSQL.
Check out summary of Chapter : 4
If you want to understand PostgreSQL In-Depth.
Top comments (0)