DEV Community

Chidera Stella Onumajuru
Chidera Stella Onumajuru

Posted on • Edited on

Internals of Postgres:Query Processing in Postgres.

Image description

PostgreSQL starts up a series of processes when a client such as a web application connects to a database cluster, one of these process is the backend process which handles all queries issued by a connected client, When this happens, a bunch of things occur, which I will outline below.

Parser
When an SQL statement such as SELECT id, data FROM tbl_a WHERE id < 300 ORDER BY data; is made, The parser generates a parse tree from the SQL statement. A parse tree is a structured representation of an SQL statement. The parse tree is made up of nodes, A node here refers to a data structure that represents a specific element of the SQL being parsed.

Each node in the parse tree corresponds to a specific component of the SQL statement. For example, in a SELECT statement, you would have nodes for the SELECT keyword, the list of columns, the FROM keyword, the table name, the WHERE clause, and so on. The parse tree breaks down SQL statement to its constituent parts such as column name, table names, operators that can be further processed.

Analyzer
The analyzer examines the parse tree and performs semantic analysis. It ensures that the syntax of the query is correct, resolves references to tables, columns, and other database objects, and enforces the rules and constraints defined in the database schema. It then generates a query tree. The query tree breaks down each element to its constituent part just like the parse tree, The query tree serves as an intermediate representation of the query before it is further processed for query planning and execution.

Rewriter
The rewriter is the system that realizes the rule system, and transforms a query tree according to the rules stored in the pg_rules system catalog if necessary. The rewriter analyzes the query tree, which is the internal representation of the parsed query, and applies a set of rules and transformations to modify the query structure.

The primary goal of the rewriter is to optimize the query by rewriting it in a form that allows for better execution performance.

Planner

The planner generates an execution plan that minimizes the overall cost of executing the query while adhering to the specified query semantics. The planner determines the best way to process the query in terms of cost and efficiency.

Executor
The executor executes the SQL queries and produces the result sets. The executor is responsible for executing the query plan efficiently, processing the data according to the specified operations, and producing the desired result set. It works closely with other components of the PostgreSQL system, such as the storage manager, buffer manager, and transaction manager, to ensure proper execution and consistency of the database operations.

Conclusion
These processes work together to handle queries issued by connected clients, ensuring proper query parsing, analysis, rewriting, planning, and execution within the PostgreSQL database cluster.

References
https://www.interdb.jp/pg/pgsql03.html

Top comments (0)