This folder contains the source tree of the core of DuckDB. Below is a short overview of the different components.
The parser can be found in the parser folder. This is the entry point for any query that enters DuckDB. DuckDB uses the parser of Postgres (libpg_query). After parsing the query using that parser, the tokens are transformed into a custom parse tree representation that is based on SQLStatements
, Expressions
and TableRefs
.
The planner can be found in the planner folder. The planner is responsible for taking the tokens extracted from the query string by the Parser
, and converting them into a Logical Query Plan. The plan is represented as a tree with nodes of type LogicalOperator
.
The optimizer can be found in the optimizer folder. The optimizer takes the Logical Query Plan generated by the Planner
and translates it into a logically equivalent but (hopefully) faster to execute Logical Query Plan. Examples of optimizations that are performed are predicate pushdown, expression rewriting and join ordering. Both cost-based and rule-based optimizations are performed.
The execution layer can be found in the execution folder. The execution layer first takes the Logical Query Plan resulting from the Optimizer
and converts it into a Physical Query Plan consisting of PhysicalOperators
. The PhysicalOperators
are then executed using a push-based execution model.
The catalog management can be found in the catalog folder. The catalog keeps track of the tables, schemas and functions that are contained in the database. The Catalog
is used by the Binder
in the planning phase to resolve symbols (e.g. "table_name") to the actual tables and columns that exist in the database.
The database storage component can be found in the storage folder. The storage component is responsible for managing the actual physical data, both as it resides in memory and as it resides on disk. The execution layer uses the storage component whenever it needs to access the base table data (e.g. perform a base table scan) or when it needs to update the information stored in the database (as part of e.g. an INSERT
or UPDATE
command).
The transaction management can be found in the transaction folder. The transaction manager manages all currently open transactions, and is in charge of handling COMMIT
or ROLLBACK
commands.