Name		Name	Last commit message	Last commit date
Latest commit History 304 Commits
clusters		clusters
config		config
docs		docs
src		src
tests		tests
.dockerignore		.dockerignore
.drone.yml		.drone.yml
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
rustfmt.toml		rustfmt.toml

Repository files navigation

toyDB

Distributed SQL database in Rust, written as a learning project. Most components have been built from scratch, including:

Raft-based distributed consensus engine for linearizable state machine replication.
ACID-compliant transaction engine with MVCC-based snapshot isolation.
Iterator-based query engine with heuristic optimization and time-travel support.
SQL interface including projections, filters, joins, and aggregates.

For details, see the documentation.

Usage

A local five-node cluster can be started on localhost ports 9601 to 9605 by running:

# Local processes
$ (cd clusters/local && ./run.sh)

# Docker containers 
$ (cd clusters/docker && docker-compose up --build)

A command-line REPL client can be built and used with the node on localhost port 9605 by running:

$ cargo run --bin toysql
Connected to node "toydb-e" (version 0.1.0). Enter !help for instructions.
toydb>

A basic subset of SQL has been partially implemented, e.g.:

toydb> CREATE TABLE movies (id INTEGER PRIMARY KEY, title VARCHAR NOT NULL)
toydb> INSERT INTO movies VALUES (1, 'Sicario'), (2, 'Stalker'), (3, 'Her')
toydb> SELECT * FROM movies
1|Sicario
2|Stalker
3|Her

ACID transactions are supported via snapshot isolation:

toydb> BEGIN
Began transaction 3
toydb:3> INSERT INTO movies VALUES (4, 'Alien vs. Predator')
toydb:3> ROLLBACK
Rolled back transaction 3

toydb> BEGIN
Began transaction 4
toydb:4> INSERT INTO movies VALUES (4, 'Alien'), (5, 'Predator')
toydb:4> COMMIT
Committed transaction 4

toydb> SELECT * FROM movies
1|Sicario
2|Stalker
3|Her
4|Alien
5|Predator

Time-travel queries are also supported:

toydb> BEGIN TRANSACTION READ ONLY AS OF SYSTEM TIME 2
Began read-only transaction in snapshot of version 2
toydb@1> SELECT * FROM movies
1|Sicario
2|Stalker
3|Her

Known Issues

The primary goal is to build a minimally functional yet correct distributed database. Performance, security, reliability, and convenience are non-goals. Below is an incomplete list of known issues preventing this from being a "real" database.

Networking

No security: all network traffic is unauthenticated and in plaintext.

Raft

Cluster reconfiguration: the Raft cluster must consist of a static set of nodes available via static IP addresses. It is not possible to resize the cluster without a full cluster restart.
Single-threaded state: all state machine operations are processed by a single thread on a single node, preventing horizontal scalability.
Log replication: only the simplest form of Raft log replication is implemented, without snapshots or rapid log replay.

Storage

Recovery: there is no WAL or other form of crash recovery - unexpected termination is likely to corrupt data.
Garbage collection: old Raft log entries or MVCC versions are never removed, giving unbounded disk usage but also unbounded data history.

Transactions

Abortion: client disconnects or system crashes do not abort transactions, which may cause serialization failures for other transactions due to uncommitted writes.
Transient reads: all statements, including trivial SELECTs, will be wrapped in a transaction with a globally allocated, persistent transaction ID.
Serializability: transactions use MVCC without serializable snapshot isolation, and are thus vulnerable to write skew anomalies.
ID overflow: transaction IDs are stored as unsigned 64-bit integers without wraparound, thus no new transactions can be started beyond transaction number 2⁶⁴-1.

Schema

Single database: only a single, unnamed database is supported per toyDB cluster.
Schema changes: schema changes other than creating or dropping tables is not supported.

Query Engine

Type checking: query type checking (e.g. SELECT a + b must receive two numbers) is done at query evaluation time, not at query compile time.
Streamed evaluation: it's not possible to operate on fields that were not output by a preceding plan node - e.g. it's only possible to order on SELECT output columns.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

toyDB

Usage

Known Issues

Networking

Raft

Storage

Transactions

Schema

Query Engine

About

Releases

Packages

Used by 289

Contributors 8

Languages

License

erikgrinaker/toydb

Folders and files

Latest commit

History

Repository files navigation

toyDB

Usage

Known Issues

Networking

Raft

Storage

Transactions

Schema

Query Engine

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Used by 289

Contributors 8

Languages

Packages