Lists (4)
Sort Name ascending (A-Z)
Starred repositories
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
An immutable SQL database for application development, time-travel reporting and data compliance. Developed by @juxt
A curated list of awesome Online Analytical Processing databases, frameworks, ressources and other awesomeness.
evitaDB is a specialized database with an easy-to-use API for e-commerce systems. It is a low-latency NoSQL in-memory engine that handles all the complex tasks that e-commerce systems have to deal …
DuckDB is an analytical in-process SQL database management system
Master programming by recreating your favorite technologies from scratch.
Distributed, MVCC SQLite that runs on FoundationDB.
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
Learn database internals by implementing it from scratch.
An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.
A cross platform way to express data transformation, relational algebra, standardized record expression and plans.
A flexible distributed key-value datastore that supports both caching and beyond caching workloads.
Garnet is a remote cache-store from Microsoft Research that offers strong performance (throughput and latency), scalability, storage, recovery, cluster sharding, key migration, and replication feat…
A curated list of analytics frameworks, software and other tools.
A list of learning materials to understand databases internals
1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java
A curated list of engineering blogs
A better compressed bitset in Java: used by Apache Spark, Netflix Atlas, Apache Pinot, Tablesaw, and many others
Essential Spark extensions and helper methods ✨😲