Stars
Video+code lecture on building nanoGPT from scratch
Cassovary is a simple big graph processing library for the JVM
Common Crawl support library to access 2008-2012 crawl archives (ARC files)
Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more
Hadoop library for large-scale data processing, now an Apache Incubator project
A distributed publish/subscribe messaging service
Netty project - an event-driven asynchronous network application framework
S4 is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous unbounded streams of da…
A fault tolerant, protocol-agnostic RPC system
[Archived] A flexible sharding framework for creating eventually-consistent distributed datastores
Lightning-fast cluster computing in Java, Scala and Python.
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive lea…