Stars
Robust recipes to align language models with human and AI preferences
JunoDB is PayPal's home-grown secure, consistent and highly available key-value store providing low, single digit millisecond, latency at any scale.
Extensible Rules Engine for custom Dataframe / Dataset validation
Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Essential Spark extensions and helper methods ✨😲
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…
Examples for High Performance Spark
Apache Spark - A unified analytics engine for large-scale data processing