- Barcelona
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
CMAK is a tool for managing Apache Kafka clusters
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
A machine learning package built for humans.
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
A Spark UI and Spark History Server alternative with CPU and Memory metrics! Delight is free, cross-platform, and open-source.
A library you can include in your Spark job to validate the counters and perform operations on success. Goal is scala/java/python support.
Performance optimization for Spark running on Kubernetes
Mirrors a Kinesis stream to Amazon S3 using the KCL
Stores Snowplow enriched events in Redshift, Snowflake and Databricks
Solutions to Project Euler problems in Scala.
Load testing for event analytics platforms (Snowplow, more coming soon)