Block or Report
Block or report Salamaleko
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLists (27)
Sort Name ascending (A-Z)
airflow
airflowAirflow / Other Orchestrators
API Service Building
Blogazo God
dashboards / dataviz / superset
Data Platform Thought & Tooling
dbt
delta
distributedsystems - k8s -devops
emr
Fink
flinkFull Blown DE Projects
Game Development
GD GDGame Development Resources
INT PREP
Kafka
Leetcode
Marketing - Product Building
Pinot - Clickhouse
POSium
PySpark
Reference Components
ScalaSpark
Spark Generic
SGRSpark God
Streaming - Data Engineering
Stream,meisterWebDev
Stars
Language
Sort by: Recently starred
Primary repository for NYC DCP's Data Engineering team
This repository hosts materials for the Docker for Data Engineers workshop, offering hands-on exercises and resources tailored for data engineering professionals.
escobar-west / polars-cookbook
Forked from jvns/pandas-cookbookRecipes for using Python's polars library
This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
Google's Engineering Practices documentation
All the resources you need to get to Senior Engineer and beyond
Compare pyarrow to duckdb to polars for writing data pipelines.
Awesome lists about Project Management interesting and useful topics.
A curated list of awesome resources, related to game production process: books, articles, tools, project management stuff etc.
The Open-Source Enterprise Data Platform in a single Portal
Apache Camel is an open source integration framework that empowers you to quickly and easily integrate various systems consuming or producing data.
Learn Domain-Driven Design, software architecture, design patterns, best practices. Code examples included
A curated list of Domain-Driven Design (DDD), Command Query Responsibility Segregation (CQRS), Event Sourcing, and Event Storming resources
The Data Contract Specification Repository
The Score Specification provides a developer-centric and platform-agnostic Workload specification to improve developer productivity and experience. It eliminates configuration inconsistencies betwe…
Productionalizing Data Pipelines with Apache Airflow
SuperSonic is the next-generation BI+AI platform that integrates Chat BI (powered by LLM) and Headless BI (powered by semantic layer) paradigms.
Synmetrix – production-ready open source semantic layer on Cube
📊 Cube — The Semantic Layer for Building Data Applications
A guide to running Airflow on Kubernetes
How to deploy airflow on Kubernetes
Notes talking about the design and implementation of Apache Spark
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.