Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
-
Updated
Oct 30, 2024 - Python
Python framework for building efficient data pipelines. It promotes modularity and collaboration, enabling the creation of complex pipelines from simple, reusable components.
Amazon SageMaker Local Mode Examples
The Lakehouse Engine is a configuration driven Spark framework, written in Python, serving as a scalable and distributed engine for several lakehouse algorithms, data flows and utilities for Data Products.
Sample project to demonstrate data engineering best practices
This repository exemplifies a simple ELT process using delta to perform upsert and remove data files that aren't in the latest state of the transaction log for the table.
Exercícios do módulo 1 - Bootcamp EDC - IGTI 2021
A Delta Lake reader for Dask
Creation of a data lakehouse and an ELT pipeline to enable the efficient analysis and use of data
Spark data pipeline that processes movie ratings data.
Implementation of an ETL process for real-time sentiment analysis of tweets with Docker, Apache Kafka, Spark Streaming, MongoDB and Delta Lake
Spark Structured Streaming data pipeline that processes movie ratings data in real-time.
Free High-Quality Financial Data in Azure
UI to run SQL on Delta Lake tables and visualize the variations of the result among tables versions
End-to-end data platform: A PoC Data Platform project utilizing modern data stack (Spark, Airflow, DBT, Trino, Lightdash, Hive metastore, Minio, Postgres)
Introducing Delta-Buddy: Your ultimate Delta Lake companion! 🚀 Streamline your data journey with an AI-powered chatbot. Ask Delta-Buddy anything about your Delta Lake.
A quick example for Delta Lake running on AWS EMR Serverless Spark
Streaming ETL job cases in AWS Glue to integrate Delta Lake and creating an in-place updatable data lake on Amazon S3
Automated provisioning of an industry Lakehouse with enterprise data model
Add a description, image, and links to the delta-lake topic page so that developers can more easily learn about it.
To associate your repository with the delta-lake topic, visit your repo's landing page and select "manage topics."