alanchn31 / Data-Engineering-Projects Star 784 Code Issues Pull requests Personal Data Engineering Projects postgres airflow spark cassandra mongodb data-warehouse data-engineering data-lake scrapy data-modeling aws-redshift star-schema ingest-data data-engineering-nanodegree Updated Feb 8, 2023 Jupyter Notebook
uber / marmaray Star 475 Code Issues Pull requests Generic Data Ingestion & Dispersal Library for Hadoop spark hadoop data-lake avro-schema ingest-data schema-format Updated Mar 19, 2023 Java
aws-samples / aws-dbs-refarch-datalake Star 75 Code Issues Pull requests Reference Architectures for Datalakes on AWS glue amazon-emr data-transformation data-lake data-catalog data-analytics hive-metastore emr-cluster ingest-data Updated May 13, 2020 HTML
18F / data-federation-project Star 28 Code Issues Pull requests A project focused on tools and best practices to supported federated data collection efforts opendata ingest-data 10x-data-federation data-federation Updated May 5, 2020
romnn / mongoimport Star 3 Code Issues Pull requests CLI and go library for importing data from CSV, JSON or XML files into MongoDB. cli golang json csv mongodb pipeline importer xml loader import ingest-data Updated Feb 3, 2021 Go
IndeemaSoftware / QPredix Star 3 Code Issues Pull requests This is Qt/C++ SDK for Predix Ge services API (https://www.predix.io/) developed by Indeema Software Inc. cpp industrial qpm predix predix-uaa predix-cloud ingest-data timeseries-service Updated Sep 11, 2018 C++
Yan-Luo-AU / SnowFlake_Data_Warehouse_Project_ETL_Process Star 1 Code Issues Pull requests Star Schema design and Data Ingestion snowflake s3-bucket notification stage dimension-tables star-schema ingest-data videotitle Updated Jan 26, 2021