This project demonstrates an end-to-end solution for processing and analyzing real-time conversations data from a JSON file using GCP services and infrastructure automation, showcasing data storage…

Python 8 1 Updated Apr 29, 2024

josephmachado / python-v-sql-for-data-transform

Python or SQL for data transformation

Python 8 Updated Jul 4, 2024

PaulJuliusMartinez / jless

jless is a command-line JSON viewer designed for reading, exploring, and searching through JSON data.

Rust 4,769 91 Updated Sep 7, 2024

josephmachado / spark_submit_airflow

Simple repo to demonstrate how to submit a spark job to EMR from Airflow

Python 32 23 Updated Oct 18, 2020

josephmachado / data_engineering_project_template

A template repository to create a data project with IAC, CI/CD, Data migrations, & testing

HTML 235 101 Updated Jul 11, 2024

josephmachado / e2e_datapipeline_test

Example repo to create end to end tests for data pipeline.

Python 21 4 Updated Jun 14, 2024

josephmachado / change_data_capture

Repo for CDC with debezium blog post

Python 26 12 Updated Sep 15, 2024

josephmachado / online_store

End to end data engineering project

Python 49 17 Updated Oct 27, 2022

josephmachado / sde_de101_josephmachado

Sample repo for startdataengineering DE 101 free course

35 23 Updated Jun 24, 2024

josephmachado / python_essentials_for_data_engineers

Code for blog at https://www.startdataengineering.com/post/python-for-de/

Python 54 60 Updated Jun 7, 2024

josephmachado / cost_effective_data_pipelines

Cost Efficient Data Pipelines with DuckDB

C 44 65 Updated Jul 31, 2024

josephmachado / simple_dbt_project

Code for dbt tutorial

142 73 Updated May 31, 2024

josephmachado / socialetl

Project for "Data pipeline design patterns" blog.

Python 41 6 Updated Aug 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start Data Engineering josephmachado

Achievements

Achievements

Block or report josephmachado

Stars

Interana / eventsim

josephmachado / de-workshop-prereq

josephmachado / iceberg-features

josephmachado / de_project

cartershanklin / pyspark-cheatsheet

kevinschaich / pyspark-cheatsheet

jupyterlab / jupyterlab

josephmachado / etl-dashboard

AnswerDotAI / fasthtml

josephmachado / data-engineering-interview-series

josephmachado / recipes

NYCPlanning / data-engineering

evildmp / diataxis-documentation-framework

josephmachado / analytical_dp_with_sql

josephmachado / how-to-slash-dbt-cost-w-duckdb

josephmachado / data-quality-w-greatexpectations

josephmachado / adv_data_transformation_in_sql

janaom / gcp-de-project-streaming-pubsub-beam-dataflow