ihalytskyi

ihalytskyi

1 follower · 2 following

Stars

spring-guides / gs-accessing-data-jpa

Accessing Data with JPA :: Learn how to work with JPA data persistence using Spring Data JPA.

Java 236 346 Updated Aug 2, 2024

airtai / faststream

FastStream is a powerful and easy-to-use Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS and Redis.

Python 2,346 118 Updated Sep 8, 2024

rockthejvm / udemy-scala-beginners

Scala 285 344 Updated May 7, 2024

MartinThoma / flake8-simplify

❄ A flake8 plugin that helps you to simplify code

Python 183 19 Updated Dec 25, 2023

amundsen-io / amundsen

Amundsen is a metadata driven application for improving the productivity of data analysts, data scientists and engineers when interacting with data.

Python 4,379 953 Updated Sep 3, 2024

tokern / piicatcher

Scan databases and data warehouses for PII data. Tag tables and columns in data catalogs like Amundsen and Datahub

Python 270 91 Updated Jan 5, 2024

tokern / dbcat

Data Catalog for Databases and Data Warehouses

Python 31 9 Updated Jan 15, 2024

NaimKabir / jinja-sql-demo

A proof of concept for how to set up a codebase for an analytics org.

Python 13 7 Updated Aug 15, 2021

donnemartin / system-design-primer

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Python 268,953 45,449 Updated Aug 7, 2024

sqlalchemy / sqlalchemy

The Database Toolkit for Python

Python 9,387 1,408 Updated Sep 8, 2024

pypa / sampleproject

A sample project that exists for PyPUG's "Tutorial on Packaging and Distributing Projects"

Python 5,077 1,717 Updated Aug 6, 2024

miguelgrinberg / python-socketio

Python Socket.IO server and client

Python 3,939 583 Updated Sep 2, 2024

dbt-labs / dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Python 9,591 1,589 Updated Sep 8, 2024

treeverse / lakeFS

lakeFS - Data version control for your data lake | Git for data

Go 4,346 343 Updated Sep 8, 2024

mara / mara-pipelines

A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow

Python 2,068 100 Updated Dec 15, 2023

JohnMcCambridge / flenser

Flenser is a simple, minimal, automated exploratory data analysis tool.

Python 78 6 Updated Apr 4, 2021

prometheus / statsd_exporter

StatsD to Prometheus metrics exporter

Go 913 230 Updated Sep 1, 2024

fluentpython / example-code

Example code for the book Fluent Python, 1st Edition (O'Reilly, 2015)

Python 5,544 2,175 Updated Dec 2, 2021

AllenDowney / ThinkStats2

Text and supporting code for Think Stats, 2nd Edition

Jupyter Notebook 4,020 11,282 Updated Jul 1, 2024

gabfr / data-engineering-nanodegree

notebooks produced throughout the Udacity's Nanodegree Data Engineering Course

Jupyter Notebook 72 59 Updated Oct 3, 2020

capitalone / datacompy

Pandas, Polars, and Spark DataFrame comparison for humans and more!

Python 463 123 Updated Aug 21, 2024

oxnr / awesome-bigdata

A curated list of awesome big data frameworks, ressources and other awesomeness.

13,119 2,549 Updated May 7, 2024

Pushkr / Apache-Spark-Hands-On

Educational notes,Hands on problems w/ solutions for hadoop ecosystem

Python 86 76 Updated Jan 22, 2019

aws / aws-sdk-pandas

pandas on AWS - Easy integration with Athena, Glue, Redshift, Timestream, Neptune, OpenSearch, QuickSight, Chime, CloudWatchLogs, DynamoDB, EMR, SecretManager, PostgreSQL, MySQL, SQLServer and S3 (…

Python 3,888 688 Updated Sep 5, 2024

kjam / data-cleaning-101

Data Cleaning Libraries with Python

Jupyter Notebook 280 174 Updated Sep 15, 2023

maxis42 / Big-Data-Engineering-Coursera-Yandex

Big Data for Data Engineers Coursera Specialization from Yandex

Jupyter Notebook 102 75 Updated Mar 15, 2023

udacity / nd027-c3-data-lakes-with-spark

Python 142 224 Updated May 23, 2023

jukkakansanaho / udacity-dend-project-5

Udacity Data Engineer Nano Degree - Project-5 (Data Pipelines)

Python 5 3 Updated Jul 19, 2019

cluster-apps-on-docker / spark-standalone-cluster-on-docker

Learn Apache Spark in Scala, Python (PySpark) and R (SparkR) by building your own cluster with a JupyterLab interface on Docker. ⚡

Jupyter Notebook 443 190 Updated Dec 24, 2022

apache / spark

Apache Spark - A unified analytics engine for large-scale data processing

Scala 39,267 28,183 Updated Sep 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly