Lists (2)
Sort Name ascending (A-Z)
Stars
A data engineering project with Kafka, Spark Streaming, dbt, Docker, Airflow, Terraform, GCP and much more!
Learn ML engineering for free in 4 months!
Jupyter Notebooks as Markdown Documents, Julia, Python or R scripts
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
An orchestration platform for the development, production, and observation of data assets.
Modin: Scale your Pandas workflows by changing a single line of code
Apache Superset is a Data Visualization and Data Exploration Platform
Python Fire is a library for automatically generating command line interfaces (CLIs) from absolutely any Python object.
Solutions for various coding/algorithmic problems and many useful resources for learning algorithms and data structures
Visualize and compare datasets, target values and associations, with one line of code.
Free Data Engineering course!
Official Python library for the DeepL language translation API.
β‘ Data quality testing for the modern data stack (SQL, Spark, and Pandas) https://www.soda.io
Algorithms for outlier, adversarial and drift detection
π Papers & tech blogs by companies sharing their work on data science & machine learning in production.
C++ `std::unique_ptr` that represents each object as an NFT on the Ethereum blockchain
π§βπ« 60+ Implementations/tutorials of deep learning papers with side-by-side notes π; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gaβ¦
π Monitor deep learning model training and hardware usage from your mobile phone π±
A GitHub action for autopep8, a tool that automatically formats Python code to conform to the PEP 8 style guide.
A curated list of data engineering tools for software developers
Dataframes powered by a multithreaded, vectorized query engine, written in Rust
Create full-fledged APIs for slowly moving datasets without writing a single line of code.
AngularJS - HTML enhanced for web apps!