- Oak Park, MI
-
04:51
(UTC -04:00) - frankcash.github.io
Highlights
Block or Report
Block or report frankcash
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Open source platform for the machine learning lifecycle
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Lab assignments for Introduction to Data-Centric AI, MIT IAP 2024 👩🏽💻
(Legacy) Command Line Interface for Databricks
Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
A highly efficient daemon for streaming data from Kafka into Delta Lake
NeuralProphet: A simple forecasting package
A cluster computing framework for processing large-scale geospatial data
A curated list of awesome Apache Spark packages and resources.
Materials for a 2-day instructor led course on applying machine learning
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports comp…
dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
dbt-redshift contains all of the code enabling dbt to work with Amazon Redshift
The resources of the preparation course for Databricks Data Engineer Associate certification exam
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
CLI tool which enables you to login and retrieve AWS temporary credentials using a SAML IDP
Snowflake Connector for Python
Astronomer Starship can send your Airflow workloads to new places!
An orchestration platform for the development, production, and observation of data assets.
An Open Standard for lineage metadata collection
Reads key-value pairs from a .env file and can set them as environment variables. It helps in developing applications following the 12-factor principles.
Apache Airflow - OpenApi Client for Python
Work with remote images registries - retrieving information, images, signing content
A modular SQL linter and auto-formatter with support for multiple dialects and templated code.