Skip to content
View ahasha's full-sized avatar

Block or report ahasha

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

LETSQL provides a unified interface to multi-engine data pipelines. It is focused on composability with first-class support for UDFs 🚗🛠.

Python 34 1 Updated Sep 5, 2024

A repository with my config files and setup instructions for my terminal prompt on the Mac.

Shell 9 1 Updated Aug 4, 2023

A lightweight CLI tool for versioning data alongside source code and building data pipelines.

Go 181 7 Updated Sep 2, 2024

🦉 ML Experiments and Data Management with Git

Python 13,566 1,173 Updated Sep 5, 2024

The mission of Project Drawdown is to help the world reach “Drawdown”— the point in the future when levels of greenhouse gases in the atmosphere stop climbing and start to steadily decline, thereby…

Python 212 91 Updated Oct 10, 2022

Backend Code for Massenergize Portal. This provides the API to the backend database, and is shared by the various front-end portal projects.

Python 5 8 Updated Sep 5, 2024
Python 54 14 Updated Jul 26, 2017

User-friendly Teradata client for Python

Python 108 35 Updated Nov 17, 2021

Ansible examples using Vagrant to deploy to local VMs.

2,082 707 Updated Nov 17, 2023

Scalable Machine Learning in Scalding

Java 361 61 Updated Feb 16, 2018

Programming MapReduce with Scalding

Scala 81 46 Updated Dec 5, 2015

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

Python 36,139 14,026 Updated Sep 5, 2024

Ferry lets you define, run, and deploy big data applications on AWS, OpenStack, and your local machine using Docker

Python 252 25 Updated May 30, 2015

[PROJECT IS NO LONGER MAINTAINED] Code examples that show to integrate Apache Kafka 0.8+ with Apache Storm 0.9+ and Apache Spark Streaming 1.1+, while using Apache Avro as the data serialization fo…

Scala 725 330 Updated Mar 22, 2022

Dockerfiles and scripts for Spark and Shark Docker images

Shell 261 102 Updated Jun 19, 2014

Hadoop docker image

Dockerfile 1,212 561 Updated Jun 25, 2020

Luigi is a Python module that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization etc. It also comes with Hadoop support built in.

Python 17,672 2,390 Updated Sep 5, 2024

scikit-learn: machine learning in Python

Python 59,339 25,233 Updated Sep 5, 2024