aws-emr-clusters

Here are 39 public repositories matching this topic...

RubensZimbres / Repo-2019

BERT, AWS RDS, AWS Forecast, EMR Spark Cluster, Hive, Serverless, Google Assistant + Raspberry Pi, Infrared, Google Cloud Platform Natural Language, Anomaly detection, Tensorflow, Mathematics

sql-server tensorflow mathematica pyspark aws-rds bert raspberry-pi-3 keras-tensorflow wolfram-mathematica anomaly-detection hiveql emr-cluster googleassistant aws-emr-clusters googlespeech mathe bert-model

Updated Aug 6, 2021
Jupyter Notebook

terraform-aws-modules / terraform-aws-emr

Sponsor

Star

Terraform module to create AWS EMR resources 🇺🇦

terraform aws-emr terraform-module aws-emr-clusters aws-emr-serverless

Updated May 4, 2024
HCL

AWS-Big-Data-Projects / Run-a-Spark-job-within-Amazon-EMR

Sponsor

Star

Run a Spark job within Amazon EMR

spark apache-spark aws-s3 aws-emr aws-emr-clusters

Updated Sep 12, 2020
Java

suvayu / emr-scripts

Star

Shell scripts for AWS EMR clusters

spark cluster aws-cli aws-emr-clusters

Updated Jan 25, 2018
Shell

felipeazucares / Airflow-EMR-Redshift

Star

EMR + Hadoop to Redshift ELT workflow using spark steps API and orchestrated by Apache-Airflow, which ingests disparate datasets focused around 7Gb of I94 arrivals information to produce a simple star schema in Redshift

apache-spark aws-s3 aws-emr sas7bdat-datasets apache-airflow aws-emr-clusters i94

Updated Feb 25, 2021
Python

khushal2405 / Daily-Incremental-load-ETL-pipeline-for-Ecommerce-company-using-AWS-Lambda-and-Apache-airflow

Star

Daily Incremental load ETL pipeline for Ecommerce company using AWS Lambda and AWS EMR cluster, Deployed using Apache airflow in a docker container.

Updated Mar 17, 2023
Python

abhibalani / emr_lambda

Star

Lambda to start EMR and run a map reduce job

aws aws-lambda aws-emr hadoop-mapreduce aws-emr-clusters mapreduce-python

Updated Aug 16, 2019
Python

anuragkr29 / TightCommunityDetection

Star

Detect Tight Communities in a social Network

aws scala spark graphx amazon-s3 kerbosch aws-emr-clusters graphloader

Updated Aug 7, 2019
Scala

rigganni / AWS-Spark-Million-Song-ETL

Star

Load data from the Million Song Dataset into a final dimensional model stored in S3.

apache-spark etl aws-emr parquet parquet-files dimensional-model aws-emr-clusters

Updated May 17, 2020
Python

nikhilsu / Product-review-analysis-Spark-MongoDB

Star

Performing various product review analysis on Amazon dataset using Apache Spark and MongoDB

spark apache-spark mongodb aws-s3 spark-clusters spark-sql big-data-analytics aws-emr-clusters

Updated Oct 17, 2018
Java

rshinde03 / Default-Credit-Data-Analysis-and-Prediction-Using-Big-Data

Star

Credit defaulting results in a large profit loss to banks and other credit lenders. The success of the banking industry results in the ability to understand risk. This project uses big data technologies like Mapreduce, HDFS along with PySpark and AWS for analysis of credit history and its prediction

aws big-data spark hive aws-s3 pyspark aws-ec2 hadoop-mapreduce cloudera-manager hadoop-hdfs aws-emr-clusters

Updated May 5, 2021
Jupyter Notebook

dvu4 / udacity-data-engineering

Star

Data Engineering Projects including Data Modeling, Data Warehouse, Data Lake Development

Updated May 25, 2020
Jupyter Notebook

silviomori / covid19-datalake

Star

python emr docker aws data-science airflow spark docker-container aws-s3 ecs python3 aws-emr data-engineering data-lake aws-ecs boto3 aws-emr-clusters aws-ecs-cluster

Updated Jun 19, 2020
Python

UCloudM / Steam_Analysis_For_Gamers

Star

Analysis performed on data from the Steam platform using Apache Spark and Cloud services such as Amazon Web Services.

python aws data-science big-data apache-spark aws-ec2 aws-emr-clusters

Updated Dec 11, 2019
Python

johnnyiller / cluster_funk

Star

An opinionated framework for running big data jobs

aws big-data spark aws-emr pyspark aws-emr-clusters

Updated Nov 11, 2022
Python

Adith-Rai / Reddit-Stock-Sentiment-Analyzer

Star

A Cloud based Reddit stock sentiment analyzer that analyzes overall sentiment from a configurable selection of stock subreddits for each stock. The architecture utilizes AWS MSK (Kafka), AWS EMR (PySpark) and AWS Lambda (Python 3) for maximum scalability and the OpenAI API for sentiment analysis through prompt engineering.

aws-lambda reddit-api python3 pyspark aws-ec2 aws-emr-clusters aws-msk openai-api

Updated Jan 30, 2024
Python

nihil21 / DocxAnonymizer-spark

Star

Stand-alone Scala & Java tool to anonymize OOXML Documents (DOCX)

java scala spark parallel-computing parallelization anonymisation parallel-programming aws-emr-clusters

Updated Nov 27, 2021
Java

SRVivek1 / pyspark-rdd-dataframe-examples

Star

PySpark RDD and DataFrame Examples

python aws aws-lambda aws-s3 python-script pyspark aws-ec2 rdd aws-redshift aws-db-instance python-lambda aws-emr-clusters

Updated Feb 18, 2024
Python

sagardua297 / udacity-data-engineering-nd

Star

Data Pipeline Analytics Platform is an end-to-end generic Big Data pipeline. Involves following tech stack: AWS S3, AWS Redshift, AWS EMR Cluster, Apache Spark, Apache Airflow.

python airflow spark cassandra aws-s3 data-warehouse data-engineering data-lake data-modeling airflow-plugin aws-redshift etl-pipeline aws-emr-clusters postrgresql airflow-dags airflow-operators

Updated Feb 13, 2021
Python

kacperstyslo / most-wanted-programming-skills-finder

Star

With this app, you can see what programming skills are most in-demand in the current job market.

javascript css docker scraper django docker-compose aws-s3 postgresql pandas pyspark serverless-framework shell-scripting aws-lambda-python aws-emr-clusters terraform-aws python38 airflow-dags

Updated Dec 20, 2021
Python

Improve this page

Add a description, image, and links to the aws-emr-clusters topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the aws-emr-clusters topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-emr-clusters

Here are 39 public repositories matching this topic...

RubensZimbres / Repo-2019

terraform-aws-modules / terraform-aws-emr

AWS-Big-Data-Projects / Run-a-Spark-job-within-Amazon-EMR

suvayu / emr-scripts

felipeazucares / Airflow-EMR-Redshift

khushal2405 / Daily-Incremental-load-ETL-pipeline-for-Ecommerce-company-using-AWS-Lambda-and-Apache-airflow

abhibalani / emr_lambda

anuragkr29 / TightCommunityDetection

rigganni / AWS-Spark-Million-Song-ETL

nikhilsu / Product-review-analysis-Spark-MongoDB

rshinde03 / Default-Credit-Data-Analysis-and-Prediction-Using-Big-Data

dvu4 / udacity-data-engineering

silviomori / covid19-datalake

UCloudM / Steam_Analysis_For_Gamers

johnnyiller / cluster_funk

Adith-Rai / Reddit-Stock-Sentiment-Analyzer

nihil21 / DocxAnonymizer-spark

SRVivek1 / pyspark-rdd-dataframe-examples

sagardua297 / udacity-data-engineering-nd

kacperstyslo / most-wanted-programming-skills-finder

Improve this page

Add this topic to your repo