Skip to content

DSCI 525-group8 Rainfall Predictive Model Repository

License

Notifications You must be signed in to change notification settings

UBC-MDS/525-group8

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Rainfall Predictive Model

DSCI 525 - Web and Cloud Computing

Group8 project for DSCI 525 - Web and Cloud Computing as part of the Master of Data Science at UBC.

The goal of the project is to build and deploy ensemble machine learning models to predict daily rainfall in Australia. As part of the course objectives, we will examine the limitations of working on our computers as well as the advantages of doing so on the cloud.

Throughout the project we will be addressing the following milestones:

Milestone 1 (Week 1) - Get the Data from Web & familiarize with advanced file formats

Milestone 2 (Week 3) - Setup S3 bucket, EC2 instance & TLJH

Milestone 3 (Week 4) - Setup EMR-spark instance & rewrite ML model you have from previous milestone in spark

Milestone 4 (Week 5) - Deploy ML model using flask

Report

Please find the report of the project in a notebook here.

Data

The data used for this project is a very large rainfall dataset (>6GB) that can be found in figshare.

The features are outputs of different climate models and the target is the actual observed rainfall. This dataset contains observations from 1889 to 2014.

Contributors

Contributor GitHub handle
Rachel Wong @rachelywong
Santiago Rugeles @ansarusc
Rui Wang @wang-rui
Daniel Ortiz @danielon-5

References

Beuzen. T., Daily rainfall over NSW, Australia (2021): https://figshare.com/articles/dataset/Daily_rainfall_over_NSW_Australia/14096681/3