Group-38

COMP90024 Cluster & Cloud Computing Assignment 2

The demo can be seen at https://172.26.38.49:3000/ make sure the Unimelb VPN is connected.

Prerequisites

We have built a simple real-time cloud project which runs on Nectar Cloud. This project uses various frameworks and technologies including:

Ansible: Used as a configration management tool to automate cloud.
Python: Used for tweet harvesting and profanity analysis.
Nodejs and Expressjs: Used for Web App.
Couchdb: Used as a database software for storing raw tweets and analyzed results.
AURIN: AURIN datasets is used for result validations.

Deployment and User Guide

Ansible and Setup Our application uses ansible to automatically create instances, mount volumes, install dependencies for various softwares, configure a cluster CouchDB on all nodes and start twitter harvesters and the web application. The following steps need to be executed to setup the application after ensuring that ansible is installed in the system.``

Download and save openrc.sh file inside the /Ansible folder.
Open file /Ansible/host_vars/nectar.yaml and change COUCHDB_PASSWORD variable to your preferred database password.
Open terminal and run ./run-nectar.sh from /Ansible/ folder

Accessing individual nodes The IPs of different nodes and their classifications (harvester, webserver) are stored inside hosts file inside /Ansible/ folder. To access individual nodes, the user needs to acquire the ssh key (contact repository administrator) and store it in /Ansible/ folder. Next, the following commands need to be run to enter the individual node. 4. Open terminal and enter:

ssh -i<ssh-key><node-IP>

Checking status of Harvester To check the status of TwitterCrawler, the following command should be run on the terminal inside harvester node:

ps ax | grep TwitterCrawler.py

The output file of the TwitterCrawler can be viewed by running cat output.log (for debugging) in /home/ubuntu/deploy/Twitter_Crawler/ folder inside harvester node

Checking status of Webserver The following command should be run on the terminal inside webserver node inside /home/ubuntu/deploy/webapp/ to check the status of webserver on NodeJs:

pm2 list

The webserver node is also running a periodic function which updates the result of MapReduce on tweets database every 60 seconds and pushes the updated result view to the analyze_results database in the CouchDB cluster. This is used by the webserver to display visualizations.

To check the status of this function:

ps ax | grep send_updated_result.sh

To check output log of this function, go to /home/ubuntu/deploy/Data/:

cat output.log (For debugging).

Links

Github repository:

https://github.com/parth97/Group-38

Deployment demo:

< link to deployment video >

Website demo:

< link to webapp video >

Team Contributions

Parth Trehan[git link] : Front-end development/Visualization, Ansible deployment

Kumar Utkarsh[git link]: Ansible deployment, Architecture, Debugging and Testing

Kushagra[git link]: Ansible deployment, Architecture

Smith John Colaco[git link]: Architecture, CouchDB

Rohan Kirpekar[git link]: Tweet Harvesting, CouchDB, MapReduce

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
Ansible		Ansible
Data		Data
Twitter_Crawler		Twitter_Crawler
webapp		webapp
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Group-38

COMP90024 Cluster & Cloud Computing Assignment 2

Prerequisites

Deployment and User Guide

Links

Team Contributions

About

Releases

Packages

Contributors 2

Languages

ParthTrehan/Group-38

Folders and files

Latest commit

History

Repository files navigation

Group-38

COMP90024 Cluster & Cloud Computing Assignment 2

Prerequisites

Deployment and User Guide

Links

Team Contributions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages