This repository contains the proof of concept for spark technologies.
It includes Spark-Core , Spark SQL
Spark-Core includes Python & Scala.
Spark SQL with Scala.
It is shown on Uttar Pradesh Assembly Elections 2017 Dataset.
Objective in Spark-Core with Python is:
- To determine the candidate with maximum votes.
- To determine the candidate with minimum votes.
- To determine total candidates allotted by different parties.
- To determine number of Congress Candidates (INC) allotted with respect to district.
- To determine number of Congress Candidates (INC) allotted with respect to Assembly Constituency.
- Total candidates allotted with respect to different parties at Saharanpur District Level.
- Total number of candidates allotted with respect to phases.
- Who got maximum votes in BJP+.
It is shown on IPL Dataset.
Objective in Spark-Core with Scala is:
- To determine total number of matches played in every season.
- To determine number of matches played in a particular stadium.
- To determine the decision on winning the toss and how many times batting and fielding were selected on winning toss from season1 to season 9.
- To determine total number of matches played by every team.
- To determine total number of matches won by every team.
- To determine total number of matches won by winType (i.e. by runs, by wickets, tie, no results) at different stadiums.
- To determine total number of matches won by batting first at different stadiums.
- To determine total number of matches won by bowling first at different stadiums.
- To determine winning percentage by batting first at different stadiums.
- To determine winning percentage by bowling first at different stadiums.
It is shown on IMDB Dataset.
Objective in Spark SQL with Scala is:
- To determine movies with maximum budget.
- To determine the movies with maximum Facebook likes.
- To determine top 5 IMDB rating movies.
- To determine total number of movies released in different years.
- To determine the movies, popular with respect to actor-1.
- To determine the movies, popular with respect to actor-2.
- To determine the movies, popular with respect to actor-3.
- To determine the movies, popular with respect to the director.
- To determine the net profit of movies.
- To determine the worst movies according to critic reviews.
- To determine the best movies according to critic reviews.
- To determine movies with longest runtime (duration).
- To determine movies with shortest runtime (duration).
- To determine the best movies according to user reviews.
- To determine the worst movies according to user reviews.
It is shown with Twitter Analysis.
Objective in Spark Streaming with Twitter App is to determine:
- The Popular Topics in last 60 seconds.
- The Popular Topics in last 10 seconds.
- The Username & his/her Tweets.
- The Time at which the User Tweeted.
- The FriendsCount of User.
- The number of Tweets & Score of User.