To Run: Create the virtual environment:
python3 -m venv env
Activate the virtual environment:
source env/bin/activate
Install the requirements:
pip install -r requirements.txt
or
pip3 install -r requirements.txt
To Configure the mocks edit the .env file By default the mocks are set to the following:
MOCK_AWARD_CATEGORIES=False
MOCK_AWARD_PRESENTERS=False
MOCK_AWARD_WINNERS=False
MOCK_AWARD_NOMINEES=False
MOCK_HOSTS=False
MOCK_RED_CARPET=False
MOCK_SENTIMENT=False
Turning these on will enable the mocks and will use the data from gg_apifake.py
To Run the program: Supplying no arguments will run the program with the default values You need supply --output_results in order to get console output results Adding --save_json will save the json files to gg_{year}_generated_answers.json in the format for the auto grader
# This will print the results and save the json files
python Runner.py --output_results --year 2013 --save_json
# This will save the json files
python Runner.py --year 2013 --save_json
# This will print the results
python Runner.py --output_results --year 2013
In order to scrape our data from IMDB, we made a file scraper.py
The requirements for the scraper are installed with the requirements.txt file
- - Hosts
- - Award Categories
- - Presenters (not 100% accurate)
- - Nominees (not 100% accurate)
- - Winners
- - Red Carpet
- Best Dressed
- Worst Dressed
- Most Controversial
- Three Most Discussed
- - Sentiment Analysis
- Sentiments regarding hosts
- most positive winner
- least positive winner
- Our code groups together similar awards into one award name with a set of aliases. These can be found in
award_aliases.json
. Because of this, it is possible that code may not work will all the subparts when mocking Award Categories. - The files
TimeToJson.py
andIntervalTester.py
create plots in the foldertest_tweets_time/
that provide interesting visualizations about where tweets about certain awards fall on a chronological scale (we used these to identify presenters and nominees) - The files in
saved_jsons/
are for internal use of the award category recognition function. - The main Runner takes on average 4.5 minutes to run on a Macbook Pro M1 chip (with video/other programs running in the background). The extra sections add around 30 seconds in total.