You can find the project at SoMe
Amin Azad | Jacob Padgett | Lawrence Kimsey |
---|---|---|
Andrew Lowe | Sarah Xu | Jud Taylor |
---|---|---|
So-Me is a social media management tool for small businesses and tech professionals. Users of So-Me will be able to post to any of their company's major social media platforms (LinkedIn, Instagram, Facebook, Twitter) from the app, supported by a simple to use drag and drop design. Our app will provide users with optimal time recommendation's for posting, keywords user can use to increase user engagement and feedback on drafted posts using their follower's engagement data.
Languages: Python
Frameworks: FastAPI
Services: AWS, Docker, Jupyter Notebooks, Postman, TablesPlus
Below is an annotated breakout of the Cloud Architecture for SoMe.
Topic Modeling is a technique to extract the hidden topics from large volumes of text.Our team used an Latent Dirichlet Allocation (LDA) model from Gensim Python package to generate the most important words drawing engagement from user followers. One of our main challenges was how to extract good quality of topics that are clear, segregated and meaningful. This depends heavily on the quality of text preprocessing and the strategy of finding the optimal number of topics. To improve the quality of the text we recieved from the Twitter API we used various techniques such as extensive data wrangling by cleaning tweets from emojies and html marks, combining Spacy, Gensim and Wordcloud stop word libraries into one library, add our custom stop words and lemmitizing all the text. After generating topics we used pyLDAvis package to visualize all the topics, computed coherence scores and then worked through getting to optimal number of topics.
- The time followers engage with posts
- Follower Engagement data
- Tweets Followers engaged with the most
route | description |
---|---|
GET: / |
Verifies the API is deployed, and links to the docs. |
POST: /recommend |
With Twitter handle input, returns optimal post time. |
POST: /topic_model/schedule |
With Twitter handle input, returns topic modeling processing time. |
POST: /topic_model/status |
Returns status of topic modeling process. |
POST: /topic_model/get_topics |
Returns a dictionary of all topics and a list of keywords. |
POST: /engagement |
Returns a dictionary of calculated engagement values from users tweets over 30 days. |
Go to https://api.so-me.net/docs for more information and to test these endpoints.
API Request URL:
https://api.so-me.net/topic_model/schedule
API Request Body:
{
"twitter_handle": "dutchbros",
"num_followers_to_scan": 500,
"max_age_of_tweet": 7,
"words_to_ignore": [
"shooting",
"violence"
]
}
API Response:
{
"success": true
}
API Request URL:
https://api.so-me.net/topic_model/status
API Request Body:
{
"twitter_handle": "dutchbros"
}
API Response:
{
"success": true,
"queued": false,
"processing": true,
"model_ready": true
}
API Request URL:
https://api.so-me.net/topic_model/get_topics
API Request Body:
{
"twitter_handle": "dutchbros"
}
API Response:
{
"topics": {
"1": [
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"..."
],
"2": [
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"..."
],
"3": [
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"..."
],
"4": [
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"..."
],
"5": [
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"...",
"..."
]
},
"success": true
}
This package uses enviornment variables stored in a .env file to store secrets, example of used variables here:
# Twitter credentials
TWITTER_API_KEY="KEY_HERE"
TWITTER_API_SECRET="SECRET_HERE"
# Credentials for AWS database
DB_NAME = "database_name_here"
DB_USER = "database_login_here"
DB_PASSWORD = "database_password_here"
DB_HOST = "database_url_here"
DB_PORT = "database_port_here"
This app is designed to be deployed using AWS Elatic Beanstalk as a docker container. Please read the commands.md
file for a list of relevant commands on how to do this.
For deployment to work, the following needs to be done in addition to cloning this repo locally:
-
The
.env
file needs to be downloaded and added to the directoryapp
(the same directory asmain.py
). See above for the required variables in this file. Contact your team lead or previous team members for this information if you have trouble finding it. -
The file
config.yml
needs to be added to the.elasticbeanstalk
directory. This can be done by the commandeb init -p docker So-Me-DS-API
as listed incommands.md
. You will need to connect to your AWS account to do this.
Additional Pipenv related files are included in the repo, but Pipenv is NOT used during deployment, and only included for development and testing purposes.
We are documenting outstanding issues on the issues page of this repo: https://github.com/Lambda-School-Labs/social-media-strategy-ds/issues
For future teams that contribute to this project, there are a number of different directions they could go. However, here are some ideas that have been floated around:
-
The newly created "analytics" page on the So-Me site is a perfect place to put any future machine learning features.
-
The front end does not currently incorporate the additional inputs for the 'scheduling' endpoint. Work with them to incorporate this feature more fully, including variables like custom stopwords, tweet age, and number of followers to scan.
-
A function already exists that builds a corpus out of posts that a twitter user's followers engage with. This corpus could be used for things other than the existing topic modeling feature.
-
Currently, the only thing being returned is the topic modeling results. However, returning things like the most common words, #hashtags, and @mentions BEFORE topic modeling might be useful information that could easily be added to the topic modeling process.
-
A number of changes could be made the architecture of the project:
-
The Twitter scanning and machine learning model building could be offloaded from the FastAPI app to a singular background worker app, running something like Celery. This app should be given a seperate Twitter API key.
-
Security features could be added to the API, preventing unauthorized users from accessing it. The same could be done for the database, which currently uses a very open security group.
-
Models could be pickled and stored on AWS S3 for future use. Currently, the results of the model in JSON format are the only thing being stored.
-
If you are having an issue with the existing project code, please submit a bug report under the following guidelines:
- Check first to see if your issue has already been reported.
- Check to see if the issue has recently been fixed by attempting to reproduce the issue using the latest master branch in the repository.
- Create a live example of the problem.
- Submit a detailed bug report including your environment & browser, steps to reproduce the issue, actual and expected outcomes, where you believe the issue is originating from, and any potential solutions you have considered.
We would love to hear from you about new features which would improve this app and further the aims of our project. Please provide as much detail and information as possible to show us why you think your new feature should be implemented.
If you have developed a patch, bug fix, or new feature that would improve this app, please submit a pull request. It is best to communicate your ideas with the developers first before investing a great deal of time into a pull request to ensure that it will mesh smoothly with the project.
Remember that this project is licensed under the MIT license, and by submitting a pull request, you agree that your work will be, too.
- Ensure any install or build dependencies are removed before the end of the layer when doing a build.
- Update the README.md with details of changes to the interface, including new plist variables, exposed ports, useful file locations and container parameters.
- Ensure that your code conforms to our existing code conventions and test coverage.
- Include the relevant issue number, if applicable.
- You may merge the Pull Request in once you have the sign-off of two other developers, or if you do not have permission to do that, you may request the second reviewer to merge it for you.
These contribution guidelines have been adapted from this good-Contributing.md-template.
Sudoku Image Processing was developed with reference to: Sarthak Vajpayee's https://medium.com/swlh/how-to-solve-sudoku-using-artificial-intelligence-8d5d3841b872
Use your own Algorithim with AWS Sagemaker: https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms.html AWS: Bring Your own Container: https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/scikit_bring_your_own/container
Call an Amazon SageMaker model endpoint using Amazon API Gateway and AWS Lambda: https://aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/
Help with Sudoku Solver Code: Peter Norvig, https://norvig.com/sudoku.html
Naked Twins Solver Technique Reference: https://hodoku.sourceforge.net/en/tech_naked.php
See Backend Documentation for details on the backend of our project.
See Front End Documentation for details on the front end of our project.