This project was first introduced at CruzHacks 2023, and now it's heading towards becoming a compelling solution. We created a question answer chatbot for youtube videos using semantic search, large language models, and REST API. The name is quite self-explanatory.
The project is livestreamed at https://reainvent.com/
This solution promotes efficient learning by quickly converting extensive video content to text. It's especially handy for revisiting parts of long lectures, hunting down specific details, or looking up certain policies. Furthermore, it's applicable to any YouTube video with quality transcription and can speed up the process of obtaining key information.
This bot leverages pytube
and youtube_transcript_api
to extract transcription from a given YouTube URL. A semantic search is then run, using OpenAI's embedding models, to pinpoint the most relevant parts of the video. GPT-DaVinci, OpenAI's largest LLM, forms accurate and comprehensible responses based on this context.
Prompt engineering is used to avoid misinformation, and relevant timestamps are provided for easy reference.
- Navigate to the project directory
- Create the virtual environment
python3 -m venv ./venv
- Activate the virtual environment
/venv/Scripts/activate.bat
pip install -r requirements.txt
python3 ./backend/server.py
Note: An OpenAI API key in a .env file within the backend directory is required. And OpenAI APIs are not free to use beyond the free trial. The .env file should contain the line openai_key = "{API_KEY}"
.
When running in development mode, make sure to set this line at the top of App.js so the site can communicate with the API. By default, it is set to "/api"
const API_ENDPOINT = "";
Then, initialise the website.
cd frontend
npm start
Proceed to localhost:3000 to access the webserver.
These instructions are designed for the development mode. The API and frontend can be built for production in several possible ways. Currently, AbleReInvigorated is hosted on a Google Cloud VM, using Gunicorn for the backend, and Nginx for the frontend.