Skip to content

Commit

Permalink
Refactoring readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Aadesh Kulkarni authored and Aadesh Kulkarni committed Dec 14, 2023
1 parent 4f16884 commit 3f5a2bd
Showing 1 changed file with 11 additions and 4 deletions.
15 changes: 11 additions & 4 deletions readme.MD
Original file line number Diff line number Diff line change
@@ -1,4 +1,12 @@
## Setup
# Sanchay AI
- Sanchay is a Sanskrit word that means "collection" or "accumulation."
- We think that It is a fitting name for this project because it captures the essence of what it does - it takes a video and collects its key elements (transcription, subtitles, and video chapters) in an organized and easily accessible manner.
- It's like creating a collection of information from a single source, making it more useful and convenient to work with. The name Sanchaya also has a nice ring to it.
- Sanchay-AI internally uses Whisper and OpenAI to achieve this feat.

## Project Setup

> Pre-requisiites: Python
### Install openai-Whisper
pip install -U openai-whisper
Expand All @@ -14,19 +22,18 @@

#### Note:
- Ensure that your video is stored in videos directory.
- Edit Env in ./run.sh, most importantly, the INPUT_FILE_PATH property.
- Edit Env in ./run.sh, most importantly, the INPUT_FILE_PATH property & the OPENAI_API_KEY property

### Run the script
chmod +x run.sh
./run.sh



#### Env

- INPUT_FILE_PATH - Video file path
- OPENAI_API_KEY - https://platform.openai.com/api-keys
- OPENAI_MODEL - https://platform.openai.com/docs/models (Can use only chat based models for the script to work)
- MAX_TOKENS - This is used to split the subtitles into chunks of MAX_TOKENS. MAX_TOKENS should be 4/5th of your OPENAI_MODEL's max tokens.
- OUTPUT_TYPE - options [txt, json, vtt, srt, tsv, all], logic supports vtt files only for now
- OUTPUT_DIR - Directory where output will be stored
- OUTPUT_DIR - Directory where output will be stored

0 comments on commit 3f5a2bd

Please sign in to comment.