Skip to content

Commit

Permalink
update: Better ReadMe
Browse files Browse the repository at this point in the history
  • Loading branch information
localhostd3veloper committed Dec 16, 2023
1 parent 06459b4 commit 70818dd
Showing 1 changed file with 15 additions and 13 deletions.
28 changes: 15 additions & 13 deletions readme.MD
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
# Sanchay AI
- Sanchay is a Sanskrit word that means "collection" or "accumulation."
- We think that It is a fitting name for this project because it captures the essence of what it does - it takes a video and collects its key elements (transcription, subtitles, and video chapters) in an organized and easily accessible manner.
- Sanchay is a Sanskrit word that means `collection` or `accumulation`.
- We think that It is a fitting name for this project because it captures the `essence` of what it does - it takes a video and collects its key elements (`transcription, subtitles, and video chapters`) in an organized and easily accessible manner.
- It's like creating a collection of information from a single source, making it more useful and convenient to work with. The name Sanchaya also has a nice ring to it.
- Sanchay-AI internally uses Whisper and OpenAI to achieve this feat.
- Sanchay-AI internally uses `Whisper` and `OpenAI` to achieve this feat.

## Project Setup

> Pre-requisiites: Python
> Pre-requisite: Python
### Install openai-Whisper
pip install -U openai-whisper
Expand All @@ -21,19 +21,21 @@
pip install --upgrade tiktoken

#### Note:
- Ensure that your video is stored in videos directory.
- Edit Env in ./run.sh, most importantly, the INPUT_FILE_PATH property & the OPENAI_API_KEY property
- Ensure that your video is stored in `videos directory`.
- Edit Env in `./run.sh`, most importantly, the `INPUT_FILE_PATH` property & the `OPENAI_API_KEY` property

### Run the script
chmod +x run.sh
./run.sh


#### Env
### Environment Variables

- INPUT_FILE_PATH - Video file path
- OPENAI_API_KEY - https://platform.openai.com/api-keys
- OPENAI_MODEL - https://platform.openai.com/docs/models (Can use only chat based models for the script to work)
- MAX_TOKENS - This is used to split the subtitles into chunks of MAX_TOKENS. MAX_TOKENS should be 4/5th of your OPENAI_MODEL's max tokens.
- OUTPUT_TYPE - options [txt, json, vtt, srt, tsv, all], logic supports vtt files only for now
- OUTPUT_DIR - Directory where output will be stored
| Key | Description |
| --- | --- |
| INPUT_FILE_PATH | Video file path |
| OPENAI_API_KEY | https://platform.openai.com/api-keys |
| OPENAI_MODEL | https://platform.openai.com/docs/models |
| MAX_TOKENS | This is used to split the subtitles into chunks of MAX_TOKENS. MAX_TOKENS should be 4/5th of your OPENAI_MODEL's max tokens |
| OUTPUT_TYPE | options [txt, json, vtt, srt, tsv, all], logic supports vtt files only for now |
| OUTPUT_DIR | Directory where output will be stored |

0 comments on commit 70818dd

Please sign in to comment.