Whisper YouTube Transcriber

Transcribes an audio source to text using the OpenAI's Whisper API. The result is saved in the timestamps/ folder as a Markdown file. The input can be URL to a video source, it will use yt-dlp to download the video and proceed with transcribing.

Result Folders

audios/ folder contains extracted audio files from YouTube videos when the input is a YouTube URL. Large audio files are split into smaller chunks are also saved here.
jsons/ folder contains responses from Whisper API transcribing/translating the audio files. The data follows the schema in transcription.schema.json. Some non-speech audio segments may be falsely transcribed as speech by the Whisper model. These segments are filtered out and saved into json/*-no_speech.json and json/*-speech.json files.
timestamps/ folder contains the resulting Markdown files, that includes transcription and their timestamps. The timestamps link to the YouTube video at the matching times when the input when the input is a YouTube URL or the input file has the source YouTube URL embedded to the metadata under format:tags:comment. The *-no_speech.json and *-speech.json files are also timestamped and saved here as timestamps/*-no_speech.md and timestamps/*-speech.md.

Usage

python main.py [-h] [-p PROMPT] [-l LANGUAGE] [-t] input

positional arguments:
  input                 Path or URL to the audio source to transcribe, in one of these formats: mp3, mp4, mpeg, mpga,
                        m4a, wav, or webm.

options:
  -h, --help            show this help message and exit
  -p PROMPT, --prompt PROMPT
                        A list of correct word spellings for problematic words.
  -l LANGUAGE, --language LANGUAGE
                        The language of the input audio. Supplying the input language in ISO-639-1 format will improve
                        accuracy and latency.
  -t, --translate       Translate the audio file to English.

Prompting

See OpenAI's documantation.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
audio_processing.py		audio_processing.py
logging_config.py		logging_config.py
main.py		main.py
tagger.py		tagger.py
transcribe.py		transcribe.py
transcription.schema.json		transcription.schema.json
transcription_processing.ipynb		transcription_processing.ipynb
transcription_processing.py		transcription_processing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper YouTube Transcriber

Result Folders

Usage

Prompting

About

Languages

License

tunaflsh/whisper-youtube-transcriber

Folders and files

Latest commit

History

Repository files navigation

Whisper YouTube Transcriber

Result Folders

Usage

Prompting

About

Resources

License

Stars

Watchers

Forks

Languages