Name	Name	Last commit message	Last commit date
Latest commit History 72 Commits
config	config
src	src
tests	tests
.gitignore	.gitignore
README.md	README.md
requirements.txt	requirements.txt

whisper-transcriber-telegram-bot

A local Whisper AI transcriber bot for Telegram, utilizing GPU or CPU for processing.

Runs on Python 3.10+.

About

This program is a Whisper AI-based transcriber Telegram Bot running on Python (v3.10+), designed to transcribe audio from various media sources supported by yt-dlp, or via Telegram's audio messages and over audio file uploads (mp3, wav).

The bot supports a broad range of media sites via yt-dlp (listed here), leveraging a locally run OpenAI's Whisper model to process audio and return the transcription in multiple formats.

Features

🎥 Downloads and processes media URLs from any source supported by yt-dlp.
📲 Can receive Telegram audio messages as well as .mp3 and .wav files for transcription.
🤖 Uses a local Whisper model from the openai-whisper package for transcription (no API required!).
🖥️ Automatically uses GPUtil to map out the best available CUDA-enabled local GPU.
📝 Transcribes audio using OpenAI's Whisper model (can be user-selected with /model).
- (see openai/whisper for more info)
📄 Returns transcription in text, SRT, and VTT formats.
🔄 Handles concurrent transcription requests efficiently with async & task queuing.
🕒 Features an automatic queue system to manage multiple transcription requests seamlessly.

Installation

To set up the Whisper Transcriber Telegram Bot, follow these steps:

Clone the repository:

git clone https://github.com/FlyingFathead/whisper-transcriber-telegram-bot.git
cd whisper-transcriber-telegram-bot

Install the required Python packages:
```
pip install -r requirements.txt
```
Set up your Telegram bot token either in config/bot_token.txt or as an environment variable TELEGRAM_BOT_TOKEN.
Run the bot:
```
python src/main.py
```

Usage

After launching the bot, you can interact with it via Telegram:

Send a video URL, voice message or audio file (in WAV/MP3 format) to the bot.
The bot will acknowledge the request and begin processing.
Once processing is complete, the bot will send the transcription files to you.

Commands:

/info to view current settings, uptime, GPU info and queue status
/help and /about - get help on bot use, list version number, available models and commands, etc.
/model - view the model in use or change to another available model.
/language - set the model's transcription language (auto = autodetect); if you know the language spoken in the audio, setting the transcription language manually with this command may improve both transcription speed and accuracy.

Changes

v0.14.4 - added the /info command for viewing current settings & queue status
v0.14.3 - Whisper model language selection via /language command
v0.14.2 - display duration & estimates
v0.14.1 - small fixes to the file handler; more detailed exception catching
v0.14 - now handles both Telegram's audio messages as well as audio files (.wav, .mp3)
v0.13 - added GPUtil GPU mapping to figure out the best available CUDA GPU instance to use
- (by default, uses a CUDA-enabled GPU on the system with the most free VRAM available)
v0.12 - async handling & user model change fixes, improved error handling
v0.11.1 - bot logic + layout changes, model list with /model (also in config.ini)
v0.11 - bugfixes & rate limits for /model command changes for users
v0.10 - /help & /about commands added for further assistance
- config.ini now has a list of supported models that can be changed as needed
v0.09 - users can now change the model Whisper model with /model command
v0.08 - auto-retry TG connection on start-up connection failure
- can be set in config.ini with RestartOnConnectionFailure
v0.07.7 - log output from whisper to logging
v0.07.6 - update interval for logging yt-dlp downloads now configurable from config.ini
v0.07.5 - 10-second interval update for yt-dlp logging
v0.07.4 - fixes for non-youtube urls
v0.07.2 - job queues fine-tuned to be more informative
v0.07.1 - job queues introduced
v0.07 - transcript queuing, more precise transcript time estimates
v0.06 - better handling of details for all video sources, transcription time estimates
v0.05 - universal video description parsing (platform-agnostic)
v0.04.1 - version number printouts and added utils
v0.04 - expanded support for various media sources via yt-dlp, supported sites listed here
v0.03 - better logging to console, Whisper model + keep audio y/n can now be set in config.ini
v0.02 - add video information to the transcript text file
- (see: config.ini => IncludeHeaderInTranscription = True)
v0.01 - initial commit

Contributing

Contributions are welcome! If you have suggestions for improvements or bug fixes, please open an issue or submit a pull request.

Credits

FlyingFathead - Project creator
ChaosWhisperer - Contributions to the Whisper integration and documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

whisper-transcriber-telegram-bot

About

Features

Installation

Usage

Commands:

Changes

Contributing

Credits

About

Releases

Packages

Contributors 3

Languages

FlyingFathead/whisper-transcriber-telegram-bot

Folders and files

Latest commit

History

Repository files navigation

whisper-transcriber-telegram-bot

About

Features

Installation

Usage

Commands:

Changes

Contributing

Credits

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages