Port of OpenAI's Whisper model in C/C++
-
Updated
Jul 27, 2024 - C++
Port of OpenAI's Whisper model in C/C++
DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter
🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.
The official repository of the Eesen project
Whisper Dart is a cross platform library for dart and flutter that allows converting audio to text / speech to text / inference from Open AI models
Speech Note Linux app. Note taking, reading and translating with offline Speech to Text, Text to Speech and Machine translation.
OBS plugin for local speech recognition and captioning using AI
Cross-platform, real-time, offline speech recognition plugin for Unreal Engine. Based on Whisper OpenAI technology, whisper.cpp.
This plugin integrates Azure Speech Cognitive Services in Unreal Engine.
Welcome to the Microsoft Voice Assistant samples repository! Here you will find samples to help you get started building client application for your bot or Custom Command service. You will also be able to easily deploy a working Custom Command based Voice Assistant to your own Azure subscription
Server framework for Kaldi ASR Toolkit
CleanStream is an OBS plugin that uses AI to clean live audio streams from unwanted words and utterances
A speech recognition plugin for Unreal Engine 5. This is essentially a port of Pocketsphinx, to be used within an Unreal Engine project.
Speech-to-Text based on silero-vad + whisper.cpp (GGML STT) for ROS 2
📖 LanMIT: A Toolkit for Improving Language Models in Low-resourced Speech Recognition based on Kaldi.
Fast C++ implementation of ESOLA using KFRLib, can be used for online time-stretch augmentation during SpeechToText training.
Continuous Dictation Speech Recognition and Speech Synthesis in Win32
VTuber application which only requires your voice and microphone, no need for a webcam or other tracking nonsense.
cmake based kaldi + vosk + microphone speech recognition example
Add a description, image, and links to the speech-to-text topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-text topic, visit your repo's landing page and select "manage topics."