-
Updated
Aug 29, 2019 - HTML
speech-corpus
Here are 18 public repositories matching this topic...
This project is devoted to the dialects of the Siberian Tatars. Around 100,000 people are spoken in these dialects. The language of Siberian Tatars consists of three dialects: Tobolo-Irtysh, Tom and Baraba.
-
Updated
Jun 15, 2022
Deliverables relating to the Speech Technology University Unit (Notes Courtesy to Dr. Andrea De Marco)
-
Updated
Jan 3, 2024 - Jupyter Notebook
Utilities for preprocessing the Switchboard and WSJ corpora in Python3
-
Updated
Jul 31, 2020 - Python
ESO speech dataset: an English-language speech corpus of the oncology domain for ASR training and benchmarking and MT benchmarking.
-
Updated
Apr 15, 2024
Code for Dialogs Re-enacted Across Languages (DRAL)
-
Updated
Nov 4, 2023 - Python
A 1300-hour English speech and text corpus of parliamentary debates for streaming ASR training and benchmarking, speech data filtering and speech data verbatimization.
-
Updated
Mar 30, 2024
AsoSoft Speech Corpus can be used for spoken language processing tasks in Central Kurdish such as speech recognition, speaker recognition, gender identification, and phonetic analysis.
-
Updated
Mar 8, 2022
This project is devoted to the Siberian Ingrian Finnish language. Siberian Ingrian Finnish – is a language (dialect) used by the descendants of the settlers who spoke Lower Luga Ingrian Finnish varieties and Lower Luga Ingrian (Izhorian) who have been living in Omsk oblast (previously they lived also in other regions of the Siberia) for more tha…
-
Updated
Apr 11, 2024 - C#
Voice activity detection and speaker gender segmentation audiovisual corpus
-
Updated
Jun 6, 2024 - Jupyter Notebook
Split ELAN Annotation Files and corresponding speech files into a corpus format for common ASR and Forced Aligners
-
Updated
Oct 15, 2018 - C++
Downloader for the voxforge corpus
-
Updated
May 10, 2018 - Python
An open-access corpus of conversational bilingual speech in Cantonese and English
-
Updated
Apr 28, 2022 - JavaScript
SpeCT - Speech Corpus Toolkit for Praat. Documentation: https://lennes.github.io/spect/
-
Updated
Aug 11, 2023 - HTML
Tools for ASR Corpus Generation from Online Video
-
Updated
Feb 10, 2019 - Python
ClovaCall dataset and Pytorch LAS baseline code (Interspeech 2020)
-
Updated
Apr 5, 2022 - Python
Improve this page
Add a description, image, and links to the speech-corpus topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the speech-corpus topic, visit your repo's landing page and select "manage topics."