-
sogang univ. IIP lab
- Mapo, Seoul
Starred repositories
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
A Python wrapper for the high-quality vocoder "World"
한국어 음성인식 STT API 리스트. 각 성능 벤치마크.
An OpenAI gym environment to evaluate the ability of LLMs (eg. GPT-4, Claude) in long-horizon reasoning and task planning in dynamic multi-agent settings.
CjangCjengh / vits
Forked from jaywalnut310/vitsVITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
ouor / vits
Forked from CjangCjengh/vitsVITS implementation of Japanese, Chinese, Korean, Sanskrit and Thai
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal proce…
Awesome speech/audio LLMs, representation learning, and codec models
👦 👧 Technical-Interview guidelines written for those who started studying programming. I wish you all the best. 👾
Google AI 2018 BERT pytorch implementation
This repository contains the official implementation of GhostFaceNets, State-Of-The-Art lightweight face recognition models.
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP
Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation
Implementation of ViViT: A Video Vision Transformer
A python package for whisper normalizer
Pipeline to generate the Standardized Project Gutenberg Corpus
A tool for extracting plain text from Wikipedia dumps
dual-path multi-channel network for speech separation
Convert Wikipedia database dumps into plaintext files
This Repository surveys the paper focusing on Prompting and Adapters for Speech Processing.