-
Shanghai Jiao Tong University
- Beijing
- https://yfyeung.github.io/
Highlights
- Pro
Block or Report
Block or report yfyeung
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseLanguage
Sort by: Recently starred
Starred repositories
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Kaldi-compatible online fbank extractor without external dependencies
SuperCLUE: 中文通用大模型综合性基准 | A Benchmark for Foundation Models in Chinese
A Native-PyTorch Library for LLM Fine-tuning
VQVAEs, GumbelSoftmaxes and friends
Fast and memory-efficient exact attention
Chinese Mandarin Grapheme-to-Phoneme Converter. 中文轉注音或拼音 (INTERSPEECH 2022)
Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
Multilingual Voice Understanding Model
A feature-rich command-line audio/video downloader
This is the code for the SpeechTokenizer presented in the SpeechTokenizer: Unified Speech Tokenizer for Speech Language Models. Samples are presented on
SpeechGPT Series: Speech Large Language Models
A generative speech model for daily dialogue.
《李宏毅深度学习教程》(李宏毅老师推荐👍),PDF下载地址:https://github.com/datawhalechina/leedl-tutorial/releases
An evolving, large-scale and multi-domain ASR corpus for low-resource languages with automated crawling, transcription and refinement
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
Massive open Japanese speech corpus
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A must-read paper for speech separation based on neural networks
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
A fast parallel implementation of RNN Transducer.
Vector (and Scalar) Quantization, in Pytorch
Some fast-ish algorithms for batch text search in moderate-sized collections, intended for data cleanup
Universal Romanizer that can convert any unicode script to roman (latin) script