-
Computer of Science and Technology Beijing
Highlights
- Pro
Block or Report
Block or report Zth9730
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)
Reference-aware automatic speech evaluation toolkit
Multilingual Voice Understanding Model
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Ongoing research training transformer language models at scale, including: BERT & GPT-2
AcademiCodec: An Open Source Audio Codec Model for Academic Research
Ongoing research training transformer models at scale
Audio Codec Speech processing Universal PERformance Benchmark
A generative speech model for daily dialogue.
Speech, Language, Audio, Music Processing with Large Language Model
Finetune VITS and MMS using HuggingFace's tools
FaRL for Facial Representation Learning [Official, CVPR 2022]
NeMo text processing for ASR and TTS
A curated list of papers in Test-time Adaptation, Test-time Training and Source-free Domain Adaptation
MaTe3D: Mask-guided Text-based 3D-aware Portrait Editing
欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
A collection of AWESOME things about mixture-of-experts
BLSP: Bootstrapping Langauge-Speech Pre-training via Behavior Alignment of Continuation Writing