-
Northwestern Polytechnical University
- Suzhou
-
07:10
(UTC +08:00)
Lists (9)
Sort Name ascending (A-Z)
Stars
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
A high-throughput and memory-efficient inference and serving engine for LLMs
An Open-Sourced LLM-empowered Foundation TTS System
Qwen2.5 is the large language model series developed by Qwen team, Alibaba Cloud.
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
The open source code for SimpleSpeech series
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge manageme…
open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.
Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)
Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation
High-Resolution Image Synthesis with Latent Diffusion Models
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"
PitchVC: Pitch Conditioned Any-to-Many Voice Conversion
FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion
LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning