Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,718 519 Updated Sep 19, 2024

wdndev / llm_interview_note

主要记录大语言大模型（LLMs）算法（应用）工程师相关的知识及面试题

HTML 3,036 360 Updated Aug 19, 2024

pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 5,591 510 Updated Oct 4, 2024

bytedance / decoupleQ

A quantization algorithm for LLM

Cuda 99 5 Updated Jun 21, 2024

THUDM / ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

Python 40,489 5,196 Updated Jun 27, 2024

yangdongchao / SimpleSpeech

The open source code for SimpleSpeech series

Python 92 6 Updated Oct 8, 2024

supertone-inc / super-monotonic-align

Python 118 9 Updated Sep 19, 2024

ictnlp / LLaMA-Omni

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,329 144 Updated Sep 24, 2024

xingchensong / S3Tokenizer

Reverse Engineering of Supervised Semantic Speech Tokenizer (S3Tokenizer) proposed in CosyVoice

Python 107 9 Updated Oct 11, 2024

lobehub / lobe-chat

🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Azure / DeepSeek), Knowledge Base (file upload / knowledge manageme…

TypeScript 42,339 9,570 Updated Oct 11, 2024

3loi / NaturalVoices

Jupyter Notebook 42 3 Updated Oct 11, 2024

gpt-omni / mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Python 2,798 255 Updated Sep 25, 2024

thunlp / duplex-model

TypeScript 18 4 Updated Aug 17, 2024

liutaocode / TTS-arxiv-daily

Automatically Update Text-to-speech (TTS) Papers Daily using Github Actions (Update Every 12th hours)

Python 240 20 Updated Oct 11, 2024

RBenita / DIFFAR

Denoising Diffusion Autoregressive Model for Raw Speech Waveform Generation

Python 23 2 Updated Mar 8, 2024

CompVis / latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models

Jupyter Notebook 11,650 1,518 Updated Feb 29, 2024

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,110 541 Updated May 31, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 757 60 Updated Aug 27, 2024

SalesforceAIResearch / DiffusionDPO

Code for "Diffusion Model Alignment Using Direct Preference Optimization"

Python 242 22 Updated Dec 28, 2023

yangdongchao / LLM-Codec

The open source code for LLM-Codec

Python 110 4 Updated Aug 18, 2024

hayeong0 / Diff-HierVC

Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"

Python 193 18 Updated Jul 3, 2024

OlaWod / PitchVC

PitchVC: Pitch Conditioned Any-to-Many Voice Conversion

Python 34 4 Updated Jun 6, 2024

OlaWod / FreeVC

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Python 592 109 Updated Mar 23, 2024

line / LibriTTS-P

LibriTTS-P: A Corpus with Speaking Style and Speaker Identity Prompts for Text-to-Speech and Style Captioning

111 2 Updated Jun 13, 2024

Shengqiang Li Shengqiang-Li

Lists (9)

ASR

ASV

Codec

Corpus

Inference

LLM

RL

TTS

VC

Stars