yinyxl

Yin Xinlei yinyxl

2 followers · 2 following

Highlights

Stars

THUDM / VisualGLM-6B

Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型

Python 4,080 416 Updated Aug 23, 2024

cdjkim / audiocaps

🔊 Repository for our NAACL-HLT 2019 paper: AudioCaps

Python 139 17 Updated Apr 23, 2024

mlfoundations / open_flamingo

An open-source framework for training large multimodal models.

Python 3,697 281 Updated Aug 31, 2024

sarulab-speech / UTMOS22

UT-Sarulab MOS prediction system using SSL models

Python 177 13 Updated Apr 11, 2024

sigsep / sigsep-mus-db

Python parser and tools for MUSDB18 Music Separation Dataset

Python 161 34 Updated Nov 24, 2023

hzwer / WritingAIPaper

Writing AI Conference Papers: A Handbook for Beginners

1,152 38 Updated Sep 26, 2024

declare-lab / tango

A family of diffusion models for text-to-audio generation.

Python 1,010 79 Updated Jul 3, 2024

haoxiangsnr / llm-tse

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)

JavaScript 32 2 Updated Oct 13, 2023

akoepke / audio-retrieval-benchmark

Implementation of "Audio Retrieval with Natural Language Queries: A Benchmark Study".

Python 46 2 Updated Jul 22, 2022

haoheliu / versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

Python 1,125 108 Updated May 10, 2024

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities.

Python 180 10 Updated Oct 2, 2024

karpathy / LLM101n

LLM101n: Let's build a Storyteller

29,368 1,608 Updated Aug 1, 2024

HilaManor / AudioEditingCode

Python 134 22 Updated Oct 13, 2024

Graph-COM / GESS

Code for GeSS: Benchmarking Geometric Deep Learning under Scientific Applications with Distribution Shifts

Python 13 1 Updated Jun 2, 2024

voidful / Codec-SUPERB

Audio Codec Speech processing Universal PERformance Benchmark

Python 208 22 Updated Sep 28, 2024

yangdongchao / AcademiCodec

AcademiCodec: An Open Source Audio Codec Model for Academic Research

Python 578 80 Updated Dec 27, 2023

MTG / mtg-jamendo-dataset

Metadata, scripts and baselines for the MTG-Jamendo dataset

Python 267 38 Updated Jul 9, 2024

facebookresearch / DiT

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 6,144 542 Updated May 31, 2024

Audio-AGI / AudioSep

Official implementation of "Separate Anything You Describe"

Python 1,598 116 Updated Mar 31, 2024

jaeyeonkim99 / EnCLAP

Official Implementation of EnCLAP (ICASSP 2024)

Python 88 5 Updated Jun 2, 2024

hche11 / VGGSound

VGGSound: A Large-scale Audio-Visual Dataset

Python 287 32 Updated Sep 13, 2021

RetroCirce / Zero_Shot_Audio_Source_Separation

The official code repo for "Zero-shot Audio Source Separation through Query-based Learning from Weakly-labeled Data", in AAAI 2022

Python 186 31 Updated Jul 14, 2022

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.

Python 25,563 5,289 Updated Oct 16, 2024

ustctug / ustcthesis

LaTeX template for USTC thesis

TeX 1,616 399 Updated Oct 15, 2024

XinhaoMei / ACT

Source code for the paper 'Audio Captioning Transformer'

Jupyter Notebook 48 3 Updated Jan 18, 2022

haoheliu / AudioLDM-training-finetuning

AudioLDM training, finetuning, evaluation and inference.

Python 200 39 Updated Jun 2, 2024

microsoft / unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities

Python 19,866 2,532 Updated Oct 10, 2024

descriptinc / melgan-neurips

GAN-based Mel-Spectrogram Inversion Network for Text-to-Speech Synthesis

Python 966 214 Updated Aug 28, 2023

JinhuaLiang / WavCraft

Official repo for WavCraft, an AI agent for audio creation and editing

Python 650 96 Updated Sep 13, 2024

google / visqol

Perceptual Quality Estimator for speech and audio

C++ 687 124 Updated Aug 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly