Stars
Out of time: automated lip sync in the wild
[ACMMM 2022] Weakly-Supervised Temporal Action Alignment Driven by Unbalanced Spectral Fused Gromov-Wasserstein Distance
Open source annotation tool for machine learning practitioners.
Code for the ICASSP-2021 paper: Continuous Speech Separation with Conformer.
🐱给小白的Shadowsocks翻墙教程-Easy-to-follow tutorials for beginners on using Shadowsocks to bypass internet restrictions.
Chat or caption with an image as context. LLaVA architecture. Tuned on GPT4V-only accurate and diverse image-text datasets. Good for practical usage. Open source.
[CVPR'23 Highlight] AutoAD: Movie Description in Context.
📋 Collection of evaluation code for natural language generation.
[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…
Taming Transformers for High-Resolution Image Synthesis
✨✨Latest Advances on Multimodal Large Language Models
Semi-Offline Reinforcement Learning for Optimized Text Generation
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Train transformer language models with reinforcement learning.
DAMO-ConvAI: The official repository which contains the codebase for Alibaba DAMO Conversational AI.
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
GLIDE: a diffusion-based text-conditional image synthesis model
Official Pytorch Implementation of Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models
A latent text-to-image diffusion model
Fine-grained Post-training for Improving Retrieval-based Dialogue Systems - NAACL 2021