-
Tsinghua University
- https://yushuiwx.github.io/
Block or Report
Block or report yushuiwx
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Official Pytorch Implementation of Our CVPR2023 Paper: "Towards Accurate Image Coding: Improved Autoregressive Image Generation with Dynamic Vector Quantization"
🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
Awesome-LLM-3D: a curated list of Multi-modal Large Language Model in 3D world Resources
🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.
A repository for research on medium sized language models.
Pytorch implementation of the models RT-1-X and RT-2-X from the paper: "Open X-Embodiment: Robotic Learning Datasets and RT-X Models"
Official Repository for "Eureka: Human-Level Reward Design via Coding Large Language Models" (ICLR 2024)
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch
StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text
Code of the paper: Finetuning Text-to-Image Diffusion Models for Fairness
Mora: More like Sora for Generalist Video Generation
Open-Sora: Democratizing Efficient Video Production for All
An open source implementation of "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning", an all-new multi modal AI that uses just a decoder to generate both text and images
📚 A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
A curated list of reinforcement learning with human feedback resources (continually updated)
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
[Arxiv] A Survey on Video Diffusion Models
A comprehensive list of papers using large language/multi-modal models for Robotics/RL, including papers, codes, and related websites
Implementation of AudioLM, a SOTA Language Modeling Approach to Audio Generation out of Google Research, in Pytorch
FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models
Instruct-tune LLaMA on consumer hardware