-
HKUST, Guangzhou
-
11:46
(UTC -12:00) - https://owen718.github.io
- https://scholar.google.com/citations?user=1sGXZ-wAAAAJ&hl=en
Stars
Score identity Distillation with Long and Short Guidance for One-Step Text-to-Image Generation
[train + eval + deploy] Aurora Series: A more efficient multimodal large language model series for video.
[ECCV‘24] Teaching Tailored to Talent: Adverse Weather Restoration via Prompt Pool and Depth-Anything Constraint
🐫 CAMEL: Finding the Scaling Law of Agents. A multi-agent framework. https://www.camel-ai.org
OLMoE: Open Mixture-of-Experts Language Models
[ICLR 2024] Confidence-aware Reward Optimization for Fine-tuning Text-to-Image Models
A list of works on evaluation of visual generation models, including evaluation metrics, models, and systems
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
⛏💎 STEVE in Minecraft is for See and Think: Embodied Agent in Virtual Environment
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.
Lumina-T2X is a unified framework for Text to Any Modality Generation
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
Repo is required for the code of our research paper on micro-budget training of large scale diffusion model.
Live2Diff: A Pipeline that processes Live video streams by a uni-directional video Diffusion model.
Official implementation of MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
"SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow", Yuanzhi Zhu, Xingchao Liu, Qiang Liu
PeRFlow: Piecewise Rectified Flow as Universal Plug-and-Play Accelerator (NeurIPS 2024)
Implementation of UltraPixel: Advancing Ultra-High-Resolution Image Synthesis to New Peaks
This repo provides a YOLOv8 model, finely trained for detecting human heads in complex crowd scenes, with the CrowdHuman dataset serving as training data. To boost accessibility and compatibility, …