-
CASIA
- Beijing, China
- https://wangguanan.github.io/
Starred repositories
🔊 Text-Prompted Generative Audio Model
[ECCV 2024] DragAnything: Motion Control for Anything using Entity Representation
An open source implementation of CLIP.
🇨🇳 GitHub中文排行榜,各语言分设「软件 | 资料」榜单,精准定位中文好项目。各取所需,高效学习。
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
PixArt-α: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis
[CVPR2024] The code for "Osprey: Pixel Understanding with Visual Instruction Tuning"
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
[ICLR 2023] Official implementation of the paper "DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection"
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
DeepStream SDK Python bindings and sample applications
A renderer that generates realistic images of the Earth from outer space.
Drop in a screenshot and convert it to clean code (HTML/Tailwind/React/Vue)
Awesome Monocular 3D detection
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version ongoing
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
[ICCV 2023 Oral] Text-to-Image Diffusion Models are Zero-Shot Video Generators
[IJCV2024] Exploiting Diffusion Prior for Real-World Image Super-Resolution
No-code in the front, Python in the back. An open-source framework for creating data apps.
Implementation of DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing
A high-throughput and memory-efficient inference and serving engine for LLMs
[ECCV2022] MOTR: End-to-End Multiple-Object Tracking with TRansformer
PyTorch code for the paper "FIERY: Future Instance Segmentation in Bird's-Eye view from Surround Monocular Cameras"