Block or Report
Block or report xingyizhou
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
Cambrian-1 is a family of multimodal LLMs with a vision-centric design.
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion
[CVPR 2024 Oral] MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation.
Accelerating the development of large multimodal models (LMMs) with lmms-eval
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
[BSQ-ViT] Image and Video Tokenization with Binary Spherical Quantization
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
[CVPR 2024] Official implementation of "VRP-SAM: SAM with Visual Reference Prompt"
Tokenize Anything via Prompting
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting
[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
Inpaint anything using Segment Anything and inpainting models.
Open weights LLM from Google DeepMind.
Official JAX implementation of MAGVIT: Masked Generative Video Transformer
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型
Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
We release the DaTaSeg Objects365 Instance Segmentation Dataset introduced in the DaTaSeg paper, which can be used as an evaluation benchmark for weakly or semi supervised segmentation.