Skip to content
View xingyizhou's full-sized avatar
🕊️
.
🕊️
.
Block or Report

Block or report xingyizhou

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 788 41 Updated Jun 28, 2024

Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) with Stable Diffusion

Jupyter Notebook 7,503 790 Updated Dec 8, 2022
Python 950 51 Updated Jun 27, 2024

[CVPR 2024 Oral] MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation.

Python 86 8 Updated Jun 13, 2024
Python 1,676 51 Updated Jun 28, 2024

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python 1,022 53 Updated Jun 28, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 17,781 1,916 Updated Jun 28, 2024

[BSQ-ViT] Image and Video Tokenization with Binary Spherical Quantization

Python 51 Updated Jun 12, 2024

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 862 31 Updated Jun 28, 2024
Python 286 20 Updated Jun 6, 2024

[CVPR 2024] Official implementation of "VRP-SAM: SAM with Visual Reference Prompt"

Python 48 5 Updated Apr 2, 2024

Tokenize Anything via Prompting

Jupyter Notebook 460 18 Updated May 28, 2024

Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.

Python 3,418 320 Updated Jun 16, 2024

The official Meta Llama 3 GitHub site

Python 22,581 2,356 Updated Jun 25, 2024

Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Python 12,035 776 Updated Jun 26, 2024

[ICCV 2023] ProPainter: Improving Propagation and Transformer for Video Inpainting

Python 4,869 585 Updated Apr 17, 2024

[GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly …

Python 3,753 283 Updated Apr 30, 2024

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Python 961 77 Updated May 8, 2024

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 5,822 484 Updated Feb 29, 2024

A family of lightweight multimodal models.

Python 770 56 Updated Jun 25, 2024

Open weights LLM from Google DeepMind.

Jupyter Notebook 2,193 265 Updated May 18, 2024

Official JAX implementation of MAGVIT: Masked Generative Video Transformer

Python 893 43 Updated Jan 17, 2024

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 6,319 480 Updated Jun 20, 2024

OMG-LLaVA and OMG-Seg codebase

Python 835 39 Updated Jun 28, 2024

[ECCV 2022] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

Python 1,645 182 Updated Mar 15, 2024

A simple, performant and scalable Jax LLM!

Python 1,349 236 Updated Jun 28, 2024

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 1,956 133 Updated Jun 21, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源多模态对话模型

Python 3,767 290 Updated Jun 20, 2024

Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"

Python 2,035 102 Updated Mar 21, 2024

We release the DaTaSeg Objects365 Instance Segmentation Dataset introduced in the DaTaSeg paper, which can be used as an evaluation benchmark for weakly or semi supervised segmentation.

Jupyter Notebook 14 1 Updated Dec 9, 2023
Next