The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 7,860 420 Updated Aug 2, 2024

tangli-udel / DEAL

The Pytorch implementation for "DEAL: Disentangle and Localize Concept-level Explanations for VLMs" (ECCV 2024)

Jupyter Notebook 4 1 Updated Jul 5, 2024

LTH14 / mar

PyTorch implementation of MAR+DiffLoss https://arxiv.org/abs/2406.11838

Python 351 15 Updated Jul 31, 2024

hao-ai-lab / LookaheadDecoding

Python 1,058 62 Updated Feb 14, 2024

google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.

Jupyter Notebook 2,080 142 Updated Jul 12, 2024

openai / point-e

Point cloud diffusion for 3D model synthesis

Python 6,435 747 Updated Jul 4, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 570 29 Updated Jul 15, 2024

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,619 101 Updated Jul 29, 2024

AILab-CVC / SEED-X

Multimodal Models in Real World

Jupyter Notebook 347 17 Updated Jul 12, 2024

TencentARC / SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python 626 47 Updated Jul 29, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,621 99 Updated Jul 26, 2024

csyxwei / ELITE

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation (ICCV 2023, Oral)

Python 495 28 Updated Jan 8, 2024

haoosz / ConceptExpress

[ECCV 2024] ConceptExpress: Harnessing Diffusion Models for Single-image Unsupervised Concept Extraction

Python 24 3 Updated Jul 25, 2024

microsoft / Phi-3CookBook

This is a Phi-3 book for getting started with Phi-3. Phi-3, a family of open AI models developed by Microsoft. Phi-3 models are the most capable and cost-effective small language models (SLMs) avai…

Jupyter Notebook 1,450 137 Updated Aug 2, 2024

ali-vilab / Ranni

Python 199 14 Updated Apr 10, 2024

kamwoh / partcraft

PartCraft: Crafting Creative Objects by Parts (ECCV2024)

Python 69 1 Updated Jul 13, 2024

ruocwang / dpo-diffusion

[ICML 2024] On Discrete Prompt Optimization for Diffusion Models - Google

12 Updated Jul 24, 2024

tianweiy / DMD2

Python 368 21 Updated Jul 10, 2024

CaraJ7 / CoMat

Official code for 💫CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

Python 115 3 Updated Apr 30, 2024

TencentARC / Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 351 10 Updated Jul 10, 2024

Understanding-Visual-Datasets / VisDiff

Official implementation of "Describing Differences in Image Sets with Natural Language" (CVPR 2024 Oral)

Jupyter Notebook 90 11 Updated Apr 8, 2024

LLaVA-VL / LLaVA-NeXT

Python 1,421 79 Updated Jul 29, 2024

FoundationVision / LlamaGen

Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation

Python 1,104 39 Updated Jul 14, 2024

baaivision / EVA

EVA Series: Visual Representation Fantasies from BAAI

Python 2,150 157 Updated Aug 1, 2024

RockeyCoss / SPO

Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

Python 120 2 Updated Jul 10, 2024

AIGText / Glyph-ByT5

[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Mu…

Jupyter Notebook 455 19 Updated Jul 13, 2024

lllyasviel / Omost

Your image is almost there!

Python 7,017 411 Updated Jul 26, 2024

anyoptimization / pymoo

NSGA2, NSGA3, R-NSGA3, MOEAD, Genetic Algorithms (GA), Differential Evolution (DE), CMAES, PSO

Python 2,150 378 Updated Aug 1, 2024

gnobitab / MultiObjectiveSampling

Python 16 3 Updated Dec 6, 2021