Stars
Accelerating the development of large multimodal models (LMMs) with lmms-eval
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
The official code of "CaLa: Complementary Association Learning for Augmenting Composed Image Retrieval"
LEViT: Locally Enhanced Vision Transformer for Efficient Object Re-identification
GPU Accelerated t-SNE for CUDA with Python bindings
The script to label image with a sentence
A quickstart and benchmark for pytorch distributed training.
About The official repo for paper: MoEPose: Mixture of Experts Learning for Occluded Human Pose Estimation
Concept Sliders for Precise Control of Diffusion Models
Progressive Text-to-3D Generation for Automatic 3D Prototyping
A unified framework for 3D content generation.
(ෆ`꒳´ෆ) A Survey on Text-to-Image Generation/Synthesis.
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
👗3D Magic Mirror: Clothing Reconstruction from a Single Image via a Causal Perspective👗 Single-View 3D Reconstruction
Official Implementation for "TEXTure: Text-Guided Texturing of 3D Shapes"
[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Code for A Simple Episodic Linear Probe Improves Visual Recognition in the Wild