Stars
[CVPR 2022] DenseCLIP: Language-Guided Dense Prediction with Context-Aware Prompting
real time face swap and one-click video deepfake with only a single image
Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
Official implementation of the CVPR paper Open-TransMind: A New Baseline and Benchmark for 1st Foundation Model Challenge of Intelligent Transportation
(ICLR 2022 Spotlight) Official PyTorch implementation of "How Do Vision Transformers Work?"
Python scripts to parse between coco and bdd100k annotations for object detection. It also includes a folder 'mAP' that contains the script main.py which can calculate the mAP score given the groun…
EVA Series: Visual Representation Fantasies from BAAI
[ICLR'22 Oral] Implementation of "CycleMLP: A MLP-like Architecture for Dense Prediction"
[NeurIPS2022] This is the official implementation of the paper "Expediting Large-Scale Vision Transformer for Dense Prediction without Fine-tuning" for SAM.
PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).
[ICLR'23 Oral] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
[CVPR 2023] Explicit Visual Prompting for Low-Level Structure Segmentations
PyTorch codes for "LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning"
A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities
Adapting Segment Anything Model for Medical Image Segmentation
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"
Tracking and collecting papers/projects/others related to Segment Anything.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Prefix-Tuning: Optimizing Continuous Prompts for Generation
An optimized deep prompt tuning strategy comparable to fine-tuning across scales and tasks
PyTorch code for "VL-Adapter: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks" (CVPR2022)