Block or Report
Block or report zaneoo
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Codes and models for Medical Image Analysis (MIA) 2023 paper. Segment Anything Model for Medical Images?.
SAM-Med2D: Bridging the Gap between Natural Image Segmentation and Medical Image Segmentation
Adapting Segment Anything Model for Medical Image Segmentation
A list of referring video object segmentation papers
Evaluation Framework for DAVIS 2017 Semi-supervised and Unsupervised used in the DAVIS Challenges
MMSegmentation with LaRS configs and dataloaders
[AAAI 2024] Referred by Multi-Modality: A Unified Temporal Transformers for Video Object Segmentation
[CVPR 2024 Highlight] Official implementation of the paper: Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation
The official repo for "Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation", ECCV 2024
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
Official repository of "Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer", AAAI 2024
Official code for "A Closer Look at Audio-Visual Segmentation"
Combining "segment-anything" with MOT, it create the era of "MOTS"
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Code for "Simplifying Source-Free Domain Adaptation for Object Detection: Effective Self-Training Strategies and Performance Insights" (ECCV 2024)
[IGARSS2024] Code for "CLIP-Guided Source-Free Object Detection in Aerial Images"
An open-source implementation for training LLaVA-NeXT.
Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States
Learning multi-modal representations by watching hundreds of surgical video lectures
A repository for surgical action triplet dataset. Data are videos of laparoscopic cholecystectomy that have been annotated with <instrument, verb, target> labels for every surgical fine-grained act…
CVPR 2023 Accepted Paper HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models
Utilities for the human-object interaction detection dataset HICO-DET