Highlights
- Pro
Stars
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
LAVIS - A One-stop Library for Language-Vision Intelligence
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
An open source implementation of CLIP.
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
A curated list of papers of interesting empirical study and insight on deep learning. Continually updating...
[TPAMI 2023] Low Dimensional Landscape Hypothesis is True: DNNs can be Trained in Tiny Subspaces
[ICLR 2023] Trainable Weight Averaging: Efficient Training by Optimizing Historical Solutions
This is the pytorch implementation of some representative action recognition approaches including I3D, S3D, TSN and TAM.
PyTorch code and models for V-JEPA self-supervised learning from video.
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
A collection of parameter-efficient transfer learning papers focusing on computer vision and multimodal domains.
The official repository for ICLR2024 paper "FROSTER: Frozen CLIP is a Strong Teacher for Open-Vocabulary Action Recognition"
This repo holds the official code and data for "Beyond Literal Descriptions: Understanding and Locating Open-World Objects Aligned with Human Intentions", which is accepted by ACL 2024 (Findings).
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
Official Implementation for paper "Referring Transformer: A One-step Approach to Multi-task Visual Grounding" Neurips 2021
[ICCV2023] Segment Every Reference Object in Spatial and Temporal Spaces
A detection/segmentation dataset with labels characterized by intricate and flexible expressions. "Described Object Detection: Liberating Object Detection with Flexible Expressions" (NeurIPS 2023).
A curated list of papers and resources related to Described Object Detection, Open-Vocabulary/Open-World Object Detection and Referring Expression Comprehension. Updated frequently and pull request…
[NeurIPS 2022] Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
[ICCV 2023] MeViS: A Large-scale Benchmark for Video Segmentation with Motion Expressions
A curated list of foundation models for vision and language tasks