Stars
Reaching LLaMA2 Performance with 0.1M Dollars
Official code repository for the publication "SSG2: A New Modelling Paradigm for Semantic Segmentation"
Perceptual Grouping in Contrastive Vision-Language Models (ICCV'23)
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
[ICCV-2023] The official repository of our paper "TMA: Temporal Motion Aggregation for Event-based Optical Flow".
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
[NeurIPS 2023] This repo contains the code for our paper Convolutions Die Hard: Open-Vocabulary Segmentation with Single Frozen Convolutional CLIP
[ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.
This is an official implementation for "Scale-Aware Modulation Meet Transformer".
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
[TPAMI 2024 & CVPR 2022] Attention Concatenation Volume for Accurate and Efficient Stereo Matching
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
Code for ACL 2023 Oral Paper: ManagerTower: Aggregating the Insights of Uni-Modal Experts for Vision-Language Representation Learning
Neighborhood Attention Extension. Bringing attention to a neighborhood near you!
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference,…
[ICLR 2023] "Dilated convolution with learnable spacings" Ismail Khalfaoui Hassani, Thomas Pellegrini and Timothée Masquelier
QLoRA: Efficient Finetuning of Quantized LLMs
This is the official implementation of the paper - GeoMAE: Masked Geometric Target Prediction for Self-supervised Point Cloud Pre-Training
This is a collection of our NAS and Vision Transformer work.
ImageBind One Embedding Space to Bind Them All
Codes for VPGTrans: Transfer Visual Prompt Generator across LLMs. VL-LLaMA, VL-Vicuna.
LAVIS - A One-stop Library for Language-Vision Intelligence
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
PyTorch code and models for the DINOv2 self-supervised learning method.
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.