Stars
A Robust and Versatile Monocular Visual-Inertial State Estimator
An OpenCV based implementation of Monocular Visual Odometry
a reimplementation of LiteFlowNet in PyTorch that matches the official Caffe version
deep learning for image processing including classification and object-detection etc.
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
The first challenge on short-form video quality assessment
A Deep Learning based No-reference Quality Assessment Model for UGC Videos
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
中文nlp解决方案(大模型、数据、模型、训练、推理)
This is a collection of our NAS and Vision Transformer work.
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
[ICLR 2024] Official PyTorch implementation of FasterViT: Fast Vision Transformers with Hierarchical Attention
The first international standard for image aesthetics assessment metadata. 首个面向图像美学评估元数据的国际标准.
End-to-end learning of deep visual representations for image retrieval
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
A PyTorch implementation of our method from "An Integrated System for Spatio-Temporal Summarization of 360-degrees Videos", Proc. MMM 2024
Get hundred of million of image+url from the crawling at home dataset and preprocess them
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
The ChatGPT Retrieval Plugin lets you easily find personal or work documents by asking questions in natural language.
Easily compute clip embeddings and build a clip retrieval system with them
Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.
[NAACL 2022]Mobile Text-to-Image search powered by multimodal semantic representation models(e.g., OpenAI's CLIP)
Authors official PyTorch implementation of the "ViSiL: Fine-grained Spatio-Temporal Video Similarity Learning" [ICCV 2019]
Official PyTorch Implementation of Correlation Verification for Image Retrieval, CVPR 2022 (Oral Presentation)
[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
[CVPR 2023] DepGraph: Towards Any Structural Pruning
CNN Image Retrieval in PyTorch: Training and evaluating CNNs for Image Retrieval in PyTorch
LightGlue: Local Feature Matching at Light Speed (ICCV 2023)
Code for "LoFTR: Detector-Free Local Feature Matching with Transformers", CVPR 2021, T-PAMI 2022