Block or Report
Block or report Haiyan-Chris-Wang
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
[ICCV 2023] MI-GAN: A Simple Baseline for Image Inpainting on Mobile Devices
Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything
The open-source tool for building high-quality datasets and computer vision models
This repository contains the official implementation of the research paper, "MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training" CVPR 2024
This repository contains the official implementation of the research paper, "FastViT: A Fast Hybrid Vision Transformer using Structural Reparameterization" ICCV 2023
An Improved One millisecond Mobile Backbone
This repository contains the official implementation of the research paper, "An Improved One millisecond Mobile Backbone".
74.3% MobileNetV3-Large and 67.2% MobileNetV3-Small model on ImageNet
Simple implementation of OpenAI CLIP model in PyTorch.
Official code for our paper "Enhancing Novel Object Detection via Cooperative Foundational Models"
The official repository for Mobile AR Depth Estimation: Challenges & Prospects (HotMobile24)
TF2 implementation of knowledge distillation using the "function matching" hypothesis from https://arxiv.org/abs/2106.05237.
API for Grounding DINO 1.5: IDEA Research's Most Capable Open-World Object Detection Model Series
A PyTorch implementation of MobileNet V2 architecture and pretrained model.
An unofficial implementation of MobileNetV4 in Pytorch
[ICLR'24 Spotlight] Uni3D: 3D Visual Representation from BAAI
This is a warehouse for MobileNetV4-Pytorch-model, can be used to train your image-datasets for vision tasks.
[SIGGRAPH'24] 2D Gaussian Splatting for Geometrically Accurate Radiance Fields
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset
The repo for "Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image" and "Metric3Dv2: A Versatile Monocular Geometric Foundation Model..."
Official Pytorch code for SM4Depth: Seamless Monocular Metric Depth Estimation across Multiple Cameras and Scenes by One Model
a Android demo of depth_anything_v1 and depth_anything_v2