Stars
Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RI…
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Implementation of yolo v10 in c++ std 17 over opencv and onnxruntime
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Stable Diffusion and Flux in pure C/C++
This repository contains a pure C++ ONNX implementation of multiple offline AI models, such as StableDiffusion (1.5 and XL), ControlNet, Midas, HED and OpenPose.
🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
A MIT-licensed, deployable starter kit for building and customizing your own version of AI town - a virtual town where AI characters live, chat and socialize.
CoreNet: A library for training deep neural networks
Vision Benchmark for Maritime Search and Rescue
A high-altitude infrared thermal dataset for Unmanned Aerial Vehicle-based object detection
The state-of-the-art image restoration model without nonlinear activation functions.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Effortless data labeling with AI support from Segment Anything and other awesome models.
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
Visualize the low-level outputs of YOLOv8 to analyze and understand the areas where our model focuses. Specifically, illustrate which anchor points are activated to predict bounding boxes.
Wanna know what your model sees? Here's a package for applying EigenCAM on the new YOLO V8 model
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Lightning fast C++/CUDA neural network framework
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
[ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
Nvidia GPU exporter for prometheus using nvidia-smi binary