Highlights
- Pro
Stars
akanametov / yolo-face
Forked from ultralytics/ultralyticsYOLOv8 Face 🚀 in PyTorch > ONNX > CoreML > TFLite
Official implementation of the pupillometry system called PupilSense proposed in the article "PupilSense: Detection of Depressive Episodes Through Pupillary Response in the Wild".
The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…
Baselines for MCD-rPPG dataset. Medical parameter prediction from facial videos and rPPG reconstruction.
Efficient face emotion recognition in photos and videos
rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
Official implementation of the paper "Hausdorff Distance Matching with Adaptive Query Denoising for Rotated Detection Transformer"
TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, sparsity, distillation, etc. It compresses deep learning models for downstream …
This is the official repository for our recent work: PIDNet
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
Geometric Computer Vision Library for Spatial AI
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
Docker scripts for building ONNX Runtime with TensorRT and OpenVINO in manylinux environment
[CVPR2023] Towards Robust Tampered Text Detection in Document Image: New Dataset and New Solution
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
Fast and customizable framework for automatic ML model creation (AutoML)
[CVPR2024] StableVITON: Learning Semantic Correspondence with Latent Diffusion Model for Virtual Try-On
Equivariant Steerable CNNs Library for Pytorch https://quva-lab.github.io/escnn/
Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch
Inpaint anything using Segment Anything and inpainting models.
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.