![:shipit: :shipit:](https://github.githubassets.com/images/icons/emoji/shipit.png)
Highlights
Block or Report
Block or report sithu31296
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
End-to-end SFM framework based on GTSAM
Making Structure-from-Motion (COLMAP) more robust to symmetries and duplicated structures
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code
Code for "FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent" by Cameron Smith*, David Charatan*, Ayush Tewari, and Vincent Sitzmann
A user-friendly and high-performance implementation of neural radiance fields (NeRF)
Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
Official implementation of "Contrastive Audio-Language Learning for Music" (ISMIR 2022)
Easily compute clip embeddings from video frames
[ECCV'22 Oral] Towards Grand Unification of Object Tracking
Pen and paper exercises in machine learning
[CVPR 2023] Official implementation of the paper "Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation"
Official code for "Decoupling Zero-Shot Semantic Segmentation"
PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmentation (CVPR2022)
Real-time Object Detection for Streaming Perception, CVPR 2022
[CVPR 2022 (oral)] Bongard-HOI for benchmarking few-shot visual reasoning
[TPAMI 2023] Multimodal Image Synthesis and Editing: The Generative AI Era
Code for "OnePose: One-Shot Object Pose Estimation without CAD Models", CVPR 2022
🧮 A collection of resources to learn mathematics for machine learning
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
[CVPR 2022] Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization
Benchmark your model on out-of-distribution datasets with carefully collected human comparison data (NeurIPS 2021 Oral)
CLIPort: What and Where Pathways for Robotic Manipulation
The official implementation for Pseudo Numerical Methods for Diffusion Models on Manifolds (PNDM, PLMS | ICLR2022)
Transformer Tracking with Cyclic Shifting Window Attention (CSWinTT)
Language Models Can See: Plugging Visual Controls in Text Generation
Research code for CVPR 2022 paper "SwinBERT: End-to-End Transformers with Sparse Attention for Video Captioning"
A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ulti…
[CVPR 2022] Thin-Plate Spline Motion Model for Image Animation.