Block or Report
Block or report awkrail
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A video reader for extracting motion vectors and residuals from encoded H.264 videos.
[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
[ICCV 2023] Simple Baselines for Interactive Video Retrieval with Questions and Answers
Official pytorch repository for "QD-DETR : Query-Dependent Video Representation for Moment Retrieval and Highlight Detection" (CVPR 2023 Paper)
DocBank: A Benchmark Dataset for Document Layout Analysis
(CVPR 2023) Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry
Code for "Neural 3D Reconstruction in the Wild", SIGGRAPH 2022 (Conference Proceedings)
S3D Text-Video model trained on HowTo100M using MIL-NCE
An implementation of a small TCP/IP protocol stack for learning.
Implementation of "Frustratingly Easy Edit-based Linguistic Steganography with a Masked Language Model"
Using LLMs and pre-trained caption models for super-human performance on image captioning.
DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis
An official implementation (base) of recipe generation from unsegmented cooking videos [Nishimura+ arXiv22]. Joint learning approach of event selector and sentence generator.
mlvlab / VT-TWINS
Forked from KoDohwan/VT-TWINSVideo-Text Representation Learning via Differentiable Weak Temporal Alignment (CVPR 2022)
ChangeIt dataset with more than 2600 hours of video with state-changing actions published at CVPR 2022
Official implementation of state-aware video procedural captioning (ACM MM 2021)
A Lisp interpreter implemented in Conway's Game of Life
Improved Sentence Alignment in Linear Time and Space
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk