deepsworld

💻

Never underestimate the power of more data

Deep Patel deepsworld

💻

Never underestimate the power of more data

11 followers · 7 following

Achievements

Organizations

Stars

Ucas-HaoranWei / GOT-OCR2.0

Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Python 5,115 416 Updated Oct 2, 2024

lixirui142 / VidToMe

Official Pytorch Implementation for "VidToMe: Video Token Merging for Zero-Shot Video Editing" (CVPR 2024)

Python 172 10 Updated Apr 1, 2024

TonyLianLong / LLM-groundedVideoDiffusion

[ICLR 2024] LLM-grounded Video Diffusion Models (LVD): official implementation for the LVD paper

Python 122 7 Updated May 7, 2024

gorkaydemir / SOLV

Official implementation of the NeurIPS 2023 paper "Self-supervised Object-Centric Learning for Videos"

Python 21 Updated Feb 6, 2024

evelinehong / 3D-CLR-Official

Forked from zsh2000/3D-CLR

[CVPR 2023] Code for "3D Concept Learning and Reasoning from Multi-View Images"

Python 73 3 Updated Jan 20, 2024

gaomingqi / Track-Anything

Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.

Python 6,432 479 Updated May 31, 2024

elicassion / 3DTRL

Code for NeurIPS 2022 paper "Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space"

Python 18 Updated Apr 20, 2023

DepthAnything / Depth-Anything-V2

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 3,463 286 Updated Aug 14, 2024

OpenGVLab / EgoExoLearn

[CVPR 2024] Data and benchmark code for the EgoExoLearn dataset

Python 45 Updated Sep 3, 2024

brown-palm / AntGPT

Official code implemtation of paper AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Python 19 2 Updated Sep 23, 2024

FoundationVision / VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,052 304 Updated Oct 6, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 26,549 3,001 Updated Aug 12, 2024

chengche6230 / ReST

[ICCV 2023] ReST: A Reconfigurable Spatial-Temporal Graph Model for Multi-Camera Multi-Object Tracking

Python 137 15 Updated Mar 27, 2024

ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++

C 34,911 3,559 Updated Oct 8, 2024

princeton-nlp / SWE-agent

[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challen…

Python 13,422 1,344 Updated Oct 9, 2024