Skip to content
View ziqipang's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ziqipang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

A PyTorch implementation of MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis

Python 523 25 Updated Mar 10, 2023

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

11 1 Updated Oct 15, 2024

A bibliography and survey of the papers surrounding o1

TeX 515 21 Updated Nov 2, 2024

Emerging Pixel Grounding in Large Multimodal Models Without Grounding Supervision

Python 18 Updated Oct 21, 2024

O1 Replication Journey: A Strategic Progress Report – Part I

1,154 28 Updated Oct 28, 2024

A paper list of some recent works about Token Compress for Vit and VLM

118 2 Updated Oct 30, 2024

[NeurIPS 2024] Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

Python 39 4 Updated Oct 31, 2024

✨✨ MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?

Python 77 5 Updated Sep 29, 2024
1 Updated Aug 13, 2024

Textureless Underwater Real Time Localization and Mapping

C++ 55 4 Updated Oct 1, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 36,852 4,540 Updated Nov 2, 2024

[CVPR 2024] Situational Awareness Matters in 3D Vision Language Reasoning

Python 21 2 Updated Nov 1, 2024

LLM101n: Let's build a Storyteller

29,583 1,620 Updated Aug 1, 2024

We propose CRKD to bridge the performance gap between LC and CR detectors with a novel cross-modality knowledge distillation (KD) framework.

Python 25 4 Updated Jul 3, 2024

Open-MAGVIT2: Democratizing Autoregressive Visual Generation

Python 675 28 Updated Sep 27, 2024

official repository of CVPR 2024 paper, RMem: Restricted Memory Banks Improve Video Object Segmentation

Python 32 2 Updated Aug 22, 2024

Taming Transformers for High-Resolution Image Synthesis

Jupyter Notebook 5,776 1,143 Updated Jul 30, 2024

Your image is almost there!

Python 7,304 418 Updated Jul 26, 2024

Official repository for "AM-RADIO: Reduce All Domains Into One"

Python 774 32 Updated Oct 25, 2024

[ICCV 2023] Multi3DRefer: Grounding Text Description to Multiple 3D Objects

Python 75 3 Updated Jan 26, 2024

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python 1,823 142 Updated Nov 2, 2024

[COLM-2024] List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs

Python 123 3 Updated Aug 23, 2024

Code for PhysDreamer

Python 489 25 Updated Sep 15, 2024

Reaching LLaMA2 Performance with 0.1M Dollars

Python 961 79 Updated Jul 23, 2024

✨✨Latest Advances on Multimodal Large Language Models

12,444 795 Updated Oct 29, 2024

DUSt3R: Geometric 3D Vision Made Easy

Python 5,311 580 Updated Sep 20, 2024

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Python 2,235 143 Updated Oct 21, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Python 4,202 310 Updated Oct 6, 2024

Efficient vision foundation models for high-resolution generation and perception.

Python 2,246 182 Updated Oct 29, 2024

[NeurIPS 2024] SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challen…

Python 13,614 1,377 Updated Oct 31, 2024
Next