Skip to content
View linzhenyuyuchen's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report linzhenyuyuchen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ICLR 2024] Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Python 124 5 Updated Apr 30, 2024

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

315 9 Updated Aug 20, 2024

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Python 1,252 79 Updated Aug 20, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 5,137 403 Updated Aug 20, 2024

Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"

Python 3,145 277 Updated May 4, 2024

【ICLR 2024🔥】 Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Python 659 48 Updated Mar 25, 2024
Python 19 Updated Oct 10, 2023

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Python 286 14 Updated Aug 18, 2024

Recent LLM-based CV and related works. Welcome to comment/contribute!

815 33 Updated Jun 5, 2024

A general representation model across vision, audio, language modalities. Paper: ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities

Python 917 56 Updated Jun 27, 2024

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,144 64 Updated Aug 21, 2024

Mora: More like Sora for Generalist Video Generation

Python 1,464 91 Updated Jun 21, 2024

Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information

Python 8,738 1,349 Updated Aug 9, 2024

mPLUG-DocOwl: Modularized Multimodal Large Language Model for Document Understanding

Python 1,207 75 Updated Jul 16, 2024

OpenMMLab YOLO series toolbox and benchmark. Implemented RTMDet, RTMDet-Rotated,YOLOv5, YOLOv6, YOLOv7, YOLOv8,YOLOX, PPYOLOE, etc.

Python 2,891 526 Updated Jul 14, 2024

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

Python 9,261 2,179 Updated Jul 30, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 4,167 405 Updated Jul 30, 2024

Referring Expression Datasets API

Jupyter Notebook 442 79 Updated Apr 13, 2021

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,656 118 Updated Aug 15, 2024

[CVPR2024] Generative Region-Language Pretraining for Open-Ended Object Detection

Python 122 6 Updated Mar 25, 2024

Grok open release

Python 49,355 8,325 Updated Aug 7, 2024

本项目旨在分享大模型相关技术原理以及实战经验。

HTML 8,661 843 Updated Aug 11, 2024

[ECCV 2024] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 185 9 Updated Aug 12, 2024

Official GitHub repository for the paper "LingoQA: Video Question Answering for Autonomous Driving"

Python 99 3 Updated Mar 26, 2024

[ECCV 2024] Embodied Understanding of Driving Scenarios

Python 125 8 Updated May 9, 2024

CLIP+MLP Aesthetic Score Predictor

Python 842 87 Updated Jul 1, 2024
Jupyter Notebook 143 9 Updated Jul 5, 2024

[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Python 477 19 Updated Jun 26, 2024

[CVPR2024] ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts

Python 268 20 Updated Jul 17, 2024

Official repository of NEFTune: Noisy Embeddings Improves Instruction Finetuning

Python 358 18 Updated May 17, 2024
Next