Skip to content
View fyting's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Beihang University
Block or Report

Block or report fyting

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

OMG-LLaVA and OMG-Seg codebase

Python 1,173 45 Updated Jul 29, 2024

A curated list for Efficient Large Language Models

Python 1,005 74 Updated Jul 31, 2024

(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator

Python 59 Updated May 28, 2024

[ECCV 2024] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 177 8 Updated Jul 4, 2024

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 503 55 Updated Jun 7, 2024

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

Python 7 1 Updated Mar 6, 2024

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Python 504 25 Updated Jul 25, 2024

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 823 117 Updated Apr 12, 2024

VisionLLM Series

Python 776 16 Updated Jul 2, 2024

[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

Python 146 8 Updated Jul 16, 2024

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 6,601 384 Updated Jul 29, 2024
Jupyter Notebook 2 Updated Jun 2, 2024

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 703 36 Updated Jun 2, 2024

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,687 90 Updated Jul 31, 2024

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 2,645 238 Updated Jul 31, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Python 4,512 350 Updated Jul 31, 2024
Python 1,401 78 Updated Jul 29, 2024

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 369 134 Updated Jul 4, 2024

A family of lightweight multimodal models.

Python 826 64 Updated Jul 31, 2024

The official Meta Llama 3 GitHub site

Python 25,000 2,739 Updated Jul 28, 2024

An Open-source Toolkit for LLM Development

Python 2,642 168 Updated May 24, 2024

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 466 28 Updated May 8, 2024

A Framework of Small-scale Large Multimodal Models

Python 524 48 Updated Jul 30, 2024

PyTorch implementation for the paper "Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving"

Python 377 33 Updated Jun 12, 2024

A collection of deep learning based RGB-T-Fusion methods, codes, and datasets. The main directions involved are Multispectral Pedestrian Detection, RGB-T Aerial Object Detection, RGB-T Semantic Seg…

269 35 Updated Jul 31, 2024

Official Code for Paper "GDRNet: Towards Generalizable Diabetic Retinopathy Grading in Unseen Domains"

Jupyter Notebook 24 1 Updated Apr 2, 2024

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,450 2,025 Updated Jul 14, 2024

Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.

Python 178 9 Updated Jul 3, 2024

CVPR 2023 论文和开源项目合集

1 Updated Jun 29, 2023
Next