fyting

🎯

Focusing

wangxinliang fyting

🎯

Focusing

😃

22 followers · 39 following

Beihang University

Block or Report

Block or report fyting

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

lxtGH / OMG-Seg

OMG-LLaVA and OMG-Seg codebase

Python 1,173 45 Updated Jul 29, 2024

horseee / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Python 1,005 74 Updated Jul 31, 2024

zhaohengyuan1 / Genixer

(ECCV 2024) Empowering Multimodal Large Language Model as a Powerful Data Generator

Python 59 Updated May 28, 2024

pkunlp-icler / FastV

[ECCV 2024] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 177 8 Updated Jul 4, 2024

FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

Python 503 55 Updated Jun 7, 2024

adxcreative / EERCF

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

Python 7 1 Updated Mar 6, 2024

beichenzbc / Long-CLIP

[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"

Python 504 25 Updated Jul 25, 2024

ArrowLuo / CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

Python 823 117 Updated Apr 12, 2024

OpenGVLab / VisionLLM

VisionLLM Series

Python 776 16 Updated Jul 2, 2024

DAMO-NLP-SG / VCD

[CVPR 2024 Highlight] Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding

Python 146 8 Updated Jul 16, 2024

QwenLM / Qwen2

Qwen2 is the large language model series developed by Qwen team, Alibaba Cloud.

Shell 6,601 384 Updated Jul 29, 2024

yuanmaoxun / RSDet

Forked from Zhao-Tian-yi/RSDet

Jupyter Notebook 2 Updated Jun 2, 2024

mbzuai-oryx / groundingLMM

[CVPR 2024 🔥] Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks.

Python 703 36 Updated Jun 2, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,687 90 Updated Jul 31, 2024

modelscope / swift

ms-swift: Use PEFT or Full-parameter to finetune 300+ LLMs or 50+ MLLMs. (Qwen2, GLM4v, Internlm2.5, Yi, Llama3.1, Llava-Video, Internvl2, MiniCPM-V, Deepseek, Baichuan2, Gemma2, Phi3-Vision, ...)

Python 2,645 238 Updated Jul 31, 2024

OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

Python 4,512 350 Updated Jul 31, 2024

LLaVA-VL / LLaVA-NeXT

Python 1,401 78 Updated Jul 29, 2024

TRI-ML / prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)

Python 369 134 Updated Jul 4, 2024

BAAI-DCAI / Bunny

A family of lightweight multimodal models.

Python 826 64 Updated Jul 31, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 25,000 2,739 Updated Jul 28, 2024

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Python 2,642 168 Updated May 24, 2024

shenyunhang / APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception

Python 466 28 Updated May 8, 2024

TinyLLaVA / TinyLLaVA_Factory

A Framework of Small-scale Large Multimodal Models

Python 524 48 Updated Jul 30, 2024

wayveai / Driving-with-LLMs

PyTorch implementation for the paper "Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving"

Python 377 33 Updated Jun 12, 2024

yuanmaoxun / Awesome-RGBT-Fusion

A collection of deep learning based RGB-T-Fusion methods, codes, and datasets. The main directions involved are Multispectral Pedestrian Detection, RGB-T Aerial Object Detection, RGB-T Semantic Seg…

269 35 Updated Jul 31, 2024

chehx / DGDR

Official Code for Paper "GDRNet: Towards Generalizable Diabetic Retinopathy Grading in Unseen Domains"

Jupyter Notebook 24 1 Updated Apr 2, 2024

haotian-liu / LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 18,450 2,025 Updated Jul 14, 2024

fundamentalvision / Uni-Perceiver

Python 261 21 Updated Jan 12, 2023

Traffic-X / ViT-CoMer

Official implementation of the CVPR 2024 paper ViT-CoMer: Vision Transformer with Convolutional Multi-scale Feature Interaction for Dense Predictions.

Python 178 9 Updated Jul 3, 2024

fish-kong / CVPR2023-Papers-with-Code

Forked from amusi/CVPR2024-Papers-with-Code

CVPR 2023 论文和开源项目合集

1 Updated Jun 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly