vasgaowei

vasgaowei vasgaowei

33 followers · 30 following

Alibaba Group
Beijing, Chaoyang
https://scholar.google.com/citations?hl=en&user=VfovrnEAAAAJ

Achievements

Block or Report

Block or report vasgaowei

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Stars

BAAI-DCAI / DataOptim

A collection of visual instruction tuning datasets.

Python 73 3 Updated Mar 14, 2024

mlfoundations / dclm

DataComp for Language Models

HTML 724 59 Updated Jul 24, 2024

mbzuai-oryx / MobiLlama

MobiLlama : Small Language Model tailored for edge devices

Python 567 41 Updated Mar 3, 2024

apple / corenet

CoreNet: A library for training deep neural networks

Python 6,826 528 Updated May 28, 2024

FreedomIntelligence / ALLaVA

Harnessing 1.4M GPT4V-synthesized Data for A Lite Vision-Language Model

Python 225 7 Updated Jun 25, 2024

CircleRadon / TokenPacker

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".

Python 105 4 Updated Jul 9, 2024

khanrc / honeybee

Official implementation of project Honeybee (CVPR 2024)

Python 399 18 Updated May 10, 2024

jihaonew / MM-Instruct

MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment

Python 26 Updated Jul 1, 2024

TencentARC / SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model

Python 578 42 Updated Jul 24, 2024

Beckschen / LLaVolta

Efficient Multi-modal Models via Stage-wise Visual Context Compression

Python 26 2 Updated Jul 3, 2024

bfshi / scaling_on_scales

When do we not need larger vision models?

Python 273 8 Updated Jul 12, 2024

baaivision / DenseFusion

DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

Python 74 1 Updated Jul 12, 2024

jiyt17 / IDA-VLM

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

Python 17 Updated Jul 12, 2024

GAIR-NLP / anole

Anole: An Open, Autoregressive and Native Multimodal Models for Interleaved Image-Text Generation

Python 542 28 Updated Jul 15, 2024

jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content

Jupyter Notebook 455 22 Updated Jul 1, 2024

OpenMOSS / AnyGPT

Code for "AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling"

Python 667 50 Updated Jul 9, 2024

facebookresearch / chameleon

Repository for Meta Chameleon, a mixed-modal early-fusion foundation model from FAIR.

Python 1,585 95 Updated Jul 10, 2024

lupantech / chameleon-llm

Codes for "Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models".

Jupyter Notebook 1,044 86 Updated Dec 23, 2023

wayveai / wayve_scenes

Codebase for the WayveScenes101 Dataset

Python 135 4 Updated Jul 16, 2024

zympsyche / BevWorld

63 2 Updated Jul 9, 2024

alfredgu001324 / MapBEVPrediction

[ECCV 2024] Accelerating Online Mapping and Behavior Prediction via Direct BEV Feature Attention

18 1 Updated Jul 10, 2024

LLaVA-VL / LLaVA-NeXT

Python 1,351 73 Updated Jul 22, 2024

OpenGVLab / De-focus-Attention-Networks

Learning 1D Causal Visual Representation with De-focus Attention Networks

Python 24 Updated Jun 7, 2024

AILab-CVC / SEED

Official implementation of SEED-LLaMA (ICLR 2024).

Python 530 30 Updated Apr 11, 2024

thunlp / LLaVA-UHD

LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images

Python 273 14 Updated Apr 14, 2024

RLHF-V / RLHF-V

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

Python 197 6 Updated May 28, 2024

OpenBMB / VisCPM

[ICLR'24 spotlight] Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列

Python 1,038 92 Updated Jun 13, 2024

Haiyang-W / GiT

🔥 [ECCV2024] Official Implementation of "GiT: Towards Generalist Vision Transformer through Universal Language Interface"

Python 256 12 Updated Jul 23, 2024

kahnchana / mvu

Multimodal Video Understanding Framework (MVU)

Python 21 Updated May 15, 2024

ziqipang / LM4VisualEncoding

[ICLR 2024 (Spotlight)] "Frozen Transformers in Language Models are Effective Visual Encoder Layers"

Python 210 6 Updated Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vasgaowei vasgaowei

Achievements

Achievements

Block or report vasgaowei

Stars

BAAI-DCAI / DataOptim

mlfoundations / dclm

mbzuai-oryx / MobiLlama

apple / corenet

FreedomIntelligence / ALLaVA

CircleRadon / TokenPacker

khanrc / honeybee

jihaonew / MM-Instruct

TencentARC / SEED-Story

Beckschen / LLaVolta

bfshi / scaling_on_scales

baaivision / DenseFusion

jiyt17 / IDA-VLM

GAIR-NLP / anole

jy0205 / LaVIT

OpenMOSS / AnyGPT

facebookresearch / chameleon

lupantech / chameleon-llm

wayveai / wayve_scenes

zympsyche / BevWorld

alfredgu001324 / MapBEVPrediction

LLaVA-VL / LLaVA-NeXT

OpenGVLab / De-focus-Attention-Networks

AILab-CVC / SEED

thunlp / LLaVA-UHD

RLHF-V / RLHF-V

OpenBMB / VisCPM

Haiyang-W / GiT

kahnchana / mvu

ziqipang / LM4VisualEncoding