forrestbing

Follow

Yong Liu forrestbing

Follow

advancing 2D/3D AIGC tech in games

28 followers · 102 following

Hangzhou, Zhejiang

Achievements

Achievements

Block or Report

Block or report forrestbing

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Stars

CircleRadon / TokenPacker

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".

Python 110 4 Updated Jul 26, 2024

baaivision / EVE

EVE: Encoder-Free Vision-Language Models from BAAI

Python 176 3 Updated Jul 20, 2024

IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Python 2,035 123 Updated Jun 25, 2024

zju-vipa / Odyssey

Odyssey: Empowering Agents with Open-World Skills

Python 154 5 Updated Jul 29, 2024

Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,527 242 Updated Mar 5, 2024

THUDM / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,716 393 Updated May 29, 2024

Young98CN / LoRA_Composer

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models

Python 28 2 Updated Jul 11, 2024

MaverickRen / PixelLM

PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.

Python 157 4 Updated Jun 3, 2024

RapidAI / RapidLayout

Analysis of Chinese and English layouts 中英文版面分析

Python 56 6 Updated Jul 19, 2024

SpursGoZmy / Table-LLaVA

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tab…

Python 73 2 Updated Jul 19, 2024

PKU-Alignment / align-anything

Align Anything: Training Any Modality Model with Feedback

Python 64 17 Updated Jul 28, 2024

NVlabs / Minitron

A family of compressed models obtained via pruning and knowledge distillation

70 5 Updated Jul 26, 2024

zenbase-ai / core

Prompt engineering, automated.

Jupyter Notebook 116 7 Updated Jul 28, 2024

ddupont808 / GPT-4V-Act

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

JavaScript 926 85 Updated Jan 31, 2024

PhoenixZ810 / MG-LLaVA

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Python 123 2 Updated Jul 19, 2024

HKUST-LongGroup / Awesome-Open-Vocabulary-Detection-and-Segmentation

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

74 3 Updated Jul 24, 2024

apple / ml-4m

4M: Massively Multimodal Masked Modeling

Python 1,448 83 Updated Jul 17, 2024

zjysteven / lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5, qwen-vl, llava-interleave, llava-next-video, phi3-v etc.

Python 75 2 Updated Jul 28, 2024

FudanNLPLAB / MouSi

69 Updated Mar 7, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,606 99 Updated Jul 26, 2024

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Python 2,642 168 Updated May 24, 2024

TIGER-AI-Lab / Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

Python 127 9 Updated Jul 25, 2024

jaechanjo / TIFF

Text-Guided Generation of Full-Body Image with Preserved Reference Face for Customized Animation

Python 22 3 Updated Jun 24, 2024

zhenyuw16 / GenArtist

Jupyter Notebook 29 2 Updated Jul 15, 2024

run-llama / llama-agents

Python 1,396 130 Updated Jul 29, 2024

allenai / unified-io-2

Python 544 25 Updated Feb 15, 2024

u2seg / U2Seg

[CVPR 2024] Code release for "Unsupervised Universal Image Segmentation"

Python 160 5 Updated May 7, 2024

deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 1,907 180 Updated Apr 24, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,447 335 Updated May 28, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,670 90 Updated Jul 25, 2024