forrestbing

Follow

Yong Liu forrestbing

Follow

advancing 2D/3D AIGC tech in games

28 followers · 102 following

Hangzhou, Zhejiang

Achievements

Achievements

Block or Report

Block or report forrestbing

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Stars

4212 results for source starred repositories

CircleRadon / TokenPacker

The code for "TokenPacker: Efficient Visual Projector for Multimodal LLM".

Python 110 4 Updated Jul 26, 2024

baaivision / EVE

EVE: Encoder-Free Vision-Language Models from BAAI

Python 176 3 Updated Jul 20, 2024

IDEA-Research / T-Rex

[ECCV2024] API code for T-Rex2: Towards Generic Object Detection via Text-Visual Prompt Synergy

Python 2,035 123 Updated Jun 25, 2024

zju-vipa / Odyssey

Odyssey: Empowering Agents with Open-World Skills

Python 154 5 Updated Jul 29, 2024

Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

Python 3,527 242 Updated Mar 5, 2024

THUDM / CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 5,716 393 Updated May 29, 2024

Young98CN / LoRA_Composer

LoRA-Composer: Leveraging Low-Rank Adaptation for Multi-Concept Customization in Training-Free Diffusion Models

Python 28 2 Updated Jul 11, 2024

MaverickRen / PixelLM

PixelLM is an effective and efficient LMM for pixel-level reasoning and understanding. PixelLM is accepted by CVPR 2024.

Python 157 4 Updated Jun 3, 2024

RapidAI / RapidLayout

Analysis of Chinese and English layouts 中英文版面分析

Python 56 6 Updated Jul 19, 2024

SpursGoZmy / Table-LLaVA

Dataset and Code for our ACL 2024 paper: "Multimodal Table Understanding". We propose the first large-scale Multimodal IFT and Pre-Train Dataset for table understanding and develop a generalist tab…

Python 73 2 Updated Jul 19, 2024

PKU-Alignment / align-anything

Align Anything: Training Any Modality Model with Feedback

Python 64 17 Updated Jul 28, 2024

NVlabs / Minitron

A family of compressed models obtained via pruning and knowledge distillation

70 5 Updated Jul 26, 2024

zenbase-ai / core

Prompt engineering, automated.

Jupyter Notebook 116 7 Updated Jul 28, 2024

ddupont808 / GPT-4V-Act

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

JavaScript 926 85 Updated Jan 31, 2024

PhoenixZ810 / MG-LLaVA

Official repository for paper MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning(https://arxiv.org/abs/2406.17770).

Python 123 2 Updated Jul 19, 2024

HKUST-LongGroup / Awesome-Open-Vocabulary-Detection-and-Segmentation

Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future

74 3 Updated Jul 24, 2024

apple / ml-4m

4M: Massively Multimodal Masked Modeling

Python 1,448 83 Updated Jul 17, 2024

zjysteven / lmms-finetune

A minimal codebase for finetuning large multimodal models, supporting llava-1.5, qwen-vl, llava-interleave, llava-next-video, phi3-v etc.

Python 75 2 Updated Jul 28, 2024

FudanNLPLAB / MouSi

69 Updated Mar 7, 2024

cambrian-mllm / cambrian

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Python 1,606 99 Updated Jul 26, 2024

Alpha-VLLM / LLaMA2-Accessory

An Open-source Toolkit for LLM Development

Python 2,642 168 Updated May 24, 2024

TIGER-AI-Lab / Mantis

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

Python 127 9 Updated Jul 25, 2024

jaechanjo / TIFF

Text-Guided Generation of Full-Body Image with Preserved Reference Face for Customized Animation

Python 22 3 Updated Jun 24, 2024

zhenyuw16 / GenArtist

Jupyter Notebook 29 2 Updated Jul 15, 2024

run-llama / llama-agents

Python 1,396 130 Updated Jul 29, 2024

allenai / unified-io-2

Python 544 25 Updated Feb 15, 2024

u2seg / U2Seg

[CVPR 2024] Code release for "Unsupervised Universal Image Segmentation"

Python 160 5 Updated May 7, 2024

deepseek-ai / DeepSeek-VL

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Python 1,907 180 Updated Apr 24, 2024

QwenLM / Qwen-VL

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 4,447 335 Updated May 28, 2024

THUDM / CogVLM2

GPT4V-level open-source multi-modal model based on Llama3-8B

Python 1,670 90 Updated Jul 25, 2024