lxtGH

💬

At home

Xiangtai Li lxtGH

💬

At home

Work in Computer Vision, Deep Learning and Multi-Modal Models.

574 followers · 254 following

Achievements

Highlights

Lists (3)

Sort

Beta Lists are currently in beta. Share feedback and report bugs.

Stars

viiika / Meissonic

Inference Code of Meissonic

Python 94 1 Updated Oct 16, 2024

HarborYuan / ovsam

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Python 925 27 Updated Jul 31, 2024

chongzhou96 / EdgeSAM

Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"

Jupyter Notebook 919 41 Updated Aug 12, 2024

yyyujintang / PredFormer

Official PyTorch Code for Paper: PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

30 1 Updated Oct 7, 2024

MIMAFace2024 / MIMAFace

Python 12 Updated Sep 23, 2024

VectorSpaceLab / OmniGen

701 13 Updated Sep 18, 2024

apple / ml-mgie

Python 3,847 252 Updated Mar 15, 2024

hkchengrex / vos-benchmark

Fast and general video object segmentation evaluation.

Python 27 4 Updated Jan 30, 2024

baaivision / Emu3

Next-Token Prediction is All You Need

Python 972 26 Updated Oct 8, 2024

open-compass / VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks

Python 1,182 165 Updated Oct 16, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,637 150 Updated Oct 4, 2024

yangjiheng / 3DGS_and_Beyond_Docs

This is a collective repository for all 3DGS related progresses in research and industry world

342 15 Updated Oct 2, 2024

hithqd / PointRWKV

Python 46 4 Updated Sep 11, 2024

Yaziwel / Awesome-RWKV-in-Vision

A curated list of papers on the applications of RWKV in computer vision.

103 3 Updated Oct 2, 2024

THUDM / GLM-4

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 4,900 405 Updated Oct 16, 2024

lxtGH / OMG-Seg

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,260 47 Updated Oct 2, 2024

Jiahao000 / MosaicFusion

[IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

Python 112 3 Updated Oct 8, 2024

zhang-tao-whu / DVIS_Plus

Python 90 6 Updated Jul 4, 2024

FaceAdapter / Face-Adapter

Python 320 22 Updated May 27, 2024

jianzongwu / MotionBooth

[NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"

Python 102 7 Updated Oct 8, 2024

showlab / Awesome-MLLM-Hallucination

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

407 11 Updated Oct 10, 2024

NVIDIA / Megatron-Energon

Megatron's multi-modal data loader

Python 110 7 Updated Oct 12, 2024

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,264 858 Updated Oct 16, 2024

epfLLM / Megatron-LLM

distributed trainer for LLMs

Python 533 76 Updated May 20, 2024

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 8,065 757 Updated Oct 14, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 10,287 2,309 Updated Oct 16, 2024

Stability-AI / stable-fast-3d

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

Python 1,095 105 Updated Oct 10, 2024

black-forest-labs / flux

Official inference repo for FLUX.1 models

Python 14,960 1,077 Updated Oct 8, 2024

hughplay / Visual-Reasoning-Papers

📄 A curated list of visual reasoning papers.

TeX 20 2 Updated Oct 1, 2024

xushilin1 / RAP-SAM

Python 208 10 Updated Jun 28, 2024

Xiangtai Li lxtGH

Highlights

Lists (3)

🔮 Future ideas

✨ Inspiration

🚀 My stack

Stars