Skip to content
View lxtGH's full-sized avatar
💬
At home
💬
At home

Highlights

  • Pro

Block or report lxtGH

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Inference Code of Meissonic

Python 94 1 Updated Oct 16, 2024

[ECCV 2024] The official code of paper "Open-Vocabulary SAM".

Python 925 27 Updated Jul 31, 2024

Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"

Jupyter Notebook 919 41 Updated Aug 12, 2024

Official PyTorch Code for Paper: PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners

30 1 Updated Oct 7, 2024
Python 12 Updated Sep 23, 2024
Python 3,847 252 Updated Mar 15, 2024

Fast and general video object segmentation evaluation.

Python 27 4 Updated Jan 30, 2024

Next-Token Prediction is All You Need

Python 972 26 Updated Oct 8, 2024

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks

Python 1,182 165 Updated Oct 16, 2024

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,637 150 Updated Oct 4, 2024

This is a collective repository for all 3DGS related progresses in research and industry world

342 15 Updated Oct 2, 2024
Python 46 4 Updated Sep 11, 2024

A curated list of papers on the applications of RWKV in computer vision.

103 3 Updated Oct 2, 2024

GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型

Python 4,900 405 Updated Oct 16, 2024

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,260 47 Updated Oct 2, 2024

[IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation

Python 112 3 Updated Oct 8, 2024
Python 90 6 Updated Jul 4, 2024
Python 320 22 Updated May 27, 2024

[NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"

Python 102 7 Updated Oct 8, 2024

📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

407 11 Updated Oct 10, 2024

Megatron's multi-modal data loader

Python 110 7 Updated Oct 12, 2024

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,264 858 Updated Oct 16, 2024

distributed trainer for LLMs

Python 533 76 Updated May 20, 2024

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 8,065 757 Updated Oct 14, 2024

Ongoing research training transformer models at scale

Python 10,287 2,309 Updated Oct 16, 2024

SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement

Python 1,095 105 Updated Oct 10, 2024

Official inference repo for FLUX.1 models

Python 14,960 1,077 Updated Oct 8, 2024

📄 A curated list of visual reasoning papers.

TeX 20 2 Updated Oct 1, 2024
Python 208 10 Updated Jun 28, 2024
Next