- Singapore
- https://lxtgh.github.io/
- @xtl994
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Stars
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
Official PyTorch Code for Paper: PredFormer: Transformers Are Effective Spatial-Temporal Predictive Learners
Fast and general video object segmentation evaluation.
Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks
Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
This is a collective repository for all 3DGS related progresses in research and industry world
A curated list of papers on the applications of RWKV in computer vision.
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]
[IJCV 2024] MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation
[NeurIPS 2024 Spotlight] The official implement of research paper "MotionBooth: Motion-Aware Customized Text-to-Video Generation"
📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Ongoing research training transformer models at scale
SF3D: Stable Fast 3D Mesh Reconstruction with UV-unwrapping and Illumination Disentanglement
Official inference repo for FLUX.1 models
📄 A curated list of visual reasoning papers.