LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Tang, Jiaxiang; Chen, Zhaoxi; Chen, Xiaokang; Wang, Tengfei; Zeng, Gang; Liu, Ziwei

Computer Science > Computer Vision and Pattern Recognition

arXiv:2402.05054 (cs)

[Submitted on 7 Feb 2024]

Title:LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Authors:Jiaxiang Tang, Zhaoxi Chen, Xiaokang Chen, Tengfei Wang, Gang Zeng, Ziwei Liu

View PDF

Abstract:3D content creation has achieved significant progress in terms of both quality and speed. Although current feed-forward models can produce 3D objects in seconds, their resolution is constrained by the intensive computation required during training. In this paper, we introduce Large Multi-View Gaussian Model (LGM), a novel framework designed to generate high-resolution 3D models from text prompts or single-view images. Our key insights are two-fold: 1) 3D Representation: We propose multi-view Gaussian features as an efficient yet powerful representation, which can then be fused together for differentiable rendering. 2) 3D Backbone: We present an asymmetric U-Net as a high-throughput backbone operating on multi-view images, which can be produced from text or single-view image input by leveraging multi-view diffusion models. Extensive experiments demonstrate the high fidelity and efficiency of our approach. Notably, we maintain the fast speed to generate 3D objects within 5 seconds while boosting the training resolution to 512, thereby achieving high-resolution 3D content generation.

Comments:	Project page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2402.05054 [cs.CV]
	(or arXiv:2402.05054v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2402.05054

Submission history

From: Jiaxiang Tang [view email]
[v1] Wed, 7 Feb 2024 17:57:03 UTC (5,268 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators