Official Pytorch & Diffusers implementation of the paper:
Text-Guided Texturing by Synchronized Multi-View Diffusion
SyncMVD can generate texture for a 3D object from a text prompt using a Synchronized Multi-View Diffusion approach. The method shares the denoised content among different views in each denoising step to ensure texture consistency and avoid seams and fragmentation (fig a).
The program is developed and tested on Linux system with Nvidia GPU. If you find compatibility issues on Windows platform, you can also consider using WSL.
To install, first clone the repository and install the basic dependencies
git clone https://github.com/LIU-Yuxin/SyncMVD.git
cd SyncMVD
conda create -n syncmvd python=3.8
conda activate syncmvd
pip install -r requirements.txt
Then install PyTorch3D through the following URL (change the respective Python, CUDA and PyTorch version in the link for the binary compatible with your setup), or install according to official installation guide
pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu117_pyt200/download.html
The pretrained models will be downloaded automatically on demand, including:
- runwayml/stable-diffusion-v1-5
- lllyasviel/control_v11f1p_sd15_depth
- lllyasviel/control_v11p_sd15_normalbae
The current program based on PyTorch3D library requires a input .obj
mesh with .mtl
material and related textures to read the original UV mapping of the object, which may require manual cleaning. Alternatively the program also support auto unwrapping based on XAtlas to load mesh that does not met the above requirements. The program also supports loading .glb
mesh, but it may not be stable as its a PyTorch3D experiment feature.
To avoid unexpected artifact, the object being textured should avoid flipped face normals and overlapping UV, and keep the number of triangle faces within around 40,000. You can try Blender for manual mesh cleaning and processing, or its python scripting for automation.
You can try out the method with the following pre-processed meshes and configs:
- Face - "Portrait photo of Kratos, god of war." (by 2on)
- Sneaker - "A photo of a camouflage military boot." (by gianpego)
python run_experiment.py --config {your config}.yaml
Refer to config.py for the list of arguments and settings you can adjust. You can change these settings by including them in a .yaml
config file or passing the related arguments in command line; values specified in command line will overwrite those in config files.
When no output path is specified, the generated result will be placed in the same folder as the config file by default.
The program licensed under MIT License.
@article{liu2023text,
title={Text-Guided Texturing by Synchronized Multi-View Diffusion},
author={Liu, Yuxin and Xie, Minshan and Liu, Hanyuan and Wong, Tien-Tsin},
journal={arXiv preprint arXiv:2311.12891},
year={2023}
}