Code and data release for the paper MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes (3DV 2022 best student paper award :3).
Our code is tested on python==3.8.2 with pytorch==1.7.1, open3d==0.11.2 and trimesh==3.9.20
Other fundamental packages: numpy, scipy, opencv, matplotlib, sklearn, tqdm, json and pickle
Task specific packages: smplx, detectron2, pyembree (optional, to speed up raycasting by 50x), and ChamferDistancePytorch
Details can be found here. Please download the files as instructed for running the experiments.
MoCapDeform utilises several existing methods such as smplify-x, PROX, POSA and PointRend.
We provide all necessary models in our models
folder;
please download as instructed here.
All the models are provided by the original repositories.
We utilise smplify-x for initialisation.
The results are stored at dataset/subject/RGB
.
As the smplify-x optimisation doesn't always converge,
we store the converged frame number in dataset/subject/avail_frms.npy
To generate the smplify-x results, we need to get openpose
2d keypoint detection results which are stored at dataset/subject/keypoints
.
For PROX and POSA please refer to their code.
First generate body-centric human contacts with POSA:
python gen_contacts.py --config posa_contacts/contact.yaml
Then generate tight human masks with PointRend:
python gen_mask.py
Next, get scene contacts by raycasting:
python gen_raycast.py
At last, run stage2 optimisation:
python optimise-stage2.py
Optimised human poses are then stored at dataset/subject/stage2.npy
.
Simply run python optimise-stage3.py
for the optimisation.
The optimised human pose and scene deformation are stored at dataset/subject/stage3.npy
Note that MoCapDeform dataset assumes that all the furniture in the dataset are deformable. For running on datasets such as PROX or your own dataset, you may want to do 3d semantic segmentation first to determine rigidity flags; in the experiments in our paper we adopt the trained VMNet for segmentation.
If you find our work useful, please kindly cite:
@inproceedings{Li_3DV2022,
title={MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes},
author={Zhi Li and Soshi Shimada and Bernt Schiele and Christian Theobalt and Vladislav Golyanik},
booktitle = {International Conference on 3D Vision (3DV)},
year={2022}
}
For questions, clarifications, please get in touch with:
Zhi Li [email protected]
Soshi Shimada [email protected]
Vladislav Golyanik [email protected]