Skip to content

Commit

Permalink
add documentations
Browse files Browse the repository at this point in the history
  • Loading branch information
xiexh20 committed Mar 28, 2023
1 parent d2191c8 commit fb1011c
Show file tree
Hide file tree
Showing 8 changed files with 286 additions and 8 deletions.
1 change: 1 addition & 0 deletions PATHS.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# Cite: CHORE: Contact, Human and Object REconstruction from a single RGB image, ECCV'2022

CODE: "/BS/xxie-4/work/VisTracker" # path to your project main folder
BEHAVE_ROOT: "/BS/xxie-5/static00/behave_release" # path to BEHAVE root
BEHAVE_PATH: "/BS/xxie-5/static00/behave_release/sequences" # path to the original behave path
EXTENDED_BEHAVE_PATH: '/BS/xxie-4/static00/test-seq' # root path where 30fps BEHAVE sequences are stored
GT_PACKED: "/scratch/inf0/user/xxie/behave-packed" # path where packed GT data is saved
Expand Down
105 changes: 102 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,112 @@
# VisTracker
Official implementation for the CVPR'23 paper
# VisTracker (CVPR'23)
#### Official implementation for the CVPR 2023 paper: Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

Train a model:
[[ArXiv]](https://arxiv.org/abs/2204.02445) [[Project Page]](https://virtualhumans.mpi-inf.mpg.de/chore)

<p align="left">
<img src="https://datasets.d2.mpi-inf.mpg.de/cvpr23vistracker/teaser.png" alt="teaser" width="512"/>
</p>

## Contents
1. [Dependencies](#dependencies)
2. [Dataset preparation](#dataset-preparation)
3. [Run demo](#run-demo)
4. [Training](#training)
5. [Evaluation](#evaluation)
5. [Citation](#citation)
6. [License](#license)

## Dependencies
The code is tested with `torch 1.6, cuda10.1, debian 11`. The environment setup is the same as CHORE, ECCV'22. Please follow the instructions [here](https://github.com/xiexh20/CHORE#dependencies).


## Dataset preparation
We work on the extended BEHAVE dataset, to have the dataset ready, you need to download some files and run some processing scripts to prepare the data. All files are provided in [this webpage](https://virtualhumans.mpi-inf.mpg.de/behave/license.html).

1. Download the video files: [color videos of test sequences](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/video/date03_color.tar), [frame time information](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/video/date03_time.tar).
2. Extract RGB images: follow [this script](https://github.com/xiexh20/behave-dataset#generate-images-from-raw-videos) from BEHAVE dataset repo to extract RGB images. Please enable `-nodepth` tag to extract RGB images only. Example: `python tools/video2images.py /BS/xxie-3/static00/rawvideo/Date03/Date03_Sub03_chairwood_hand.0.color.mp4 /BS/xxie-4/static00/behave-fps30/ -nodepth`
3. Download human and object masks: [masks for all test sequences](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/masks/masks-date03.tar). Download and unzip them into one folder.
4. Rename the mask files to follow the BEHAVE dataset structure: `python tools/rename_masks.py -s SEQ_FOLDER -m MASK_ROOT` Example: `python tools/rename_masks.py -s /BS/xxie-4/static00/behave-fps30/Date03_Sub03_chairwood_hand -m /BS/xxie-5/static00/behave_release/30fps-masks-new/`
5. Download [openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) and [FrankMocap](https://github.com/facebookresearch/frankmocap) detections: [packed data for test sequences](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/behave-packed-test-seqs.zip)
6. Process the packed data to BEHAVE dataset format: `python tools/pack2separate.py -s SEQ_FOLDER -p PACKED_ROOT`. Example: `python tools/pack2separate.py -s /BS/xxie-4/static00/behave-fps30/Date03_Sub03_chairwood_hand -p /scratch/inf0/user/xxie/behave-packed`

## Run demo
You can find all the commands of the pipeline in `scripts/demo.sh`. To run it, you need to download the pretrained models from [here](https://datasets.d2.mpi-inf.mpg.de/cvpr23vistracker/models.zip) and unzip them in the folder `experiments`.

Also, the dataset files should be prepared as described above.

Once done, you can run the demo for one sequence simply by:
```shell
bash scripts/vistracker_pipeline.sh SEQ_FOLDER
```
example: `bash scripts/vistracker_pipeline.sh /BS/xxie-4/static00/test-seq/Date03_Sub03_chairwood_hand`

It will take around 6~8 hours to finish a sequence of 1500 frames (50s).

Tips: the runtime bottlenecks are the SMPL-T pre-fitting (step 1-2) and joint optimization (step 6) in `scripts/demo.sh`. If you have a cluster with multiple GPU machines, you can run multiple sequences in parallel by specifying the `--start` and `--end` option for these commands. This will separate one long sequence into several chunks and each job only optimizes the chunk specified by start and end frames.

## Training
Train a SIF-Net model:
```shell
python -m torch.distributed.launch --nproc_per_node=NUM_GPU --master_port 6789 --use_env train_launch.py -en tri-vis-l2
```
Note that to train this model, you also need to prepare the GT registrations (meshes) in order to run online boundary sampling during training. We provide an example script to save SMPL and object meshes from packed parameters:
`python tools/pack2separate_params.py -s SEQ_FOLDER -p PACKED_PATH`, similar to `tools/pack2separate.py`. The packed training data for this can be downloaded from [here (part1)](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/behave-packed-train-seqs-p1.zip) and [here (part2)](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/behave-packed-train-seqs-p2.zip)


Train motion infill model:
```shell
python -m torch.distributed.launch --nproc_per_node=NUM_GPU --master_port 6787 --use_env train_mfiller.py -en cmf-k4-lrot
```
For this, you need to specify the path to all packed GT files.


## Evaluation
```shell
python recon/eval/evalvideo_packed.py -split splits/behave-test-30fps.json -sn RECON_NAME -m ours -w WINDOW_SIZE
```
where `RECON_NAME` is your own save name for the reconstruction, and `WINDOW_SIZE` is the alignment window size (main paper Sec. 4). `WINDOW_SIZE=1` is equivalent to the evaluation used by CHORE.

## Citation
If you use our code, please cite:
```bibtex
@inproceedings{xie2023vistracker,
title = {Visibility Aware Human-Object Interaction Tracking from Single RGB Camera},
author = {Xie, Xianghui and Bhatnagar, Bharat Lal and Pons-Moll, Gerard },
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month={June},
year={2023}
}
```
If you use BEHAVE dataset, please also cite:
```bibtex
@inproceedings{bhatnagar22behave,
title = {BEHAVE: Dataset and Method for Tracking Human Object Interactions},
author={Bhatnagar, Bharat Lal and Xie, Xianghui and Petrov, Ilya and Sminchisescu, Cristian and Theobalt, Christian and Pons-Moll, Gerard},
booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {jun},
organization = {{IEEE}},
year = {2022},
}
```

## License
Copyright (c) 2023 Xianghui Xie, Max-Planck-Gesellschaft

Please read carefully the following terms and conditions and any accompanying documentation before you download and/or use this software and associated documentation files (the "Software").

The authors hereby grant you a non-exclusive, non-transferable, free of charge right to copy, modify, merge, publish, distribute, and sublicense the Software for the sole purpose of performing non-commercial scientific research, non-commercial education, or non-commercial artistic projects.

Any other use, in particular any use for commercial purposes, is prohibited. This includes, without limitation, incorporation in a commercial product, use in a commercial service, or production of other artefacts for commercial purposes.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

You understand and agree that the authors are under no obligation to provide either maintenance services, update services, notices of latent defects, or corrections of defects with regard to the Software. The authors nevertheless reserve the right to update, modify, or discontinue the Software at any time.

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. You agree to cite the **Visibility Aware Human-Object Interaction Tracking from Single RGB Camera** paper in documents and papers that report on research using this Software.






8 changes: 6 additions & 2 deletions behave/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@
import numpy as np
import os.path as osp
from psbody.mesh import Mesh
import yaml, sys
with open("PATHS.yml", 'r') as stream:
paths = yaml.safe_load(stream)
BEHAVE_ROOT = paths['BEHAVE_ROOT']


def rotate_yaxis(R, t):
Expand Down Expand Up @@ -182,9 +186,9 @@ def load_scan_centered(scan_path, cent=True):
return scan


def load_template(obj_name, cent=True, high_reso=False):
def load_template(obj_name, cent=True, high_reso=False, behave_path=BEHAVE_ROOT):
"load object template mesh given object name"
temp_path = get_template_path("/BS/xxie-5/static00/behave_release/objects", obj_name)
temp_path = get_template_path(behave_path+"/objects", obj_name)
if high_reso:
assert obj_name in _mesh_template.keys(), f'does not support high reso template for {obj_name}'
lowreso_temp = Mesh(filename=temp_path)
Expand Down
4 changes: 2 additions & 2 deletions lib_smpl/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
from .smplpytorch import SMPL_Layer
from .smpl_generator import SMPLHGenerator
from .wrapper_pytorch import SMPL_MODEL_ROOT


def get_smpl(gender, hands, model_root='/BS/xxie2020/static00/mysmpl/smplh'):
def get_smpl(gender, hands, model_root=SMPL_MODEL_ROOT):
"simple wrapper to get SMPL model"
return SMPL_Layer(model_root=model_root,
gender=gender, hands=hands)
2 changes: 1 addition & 1 deletion smoothnet/smooth_objrot.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
from smoothnet.utils.geometry_utils import rot6d_to_rotmat, rot6D_to_axis
from smoothnet.smooth_base import SmootherBase
import smoothnet.utils.geometry_utils as geom_utils
from behave.utils import load_template, load_configs_all
from behave.utils import load_template
from recon.pca_util import PCAUtil

import yaml
Expand Down
54 changes: 54 additions & 0 deletions tools/pack2separate.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
"""
packed openpose and mocap results to separate files in each frame in BEHAVE dataset format
mocap: only save the parameters, not meshes
keywords: pose, betas
openpose: save as json files
keywords: body_joints, face_joints, left_hand_joints, right_hand_joints
"""
import os, sys
sys.path.append(os.getcwd())
import joblib
import os.path as osp
from tqdm import tqdm
import json
from behave.frame_data import FrameDataReader


def pack2separate(args):
reader = FrameDataReader(args.seq_folder)
seq_name = reader.seq_name
packed_data = joblib.load(osp.join(args.packed_path, f'{seq_name}_GT-packed.pkl'))
assert len(packed_data['frames']) == len(reader), f'Warning: number of frames does not match for seq {seq_name}!'

# save as separate openpose and mocap files
for idx in tqdm(range(len(reader))):
for kid in reader.kids:
outfile = osp.join(reader.get_frame_folder(idx), f'k{kid}.mocap.json')
if not osp.isfile(outfile):
json.dump(
{
"pose":packed_data['mocap_poses'][idx, kid].tolist(),
"betas":packed_data['mocap_betas'][idx, kid].tolist(),
},
open(outfile, 'w')
)
outfile = osp.join(reader.get_frame_folder(idx), f'k{kid}.color.json')
if not osp.isfile(outfile):
json.dump(
{
"body_joints": packed_data["joints2d"][idx, kid].tolist(),
},
open(outfile, 'w')
)
print("all done")


if __name__ == '__main__':
from argparse import ArgumentParser
parser = ArgumentParser()
parser.add_argument('-s', '--seq_folder')
parser.add_argument('-p', '--packed_path', default="/scratch/inf0/user/xxie/behave-packed")# root path to all packed files

args = parser.parse_args()

pack2separate(args)
70 changes: 70 additions & 0 deletions tools/pack2separate_params.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
"""
load SMPL and object parameters from packed pkl file (extended BEHAVE data)
save SMPL and object registration (meshes) in BEHAVE data format
"""
import os, sys

import numpy as np
import torch

sys.path.append(os.getcwd())
import joblib
import os.path as osp
from tqdm import tqdm
import json
from psbody.mesh import Mesh
from scipy.spatial.transform import Rotation

from behave.frame_data import FrameDataReader
from lib_smpl import get_smpl
from behave.utils import load_template


def pack2separate_params(args):
reader = FrameDataReader(args.seq_folder)
seq_name = reader.seq_name
smpl_name, obj_name = "fit03", 'fit01-smooth'
obj_cat = reader.seq_info.get_obj_name(True)

packed_data = joblib.load(osp.join(args.packed_path, f'{seq_name}_GT-packed.pkl'))
assert len(packed_data['frames']) == len(reader), f'Warning: number of frames does not match for seq {seq_name}!'

temp = load_template(reader.seq_info.get_obj_name())
smplh_layer = get_smpl(reader.seq_info.get_gender(), True)
faces = smplh_layer.faces.copy()
for idx in tqdm(range(0, len(reader), args.interval)):
# object
outfile = osp.join(reader.get_frame_folder(idx), obj_cat, obj_name, f'{obj_cat}_fit.ply')
if not osp.isfile(outfile):
os.makedirs(osp.dirname(outfile), exist_ok=True)
angle, trans = packed_data['obj_angles'][idx], packed_data['obj_trans'][idx]
rot = Rotation.from_rotvec(angle).as_matrix()
obj_fit = np.matmul(temp.v, rot.T) + trans
Mesh(obj_fit, temp.f).write_ply(outfile)

# SMPL
outfile = osp.join(reader.get_frame_folder(idx), 'person', smpl_name, 'person_fit.ply')
if not osp.isfile(outfile):
os.makedirs(osp.dirname(outfile), exist_ok=True)
verts, _, _, _ = smplh_layer(torch.from_numpy(packed_data['poses'][idx:idx+1]),
torch.from_numpy(packed_data['betas'][idx:idx+1]),
torch.from_numpy(packed_data['trans'][idx:idx+1]))
verts = verts[0].cpu().numpy()
Mesh(verts, faces).write_ply(outfile)
print("all done")


if __name__ == '__main__':
from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument('-s', '--seq_folder')
parser.add_argument('-p', '--packed_path',
default="/scratch/inf0/user/xxie/behave-packed") # root path to all packed files
parser.add_argument('-i', '--interval', default=30, type=int,
help="interval between two saved frames, if set to 1, save for all frames")


args = parser.parse_args()

pack2separate_params(args)
50 changes: 50 additions & 0 deletions tools/rename_masks.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
"""
rename and move the human and object masks to BEHAVE dataset structure
"""

import os, sys
sys.path.append(os.getcwd())
import joblib
import os.path as osp
from tqdm import tqdm
import json
from glob import glob
from behave.frame_data import FrameDataReader


def rename_masks(args):
reader = FrameDataReader(args.seq_folder)
seq_name = reader.seq_name
mask_path = osp.join(args.mask_path, seq_name)
ps_files = glob(osp.join(mask_path, 't*k1.person_mask.png'))
obj_files = glob(osp.join(mask_path, 't*k1.obj_rend_mask.png'))
assert len(ps_files) == len(obj_files), 'the number of mask files does not match!'
assert len(ps_files) == len(reader), 'the number of frames between mask and RGB images does not match!'

files_all = glob(osp.join(mask_path, 't*.png'))
count = 0
for file in tqdm(files_all):
fname = osp.join(args.seq_folder, *osp.basename(file).split("-"))
if osp.isfile(fname):
continue
cmd = f'mv {file} {fname}'
# print(cmd)
os.system(cmd)
# count += 1
# if count == 10:
# break
print("all done!")



if __name__ == '__main__':
from argparse import ArgumentParser

parser = ArgumentParser()
parser.add_argument('-s', '--seq_folder')
parser.add_argument('-m', '--mask_path',
default="/scratch/inf0/user/xxie/behave-packed") # root path to all mask files

args = parser.parse_args()

rename_masks(args)

0 comments on commit fb1011c

Please sign in to comment.