add documentations

xiexh20 · Mar 28, 2023 · fb1011c · fb1011c
1 parent d2191c8
commit fb1011c
Show file tree

Hide file tree

Showing 8 changed files with 286 additions and 8 deletions.
diff --git a/PATHS.yml b/PATHS.yml
@@ -3,6 +3,7 @@
 # Cite: CHORE: Contact, Human and Object REconstruction from a single RGB image, ECCV'2022
 
 CODE: "/BS/xxie-4/work/VisTracker" # path to your project main folder
+BEHAVE_ROOT: "/BS/xxie-5/static00/behave_release" # path to BEHAVE root
 BEHAVE_PATH: "/BS/xxie-5/static00/behave_release/sequences" # path to the original behave path
 EXTENDED_BEHAVE_PATH: '/BS/xxie-4/static00/test-seq' # root path where 30fps BEHAVE sequences are stored
 GT_PACKED: "/scratch/inf0/user/xxie/behave-packed" # path where packed GT data is saved

diff --git a/README.md b/README.md
@@ -1,13 +1,112 @@
-# VisTracker
-Official implementation for the CVPR'23 paper 
+# VisTracker (CVPR'23)
+#### Official implementation for the CVPR 2023 paper: Visibility Aware Human-Object Interaction Tracking from Single RGB Camera
 
-Train a model:
+[[ArXiv]](https://arxiv.org/abs/2204.02445) [[Project Page]](https://virtualhumans.mpi-inf.mpg.de/chore)
+
+<p align="left">
+<img src="https://datasets.d2.mpi-inf.mpg.de/cvpr23vistracker/teaser.png" alt="teaser" width="512"/>
+</p>
+
+## Contents 
+1. [Dependencies](#dependencies)
+2. [Dataset preparation](#dataset-preparation)
+3. [Run demo](#run-demo)
+4. [Training](#training)
+5. [Evaluation](#evaluation)
+5. [Citation](#citation)
+6. [License](#license)
+
+## Dependencies
+The code is tested with `torch 1.6, cuda10.1, debian 11`. The environment setup is the same as CHORE, ECCV'22. Please follow the instructions [here](https://github.com/xiexh20/CHORE#dependencies). 
+
+
+## Dataset preparation
+We work on the extended BEHAVE dataset, to have the dataset ready, you need to download some files and run some processing scripts to prepare the data. All files are provided in [this webpage](https://virtualhumans.mpi-inf.mpg.de/behave/license.html). 
+
+1. Download the video files: [color videos of test sequences](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/video/date03_color.tar), [frame time information](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/video/date03_time.tar). 
+2. Extract RGB images: follow [this script](https://github.com/xiexh20/behave-dataset#generate-images-from-raw-videos) from BEHAVE dataset repo to extract RGB images. Please enable `-nodepth` tag to extract RGB images only. Example: `python tools/video2images.py /BS/xxie-3/static00/rawvideo/Date03/Date03_Sub03_chairwood_hand.0.color.mp4 /BS/xxie-4/static00/behave-fps30/ -nodepth`
+3. Download human and object masks: [masks for all test sequences](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/masks/masks-date03.tar). Download and unzip them into one folder. 
+4. Rename the mask files to follow the BEHAVE dataset structure: `python tools/rename_masks.py -s SEQ_FOLDER -m MASK_ROOT` Example: `python tools/rename_masks.py -s /BS/xxie-4/static00/behave-fps30/Date03_Sub03_chairwood_hand -m /BS/xxie-5/static00/behave_release/30fps-masks-new/`
+5. Download [openpose](https://github.com/CMU-Perceptual-Computing-Lab/openpose) and [FrankMocap](https://github.com/facebookresearch/frankmocap) detections: [packed data for test sequences](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/behave-packed-test-seqs.zip)
+6. Process the packed data to BEHAVE dataset format: `python tools/pack2separate.py -s SEQ_FOLDER -p PACKED_ROOT`. Example: `python tools/pack2separate.py -s /BS/xxie-4/static00/behave-fps30/Date03_Sub03_chairwood_hand -p /scratch/inf0/user/xxie/behave-packed`
+
+## Run demo 
+You can find all the commands of the pipeline in `scripts/demo.sh`. To run it, you need to download the pretrained models from [here](https://datasets.d2.mpi-inf.mpg.de/cvpr23vistracker/models.zip) and unzip them in the folder `experiments`. 
+
+Also, the dataset files should be prepared as described above. 
+
+Once done, you can run the demo for one sequence simply by:
+```shell
+bash scripts/vistracker_pipeline.sh SEQ_FOLDER 
+```
+example: `bash scripts/vistracker_pipeline.sh /BS/xxie-4/static00/test-seq/Date03_Sub03_chairwood_hand`
+
+It will take around 6~8 hours to finish a sequence of 1500 frames (50s). 
+
+Tips: the runtime bottlenecks are the SMPL-T pre-fitting (step 1-2) and joint optimization (step 6) in `scripts/demo.sh`. If you have a cluster with multiple GPU machines, you can run multiple sequences in parallel by specifying the `--start` and `--end` option for these commands. This will separate one long sequence into several chunks and each job only optimizes the chunk specified by start and end frames. 
+
+## Training 
+Train a SIF-Net model:
 ```shell
 python -m torch.distributed.launch --nproc_per_node=NUM_GPU --master_port 6789 --use_env train_launch.py -en tri-vis-l2
 ```
+Note that to train this model, you also need to prepare the GT registrations (meshes) in order to run online boundary sampling during training. We provide an example script to save SMPL and object meshes from packed parameters:
+`python tools/pack2separate_params.py -s SEQ_FOLDER -p PACKED_PATH`, similar to `tools/pack2separate.py`. The packed training data for this can be downloaded from [here (part1)](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/behave-packed-train-seqs-p1.zip) and [here (part2)](https://datasets.d2.mpi-inf.mpg.de/cvpr22behave/behave-packed-train-seqs-p2.zip)
+
 
 Train  motion infill model:
 ```shell
 python -m torch.distributed.launch --nproc_per_node=NUM_GPU --master_port 6787 --use_env train_mfiller.py -en cmf-k4-lrot
 ```
+For this, you need to specify the path to all packed GT files. 
+
+
+## Evaluation
+```shell
+python recon/eval/evalvideo_packed.py -split splits/behave-test-30fps.json -sn RECON_NAME -m ours -w WINDOW_SIZE
+```
+where `RECON_NAME` is your own save name for the reconstruction, and `WINDOW_SIZE` is the alignment window size (main paper Sec. 4). `WINDOW_SIZE=1` is equivalent to the evaluation used by CHORE. 
+
+## Citation
+If you use our code, please cite:
+```bibtex
+@inproceedings{xie2023vistracker,
+title = {Visibility Aware Human-Object Interaction Tracking from Single RGB Camera},
+    author = {Xie, Xianghui and Bhatnagar, Bharat Lal and Pons-Moll, Gerard },
+    booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, 
+    month={June}, 
+    year={2023} 
+}
+```
+If you use BEHAVE dataset, please also cite:
+```bibtex
+@inproceedings{bhatnagar22behave,
+    title = {BEHAVE: Dataset and Method for Tracking Human Object Interactions},
+    author={Bhatnagar, Bharat Lal and Xie, Xianghui and Petrov, Ilya and Sminchisescu, Cristian and Theobalt, Christian and Pons-Moll, Gerard},
+    booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition (CVPR)},
+    month = {jun},
+    organization = {{IEEE}},
+    year = {2022},
+    }
+```
+
+## License
+Copyright (c) 2023 Xianghui Xie, Max-Planck-Gesellschaft
+
+Please read carefully the following terms and conditions and any accompanying documentation before you download and/or use this software and associated documentation files (the "Software").
+
+The authors hereby grant you a non-exclusive, non-transferable, free of charge right to copy, modify, merge, publish, distribute, and sublicense the Software for the sole purpose of performing non-commercial scientific research, non-commercial education, or non-commercial artistic projects.
+
+Any other use, in particular any use for commercial purposes, is prohibited. This includes, without limitation, incorporation in a commercial product, use in a commercial service, or production of other artefacts for commercial purposes.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
+
+You understand and agree that the authors are under no obligation to provide either maintenance services, update services, notices of latent defects, or corrections of defects with regard to the Software. The authors nevertheless reserve the right to update, modify, or discontinue the Software at any time.
+
+The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. You agree to cite the **Visibility Aware Human-Object Interaction Tracking from Single RGB Camera** paper in documents and papers that report on research using this Software.
+
+
+
+
+
 
diff --git a/behave/utils.py b/behave/utils.py
@@ -9,6 +9,10 @@
 import numpy as np
 import os.path as osp
 from psbody.mesh import Mesh
+import yaml, sys
+with open("PATHS.yml", 'r') as stream:
+    paths = yaml.safe_load(stream)
+BEHAVE_ROOT = paths['BEHAVE_ROOT']
 
 
 def rotate_yaxis(R, t):
@@ -182,9 +186,9 @@ def load_scan_centered(scan_path, cent=True):
     return scan
 
 
-def load_template(obj_name, cent=True, high_reso=False):
+def load_template(obj_name, cent=True, high_reso=False, behave_path=BEHAVE_ROOT):
     "load object template mesh given object name"
-    temp_path = get_template_path("/BS/xxie-5/static00/behave_release/objects", obj_name)
+    temp_path = get_template_path(behave_path+"/objects", obj_name)
     if high_reso:
         assert obj_name in _mesh_template.keys(), f'does not support high reso template for {obj_name}'
         lowreso_temp = Mesh(filename=temp_path)

diff --git a/lib_smpl/__init__.py b/lib_smpl/__init__.py
@@ -1,8 +1,8 @@
 from .smplpytorch import SMPL_Layer
 from .smpl_generator import SMPLHGenerator
+from .wrapper_pytorch import SMPL_MODEL_ROOT
 
-
-def get_smpl(gender, hands, model_root='/BS/xxie2020/static00/mysmpl/smplh'):
+def get_smpl(gender, hands, model_root=SMPL_MODEL_ROOT):
     "simple wrapper to get SMPL model"
     return SMPL_Layer(model_root=model_root,
                gender=gender, hands=hands)
diff --git a/smoothnet/smooth_objrot.py b/smoothnet/smooth_objrot.py
@@ -14,7 +14,7 @@
 from smoothnet.utils.geometry_utils import rot6d_to_rotmat, rot6D_to_axis
 from smoothnet.smooth_base import SmootherBase
 import smoothnet.utils.geometry_utils as geom_utils
-from behave.utils import load_template, load_configs_all
+from behave.utils import load_template
 from recon.pca_util import PCAUtil
 
 import yaml

diff --git a/tools/pack2separate.py b/tools/pack2separate.py
@@ -0,0 +1,54 @@
+"""
+packed openpose and mocap results to separate files in each frame in BEHAVE dataset format 
+mocap: only save the parameters, not meshes
+    keywords: pose, betas 
+openpose: save as json files 
+    keywords: body_joints, face_joints, left_hand_joints, right_hand_joints 
+"""
+import os, sys
+sys.path.append(os.getcwd())
+import joblib
+import os.path as osp
+from tqdm import tqdm
+import json
+from behave.frame_data import FrameDataReader
+
+
+def pack2separate(args):
+    reader = FrameDataReader(args.seq_folder)
+    seq_name = reader.seq_name
+    packed_data = joblib.load(osp.join(args.packed_path, f'{seq_name}_GT-packed.pkl'))
+    assert len(packed_data['frames']) == len(reader), f'Warning: number of frames does not match for seq {seq_name}!'
+
+    # save as separate openpose and mocap files
+    for idx in tqdm(range(len(reader))):
+        for kid in reader.kids:
+            outfile = osp.join(reader.get_frame_folder(idx), f'k{kid}.mocap.json')
+            if not osp.isfile(outfile):
+                json.dump(
+                    {
+                        "pose":packed_data['mocap_poses'][idx, kid].tolist(),
+                        "betas":packed_data['mocap_betas'][idx, kid].tolist(),
+                    },
+                    open(outfile, 'w')
+                )
+            outfile = osp.join(reader.get_frame_folder(idx), f'k{kid}.color.json')
+            if not osp.isfile(outfile):
+                json.dump(
+                    {
+                        "body_joints": packed_data["joints2d"][idx, kid].tolist(),
+                    },
+                    open(outfile, 'w')
+                )
+    print("all done")
+
+
+if __name__ == '__main__':
+    from argparse import ArgumentParser
+    parser = ArgumentParser()
+    parser.add_argument('-s', '--seq_folder')
+    parser.add_argument('-p', '--packed_path', default="/scratch/inf0/user/xxie/behave-packed")# root path to all packed files 
+
+    args = parser.parse_args()
+
+    pack2separate(args)
diff --git a/tools/pack2separate_params.py b/tools/pack2separate_params.py
@@ -0,0 +1,70 @@
+"""
+load SMPL and object parameters from packed pkl file (extended BEHAVE data)
+save SMPL and object registration (meshes) in BEHAVE data format
+"""
+import os, sys
+
+import numpy as np
+import torch
+
+sys.path.append(os.getcwd())
+import joblib
+import os.path as osp
+from tqdm import tqdm
+import json
+from psbody.mesh import Mesh
+from scipy.spatial.transform import Rotation
+
+from behave.frame_data import FrameDataReader
+from lib_smpl import get_smpl
+from behave.utils import load_template
+
+
+def pack2separate_params(args):
+    reader = FrameDataReader(args.seq_folder)
+    seq_name = reader.seq_name
+    smpl_name, obj_name = "fit03", 'fit01-smooth'
+    obj_cat = reader.seq_info.get_obj_name(True)
+
+    packed_data = joblib.load(osp.join(args.packed_path, f'{seq_name}_GT-packed.pkl'))
+    assert len(packed_data['frames']) == len(reader), f'Warning: number of frames does not match for seq {seq_name}!'
+
+    temp = load_template(reader.seq_info.get_obj_name())
+    smplh_layer = get_smpl(reader.seq_info.get_gender(), True)
+    faces = smplh_layer.faces.copy()
+    for idx in tqdm(range(0, len(reader), args.interval)):
+        # object
+        outfile = osp.join(reader.get_frame_folder(idx), obj_cat, obj_name, f'{obj_cat}_fit.ply')
+        if not osp.isfile(outfile):
+            os.makedirs(osp.dirname(outfile), exist_ok=True)
+            angle, trans = packed_data['obj_angles'][idx], packed_data['obj_trans'][idx]
+            rot = Rotation.from_rotvec(angle).as_matrix()
+            obj_fit = np.matmul(temp.v, rot.T) + trans
+            Mesh(obj_fit, temp.f).write_ply(outfile)
+
+        # SMPL
+        outfile = osp.join(reader.get_frame_folder(idx), 'person', smpl_name, 'person_fit.ply')
+        if not osp.isfile(outfile):
+            os.makedirs(osp.dirname(outfile), exist_ok=True)
+            verts, _, _, _ = smplh_layer(torch.from_numpy(packed_data['poses'][idx:idx+1]),
+                                         torch.from_numpy(packed_data['betas'][idx:idx+1]),
+                                         torch.from_numpy(packed_data['trans'][idx:idx+1]))
+            verts = verts[0].cpu().numpy()
+            Mesh(verts, faces).write_ply(outfile)
+    print("all done")
+
+
+if __name__ == '__main__':
+    from argparse import ArgumentParser
+
+    parser = ArgumentParser()
+    parser.add_argument('-s', '--seq_folder')
+    parser.add_argument('-p', '--packed_path',
+                        default="/scratch/inf0/user/xxie/behave-packed")  # root path to all packed files
+    parser.add_argument('-i', '--interval', default=30, type=int,
+                        help="interval between two saved frames, if set to 1, save for all frames")
+
+
+    args = parser.parse_args()
+
+    pack2separate_params(args)
diff --git a/tools/rename_masks.py b/tools/rename_masks.py
@@ -0,0 +1,50 @@
+"""
+rename and move the human and object masks to BEHAVE dataset structure
+"""
+
+import os, sys
+sys.path.append(os.getcwd())
+import joblib
+import os.path as osp
+from tqdm import tqdm
+import json
+from glob import glob
+from behave.frame_data import FrameDataReader
+
+
+def rename_masks(args):
+    reader = FrameDataReader(args.seq_folder)
+    seq_name = reader.seq_name
+    mask_path = osp.join(args.mask_path, seq_name)
+    ps_files = glob(osp.join(mask_path, 't*k1.person_mask.png'))
+    obj_files = glob(osp.join(mask_path, 't*k1.obj_rend_mask.png'))
+    assert len(ps_files) == len(obj_files), 'the number of mask files does not match!'
+    assert len(ps_files) == len(reader), 'the number of frames between mask and RGB images does not match!'
+
+    files_all = glob(osp.join(mask_path, 't*.png'))
+    count = 0
+    for file in tqdm(files_all):
+        fname = osp.join(args.seq_folder, *osp.basename(file).split("-"))
+        if osp.isfile(fname):
+            continue
+        cmd = f'mv {file} {fname}'
+        # print(cmd)
+        os.system(cmd)
+        # count += 1
+        # if count == 10:
+        #     break
+    print("all done!")
+
+
+
+if __name__ == '__main__':
+    from argparse import ArgumentParser
+
+    parser = ArgumentParser()
+    parser.add_argument('-s', '--seq_folder')
+    parser.add_argument('-m', '--mask_path',
+                        default="/scratch/inf0/user/xxie/behave-packed")  # root path to all mask files
+
+    args = parser.parse_args()
+
+    rename_masks(args)