Skip to content
forked from ethnhe/PVN3D

Code for "PVN3D: A Deep Point-wise 3D Keypoints Hough Voting Network for 6DoF Pose Estimation", CVPR 2020

License

Notifications You must be signed in to change notification settings

hajar2016/PVN3D

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PVN3D

This is the official source code for PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation, CVPR 2020. (PDF, Video_bilibili, Video_youtube).

News

We optimized and applied PVN3D to a robotic manipulation contest OCRTOC (IROS 2020: Open Cloud Robot Table Organization Challenge) and got 2nd place! The model was trained on synthesis data generated from rendering engine and only a few frames (about 100 frames) of real data but generalize to real scenes during inference, revealing its capability of cross-domain generalization.

Installation

  • The following setting is for pytorch 1.0.1. For pytorch 1.5 & cuda 10, switch to branch pytorch-1.5.
  • Install CUDA9.0
  • Set up python environment from requirement.txt:
    pip3 install -r requirement.txt 
  • Install tkinter through sudo apt install python3-tk
  • Install python-pcl.
  • Install PointNet++ (refer from Pointnet2_PyTorch):
    python3 setup.py build_ext

Datasets

  • LineMOD: Download the preprocessed LineMOD dataset from here (refer from DenseFusion). Unzip it and link the unzipped Linemod_preprocessed/ to pvn3d/datasets/linemod/Linemod_preprocessed:

    ln -s path_to_unzipped_Linemod_preprocessed pvn3d/dataset/linemod/
  • YCB-Video: Download the YCB-Video Dataset from PoseCNN. Unzip it and link the unzippedYCB_Video_Dataset to pvn3d/datasets/ycb/YCB_Video_Dataset:

    ln -s path_to_unzipped_YCB_Video_Dataset pvn3d/datasets/ycb/

Training and evaluating

Training on the LineMOD Dataset

  • First, generate synthesis data for each object using scripts from raster triangle.
  • Train the model for the target object. Take object ape for example:
    cd pvn3d
    python3 -m train.train_linemod_pvn3d --cls ape
    The trained checkpoints are stored in train_log/linemod/checkpoints/{cls}/, train_log/linemod/checkpoints/ape/ in this example.

Evaluating on the LineMOD Dataset

  • Start evaluation by:
    # commands in eval_linemod.sh
    cls='ape'
    tst_mdl=train_log/linemod/checkpoints/${cls}/${cls}_pvn3d_best.pth.tar
    python3 -m train.train_linemod_pvn3d -checkpoint $tst_mdl -eval_net --test --cls $cls
    You can evaluate different checkpoint by revising tst_mdl to the path of your target model.
  • We provide our pre-trained models for each object at onedrive link, baiduyun link (access code(提取码): 8kmp). Download them and move them to their according folders. For example, move the ape_pvn3d_best.pth.tar to train_log/linemod/checkpoints/ape/. Then revise tst_mdl=train_log/linemod/checkpoints/ape/ape_pvn3d_best.path.tar for testing.

Demo/visualizaion on the LineMOD Dataset

  • After training your models or downloading the pre-trained models, you can start the demo by:
    # commands in demo_linemod.sh
    cls='ape'
    tst_mdl=train_log/linemod/checkpoints/${cls}/${cls}_pvn3d_best.pth.tar
    python3 -m demo -dataset linemod -checkpoint $tst_mdl -cls $cls
    The visualization results will be stored in train_log/linemod/eval_results/{cls}/pose_vis

Training on the YCB-Video Dataset

  • Preprocess the validation set to speed up training:
    cd pvn3d
    python3 -m datasets.ycb.preprocess_testset
  • Start training on the YCB-Video Dataset by:
    python3 -m train.train_ycb_pvn3d
    The trained model checkpoints are stored in train_log/ycb/checkpoints/

Evaluating on the YCB-Video Dataset

  • Start evaluating by:
    # commands in eval_ycb.sh
    tst_mdl=train_log/ycb/checkpoints/pvn3d_best.pth.tar
    python3 -m train.train_ycb_pvn3d -checkpoint $tst_mdl -eval_net --test
    You can evaluate different checkpoint by revising the tst_mdl to the path of your target model.
  • We provide our pre-trained models at onedrive link, baiduyun link (access code(提取码): h2i5). Download the ycb pre-trained model, move it to train_log/ycb/checkpoints/ and modify tst_mdl for testing.

Demo/visualizaion on the YCB-Video Dataset

  • After training your model or downloading the pre-trained model, you can start the demo by:
    # commands in demo_ycb.sh
    tst_mdl=train_log/ycb/checkpoints/pvn3d_best.pth.tar
    python3 -m demo -checkpoint $tst_mdl -dataset ycb
    The visualization results will be stored in train_log/ycb/eval_results/pose_vis

Results

  • Evaluation result on the LineMOD dataset: res_lm
  • Evaluation result on the YCB-Video dataset: res_ycb
  • Visualization of some predicted poses on YCB-Video dataset: vis_ycb
  • Joint training for distinguishing objects with similar appearance but different in size: seg

Adaptation to New Dataset

  cd lib/utils/dataset_tools/fps/
  python3 setup.py build_ext --inplace
  • Generate information of objects, eg. radius, 3D keypoints, etc. in your datasets with the gen_obj_info.py script:
  cd ../
  python3 gen_obj_info.py --help
  • Modify info of your new dataset in PVN3D/pvn3d/common.py
  • Write your dataset preprocess script following PVN3D/pvn3d/datasets/ycb/ycb_dataset.py (for multi objects of a scene) or PVN3D/pvn3d/datasets/linemod/linemod_dataset.py (for single object of a scene). Note that you should modify or call the function that get your model info, such as 3D keypoints, center points, and radius properly.
  • (Important!) Visualize and check if you process the data properly, eg, the projected keypoint and center point, the semantic label of each point, etc.
  • For inference, make sure that you load the 3D keypoints, center point, and radius of your objects in the object coordinate system properly in PVN3D/pvn3d/lib/utils/pvn3d_eval_utils.py.

Citations

Please cite PVN3D if you use this repository in your publications:

@InProceedings{He_2020_CVPR,
author = {He, Yisheng and Sun, Wei and Huang, Haibin and Liu, Jianran and Fan, Haoqiang and Sun, Jian},
title = {PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2020}
}

About

Code for "PVN3D: A Deep Point-wise 3D Keypoints Hough Voting Network for 6DoF Pose Estimation", CVPR 2020

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 88.0%
  • C 4.5%
  • C++ 3.5%
  • Cuda 3.2%
  • Jupyter Notebook 0.6%
  • Shell 0.2%