To get cam_intrinsics and cam_extrinsics from .npz files #300

MoyGcc · 2022-07-07T13:07:50Z

Hi Yu, thanks for your work and such an organized repo!

I'm now using ROMP to get SMPL poses and would like to visualize the meshes via a perspective camera. I usually use a similar way as chungyiweng/humannerf#1 to convert the s, t_x, and t_y along with a human bbox to the pinhole camera parameters and it does work on VIBE output. However, it seems like I cannot easily get the parameters with the output from ROMP .npz outputs (I can get a rough bbox from pj2d_org ). I found that the scaling factor s is quite different between the VIBE and ROMP estimation for the same input image (~1.14 in VIBE and ~0.58 in ROMP). Could you please point out how I can quickly obtain (estimate) the camera intrinsic and extrinsic? Thanks!

The text was updated successfully, but these errors were encountered:

Arthur151 · 2022-07-08T08:04:26Z

Hi, @MoyGcc
Thanks for you kind word!
You can use this function to achieve this.

ROMP/simple_romp/romp/utils.py

Line 331 in 91dac01

 def estimate_translation_cv2(joints_3d, joints_2d, focal_length=600, img_size=np.array([512.,512.]), proj_mat=None, cam_dist=None): 

It take the estimated 3D joints, the 2D joints pj2d_org, image size, and the focal length to estimate the corresponding
3D translation in the camera space defined by these intrinsice paramters.
In BEV, we calculate the focal length like this:
focal length: when FOV=60 deg, 443.4 = H/2 * 1/(tan(FOV/2)) = 512/2. * 1./np.tan(np.radians(30))
BEV takes the square 512 x 512 input, we assume the FOV = 60 degree

hongsiyu · 2022-07-08T09:11:51Z

Hi Yu, thanks for your work and such an organized repo!

I'm now using ROMP to get SMPL poses and would like to visualize the meshes via a perspective camera. I usually use a similar way as chungyiweng/humannerf#1 to convert the s, t_x, and t_y along with a human bbox to the pinhole camera parameters and it does work on VIBE output. However, it seems like I cannot easily get the parameters with the output from ROMP .npz outputs (I can get a rough bbox from pj2d_org ). I found that the scaling factor s is quite different between the VIBE and ROMP estimation for the same input image (~1.14 in VIBE and ~0.58 in ROMP). Could you please point out how I can quickly obtain (estimate) the camera intrinsic and extrinsic? Thanks!

Have you solved this problem? I met the same issue.

MoyGcc · 2022-07-08T09:23:52Z

Hi Yu @Arthur151,
Thanks so much for the quick reply and for pointing out the correct way to do this. In the end, I followed the way that you applied for evaluation on AGORA:

ROMP/simple_romp/evaluation/eval_AGORA.py

Line 79 in 91dac01

def save_agora_predictions_v6(image_path, outputs, save_dir):

and now the projected smpl mesh can align well with the image. Though there is still a "slight" difference (below, the one with normal color is my projected result) in terms of the projection. I think it's okay. @hongsiyu, you could probably also refer to the evaluation on the AGORA dataset for doing this.

Arthur151 · 2022-07-08T09:28:09Z

That's clever. Glad to hear that.

Andyen512 · 2022-07-08T10:19:02Z

So the intrinsic is ([443.4, 1, 512//2], [1, 443.4, 512//2],[0, 0, 1]), and the extrinsics[:3, 3] = cam_trans , right? @MoyGcc

hongsiyu · 2022-07-08T10:54:29Z

Hi Yu @Arthur151, Thanks so much for the quick reply and for pointing out the correct way to do this. In the end, I followed the way that you applied for evaluation on AGORA:

ROMP/simple_romp/evaluation/eval_AGORA.py

Line 79 in 91dac01

def save_agora_predictions_v6(image_path, outputs, save_dir):

and now the projected smpl mesh can align well with the image. Though there is still a "slight" difference (below, the one with normal color is my projected result) in terms of the projection. I think it's okay. @hongsiyu, you could probably also refer to the evaluation on the AGORA dataset for doing this.

I followed the way you mentioned with my own video. But the progress image in humannerf seems not correct. Do you succeed in trainiing humannerf with AGORA dataset.

Arthur151 · 2022-07-08T11:03:02Z

@Andyen512
No, The image size should be the original size on input image, not on the resize BEV's input map.
It is fine to directly use the camera intrinsic in humannerf during calculating the 3D translation using estimate_translation

"cam_intrinsics": [
            [23043.9, 0.0,940.19],
            [0.0, 23043.9, 539.23],
            [0.0, 0.0, 1.0]

hongsiyu · 2022-07-08T11:47:14Z

@Andyen512 No, The image size should be the original size on input image, not on the resize BEV's input map. It is fine to directly use the camera intrinsic in humannerf during calculating the 3D translation using estimate_translation
"cam_intrinsics": [
            [23043.9, 0.0,940.19],
            [0.0, 23043.9, 539.23],
            [0.0, 0.0, 1.0]

Thank you very much, the focal length makes me succeed in training humannerf.

Andyen512 · 2022-07-08T15:14:00Z

@Arthur151 Sorry, why using the humannerf cam_intrinsics? I was using romp --mode=video --calc_smpl --render_mesh -i=/path/to/video.mp4 -o=/path/to/output/folder/results.mp4 --save_video to inference my own video and I see the args.focal_length in

ROMP/romp/lib/config.py

Line 60 in 91dac01

 V6_group.add_argument('--focal_length',type=float, default = 443.4, help = 'Default focal length, adopted from JTA dataset') 

is 443.4. Also, the original size of input image is 1920*1080, so why not the cam_intrinsics[0][2]=960, cam_intrinsics[1][2]=540?
I was so confused.

Arthur151 · 2022-07-09T03:55:30Z

@Andyen512
That focal length (23043.9) / image center coords (940.19, 539.23) is just for training humannerf in their camera extrinsic matrix.

To inference on you own video, you can re-calculate the focal length :
when FOV=60 deg, focal length = H/2 * 1/(tan(FOV/2)) = 1920/2. * 1./np.tan(np.radians(30)) = 1662.768

Andyen512 · 2022-07-17T08:00:16Z

ok thx, I'll try

mch0dmin · 2023-05-11T07:43:00Z

length

hi @hongsiyu , can you tell me how to use ROMP to obtain "3x3" cam_intrinsics and "4x4" cam_extrinsics, thanks.

kaufManu mentioned this issue Aug 26, 2022

How to render results with weak perspective camera #344

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

To get cam_intrinsics and cam_extrinsics from .npz files #300

To get cam_intrinsics and cam_extrinsics from .npz files #300

MoyGcc commented Jul 7, 2022

Arthur151 commented Jul 8, 2022

hongsiyu commented Jul 8, 2022

MoyGcc commented Jul 8, 2022

Arthur151 commented Jul 8, 2022

Andyen512 commented Jul 8, 2022

hongsiyu commented Jul 8, 2022

Arthur151 commented Jul 8, 2022 •

edited

Loading

hongsiyu commented Jul 8, 2022

Andyen512 commented Jul 8, 2022 •

edited

Loading

Arthur151 commented Jul 9, 2022

Andyen512 commented Jul 17, 2022

mch0dmin commented May 11, 2023

To get cam_intrinsics and cam_extrinsics from .npz files #300

To get cam_intrinsics and cam_extrinsics from .npz files #300

Comments

MoyGcc commented Jul 7, 2022

Arthur151 commented Jul 8, 2022

hongsiyu commented Jul 8, 2022

MoyGcc commented Jul 8, 2022

Arthur151 commented Jul 8, 2022

Andyen512 commented Jul 8, 2022

hongsiyu commented Jul 8, 2022

Arthur151 commented Jul 8, 2022 • edited Loading

hongsiyu commented Jul 8, 2022

Andyen512 commented Jul 8, 2022 • edited Loading

Arthur151 commented Jul 9, 2022

Andyen512 commented Jul 17, 2022

mch0dmin commented May 11, 2023

Arthur151 commented Jul 8, 2022 •

edited

Loading

Andyen512 commented Jul 8, 2022 •

edited

Loading