Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is the focal length different in your rendering and img_choy2016? #64

Closed
BostonLobster opened this issue Jun 30, 2021 · 4 comments
Closed

Comments

@BostonLobster
Copy link

I downloaded the ShapeNet for 2.5D supervised models dataset, and found there are two cameras.npz. One in obj_ID folder, another in img_choy2016 folder.

In the paper, you wrote "While we use the renderings from Choy et al. [13] as input, we additionally render 24 images of resolution 2562 with depth maps and object masks per object which we use for supervision." So, I guess one cameras.npz is for your rendering, the other for choy's.

But the focal length in two cameras.npz are different:
In yours, the focal is

array([[2.1875, 0.    , 0.    , 0.    ],
       [0.    , 2.1875, 0.    , 0.    ],
       [0.    , 0.    , 1.    , 0.    ],
       [0.    , 0.    , 0.    , 1.    ]])

but in choy's, the focal is

array([[149.84375,   0.     ,  68.5    ],
       [  0.     , 149.84375,  68.5    ],
       [  0.     ,   0.     ,   1.     ]])

I think the focal length should be same, because you just changed the camera pose during additional rendering, right?

@m-niemeyer
Copy link
Collaborator

Hi @BostonLobster , thanks for your question!

Yes, that is correct, the focal length and the principal point is different. If you check our code (e.g. the arange_pixels function), you will see that we assume the image plane to be in [-1, 1] with the center being at 0. The format used by Choy et. al. is [0, H-1] x [0, W-1] with the center point at H/2, W/2. (As a side note: we use the Choy et al. renderings only as input for the encoder, so that we never need to use the camera intrinsics / extrinsics in our repo.)

@BostonLobster
Copy link
Author

@m-niemeyer Thanks for your reply! I know the difference now. An additional question here: how to convert the format of Choy et. al. to yours where the image plane reside in [-1, 1]? I'm wondering if I use Choy et. al. rendering for both input and supervision, I have to modify the camera intrinsics.

@m-niemeyer
Copy link
Collaborator

I would suggest to do either of the following:

  1. Find the field of view of Choy et al; for this, you need the focal length and the sensor size (= image size). If you have these two values, you can calculate the FoV. With this, you can then calculate the new focal length for a sensor of size 2 (as our image should be between [-1, 1]), and with this this you can create you new camera matrix.
  2. You can multiply the Choy et al. camera matrix with another matrix S from the left (K_new = S @ K_choy). This matrix needs to a.) scale the pixels from [0, H-1] / [0, W-1] to (-1, 1). If I am not mistaken, this should be:
S = [   [s, 0, -1],
        [0, s, -1],
        [0, 0, 1]]

where s = (2 / (H - 1)); if H and W are different, you need to two different values, but this is not the case for Choy et al. (as you have squared images).

@BostonLobster
Copy link
Author

I'll try your suggestions! Many thanks!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants