Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-frame Fusion #53

Open
zw615 opened this issue Sep 2, 2022 · 6 comments
Open

Multi-frame Fusion #53

zw615 opened this issue Sep 2, 2022 · 6 comments

Comments

@zw615
Copy link

zw615 commented Sep 2, 2022

Hi, I have been reading this code for some time, but I fail to find any multi-frame fusion config. All I have found is num_sweeps=1:

Does it mean that now this code doesn't support multi-frame fusion yet? I wonder if in the comparison with SOTA results on nuscenes val set (0.351mAP and 0.475NDS for R50) and test set (0.503mAP and 0.600NDS for which backbone?) in the paper, BEVDepth has used multi-frame fusion? Or in the leaderboard? If so, could you please give me some advice on how to reproduce the paper results or the leaderboard results?

@yinchimaoliang
Copy link
Collaborator

Hi! We have provided multi-frame exps, please refer to https://github.com/Megvii-BaseDetection/BEVDepth/blob/main/exps/bev_depth_lss_r50_256x704_128x128_20e_cbgs_2key_da_ema.py. To reproduce the results in paper, you need to change the backbone to Vovnet and the image resolution to 640 x 1600.

@zw615
Copy link
Author

zw615 commented Sep 3, 2022

After a more thourough go through on the previous issues, I find a very similar issue here (#8). Like that issue, I didn't notice the key_idxes used in key frame fusion. I thought BEVDepth selectes previous sweep frames for fusion, but clearly I was wrong.

About reproducing the result, I think R50 and R101 are used on nuscenes val set. I assume Vovnet and 640x1600 resolution you describe here are used on nuscenes test set? I wonder what config did you use to reproduce the result on the nuscenes leaderboard?

@yinchimaoliang
Copy link
Collaborator

Yes, On the leaderboard, we use vovnet and 640x1600 resolution.

@zw615
Copy link
Author

zw615 commented Sep 5, 2022

Ok, that helps a lot, thanks!

@zw615
Copy link
Author

zw615 commented Oct 4, 2022

Hi, I just found this answer about leaderboard results reproduction. It says ConvNeXT is used as backbone, not VovNet. I wonder which one is correct? Thanks!

Yes, On the leaderboard, we use vovnet and 640x1600 resolution.

@Sadwy
Copy link

Sadwy commented Mar 22, 2023

I fail to locate the code for implementing multi-frame fusion. Would you show me where I can find the module? Specifically, I'm interested in the code that aligns two frames using a coordinate transformation matrix. Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants