-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-frame Fusion #53
Comments
Hi! We have provided multi-frame exps, please refer to https://github.com/Megvii-BaseDetection/BEVDepth/blob/main/exps/bev_depth_lss_r50_256x704_128x128_20e_cbgs_2key_da_ema.py. To reproduce the results in paper, you need to change the backbone to Vovnet and the image resolution to 640 x 1600. |
After a more thourough go through on the previous issues, I find a very similar issue here (#8). Like that issue, I didn't notice the key_idxes used in key frame fusion. I thought BEVDepth selectes previous sweep frames for fusion, but clearly I was wrong. About reproducing the result, I think R50 and R101 are used on nuscenes val set. I assume Vovnet and 640x1600 resolution you describe here are used on nuscenes test set? I wonder what config did you use to reproduce the result on the nuscenes leaderboard? |
Yes, On the leaderboard, we use vovnet and 640x1600 resolution. |
Ok, that helps a lot, thanks! |
Hi, I just found this answer about leaderboard results reproduction. It says ConvNeXT is used as backbone, not VovNet. I wonder which one is correct? Thanks!
|
I fail to locate the code for implementing multi-frame fusion. Would you show me where I can find the module? Specifically, I'm interested in the code that aligns two frames using a coordinate transformation matrix. Thanks a lot! |
Hi, I have been reading this code for some time, but I fail to find any multi-frame fusion config. All I have found is num_sweeps=1:
BEVDepth/exps/bev_depth_lss_r50_256x704_128x128_24e.py
Line 222 in 16d7854
Does it mean that now this code doesn't support multi-frame fusion yet? I wonder if in the comparison with SOTA results on nuscenes val set (0.351mAP and 0.475NDS for R50) and test set (0.503mAP and 0.600NDS for which backbone?) in the paper, BEVDepth has used multi-frame fusion? Or in the leaderboard? If so, could you please give me some advice on how to reproduce the paper results or the leaderboard results?
The text was updated successfully, but these errors were encountered: