Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED] Bug: properly handling batched data in the ego motion loss and Chamfer distance computation #5

Closed
tpzou opened this issue Apr 11, 2021 · 7 comments
Labels
bug Something isn't working

Comments

@tpzou
Copy link

tpzou commented Apr 11, 2021

hi, @zgojcic . Is there a mistake at loss.py line 83? The first column of p_s_temp[mask_temp,:3] is the batch id... Why should transform it?

@zgojcic
Copy link
Owner

zgojcic commented Apr 11, 2021

Hi @tpzou,

thanks for catching this, it is indeed a bug that is a result of the legacy issues (in the earlier versions ME used the last column to store the indices).

This bug means that our ego motion loss was applied on points, which always lie on a specific line (the line is different for each element in the batch due to the batch index being treated as a coordinate). Theoretically, our formulation of the ego loss would allow us to use random 3D points (as the points are transformed with both GT and EST, transformation parameters). Therefore, this bug should not have a significant effect on the performance and if anything should actually improve it.

I have uploaded an updated loss.py but will leave this issue open and will update it when I have the results of further analysis (retraining the model).

Thank you again,

Best,

Zan

@zgojcic zgojcic reopened this Apr 11, 2021
@zgojcic
Copy link
Owner

zgojcic commented Apr 20, 2021

Hi @tpzou,

the model is still training but after fixing this bug the ego_loss as well as the total loss are lower both on training and validation data. I will update the results once the model is fully converged...

@tpzou
Copy link
Author

tpzou commented Apr 21, 2021

Hi @zgojcic ,
would you have the plane to make it distributed multi-GPU training?
I train to use the spawn and DistributedDataParallel , but it seems something wrong with my code.

@zgojcic
Copy link
Owner

zgojcic commented Apr 21, 2021

Hi @tpzou,

we have never tried that, but I can have a quick look in the next days/weeks. Otherwise you can just increase the accumulation iterations parameter if you would like to have a bigger batch size.

@tpzou
Copy link
Author

tpzou commented Apr 23, 2021

Hi, @zgojcic
I am constructing my code based on yours, I have another question: as the chamfer_distance of Foreground loss cat foregrounds of all batches, perhaps that will make wrong matching?

@zgojcic
Copy link
Owner

zgojcic commented Nov 18, 2021

Sorry for the extremely late response, I have somehow missed this post. Indeed, this will affect some cases and one should iterate over the batch. I think that the difference will be small, but I will update the model and post the result here.

@zgojcic
Copy link
Owner

zgojcic commented Nov 28, 2021

Hi,

we have fixed the bug in the chamfer_distance computation to properly handle the batched data.

I have retrained a model after the fix, but the performance is basically the same (see the loss curves below) so for the sake of reduced confusion we will leave the old model as the default one (should you require a newly trained model simply let me know).

Thank you again for spotting this bug.

Orange is the old run and blue the run after fixing the chamfer_distance bug
image
image

@zgojcic zgojcic closed this as completed Nov 28, 2021
@zgojcic zgojcic changed the title Loss [FIXED] Bug: properly handling batched data in the ego motion loss and Chamfer distance computation Nov 28, 2021
@zgojcic zgojcic added the bug Something isn't working label Nov 28, 2021
@zgojcic zgojcic pinned this issue Nov 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants