Can't reproduce Robotcar results #45

RuotongWANG · 2021-11-21T10:04:35Z

Hi, I tried to reproduce your result on Robotcar Seasons V2 test set by submitting to the challenge submission server. I used the released performance-focused model which is pre-trained on MSLS dataset, but I got this incorrect result:

And I tried the model pre-trained on Pitts30k, the results are not correct either.

Besides, the results on other datasets is normal. Is the model version that I used is wrong? Could you possibly release the model state that achieves the results on Robotcat dataset shown in the paper? Or would you provide the results on test set split by conditions like the Supplementary Table 1? Thank you so much.

Best regards,

Tobias-Fischer · 2021-11-21T20:49:27Z

Hi,

Could you please let us know the complete process that you used to obtain these results? In particular, how you map the best match to a pose?

Best, Tobias

RuotongWANG · 2021-11-22T01:39:21Z

I directly used the pose of the best matched reference image as the estimated pose of the query. And I have also evaluated the SuperGlue method with the same procedure and got a normal result:

So I think there might be something wrong with the configuration or the model state that I used.

Tobias-Fischer · 2021-11-22T03:13:03Z

Ok - @StephenHausler - let's sit together at some point to find where the culprit lies.

marialeyvallina · 2021-11-25T13:51:18Z

Hi, @StephenHausler, @Tobias-Fischer , some days ago I ran the Pittsburgh_WPCA4096 and MSLS_WPCA4096 models for RobotSeasons and obtained the following results with the NetVLAD retrieval:
Pittsburgh_WPCA4096:
day-all: 7.3 29.2 91.3, night all: 0.9 2.6 2.4 -> overall 5.9 23.3 73.9
In the paper you report for NetVLAD: 7.0 24.9 76.6

MSLS_WPCA4096:
day-all: 6.2 23.1 83.5, night-all: 0, 0.5, 4.2 -> overall 5.0 18.58 67.8

The overall I calculate it by doing the weighted mean of both numbers based on the number of images taken at day and night: overall = ( day * 9300 + night * 2634 ) / (9300 + 2634)

For the Pittsburgh model the difference with the reported numbers seems reasonable to me (like what would happen between two different trainings), so I think that the model is probably fine and the problem lies in the Patch-NetVLAD feature extraction part. I hope this info helps with the issue.

HeartbreakSurvivor · 2021-12-12T08:19:33Z

Hi, @StephenHausler, @Tobias-Fischer , some days ago I ran the Pittsburgh_WPCA4096 and MSLS_WPCA4096 models for RobotSeasons and obtained the following results with the NetVLAD retrieval: Pittsburgh_WPCA4096: day-all: 7.3 29.2 91.3, night all: 0.9 2.6 2.4 -> overall 5.9 23.3 73.9 In the paper you report for NetVLAD: 7.0 24.9 76.6

MSLS_WPCA4096: day-all: 6.2 23.1 83.5, night-all: 0, 0.5, 4.2 -> overall 5.0 18.58 67.8

The overall I calculate it by doing the weighted mean of both numbers based on the number of images taken at day and night: overall = ( day * 9300 + night * 2634 ) / (9300 + 2634)

For the Pittsburgh model the difference with the reported numbers seems reasonable to me (like what would happen between two different trainings), so I think that the model is probably fine and the problem lies in the Patch-NetVLAD feature extraction part. I hope this info helps with the issue.

Hi, could you please tell me the dataset you ran Pittsburgh_WPCA4096 model is RobotSeasons V1 or V2?

marialeyvallina · 2021-12-13T15:58:17Z

Hi @HeartbreakSurvivor, I ran RobotSeasons V2

HeartbreakSurvivor · 2021-12-15T06:31:51Z

Hi @HeartbreakSurvivor, I ran RobotSeasons V2

Hi, the question is the RobotcarV1 has 9300 + 2634 = 11934 query images and the RobotCar v2 has 1872 query images, you said you ran on RobotSeasons V2 but calculate overall use this:

overall = ( day * 9300 + night * 2634 ) / (9300 + 2634)

I dont't know why, but it doesn't matter.

What I really wonder is that how you get these result? just follow the QuickStart in ReadMe.md file?
I alos ran the Pittsburgh_WPCA4096 model on RobotCar Seasons V2 but got wrong result and don't know why. I just run the feature_extract.py, feature_match .py to get the 'PatchNetVLAD_predictions.txt' and just get pose of the best matched database image as estimated pose for each query image. And submit result to benchmark website but got wrong answers.
So I hope you could tell me how you obtained your results which seems reasonable, Thanks.

marialeyvallina · 2021-12-15T09:18:34Z

Hi again @HeartbreakSurvivor

Hi, the question is the RobotcarV1 has 9300 + 2634 = 11934 query images and the RobotCar v2 has 1872 query images, you said you ran on RobotSeasons V2 but calculate overall use this:

Thank you very much for pointing this out, it seems that I indeed mixed the two versions. The overall should be instead calculated as:
overall = (day * 1443 +night * 429)/(1443+429)
The distribution is very similar between v1 and v2 so the results do not change much:
For Pittsburgh_WPCA4096 day-all: 7.3 29.2 91.3, night all: 0.9 2.6 2.4 -> 5.8 | 23.1 | 73.2
For MSLS_WPCA4096: day-all: 6.2 23.1 83.5, night-all: 0, 0.5, 4.2 -> 4.8 | 17.9 | 65.3

I use indeed feature_extract.py and feature_match.py and then use the NetVLAD_predictions.txt file (I have not evaluated Patch-NetVLAD yet, only NetVLAD). You have to be careful with the format of the poses, as explained in the dataset readme, but the retrieval itself should be fine.

HeartbreakSurvivor · 2021-12-15T14:52:58Z

Hi again @HeartbreakSurvivor

Hi, the question is the RobotcarV1 has 9300 + 2634 = 11934 query images and the RobotCar v2 has 1872 query images, you said you ran on RobotSeasons V2 but calculate overall use this:

Thank you very much for pointing this out, it seems that I indeed mixed the two versions. The overall should be instead calculated as: overall = (day * 1443 +night * 429)/(1443+429) The distribution is very similar between v1 and v2 so the results do not change much: For Pittsburgh_WPCA4096 day-all: 7.3 29.2 91.3, night all: 0.9 2.6 2.4 -> 5.8 | 23.1 | 73.2 For MSLS_WPCA4096: day-all: 6.2 23.1 83.5, night-all: 0, 0.5, 4.2 -> 4.8 | 17.9 | 65.3

I use indeed feature_extract.py and feature_match.py and then use the NetVLAD_predictions.txt file (I have not evaluated Patch-NetVLAD yet, only NetVLAD). You have to be careful with the format of the poses, as explained in the dataset readme, but the retrieval itself should be fine.

thank you very much for the reply, I will check my code.

HeartbreakSurvivor · 2021-12-17T04:04:46Z

Hi again @marialeyvallina it seems that I got the same problem with you. I ran the Pittsburgh_WPCA4096 model for RobotSeasons V1 and obtained the following results with the NetVLAD retrieval:

day all	night all
6.3 / 25.4 / 87.6	0.8 / 2.5 / 16.5

which seems reasonable to me.
But when I use the PatchNetvlad retrieval, the result seems wrong.

day all	night all
2.1 / 8.3 / 36.7	0.1 / 1.3 / 13.9

I have test Pittsburgh_WPCA4096 on RobotCar Seasons V1 for twice just in case, but got the same result, the result is as follows.

So I agree with your point, the problem maybe lies in PathchNetvlad feature extraction or feature match part.
Hi, @Tobias-Fischer, any hints about this issue？Or did you test RobotCar Seasons V1 dataset, if so, could you please provide the test result?

Tobias-Fischer · 2021-12-17T05:27:46Z

Hi, @StephenHausler and I will be looking at this. However the holiday season is coming up and we're tied with other projects.

We haven't ever checked V1 as far as I remember.

I'm assuming you guys are aware that the lower scores are better for NetVLAD (distances), but higher scores for Patch-NetVLAD (number of inliers)? So it needs an argmax instead of argmin to get the top1 match.

Tobias-Fischer changed the title ~~Can't reproduce Robotcar redults~~ Can't reproduce Robotcar results Dec 2, 2021

Tobias-Fischer mentioned this issue Apr 3, 2023

How to get the value of Extended CMU Seasons? #71

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't reproduce Robotcar results #45

Can't reproduce Robotcar results #45

RuotongWANG commented Nov 21, 2021

Tobias-Fischer commented Nov 21, 2021

RuotongWANG commented Nov 22, 2021 •

edited

Loading

Tobias-Fischer commented Nov 22, 2021

marialeyvallina commented Nov 25, 2021

HeartbreakSurvivor commented Dec 12, 2021

marialeyvallina commented Dec 13, 2021

HeartbreakSurvivor commented Dec 15, 2021

marialeyvallina commented Dec 15, 2021

HeartbreakSurvivor commented Dec 15, 2021

HeartbreakSurvivor commented Dec 17, 2021

Tobias-Fischer commented Dec 17, 2021

Can't reproduce Robotcar results #45

Can't reproduce Robotcar results #45

Comments

RuotongWANG commented Nov 21, 2021

Tobias-Fischer commented Nov 21, 2021

RuotongWANG commented Nov 22, 2021 • edited Loading

Tobias-Fischer commented Nov 22, 2021

marialeyvallina commented Nov 25, 2021

HeartbreakSurvivor commented Dec 12, 2021

marialeyvallina commented Dec 13, 2021

HeartbreakSurvivor commented Dec 15, 2021

marialeyvallina commented Dec 15, 2021

HeartbreakSurvivor commented Dec 15, 2021

HeartbreakSurvivor commented Dec 17, 2021

Tobias-Fischer commented Dec 17, 2021

RuotongWANG commented Nov 22, 2021 •

edited

Loading