Validate on the test set #11

swagshaw · 2024-05-08T09:00:15Z

Line 213 in 0ac8073

 valid_dataset = torch.utils.data.ConcatDataset([weak_val, synth_val, strong_val, test_dataset]) 

I am reproducing the result of the ATST-SED. When I was doing stage 2, I realized that the test set already leaked into the valid set.
Is this designed as proposed? I cannot find this part of the explanation in the paper. Did you also keep this setting of train/valid/test split into the baseline BEATs model, otherwise I doubt whether the improvement is from it.

SaoYear · 2024-05-08T09:24:35Z

Hi, thanks for noticing that.

There should be no worry on the data leakage. As you can see in the trainer file, the definition of the Validation Dataset (nn.Dataset) does not determine the data used in the validation step. There are three masks that control the data used for validation, namely mask_weak, mask_synth and mask_real.

The reason why the test_dataset appearred in the valid_dataset is that, the validation results of ATST-SED are too good and keep increasing on weak data, strong real data and strong synthetic data. Therefore, we want to make sure that such continuous increments on the performances are solid - namely, the model performances indeed increase on the unseen data in training. So we surely did the experiments to test the model after each epoch, but NOT using it for model selection (the model selection is determined by the val/obj_metric defined here https://github.com/Audio-WestlakeU/ATST-SED/blob/main/train/local/ultra_sed_trainer.py#L474 ).

There should be no doubt on the improvements because:

We found that, actually, the current validation method did not pick up the best model for the development dataset.
The model improvement on the PublicEval dataset is also significant.

Anyway, this line of the code is indeed suspicious, I will fix that and leave a notificaiton on the home page.

Many thanks for mentioning that!

Explaination see in: #11

swagshaw · 2024-05-08T09:28:48Z

I see. The obj_metric is independent of the test_dataset here because of the mask. Thank you for the quickly explanation.

SaoYear added a commit that referenced this issue May 8, 2024

Fix validation dataset definition, removed unused test_dataset

d82405c

Explaination see in: #11

swagshaw closed this as completed May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validate on the test set #11

Validate on the test set #11

swagshaw commented May 8, 2024

SaoYear commented May 8, 2024

swagshaw commented May 8, 2024

Validate on the test set #11

Validate on the test set #11

Comments

swagshaw commented May 8, 2024

SaoYear commented May 8, 2024

swagshaw commented May 8, 2024