AssertionError when calculating BLEU score #15

thechargedneutron · 2023-06-30T20:21:47Z

Thanks for the code and documentation. I am running the captioning finetuning experiment on MSRVTT. During the evaluation stage, the code stops with an AssertionError here. Seems like hypo variable contains repetition of the same sentence multiple times. Can you please tell if I missed any step of if not, why is this error coming and how to solve it?

Here is the generated hypo variable and the ref variable output for video9894:

['in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera']
['a boy is talking to his roomnates who are in different room', 'a man asks another man to help him with chores', 'a man avoids helping his roommates', 'a man doesn t help his friends with anything', 'a man drinking some beer', 'a man walking in an apartment', 'a person communicating with other person', 'he was in the kitchen', 'man asking another man to do the dishes', 'man refusing to help his roommate', 'roommate continues to say no each time he is asked to help with something', 'the boys meet courier boy', 'the man asks for help', 'the youtube nigahiga doesn t want to help anyone', 'two friends are having fun', 'two men are talking in a kitchen', 'two young men talking to each other about doing dishes', 'a man walking in an apartment', 'a man drinking some beer', 'roommate continues to say no each time he is asked to help with something']

Thanks!

The text was updated successfully, but these errors were encountered:

TXH-mercury · 2023-07-04T02:49:51Z

Hi @thechargedneutron , this is because that in valor's code when loading a data sample meets some wrong, it will automatically random choose another sample for replacement. THis process should only happened in training process but in the code i don't restrict it. So when testing meeting some wrong samples. it will randomly choose another sample from test set which cause dupilicate validation on chosen samples, results in the captioning bug.

Solution:

VALOR/data/data.py

Line 369 in 7a047df

if video_pixels is None: ###wrong img/video and needs to resample

changing from 'if video_pixels is None:' to 'if video_pixels is None and self.training:'

VALOR/data/data.py

Line 377 in 7a047df

if audio_spectrograms is None: ### wrong audio and needs to resample

changing from 'if audio_spectrograms is None:' to 'if audio_spectrograms is None and self.training:'

At this time, wrong samples in testing time will report a bug instead of sesearching for a replaced one, and you could fix the (video/audio) data according to the bug hint. To view the real bug information, you could comment out the 'try except' at

VALOR/data/data.py

Line 179 in d616e97

try:

I will fix this in the latest code, thanks for pointing it.

thechargedneutron mentioned this issue Jul 1, 2023

Plan to release finetuned models? #11

Closed

TXH-mercury closed this as completed May 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AssertionError when calculating BLEU score #15

AssertionError when calculating BLEU score #15

thechargedneutron commented Jun 30, 2023

TXH-mercury commented Jul 4, 2023 •

edited

Loading

AssertionError when calculating BLEU score #15

AssertionError when calculating BLEU score #15

Comments

thechargedneutron commented Jun 30, 2023

TXH-mercury commented Jul 4, 2023 • edited Loading

TXH-mercury commented Jul 4, 2023 •

edited

Loading