Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError when calculating BLEU score #15

Closed
thechargedneutron opened this issue Jun 30, 2023 · 1 comment
Closed

AssertionError when calculating BLEU score #15

thechargedneutron opened this issue Jun 30, 2023 · 1 comment

Comments

@thechargedneutron
Copy link

Thanks for the code and documentation. I am running the captioning finetuning experiment on MSRVTT. During the evaluation stage, the code stops with an AssertionError here. Seems like hypo variable contains repetition of the same sentence multiple times. Can you please tell if I missed any step of if not, why is this error coming and how to solve it?

Here is the generated hypo variable and the ref variable output for video9894:

['in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera', 'in the room a man in red was talking to the camera']
['a boy is talking to his roomnates who are in different room', 'a man asks another man to help him with chores', 'a man avoids helping his roommates', 'a man doesn t help his friends with anything', 'a man drinking some beer', 'a man walking in an apartment', 'a person communicating with other person', 'he was in the kitchen', 'man asking another man to do the dishes', 'man refusing to help his roommate', 'roommate continues to say no each time he is asked to help with something', 'the boys meet courier boy', 'the man asks for help', 'the youtube nigahiga doesn t want to help anyone', 'two friends are having fun', 'two men are talking in a kitchen', 'two young men talking to each other about doing dishes', 'a man walking in an apartment', 'a man drinking some beer', 'roommate continues to say no each time he is asked to help with something']

Thanks!

@TXH-mercury
Copy link
Owner

TXH-mercury commented Jul 4, 2023

Hi @thechargedneutron , this is because that in valor's code when loading a data sample meets some wrong, it will automatically random choose another sample for replacement. THis process should only happened in training process but in the code i don't restrict it. So when testing meeting some wrong samples. it will randomly choose another sample from test set which cause dupilicate validation on chosen samples, results in the captioning bug.

Solution:

if video_pixels is None: ###wrong img/video and needs to resample

changing from 'if video_pixels is None:' to 'if video_pixels is None and self.training:'

if audio_spectrograms is None: ### wrong audio and needs to resample

changing from 'if audio_spectrograms is None:' to 'if audio_spectrograms is None and self.training:'

At this time, wrong samples in testing time will report a bug instead of sesearching for a replaced one, and you could fix the (video/audio) data according to the bug hint. To view the real bug information, you could comment out the 'try except' at

try:

I will fix this in the latest code, thanks for pointing it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants