-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about how to get the result in table4 in the paper #4
Comments
Hi, thanks for your interest! The training configuration in this work is indeed complicated, which poses obstacles on the reproduction. I would like to answer you question step by step:
Hope these suggestions helps! |
Thanks for your detailed reply. I will try to follow your instructions. Thank you again for your remarkable work and your kind response. 👍 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for your code. It's a remarkable work!
I'm trying to reproduce the results in the paper:
FINE-TUNE THE PRETRAINED ATST MODEL FOR SOUND EVENT DETECTION
. Using the same setting in this repository, I get nearly the same results as the table 2 shown. But when I try to reproduce the result in table 4(i.e. the SOTA result in this paper, psds1=0.583, psds2=0,810), I only get psds1=0.5695,psds2=0.7997, even though I change the setting of the median-filters following the guide ofRCT: Random consistency training for semi-supervised sound event detection
to {3,28,7, 4,7,22,48,19,10, 50}.If the difference is caused because of the wrong setting of the median filter length? Or it may be the different mode of torchlighting that causes the difference (I set the torchlighting mode to DP, because some bugs occur when running in the DDP mode with multiply GPUs)? Or it's just because of the randomness of the result?
The text was updated successfully, but these errors were encountered: