Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot achieve paper's RawNet3 results using an official recipe #160

Open
happyjin opened this issue Oct 11, 2022 · 4 comments
Open

Cannot achieve paper's RawNet3 results using an official recipe #160

happyjin opened this issue Oct 11, 2022 · 4 comments

Comments

@happyjin
Copy link

Dear author,

I cannot reimplement the paper's results using the RawNet3 script, which should get EER 0.8932. I am wondering if the paper's result is wrong. Can you please upload a recipe so that we can reimplement the result on paper?

@Jungjee
Copy link
Collaborator

Jungjee commented Oct 18, 2022

Hi, can you share your results?

I cannot share the exact training recipe because it includes internal codes, which is one of the reasons why I shared trained weight parameters.
However, model architecture is exactly the same and you should be getting similar results.

@JunLi0514
Copy link

Hi, we're encountering a similar issue. The pre-trained RawNet3 achieves an EER of 0.9809% with the full-length enroll and full-len test utterances. But when we train RawNet3 with the voxceleb1 & 2 dev set and use noise and reverberation addition as augmentation methods, the EER increases to 1.20% after 40 epochs. Besides, the EER rises to 1.4% after applying speed perturbation for voxceleb 2 dev.

During training, the mixedprec and distributed arguments are used to accelerate training.

Could you please provide some advice on how to address this? Thank you!

@Jungjee
Copy link
Collaborator

Jungjee commented Dec 13, 2023

@JunLi0514 , hi, thanks for reporting your status. Speaking of EER 0.98%, did you follow the same setup by segmenting it into ten 4-second segments? If you input the full-length utterance, it would be different to what we did and hence result might be affected.

Note that I trained RawNet3 using another codebase. Only the model architecture has been updated to this repo (VoxCeleb_trainer).

FYI, due to my changed affiliation, I recently developed a RawNet3 reproducible recipe in ESPnet2, where I achieved EER of 0.73% with RawNet3 and it was reproducible several times when tested.

@JunLi0514
Copy link

Hi, @Jungjee , thank you for your quick reply! The work is impressive and the training recipe described in the paper is detailed. With your kind advice, we'll test the pretrained model with a duration of 4s and try training on ESPnet2 ^ ^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants