-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use on own audio stream #12
Comments
Hi! Thanks for the interest and sorry for the late response! The CRNN model for SED is causal while the ATST-SED is not. ATST-SED composes of a pretrained ATST model, which is based on the Transformer model, and since the self-attention modules in ATST is NOT a "masked" version. The ATST cannot be used in an online manner. As for obtaining the audio cue, if you want to capture the audio cue in an offline way, you can refer to this issue. Hope these help and please contact me if you have any other problem. Nian |
Thank you for the clarification.
what model are you referring? is it a different repo? |
Sorry, the CRNN means the baseline model of the DCASE challenge 2021-2023, you can find it here. This CRNN is also included in the ATST-SED architecture, which is why I mentioned it. |
@SaoYear
Thank you for your work!
I want to be able to detect a unique audio cue in my video stream that captures audio. Assuming that i have trained a fine-tuned model, how can i go about using it in a causal manner?
The text was updated successfully, but these errors were encountered: