Use on own audio stream #12

HeChengHui · 2024-06-07T08:58:15Z

@SaoYear
Thank you for your work!
I want to be able to detect a unique audio cue in my video stream that captures audio. Assuming that i have trained a fine-tuned model, how can i go about using it in a causal manner?

SaoYear · 2024-06-12T06:31:34Z

Hi! Thanks for the interest and sorry for the late response!

The CRNN model for SED is causal while the ATST-SED is not. ATST-SED composes of a pretrained ATST model, which is based on the Transformer model, and since the self-attention modules in ATST is NOT a "masked" version. The ATST cannot be used in an online manner.

As for obtaining the audio cue, if you want to capture the audio cue in an offline way, you can refer to this issue.

Hope these help and please contact me if you have any other problem.

Nian

HeChengHui · 2024-06-12T07:03:21Z

@SaoYear

Thank you for the clarification.

The CRNN model for SED is causal

what model are you referring? is it a different repo?

SaoYear · 2024-06-13T05:00:19Z

Sorry,

the CRNN means the baseline model of the DCASE challenge 2021-2023, you can find it here.

This CRNN is also included in the ATST-SED architecture, which is why I mentioned it.

HeChengHui closed this as completed Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use on own audio stream #12

Use on own audio stream #12

HeChengHui commented Jun 7, 2024

SaoYear commented Jun 12, 2024

HeChengHui commented Jun 12, 2024

SaoYear commented Jun 13, 2024

Use on own audio stream #12

Use on own audio stream #12

Comments

HeChengHui commented Jun 7, 2024

SaoYear commented Jun 12, 2024

HeChengHui commented Jun 12, 2024

SaoYear commented Jun 13, 2024