Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP32 Example(s) #49

Open
jorgie0 opened this issue Mar 26, 2024 · 4 comments
Open

ESP32 Example(s) #49

jorgie0 opened this issue Mar 26, 2024 · 4 comments

Comments

@jorgie0
Copy link

jorgie0 commented Mar 26, 2024

Hi,
The olaf_fp_ref_mem.h included in the zip file from here:
https://0110.be/files/attachments/475/ESP32-Olaf.zip

Has a number of nuls on line 7.
image

In addition to that the content is different from the same file in the arduino sketch folder for esp32_inmp441_olaf

Should the nuls be there?
What song does this file represent?

2nd set of questions -
When trying to compile the arduino sketch esp32_inmp441_olaf.ino

I found I had to copy the content of the files referenced in the esp32_inmp441_olaf folder. Despite that I have been unable to successfully link the code.

Are you able to provide guidance as to how to build your original hardware and ESP32 code that you used here:
https://0110.be/posts/Olaf_-_Acoustic_fingerprinting_on_the_ESP32_and_in_the_Browser
For example what ESP32 board did you use and what pins did you have the microphone connected to?

I believe that this is a very interesting project and would like to replicate it.

Sean

@JorenSix
Copy link
Owner

Hi,

Thanks for the interest. I would advice against using the code from the blog post. The GitHub version has 4 years of additional development behind it and is much cleaner.

To read a MEMS microphone takes a bit of wiring and pinout very much depends on which ESP32 you are working with. I believed I worked with a WROOM Devkit V1 dev board and a ESP32 Thing by sparkfun.

The INMP441 Arduino patch should work when the files are copied/linked in the folder of the patch. Note that I had to rename .c files to .cpp to get things working. See https://github.com/JorenSix/Olaf/blob/master/ESP32/setup_arduino_patch.rb which sets up an Arduino folder with linked files.

Perhaps I should try to get Olaf working on an RP2040 with built-in microphone to get an easy to use demo...

Good luck with your project!

Joren

@jorgie0
Copy link
Author

jorgie0 commented Apr 11, 2024

Hi Joren,
I have managed to get it to compile and link. I'm using the INMAP 441 and an ESP32 WROOM.
I've had mixed results when using it. Sometimes without an music playing has matches for a few seconds.
As a consequence I'm wondering what I'm doing wrong as I'm losing confidence that I have got it working correctly.
I have used your instructions and created a new memory file using AC/DC TNT.
I'd run the code quite a few times and I may get a match of 1 minute 18 seconds and sometimes not at all.
Here is some of the output after the song has been playing for around 1min 30 seconds:
15:48:20.808 -> 0, 0.00, 0.00, , 0, 0.00, 0.00 15:48:24.504 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s) 15:48:24.504 -> 4, 82.70, 92.77, TNT, 666, 8.02, 18.08 15:48:28.329 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s) 15:48:28.329 -> 4, 82.70, 92.77, TNT, 666, 8.02, 18.08 15:48:31.762 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s) 15:48:31.762 -> 4, 82.70, 92.77, TNT, 666, 8.02, 18.08 15:48:36.022 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s) 15:48:36.022 -> 4, 82.70, 92.77, TNT, 666, 8.02, 18.08 15:48:39.427 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s) 15:48:39.427 -> 0, 0.00, 0.00, , 0, 0.00, 0.00 15:48:41.461 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s) 15:48:44.066 -> match count (#), q start (s) , q stop (s), ref path, ref ID, ref start (s), ref stop (s)

Any suggestions?
Regards
Sean

@jorgie0
Copy link
Author

jorgie0 commented Apr 13, 2024

Hi Joren,
One issue was not grounding the L/R pin of the microphone.
Having corrected that I'm still getting results that are not quite what I expect. I'm getting 4 or 5 results where match count = 4 when there is little to no sound around. Is this normal?

Other times when I'm playing the song I expect to be identified it sometimes does not identify the song at all or it take over a minute. Is this also normal?

I've also tried playing the same track from Spotify and it does not identify the song at all. I have changed the gain setting and this has helped a little.

My reason for asking is to ensure I'm not expecting to much of the software which you have so kindly shared.

Regards
Sean

@JorenSix
Copy link
Owner

Hi Sean,

I would expect it to work more reliably. A few tips to improve the over the air queries:

  • Check the ESP32 microphone (an L/R not grounded is only one of the many possible problems). In my experience these I2S/i2c/PWM microphones are very picky about sample rates/ sample formats/gain. Check the WiFiMicrophone example which allows you to listen to the microphone on your pc. Make sure the microphone does what you think it does.
  • Test the 'mem' version of Olaf on your pc with the music you want to recognise and with a microphone to set a baseline. Perhaps create a small dataset of reference - recorded query samples to see how well the algorithm can work. The configuration of the ESP32 and computer version can be configured in the same way. Also check e.g. the Spotify version.
  • Olaf only recognises the same recorded audio. A live version of the same song will not be recognised!
  • Create the reference fingerprints on the microcontroller and not on pc. This takes into account microphone characteristics/ gain / environment / ... and perhaps FFT bin shift?
  • Check the algorithm parameters and perhaps extract more fingerprints (test the settings on the pc version)
  • The algorithm does not work that well for very noisy signals (white noise) or sparse signals (semi-silence), only vertical elements in the spectrogram (extreme percussive music) or only horizontal (sequence of pure sine waves). AC/DC should be reasonable but I imagine a noisy spectrum so not the best fit.

Good luck with your project!

Joren

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants