Skip to content

Releases: kahrendt/microWakeWord

New Beta V2 Models with Extra Key

10 Jul 18:23
d0ef708
Compare
Choose a tag to compare
Pre-release

Adds a "trained_languages" key that describes what the training sample's primary language/pronunciation.

These models require a new version of the micro_wake_word component in ESPHome as they use a 10 ms step size instead of the original 20 ms. It should be available in the ESPHome's 2024.7 release. These models are faster and more accurate. Up to 3 models can run on a regular ESP32 at the same time (e.g., VAD, "okay nabu", and "hey mycroft"). ESP32-S3 supports running all 4 concurrently. I am still working on training a new "hey jarvis" model.

The default settings for each of these are benchmarked so that on the DipCo set, they have at most 0.16 false accepts per hour and have less than 0.1 false accepts per hour on the PicoVoice benchmark. The false rejection rates at these default settings are less than the corresponding v1 model's default settings. Note that since these are new models, you have to re-tune any custom probability cutoffs.

If you want to test these out now, you must be on the dev branch of ESPHome and use an external component. The yaml syntax may change without notice, so be aware this may break in the future! The implementation is backwards compatible, so you can still use the old models. However, you cannot use the older models at the same time as the new models; it is one or the other.

external_components:
  - source:
      type: git
      url: https://github.com/kahrendt/esphome
      ref:  mww-v2-external-library
    refresh: 0s
    components: [ micro_wake_word ]  

micro_wake_word:
  on_wake_word_detected:
    - voice_assistant.start: 
        wake_word: !lambda return wake_word; 
  vad:
    model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/vad.json
  models:
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/okay_nabu.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/alexa.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_jarvis.json
    - model: https://github.com/kahrendt/microWakeWord/releases/download/v2.1_models/hey_mycroft.json

If you want to run 3 models at once on an ESP32 device, you need to adjust the CPU Frequency to the max setting. The following yaml works for an ATOM Echo:

esp32:
  board: m5stack-atom
  framework:
    type: esp-idf
    version: recommended
    sdkconfig_options:
      CONFIG_ESP32_DEFAULT_CPU_FREQ_240: "y"

The CONFIG_ESP32_DEFAULT_CPU_FREQ_240: "y" is the necessary part.

Hey Mycroft model v0.2

08 May 19:11
d0ef708
Compare
Choose a tag to compare
Pre-release

This is a beta model for "Hey Mycroft." It's the best performing model to date while also being the fastest! I'm working on generating an ROC curve. It works quite well on an ATOM Echo, though you still need to remove the Improv BLE component to fit it all in memory.

VAD Model v0.21

28 Apr 20:28
332e1b0
Compare
Choose a tag to compare
VAD Model v0.21 Pre-release
Pre-release

Beta testing a new Voice Activity Model for use with a future version of the ESPHome micro_wake_word component.