Add support of speech recognition for microphone node #52

HiroyasuNishiyama · 2020-12-22T02:01:18Z

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)

Proposed changes

This PR add speech recognition feature to ui_microphone node using SpeechRecognition interface of Web Speech API.

Checklist

I have read the contribution guidelines
For non-bugfix PRs, I have discussed this change on the forum/slack team.
I have run grunt to verify the unit tests pass
I have added suitable unit tests to cover the new/changed functionality

dceejay

Thank you very much for adding this mode and adding all the translations.

dceejay · 2020-12-22T21:23:59Z

node-red-node-ui-microphone/ui_microphone.html

+ $("#node-input-mode").change(function (e) {
+ var mode = $("#node-input-mode").val();
+ if (mode === "recog") {
+ $("#div-audio").hide();


Do you really want to hide the audio options ? they affect length of recording and whether it's press to talk or click to start then click to stop - both of which would also make sense for Speech to text I think ?

OK. I'll add support for these options also for speech recognition mode.

Thanks - as it was you could set it either way when in the non-reco mode and then switch to reco and the button behaviour would be as you had set it but couldn't be changed.

... but couldn't be changed.

Does this mean that there is some issue on your side?

not now that the option is available when in reco mode.

dceejay · 2020-12-22T21:25:24Z

node-red-node-ui-microphone/ui_microphone.html

+ $("#div-audio").show();
+ }
+ });
+ }
 if (this.press === undefined) {


You also need an if mode is undefined check for existing nodes that don't have this set or the select box will start as blank.

Thank you. Will add check as you suggested.

dceejay · 2020-12-22T21:30:51Z

node-red-node-ui-microphone/README.md

+For speech recognition mode, this node will work in more restricted browsers as it uses SpeechRecognition API.
+
+ - IE, Safari : not supported
+ - Firefox : SpeechRecognition needs to be enabled (about:config -> `media.webspeech.recognition.enable`).


Is there a simple way (without having to test every useragent etc) to detect if SpeechRecognition is allowed/enabled ? such that we could then hide the option if not ? (Though I guess for Firefox it would then hint that it could be enabled if wanted...) - Maybe we want to repeat this tip in the edit confg ?

Current implementation checks existence of SpeechRecognition interface, though on editor side, and disables speech recognition mode if not available.

interesting thought... I suppose in practise the runtime browser may well be different from the browser used for editing... I have no good solution for that, apart from ensuring the developer is aware of the possibility.

I think that ui_audio node have a similar implementation to create a voice list.

yes - my next question was going to be about languages... is it selected by the browser language or user configuration ?

Current implementation uses default browser language.
SpeechRecognition interface have lang property that accepts language to be recognized.
However, it do not provides list of languages that can be recognized.

dceejay · 2020-12-22T21:35:13Z

Is it also worth looking at adding a flag (under speech mode only) - for interim results operation - https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/interimResults -or maybe that should possible the default when in click to start and click to stop mode... ie so you get streamed results while in that mode ?

Thoughts ?

HiroyasuNishiyama · 2020-12-23T15:36:31Z

Is it also worth looking at adding a flag (under speech mode only) - for interim results operation - https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognition/interimResults -or maybe that should possible the default when in click to start and click to stop mode... ie so you get streamed results while in that mode ?

In our original implementation, specifying the interim results option would display the intermediate recognition results on the dashboard.
I think it can be used for such purposes.
However, if the node outputs interim results, distinguishing between the final and intermediate results is needed.
I would like to suggest the node outputs interim results to second output port if the interim results option is specified.

HiroyasuNishiyama · 2020-12-25T13:04:26Z

Hi. Updated speech recognition mode to use same button mode of audio input with fixes you pointed out.
Also added example flows for both mode.

dceejay · 2020-12-26T12:40:37Z

Yes - I think showing intermediate results on the dashboard is one use case - but that could be done via the backend - - IE don't implement in the node itself - if the user wants it they could feed the info back to another widget. I think there are other use cases where you may want ongoing data fed back. I don't think the node / reco engine will handle both modes at once so I don't know how a second output could work. Though once the reco has stopped presumable we can send a done ? so you know the microphone is no longer expected to send more.

HiroyasuNishiyama · 2020-12-28T10:27:58Z

Though once the reco has stopped presumable we can send a done ? so you know the microphone is no longer expected to send more.

Changed to set done property to true instead of sending a message to second port if interim flag is set.

dceejay · 2020-12-28T15:49:29Z

Looks good now - Using Done works nicely - thank you. only wrinkle I can see is if I drag on a new node and set to reco mode the default time is 0 seconds. It should default to some useful value (maybe 5 to be consistent with the audio mode. ? )
I think leaving as 0 doesn't make sense unless you have intermediate results on by default... which we don't want (I think).

dceejay · 2020-12-28T15:53:49Z

I'm going to merge this as-is and we can work on it from there in master. I won't publish it just yet.

HiroyasuNishiyama · 2020-12-28T23:47:43Z

I think leaving as 0 doesn't make sense unless you have intermediate results on by default... which we don't want (I think).

SpeechRecognition interface outputs a final result for each phrase (at least for Japanese recognition). The interim result will change before fixed as final result.
For this reason, I intentionally set the default value to 0 in speech recognition mode.

HiroyasuNishiyama added 2 commits December 22, 2020 00:02

update speech recognition

88d4574

add support of speech recognition for microphone node

408f4e4

dceejay requested changes Dec 22, 2020

View reviewed changes

dceejay reviewed Dec 22, 2020

View reviewed changes

HiroyasuNishiyama added 2 commits December 25, 2020 21:58

update speech recognition & add examples

eca156f

merge work/speech-recog3

65b382a

HiroyasuNishiyama added 2 commits December 26, 2020 00:04

merge merge work/speech-recog3

882014f

add description of duration=0

60bd1bc

use done property instead of second output port

6d9d821

dceejay merged commit 31d45a1 into node-red:master Dec 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support of speech recognition for microphone node #52

Add support of speech recognition for microphone node #52

HiroyasuNishiyama commented Dec 22, 2020

dceejay left a comment

dceejay Dec 22, 2020

HiroyasuNishiyama Dec 23, 2020

dceejay Dec 26, 2020

HiroyasuNishiyama Dec 28, 2020

dceejay Dec 28, 2020

dceejay Dec 22, 2020

HiroyasuNishiyama Dec 23, 2020

dceejay Dec 22, 2020

HiroyasuNishiyama Dec 23, 2020

dceejay Dec 26, 2020

HiroyasuNishiyama Dec 28, 2020

dceejay Dec 28, 2020

HiroyasuNishiyama Dec 28, 2020

dceejay commented Dec 22, 2020 •

edited

Loading

HiroyasuNishiyama commented Dec 23, 2020

HiroyasuNishiyama commented Dec 25, 2020

dceejay commented Dec 26, 2020

HiroyasuNishiyama commented Dec 28, 2020

dceejay commented Dec 28, 2020 •

edited

Loading

dceejay commented Dec 28, 2020

HiroyasuNishiyama commented Dec 28, 2020

Add support of speech recognition for microphone node #52

Add support of speech recognition for microphone node #52

Conversation

HiroyasuNishiyama commented Dec 22, 2020

Proposed changes

Checklist

dceejay left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dceejay commented Dec 22, 2020 • edited Loading

HiroyasuNishiyama commented Dec 23, 2020

HiroyasuNishiyama commented Dec 25, 2020

dceejay commented Dec 26, 2020

HiroyasuNishiyama commented Dec 28, 2020

dceejay commented Dec 28, 2020 • edited Loading

dceejay commented Dec 28, 2020

HiroyasuNishiyama commented Dec 28, 2020

dceejay commented Dec 22, 2020 •

edited

Loading

dceejay commented Dec 28, 2020 •

edited

Loading