playing tts/audio on VTO #177

luzik · 2022-03-16T17:11:56Z

It would be awesome, to be able to send tts or audio via VTO speaker.

My personal use case is to connect face recognition with voice messages. Something like "Hello MyName"

If there is no direct command for that, my VTO have a place where I can store mp3 audio for various events. Maybe rroller/dahua could generate mp3, upload it to VTO, and trigger an action for that ?

Saiyajin53 · 2022-03-21T07:51:51Z

you can change the orginal voice with your own mp3 but there is a limit with 20kb only :/

itkfilelor · 2022-03-22T02:01:10Z

I may have found the api command for the Amcrest AD110 doorbell, in theory it would be the same for the Dahua ones. Doing some tests and will report back.

UPDATE: Ok, so apparently "we" have already known about the endpoint for sometime. From what I have found is it is really sketch for files, it needs to be rather short and lower quality, else the device gets overwhelmed. I plan to work on some premade tts recordings and see where it leads.
MORE: I found this
So i took a google tts file I made in HA and converted it like they showed in the thread:
sox -v 0.8 audio_test.mp3 -r 8k -c 1 audio_test.al
Then I sent:
sleep 45 && curl -vvv \ --limit-rate 8K \ -F "file=@audio_test.al;type=Audio/G.711A" \ -H "Content-Type: Audio/G.711A" \ http:https://admin:password@<ip>/cgi-bin/audio.cgi\?action\=postAudio\&httptype\=singlepart\&channel\=1
set a timer on my phone and ran my fat arse upstairs and waited. I heard the TTS on my doorbell within 1.5s of the timer expiring. There was a little garbage at the beginning and end but the voice came over clear.
When I have a chance I will see about making it a media_player entity.

luzik · 2022-03-24T07:12:22Z

My VTO

curl -vvv --user "admin:pass" --limit-rate 8K -F "file=@audio_test.al;type=Audio/G.711A" -H "Content-Type: Audio/G.711A" "http:https://192.168.1.30/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1"
*   Trying 192.168.1.30:80...
* Connected to 192.168.124.30 (192.168.124.30) port 80 (#0)
* Server auth using Basic with user 'admin'
> POST /cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1 HTTP/1.1
> Host: 192.168.124.30
> Authorization: Basic XXXXX
> User-Agent: curl/7.74.0
> Accept: */*
> Content-Length: 11138
> Content-Type: Audio/G.711A; boundary=------------------------a90a8721f68274a4
>
* We are completely uploaded and fine

....and hang

luzik · 2022-03-24T07:31:55Z

But it actually plays nicely on my VTO!!

Just not response ending session

luzik · 2022-03-24T08:26:41Z

With VTO2211G I do not need --limit-rate nor auth ?!?
To get connection close I added --speed-limit 1 --speed-time 1 that close connection where transfer drops below 1byte/sec in 1 sec window.

Can dahua be visible as HA MediaPlayer class device? or maybe it is wrong idea ?
It would be awesome to include automatic audio convertion and play function in https://github.com/rroller/dahua

itkfilelor · 2022-03-24T13:52:14Z

Yeah I had the hang as well. I've never messed with any form of media streaming in python so I don't know how to handle that with the requests module that we are using here. In fact most of my http get/post experiencesin python were simple endpoints that auto closed. This endpoint appears to be the one the app uses to open the stream, but the docs don't show how it ends. I'll have to dive into the requests module and see how it closes persistent connections.

luzik · 2022-03-24T14:07:23Z

Maybe this ?

r = requests.get('https://github.com', timeout=(3.05, 5))

https://docs.python-requests.org/en/latest/user/advanced/#timeouts

3.05 - connection timeout
5 - read timeout

itkfilelor · 2022-03-24T14:30:13Z

😂 😅
Never looked into it before, this is likely the way. When I have a mo to work on I'll submit a new PR. Can you confirm with the dahua device the endpoint is the same as my amcrest bell?

luzik · 2022-03-24T18:19:24Z

Yes it is
Please also consider using FFmpeg instead of sox. Default home-assistant docker image, contains only ffmpeg.

ffmpeg -i audio_test.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16 audio_test.al is working for me. Later on I will test it with acc (should be supported with hardware, and using less space/ be faster)

itkfilelor · 2022-03-24T19:11:33Z

Got it, have some free time coming up, I'll look into it.

luzik · 2022-03-25T08:03:16Z

I failed trying to play an ACC format on my VTO. pcm_alaw is a way to go.

calisro · 2022-06-06T12:47:10Z

I've been playing around with this. The issue I am having, though, is after sending a few streams of audio (which work very well btw with pcm_alaw) it then refuses any more. Its almost like it needs a 'end conversation' to be sent to close the existing connections. I am at a loss tbh.

What I have noticed though. It sends perfectly the first time and then fails the second. I believe the 'mic' needs to be turned off somehow. In the amcrest app, you turn the mic on, speak, then turn it off.

IF I test the first time, then go into the app and toggle the mic it works again. I need to figure out how to 'turn off the mic' after sending. Any ideas?

EDIT:
Timeouts/keepalive fixed it.
#181 (comment)

NickM-27 · 2022-06-19T12:29:18Z

Would love to see this as a media player!

luzik · 2022-09-07T16:25:47Z

As media player would be a grate feature it probably take some time to implement.. in the meantime did someone figure out how to automate/script this in HA ?

calisro · 2022-09-07T21:53:02Z

Well. For any camera that supports onvif profile T, you can now 2-way with the cameras with go2rtc. I'm using it with a ad410 perfectly.

luzik · 2022-09-08T16:48:15Z

Yeah, I am very trilled running go2rtc in 2-way mode ..just struggling with ssl via traefik under "network_mode: host" mode.

Meanwhile I wrote automation for playing TTS over VTO ..this is a main part

shell_command:
   play_tts_on_vto: >-
     /bin/bash -c "name={{states('input_text.person_at_door')}} ; x=`/usr/bin/curl -X POST -H \"Authorization: Bearer TOKEN\" -H \"Content-Type: application/json\" -d '{\"messa  ge\": \"Hi '$name'\", \"platform\": \"google_translate\"}' http:https://localhost:8123/api/tts_get_url |jq -r .url` && /usr/bin/curl $x -o /tmp/audio_vto.mp3 && /usr/bin/ffmpeg -i   /tmp/audio_vto.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16 /tmp/audio_vto.al && /usr/bin/curl -vvv -F \"file=@/tmp/audio_vto.al;type=Audio/G.711A\" -H \"Content-Type: Aud  io/G.711A\" \"http:https://VTO_IP/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1\" --speed-limit 1 --speed-time 1; rm /tmp/audio_vto.al /tmp/audio_vto.mp3"

GaryOkie · 2022-11-17T17:03:03Z

For what it's worth - the techniques described here also work on the Amcrest AD110/AD410 doorbells to send custom sounds, including sirens.

morpheus8888 · 2022-12-24T09:18:40Z

Would love to see this as a media player!

any news? i'm very interested in this

Yeah, I am very trilled running go2rtc in 2-way mode ..just struggling with ssl via traefik under "network_mode: host" mode.

Meanwhile I wrote automation for playing TTS over VTO ..this is a main part

shell_command:
   play_tts_on_vto: >-
     /bin/bash -c "name={{states('input_text.person_at_door')}} ; x=`/usr/bin/curl -X POST -H \"Authorization: Bearer TOKEN\" -H \"Content-Type: application/json\" -d '{\"messa  ge\": \"Hi '$name'\", \"platform\": \"google_translate\"}' http:https://localhost:8123/api/tts_get_url |jq -r .url` && /usr/bin/curl $x -o /tmp/audio_vto.mp3 && /usr/bin/ffmpeg -i   /tmp/audio_vto.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16 /tmp/audio_vto.al && /usr/bin/curl -vvv -F \"file=@/tmp/audio_vto.al;type=Audio/G.711A\" -H \"Content-Type: Aud  io/G.711A\" \"http:https://VTO_IP/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1\" --speed-limit 1 --speed-time 1; rm /tmp/audio_vto.al /tmp/audio_vto.mp3"

can you explain the procedure better for a newbie like me? thank you @luzik

Pveska · 2023-06-04T12:08:47Z

So, any progress with that issue?

Pveska · 2023-06-04T12:35:15Z

Yeah, I am very trilled running go2rtc in 2-way mode ..just struggling with ssl via traefik under "network_mode: host" mode.

Meanwhile I wrote automation for playing TTS over VTO ..this is a main part

shell_command:
   play_tts_on_vto: >-
     /bin/bash -c "name={{states('input_text.person_at_door')}} ; x=`/usr/bin/curl -X POST -H \"Authorization: Bearer TOKEN\" -H \"Content-Type: application/json\" -d '{\"messa  ge\": \"Hi '$name'\", \"platform\": \"google_translate\"}' http:https://localhost:8123/api/tts_get_url |jq -r .url` && /usr/bin/curl $x -o /tmp/audio_vto.mp3 && /usr/bin/ffmpeg -i   /tmp/audio_vto.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16 /tmp/audio_vto.al && /usr/bin/curl -vvv -F \"file=@/tmp/audio_vto.al;type=Audio/G.711A\" -H \"Content-Type: Aud  io/G.711A\" \"http:https://VTO_IP/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1\" --speed-limit 1 --speed-time 1; rm /tmp/audio_vto.al /tmp/audio_vto.mp3"

Would be nice if you explain that code for us

baudneo · 2023-12-25T23:21:52Z

Yeah, I am very trilled running go2rtc in 2-way mode ..just struggling with ssl via traefik under "network_mode: host" mode.
Meanwhile I wrote automation for playing TTS over VTO ..this is a main part

shell_command:
   play_tts_on_vto: >-
     /bin/bash -c "name={{states('input_text.person_at_door')}} ; x=`/usr/bin/curl -X POST -H \"Authorization: Bearer TOKEN\" -H \"Content-Type: application/json\" -d '{\"messa  ge\": \"Hi '$name'\", \"platform\": \"google_translate\"}' http:https://localhost:8123/api/tts_get_url |jq -r .url` && /usr/bin/curl $x -o /tmp/audio_vto.mp3 && /usr/bin/ffmpeg -i   /tmp/audio_vto.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16 /tmp/audio_vto.al && /usr/bin/curl -vvv -F \"file=@/tmp/audio_vto.al;type=Audio/G.711A\" -H \"Content-Type: Aud  io/G.711A\" \"http:https://VTO_IP/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1\" --speed-limit 1 --speed-time 1; rm /tmp/audio_vto.al /tmp/audio_vto.mp3"

Would be nice if you explain that code for us

Launches bash and sets 2 local variables

VAR 1: 'name' = {{states('input_text.person_at_door')}} (Jinja template for HASS to process)
- input_text.person_at_door - is out of scope here, but I am assuming that there is an external automation that runs face detection and recognition that sets input_text.person_at_door to a name like "George" or possibly "Unknown" for faces that aren't recognized.
VAR 2: 'x' = /usr/bin/curl -X POST -H \"Authorization: Bearer TOKEN\" -H \"Content-Type: application/json\" -d '{\"message\": \"Hi '$name'\", \"platform\": \"google_translate\"}' http:https://localhost:8123/api/tts_get_url | jq -r .url
- VAR 2 x is a curl command that creates a TTS audio file using text: Hi, $name. It queries HASS TTS endpoint to create a sound file, the variable 'x' is then set to the URL output that is parsed by jq binary. This gives a URL that you can HTTP GET to obtain the TTS audio file (in .mp3 format, I assume).
&& /usr/bin/curl $x -o /tmp/audio_vto.mp3 - if the 'name' and 'x' vars are set correctly (&& will not execute if the previous command fails) it then runs curl and saves the HASS generated TTS file to a temporary .mp3 file at /tmp/audio_vto.mp3
&& /usr/bin/ffmpeg -i /tmp/audio_vto.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16 /tmp/audio_vto.al - converts the .mp3 to pcm_alaw with proper flags and saves it to /tmp/audio_vto.al
&& /usr/bin/curl -vvv -F \"file=@/tmp/audio_vto.al;type=Audio/G.711A\" -H \"Content-Type: Audio/G.711A\" \"http:https://VTO_IP/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1\" --speed-limit 1 --speed-time 1; rm /tmp/audio_vto.al /tmp/audio_vto.mp3 - Issues the final command to send the pcm_alaw file to the VTO device for playback, and then deletes the 2 temp audio files (mp3 and alaw).
- Change VTO_IP to the actual IP of your VTO device.

The original command has 2 spaces in the last commands -H \"Content-Type: Au dio/G.711A\"

Here is a reformatted command with the whitespace removed:

shell_command:
   play_tts_on_vto: >-
     /bin/bash -c "name={{states('input_text.person_at_door')}} ; x=`/usr/bin/curl -X POST -H \"Authorization: Bearer TOKEN\" -H \"Content-Type: application/json\" -d '{\"message\": \"Hi '$name'\", \"platform\": \"google_translate\"}' http:https://localhost:8123/api/tts_get_url | jq -r .url` && /usr/bin/curl $x -o /tmp/audio_vto.mp3 && /usr/bin/ffmpeg -i /tmp/audio_vto.mp3 -c:a pcm_alaw -ac 1 -ar 8000 -sample_fmt s16  /tmp/audio_vto.al && /usr/bin/curl -vvv -F \"file=@/tmp/audio_vto.al;type=Audio/G.711A\" -H \"Content-Type: Audio/G.711A\" \"http:https://VTO_IP/cgi-bin/audio.cgi?action=postAudio&httptype=singlepart&channel=1\" --speed-limit 1 --speed-time 1; rm /tmp/audio_vto.al /tmp/audio_vto.mp3"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

playing tts/audio on VTO #177

playing tts/audio on VTO #177

luzik commented Mar 16, 2022

Saiyajin53 commented Mar 21, 2022

itkfilelor commented Mar 22, 2022 •

edited

Loading

luzik commented Mar 24, 2022

luzik commented Mar 24, 2022

luzik commented Mar 24, 2022

itkfilelor commented Mar 24, 2022

luzik commented Mar 24, 2022 •

edited

Loading

itkfilelor commented Mar 24, 2022

luzik commented Mar 24, 2022

itkfilelor commented Mar 24, 2022

luzik commented Mar 25, 2022

calisro commented Jun 6, 2022 •

edited

Loading

NickM-27 commented Jun 19, 2022

luzik commented Sep 7, 2022

calisro commented Sep 7, 2022 •

edited

Loading

luzik commented Sep 8, 2022

GaryOkie commented Nov 17, 2022

morpheus8888 commented Dec 24, 2022 •

edited

Loading

Pveska commented Jun 4, 2023

Pveska commented Jun 4, 2023

baudneo commented Dec 25, 2023 •

edited

Loading

playing tts/audio on VTO #177

playing tts/audio on VTO #177

Comments

luzik commented Mar 16, 2022

Saiyajin53 commented Mar 21, 2022

itkfilelor commented Mar 22, 2022 • edited Loading

luzik commented Mar 24, 2022

luzik commented Mar 24, 2022

luzik commented Mar 24, 2022

itkfilelor commented Mar 24, 2022

luzik commented Mar 24, 2022 • edited Loading

itkfilelor commented Mar 24, 2022

luzik commented Mar 24, 2022

itkfilelor commented Mar 24, 2022

luzik commented Mar 25, 2022

calisro commented Jun 6, 2022 • edited Loading

NickM-27 commented Jun 19, 2022

luzik commented Sep 7, 2022

calisro commented Sep 7, 2022 • edited Loading

luzik commented Sep 8, 2022

GaryOkie commented Nov 17, 2022

morpheus8888 commented Dec 24, 2022 • edited Loading

Pveska commented Jun 4, 2023

Pveska commented Jun 4, 2023

baudneo commented Dec 25, 2023 • edited Loading

itkfilelor commented Mar 22, 2022 •

edited

Loading

luzik commented Mar 24, 2022 •

edited

Loading

calisro commented Jun 6, 2022 •

edited

Loading

calisro commented Sep 7, 2022 •

edited

Loading

morpheus8888 commented Dec 24, 2022 •

edited

Loading

baudneo commented Dec 25, 2023 •

edited

Loading