crackling sound on voice output using VoiceAssistant
vaski666 opened this issue · comments
Hi,
I have successfully set up the voice assistant on a CoreS3 using the Yaml below.
https://github.com/m5stack/M5CoreS3-Esphome/blob/main/voice-assistant/m5stack-cores3.yaml
and im very happy to have found this repository 👍
The default settings give a lot of "crackling sounds" on top of the voice when the response is given (response produced by OpenAI Conversation).
Are there any possibiities to adjust the sound output?
I have a ATOM Echo set up as a voice assistant too and the output on this one does not produce these noises.
+1
Hi, i'll check
Hello
Is there any news here? I have the same problem. Do not understand the voice output at all. Would be great if you could get it to work. With all the (additional) sensors that M5 offers, it would be a great voice base! Probably even the best!
Maybe if I can get the assistant to work I'll be able to check. See #13.
Hello Is there any news here? I have the same problem. Do not understand the voice output at all. Would be great if you could get it to work. With all the (additional) sensors that M5 offers, it would be a great voice base! Probably even the best!
Hello Is there any news here? I have the same problem. Do not understand the voice output at all. Would be great if you could get it to work. With all the (additional) sensors that M5 offers, it would be a great voice base! Probably even the best!
Wanted to try it out. However, the M5 does not appear as a media player for me. What does your configuration look like?
Same problem here
any news to this?
Hi, i'll check
Any News? Can I Help?
Hello Is there any news here? I have the same problem. Do not understand the voice output at all. Would be great if you could get it to work. With all the (additional) sensors that M5 offers, it would be a great voice base! Probably even the best!
I chopped down the code dramatically, replaced speaker with media_player, and provided a name so it would appear within Home Assistant. However all attempts to get something to play from the speaker have resulted in silence:
If you would like to try my code you will need to add the following Secrets to ESPHome. Ignore the "HA not found" warning on the display. I wanted to ensure the display and backlight are on while testing the speaker. Lessons learned from issues with the M5StickC+ display backlight and the SPK2.
cores3_address:
cores3_encryption:
cores3_ota:
substitutions:
name: m5cores3
friendly_name: M5CoreS3
loading_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/loading_320_240.png
idle_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/idle_320_240.png
listening_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/listening_320_240.png
thinking_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/thinking_320_240.png
replying_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/replying_320_240.png
error_illustration_file: https://github.com/esphome/firmware/raw/main/voice-assistant/casita/error_320_240.png
loading_illustration_background_color: '000000'
idle_illustration_background_color: '000000'
listening_illustration_background_color: 'FFFFFF'
thinking_illustration_background_color: 'FFFFFF'
replying_illustration_background_color: 'FFFFFF'
error_illustration_background_color: '000000'
voice_assist_idle_phase_id: '1'
voice_assist_listening_phase_id: '2'
voice_assist_thinking_phase_id: '3'
voice_assist_replying_phase_id: '4'
voice_assist_not_ready_phase_id: '10'
voice_assist_error_phase_id: '11'
voice_assist_muted_phase_id: '12'
esphome:
name: m5core-s3
friendly_name: m5core-s3
project:
name: m5stack.cores3-voice-assistant
version: "1.0"
platformio_options:
board_build.f_cpu : 240000000L
libraries:
- m5stack/M5GFX@^0.1.11
- m5stack/M5Unified@^0.1.11
on_boot:
priority: 600
then:
- script.execute: draw_display
- delay: 30s
- if:
condition:
lambda: return id(init_in_progress);
then:
- lambda: id(init_in_progress) = false;
- script.execute: draw_display
esp32:
board: esp32-s3-devkitc-1
flash_size: 16MB
framework:
type: arduino
psram:
mode: octal
speed: 80MHz
external_components:
- source:
type: git
url: https://github.com/m5stack/M5CoreS3-Esphome
components: [ board_m5cores3, m5cores3_audio, m5cores3_display ]
refresh: 0s
# Enable logging
logger:
# Enable Home Assistant API
api:
encryption:
key: !secret cores3_encryption
on_client_connected:
- script.execute: draw_display
on_client_disconnected:
- script.execute: draw_display
ota:
password: !secret cores3_ota
wifi:
ssid: !secret wifi_ssid
password: !secret wifi_password
use_address: !secret cores3_address
# Enable fallback hotspot (captive portal) in case wifi connection fails
ap:
on_connect:
- script.execute: draw_display
- delay: 5s # Gives time for improv results to be transmitted
on_disconnect:
- script.execute: draw_display
captive_portal:
#
# Globals
#
globals:
- id: init_in_progress
type: bool
restore_value: no
initial_value: 'true'
- id: voice_assistant_phase
type: int
restore_value: no
initial_value: ${voice_assist_not_ready_phase_id}
#
# Display
#
script:
- id: draw_display
then:
- if:
condition:
lambda: return !id(init_in_progress);
then:
- if:
condition:
wifi.connected:
then:
- if:
condition:
api.connected:
then:
- lambda: |
switch(id(voice_assistant_phase)) {
case ${voice_assist_listening_phase_id}:
id(m5cores3_lcd).show_page(listening_page);
id(m5cores3_lcd).update();
break;
case ${voice_assist_thinking_phase_id}:
id(m5cores3_lcd).show_page(thinking_page);
id(m5cores3_lcd).update();
break;
case ${voice_assist_replying_phase_id}:
id(m5cores3_lcd).show_page(replying_page);
id(m5cores3_lcd).update();
break;
case ${voice_assist_error_phase_id}:
id(m5cores3_lcd).show_page(error_page);
id(m5cores3_lcd).update();
break;
case ${voice_assist_muted_phase_id}:
id(m5cores3_lcd).show_page(muted_page);
id(m5cores3_lcd).update();
break;
case ${voice_assist_not_ready_phase_id}:
id(m5cores3_lcd).show_page(no_ha_page);
id(m5cores3_lcd).update();
break;
default:
id(m5cores3_lcd).show_page(idle_page);
id(m5cores3_lcd).update();
}
else:
- display.page.show: no_ha_page
- component.update: m5cores3_lcd
else:
- display.page.show: no_wifi_page
- component.update: m5cores3_lcd
else:
- display.page.show: initializing_page
- component.update: m5cores3_lcd
image:
- file: ${error_illustration_file}
id: casita_error
resize: 320x240
type: RGB24
use_transparency: true
- file: ${idle_illustration_file}
id: casita_idle
resize: 320x240
type: RGB24
use_transparency: true
- file: ${listening_illustration_file}
id: casita_listening
resize: 320x240
type: RGB24
use_transparency: true
- file: ${thinking_illustration_file}
id: casita_thinking
resize: 320x240
type: RGB24
use_transparency: true
- file: ${replying_illustration_file}
id: casita_replying
resize: 320x240
type: RGB24
use_transparency: true
- file: ${loading_illustration_file}
id: casita_initializing
resize: 320x240
type: RGB24
use_transparency: true
- file: https://github.com/esphome/firmware/raw/main/voice-assistant/error_box_illustrations/error-no-wifi.png
id: error_no_wifi
resize: 320x240
type: RGB24
use_transparency: true
- file: https://github.com/esphome/firmware/raw/main/voice-assistant/error_box_illustrations/error-no-ha.png
id: error_no_ha
resize: 320x240
type: RGB24
use_transparency: true
color:
- id: idle_color
hex: ${idle_illustration_background_color}
- id: listening_color
hex: ${listening_illustration_background_color}
- id: thinking_color
hex: ${thinking_illustration_background_color}
- id: replying_color
hex: ${replying_illustration_background_color}
- id: loading_color
hex: ${loading_illustration_background_color}
- id: error_color
hex: ${error_illustration_background_color}
display:
- platform: m5cores3_display
model: ILI9342
dc_pin: 35
update_interval: never
id: m5cores3_lcd
pages:
- id: idle_page
lambda: |-
it.fill(id(idle_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_idle), ImageAlign::CENTER);
- id: listening_page
lambda: |-
it.fill(id(listening_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_listening), ImageAlign::CENTER);
- id: thinking_page
lambda: |-
it.fill(id(thinking_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_thinking), ImageAlign::CENTER);
- id: replying_page
lambda: |-
it.fill(id(replying_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_replying), ImageAlign::CENTER);
- id: error_page
lambda: |-
it.fill(id(error_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_error), ImageAlign::CENTER);
- id: no_ha_page
lambda: |-
it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_ha), ImageAlign::CENTER);
- id: no_wifi_page
lambda: |-
it.image((it.get_width() / 2), (it.get_height() / 2), id(error_no_wifi), ImageAlign::CENTER);
- id: initializing_page
lambda: |-
it.fill(id(loading_color));
it.image((it.get_width() / 2), (it.get_height() / 2), id(casita_initializing), ImageAlign::CENTER);
- id: muted_page
lambda: |-
it.fill(Color::BLACK);
#
# Audio
#
board_m5cores3:
m5cores3_audio:
id: m5cores3_audio_1
microphone:
- platform: m5cores3_audio
m5cores3_audio_id: m5cores3_audio_1
id: m5cores3_mic
adc_type: external
i2s_din_pin: 14
pdm: false
media_player:
- platform: m5cores3_audio
m5cores3_audio_id: m5cores3_audio_1
id: media_out
name: ${friendly_name}
dac_type: external
i2s_dout_pin: 13
mode: mono
Either voice assistants can't be media players or something but jo, it won't show but you can use other speakers, you just have to define them in the yaml. See the link below. I found this out creating announcements as I could never get it to go to my Korvo-1, which has annoying audio issues. In fact I think everything but the S3 boxes does.
Here you go. Was the current YAML ever using Microwakeword instead of Openeakeword? Doesn't appear to as I was looking into this device because my espressif korvo-1 makes a boying popping noises from the 3.5mn output jack. Also has anyone tried the RCA module? I imagine code or YAML would need to be added in order to to get the full potential, same with the Ethernet module too (I'm assuming)
Imagine using something like this for audio out would require adding it to ESPHome, at least defining pins at a minimum but I could be wrong, I'm not a developer.
https://shop.m5stack.com/products/rca-audio-video-composite-module-13-2
Thie above worked for me. It still played on my Korvo but I simply unplugged the 3.5mm audio output and he mentions this on the video. Not sure if you can just turn the M5 volume all the way down to accomplish the same thing. You have to click on ESPHome, if you click under it you won't have the below option (true for all integrations).
Also you have to go to devices, esphome, then click configure in your voice assistant and check the checkbox to let it make home assistant service calls. It won't work unless you do (it's covered in the video).. You can also now call any ho.e assistant service calls in ESPHome for each device you do this to.
on_tts_end:
- homeassistant.service:
service: media_player.play_media
data:
entity_id: media_player.vlc_telnet
media_content_id: !lambda 'return x;'
media_content_type: music
announce: "true"