Entire Modbus hub hangs if one of slave devices stops responding
FrenkK opened this issue · comments
The problem
If a single slave device on a Modbus hub stops working, all Modbus sensors for that hub stop updating until the offending slave device starts working again. If there are multiple hubs, the other ones still seem to work.
It seems the same problem can prevent Home Assistant startup from finishing as well, but I did not research that further.
This is a new problem, the same setup (except for the new data types) worked before the latest batch of Modbus changes.
What version of Home Assistant Core has the issue?
2022.5.5, at least since 2022.4.7
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Modbus
Link to integration documentation on our website
https://www.home-assistant.io/integrations/modbus/
Diagnostics information
home-assistant.log
This is the log with debug logging enabled.
This is a redacted log, I deleted everything until Homeassistant startup completed, no errors until that. I also deleted the entries for upnp, ssdp and some other unrelated functions since the original log file was over 2 megabytes in size.
Example YAML snippet
# Loads default set of integrations. Do not remove.
default_config:
# Text to speech
#tts:
# - platform: google_translate
automation: !include automations.yaml
script: !include scripts.yaml
scene: !include scenes.yaml
logger:
default: debug
logs:
homeassistant.components.modbus: debug
pymodbus.*: debug
modbus:
- name: lopa
type: tcp
host: 192.168.1.7
port: 502
timeout: 14
delay: 1
close_comm_on_error: false
retries: 10
retry_on_empty: true
sensors:
- name: Faktor moci iz omrezja
slave: 2
input_type: input
address: 62
count: 2
data_type: float32
unit_of_measurement: /1
precision: 2
- name: Faktor moci v hiso
slave: 3
input_type: input
address: 62
count: 2
data_type: float32
unit_of_measurement: /1
precision: 2
- name: Napetost niza 1
slave: 1
input_type: holding
address: 234
count: 1
data_type: int16
scale: 0.1
unit_of_measurement: V
precision: 1
Anything in the logs that might be useful for us?
To put the log in context:
* Until 11:57:30 the system is working normally.
* At 11:57:30 the slave device 2 was turned off and stops responding until until 12:02:30. The connected sensor is named "Faktor moci iz omrezja". During this time, the other two sensors (from slave 1 and 3) stop updating - this is the problem I'm trying to solve.
* At 12:02:30 the slave device was turned on again and all the sensors start working again.
Additional information
All the info, config and logs are for a freshly installed system, the YAML snippet is the whole config.yaml
file.
modbus documentation
modbus source
(message by IssueLinks)
Hey there @adamchengtkc, @janiversen, @vzahradnik, mind taking a look at this issue as it has been labeled with an integration (modbus
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
Sounds like a problem, I will try to reproduce it with the test suite.
@janiversen, did you have any luck with this?
It's summer and I'm often losing power due to thunderstorms. This bug prevents me from detecting that reliably and preventing draining the battery deeply ...
So I found a partial solution.
The problem seems to be connected with some old settings that used to be needed for reliability but seem to cause problems with the new implementation.
Specifically, I deleted the following lines from my modbus hub config (it's likely that not all of them were causing a problem):
timeout: 14
delay: 1
close_comm_on_error: false
retries: 10
retry_on_empty: true
My theory is that the retries were preventing the whole thing from reading out the data.
The setup seems to work acceptably now, but the startup of HA is still really slow if one of the Modbus devices is not working, so I think there is still a problem there that needs fixing.