smartrent / grizzly

Elixir Z-Wave Library

Home Page:https://hex.pm/packages/grizzly

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inclusion fails with status: :failed

djantea opened this issue · comments

Hi,

I am experimenting with z-wave and Grizzly and I have the following issue.

I am using the latest versions of Nerves (1.7.0) and zipgateway (v7.15.4, built on Raspberry Pi OS).

I have created just a bare Nerves project, to which I have added the Grizzly library (0.18.3). The first attempt to run the zipgateway executable on Nerves failed with a message about missing the libusb-1 library. So I built a custom Nerves System based on the standard nerves_system_rpi3, to which the only modification was to add the libusb library ("BR2_PACKAGE_LIBUSB=y" added to the nerves_defconfig, by using make menuconfig).

After that I have managed to start Grizzly by using the following:

iex(5)> Grizzly.Supervisor.start_link(serial_port: "/dev/ttyUSB0", eeprom_file: nil)
{:ok, #PID<0.1600.0>}
# some of the more interesting messages, in my oppinion:
...
17:08:57.612 [debug] /usr/sbin/zipgateway: 111797 Comming up
17:08:57.629 [debug] /usr/sbin/zipgateway: 111813 Unexpected tunnel script status=0, if killed by signal: 0, if stopped by signal: 0
17:08:57.642 [debug] /usr/sbin/zipgateway: Failed adding tunnel routes. Attempt 1
...
17:09:11.762 [debug] /usr/sbin/zipgateway: 125945 DTLS input is handshake:
17:09:11.762 [debug] /usr/sbin/zipgateway: 125945 DTLS handshake Hello verify request
17:09:11.763 [debug] /usr/sbin/zipgateway: 125946 DTLS: DTLS read failed: %d
...
17:09:11.762 [debug] /usr/sbin/zipgateway: 125945 DTLS input is handshake:
17:09:11.762 [debug] /usr/sbin/zipgateway: 125945 DTLS handshake Hello verify request
17:09:11.763 [debug] /usr/sbin/zipgateway: 125946 DTLS: DTLS read failed: %d
17:09:11.764 [debug] /usr/sbin/zipgateway: 125946 SSL_ERROR_WANT_READ: Processing was not completed successfully because there was no data available for reading,
...
17:09:11.782 [debug] [GRIZZLY] Unsolicited Message for node 1: %Grizzly.ZWave.Command{command_byte: 2, command_class: Grizzly.ZWave.CommandClasses.NetworkManagementProxy, impl: Grizzly.ZWave.Commands.NodeListReport, name: :node_list_report, params: [seq_number: 206, controller_id: 1, status: :latest, node_ids: [1]]}
...
17:10:11.786 [error] [CLOSED]: {:ssl_closed, {:sslsocket, {:gen_udp, {#PID<0.1621.0>, {{{64768, 43690, 0, 0, 0, 0, 0, 1}, 41230}, #Port<0.146>}}, :dtls_gen_connection}, [#PID<0.1633.0>]}}
...

(note that without eeprom_file: nil it fails with an explicit message).

Although the result of Grizzly.Supervisor.start_link is {:ok, #PID<...>}, on the terminal of the PI (not on the ssh session where I ran the start_link) the following message appears displayed several times (four times usually):

tap_dev: tapdev_send: write: Input/output error

Then, I try to add a node:

iex(15)> Grizzly.Inclusions.add_node()
:ok
# some interesting messages:
...
17:17:18.867 [debug] /usr/sbin/zipgateway: 613053 DTLS: DTLS read failed: %d
17:17:18.868 [debug] /usr/sbin/zipgateway: 613053 SSL_ERROR_WANT_READ: Processing was not completed successfully because there was no data available for reading,
17:17:18.868 [debug] /usr/sbin/zipgateway: 613053 DTLS: Client handshake done
...
17:17:18.873 [debug] /usr/sbin/zipgateway: 613059 nm_fsm_post_event event: NM_EV_FRAME_RECEIVED state: NM_IDLE
17:17:18.880 [debug] /usr/sbin/zipgateway: 613067 Cmd class 0x34 command 0x01 identified as version 4
17:17:18.880 [debug] /usr/sbin/zipgateway: 613067 NetworkManagementCommandHandler 34 1
17:17:18.888 [debug] /usr/sbin/zipgateway: 613074 nm_fsm_post_event event: NM_EV_NODE_ADD_S2 state: NM_IDLE
17:17:18.898 [debug] /usr/sbin/zipgateway: 613085 nm_fsm_post_event event: NM_EV_ADD_LEARN_READY state: NM_WAITING_FOR_ADD
...
17:18:18.894 [debug] /usr/sbin/zipgateway: 673081 nm_fsm_post_event event: NM_EV_TIMEOUT state: NM_WAITING_FOR_ADD
...
17:18:18.907 [debug] /usr/sbin/zipgateway: 673094 DTLS: DTLS read failed: %d
17:18:18.911 [debug] /usr/sbin/zipgateway: 673094 SSL_ERROR_ZERO_RETURN: The remote application shut down the SSL connection normally.673094 DTLS: Closing DTLS connection
...

and putting the z-wave device in the inclusion mode also (e.g. press three times on the Fibaro wall plug button).

After some time, when I check the inclusion result:

iex(23)> flush
{:grizzly, :report,
 %Grizzly.Report{
   command: %Grizzly.ZWave.Command{
     command_byte: 2,
     command_class: Grizzly.ZWave.CommandClasses.NetworkManagementInclusion,
     impl: Grizzly.ZWave.Commands.NodeAddStatus,
     name: :node_add_status,
     params: [
       status: :failed,
       seq_number: 77,
       node_id: 0,
       listening?: false,
       basic_device_class: :unknown,
       generic_device_class: :unknown,
       specific_device_class: :unknown,
       command_classes: []
     ]
   },
   command_ref: #Reference<0.1224014667.806092801.9371>,
   node_id: 1,
   queued: false,
   queued_delay: 0,
   status: :complete,
   transmission_stats: [],
   type: :command
 }}
:ok

Do you have some advice on what can I do to investigate further, and to determine the root cause of this?

It is not clear to me if the cause is related to the tapdev_send: write: Input/output error message, or if this message is unrelated to the cause of the error.

Thanks,
Dan

The message "Failed adding tunnel routes. Attempt $i" displayed by the zipgateway.tun script appears in the log three times, for i in 1..3. So it seems that on the fourth attempt the ip -6 route del and ip -6 route add commands succeed.

Hey @djantea!

In regards to the eeprom file, the reason for us forcing the nil is due to some upgrade paths from earlier zipgateway versions. I think there is probably a better way to allow those who haven't had to upgrade from earlier versions to jump in and get started, so once I have some time I will look into making the API nicer.

There are tons of logs from various pieces of software during zipgateway initialization and 95% of them are safe to ignore as there are some non-deterministic characteristics around the tap interface and getting everything set up. The tapdev_send error is one of those along with the Failed adding tunnel routes. message. I think documentation improvements can really help ease some confusion here. I have on my to-do list to do a pass of the main docs as they are slightly out of date especially the information about zipgateway since Silicon Labs has updated their software.

Usually, when I want to do a quick test to ensure the networking side of things is running as expected I run Grizzly.Network.get_node_ids(). If everything seems to work there, then that means you should be good to go in regards to using Grizzly.

The device you are trying to include, can you send me a link to the exact plug you are using so I can look at its user manual? Also, I am curious as to what Z-Wave controller you are using can you share that information also?

I have been busy on some other projects recently but I am hoping to get some time with Grizzly soon to improve some things. Do you have a repo for your system? I can always pull that and debug it as well once I get a chance.

Thank you!

Hi @mattludwigs,

First, thank you for your response, really appreciated!

In regards to the eeprom file, the reason for us forcing the nil is due to some upgrade paths from earlier zipgateway versions.

Ok, nu problem. I tried passing eeprom_file: nil when I saw the following messages displayed after running Grizzly.Supervisor.start_link(serial_port: "/dev/ttyUSB0"):

Configuration key "Eepromfile", found in zipgateway.cfg, is no lonker supported. Please remove this line and use "ZipGwDatabase" instead

I think documentation improvements can really help ease some confusion here.

Let me know how can I help.

Usually, when I want to do a quick test to ensure the networking side of things is running as expected I run Grizzly.Network.get_node_ids(). If everything seems to work there, then that means you should be good to go in regards to using Grizzly.

Grizzly.Network.get_node_ids() seems to work, but the issue is that it returns just one id - of the controller - so I don't think that there is a real network communication - so far I have not managed to add any other device:

{:ok,
 %Grizzly.Report{
   command: %Grizzly.ZWave.Command{
     command_byte: 2,
     command_class: Grizzly.ZWave.CommandClasses.NetworkManagementProxy,
     impl: Grizzly.ZWave.Commands.NodeListReport,
     name: :node_list_report,
     params: [seq_number: 73, controller_id: 1, status: :latest, node_ids: [1]]
   },
   command_ref: #Reference<0.1400771232.537395203.25926>,
   node_id: :gateway,
   queued: false,
   queued_delay: 0,
   status: :complete,
   transmission_stats: [],
   type: :command
 }}

The device you are trying to include, can you send me a link to the exact plug you are using so I can look at its user manual?

It is a Fibaro Wall Plug, model number FGWPF-102 ZW5 v3.3 - the manual can be found here. It says to quickly press three times on the button after the controller is put in inclusion mode. I have tried that several times: first run Grizzly.Inclusions.add_node() and then quickly press three times the button - with no success - the Grizzly report message that comes back after some time (read with flush) returns status: :failed as above.

Also, I am curious as to what Z-Wave controller you are using can you share that information also?

It is Silicon Labs' Z-WAVE 700 UZB-7 USB STICK.

Do you have a repo for your system?

I will upload it on github and send you the link. As I said, it is just a bare Nerves project, compiled for Raspberry PI 3B+, to which the only changes are:

  1. Add {:grizzly, "~> 0.18.3"} to deps in mix.exs
  2. Use a custom rpi3 nerves system, referred also in mix.exs desp:
{:custom_nerves_system_rpi3,
       path: "../custom_rpi3",
       runtime: false,
       targets: :custom_nerves_system_rpi3,
       nerves: [compile: true]}

The ../custom_rpi3 is a bare rpi3 nerves system built as per Nerves documentation, to which the only change is this:

[dan@dxps custom_rpi3 ((v1.15.0))]$ git --no-pager diff --minimal
diff --git a/mix.exs b/mix.exs
index fe6dd7d..53e2cb9 100644
--- a/mix.exs
+++ b/mix.exs
@@ -1,8 +1,8 @@
-defmodule NervesSystemRpi3.MixProject do
+defmodule CustomNervesSystemRpi3.MixProject do
   use Mix.Project
 
-  @github_organization "nerves-project"
-  @app :nerves_system_rpi3
+  @github_organization "djantea"
+  @app :custom_nerves_system_rpi3
   @source_url "https://github.com/#{@github_organization}/#{@app}"
   @version Path.join(__DIR__, "VERSION")
            |> File.read!()
diff --git a/nerves_defconfig b/nerves_defconfig
index 5516c38..e5eb6a5 100644
--- a/nerves_defconfig
+++ b/nerves_defconfig
@@ -55,6 +55,7 @@ BR2_PACKAGE_RPI_USERLAND=y
 # BR2_PACKAGE_ALSA_LIB_OLD_SYMBOLS is not set
 BR2_PACKAGE_LIBP11=y
 BR2_PACKAGE_UNIXODBC=y
+BR2_PACKAGE_LIBUSB=y
 BR2_PACKAGE_LIBMNL=y
 BR2_PACKAGE_WIRELESS_REGDB=y
 BR2_PACKAGE_WPA_SUPPLICANT=y

The relevant change is the addition of the libusb package. Without it, starting /usr/sbin/zipgateway by hand complained about missing libusb library.

Thank you,
Dan

Hi @mattludwigs,,

I will upload it on github and send you the link.

As promised, here is the link: https://github.com/djantea/zwave-experiment.

Best regards,
Dan

@djantea thank you for posting that system! When I get a chance I can dig into this a little more.

However, one thing that came to mind when was thinking about this was the RF region. Can you point your mix dep to this branch: https://github.com/smartrent/grizzly/tree/config-rf-region of Grizzly.

Then when you go start Grizzly you should be able to pass :rf_region into the options with one of these values:

@type rf_region() ::

If you attach RingLogger before running Grizzly.start_link/1 you should see the correct RF region printed out by zipgateway during initialization, if your zipgateway supports this configuration option - which 7.15.4 should. If that appears to work maybe try re-including the device?

Hi @mattludwigs,

Changed to use this dep in mix.exs:

  {:grizzly, git: "https://github.com/smartrent/grizzly.git", branch: "config-rf-region"}

and added rf_region in target.exs:

config :grizzly,
  serial_port: "/dev/ttyUSB0",
  rf_region: :eu

I still see this in the log:

[debug] /usr/sbin/zipgateway: 109939 RF Region US = 1

I also tried passing the param in start_link directly, without success:

Grizzly.Supervisor.start_link(serial_port: "/dev/ttyUSB0", eeprom_file: nil, rf_region: :eu)

On the the hand I had some success with another gateway: UZB3 BRIDGE CTRLR W/SAW FILTER.
Judging by the "E" from the name, this is Europe-only, so probably you are right that the former gateway does not work due to RF region being US instead of EU.

With the later gateway, I was able to connect the plug and switch it on and off with Grizzly.send_command, even if I see this in the log after start_link:

[debug] /usr/sbin/zipgateway: 64954 The RF Region value is not valid: 254

P.S. For this second gateway I had to use serial_port: "/dev/ttyACM0" instead of serial_port: "/dev/ttyUSB0"

Regards,
Dan

Hi @mattludwigs,

So now after some more experimenting, there are two issues:

  1. The rf_region: :eu does not seem to be taken into account when using the "config-rf-region" branch.
    • With the SLUSB001A controller, the log still displays RF region US, and the inclusion continues to return the status: :failed report.
    • With the ACC-UZB3-E-BRG controller, the log shows The RF Region value is not valid: 254 both with and without rf_region: :eu

If I can help with debugging this issue, please let me know.

  1. The success with the ACC-UZB3-E-BRG controller was temporary. After factory-resetting the controller, when I try to re-add the devices (after factory-resetting them), most of the times fails (the same status: :failed report keeps coming back), and when it works (I succeeded only one more time after the first success, out of many retries), the switch-on / switch-off commands (Grizzly.send_command(device_id, :switch_binary_set, target_value: :on)) do not work anymore. So it seems to have an unpredictable behavior.

I will try again with the ACC-UZB3-E-BRG controller and the main branch and let you know how it worked.

Thanks,
Dan

Hi @mattludwigs,

After I added ZWRFRegion=0x00 (for Europe) to zipgateway.cfg, it worked with the SLUSB001A controller also. So this confirms that my original problem (failure to include any node) using this controller was because it by default use the US RF region.

Please note that for the moment I did it quick and dirty (hardcoded) just to validate:

defmodule Grizzly.ZIPGateway.Config do
  ...
  def to_string(cfg) do
    """
    ZipCaCert=#{cfg.ca_cert}
    ...
    ZipPSK=#{cfg.psk}
    ZWRFRegion=0x00  # added this
    """
    ...
  end
  ...
end

I suppose that this should happen when passing the rf_region: :eu in the "config-rf-region" branch, but in my case it didn't work. The result of maybe_put_config_item(config_string, cfg, :rf_region, cfg_name) is adding ZWRFRegion=0 to the config file. Maybe the problem is that defp rf_region(:eu) returns 0x00 - an integer, and when used in string interpolation it produces "0" instead of "0x00". Although, as per the zipgateway manpage:

The region must be given in decimal, or in hex preceding 0x, ie. 0x02.

So it should work with either "0" or "0x00".

I will try again later with 0 and let you know.

The second issue, with unpredictable behavior, I did not have any progress. I managed again to pair all my devices after many retries. I am just curios, is this happening the same to you?

Thanks,
Dan

@djantea thank you for trying that out and getting back to me!

I wonder why the decimal isn't working. I pushed an update to the config-rf-region branch, but it still might not work depending on how picky they are being on the hex format. If that does not seem to take I will probably just return the string hex from the rf_region function because the function is only meant to build the config anyways.

Also if you want to see the config your zipgateway is running you can always:

iex> cat "/tmp/zipgateway.cfg"

while on your device.

In regards to the ACC-UZB3-E-BRG, I am not sure if I have the working US counterpart (I have bricked a lot of USBs over the course of the years 😂), although I think that device is an older Z-Wave series (maybe 500?) and I am not sure how well 7.15.x zipgateway firmware will work with it. You can try updating the controller to the latest Z-Wave firmware and seeing if that helps. Grizzly supports this via Grizzly.FirmwareUpdates. Although I haven't tried it on a USB yet. I have only used Simplicity Studio programs to try to flash a new firmware to USBs without much luck.

Also, I notice you that you are using a binary switch. Grizzly v0.19.0 was just released and added the Grizzly.SwitchBinary module to make using binary switches a little nicer to work with.

Hi @mattludwigs,

I can confirm that the update to the config-rf-region branch is working: the log displays [debug] /usr/sbin/zipgateway: 192054 RF Region EU = 0 and I can control the devices through Grizzly commands.

One thing remains unclear to me: the /tmp/zipgateway.cfg does not contain the ZWRFRegion parameter. But from the log message and from the fact that the gateway works, I can conclude that the rf_region: :eu config works.

Grizzly supports this via Grizzly.FirmwareUpdates. Although I haven't tried it on a USB yet.

I am curios: have you tried Grizzly.FirmwareUpdates on zwave devices other than zwave USB sticks? I was also wandering how can one update the firmware of either zwave gateway or devices. I could try running Grizzly.FirmwareUpdates on ACC-UZB3-E-BRG, just I am a bit worried that I can brick it.

Grizzly v0.19.0 was just released and added the Grizzly.SwitchBinary

I already saw that, and is awesome. I just have this small comment. It would be good if Grizzly.SwitchBinary functions would accept opts \\ [], so that we can pass options like timeout. In my case, the first time I send a command to a device after some inactivity, it responds slower and Grizzly.SwitchBinary times out:

iex(52)> Grizzly.SwitchBinary.get(46)
{:error, :timeout}
iex(53)> Grizzly.send_command(46, :switch_binary_get, [], timeout: 10_000)
{:ok,
 %Grizzly.Report{
   command: %Grizzly.ZWave.Command{
     command_byte: 3,
     command_class: Grizzly.ZWave.CommandClasses.SwitchBinary,
     impl: Grizzly.ZWave.Commands.SwitchBinaryReport,
     name: :switch_binary_report,
     params: [target_value: :off]
   },
   command_ref: #Reference<0.1753601047.537395202.57140>,
   node_id: 46,
   queued: false,
   queued_delay: 0,
   status: :complete,
   transmission_stats: [],
   type: :command
 }}

I think that this issue can be closed. It would be good if config-rf-region branch would be merged into main and released at some time. It enables Grizzly to be used in regions other than US.

P. S. If I will have some more subjects to discuss with you about Grizzly, how can I do that without "polluting" the github repo with issues?

Thank,
Dan

Awesome! I will update the tests and debug to make sure the configuration is being written correctly, then I will make a PR to get that in. I agree in regards to being able to add the send options to the SwitchBinary module helpers. Once those are two things are in I will make a release.

have you tried Grizzly.FirmwareUpdates on zwave devices other than zwave USB sticks?

Yes, we have a custom device at SmartRent that has the 700 series module on the board. I mostly just use that board for testing and development. I rarely use the USBs mostly because Grizzly's main use case is SmartRent's gateway, so testing on the device we use has been ideal for us.

I just bought a handle full of the SLUSB001A USBs, so I can start testing the user experience when not using the SmartRent hardware as I am sure there are some extra pain points there. I bought a hand full because I assume I am probably going to brick a few in the process 😂.

Regarding the ACC-UZB3-E-BRG, I think that it is using a 500 series chip, which Silabs is moving towards 700 series chips now. Unless you know otherwise. I can look closer into that as well to be sure. However, if it is the case it is using 500 series unless you have hard requirements to meet, I would recommend only using the SLUSB001A USB since Silabs is now preferring the 700 series. This should save you some time from having to support both 500 and 700 series - if that is possible for you.

If I will have some more subjects to discuss with you about Grizzly, how can I do that without "polluting" the github repo with issues?

Are you on the Elixir lang slack? My username is mwigs there. If that does not work let me know and we can figure something out as I would like to collaborate more with you.

I recently joined slack, but I must admit I am not familiar of how to use it. I joined Elixir Lang and sent you a message with @mwigs.

I did not see a message from you in the Elixir lang slack, but I think I was able to find you and send a message.... or another Dan got a random message from me 😂!

I made the changes to have the ZWRFRegion populate in the zipgateway.cfg and opened a PR: #497. That should be merged soon!

With #497 merged I am closing this issue. Thank you for working with me to get that resolved and tested.