eclipse / paho.mqtt.embedded-c

Paho MQTT C client library for embedded systems. Paho is an Eclipse IoT project (https://iot.eclipse.org/)

Home Page:https://eclipse.org/paho

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

QoS 2 publishing fails + synchronous publish always forces watchdog reset on NodeMCU

pirlite2 opened this issue · comments

Hi
I have been using Paho C++ via MQTTClient within PlatformIO to produce a NodeMCU-based system to communicate synchronously with a mosquitto broker on a Linux Mint PC. However, I have encountered two, possibly inter-related, issues.

I can publish from the PC to the NodeMCU at QoS2 no problem. However, I am unable to publish from the NodeMCU board to the PC (mosquitto_sub) at QoS2. Inspecting the mosquitto log, I can see that I am getting a PUBLISH message and the broker is responding with PUBREC. But the PUBREL and PUBCOMP parts of the QoS2 transaction are always absent. It seems like Paho is not producing the PUBREL response and so the transaction dies. Unsurprisingly, no message gets delivered! Publishing from the NodeMCU at QoS0 or QoS1 works – the messages do appear on the PC.

Second issue: The NodeMCU soft watchdog triggers after 20 (and always 20) outbound messages for all QoS levels, 0 to 2. I have liberally scattered the code with dog feeding statements and the issue seems to be the blocking publish method not returning (soon enough?) on the 20th invocation. This sounds like a full buffer problem? To repeat: this happens for all QoSs so even for messages that are getting delivered.

One thing I did notice in MQTTClient.h is that unless MQTTCLIENT_QOS2 is defined (as 2?) then MQTTCLIENT_QOS2 is set to zero. In other words, QoS2 downgrades to QoS0; so it appears that the subsequent code that generates the (missing) PUBREL message is never executed?

Anybody any comments/suggestions on this?

Peter

Hi, Peter. I would suggest you split this in two reports.
Is this the C++ client or the C client ? Cause this is the C (not C++) repo.
If the C client, I can help with tracking the networking part in the Qos2 (first) issue. Would you please install Wireshark and capture the traffic so we can have a clear understanding of what is going on at the protocol level ? Looks like proper msgs aren't being sent but I personally would like to see it.

Hi, Peter. I would suggest you split this in two reports.

OK. Will do . On reflection, I am not quite sure why I thought these two things were linked. Of the two problems, the perpetual reset one, is the more important to me; I can live with QoS0 for the time being.

Is this the C++ client or the C client ?

The file that PlatformIO presents is MQTTClient.h, as I stated in my initial post. This is an older version of the MQTTClient.h file in this repo. A minor update. Since posting, I have downloaded the latest, updated version and built it locally. Makes no difference - both problems persist with the latest version.

Cause this is the C (not C++) repo.

Bit puzzled by this since the repo's README.md states "There are three sub-projects: ... 2) MQTTClient - high(er) level C++ client, plus ..." which, AFAIAA, is a thin wrapper around the C lib. This is what I have included since it is the interface that PlatformIO provides. That said, I am tempted to move to a local build of the pure C client since I am not sure the wrapping will help with running down the problem.

I can help with tracking the networking part in the Qos2 (first) issue. Would you please install Wireshark and capture the traffic so we can have a clear understanding of what is going on at the protocol level ? Looks like proper msgs aren't being sent but I personally would like to see it.

Ummmm... Isn't there already evidence from the mosquitto log that the Paho client is not responding to the broker's PUBREC with a PUBREL? What more will Wireshark tell us? I can try it but I have never used Wireshark.

BTW: I forgot to say in the original post, but Client::publish is always returning a value of zero (i.e. no error), even when the message is disappearing under QoS2. So failing silently :(

Will now switch to two separate posts.

Peter

On Sergio's suggestion, split into #201 and #202, and Closed