absmach / magistrala

Industrial IoT Messaging and Device Management Platform

Home Page:https://www.abstractmachines.fr/magistrala.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

MQTT over WS - Client cannot subscribe successfully

kostasbalampekos opened this issue · comments

BUG REPORT

Hi all and thank you for this great project. We are facing the following issue during our upgrade procedure from Mainflux v0.12.1 to Magistrala v0.14.0.

  1. What were you trying to achieve?
    We are trying to use MQTT over websocket.

  2. What are the expected results?
    We use a JS client to connect and subscribe to an MQTT topic over websocket. The expected result is that the client correctly connects and subscribes to the topic and receives messages over the websocket.

  3. What are the received results?
    From the client's perspective, the connection is successful but the subscription to the topic fails. The logs in the mqtt-adapter service are the following:

{"time":"2024-03-29T14:52:35.820944112Z","level":"ERROR","msg":"failed to publish connect event : context canceled"}
{"time":"2024-03-29T14:52:35.821087883Z","level":"INFO","msg":"connected with client\_id \<client-id>"}
{"time":"2024-03-29T14:52:35.825325252Z","level":"ERROR","msg":"disconnected client\_id \<client-id> and username \<username>"}
{"time":"2024-03-29T14:52:35.825458365Z","level":"WARN","msg":"Broken connection for client","error":"failed to proxy from MQTT broker to client with id \<client-id> with error: unexpected gRPC status: Canceled (status code:Canceled) : context canceled"}

The first two logs are printed during the connect procedure, while the last two logs are printed during the subscribe procedure. From the client perspective, it seems that the connection is successful, but subscribing to the topic fails.

  1. What are the steps to reproduce the issue?
  • Deploy Magistrala v0.14.0 that uses mProxy v0.4.2
  • Create the required entities (domain, user, thing and channel) with the appropriate permissions
  • Use a client to connect and subscribe to a topic using MQTT over WS (we are using MQTT.js but also tried Eclipse Paho with the same results)
  • Subscription fails and the above logs are shown in the mqtt-adapter service.
  1. In what environment did you encounter the issue?
    Magistrala v0.14.0, mProxy v0.4.2.

  2. Additional information you deem important:
    After investigating the issue, it seems that the connection is dropped on the mqtt-adapter service (mProxy v0.4.2). We identified that the issue is in the pkg/mqtt/websocket/websocket.go file, in the handle() function:

func (p Proxy) handle() http.Handler {
	return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
		cconn, err := upgrader.Upgrade(w, r, nil)
		if err != nil {
			p.logger.Error("Error upgrading connection", slog.Any("error", err))
			http.Error(w, err.Error(), http.StatusInternalServerError)
			return
		}

		go p.pass(r.Context(), cconn)
	})
}

It seems that the context is created and passed in the goroutine, but handle() returns and the context is cancelled. Therefore, the connect event is not sent to the ES and subsequent requests (e.g., subscription) fail. We tried changing the context creation by using a cancellable one, like

ctx, cancel := context.WithCancel(context.Background())
go func() {
	defer cancel() // Cancel the context when the goroutine finishes
	p.pass(ctx, cconn)
}()

This approach seems to work fine, as the client connects and subscribes to the topic successfully.

Following a discussion with @dborovcanin in Gitter, it was suggested that we test

go p.pass(context.WithoutCancel(r.Context()), cconn)

This also works fine, so the question is which is the best approach and how we can make sure that the context is properly cleaned up when the websocket is closed.

Thank you in advance for your time.

Thank you for bringing this issue to our attention. Upon thorough testing, it appears that the bug persists within Magistrala when utilizing MQTT over WebSocket without MProxy. We are currently conducting additional tests to confirm and address this matter accordingly.

@nyagamunene Any updates on this?

Indeed, upon conducting additional testing, it has been observed that integrating the suggested solution effectively resolves the issue on Magistrala.

@kostasbalampekos The bug has been fixed on the mProxy side, we are working on merging it to the Magistrala and fixing the bug very soon.

@dborovcanin, @nyagamunene thank you both for the feedback and the fix.

@kostasbalampekos welcome, Anytime