awslabs / aws-c-mqtt

C99 implementation of the MQTT 3.1.1 specification.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

version 0.8.7 same issue 250 - still having trouble running aws-c-mqtt unit tests

thomas-roos opened this issue · comments

commented

belongs to this: #250
running this:
#!/bin/sh
cd tests

MQTT_TESTS="
mqtt_connection_publish_QoS1_timeout
mqtt_connection_unsub_timeout
mqtt_connection_publish_QoS1_timeout_connection_lost_reset_time
mqtt_connect_disconnect
mqtt_connect_set_will_login
mqtt_connection_interrupted
mqtt_connection_any_publish
mqtt_connection_timeout
mqtt_connection_connack_timeout
mqtt_connect_subscribe
mqtt_connect_subscribe_fail_from_broker
mqtt_connect_subscribe_multi
mqtt_connect_unsubscribe
mqtt_connect_resubscribe
mqtt_connect_publish
mqtt_connect_publish_payload
mqtt_connection_offline_publish
mqtt_connection_disconnect_while_reconnecting
mqtt_connection_closes_while_making_requests
mqtt_connection_resend_packets
mqtt_connection_consistent_retry_policy
mqtt_connection_not_resend_packets_on_healthy_connection
mqtt_connection_destory_pending_requests
mqtt_clean_session_not_retry
mqtt_clean_session_discard_previous
mqtt_clean_session_keep_next_session
"

for TEST in $MQTT_TESTS
do
./aws-c-mqtt-tests $TEST >> tests.log
done

gives that:
aws-c-mqtt_0.8.7_error.log

this is now every time not just sometimes and it doesn't finish in 12h...
if this is expected I'm okay with it.

Is this still codebuild? What's the image?

commented

it's inside a qemu x86-64 machine used for testing yocto recipes running on a ec2 instance
It's easy to replicate if you have some ubuntu and time

Well I'm oncall at the moment, so the answer is yes, I have ubuntu and I have time.

Also, I looked through the log. What seems to be happening is that the test is just failing. But because you're running the test directly (not through ctest) there's no timeout-failure mechanism wrapped around it.

Looking at the log, the connack timeout is triggering immediately after the connect packet is sent, shutting down the channel and not giving the test the ability to arrange what it's trying to do. Maybe the timeout is set to zero somehow or the high res clock is potentially misbehaving (seems super unlikely, dunno).

Looking at the client impl, we use the ping timeout as the connack timeout. Looking at all the flaky connection state tests, I see us setting ping timeout to 10ms, which is fine for local runs given the domain socket transport, but in a container or emu environment, that's probably not going to work. I'm going to bump those up quite a bit.

Are you able to regress this without me making a release?

commented

Sure, will test the branch / commit

commented

yes, confirm this works, actually.
thank you - if you are interested in how to reproduce it: https://github.com/thomas-roos/yocto_example