awslabs / aws-c-mqtt

C99 implementation of the MQTT 3.1.1 specification.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segfault when resubscribing with no subscriptions

Octogonapus opened this issue · comments

commented

I have run into a segfault when resubscribing a connection that had no subscriptions to begin with.

I've debugged the issue as follows. s_resubscribe_complete is called even when resubscribing with no prior subscriptions due to the following data flow. s_resubscribe_send includes a check for not having any subscriptions:

size_t sub_count = aws_mqtt_topic_tree_get_sub_count(&task_arg->connection->thread_data.subscriptions);
if (sub_count == 0) {
    AWS_LOGF_TRACE(
        AWS_LS_MQTT_CLIENT,
        "id=%p: Not subscribed to any topics. Resubscribe is unnecessary, no packet will be sent.",
        (void *)task_arg->connection);
    return AWS_MQTT_CLIENT_REQUEST_COMPLETE;
}

Normally, s_resubscribe_send allocates and initializes task_arg->topics. The segfault is caused by task_arg->topics being left uninitialized due to s_resubscribe_send exiting early from the above check.
According to the docs, on_complete will be called once the request completed, either either in success or error. Therefore, with s_resubscribe_send finished, s_resubscribe_complete runs. However, s_resubscribe_send never set task_arg->topics because it exited early. It is still null from the initial allocation inside aws_mqtt_resubscribe_existing_topics. When s_resubscribe_complete runs, it attempts to access task_arg->topics like so aws_array_list_get_at(&task_arg->topics, &topic, 0). This causes a segfault.

I think I already have a decent understanding of this problem, so I can submit a patch for this, but I'd first like to know what ways in which the maintainers are open to solving this. I propose that s_resubscribe_send initializes the topics list as an empty list. We can then have s_resubscribe_complete check the size of this list before accessing its elements. An empty list would cause it to skip interacting with the list; it would then complete its cleanup steps and exit.

Fixed via #215