ros2 / rmw_dps

Implementation of the ROS Middleware (rmw) Interface using Intel's Distributed Publish & Subscribe.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

rmw_dps support for RMW 0.7.1

alsora opened this issue · comments

The RMW has been updated few days ago to version 0.7.1 and this is not supported yet from rmw_dps.

First of all there are build errors, due to the new APIs (allocation and liveliness mostly).

Moreover there is a runtime error when a subscriber try to deserialize a message of type ParameterEvent.

I saw a similar problem mentioned for Fast-RTPS:
ros2/rmw_fastrtps#36 (comment)

I'm working on a PR: #7

It allows to build without errors and, if you remove the ParameterEvents publisher and subscribers, it allows nodes to communicate.

Do you have any ideas on how to fix this deserialization issue?

@AAlon

I'll take a look at the deserialization issue. Can you point to me the failing test?

There are a bunch of packages with failed tests.

  8 packages had test failures: message_filters rcl rcl_action rclcpp rclcpp_action rclcpp_components rclpy tlsf_cpp

The main things I can see are:

  • The APIs for getting node names are returning empty strings (RCL test_get_node_names__rmw_dps_cpp, RCL test_count_matched__rmw_dps_cpp, RCLCPP test_node, RCLCPP test_publisher_subscription_count_api, RCLCPP test_subscription_publisher_count_api)
  • RCLCPP test_timer and RCLCPP test_time_source segfault failure.

The error I was referring to is probably happening in the last two tests mentioned above.
However, it's sufficient to run any 2 nodes to notice it

ros2 run examples_rclcpp_minimal_publisher publisher_lamda
ros2 run examples_rclcpp_minimal_subscriber subscriber_lambda

In any case, the segfault occurs when calling the resize_function of the MessageMember

member->resize_function(field, array_size);

This happens only for ParameterEvents messages, while it works for custom and other standard messages.
Moreover, note that the error does not occur when using ROS2 Crystal

Thanks for the test pointer. I will reproduce on my end and fix it.

Implementation of the discovery/introspection APIs is next.

@alsora the issue appears to be that the resize_function is not hooked into the introspection structure in the latest code while it was in Crystal. I'm still tracking down exactly when and why this occurred.

Some background: in the deserialization code you highlighted above, field points to a std::vector of some type inside a ROS message. The code does a placement new to initialize the memory inside the message then calls the resize_function of the actual type to allocate the memory for the elements. With no resize_function in place, we crash.

With the proposed PR ros2/rosidl#379
The two rclcpp tests that were having a segfault now succed.
Moreover, I'm able to run nodes with parameter events

The previously mentioned PR has been merged, so it's possible to close this issue.