failed to create domain error when spawning many python nodes at once from launch file with cyclonedds
firesurfer opened this issue · comments
Bug report
I have a launch file where I launch a rather large amount of python nodes (~15-20). For some of those nodes I get an error like this:
[spawner-33] 1706000264.234912 [5] spawner: Failed to find a free participant index for domain 5
[spawner-33] [ERROR] [1706000264.234976868] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error
[spawner-33]
[spawner-33] >>> [rcutils|error_handling.c:108] rcutils_set_error_state()
[spawner-33] This error state is being overwritten:
[spawner-33]
[spawner-33] 'error not set, at ./src/rcl/node.c:262'
[spawner-33]
[spawner-33] with this new error message:
[spawner-33]
[spawner-33] 'rcl node's rmw handle is invalid, at ./src/rcl/node.c:433'
[spawner-33]
[spawner-33] rcutils_reset_error() should be called after error handling to avoid this.
[spawner-33] <<<
[spawner-33] [ERROR] [1706000264.235045107] [rcl]: Failed to fini publisher for node: 1
[spawner-33] Traceback (most recent call last):
[spawner-33] File "/opt/ros/iron/lib/controller_manager/spawner", line 33, in <module>
[spawner-33] sys.exit(load_entry_point('controller-manager==3.21.2', 'console_scripts', 'spawner')())
[spawner-33] File "/opt/ros/iron/lib/python3.10/site-packages/controller_manager/spawner.py", line 207, in main
[spawner-33] node = Node("spawner_" + controller_names[0])
[spawner-33] File "/opt/ros/iron/lib/python3.10/site-packages/rclpy/node.py", line 185, in __init__
[spawner-33] self.__node = _rclpy.Node(
[spawner-33] rclpy._rclpy_pybind11.RCLError: error creating node: rcl node's rmw handle is invalid, at ./src/rcl/node.c:433
The reason I submitted this in the rclpy repository is that it only seems to happen for python nodes (perhaps because there are so many of it?) The exact nodes that fail during startup change between to runs.
Required Info:
- Operating System:
- Ubuntu 22.04 - Podman container
- Installation type:
- Binary
- Version or commit hash:
- rclpy : 4.1.4
- DDS implementation:
- rmw_cyclonedds_cpp: 1.6.0
- Client library (if applicable):
- rclpy
Steps to reproduce issue
Have a launch file where you start many python nodes at once. In my case I have a lot of controller spawners from ros2control:
servo_status_spawner = Node(
package="controller_manager",
executable="spawner",
arguments=["status_controller_servo",
"--controller-manager", "/controller_manager"],
)
#And many many more
Expected behavior
All nodes should start.
Actual behavior
For at least 4-5 Nodes I get an:
[spawner-33] 1706000264.234912 [5] spawner: Failed to find a free participant index for domain 5
[spawner-33] [ERROR] [1706000264.234976868] [rmw_cyclonedds_cpp]: rmw_create_node: failed to create domain, error Error
(See above for full error log)
Additional information
As said above the setup runs in a podman container. I will test it this week in a native installation.
Environment settings:
export ROS_DOMAIN_ID=5
source /opt/ros/iron/setup.bash
export RMW_IMPLEMENTATION=rmw_cyclonedds_cpp
export ROS_AUTOMATIC_DISCOVERY_RANGE=LOCALHOST
This is likely another case of ros2/rmw_cyclonedds#458 .
@clalancette I can confirm this.
The solution presented in: ros2/rmw_cyclonedds#458 (comment)
worked for me.
The precise I had to use the first line:
export CYCLONEDDS_URI='<CycloneDDS><Domain><Discovery><ParticipantIndex>none</ParticipantIndex></Discovery></Domain></CycloneDDS>'
The second suggestion that also enables multicast didn't work for me as I then got the error message:
[spawner-32] 1706083268.665794 [5] spawner: selected interface "lo" is not multicast-capable: disabling multicast
[spawner-32] 1706083268.667774 [5] spawner: Failed to find a free participant index for domain 5
@clalancette I can confirm this.
Thanks. I'm going to close this one in favor of that one.