
Otherwise I think this is the one labelled deli. I think I know why that is, and a few tweaks to the RMW layer would probably help with that. The one that is zoomed in on delivery_locally_all_in_sync seems to be wasteful by spending time both in entity_status_signal and in signalling DATA_AVAILABLE.

I'm mentioning it mostly as a heads-up of what is to come, but you are of course welcome to try it. If this is all within a single machine, then the Iceoryx integration work should allow switching to end-to-end zero-copy communication but whether that's an option for you depends on various factors - ROS 2 version, QoS settings, whether you're using a fixed-size type.
#HOW TO USE ECLIPSE TUNER HOW TO#
I think you know how to do those settings, right? Setting General/MaxMessageSize to 65500B in the Cyclone DDS configuration should take care of that. Fortunately, the default maximum message size used by Cyclone is a bit odd and raising it to (nearly) 64kB should significantly reduce the number of packets. Reducing the number of packets would probably help. The cost of rcl_wait is due to the total mismatch between the RMW wait interface and the DDS wait interface, and overhead of ROS 2 executors ( rclcpp::Executor) is a well-known performance issue within ROS 2. The remainder of the time spent in Cyclone is minor, I'd say.

bit has to do with delivering data to readers. RecvMC is the thread receiving incoming multicast data. The amount of time spent there suggests there are quite a few endpoints, but this should be time spent only during startup and so not the primary issue. The recv one is the thread most likely to be receiving that data and handing it off. is the thread processing "built-in" data, which really is primarily discovery data.

As far as I can tell from the (sometimes rather dramatically) abbreviated names in the flame graph:ĭq.bu.
