Adding Object Detection in ROS 2

The Object Detection module can be configured to use one of the available detection models:

MODEL	Description
`MULTI_CLASS_BOX_FAST`	Any objects, bounding box based
`MULTI_CLASS_BOX_MEDIUM`	Any objects, bounding box based, compromise between accuracy and speed
`MULTI_CLASS_BOX_ACCURATE`	Any objects, bounding box based, more accurate but slower than the base model
`PERSON_HEAD_BOX_FAST`	Bounding Box detector specialized in person heads, particularly well suited for crowded environments, the person localization is also improved
`PERSON_HEAD_BOX_ACCURATE`	Bounding Box detector specialized in person heads, particularly well suited for crowded environments, the person localization is also improved, more accurate but slower than the base model
`CUSTOM_YOLOLIKE_BOX_OBJECTS`	For internal inference using your own custom YOLO-like model. This mode requires a onnx file to be passed in the ObjectDetectionParameters. This model will be used for inference.

The MULTI_CLASS_BOX and PERSON_HEAD_BOX modes use internal models, they are provided and downloaded automatically by the SDK.

The CUSTOM_YOLOLIKE_BOX_OBJECTS mode allows to load a custom ONNX YOLO model. The section Exporting a ONNX model for Custom YOLO-like detections shows an example on how to export this model from a trained Pytorch model. This process must be done only once.

The result of the detection is published using a new custom message of type zed_interfaces/ObjectsStamped defined in the package zed_interfaces.

Enable Object Detection #

Object detection can be started automatically when the ZED Wrapper node starts by setting the parameter object_detection.od_enabled to true in the file common.yaml.

It is also possible to start the Object Detection processing manually by calling the service ~/enable_obj_det with the parameter True.

In both cases, the Object Detection processing can be stopped by calling the service ~/enable_obj_det with the parameter False.

See the services documentation for more info.

Exporting a ONNX model for Custom YOLO-like detections #

Object Detection inference can be performed using a custom inference engine in YOLO-like ONNX format.

Please refer to the YOLO ONNX model export documentation page for instructions on how to proceed.

Here’s a quick overview using Ultralytics framework to export a YOLOv8 ONNX model:

python -m pip install -U ultralytics
yolo export model=yolo8s.pt format=onnx simplify=True dynamic=False imgsz=512

Using a Custom YOLO-like model #

Modify the common.yaml parameters to match your configuration:

set object_detection.model to CUSTOM_YOLOLIKE_BOX_OBJECTS
set object_detection.custom_onnx_file to the full path of your custom ONNX file
[Optional] set object_detection.onnx_input_size to the size of the YOLO input tensor if the model has dynamic input size, e.g. 512
[Optional] set object_detection.custom_label_yaml to the full path of your YAML file storing class labels in COCO format

📌 Note: The first time the custom model is used, the ZED SDK optimizes it to get the best performance from the GPU installed on the host. Please wait for the optimization to complete.

📌 Note: When using Docker, we recommend using a shared volume to store the optimized file on the host and perform the optimization only once. Read here for more information

Console log while optimization is running:

[zed_wrapper-3] [INFO] [1729184874.634985183] [zed.zed_node]: *** Starting Object Detection ***
[zed_wrapper-3] [2024-10-17 17:07:55 UTC][ZED][INFO] Please wait while the AI model is being optimized for your graphics card
[zed_wrapper-3]  This operation will be run only once and may take a few minutes

Object Detection results in RVIZ 2 #

To visualize the results of the Object Detection processing in Rviz2 the new ZedOdDisplay plugin is required. The plugin is available in the zed-ros2-examples GitHub repository and can be installed following the online instructions.

📌 Note: the source code of the plugin is a valid example of how to process the data of the topics of type zed_interfaces/ObjectsStamped.

Parameters:

Topic: Selects the object detection topic to visualize from the list of available images in the combo box.
Depth: The depth of the incoming message queue.
History policy: Set the QoS history policy. Keep Last is suggested for performance and compatibility.
Reliability Policy: Set the QoS reliability policy. Best Effort is suggested for performance and compatibility.
Durability Policy: Set the QoS durability policy. Volatile is suggested for compatibility.
Transparency: the transparency level of the structures composing the detected objects.
Show skeleton: Not used.
Show Labels: enable/disable the visualization of the object label.
Show Bounding Boxes: enable/disable the visualization of the bounding boxes of the detected objects.
Link Size: the size of the bounding boxes’ corner lines.
Joint Radius: the radius of the spheres placed on the corners of the bounding boxes.
Label Scale: the scale of the label of the object.