Tutorial - Using 3D Object Detection

This tutorial shows how to use your ZED 3D camera to detect, classify and locate persons in space (compatible with ZED 2 only). Detection and localization works with both a static or moving camera.

Getting Started

First, download the latest version of the ZED SDK.
Download the Object Detection sample code in C++, Python or C#.

Code Overview

Open the camera

In this tutorial, we will use the Object Detection AI module of the ZED SDK. As in previous tutorials, we create, configure and open the camera.

C++
Python
C#

// Create ZED objects
Camera zed;
InitParameters initParameters;
initParameters.camera_resolution = RESOLUTION::HD720;
initParameters.depth_mode = DEPTH_MODE::ULTRA;
initParameters.sdk_verbose = true;

// Open the camera
ERROR_CODE zed_error = zed.open(initParameters);
if (zed_error != ERROR_CODE::SUCCESS) {
	std::cout << "Error " << zed_error << ", exit program.\n";
	return 1; // Quit if an error occurred
}

# Create ZED objects
zed = sl.Camera()
init_params = sl.InitParameters()
init_params.camera_resolution = sl.RESOLUTION.HD720
init_params.depth_mode = sl.DEPTH_MODE.ULTRA
init_params.sdk_verbose = True

# Open the camera
err = zed.open(init_params)
if err != sl.ERROR_CODE.SUCCESS:
    # Quit if an error occurred
    exit()

// Create a ZED camera object
Camera zed = new Camera(0);
// Set configuration parameters
InitParameters init_params = new InitParameters();
init_params.resolution = RESOLUTION.HD720; 
init_params.depthMode = DEPTH_MODE.ULTRA;

// Open the camera
ERROR_CODE err = zed.Open(ref init_params);
if (err != ERROR_CODE.SUCCESS)
    Environment.Exit(-1)

Enable 3D Object detection

Before enabling object detection, we specify the ObjectDetectionParameters of the module. In this tutorial, we use the following settings:

C++
Python
C#

// Define the Object Detection module parameters
ObjectDetectionParameters detection_parameters;
detection_parameters.image_sync = true;
detection_parameters.enable_tracking = true;
detection_parameters.enable_mask_output = true;

// Object tracking requires camera tracking to be enabled
if (detection_parameters.enable_tracking)
	zed.enablePositionalTracking();

# Define the Object Detection module parameters
detection_parameters = sl.ObjectDetectionParameters()
detection_parameters.image_sync = True
detection_parameters.enable_tracking = True
detection_parameters.enable_mask_output = True

# Object tracking requires camera tracking to be enabled
if detection_parameters.enable_tracking:
    zed.enable_positional_tracking()

// Define the Object Detection module parameters
ObjectDetectionParameters detection_parameters = new ObjectDetectionParameters();
detection_parameters.imageSync = true;
detection_parameters.enableObjectTracking = true;
detection_parameters.enable2DMask = true;

if (detection_parameters.enableObjectTracking){
    PositionalTrackingParameters trackingParams = new PositionalTrackingParameters();
    zed.EnablePositionalTracking(ref trackingParams);
}

image_sync determines if object detection runs for each frame or asynchronously in a separate thread.
enable_tracking allows objects to be tracked across frames and keep the same ID as long as possible. Positional tracking must be active in order to track objects movements independently from camera motion.
enable_mask_output outputs 2D masks over detected objects. Since it requires additional processing, disable this option if not used.

Now let’s enable object detection which will load an AI model. This operation can take a few seconds. The first time the module is used, the model will be optimized for your hardware and this can take up to a few minutes. The model optimization operation is done only once.

C++
Python
C#

cout << "Object Detection: Loading Module..." << endl;
err = zed.enableObjectDetection(detection_parameters);
if (err != ERROR_CODE::SUCCESS) {
	cout << "Error " << err << ", exit program.\n";
	zed.close();
	return 1;
}

print("Object Detection: Loading Module...")
err = zed.enable_object_detection(detection_parameters)
if err != sl.ERROR_CODE.SUCCESS:
    print("Error {}, exit program".format(err))
    zed.close()
    exit()

Console.WriteLine("Object Detection: Loading Module...");
err = zed.EnableObjectDetection(ref obj_det_params);
if (err != ERROR_CODE.SUCCESS) {
    Console.WriteLine("Error " + err + ", exit program.");
    zed.Close();
    Environment.Exit(-1);
}

Retrieve object data

To retrieve detected objects in an image, use the retrieveObjects() function with an Objects parameter that will store objects data.

Since image_sync is enabled, for each grab call, the image will be fed into the AI module that will output the detected objects for each frame. We also set object confidence threshold at 40 to keep only very confident detections.

C++
Python
C#

// Set runtime parameter confidence to 40
ObjectDetectionRuntimeParameters detection_parameters_runtime;
detection_parameters_runtime.detection_confidence_threshold = 40;

Objects objects;

// Grab new frames and detect objects
while (zed.grab() == ERROR_CODE::SUCCESS) {
	err = zed.retrieveObjects(objects, detection_parameters_runtime);

	if (objects.is_new) {
        // Count the number of objects detected
        cout << objects.object_list.size() << " Object(s) detected"

        // Display the 3D location of an object    
        cout << " 3D position: " << first_object.position;

        // Display its 3D bounding box coordinates        
        cout << " Bounding box 3D \n";
        for (auto it : first_object.bounding_box)
          cout << "    " << it;    
	}
}

# Set runtime parameter confidence to 40
detection_parameters_runtime = sl.ObjectDetectionRuntimeParameters()
detection_parameters_runtime.detection_confidence_threshold = 40

objects = sl.Objects()

# Grab new frames and detect objects
while zed.grab() == sl.ERROR_CODE.SUCCESS:
    err = zed.retrieve_objects(objects, detection_parameters_runtime)

    if objects.is_new:
        # Count the number of objects detected
        print("{} Object(s) detected".format(len(objects.object_list)))

        if len(objects.object_list):
            # Display the 3D location of an object
            first_object = objects.object_list[0]
            position = first_object.position
            print(" 3D position : [{0},{1},{2}]".format(position[0],position[1],position[2]))

            # Display its 3D bounding box coordinates
            bounding_box = first_object.bounding_box
            print(" Bounding box 3D :")
            for it in bounding_box:            
                print(" " + str(it),end='')

// Set runtime parameter confidence to 40
ObjectDetectionRuntimeParameters detection_parameters_runtime = new ObjectDetectionRuntimeParameters();
detection_parameters_runtime.detectionConfidenceThreshold = 40;

Objects objects = new Objects();

// Grab new frames and detect objects
while (zed.Grab(ref runtimeParameters) == ERROR_CODE.SUCCESS) {
    err = zed.RetrieveObjects(ref objects, ref detection_parameters_rt);

    if (objects.isNew) {
        // Count the number of objects detected
        Console.WriteLine(objects.numObject + " Object(s) detected");

        // Display the 3D location of an object 
        if(objects.numObject > 0)   
            ObjectData first_object = objects.objectData[0];
        Console.WriteLine(" 3D position: " + first_object.position);

        // Display its 3D bounding box coordinates        
        Console.WriteLine(" Bounding box 3D)";
        foreach (Vector3 it in first_object.boundingBox)
            Console.WriteLine("    " + it);    
    }
}

Disable modules and exit

Before exiting the application, modules need to be disabled and the camera closed. Note that zed.close() can also disable properly all active modules. The close() function is also called automatically by the destructor if necessary.

C++
Python
C#

// Disable object detection and close the camera
zed.disableObjectDetection();
zed.close();
return 0;

# Disable object detection and close the camera
zed.disable_object_detection()
zed.close()

// Disable object detection and close the camera
zed.DisableObjectDetection();
zed.Close();

And this is it!

Next Steps

At this point, you know how to retrieve image, depth and 3D objects data from ZED stereo cameras.

To detect objects in the scene and display their 3D bounding boxes over the live point cloud, check the 3D Object Detection advanced sample code.