Tutorial - Using 3D Object Detection This tutorial shows how to use your ZED 3D camera to detect, classify and locate persons in space (compatible with ZED 2 only). Detection and localization works with both a static or moving camera. Getting Started First, download the latest version of the ZED SDK. Download the Object Detection sample code in C++, Python or C#. Code Overview Open the camera In this tutorial, we will use the Object Detection AI module of the ZED SDK. As in previous tutorials, we create, configure and open the camera. C++ Python C# // Create ZED objects Camera zed; InitParameters initParameters; initParameters.camera_resolution = RESOLUTION::HD720; initParameters.depth_mode = DEPTH_MODE::ULTRA; initParameters.sdk_verbose = true; // Open the camera ERROR_CODE zed_error = zed.open(initParameters); if (zed_error != ERROR_CODE::SUCCESS) { std::cout << "Error " << zed_error << ", exit program.\n"; return 1; // Quit if an error occurred } # Create ZED objects zed = sl.Camera() init_params = sl.InitParameters() init_params.camera_resolution = sl.RESOLUTION.HD720 init_params.depth_mode = sl.DEPTH_MODE.ULTRA init_params.sdk_verbose = True # Open the camera err = zed.open(init_params) if err != sl.ERROR_CODE.SUCCESS: # Quit if an error occurred exit() // Create a ZED camera object Camera zed = new Camera(0); // Set configuration parameters InitParameters init_params = new InitParameters(); init_params.resolution = RESOLUTION.HD720; init_params.depthMode = DEPTH_MODE.ULTRA; // Open the camera ERROR_CODE err = zed.Open(ref init_params); if (err != ERROR_CODE.SUCCESS) Environment.Exit(-1) Enable 3D Object detection Before enabling object detection, we specify the ObjectDetectionParameters of the module. In this tutorial, we use the following settings: C++ Python C# // Define the Object Detection module parameters ObjectDetectionParameters detection_parameters; detection_parameters.image_sync = true; detection_parameters.enable_tracking = true; detection_parameters.enable_mask_output = true; // Object tracking requires camera tracking to be enabled if (detection_parameters.enable_tracking) zed.enablePositionalTracking(); # Define the Object Detection module parameters detection_parameters = sl.ObjectDetectionParameters() detection_parameters.image_sync = True detection_parameters.enable_tracking = True detection_parameters.enable_mask_output = True # Object tracking requires camera tracking to be enabled if detection_parameters.enable_tracking: zed.enable_positional_tracking() // Define the Object Detection module parameters ObjectDetectionParameters detection_parameters = new ObjectDetectionParameters(); detection_parameters.imageSync = true; detection_parameters.enableObjectTracking = true; detection_parameters.enable2DMask = true; if (detection_parameters.enableObjectTracking){ PositionalTrackingParameters trackingParams = new PositionalTrackingParameters(); zed.EnablePositionalTracking(ref trackingParams); } image_sync determines if object detection runs for each frame or asynchronously in a separate thread. enable_tracking allows objects to be tracked across frames and keep the same ID as long as possible. Positional tracking must be active in order to track objects movements independently from camera motion. enable_mask_output outputs 2D masks over detected objects. Since it requires additional processing, disable this option if not used. Now let’s enable object detection which will load an AI model. This operation can take a few seconds. The first time the module is used, the model will be optimized for your hardware and this can take up to a few minutes. The model optimization operation is done only once. C++ Python C# cout << "Object Detection: Loading Module..." << endl; err = zed.enableObjectDetection(detection_parameters); if (err != ERROR_CODE::SUCCESS) { cout << "Error " << err << ", exit program.\n"; zed.close(); return 1; } print("Object Detection: Loading Module...") err = zed.enable_object_detection(detection_parameters) if err != sl.ERROR_CODE.SUCCESS: print("Error {}, exit program".format(err)) zed.close() exit() Console.WriteLine("Object Detection: Loading Module..."); err = zed.EnableObjectDetection(ref obj_det_params); if (err != ERROR_CODE.SUCCESS) { Console.WriteLine("Error " + err + ", exit program."); zed.Close(); Environment.Exit(-1); } Retrieve object data To retrieve detected objects in an image, use the retrieveObjects() function with an Objects parameter that will store objects data. Since image_sync is enabled, for each grab call, the image will be fed into the AI module that will output the detected objects for each frame. We also set object confidence threshold at 40 to keep only very confident detections. C++ Python C# // Set runtime parameter confidence to 40 ObjectDetectionRuntimeParameters detection_parameters_runtime; detection_parameters_runtime.detection_confidence_threshold = 40; Objects objects; // Grab new frames and detect objects while (zed.grab() == ERROR_CODE::SUCCESS) { err = zed.retrieveObjects(objects, detection_parameters_runtime); if (objects.is_new) { // Count the number of objects detected cout << objects.object_list.size() << " Object(s) detected" // Display the 3D location of an object cout << " 3D position: " << first_object.position; // Display its 3D bounding box coordinates cout << " Bounding box 3D \n"; for (auto it : first_object.bounding_box) cout << " " << it; } } # Set runtime parameter confidence to 40 detection_parameters_runtime = sl.ObjectDetectionRuntimeParameters() detection_parameters_runtime.detection_confidence_threshold = 40 objects = sl.Objects() # Grab new frames and detect objects while zed.grab() == sl.ERROR_CODE.SUCCESS: err = zed.retrieve_objects(objects, detection_parameters_runtime) if objects.is_new: # Count the number of objects detected print("{} Object(s) detected".format(len(objects.object_list))) if len(objects.object_list): # Display the 3D location of an object first_object = objects.object_list[0] position = first_object.position print(" 3D position : [{0},{1},{2}]".format(position[0],position[1],position[2])) # Display its 3D bounding box coordinates bounding_box = first_object.bounding_box print(" Bounding box 3D :") for it in bounding_box: print(" " + str(it),end='') // Set runtime parameter confidence to 40 ObjectDetectionRuntimeParameters detection_parameters_runtime = new ObjectDetectionRuntimeParameters(); detection_parameters_runtime.detectionConfidenceThreshold = 40; Objects objects = new Objects(); // Grab new frames and detect objects while (zed.Grab(ref runtimeParameters) == ERROR_CODE.SUCCESS) { err = zed.RetrieveObjects(ref objects, ref detection_parameters_rt); if (objects.isNew) { // Count the number of objects detected Console.WriteLine(objects.numObject + " Object(s) detected"); // Display the 3D location of an object if(objects.numObject > 0) ObjectData first_object = objects.objectData[0]; Console.WriteLine(" 3D position: " + first_object.position); // Display its 3D bounding box coordinates Console.WriteLine(" Bounding box 3D)"; foreach (Vector3 it in first_object.boundingBox) Console.WriteLine(" " + it); } } Disable modules and exit Before exiting the application, modules need to be disabled and the camera closed. Note that zed.close() can also disable properly all active modules. The close() function is also called automatically by the destructor if necessary. C++ Python C# // Disable object detection and close the camera zed.disableObjectDetection(); zed.close(); return 0; # Disable object detection and close the camera zed.disable_object_detection() zed.close() // Disable object detection and close the camera zed.DisableObjectDetection(); zed.Close(); And this is it! Next Steps At this point, you know how to retrieve image, depth and 3D objects data from ZED stereo cameras. To detect objects in the scene and display their 3D bounding boxes over the live point cloud, check the 3D Object Detection advanced sample code.