Using the Object Detection API Object Detection Configuration To configure object detection, use ObjectDetectionParameters at initialization and ObjectDetectionRuntimeParameters to change specific parameters during use. C++ Python C# // Set initialization parameters ObjectDetectionParameters detection_parameters; detection_parameters.enable_tracking = true; // Objects will keep the same ID between frames detection_parameters.enable_mask_output = true; // Outputs 2D masks over detected objects // Set runtime parameters ObjectDetectionRuntimeParameters detection_parameters_rt; detection_parameters_rt.detection_confidence_threshold = 25; # Set initialization parameters detection_parameters = sl.ObjectDetectionParameters() detection_parameters.enable_tracking = true # Set runtime parameters detection_parameters_rt = sl.ObjectDetectionRuntimeParameters() detection_parameters_rt.detection_confidence_threshold = 25 // Set initialization parameters ObjectDetectionParameters detection_parameters = new ObjectDetectionParameters(); detection_parameters.enableObjectTracking = true; // Objects will keep the same ID between frames // Set runtime parameters ObjectDetectionRuntimeParameters detection_parameters_rt = new ObjectDetectionRuntimeParameters(); detection_parameters_rt.detectionConfidenceThreshold = 35; Various Object Box detection models are available in ZED SDK : the general purpose object detection including DETECTION_MODEL::MULTI_CLASS_BOX, DETECTION_MODEL::MULTI_CLASS_BOX_MEDIUM and DETECTION_MODEL::MULTI_CLASS_BOX_ACCURATE. You can choose one of them depending on desired performance/accuracy. These models are able to detect multiple objects classes OBJECT_CLASS. the head detection DETECTION_MODEL::PERSON_HEAD_BOX. It is specialized on person head detection and tracking. It may be beneficial for application in crowded scene where persons in background are merely detected by the general purpose person detection model. We have separated this model from the general purpose object detection model and have brought some special optimization and improvements to increase detection and tracking accuracies. It only detects a single class OBJECT_CLASS::PERSON with subclass OBJECT_SUBCLASS::PERSON_HEAD. You can use detection_parameters.detection_model to set the detection model : C++ Python C# // choose a detection model detection_parameters.detection_model = DETECTION_MODEL::MULTI_CLASS_BOX; # choose a detection model detection_parameters.detection_model = sl.DETECTION_MODEL.MULTI_CLASS_BOX // choose a detection model detection_parameters.detectionModel = sl.DETECTION_MODEL.MULTI_CLASS_BOX; If you want to track objects' motion within their environment, you will first need to activate the positional tracking module. Then, set detection_parameters.enable_tracking to true. C++ Python C# if (detection_parameters.enable_tracking) { // Set positional tracking parameters PositionalTrackingParameters positional_tracking_parameters; // Enable positional tracking zed.enablePositionalTracking(positional_tracking_parameters); } if detection_parameters.enable_tracking : # Set positional tracking parameters positional_tracking_parameters = sl.PositionalTrackingParameters() # Enable positional tracking zed.enable_positional_tracking(positional_tracking_parameters) if (detection_parameters.enableObjectTracking ) { // Set positional tracking parameters PositionalTrackingParameters trackingParams = new PositionalTrackingParameters(); // Enable positional tracking zed.EnablePositionalTracking(ref trackingParams); } With these parameters configured, you can enable the object detection module: C++ Python C# // Enable object detection with initialization parameters zed_error = zed.enableObjectDetection(detection_parameters); if (zed_error != ERROR_CODE::SUCCESS) { cout << "enableObjectDetection: " << zed_error << "\nExit program."; zed.close(); exit(-1); } # Enable object detection with initialization parameters zed_error = zed.enable_object_detection(detection_parameters) if zed_error != sl.ERROR_CODE.SUCCESS : print("enable_object_detection", zed_error, "\nExit program.") zed.close() exit(-1) // Enable object detection with initialization parameters zed_error = zedCamera.EnableObjectDetection(ref detection_parameters); if (zed_error != ERROR_CODE.SUCCESS) { Console.WriteLine("enableObjectDetection: " + zed_error + "\nExit program."; zed.Close(); Environment.Exit(-1); } Note: Object Detection has been optimized for ZED2/ZED2i and uses the camera motion sensors for improved reliability. Therefore the Object Detection module requires a ZED2/ZED2i or ZED-Mini, and sensors cannot be disabled when using the module. Getting Object Data To get the detected objects in a scene, get an new image with grab(...) and extract the detected objects with retrieveObjects(). The objects' 2D positions are relative to the left image, while the 3D positions are either in the CAMERA or WORLD reference frame depending on RuntimeParameters.measure3D_reference_frame (given to the grab() function). C++ Python C# sl::Objects objects; // Structure containing all the detected objects if (zed.grab() == ERROR_CODE::SUCCESS) { zed.retrieveObjects(objects, detection_parameters_rt); // Retrieve the detected objects } objects = sl.Objects() # Structure containing all the detected objects if zed.grab() == sl.ERROR_CODE.SUCCESS : zed.retrieve_objects(objects, obj_runtime_param) # Retrieve the detected objects sl.Objects objects = new sl.Objects(); // Structure containing all the detected objects RuntimeParameters runtimeParameters = new RuntimeParameters(); if (zed.Grab(ref runtimeParameters) == ERROR_CODE.SUCCESS) { zed.RetrieveObjects(ref objects, ref obj_runtime_param); // Retrieve the detected objects } The sl::Objects class stores all the information regarding the different objects present in the scene in it object_list attribute. Each individual object is stored as a sl::ObjectData with all information about it, such as bounding box, position, mask, etc. All objects from a given frame are stored in a vector within sl::Objects. sl::Objects also contains the timestamp of the detection, which can help connect the objects to the images. You can iterate through the objects as follows: C++ Python C# for(auto object : objects.object_list) std::cout << object.id << " " << object.position << std::endl; for object in objects.object_list: print("{} {}".format(object.id, object.position)) for (int idx = 0; idx < objects.numObject; idx++) Console.WriteLine(objects.objectData[idx].id + " " + objects.objectData[idx].position); Each detected object can be accessed by using its ID as follows: C++ Python C# sl::ObjectData object; objects.getObjectDataFromId(object, 0); // Get the object with ID = O object = sl.ObjectData() objects.get_object_data_from_id(object, 0); # Get the object with ID = O sl.ObjectData objectData = new ObjectData(); objects.GetObjectDataFromId(ref objectData, 0); // Get the object with ID = O Accessing Object Information Once an sl::ObjectData is retrieved from the object vector, you can access information such as its ID, position, velocity, label, and tracking_state: C++ Python C# unsigned int object_id = object.id // Get the object id sl::float3 object_position = object.position // Get the object position sl::float3 object_velocity = object.velocity // Get the object velocity sl::OBJECT_TRACKING_STATE object_tracking_state = object.tracking_state // Get the tracking state of the object if(object_tracking_state == sl::OBJECT_TRACK_STATE::OK){ cout << "Object " << object_id << " is tracked" << endl; } object_id = object.id # Get the object id object_position = object.position # Get the object position object_velocity = object.velocity # Get the object velocity object_tracking_state = object.tracking_state # Get the tracking state of the object if object_tracking_state == sl.OBJECT_TRACK_STATE.OK : print("Object {0} is tracked\n".format(object_id)) uint object_id = object.id // Get the object id Vector3 object_position = object.position // Get the object position Vector3 object_velocity = object.velocity // Get the object velocity OBJECT_TRACK_STATE object_tracking_state = object.objectTrackingState; // Get the tracking state of the object if(object_tracking_state == sl.OBJECT_TRACK_STATE.OK){ Console.WriteLine("Object " + object_id + " is tracked"); } You can also access the confidence of the detection for each object. This confidence depicts the probability of a detected object to really be present in the scene. Therefore, this confidence can be used to post-filter the detected objects. For example, you can ignore objects with a confidence less than 10%: C++ Python C# for(auto object : objects.object_list){ if(object.confidence < 0.1f) continue; // Work with other objects } for object in objects.object_list: if object.confidence < 0.1 : continue # Work with other objects for (int idx = 0; idx < objects.numObject; idx++){ if(objects.objectData[idx].confidence < 0.1f) continue; // Work with other objects } Getting 3D Bounding Boxes Each detected object contains two bounding boxes: a 2D bounding box and a 3D bounding box. The 2D bounding box is defined in the image frame while the 3D bounding box is provided with the depth information. The 2D bounding box is represented as four 2D points starting from the top left corner of the object. The 3D bounding box is represented by eight 3D points starting from the top left front corner, as follows: The 2D and 3D bounding boxes are accessible in sl::ObjectData: C++ Python C# vector<sl::uint2> object_2Dbbox = object.bounding_box_2d; // Get the 2D bounding box of the object vector<sl::float3> object_3Dbbox = object.bounding_box; // Get the 3D bounding box of the object object_2Dbbox = object.bounding_box_2d; # Get the 2D bounding box of the object object_3Dbbox = object.bounding_box; # Get the 3D Bounding Box of the object Vector2[] object_2Dbbox = objects.objectData[idx].boundingBox2D; // Get the 2D bounding box of the object Vector3[] object_3Dbbox = objects.objectData[idx].boundingBox; // Get the 3D bounding box of the object Getting the Object Mask Each object can also be represented by its mask. The mask includes the pixels within the 2D bounding box that belong to the object. Pixels from the object itself are set to 255 while the pixels of the background are set to 0. You can access the mask of an object with sl::Mat object_mask = object.mask;. Code Example For code examples, check out the Tutorial and Sample on GitHub.