Using the Body Tracking API Object detection API extension Body Tracking API is an extension of the Object detection API. Almost all functions are shared by these two modules. New functions are added for Body Tracking usage only. See Using Object Detection API for more details on the overall API usage. Body Tracking Configuration Body Tracking configuration shares a lot of parameters with Object detection configuration. Opening the ZED Camera Some parameters are set when opening the camera. In the above example, the ZED camera is opened with 2K resolution, all 3D data are expressed with respect to the RIGHT_HANDED_Y_UP coordinate frame. Moreover, each 3D position is expressed in METER. C++ Python C# // Set camera configuration parameters InitParameters init_parameters; init_parameters.camera_resolution = sl::RESOLUTION::HD2K; init_parameters.coordinate_system = sl::COORDINATE_SYSTEM::RIGHT_HANDED_Y_UP; init_parameters.coordinate_units = sl::UNIT::METER; // Open the camera auto returned_state = zed.open(init_parameters); if (returned_state != ERROR_CODE::SUCCESS) { print("Camera Open", returned_state, "Exit program."); return EXIT_FAILURE; } # Set camera configuration parameters InitParameters init_parameters; init_params = sl.InitParameters() init_params.camera_resolution = sl.RESOLUTION.HD2K init_params.coordinate_units = sl.UNIT.METER init_params.coordinate_system = sl.COORDINATE_SYSTEM.RIGHT_HANDED_Y_UP # Open the camera status = zed.open(init_params) if status != sl.ERROR_CODE.SUCCESS: print(repr(status)) exit() // Set camera configuration parameters InitParameters init_params = new InitParameters(); init_params.resolution = RESOLUTION.HD2K; init_params.coordinateUnits = UNIT.METER; init_params.coordinateSystem = COORDINATE_SYSTEM.RIGHT_HANDED_Y_UP; // Open the camera zedCamera = new Camera(0); ERROR_CODE err = zedCamera.Open(ref init_params); if (err != ERROR_CODE.SUCCESS) Environment.Exit(-1); Enable Body Tracking module Initial Body tracking configuration can be set using ObjectDetectionParameters and runtime configuration can be set using ObjectDetectionRuntimeParameters. Note that initial configuration must be set only once when enabling the module and runtime configuration can be changed at runtime. See Object Detection Configuration for more details on these parameters. We will focus only on new attributes and options introduced by Body Tracking module : new detection model ObjectDetectionParameters::detection_model that enables human body detection. These new preset configure the runtime and accuracy of the human body detector : -DETECTION_MODEL::HUMAN_BODY_FAST: real time performance even on Jetson or low end GPU cards -DETECTION_MODEL::HUMAN_BODY_MEDIUM : this is a compromise between accuracy and speed -DETECTION_MODEL::HUMAN_BODY_ACCURATE : state of the art accuracy, requires powerful GPU ObjectDetectionParameters::enable_body_fitting : this enables the fitting process of each detected person. ObjectDetectionParameters::body_format is the Body format outputed by the ZED SDK. The currently supported body format are : BODY_FORMAT::POSE_18 : a 18 keypoints body model. This is a COCO18 format and is not directly compatible with public software like Unreal or Unity. For this reason, all local keypoint rotation and translation are not available with this format. BODY_FORMAT::POSE_34 : a 34 keypoints body model. This model is compatible with public software and all available data for body can be extracted with this format. The code below shows how to set these new attributes. C++ Python C# // Set initialization parameters ObjectDetectionParameters detection_parameters; detection_parameters.detection_model = DETECTION_MODEL::HUMAN_BODY_ACCURATE; //specific to human skeleton detection detection_parameters.enable_tracking = true; // Objects will keep the same ID between frames detection_parameters.enable_body_fitting = true; // Fitting process is called, user have access to all available data for a person processed by SDK detection_parameters.body_format = BODY_FORMAT::POSE_34; // selects the 34 keypoints body model for SDK outputs // Set runtime parameters ObjectDetectionRuntimeParameters detection_parameters_rt; detection_parameters_rt.detection_confidence_threshold = 40; # Set initialization parameters detection_parameters.detection_model = sl.DETECTION_MODEL.HUMAN_BODY_ACCURATE detection_parameters = sl.ObjectDetectionParameters() detection_parameters.enable_tracking = true detection_parameters.enable_body_fitting = True # Set runtime parameters detection_parameters_rt = sl.ObjectDetectionRuntimeParameters() detection_parameters_rt.detection_confidence_threshold = 40 // Set initialization parameters ObjectDetectionParameters detection_parameters = new ObjectDetectionParameters(); detection_parameters.enableObjectTracking = true; // Objects will keep the same ID between frames detection_parameters.detectionModel = sl.DETECTION_MODEL.HUMAN_BODY_ACCURATE; detection_parameters.enableBodyFitting = true; detection_parameters.bodyFormat = sl.BODY_FORMAT.POSE_34; // Set runtime parameters ObjectDetectionRuntimeParameters detection_parameters_rt = new ObjectDetectionRuntimeParameters(); detection_parameters_rt.detectionConfidenceThreshold = 40; If you want to track persons' motion within their environment, you will first need to activate the positional tracking module. Then, set detection_parameters.enable_tracking to true. C++ Python C# if (detection_parameters.enable_tracking) { // Set positional tracking parameters PositionalTrackingParameters positional_tracking_parameters; // Enable positional tracking zed.enablePositionalTracking(positional_tracking_parameters); } if detection_parameters.enable_tracking : # Set positional tracking parameters positional_tracking_parameters = sl.PositionalTrackingParameters() # Enable positional tracking zed.enable_positional_tracking(positional_tracking_parameters) if (detection_parameters.enableObjectTracking ) { // Set positional tracking parameters PositionalTrackingParameters trackingParams = new PositionalTrackingParameters(); // Enable positional tracking zed.EnablePositionalTracking(ref trackingParams); } With these parameters configured, you can enable the Body Tracking module in exactly the same way as Object detection module : C++ Python C# // Enable object detection with initialization parameters zed_error = zed.enableObjectDetection(detection_parameters); if (zed_error != ERROR_CODE::SUCCESS) { cout << "enableObjectDetection: " << zed_error << "\nExit program."; zed.close(); exit(-1); } # Enable object detection with initialization parameters zed_error = zed.enable_object_detection(detection_parameters) if zed_error != sl.ERROR_CODE.SUCCESS : print("enable_object_detection", zed_error, "\nExit program.") zed.close() exit(-1) // Enable object detection with initialization parameters zed_error = zedCamera.EnableObjectDetection(ref detection_parameters); if (zed_error != ERROR_CODE.SUCCESS) { Console.WriteLine("enableObjectDetection: " + zed_error + "\nExit program."; zed.Close(); Environment.Exit(-1); } Note: Body tracking has been optimized for ZED 2 wide angle field of view and uses the camera motion sensors for improved reliability. Therefore the Body Tracking module requires ZED 2, and sensors cannot be disabled when using the module. Getting Human Body Data To get the detected person in a scene, get a new image with grab(...) and extract the detected person with retrieveObjects(). This process is exactly the same as getting new objects with Object detection module. C++ Python C# sl::Objects objects; // Structure containing all the detected objects // grab runtime parameters RuntimeParameters runtime_parameters; runtime_parameters.measure3D_reference_frame = sl::REFERENCE_FRAME::WORLD; if (zed.grab(runtime_parameters) == ERROR_CODE::SUCCESS) { zed.retrieveObjects(objects, detection_parameters_rt); // Retrieve the detected objects } objects = sl.Objects() # Structure containing all the detected objects # grab runtime parameters runtime_params = sl.RuntimeParameters() runtime_params.measure3D_reference_frame = sl.REFERENCE_FRAME.WORLD; if zed.grab(runtime_params) == sl.ERROR_CODE.SUCCESS : zed.retrieve_objects(objects, obj_runtime_param) # Retrieve the detected objects sl.Objects objects = new sl.Objects(); // Structure containing all the detected objects // grab runtime parameters RuntimeParameters runtimeParameters = new RuntimeParamters(); runtimeParameters.measure3DReferenceFrame = sl.REFERENCE_FRAME.WORLD; if (zed.Grab(ref runtimeParameters) == ERROR_CODE.SUCCESS) { zed.RetrieveObjects(ref objects, ref detection_parameters_rt); // Retrieve the detected objects } sl::Objects class stores all data regarding the different persons present in the scene in its vector<sl::ObjectData> object_list attribute. Each person data are stored as a sl::ObjectData. sl::Objects also contains the timestamp of the detection, which can help connect the objects to the images. All 2D data are related to the left image, while the 3D data are either in the CAMERA or WORLD referential depending on RuntimeParameters.measure3D_reference_frame (given to the grab() function). The 2D data are expressed in the initial camera resolution RESOLUTION. A scaling can be applied if the value is needed in another resolution. For 3D data, the coordinate frame and units can be set by user with using COORDINATE_SYSTEM and UNIT respectively. These setting are accessible using InitParameters when opening the ZED camera with open function. Accessing Human Body Data Once a sl::ObjectData is retrieved from the object vector, all data available with respect to the initial configuration can be accessed. All Object detection datas are accessible. See Getting Object Data for shared datas between Object detection and Body tracking module. Getting 2D and 3D human keypoints 2D and 3D keypoint data of a detected person are accessible in a vector of pixel keypoint keypoint_2d and a vector of 3d position keypoint which are new attributes of sl::ObjectData. From a sl::ObjectData object obj, these new attributes can be accessible : C++ Python C# // collect all 2d keypoints for (auto& kp_2d : obj.keypoint_2d) { // user code using each kp_2d point } // collect all 3d keypoints for (auto& kp_3d : obj.keypoint) { // user code using each kp_3d point } # collect all 2d keypoints for kp_2d in obj.keypoint_2d: # user code using each kp_2d point # collect all 3d keypoints for kp_3d in obj.keypoint: # user code using each kp_3d point // collect all 2d keypoints foreach (var kp_2d in obj.keypoints2D) { // user code using each kp_2d point } // collect all 3d keypoints foreach (var kp_3d in obj.keypoints) { // user code using each kp_3d point } See the keypoint index and name correspondence as well as the output skeleton format here. Getting more results When fitting is enable at initial configuration, more results are available according to the chosen BODY_FORMAT. All local rotation and translation of each keypoint become available to user with BODY_FORMAT::POSE_34 format. From a sl::ObjectData object obj, these datas can be accessed as below: C++ Python C# // collect local rotation for each keypoint for (auto &kp : obj.local_orientation_per_joint) { // kp is the local keypoint rotation represented by a quaternion // user code } // collect local translation for each keypoint for (auto &kp : obj.local_position_per_joint) // kp is the local keypoint translation // user code } // get global root orientation auto global_root_orientation = obj.global_root_orientation // note that global root translation is available in obj.keypoint[root_index] where root_index is the root index of the body model # collect local rotation for each keypoint for kp in obj.local_orientation_per_joint: # kp is the local keypoint rotation represented by a quaternion # user code # collect local translation for each keypoint for kp in obj.local_position_per_joint: # kp is the local keypoint translation # user code # get global root orientation global_root_orientation = obj.global_root_orientation # note that global root translation is available in obj.keypoint[root_index] where root_index is the root index of the body model // collect local rotation for each keypoint foreach (var kp in obj.localOrientationPerJoint) { // kp is the local keypoint rotation represented by a quaternion // user code } // collect local translation for each keypoint foreach (var kp in obj.localPositionPerJoint) // kp is the local keypoint translation // user code } // get global root orientation Quaternion globalRootOrientation = obj.globalRootOrientation; Note that all these data are expressed in the chosen COORDINATE_SYSTEM and UNITS. Code Example For code examples, check out the Tutorial and Sample on GitHub.