Fusion

The Fusion API of the ZED SDK offers developers the ability to create applications using data from multiple cameras with ease. This module handles time synchronization and geometric calibration issues, along with 360° data fusion with noisy data coming from multiple cameras and sensor sources.

Overview #

The Fusion module extends the functionalities of the ZED camera, without changing the way developers use it. It provides fine control over each camera and enables simultaneous utilization of multi-camera data. The Fusion API is continually evolving to introduce new features and enable the development of next-generation applications.

Developers can easily integrate the Fusion API into their existing applications. The module operates based on the Publish/Subscribe pattern, where the connected ZED cameras publish data, and the Fusion API subscribes to it, making data transmission seamless.

Overall, the Fusion API is a powerful tool that gives developers unprecedented control and versatility over their camera data, making it easier to build innovative applications that leverage multiple camera inputs.

Workflow #

To support most setups, the Fusion API can operate with two different workflows:

	LOCAL	NETWORK
Presentation	Fully local application, where all ZED cameras are connected to the same host computer which runs the publishers and the subscriber.	Network (edge computing) oriented application based on ZED Boxes, where each publisher and the subscriber runs on different host machines.
Diagram
Pros	- Low latency - Efficient shared memory protocol (even for GPU memories) - Can achieve high FPS with high-end hardware - Simple setup for development	- High availability (no single point of failure) - No limitations on the number of publishers - Large variety of ZED boxes available to fit your computation requirements - Adapted for large scale scenarios - Remote monitoring of your system with ZED Hub
Cons	- Requires a powerful GPU (large GPU memory) to support the execution of multiple camera instances - Requires high USB bandwidth to connect all of your cameras - Can be limited by camera cable length	- Network data transfer introduces a small latency (< 5ms), depending on the network configuration

📌 Note: The Fusion module is lightweight (in computation resources requirements) compared to the requirements for camera publishers.

Available modules: #

Currently, the Fusion API integrates the following ZED SDK modules:

Object detection
Body tracking
Spatial mapping
Positional tracking with external GNSS (refer to this page for additional details on this fusion module)

Getting Started #

To achieve optimal performance and accurate results when using the Fusion API of the ZED SDK, it is crucial to calibrate your camera array. The Fusion API requires the knowledge of each camera’s position to create a common 3D world representation. Therefore, the calibration process needs to be precise; otherwise, the results obtained from the Fusion API might be inconsistent.

To simplify this process, we have developed a tool called ZED360. You can find more information on its usage in its documentation. By utilizing ZED360, you can accurately calibrate your camera array, ensuring that you get the best possible results from the Fusion API

There are various calibration methods that you can use depending on your setup, and as the Fusion API continues to evolve, more options will be added.

The calibration process is critical, as accurate calibration is necessary to obtain reliable and precise results from the Fusion API. Proper calibration enables you to leverage the full potential of the Fusion API and its versatility.

Try it out #

Once the calibration is done, you can try out the Fusion with our C++ and Python multi-camera samples, or in Unity and Unreal Engine via their Live Link implementations.

For a local workflow when all the cameras all connected to the host machine, the code samples provided above are sufficient. However, an extra step is required when the cameras are connected on different machines on the local network.

Setting up a network workflow #

Each camera must run a streaming app that will compute all the required data and send it to the fusion server over the local network. This task can be performed with the following code :

Body Tracking Fusion Sender

// ZED includes
#include <sl/Camera.hpp>


// Using std and sl namespaces
using namespace std;
using namespace sl;

void print(string msg_prefix, ERROR_CODE err_code = ERROR_CODE::SUCCESS, string msg_suffix = "");
void parseArgs(int argc, char **argv, InitParameters& param);

int main(int argc, char **argv) {

#ifdef _SL_JETSON_
    const bool isJetson = true;
#else
    const bool isJetson = false;
#endif

    // Create ZED Bodies
    Camera zed;
    InitParameters init_parameters;
    init_parameters.camera_resolution = RESOLUTION::AUTO;
    init_parameters.depth_mode = DEPTH_MODE::NEURAL_LIGHT;
    init_parameters.coordinate_system = COORDINATE_SYSTEM::RIGHT_HANDED_Y_UP;

    // Open the camera
    auto returned_state = zed.open(init_parameters);
    if (returned_state != ERROR_CODE::SUCCESS) {
        zed.close();
        return EXIT_FAILURE;
    }

    // Enable Positional tracking (mandatory for object detection)
    PositionalTrackingParameters positional_tracking_parameters;
    //If the camera is static, uncomment the following line to have better performances
    //positional_tracking_parameters.set_as_static = true;

    returned_state = zed.enablePositionalTracking(positional_tracking_parameters);
    if (returned_state != ERROR_CODE::SUCCESS) {
        zed.close();
        return EXIT_FAILURE;
    }

    // Enable the Body tracking module
    BodyTrackingParameters body_tracker_params;
    body_tracker_params.enable_tracking = false; // track people across images flow
    body_tracker_params.enable_body_fitting = false; // smooth skeletons moves
    body_tracker_params.body_format = sl::BODY_FORMAT::BODY_18;
    body_tracker_params.detection_model = isJetson ? BODY_TRACKING_MODEL::HUMAN_BODY_FAST : BODY_TRACKING_MODEL::HUMAN_BODY_ACCURATE;
    //body_tracker_params.allow_reduced_precision_inference = true;

    returned_state = zed.enableBodyTracking(body_tracker_params);
    if (returned_state != ERROR_CODE::SUCCESS) {
        zed.close();
        return EXIT_FAILURE;
    }

    // Configure object detection runtime parameters
    BodyTrackingRuntimeParameters body_tracker_parameters_rt;
    body_tracker_parameters_rt.detection_confidence_threshold = 40;
    body_tracker_parameters_rt.skeleton_smoothing = 0.7;
    
    CommunicationParameters configuration;
    zed.startPublishing(configuration);

    // Create ZED Bodies filled in the main loop
    Bodies bodies;
    while (true) {
        // Grab images
        auto err = zed.grab();
        if (err == ERROR_CODE::SUCCESS) {
            // Retrieve Detected Human Bodies
            zed.retrieveBodies(bodies, body_tracker_parameters_rt);
        } 
    }

    // Release Bodies
    bodies.body_list.clear();

    // Disable modules
    zed.disableBodyTracking();
    zed.disablePositionalTracking();
    zed.close();

    return EXIT_SUCCESS;
}

The most important part is the call of the startPublishing method. This sets this SDK instance as a “Fusion data provider”. By default, the data will be streamed on the port 30000 but it can be changed with the CommunicationParameters data structure. This can be useful if multiple cameras are connected to the same PC. In this case, each instance must use a different port.

Configuration files #

To accommodate varying workflows and provide the ability to update without requiring changes to application code, a configuration file has been designed to contain all the necessary data for the Fusion API of the ZED SDK. The ZED SDK includes a support function that enables direct parsing of the configuration file and API setup based on its contents.

The configuration file contains the information for the publisher to open the camera and for the subscriber to connect to the publisher as well as the camera pose. Data are packed per camera, referred to by their serial number. The input part is separated into:

zed: the way the publisher can open the camera. The parameters are based on the API Input_Type, type is the API enum INPUT_TYPE and configuration is a string containing input data.
fusion: the way the subscriber can contact the publisher. If your workflow is local, simply set the type to INTRA_PROCESS, no need to define configuration. In case your workflow is over the network, set the type to LOCAL_NETWORK and add configuration|ip as a string and configuration|port as a number.

The file also contains the camera’s world data. It is defined by its rotation (radians) and its translation (meters). The position should be defined in UNIT::METER and COORDINATE_SYSTEM::IMAGE, the API read/write functions will handle the metric and coordinate system changes.

Configuration files are comprised of a list of parameters for each camera that should be configured in that calibration. You can find details on each attribute below:

Configuration File Example

{
    "12345678": { //← serial number of camera (you can find it in the ZED Explorer tool)
        "input": {
            "zed": {
                "type":  // could be "USB_SERIAL", "USB_ID", "GMSL_SERIAL", "GMSL_ID", "SVO_FILE" or "STREAM"
                "configuration": // for type X_ID: camera ID, for type STREAM: IP [and port] formatted as "IP:PORT", for type SVO_FILE, path to SVO
            },
            "fusion": {
                "type": // "LOCAL_NETWORK": (ZED Hub) Sender of this camera not running on the same machine than Fusion
                        // "INTRA_PROCESS": (Local workflow or reading SVOs) Sender of this camera & Fusion running on the same machine
                "configuration": { // needed only in case of 'LOCAL_NETWORK'
                    "ip": "192.168.1.234", // the ip of the edge device that runs the camera
                    "port": 30000 // the port the data is streamed on
                }
            }
        },
        "world": {
            "rotation": [ // orientation of the camera in radians
                0,
                0,
                0
            ],
            "translation": [ // position of the camera in meters
                0,
                0,
                0
            ]
        }
    },
// ...
// [same pattern for the other cameras]
// ...
}