How to Use PyTorch with ZED Introduction The ZED SDK can be interfaced with a PyTorch project to add 3D localization of objects detected with a custom neural network. In this tutorial, we will combine Mask R-CNN with the ZED SDK to detect, segment, classify and locate objects in 3D using a ZED stereo camera and PyTorch. Installation The Mask R-CNN 3D project depends on the following libraries: ZED SDK and Python API Pytorch (with cuDNN) OpenCV CUDA Python 3 Apex ZED SDK Install the ZED SDK and Python API. PyTorch Installation Using Conda (recommended) A dedicated environment can be created to setup PyTorch. Keep your environment activated while installing the following packages. $ conda create --name pytorch1 -y $ conda activate pytorch1 When installing PyTorch, the selected CUDA version must match the one used by the ZED SDK. Here, we use CUDA version 10.0 $ conda install pytorch torchvision cudatoolkit=10.0 -c pytorch $ conda install -c conda-forge --yes --file requirements.txt Note: Do not forget to install Python API inside your current environment. Using Pip $ pip3 install torch torchvision $ pip3 install -r requirements.txt For more information, please refer to the PyTorch setup page. Apex Installation We make use of NVIDIA’s Apex API. To install it, run the following: $ git clone https://github.com/NVIDIA/apex $ cd apex $ python3 setup.py install Mask R-CNN Installation Setup Mask R-CNN. If you’re using a conda environment, make sure it is still active before running the following commands. $ git clone https://github.com/facebookresearch/maskrcnn-benchmark.git $ cd maskrcnn-benchmark $ python3 setup.py install Running Mask R-CNN 3D Download the sample project code from GitHub. The next commands are launched from the sample dirctory. Run the code with python3. You should be detecting objects captured by your ZED camera using the Mask R-CNN ResNet 50 model and localizing them in 3D. $ python zed_object_detection.py --config-file configs/caffe2/e2e_mask_rcnn_R_50_C4_1x_caffe2.yaml --min-image-size 256 Testing Other Models Pre-trained models can be found in MODEL_ZOO.md. Selected models are downloaded automatically. Here we test Mask R-CNN with ResNet 101. $ python zed_object_detection.py --config-file configs/caffe2/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml --min-image-size 300 Now let’s test 3D keypoints extraction: $ python zed_object_detection.py --config-file configs/caffe2/e2e_keypoint_rcnn_R_50_FPN_1x_caffe2.yaml --min-image-size 300 Other Options You can launch object segmentation on recorded videos in SVO format using the following command: $ python zed_object_detection.py --svo-filename path/to/svo_file.svo Best accuracy can be obtained using min-image-size 800 (with reduced FPS). $ python zed_object_detection.py --min-image-size 800 To display heatmaps, use --show-mask-heatmaps. $ python zed_object_detection.py --min-image-size 300 --show-mask-heatmaps Finally to run the model on CPU, use MODEL.DEVICE cpu. $ python zed_object_detection.py --min-image-size 300 MODEL.DEVICE cpu