How to Use PyTorch with ZED
Introduction #
The ZED SDK can be interfaced with a PyTorch project to add 3D localization of objects detected with a custom neural network. In this tutorial, we will combine Mask R-CNN with the ZED SDK to detect, segment, classify and locate objects in 3D using a ZED stereo camera and PyTorch.
Installation #
The Mask R-CNN 3D project depends on the following libraries:
- ZED SDK and Python API
- Pytorch (with cuDNN)
- OpenCV
- CUDA
- Python 3
- Apex
ZED SDK #
Install the ZED SDK and Python API.
PyTorch Installation #
Using Conda (recommended) #
A dedicated environment can be created to set up PyTorch. Keep your environment activated while installing the following packages.
$ conda create --name pytorch1 -y
$ conda activate pytorch1
When installing PyTorch, the selected CUDA version must match the one used by the ZED SDK. Here, we use CUDA version 10.0
$ conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
$ conda install -c conda-forge --yes --file requirements.txt
📌 Note: Do not forget to install Python API inside your current environment.
Using Pip #
$ pip3 install torch torchvision
$ pip3 install -r requirements.txt
For more information, please refer to the PyTorch setup page.
Apex Installation #
We make use of NVIDIA’s Apex API. To install it, run the following:
$ git clone https://github.com/NVIDIA/apex
$ cd apex
$ python3 setup.py install
Mask R-CNN Installation #
Setup Mask R-CNN. If you’re using a conda environment, make sure it is still active before running the following commands.
$ git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
$ cd maskrcnn-benchmark
$ python3 setup.py install
Running Mask R-CNN 3D #
Download the sample project code from GitHub. The next commands are launched from the sample directory.
Run the code with python3. You should be detecting objects captured by your ZED camera using the Mask R-CNN ResNet 50 model and localizing them in 3D.
$ python zed_object_detection.py --config-file configs/caffe2/e2e_mask_rcnn_R_50_C4_1x_caffe2.yaml --min-image-size 256
Testing Other Models #
Pre-trained models can be found in MODEL_ZOO.md. Selected models are downloaded automatically. Here we test Mask R-CNN with ResNet 101.
$ python zed_object_detection.py --config-file configs/caffe2/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml --min-image-size 300
Now let’s test 3D key points extraction:
$ python zed_object_detection.py --config-file configs/caffe2/e2e_keypoint_rcnn_R_50_FPN_1x_caffe2.yaml --min-image-size 300
Other Options #
You can launch object segmentation on recorded videos in SVO format using the following command:
$ python zed_object_detection.py --svo-filename path/to/svo_file.svo
Best accuracy can be obtained using min-image-size 800
(with reduced FPS).
$ python zed_object_detection.py --min-image-size 800
To display heatmaps, use --show-mask-heatmaps
.
$ python zed_object_detection.py --min-image-size 300 --show-mask-heatmaps
Finally to run the model on the CPU, use MODEL.DEVICE cpu
.
$ python zed_object_detection.py --min-image-size 300 MODEL.DEVICE cpu