- (2023/3) SPVNAS is integrated into MMDetection3D.
- (2021/12) SPVNAS wins second place in the NuScenes LiDAR Panoptic Segmentation challenge.
- (2021/5) SPVNAS wins first place in the NuScenes LiDAR Semantic Segmentation challenge.
- (2020/8) Check out our ECCV 2020 tutorial on AutoML for Efficient 3D Deep Learning.
- (2020/7) SPVNAS achieves the state-of-the-art performance on the SemanticKITTI leaderboard leaderboard.
- (2020/7) SPVNAS is accepted to ECCV 2020.
The code is built with following libraries:
For easy installation, use conda:
conda create -n torch python=3.7
conda activate torch
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
conda install numba opencv
pip install torchpack
pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git
Please follow the instructions from here to download the SemanticKITTI dataset (both KITTI Odometry dataset and SemanticKITTI labels) and extract all the files in the sequences
folder to /dataset/semantic-kitti
. You shall see 22 folders 00, 01, …, 21; each with subfolders named velodyne
and labels
.
We share the pretrained models for MinkowskiNets, our manually designed SPVCNN models and also SPVNAS models found by our 3D-NAS pipeline. All the pretrained models are available in the model zoo. Currently, we release the models trained on sequences 00-07 and 09-10 and evaluated on sequence 08.
#Params (M) | MACs (G) | mIoU (paper) | mIoU (reprod.) | |
---|---|---|---|---|
SemanticKITTI_val_MinkUNet@29GMACs |
5.5 | 28.5 | 58.9 | 59.3 |
SemanticKITTI_val_SPVCNN@30GMACs |
5.5 | 30.0 | 60.7 | 60.8 ± 0.5 |
SemanticKITTI_val_SPVNAS@20GMACs |
3.3 | 20.0 | 61.5 | - |
SemanticKITTI_val_SPVNAS@25GMACs |
4.5 | 24.6 | 62.9 | - |
SemanticKITTI_val_MinkUNet@46GMACs |
8.8 | 45.9 | 60.3 | 60.0 |
SemanticKITTI_val_SPVCNN@47GMACs |
8.8 | 47.4 | 61.4 | 61.5 ± 0.2 |
SemanticKITTI_val_SPVNAS@35GMACs |
7.0 | 34.7 | 63.5 | - |
SemanticKITTI_val_MinkUNet@114GMACs |
21.7 | 113.9 | 61.1 | 61.9 |
SemanticKITTI_val_SPVCNN@119GMACs |
21.8 | 118.6 | 63.8 | 63.7 ± 0.4 |
SemanticKITTI_val_SPVNAS@65GMACs |
10.8 | 64.5 | 64.7 | - |
Here, the results are reproduced using 8 NVIDIA RTX 2080Ti GPUs. Result variation for each single model is due to the existence of floating point atomic addition operation in our TorchSparse CUDA backend.
You can run the following command to test the performance of SPVNAS / SPVCNN / MinkUNet models.
torchpack dist-run -np [num_of_gpus] python evaluate.py configs/semantic_kitti/default.yaml --name [num_of_net]
For example, to test the model SemanticKITTI_val_SPVNAS@65GMACs
on one GPU, you may run
torchpack dist-run -np 1 python evaluate.py configs/semantic_kitti/default.yaml --name SemanticKITTI_val_SPVNAS@65GMACs
You can run the following command (on a headless server) to visualize the predictions of SPVNAS / SPVCNN / MinkUNet models:
xvfb-run --server-args="-screen 0 1024x768x24" python visualize.py
If you are running the code on a computer with monitor, you may also directly run
python visualize.py
The visualizations will be generated in assets/
.
We currently release the training code for manually-designed baseline models (SPVCNN and MinkowskiNets). You may run the following code to train the model from scratch:
torchpack dist-run -np [num_of_gpus] python train.py configs/semantic_kitti/[model name]/[config name].yaml
For example, to train the model SemanticKITTI_val_SPVCNN@30GMACs
, you may run
torchpack dist-run -np [num_of_gpus] python train.py configs/semantic_kitti/spvcnn/cr0p5.yaml
To train the model in a non-distributed environment without MPI, i.e. on a single GPU, you may directly call train.py
with the --distributed False
argument:
python train.py configs/semantic_kitti/spvcnn/cr0p5.yaml --distributed False
The code related to architecture search will be coming soon!
If you use this code for your research, please cite our paper.
@inproceedings{tang2020searching,
title = {Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution},
author = {Tang, Haotian* and Liu, Zhijian* and Zhao, Shengyu and Lin, Yujun and Lin, Ji and Wang, Hanrui and Han, Song},
booktitle = {European Conference on Computer Vision},
year = {2020}
}