Aerial Image Segmentation is a top-down perspective semantic segmentation and has several challenging characteristics such as strong imbalance in the foreground-background distribution, complex background, intra-class heterogeneity, inter-class homogeneity, and small objects. To handle these problems, we inherit the advantages of Transformers and propose AerialFormer, which unifies Transformers at the contracting path with lightweight Multi-Dilated Convolutional Neural Networks (MD-CNNs) at the expanding path. AerialFormer is designed as a hierarchical structure, in which Transformer encoder outputs multi-scale features and MD-CNNs decoder aggregates information from the multi-scales. Thus, it takes both local and global context into consideration to render powerful representations and high-resolution segmentation. We have benchmarked AerialFormer on three common datasets including iSAID, LoveDA, and Postdam. Comprehensive experiments and extensive ablation studies show that our proposed AerialFormer outperforms previous state-of-the-art methods with remarkable performance.
Our code is implemented on mmsegmentation and its update is rapid. Please keep in mind you're using the same/compatibility version. Please refer to get_started for installation and dataset_prepare for dataset preparation on mmsegmentation. However, NOT all of codes are the same (e.g. Potsdam dataset)
Since some datasets don't allow to redistribute them, You need to get prepared the zip files. Please check mmsegmentation/dataset_prepare to get zip files.
After that, please run the following commands to prepare for datasets(iSAID, LoveDA, Potsdam)
iSAID
Download the original images from DOTA and annotations from iSAID. Put your dataset source file in one directory. For more details, check iSAID DevKit.
python tools/convert_datasets/isaid.py /path/to/potsdam
Potsdam
For Potsdam dataset, please run the following command to re-organize the dataset. Put your dataset source file in one directory. We used '2_Ortho_RGB.zip' and '5_Labels_all_noBoundary.zip'.
- With Clutter. Number of class is 6 classes.
python tools/convert_datasets/potsdam.py /path/to/potsdam
- Without Clutter. Number of class is 5 classes.
python tools/convert_datasets/potsdam_no_clutter.py /path/to/potsdam
Note that we changed some settings from the original convert_dataset code from mmsegmentation.
LoveDA
Download the dataset from Google Drive here. For LoveDA dataset, please run the following command to re-organize the dataset.
python tools/dataset_converters/loveda.py /path/to/loveDA
More details about LoveDA can be found here.
We use mmcv-full=="1.7.1
and mmsegmentation==0.30.0
. Please follow the other dependencies to mmsegmentation.
If you've not installed it, please refer to AICV to install singularity.
Environment Setup
Build Image from docker/Dockerfile
export REGISTRY_NAME="user"
export IMAGE_NAME="aerialformer"
docker build -t $REGISTRY_NAME/$IMAGE_NAME docker/ # You can use 'thanyu/aerialformer'
Training
- Single GPU
export DATAPATH="path/to/data" #If you do not specify, it'll be "$PWD/data"
bash tools/singularity_train.sh configs/path/to/config
For example, to run AerialFormer-T on iSAID dataset:
bash tools/singularity_train.sh configs/aerialformer/aerialformer_tiny_512x512_loveda.py
- Multi GPUs
export DATAPATH="path/to/data" #If you do not specify, it'll be "$PWD/data"
bash tools/singularity_train.sh configs/path/to/config
For example, to train AerialFormer-S on LoveDA dataset:
bash tools/singularity_dist_train.sh configs/aerialformer/aerialformer_small_512x512_loveda.py 2
Evaluation
- Single GPU
bash tools/singularity_test.sh configs/path/to/config work_dirs/path/to/trained_weight --eval metrics
For example, to test AerialFormer-T on Loveda dataset
bash tools/singularity_test.sh configs/aerialformer/aerialformer_tiny_512x512_loveda.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/latest.pth --eval mIoU
- Multi GPUs
bash tools/singularity_dist_test.sh configs/path/to/config work_dirs/work_dirs/path/to/trained_weight 2 --eval metrics
For example, to test AerialFormer-S on Loveda dataset
bash tools/singularity_dist_test.sh work_dirs/aerialformer_small_512x512_loveda/2023_0612_1009/aerialformer_small_512x512_loveda.py work_dirs/aerialformer_small_512x512_loveda/2023_0612_1009/latest.pth 2 --eval mIoU
Environment Setup
STEP 1. Run and install mmsegmentation by the following code.
For more information, refer to mmsegmentaiton/get_started.
pip install -U openmim && mim install mmcv-full=="1.7.1"
pip install mmsegmentation==0.30.0
STEP 2. Clone this repository and install.
git clone https://github.com/UARK-AICV/AerialFormer.git
cd AerialFormer
pip install -v -e .
Training
- Single GPU
python tools/train.py configs/path/to/config
For example, to train AerialFormer-T on LoveDA dataset:
python tools/train.py configs/aerialformer/aerialformer_tiny_512x512_loveda.py
- Multi GPUs
bash tools/dist_train.sh configs/path/to/config num_gpus
For example, to train AerialFormer-B on LoveDA dataset on two gpus:
bash tools/dist_train.sh configs/aerialformer/aerialformer_base_512x512_loveda.py 2
Note batch size matters. We're using 8 batch sizes.
Evaluation
- Single GPU
python tools/test.py configs/path/to/config work_dirs/path/to/checkpoint --eval metrics
For example , to test AerialFormer-T on Loveda dataset
python tools/test.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/aerialformer_tiny_512x512_loveda.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/latest.pth --eval mIoU
- Multi GPUs
bash tools/dist_test.py configs/path/to/config work_dirs/path/to/checkpoint num_gpus --eval metrics
For example , to test AerialFormer-T on Loveda dataset
bash tools/dist_test.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/aerialformer_tiny_512x512_loveda.py work_dirs/aerialformer_tiny_512x512_loveda/2023_0101_0000/latest.pth 2 --eval mIoU
We thank the following open sourced project(s).
If you find this work helpful, please consider citing the following paper:
@article{yamazaki2023aerialformer,
title={AerialFormer: Multi-resolution Transformer for Aerial Image Segmentation},
author={Yamazaki, Kashu and Hanyu, Taisei and Tran, Minh and Garcia, Adrian and Tran, Anh and McCann, Roy and Liao, Haitao and Rainwater, Chase and Adkins, Meredith and Molthan, Andrew and others},
journal={arXiv preprint arXiv:2306.06842},
year={2023}
}