Skip to content

An Efficient Saliency Prediction Model for Unmanned Aerial Vehicle Video (ISPRS 2022)

License

Notifications You must be signed in to change notification settings

zhangkao/IIP_UAVSal_Saliency

Repository files navigation

IIP_UAVSal_Saliency

It is a re-implementation code for the UAVSal model.

Related Project

  • Kao Zhang, Zhenzhong Chen, Shan Liu. A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction. IEEE Transactions on Image Processing (TIP), vol. 30, pp. 572-587, 2021.
    Github: https://github.com/zhangkao/IIP_STRNN_Saliency

  • Kao Zhang, Zhenzhong Chen. Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 29, no. 12, pp. 3544-3557, 2019.
    Github: https://github.com/zhangkao/IIP_TwoS_Saliency

Installation

Environment:

The code was developed using Python 3.6+ & pytorch 1.4+ & CUDA 10.0+. There may be a problem related to software versions.

  • Windows10/11 or Ubuntu20.04
  • Anaconda latest, Python
  • CUDA, CUDNN

Python requirements

You can try to create a new environment in anaconda, as follows

*For GEFORCE RTX 10 series, such as GTX1080, xp, etc. (Pytorch 1.4.0~1.7.1, python=3.6~3.8)

    conda create -n uavsal python=3.8
    conda activate uavsal
    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
    pip install numpy hdf5storage h5py==2.10.0 scipy matplotlib opencv-python scikit-image torchsummary

*For GEFORCE RTX 30 series, such as RTX3060, 3080, etc.
    
    conda create -n uavsal python=3.7
    conda activate uavsal
    conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
    pip install numpy hdf5storage h5py==2.10.0 scipy matplotlib opencv-python scikit-image torchsummary

Pre-trained models

Download the pre-trained models and put the pre-trained model into the "weights" file.

Train and Test

The parameters

  • Please change the working directory: "dataDir" to your path in the "Demo_Test.py" and "Demo_Train_Test.py" files, like:

      dataDir = '/home/name/DataSet/'
    
  • More parameters are in the "train" and "test" functions.

  • Run the demo "Demo_Test.py" and "Demo_Train_Test.py" to test or train the model.

The full training process:

  • We initialize the SRF-Net with the pretrained MobileNet V2 and fine-tune the model on SALICON dataset. Then we train the whole model on EyeTrackUAV2 and AVS1K, respectively.

The training and testing datasets:

The training and test data examples:

Output

And it is easy to change the output format in our code.

  • The results of video task is saved by ".mat"(uint8) formats.
  • You can get the color visualization results based on the "Visualization Tools".
  • You can evaluate the performance based on the "EvalScores Tools".
  • You can get the parameter size of each component based on the "Getmodelsize Tools".

Results: ALL (5.8G):

The model is trained using Adam optimizer with lr=0.0001 and weight_decay=0.00005

The model is trained using Adam optimizer with lr=0.00001 and weight_decay=0.000005

It can achieve faster speed (85FPS) with similar performance by slightly reducing the input size from original 360 x 640 pixels to 288 x 512 pixels

Video Demo

A video demo is provided for comparison with State-of-the-art methods, including: OneDrive (596M)

  • DL based models: STRNN*, TwoS*, TASED*, UNISAL*

  • non-DL based models: GBVSm, AWSD.

  • The models fine-tuned on the corresponding dataset (UAV2 and AVS1K) are marked with *.

  • After fine-tuning, the performance of these models improved significantly.

  • The first four scenes are from the UAV2-TE dataset, the rest are from AVS1K-TE.

Paper & Citation

If you use the UAVSal video saliency model, please cite the following paper:

@article{zhang2022an,
  title={An Efficient Saliency Prediction Model for Unmanned Aerial Vehicle Video},
  author={Zhang, Kao and Chen, Zhenzhong and Li, Songnan and Liu, shan},
  journal={ISPRS Journal of Photogrammetry and Remote Sensing},
  volume={xxxx},
  pages={xxxx},
  year={2022}
}

Contact

Kao ZHANG
Laboratory of Intelligent Information Processing (LabIIP)
Wuhan University, Wuhan, China.
Email: [email protected]

Zhenzhong CHEN (Professor and Director)
Laboratory of Intelligent Information Processing (LabIIP)
Wuhan University, Wuhan, China.
Email: [email protected]
Web: http://iip.whu.edu.cn/~zzchen/

About

An Efficient Saliency Prediction Model for Unmanned Aerial Vehicle Video (ISPRS 2022)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published