Wavelet Transform Feature Enhancement for Semantic Segmentation of Remote Sensing Images-Reference-Cited by-同舟云学术

Wavelet Transform Feature Enhancement for Semantic Segmentation of Remote Sensing Images

Published:2023-12-06 Issue:24 Volume:15 Page:5644
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Li Yifan¹,Liu Ziqian¹,Yang Junli¹,Zhang Haopeng²³⁴^ORCID

Affiliation:

1. International School, Beijing University of Posts and Telecommunications, Beijing 100876, China

2. Department of Aerospace Information Engineering, School of Astronautics, Beihang University, Beijing 102206, China

3. Beijing Key Laboratory of Digital Media, Beijing 102206, China

4. Key Laboratory of Spacecraft Design Optimization and Dynamic Simulation Technologies, Ministry of Education, Beijing 102206, China

Abstract

With developments in deep learning, semantic segmentation of remote sensing images has made great progress. Currently, mainstream methods are based on convolutional neural networks (CNNs) or vision transformers. However, these methods are not very effective in extracting features from remote sensing images, which are usually of high resolution with plenty of detail. Operations including downsampling will cause the loss of such features. To address this problem, we propose a novel module called Hierarchical Wavelet Feature Enhancement (WFE). The WFE module involves three sequential steps: (1) performing multi-scale decomposition of an input image based on the discrete wavelet transform; (2) enhancing the high-frequency sub-bands of the input image; and (3) feeding them back to the corresponding layers of the network. Our module can be easily integrated into various existing CNNs and transformers, and does not require additional pre-training. We conducted experiments on the ISPRS Potsdam and ISPRS Vaihingen datasets, with results showing that our method improves the benchmarks of CNNs and transformers while performing little additional computation.

Funder

National Natural Science Foundation of China

Beijing University Student Innovation and Entrepreneurship Training Intercollegiate Cooperation Program

Publisher

MDPI AG

Subject

General Earth and Planetary Sciences

Link

https://www.mdpi.com/2072-4292/15/24/5644/pdf

Reference35 articles.

1. Long, J., Shelhamer, E., and Darrell, T. (2015, January 7–12). Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.

2. Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21–26). Pyramid Scene Parsing Network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.

3. Wu, H., Zhang, J., Huang, K., Liang, K., and Yu, Y. (2019). FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation. arXiv.

4. Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv.

5. Xiao, T., Liu, Y., Zhou, B., Jiang, Y., and Sun, J. (2018, January 8–14). Unified Perceptual Parsing for Scene Understanding. Proceedings of the Computer Vision—ECCV 2018: 15th European Conference, Munich, Germany. Proceedings, Part V.