Wavelet Transform Feature Enhancement for Semantic Segmentation of Remote Sensing Images
-
Published:2023-12-06
Issue:24
Volume:15
Page:5644
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Li Yifan1, Liu Ziqian1, Yang Junli1, Zhang Haopeng234ORCID
Affiliation:
1. International School, Beijing University of Posts and Telecommunications, Beijing 100876, China 2. Department of Aerospace Information Engineering, School of Astronautics, Beihang University, Beijing 102206, China 3. Beijing Key Laboratory of Digital Media, Beijing 102206, China 4. Key Laboratory of Spacecraft Design Optimization and Dynamic Simulation Technologies, Ministry of Education, Beijing 102206, China
Abstract
With developments in deep learning, semantic segmentation of remote sensing images has made great progress. Currently, mainstream methods are based on convolutional neural networks (CNNs) or vision transformers. However, these methods are not very effective in extracting features from remote sensing images, which are usually of high resolution with plenty of detail. Operations including downsampling will cause the loss of such features. To address this problem, we propose a novel module called Hierarchical Wavelet Feature Enhancement (WFE). The WFE module involves three sequential steps: (1) performing multi-scale decomposition of an input image based on the discrete wavelet transform; (2) enhancing the high-frequency sub-bands of the input image; and (3) feeding them back to the corresponding layers of the network. Our module can be easily integrated into various existing CNNs and transformers, and does not require additional pre-training. We conducted experiments on the ISPRS Potsdam and ISPRS Vaihingen datasets, with results showing that our method improves the benchmarks of CNNs and transformers while performing little additional computation.
Funder
National Natural Science Foundation of China Beijing University Student Innovation and Entrepreneurship Training Intercollegiate Cooperation Program
Subject
General Earth and Planetary Sciences
Reference35 articles.
1. Long, J., Shelhamer, E., and Darrell, T. (2015, January 7–12). Fully convolutional networks for semantic segmentation. Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA. 2. Zhao, H., Shi, J., Qi, X., Wang, X., and Jia, J. (2017, January 21–26). Pyramid Scene Parsing Network. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 3. Wu, H., Zhang, J., Huang, K., Liang, K., and Yu, Y. (2019). FastFCN: Rethinking Dilated Convolution in the Backbone for Semantic Segmentation. arXiv. 4. Chen, L.C., Papandreou, G., Schroff, F., and Adam, H. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv. 5. Xiao, T., Liu, Y., Zhou, B., Jiang, Y., and Sun, J. (2018, January 8–14). Unified Perceptual Parsing for Scene Understanding. Proceedings of the Computer Vision—ECCV 2018: 15th European Conference, Munich, Germany. Proceedings, Part V.
|
|