CPROS: A Multimodal Decision-Level Fusion Detection Method Based on Category Probability Sets
-
Published:2024-07-27
Issue:15
Volume:16
Page:2745
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Li Can1, Zuo Zhen1, Tong Xiaozhong1, Huang Honghe1, Yuan Shudong1, Dang Zhaoyang1
Affiliation:
1. College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, China
Abstract
Images acquired by different sensors exhibit different characteristics because of the varied imaging mechanisms of sensors. The fusion of visible and infrared images is valuable for specific image applications. While infrared images provide stronger object features under poor illumination and smoke interference, visible images have rich texture features and color information about the target. This study uses dual optical fusion as an example to explore fusion detection methods at different levels and proposes a multimodal decision-level fusion detection method based on category probability sets (CPROS). YOLOv8—a single-mode detector with good detection performance—was chosen as the benchmark. Next, we innovatively introduced the improved Yager formula and proposed a simple non-learning fusion strategy based on CPROS, which can combine the detection results of multiple modes and effectively improve target confidence. We validated the proposed algorithm using the VEDAI public dataset, which was captured from a drone perspective. The results showed that the mean average precision (mAP) of YOLOv8 using the CPROS method was 8.6% and 16.4% higher than that of the YOLOv8 detection single-mode dataset. The proposed method significantly reduces the missed detection rate (MR) and number of false detections per image (FPPI), and it can be generalized.
Funder
National Natural Science Foundation of China the National Natural Youth Science Foundation of China
Reference36 articles.
1. Xie, Y., Xu, C., Rakotosaona, M.-J., Rim, P., Tombari, F., Keutzer, K., Tomizuka, M., and Zhan, W. (2023, January 1–6). SparseFusion: Fusing Multi-Modal Sparse Representations for Multi-Sensor 3D Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France. 2. Khosravi, M., Arora, R., Enayati, S., and Pishro-Nik, H. (2020). A Search and Detection Autonomous Drone System: From Design to Implementation. arXiv. 3. Geiger, A., Lenz, P., and Urtasun, R. (2012). Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012, IEEE. 4. Devaguptapu, C., Akolekar, N., Sharma, M.M., and Balasubramanian, V.N. (2019, January 16–17). Borrow from Anywhere: Pseudo Multi-Modal Object Detection in Thermal Imagery. Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA. 5. Wu, W., Chang, H., Zheng, Y., Li, Z., Chen, Z., and Zhang, Z. (2022). Contrastive Learning-Based Robust Object Detection under Smoky Conditions. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022, IEEE.
|
|