WaterSAM: Adapting SAM for Underwater Object Segmentation
-
Published:2024-09-11
Issue:9
Volume:12
Page:1616
-
ISSN:2077-1312
-
Container-title:Journal of Marine Science and Engineering
-
language:en
-
Short-container-title:JMSE
Author:
Hong Yang1ORCID, Zhou Xiaowei1, Hua Ruzhuang1, Lv Qingxuan1ORCID, Dong Junyu1
Affiliation:
1. School of Computer Science and Technology, West Coast Campus, Ocean University of China, No. 1299 Sansha Road, Binhai Street, Huangdao District, Qingdao 266100, China
Abstract
Object segmentation, a key type of image segmentation, focuses on detecting and delineating individual objects within an image, essential for applications like robotic vision and augmented reality. Despite advancements in deep learning improving object segmentation, underwater object segmentation remains challenging due to unique underwater complexities such as turbulence diffusion, light absorption, noise, low contrast, uneven illumination, and intricate backgrounds. The scarcity of underwater datasets further complicates these challenges. The Segment Anything Model (SAM) has shown potential in addressing these issues, but its adaptation for underwater environments, AquaSAM, requires fine-tuning all parameters, demanding more labeled data and high computational costs. In this paper, we propose WaterSAM, an adapted model for underwater object segmentation. Inspired by Low-Rank Adaptation (LoRA), WaterSAM incorporates trainable rank decomposition matrices into the Transformer’s layers, specifically enhancing the image encoder. This approach significantly reduces the number of trainable parameters to 6.7% of SAM’s parameters, lowering computational costs. We validated WaterSAM on three underwater image datasets: COD10K, SUIM, and UIIS. Results demonstrate that WaterSAM significantly outperforms pre-trained SAM in underwater segmentation tasks, contributing to advancements in marine biology, underwater archaeology, and environmental monitoring.
Funder
Postdoctoral Fellowship Program of CPSF Sanya Science and Technology Special Fund
Reference19 articles.
1. Underwater image processing and analysis: A review;Jian;Signal Process. Image Commun.,2021 2. Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.Y. (2023, January 1–6). Segment anything. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France. 3. Xu, M., Su, J., and Liu, Y. (2023, January 21–22). Aquasam: Underwater image foreground segmentation. Proceedings of the International Forum on Digital TV and Wireless Multimedia Communications, Beijing, China. 4. Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language models. arXiv. 5. Long, J., Shelhamer, E., and Darrell, T. (2015, January 7–12). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.
|
|