WaterSAM: Adapting SAM for Underwater Object Segmentation-Reference-Cited by-同舟云学术

WaterSAM: Adapting SAM for Underwater Object Segmentation

Published:2024-09-11 Issue:9 Volume:12 Page:1616
ISSN:2077-1312
Container-title:Journal of Marine Science and Engineering
language:en
Short-container-title:JMSE

Author:

Hong Yang¹^ORCID,Zhou Xiaowei¹,Hua Ruzhuang¹,Lv Qingxuan¹^ORCID,Dong Junyu¹

Affiliation:

1. School of Computer Science and Technology, West Coast Campus, Ocean University of China, No. 1299 Sansha Road, Binhai Street, Huangdao District, Qingdao 266100, China

Abstract

Object segmentation, a key type of image segmentation, focuses on detecting and delineating individual objects within an image, essential for applications like robotic vision and augmented reality. Despite advancements in deep learning improving object segmentation, underwater object segmentation remains challenging due to unique underwater complexities such as turbulence diffusion, light absorption, noise, low contrast, uneven illumination, and intricate backgrounds. The scarcity of underwater datasets further complicates these challenges. The Segment Anything Model (SAM) has shown potential in addressing these issues, but its adaptation for underwater environments, AquaSAM, requires fine-tuning all parameters, demanding more labeled data and high computational costs. In this paper, we propose WaterSAM, an adapted model for underwater object segmentation. Inspired by Low-Rank Adaptation (LoRA), WaterSAM incorporates trainable rank decomposition matrices into the Transformer’s layers, specifically enhancing the image encoder. This approach significantly reduces the number of trainable parameters to 6.7% of SAM’s parameters, lowering computational costs. We validated WaterSAM on three underwater image datasets: COD10K, SUIM, and UIIS. Results demonstrate that WaterSAM significantly outperforms pre-trained SAM in underwater segmentation tasks, contributing to advancements in marine biology, underwater archaeology, and environmental monitoring.

Funder

Postdoctoral Fellowship Program of CPSF

Sanya Science and Technology Special Fund

Publisher

MDPI AG

Link

https://www.mdpi.com/2077-1312/12/9/1616/pdf

Reference19 articles.

1. Underwater image processing and analysis: A review;Jian;Signal Process. Image Commun.,2021

2. Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., and Lo, W.Y. (2023, January 1–6). Segment anything. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France.

3. Xu, M., Su, J., and Liu, Y. (2023, January 21–22). Aquasam: Underwater image foreground segmentation. Proceedings of the International Forum on Digital TV and Wireless Multimedia Communications, Beijing, China.

4. Hu, E.J., Shen, Y., Wallis, P., Allen-Zhu, Z., Li, Y., Wang, S., Wang, L., and Chen, W. (2021). Lora: Low-rank adaptation of large language models. arXiv.

5. Long, J., Shelhamer, E., and Darrell, T. (2015, January 7–12). Fully convolutional networks for semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA.