A 3D hierarchical cross‐modality interaction network using transformers and convolutions for brain glioma segmentation in MR images

Author:

Zhuang Yuzhou1,Liu Hong1,Fang Wei2,Ma Guangzhi1,Sun Sisi1,Zhu Yunfeng1,Zhang Xu3,Ge Chuanbin3,Chen Wenyang1,Long Jiaosong4,Song Enmin1

Affiliation:

1. School of Computer Science and Technology Huazhong University of Science and Technology Wuhan China

2. Wuhan Zhongke Industrial Research Institute of Medical Science Co., Ltd Wuhan China

3. Wuhan United Imaging Healthcare Surgical Technology Co., Ltd Wuhan China

4. School of Art and Design Hubei University of Technology Wuhan China

Abstract

AbstractBackgroundPrecise glioma segmentation from multi‐parametric magnetic resonance (MR) images is essential for brain glioma diagnosis. However, due to the indistinct boundaries between tumor sub‐regions and the heterogeneous appearances of gliomas in volumetric MR scans, designing a reliable and automated glioma segmentation method is still challenging. Although existing 3D Transformer‐based or convolution‐based segmentation networks have obtained promising results via multi‐modal feature fusion strategies or contextual learning methods, they widely lack the capability of hierarchical interactions between different modalities and cannot effectively learn comprehensive feature representations related to all glioma sub‐regions.PurposeTo overcome these problems, in this paper, we propose a 3D hierarchical cross‐modality interaction network (HCMINet) using Transformers and convolutions for accurate multi‐modal glioma segmentation, which leverages an effective hierarchical cross‐modality interaction strategy to sufficiently learn modality‐specific and modality‐shared knowledge correlated to glioma sub‐region segmentation from multi‐parametric MR images.MethodsIn the HCMINet, we first design a hierarchical cross‐modality interaction Transformer (HCMITrans) encoder to hierarchically encode and fuse heterogeneous multi‐modal features by Transformer‐based intra‐modal embeddings and inter‐modal interactions in multiple encoding stages, which effectively captures complex cross‐modality correlations while modeling global contexts. Then, we collaborate an HCMITrans encoder with a modality‐shared convolutional encoder to construct the dual‐encoder architecture in the encoding stage, which can learn the abundant contextual information from global and local perspectives. Finally, in the decoding stage, we present a progressive hybrid context fusion (PHCF) decoder to progressively fuse local and global features extracted by the dual‐encoder architecture, which utilizes the local‐global context fusion (LGCF) module to efficiently alleviate the contextual discrepancy among the decoding features.ResultsExtensive experiments are conducted on two public and competitive glioma benchmark datasets, including the BraTS2020 dataset with 494 patients and the BraTS2021 dataset with 1251 patients. Results show that our proposed method outperforms existing Transformer‐based and CNN‐based methods using other multi‐modal fusion strategies in our experiments. Specifically, the proposed HCMINet achieves state‐of‐the‐art mean DSC values of 85.33% and 91.09% on the BraTS2020 online validation dataset and the BraTS2021 local testing dataset, respectively.ConclusionsOur proposed method can accurately and automatically segment glioma regions from multi‐parametric MR images, which is beneficial for the quantitative analysis of brain gliomas and helpful for reducing the annotation burden of neuroradiologists.

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3