A vision transformer for emphysema classification using CT images

Author:

Wu Yanan,Qi ShouliangORCID,Sun Yu,Xia Shuyue,Yao Yudong,Qian Wei

Abstract

Abstract Objective. Emphysema is characterized by the destruction and permanent enlargement of the alveoli in the lung. According to visual CT appearance, emphysema can be divided into three subtypes: centrilobular emphysema (CLE), panlobular emphysema (PLE), and paraseptal emphysema (PSE). Automating emphysema classification can help precisely determine the patterns of lung destruction and provide a quantitative evaluation. Approach. We propose a vision transformer (ViT) model to classify the emphysema subtypes via CT images. First, large patches (61 × 61) are cropped from CT images which contain the area of normal lung parenchyma, CLE, PLE, and PSE. After resizing, the large patch is divided into small patches and these small patches are converted to a sequence of patch embeddings by flattening and linear embedding. A class embedding is concatenated to the patch embedding, and the positional embedding is added to the resulting embeddings described above. Then, the obtained embedding is fed into the transformer encoder blocks to generate the final representation. Finally, the learnable class embedding is fed to a softmax layer to classify the emphysema. Main results. To overcome the lack of massive data, the transformer encoder blocks (pre-trained on ImageNet) are transferred and fine-tuned in our ViT model. The average accuracy of the pre-trained ViT model achieves 95.95% in our lab’s own dataset, which is higher than that of AlexNet, Inception-V3, MobileNet-V2, ResNet34, and ResNet50. Meanwhile, the pre-trained ViT model outperforms the ViT model without the pre-training. The accuracy of our pre-trained ViT model is higher than or comparable to that by available methods for the public dataset. Significance. The results demonstrated that the proposed ViT model can accurately classify the subtypes of emphysema using CT images. The ViT model can help make an effective computer-aided diagnosis of emphysema, and the ViT method can be extended to other medical applications.

Funder

National Natural Science Foundation of China

Key R&D Program Guidance Projects in Liaoning Province

Fundamental Research Funds for the Central Universities

Publisher

IOP Publishing

Subject

Radiology, Nuclear Medicine and imaging,Radiological and Ultrasound Technology

Reference52 articles.

Cited by 37 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3