Hierarchical learning of gastric cancer molecular subtypes by integrating multi‐modal DNA‐level omics data and clinical stratification

Author:

Yang Binyu1,Liu Siying1,Xie Jiemin1,Tang Xi1,Guan Pan1,Zhu Yifan1,Liu Xuemei2,Xiong Yunhui1,Yang Zuli3,Li Weiyao3,Wang Yonghua4,Chen Wen4,Li Qingjiao5,Xia Li C.1

Affiliation:

1. Department of Statistics and Financial Mathematics School of Mathematics South China University of Technology Guangzhou China

2. School of Physics and Optoelectronics South China University of Technology Guangzhou China

3. Department of Pathology The Sixth Affiliated Hospital Sun Yat‐sen University Guangzhou China

4. School of Food Science and Engineering South China University of Technology Guangzhou China

5. Department of Laboratory Medicine the Eighth Affiliated Hospital Sun Yat‐sen University Shenzhen China

Abstract

AbstractMolecular subtyping of gastric cancer (GC) aims to comprehend its genetic landscape. However, the efficacy of current subtyping methods is hampered by their mixed use of molecular features, a lack of strategy optimization, and the limited availability of public GC datasets. There is a pressing need for a precise and easily adoptable subtyping approach for early DNA‐based screening and treatment. Based on TCGA subtypes, we developed a novel DNA‐based hierarchical classifier for gastric cancer molecular subtyping (HCG), which employs gene mutations, copy number aberrations, and methylation patterns as predictors. By incorporating the closely related esophageal adenocarcinomas dataset, we expanded the TCGA GC dataset for the training and testing of HCG (n = 453). The optimization of HCG was achieved through three hierarchical strategies using Lasso‐Logistic regression, evaluated by their overall the area under receiver operating characteristic curve (auROC), accuracy, F1 score, the area under precisionrecall curve (auPRC) and their capability for clinical stratification using multivariate survival analysis. Subtype‐specific DNA alteration biomarkers were discerned through difference tests based on HCG defined subtypes. Our HCG classifier demonstrated superior performance in terms of overall auROC (0.95), accuracy (0.88), F1 score (0.87) and auPRC (0.86), significantly improving the clinical stratification of patients (overall p‐value = 0.032). Difference tests identified 25 subtype‐specific DNA alterations, including a high mutation rate in the SYNE1, ITGB4, and COL22A1 genes for the MSI subtype, and hypermethylation of ALS2CL, KIAA0406, and RPRD1B genes for the EBV subtype. HCG is an accurate and robust classifier for DNA‐based GC molecular subtyping with highly predictive clinical stratification performance. The training and test datasets, along with the analysis programs of HCG, are accessible on the GitHub website (github.com/LabxSCUT).

Publisher

Wiley

Reference77 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3