Robust Evaluation of Deep Learning-based Representation Methods for Survival and Gene Essentiality Prediction on Bulk RNA-seq Data

Author:

Gross Baptiste,Dauvin Antonin,Cabeli Vincent,Kmetzsch Virgilio,Khoury Jean El,Dissez Gaëtan,Ouardini Khalil,Grouard Simon,Davi Alec,Loeb Regis,Esposito Christian,Hulot Louis,Ghermi Ridouane,Blum MichaelORCID,Darhi Yannis,Durand Eric Y.,Romagnoni Alberto

Abstract

AbstractDeep learning (DL) has shown potential to provide powerful representations of bulk RNA-seq data in cancer research. However, there is no consensus regarding the impact of design choices of DL approaches on the performance of the learned representation, including the model architecture, the training methodology and the various hyperparameters. To address this problem, we evaluate the performance of various design choices of DL representation learning methods using TCGA and DepMap pan-cancer datasets, and assess their predictive power for survival and gene essentiality predictions. We demonstrate that non DL-based baseline methods achieve comparable or superior performance compared to more complex models on survival predictions tasks. DL representation methods, however, are the most efficient to predict the gene essentiality of cell lines. We show that auto-encoders (AE) are consistently improved by techniques such as masking and multi-head training. Our results suggest that the impact of DL representations and of pre-training are highly task- and architecture-dependent, highlighting the need for adopting rigorous evaluation guidelines. These guidelines for robust evaluation are implemented in a pipeline made available to the research community.

Publisher

Cold Spring Harbor Laboratory

Reference63 articles.

1. Althubaiti, Sara , Maxat Kulmanov , Yang Liu , Georgios V . Gkoutos, Paul Schofield , et Robert Hoehndorf . 2021. « DeepMOCCA: A pan-cancer prognostic model identifies personalized prognostic markers through graph attention and multi-omics data integration ». BioRxiv, 2021–03.

2. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity

3. Representation Learning: A Review and New Perspectives

4. Bengio, Yoshua , et Yves Grandvalet . 2003. « No Unbiased Estimator of the Variance of K-Fold Cross-Validation ». In Advances in Neural Information Processing Systems, édité par S. Thrun , L. Saul , et B. Schölkopf . Vol. 16. MIT Press. https://proceedings.neurips.cc/paper_files/paper/2003/file/e82c4b19b8151ddc25d4d93baf7b908f-Paper.pdf.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3