Optimised Pre-Processing of Raman Spectra for Colorectal Cancer Detection Using High-Performance Computing

Author:

Woods Freya E. R.1ORCID,Jenkins Cerys A.1,Jenkins Rhys A.2,Chandler Susan3ORCID,Harris Dean A.34,Dunstan Peter R.1

Affiliation:

1. Department of Physics, Swansea University, Swansea, UK

2. Blackett Laboratory, Imperial College London, London, UK

3. Medical School, Swansea University, Swansea, UK

4. Department of Colorectal Surgery, Morriston Hospital, Swansea, Wales, UK

Abstract

Spectral pre-processing is an essential step in data analysis for biomedical diagnostic applications of Raman spectroscopy, allowing the removal of undesirable spectral contributions that could mask biological information used for diagnosis. However, due to the specificity of pre-processing for a given sample type and the vast number of potential pre-processing combinations, optimisation of pre-processing via a manual “trial and error” format is often time intensive with no guarantee that the chosen method is optimal for the sample type. Here we present the use of high-performance computing (HPC) to trial over 2.4 million pre-processing permutations to demonstrate the optimisation on the pre-processing of human serum Raman spectra for colorectal cancer detection. The effect of varying pre-processing order, using extended multiplicative scatter correction, spectral smoothing, baseline correction, binning and normalization was considered. Permutations were assessed on their ability to detect patients with disease using a random forest (RF) algorithm trained with 102 patients (510 spectra) and independently tested with a set of 439 patients (1317 spectra) in a primary care patient cohort. Optimising via HPC enables improved performance in diagnostic abilities, with sensitivity increasing by 14.6%, specificity increasing by 6.9%, positive predictive value increasing by 3.4%, and negative predictive value increasing by 2.4% when compared to a standard pre-processing optimisation. Ultimate values of these metrics are very important for diagnostic adoption, and once diagnostics demonstrate good accuracy these types of optimisations can make a significant difference to roll-out of a test and demonstrating advantages over existing tests. We also provide tips/recommendations for pre-processing optimisation without the use of HPC. From the HPC permutations, recommendations for appropriate parameter constraints for conducting a more basic pre-processing optimisation are also detailed, thus helping model development for researchers not having access to HPC.

Funder

Cancer Research Wales

Publisher

SAGE Publications

Subject

Spectroscopy,Instrumentation

Cited by 5 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3