Affiliation:
1. Research Center for Clinical and Translational Medicine, Huazhong University of Science and Technology Union Shenzhen Hospital, and the 6th Affiliated Hospital of Shenzhen University Medical School, Shenzhen, Guangdong, China
2. Guangdong Jiyin Biotech Co. Ltd, Shenzhen, Guangdong, China
Abstract
Background Cancer screening and early detection greatly increase the chances of successful treatment. However, most cancer types lack effective early screening biomarkers. In recent years, natural language processing (NLP)-based text-mining methods have proven effective in searching the scientific literature and identifying promising associations between potential biomarkers and disease, but unfortunately few are widely used. Methods In this study, we used an NLP-enabled text-mining system, MarkerGenie, to identify potential stool bacterial markers for early detection and screening of colorectal cancer. After filtering markers based on text-mining results, we validated bacterial markers using multiplex digital droplet polymerase chain reaction (ddPCR). Classifiers were built based on ddPCR results, and sensitivity, specificity, and area under the curve (AUC) were used to evaluate the performance. Results A total of 7 of the 14 bacterial markers showed significantly increased abundance in the stools of colorectal cancer patients. A five-bacteria classifier for colorectal cancer diagnosis was built, and achieved an AUC of 0.852, with a sensitivity of 0.692 and specificity of 0.935. When combined with the fecal immunochemical test (FIT), our classifier achieved an AUC of 0.959 and increased the sensitivity of FIT (0.929 vs. 0.872) at a specificity of 0.900. Conclusions Our study provides a valuable case example of the use of NLP-based marker mining for biomarker identification.
Funder
The Funds of Health Science and Technology Research Key Project of Nanshan District, Shenzhen
the Research Funds of the Science, Technology and Innovation Commission of Shenzhen
Subject
Cancer Research,Clinical Biochemistry,Oncology,Pathology and Forensic Medicine