Affiliation:
1. University of Shanghai for Science and Technology
2. Shanghai University of Traditional Chinese Medicine
3. Shanghai University of Medicine and Health Sciences
Abstract
Abstract
The study used integratively analyzed methylation data and expression data on non-small cell lung cancer (NSCLC). From the methylation data, we obtained 19,784 differentially methylated probes (DMPs) and studied the distribution of these DMPs. The DMPs were enriched by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). Subsequently, we focused on the 6089 DMPs of enhancers, which accounted for a relatively large proportion. We used weighted gene co-expression network analysis (WGCNA) to identify NSCLC related genes from the DMPs of enhancers. Least absolute shrinkage and selection operator (LASSO) regression and Cox regression algorithms were used to identify characteristic genes and construct a prognostic risk model based on the expression data. The prognostic risk model areas under the curve (AUC) of 3-, 5-, and 10-year time-dependent receiver operating curves (ROC) were all higher than 0.7 in both the training set and validation set, and the prognostic risk model had higher predictive capacity than other clinical variables. Finally, we plotted a nomograph for 3, 5, and 10 years. In conclusion, the prognostic risk model had high predictive capacity for long term overall survival (OS) of patients with NSCLC.
Publisher
Research Square Platform LLC