Abstract
ABSTRACTIntroductionRandomized clinical trials (RCT) are limited in reflecting observable results out of controlled settings, which requires the execution of further lengthy observational studies. The usage of real-world data (RWD) has been recently considered to be a viable alternative to overcome these issues and complement certain clinical conclusions. Transcriptomics and other high-throughput data contain a molecular description of medical conditions and disease states. When linked to RWD, including demographical information, transcriptomics data is capable of elucidating nuances in disease pathways in specific patient populations. This work focuses on the construction of a patient repository database with clinical information resulting from the integration of publicly available transcriptomics datasets.ResultsSamples from patient data were integrated into the patient repository by using a new post-processing technique allowing for the combined usage of samples originating from Gene Expression Omnibus (GEO) datasets. RWD was mined from GEO samples’ metadata, and a clinical and demographical characterization of the database was obtained. Our post-processing technique, that we’ve called MACAROON, aims to uniformize, and integrate transcriptomics data (considering batch effects and possible processing-originated artefacts). This process was able to better reproduce the down streaming biological conclusions in a 10% enhancement (compared to other methods available). RWD mining was done through a manually curated synonym dictionary allowing for the correct assignment (95.33% median accuracy) of medical conditions.ConclusionOur strategy produced a RWD repository, including molecular information and clinical and demographical RWD. The exploration of these data facilitates shedding light on clinical outcomes and pathways specific to predetermined populations of patients by integrating multiple public datasets.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献