Affiliation:
1. Department of Computation and Artificial Intelligence, Euskal Herriko Unibertsitatea UPV-EHU, Donostia, Spain
2. Departament d’Estadística, Universitat de Barcelona, Barcelona, Spain
Abstract
In diagnosis and classification diseases multiple outcomes, both molecular and clinical/pathological are routinely gathered on patients. In recent years, many approaches have been suggested for integrating gene expression (continuous data) with clinical/pathological data (usually categorical and ordinal data). This new area of research integrates both clinical and genomic data in order to improve our knowledge about diseases, and to capture the information which is lost in independent clinical or genomic studies. The related metric scaling distance is a not well-known, but very valuable distance to integrate clinical/pathological and molecular information. In this article, we present the use of the related metric scaling distance in biomedical research. We describe how this distance works, and we also explain why it may sometimes be preferred. We discuss the choice of the related metric scaling distance and compare it with other proximity measures to include both clinical and genetic information. Furthermore, we comment the choice of the related metric scaling distance when classical clustering or discriminant analysis based on distances are performed and compare the results with more complex cluster or discriminant procedures specially constructed for integrating clinical and molecular information. The use of the related metric scaling distance is illustrated on simulated experimental and four real data sets, a heart disease, and three cancer studies. The results present the flexibility and availability of this distance which gives competitive results.
Subject
Health Information Management,Statistics and Probability,Epidemiology
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献