CODA: an open-source platform for federated analysis and machine learning on distributed healthcare data

Author:

Mullie Louis123ORCID,Afilalo Jonathan4ORCID,Archambault Patrick567,Bouchakri Rima8,Brown Kip8,Buckeridge David L39ORCID,Cavayas Yiorgos Alexandros10,Turgeon Alexis F611,Martineau Denis11,Lamontagne François12,Lebrasseur Martine8,Lemieux Renald12,Li Jeffrey8,Sauthier Michaël213,St-Onge Pascal8,Tang An214,Witteman William7,Chassé Michaël12

Affiliation:

1. Department of Medicine, Centre Hospitalier de l'Université de Montréal , Montréal, H2X 3E4, Canada

2. Faculty of Medicine, Université de Montréal , Montréal, H3C 3J7, Canada

3. Mila Quebec Artificial Intelligence Institute , Montréal, H2S 3H1, Canada

4. Department of Medicine, Jewish General Hospital , Montréal, H3T 1E4, Canada

5. Department of Emergency Medicine and Family Medicine, Université Laval , Québec, G1V 0A6, Canada

6. Department of Anesthesiology and Critical Care Medicine, Université Laval , Québec, G1V 0A6, Canada

7. Centre de Recherche Intégré pour un Système Apprenant en santé et Services Sociaux, Centre intégré de santé et de Services Sociaux de Chaudière-Appalaches , Lévis, G6V 3Z1, Canada

8. Centre de Recherche du Centre Hospitalier de l'Université de Montréal, Université de Montréal , Montréal, H2X 0A9, Canada

9. Department of Epidemiology and Biostatistics, School of Population and Global Health, McGill University Health Centre , Montréal, H3A 1G1, Canada

10. Department of Medicine, Hôpital du Sacré-Coeur de Montréal , Montréal, H4J 1C5, Canada

11. Centre de recherche du CHU de Québec-Université Laval, Université Laval , Québec, G1V 4G2, Canada

12. Centre de recherche du CHUS, Centre Hospitalier Universitaire de Sherbrooke , Sherbrooke, J1G 2E8, Canada

13. Department of Pediatrics, Université de Montréal and CHU Sainte-Justine Research Centre , Montréal, H3C 3J7, Canada

14. Department of Radiology, Centre Hospitalier de l’Université de Montréal , Montréal, H2X 3E4, Canada

Abstract

Abstract Objectives Distributed computations facilitate multi-institutional data analysis while avoiding the costs and complexity of data pooling. Existing approaches lack crucial features, such as built-in medical standards and terminologies, no-code data visualizations, explicit disclosure control mechanisms, and support for basic statistical computations, in addition to gradient-based optimization capabilities. Materials and methods We describe the development of the Collaborative Data Analysis (CODA) platform, and the design choices undertaken to address the key needs identified during our survey of stakeholders. We use a public dataset (MIMIC-IV) to demonstrate end-to-end multi-modal FL using CODA. We assessed the technical feasibility of deploying the CODA platform at 9 hospitals in Canada, describe implementation challenges, and evaluate its scalability on large patient populations. Results The CODA platform was designed, developed, and deployed between January 2020 and January 2023. Software code, documentation, and technical documents were released under an open-source license. Multi-modal federated averaging is illustrated using the MIMIC-IV and MIMIC-CXR datasets. To date, 8 out of the 9 participating sites have successfully deployed the platform, with a total enrolment of >1M patients. Mapping data from legacy systems to FHIR was the biggest barrier to implementation. Discussion and conclusion The CODA platform was developed and successfully deployed in a public healthcare setting in Canada, with heterogeneous information technology systems and capabilities. Ongoing efforts will use the platform to develop and prospectively validate models for risk assessment, proactive monitoring, and resource usage. Further work will also make tools available to facilitate migration from legacy formats to FHIR and DICOM.

Funder

Canadian Institutes of Health Research

Québec Table Nationale des Directeurs de Recherche

Réseau de bio-imagerie du Québec

Publisher

Oxford University Press (OUP)

Subject

Health Informatics

Reference55 articles.

1. Data silos are undermining drug development and failing rare disease patients;Denton;Orphanet J Rare Dis,2021

2. Distributed learning: a reliable privacy-preserving strategy to change multicenter collaborations using AI;Kirienko;Eur J Nucl Med Mol Imaging,2021

3. Systematic review of privacy-preserving distributed machine learning from federated databases in health care;Zerka;JCO Clin Cancer Inf,2020

4. Integrating artificial intelligence in bedside care for COVID-19 and future pandemics;Yu;BMJ,2021

5. LOINC, a universal standard for identifying laboratory observations: a 5-year update;McDonald;Clin Chem,2003

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3