Abstract
AbstractBackgroundCatheter-associated urinary tract infections (CA-UTIs) significantly increase clinical burdens. Identifying patients at high-risk of CA-UTIs is crucial in clinical practice. In this study, we developed and externally validated an explainable, prognostic prediction model of CA-UTIs among hospitalized individuals receiving urinary catheterization.MethodsWe applied a retrospective cohort paradigm to select data from a clinical research database covering three hospitals in Taiwan. We developed a prediction model using data from two hospitals and used the third hospital’s data for external validation. We selected predictors by a multivariate regression analysis through applying a Cox proportional-hazards model. Both statistical and computational machine learning algorithms were applied for predictive modeling: (1) ridge regression; (2) decision tree; (3) random forest (RF); (4) extreme gradient boosting; and (5) deep-insight visible neural network. We evaluated the calibration, clinical utility, and discrimination ability to choose the best model by the validation set. The Shapley additive explanation was used to assess the explainability of the best model.ResultsWe included 122,417 instances from 20-to-75-year-old subjects with multiple visits (n=26,401) and multiple orders of urine catheterization per visit (n=35,230). Fourteen predictors were selected from 20 candidate variables. The best prediction model was the RF for predicting CA-UTIs within 6 days. It detected 97.63% (95% confidence interval [CI]: 97.57%, 97.69%) CA-UTI positive, and 97.36% (95% CI: 97.29%, 97.42%) of individuals that were predicted to be CA-UTI negative were true negatives. Among those predicted to be CA-UTI positives, we expected 22.85% (95% CI: 22.79%, 22.92%) of them to truly be high-risk individuals. We also provide a web-based application and a paper-based nomogram for using the best model.ConclusionsOur prediction model was clinically accurate by detecting most CA-UTI positive cases, while most predicted negative individuals were correctly ruled out. However, future studies are needed to prospectively evaluate the implementation, validity, and reliability of this prediction model among users of the web application and nomogram, and the model’s impacts on patient outcomes.
Publisher
Cold Spring Harbor Laboratory