Affiliation:
1. Department of Biostatistics and Computational Biology University of Rochester Medical Center Rochester New York
2. Medical Scientist Training Program University of Rochester School of Medicine and Dentistry Rochester New York
Abstract
Existing statistical methods can estimate a policy, or a mapping from covariates to decisions, which can then instruct decision makers (eg, whether to administer hypotension treatment based on covariates blood pressure and heart rate). There is great interest in using such data‐driven policies in healthcare. However, it is often important to explain to the healthcare provider, and to the patient, how a new policy differs from the current standard of care. This end is facilitated if one can pinpoint the aspects of the policy (ie, the parameters for blood pressure and heart rate) that change when moving from the standard of care to the new, suggested policy. To this end, we adapt ideas from Trust Region Policy Optimization (TRPO). In our work, however, unlike in TRPO, the difference between the suggested policy and standard of care is required to be sparse, aiding with interpretability. This yields “relative sparsity,” where, as a function of a tuning parameter, , we can approximately control the number of parameters in our suggested policy that differ from their counterparts in the standard of care (eg, heart rate only). We propose a criterion for selecting , perform simulations, and illustrate our method with a real, observational healthcare dataset, deriving a policy that is easy to explain in the context of the current standard of care. Our work promotes the adoption of data‐driven decision aids, which have great potential to improve health outcomes.
Funder
National Institute of Environmental Health Sciences
National Institute of General Medical Sciences
National Institute of Neurological Disorders and Stroke
Subject
Statistics and Probability,Epidemiology
Reference79 articles.
1. Optimal dynamic treatment regimes
2. FutomaJ HughesMC Doshi‐VelezF.POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning.arXiv preprint arXiv:2001.04032.2020.
3. Interpretable off‐policy evaluation in reinforcement learning by highlighting influential transitions;Gottesman O;PMLR,2020
4. RaghuA KomorowskiM CeliLA SzolovitsP GhassemiM.Continuous state‐space models for optimal sepsis treatment‐a deep reinforcement learning approach.arXiv preprint arXiv:1705.08422.2017.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献