Affiliation:
1. Independent Consultant, Knoxville, United States
Abstract
Four linear multilevel mixed-risk models were compared using model assumption tests and predictions. Models varied by the number of random intercepts from 1 to 4, producing 2-level through 5-level models of the same measure, operative time. Normality of the dependent variable and residuals, variance homoscedasticity, level-1, and level-2 exogeneity were tested using the robust test of the level-1 residuals variance by surgeon, estimates of density, skew, and the Hausman test. Measure (operative time by hospital and surgeon) aberrancy and risk classification were evaluated using traditional methods and used to assess distribution measures. The dependent variable and the level-1 residuals required transformation for linearity and variance stabilization, respectively. Normality criteria were met for both level-1 and level-2 residuals and standardized residuals. The likelihood ratio comparing the four models was significantly larger for the 5-level (1016.1; P<0.00005) model than the likelihood ratio for the four-level and other models. Shrinkage was greatest for the 2-level model (0.039; P<0.00005) and least for the 5-level model (0.028; P<0.00005). Level-1 variance homoscedasticity was confirmed by the robust variance test across all models (P>F=1). Aberrant value detection did not require the exclusion of any observations, while prediction intervals revealed low or high risk for 54.2% of surgeons for the 2-level model and 8.6% for the 5-level model. The traditional (c2 = -11.01; P=1) and instrumental variable (c2 = 21.06; P=1) Hausman tests show that the null hypothesis cannot be rejected for level-1 or level-2 exogeneity. Once level-1 and level-2 exogeneity was confirmed, and since deconfounding was a model consideration, causal inferential capacity was assumed. The likelihood ratio, residual variance, shrinkage, and predictions show that the 5-level model is preferred to the other models.
Reference16 articles.
1. Ruppert, D., M. P. Wand, and R. J. Carroll. 2003. Semiparametric Regression. Cambridge: Cambridge University Press.
2. Rabe-Hesketh, S., and A. Skrondal. 2022. Multilevel and Longitudinal Modeling Using Stata. 4th ed. College Station, TX: Stata Press. Based on: J. Martin Bland, Douglas G. Altman. Statistical Methods for Assessing Agreement Between Two Methods of Clinical Measurement. Lancet, 1986; i: 307-310.
3. Cecil W. T., Selection of Reliable and Valid Surgeon Performance Measures, American Journal of Management Science and Engineering. Volume 5, Issue 5, September 2020, pp. 62-69. https://doi.org/10.11648/j.ajmse.20200505.12
4. Hannan, Ph. D., E. L., Kilburn, Jr, MA, H., O'Donnell, MA, MS, J. F., Lukacik, MA, G., & Shields, E. (1990). Adult Open Heart Surgery in New York State: An Analysis of Risk Factors and Hospital Mortality Rates. JAMA, 2768 - 2774.
5. David M. Shahian, Sharon-Lise Normand, David F. Torchiana, Stanley M. Lewis, John O. Pastore, Richard E. Kuntz, Paul I. Dreyer, Cardiac surgery report cards: comprehensive review and statistical critique. This review is an abridged version of a report submitted by the Massachusetts Cardiac Care Quality Commission to the Massachusetts Legislature, May 2001., The Annals of Thoracic Surgery, Volume 72, Issue 6, 2001, Pages 2155-2168, ISSN 0003-4975, https://doi.org/10.1016/S0003-4975(01)03222-2