- Open Access
Exploring relationships between in-hospital mortality and hospital case volume using random forest: results of a cohort study based on a nationwide sample of German hospitals, 2016–2018
BMC Health Services Research volume 22, Article number: 1 (2022)
Relationships between in-hospital mortality and case volume were investigated for various patient groups in many empirical studies with mixed results. Typically, those studies relied on (semi-)parametric statistical models like logistic regression. Those models impose strong assumptions on the functional form of the relationship between outcome and case volume. The aim of this study was to determine associations between in-hospital mortality and hospital case volume using random forest as a flexible, nonparametric machine learning method.
We analyzed a sample of 753,895 hospital cases with stroke, myocardial infarction, ventilation > 24 h, COPD, pneumonia, and colorectal cancer undergoing colorectal resection treated in 233 German hospitals over the period 2016–2018. We derived partial dependence functions from random forest estimates capturing the relationship between the patient-specific probability of in-hospital death and hospital case volume for each of the six considered patient groups.
Across all patient groups, the smallest hospital volumes were consistently related to the highest predicted probabilities of in-hospital death. We found strong relationships between in-hospital mortality and hospital case volume for hospitals treating a (very) small number of cases. Slightly higher case volumes were associated with substantially lower mortality. The estimated relationships between in-hospital mortality and case volume were nonlinear and nonmonotonic.
Our analysis revealed strong relationships between in-hospital mortality and hospital case volume in hospitals treating a small number of cases. The nonlinearity and nonmonotonicity of the estimated relationships indicate that studies applying conventional statistical approaches like logistic regression should consider these relationships adequately.
Volume-outcome relationships in inpatient care were investigated in a large number of studies for various patient groups [1,2,3,4,5]. In research on patient outcomes in critical care and surgery, special emphasis has been placed on hospital mortality. Empirical analyses suggested that higher case volumes were related to lower mortality in patients with stroke [6, 7], acute myocardial infarction [8, 9], mechanical ventilation [10, 11], respiratory diseases [12, 13], and surgical interventions [14,15,16,17,18,19]. However, evidence is inconclusive as the results of several studies cast doubt on the proposed volume-outcome associations [20,21,22,23].
Typically, studies investigating relationships between case volume and hospital mortality applied (semi-)parametric statistical models like logistic regression [4, 7, 10, 13]. A main advantage of those conventional approaches is that they facilitate adjustment for patient-specific risk factors while ensuring easily interpretable results in terms of effect sizes (e.g. due to estimation of odds ratios). However, this advantage comes at the cost of flexibility in modeling volume-outcome relationships. Statistical models like logistic regression impose a specific functional form on the relationships between outcome and covariates, including case volume, e.g. via the logistic link function. This functional form represents a strong assumption, particularly if case volume enters the regression as a continuous variable [2, 4]. In this case, the relationship between the probability of outcome occurrence and case volume is assumed to be logistic over the whole range of case volumes included in the data. Deviations from this assumption may result in biased estimates. As an alternative strategy, case volume may be divided into groups, which then enter the regression as separate indicator variables [2, 7, 8, 13]. However, this approach involves the definition of thresholds for volume groups, which may be chosen in an arbitrary way. Importantly, inappropriate definition of those thresholds (e.g. thresholds assigning too small or too large widths to specific volume groups) may lead to inadequate results and conclusions.
Against that background, the objective of this study was to exploit advantages of random forest as a flexible, nonparametric machine learning method for estimating volume-outcome relationships. Random forest facilitates exploration of associations between in-hospital mortality and hospital case volume without presuming a specific functional relationship between outcome and risk factors. Instead, those relationships were explored by estimating partial dependence functions . Within the framework of the IMPRESS study (“Effectiveness of the IQM-PR procedure to improve in-patient care - a pragmatic cluster randomized controlled trial”), we aimed to determine associations between in-hospital mortality and hospital case volume. Therefore, we analyzed a large sample of German hospital cases covering the period 2016–2018. Based on random forest estimates, we derived partial dependence functions capturing associations between patient-specific probabilities of in-hospital death and hospital case volume for patients with stroke, myocardial infarction, colorectal resection with cancer, ventilation > 24 h, COPD, and pneumonia.
The IMPRESS study
The IMPRESS study was a cluster-randomized controlled trial (cluster RCT) on the effectiveness of clinical peer review conducted in member hospitals of the German Initiative Qualitätsmedizin (IQM) on mortality in patients ventilated > 24 h. The cluster RCT was embedded in a prospective cohort study, which provided the basis for exploratory analysis of risk factors for in-hospital mortality. Primary outcome was in-hospital mortality in patients ventilated for more than 24 h. Secondary outcomes were in-hospital morality in patients with stroke, myocardial infarction, colorectal resection, COPD, and pneumonia. The study has been registered  and its procedures were described in detail elsewhere . Here, we report exploratory results from the cohort study. Our analyses were based on data from 233 IQM member hospitals which agreed to participate in the IMPRESS study, covering the period 2016–2018.
Outcome and patient groups
The outcome of this study was in-hospital mortality. We estimated relationships between in-hospital mortality and hospital case volume for patients with stroke, myocardial infarction, COPD, pneumonia, and patients with colorectal cancer undergoing colorectal resection. We identified these patients based on ventilation time, diagnoses according to the German modification of the International Classification of Diseases (ICD-10-GM), and medical procedures according to the Operation and Procedure Classification System (OPS). Inclusion and exclusion criteria followed the corresponding German Inpatient Quality Indicators (G-IQI, version 5.0)  definitions (Table 1). Departing from the G-IQI, we included patients with colorectal resection only if they had a documented diagnosis of colorectal cancer (ICD-10-GM: C18-C20) to ensure specificity and homogeneity of the underlying medical condition [18, 19].
Data sources and variables
The analysis was based on secondary data and did not involve human participants. We gathered patient data of included IQM member hospitals according to German law, §21 Krankenhausentgeltgesetz (KHEntgG). These data are collected by inpatient care providers for accounting purposes and are harmonized at the national level. In addition, we used data on hospital characteristics from the German Hospital Directory (Deutsches Krankenhausverzeichnis).
We calculated yearly hospital case volumes as the number of patients with a specific indication (stroke, myocardial infarction, colorectal resection, ventilation > 24 h, COPD or pneumonia) treated in a specific hospital in a specific year. To adjust for the influence of relevant patient characteristics, multivariable analyses included age (in years), sex (male; female), and dummy variables for all 31 Elixhauser comorbidities . In addition, we used admission reason (referral; emergency case admission, transfer from other hospital) and intensive care unit (ICU) admission as proxies for urgency and disease severity. Regarding potentially relevant hospital characteristics, we accounted for urban/rural location (as defined by Bundesinstitut für Bau-, Stadt- und Raumforschung (BBSR) ), hospital ownership (public, non-profit, private), and university hospital status.
Since the data sources provide full data on patient and hospital characteristics, all participating hospitals and patients fulfilling the inclusion criteria could be included in our analysis.
Data protection and ethics
We obtained written consent on study participation from all included hospitals prior to the start of the IMPRESS study. The study data trust site at Koordinierungszentrum für Klinische Studien (KKS) Dresden ensured anonymization of the data. These anonymized data were analyzed at the Center for Evidence-Based Healthcare (ZEGV) Dresden. The ethics committee of the TU Dresden approved the study protocol on 24/04/2017 (registered at the Institutional Review Board (IRB): Office for Human Research Protections (OHRP); identification numbers: IRB00001473 and IORG0001076).
We characterized the distributions of hospital and patient characteristics by absolute and relative frequencies in case of categorical variables and by median and 1st and 3rd quartile (Q1; Q3) in case of continuous variables. For descriptive analysis, we divided hospital case volumes into ten categories (1–9; 10–19; 20–49; 50–99; 100–199; 200–499; 500–999; 1000–1999 and 2000+). Hospitals with zero cases per considered group were excluded. The smaller widths assigned to categories capturing lower volumes reflect that volume-outcome relationships may be more pronounced at smaller case volumes . For each volume category, we calculated the raw mortality rate across all hospitals and used bar charts to visualize its relationship with case volume.
Descriptive, bivariate analysis of relationships between mortality and case volume may be subject to uncontrolled confounding. We therefore modeled patient-specific mortality risk conditional on all patient and hospital characteristics outlined above. In contrast to conventional statistical approaches, we used random forest classifier [24, 31]. Random forest is a tree-based machine-learning algorithm that constructs a multitude of decision trees based on bootstrapped samples of the original data. As a nonparametric statistical method, random forest does not make assumptions on the functional form of the relationships between outcome and covariates. Thus, it allows for flexible, data-driven exploration of those relationships and even captures complex interactions between covariates. Based on random forest results, relationships between the outcome and specific covariates may be explored by estimating the partial dependence function . The partial dependence function represents the effect of a specific covariate on the outcome after accounting for average effects of the other covariates. Due to the flexibility of random forest, estimated partial dependence functions can be highly nonlinear and may even include discontinuities. Since the analysis was conducted at the patient-level, we calculated and visualized partial dependence functions capturing relationships between the average patient-specific probability of in-hospital death and hospital case volume for all considered patient groups. Statistical analysis was performed using the packages “ranger”  and “pdp”  in R version 4.0.2 .
We assessed the robustness of our results in multiple sensitivity analyses (see supplementary material). These included 1) adjustment for type or severity of the considered indication, 2) exclusion of individuals belonging to more than one of the considered patient groups, 3) estimation of volume-outcome relationships for patients ventilated > 24 h with specific medical conditions (stroke, myocardial infarction, and COPD).
Hospital and patient characteristics
The full dataset included 12,140,587 cases treated in the participating hospitals in the period 2016–2018 (see flow chart provided in the supplementary material). 753,895 of these cases fulfilled the inclusion criteria for at least one patient group. The resulting sample covered a wide range of average yearly hospital case volumes, which differed between indications (Table 2). While the median case volume was lowest for colorectal resection (35 cases), the highest median case volume was observed for pneumonia (189 cases). Most hospitals were located in urban areas and more than 40% were privately owned. The sample included eight university hospitals. Mortality was highest in patients with ventilation > 24 (overall mortality rate: 32.9%) and lowest in patients with colorectal resection (overall mortality rate: 3.2%). The median age of patients ranged between 69 years (ventilation > 24 h) and 77 years (pneumonia). Across all indications, men accounted for the majority of cases. Compared to the other patient groups, patients with ventilation > 24 h had the highest median number of Elixhauser comorbidities. Most patients with stroke, myocardial infarction, and pneumonia were admitted as emergency case. The share of emergency cases was 48.5% in patients with ventilation > 24 h, 47.7% in patients with pneumonia, and 17.1% in patients with colorectal cancer undergoing colorectal resection. ICU admission was most frequent in patients ventilated > 24 h (44.7%).
Descriptive relationships between in-hospital mortality and hospital case volume
With 2000 or more cases per year, the largest hospital-volumes were observed for stroke and ventilation > 24 h (Table 3). Calculating the average mortality rates by case volume for each indication did not reveal clear patterns (Fig. 1). For most patient groups, hospitals with small volumes were characterized by relatively high mortality rates. However, the evolution of mortality rates across volume groups was non-monotonic. There was an increase in mortality rates in hospitals belonging to the highest volume groups for stroke, myocardial infarction, colorectal resection, and pneumonia. Opposite trends of mortality rates in high-volume groups were found for ventilation > 24 h and COPD.
Partial dependence functions based on random forest estimates
In contrast to descriptive evidence, the partial dependence functions derived from random forest estimations revealed clear and qualitatively similar patterns across most patient groups (Fig. 2). Please note that volume- and probability-scales are specific to each subfigure. The strongest relationships between in-hospital death and hospital case volume were revealed for those hospitals treating a small number of cases. The smallest case volumes were consistently related to the highest patient-specific probabilities of in-hospital death. In all patient groups, slightly higher case volumes compared to these smallest case volumes were associated with substantially lower predicted probabilities of in-hospital death. Notably, the estimated partial dependence functions were relatively smooth although they were calculated pointwise for specific hospital volumes. In relative terms, estimated differences between the lowest and the highest average predicted probability of in-hospital death exceeded 50% for all indications except for ventilation > 24 h. In case of the latter, the maximum absolute difference in the volume-specific predicted probabilities of in-hospital death was approximately 10 percentage points. The partial dependence functions also indicated increases in the probability of in-hospital death for case volumes exceeding certain, indication-specific thresholds. Again, the only exception was ventilation > 24 h for which this upward trend in partial dependence at higher case volumes was not observed.
As shown in the supplementary material, the results remained qualitatively stable when adjusting for additional risk factors (type or severity of indication) and when excluding individuals belonging to more than one of the considered patient groups. Volume-outcome relationships found for the total population of patients ventilated > 24 h were also revealed in subgroups of ventilated patients with stroke, myocardial infarction, and pneumonia, respectively.
Volume-outcome relationships in inpatient care were explored and discussed controversially in a multitude of studies. Typically, estimation of those relationships relied on (semi-)parametric statistical models. The two main strategies of handling case volume in those analyses - treating case volume as continuous variable or defining volume groups - either impose strong assumptions on the functional form of the relationship between outcome and volume or rely on the definition of arbitrary volume thresholds.
Using random forest as a flexible, nonparametric statistical method, this study contributes to the literature by providing real-world evidence on volume-outcome relationships for six patient groups without presuming a specific functional form. Using a sample of more than 230 German hospitals over the period 2016–2018, our results consistently indicate that hospitals with small case volumes were characterized by the highest predicted probabilities of in-hospital death in patients with stroke, myocardial infarction, colorectal resection, ventilation > 24 h, COPD and pneumonia. Estimated volume-outcome relationships were particularly pronounced in small-volume hospitals. Slightly higher volumes were associated with substantially lower mortality in the group of hospitals treating a (very) small number of cases. Thus, our findings suggest that particularly hospitals with very small case volumes showed deficient performance. This finding is in line with previous studies on volume-outcome relationships in similar clinical settings . Although the estimated partial dependence functions were calculated pointwise for specific hospital volumes, they were notably smooth for all patient groups. This supports the notion of systematic relationships between in-hospital mortality and hospital case volume.
Moreover, our results show that volume-outcome relationships were nonlinear and non-monotonic. Except for ventilation > 24 h, we found that the average predicted probability of in-hospital death increased with case volume after reaching a certain, indication-specific threshold. A possible explanation for this finding is that hospitals with very high case volumes may be characterized by a patient population that systematically differs from those of hospitals with lower case volumes in terms of disease severity . If patients treated in hospitals with very high volumes were characterized by systematically higher disease severity that was not fully captured by comorbidities and admission reasons included in our analysis, this incomplete adjustment may result in increasing predicted probabilities of in-hospital death for higher case volumes. The fact that this upward trend in high-volume hospitals was not observed for ventilation > 24 h may reflect that protective volume effects outweigh incomplete adjustment for disease severity for this indication. Since long-term ventilation is a highly difficile task that places high demands on equipment, skills and capabilities of medical personnel , outcome improvements resulting from increased treatment experience and expertise may saturate at later stages. A complementary explanation for the finding of increasing mortality at high case volumes is that high-volume hospitals often have multiple departments treating patients with the same indication. As a result, each of those departments only accounts for a certain share of total hospital volume. The existence of multiple departments may be related to heterogeneity in performance and increase the risk of misallocation of patients, which, in turn, may be reflected in higher mortality.
Strengths and limitations of this study
A main strength of this study is the analysis of data on more than 753,000 cases of patients treated in more than 230 hospitals covering the period 2016–2018. This broad dataset allowed for reliable nonparametric estimation of volume-outcome relationships for six indications. By using a nonparametric machine-learning approach, our study complements conventional statistical approaches to estimate volume-outcome relationships and, thus, also makes an important methodological contribution to the literature.
As a main limitation, our results do not support a causal interpretation due to the use of secondary, observational data. In fact, the estimated partial dependence functions reflect the relation of predicted patient outcomes to hospital case volume. Since experimental studies on volume-outcome relationships are difficult to realize, this limitation is shared with the vast majority of related studies.
For descriptive analysis, we divided hospital volumes into volume groups. As already mentioned above, definition of those volume groups is arbitrary and different definitions may have led to different descriptive evidence on the relationship between in-hospital mortality and hospital volume. To overcome this shortcoming, we applied random forest, which does not rely on definition of volume groups and does not impose a specific functional form on the relationship between in-hospital mortality and hospital case volume. A limitation of random forest analysis is that it does not take the multilevel nature of the data (i.e. nesting of patients within hospitals) into account. Consequently, we could not derive valid uncertainty estimates (e.g. confidence intervals) for partial dependence functions. Although hospitals with small case volumes were consistently characterized by the highest mortality estimates, these estimates are based on a relatively small number of patients treated in these hospitals. This relatively small number of patients may induce low precision of partial dependence estimates that cannot be captured by our methodological approach. However, the fact that we estimated the highest mortality rates for small-volume hospitals across all six considered indications suggests reliability of our findings. Moreover, missing uncertainty estimates do not affect the validity of the point estimates of the partial dependence functions, which allowed us to explore relationships between in-hospital mortality and case volume in a flexible way.
Our data do not include all possibly relevant hospital characteristics like team/surgeon volumes , information concerning certifications [38,39,40], staffing, and qualification . This is also true with respect to patient-specific risk factors. This may result in incomplete adjustment in the framework of statistical analysis and may explain the estimated upward trend in the partial dependence between in-hospital mortality and case volume for the largest hospitals in our sample. However, our results remained robust against inclusion of additional, indication-specific risk factors available in our data and the exclusion of patients belonging to more than one of the considered patient groups (see supplementary material). Since the focus of our analysis was on hospital volume, we did not account for the existence of multiple specialized departments in high-volume hospitals. Consequently, we could not capture intra-hospital heterogeneity in terms of volume and, possibly, performance.
Coding bias and differences in coding practices between hospitals may limit the validity of our results [42, 43]. However, this limitation is only relevant to the extent as coding practices are systematically related to hospital case volume. Since this study focused on mortality, we did not consider other relevant patient outcomes. The main advantage of using mortality as outcome is that data on in-hospital death has high validity. Extending this analysis to outcomes other than mortality would require careful examination and discussion whether these outcomes can be operationalized with sufficient validity using administrative hospital data . Finally, our analysis did not indicate causes of the estimated volume-outcome relationships. In addition to learning-by-doing, such relationships may be explained by selective referral . Gaining a deeper understanding of the underlying causes therefore is an important task for further research.
The results of this study support previous evidence on the existence of volume-outcome relationships in inpatient care of patients with stroke, myocardial infarction, ventilation > 24 h, COPD, pneumonia, and patients with colorectal cancer undergoing colorectal resection. From a policy perspective, these results suggest that patient outcomes may be systematically worse in small-volume hospitals and, thus, support arguments for centralization or regionalization of care of specific patient groups [46, 47].
The nonlinearity, nonmonotonicity, and indication-specific shape of the estimated relationships suggest that future studies should pay special attention to the valid specification of statistical models. We found the most pronounced volume effects at small case volumes. Hence, as a general recommendation, empirical studies using (semi-)parametric methods like logistic regression should assign only small widths to low-volume groups or use appropriate transformations of volume data to model these relationships adequately.
To assess the generalizability of our findings, additional studies applying flexible estimation of volume-outcome relationships in similar settings would be valuable. Further indication-specific evidence on the shape of relationships between in-hospital mortality and case volume would be an important contribution to the literature and may allow for derivation of robust implications for targeted improvement of hospital outcomes. This may include reliable, indication-specific estimation of volume thresholds for sufficiently high outcome quality.
Availability of data and materials
Data are not publicly available due to legal restrictions (German law). The data sets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Nguyen Y-L, Wallace DJ, Yordanov Y, Trinquart L, Blomkvist J, Angus DC, et al. The volume-outcome relationship in critical care. Chest. 2015;148(1):79–92.
Nimptsch U, Mansky T. Hospital volume and mortality for 25 types of inpatient treatment in German hospitals: observational study using complete national data from 2009 to 2014. BMJ Open. 2017;7(9):e016184.
Halm EA, Lee C, Chassin MR. Is volume related to outcome in health care? A systematic review and methodologic critique of the literature. Ann Intern Med. 2002;137(6):511–20.
Ross JS, Normand S-LT, Wang Y, Ko DT, Chen J, Drye EE, et al. Hospital volume and 30-day mortality for three common medical conditions. N Engl J Med. 2010;362(12):1110–8.
Hendricks A, Diers J, Baum P, Weibel S, Kastner C, Müller S, et al. Systematic review and meta-analysis on volume-outcome relationship of abdominal surgical procedures in Germany. Int J Surg. 2021 Feb;1(86):24–31.
Saposnik G, Baibergenova A, O’Donnell M, Hill MD, Kapral MK, Hachinski V. Hospital volume and stroke outcome. Does it matter? Neurology. 2007;69(11):1142–51.
Tsugawa Y, Kumamaru H, Yasunaga H, Hashimoto H, Horiguchi H, Ayanian JZ. The Association of Hospital Volume with Mortality and Costs of Care for Stroke in Japan. Med Care. 2013;51(9):782–8.
Tu JV, Austin PC, Chan BTB. Relationship between annual volume of patients treated by admitting physician and mortality after acute myocardial infarction. JAMA. 2001;285(24):3116–22.
Vakili BA, Robert K, Brown DL. Volume-outcome relation for physicians and hospitals performing angioplasty for acute myocardial infarction in New York state. Circulation. 2001;104(18):2171–6.
Kahn JM, Goss CH, Heagerty PJ, Kramer AA, O’Brien CR, Rubenfeld GD. Hospital volume and the outcomes of mechanical ventilation. N Engl J Med. 2006;355(1):41–50.
Kahn JM, Ten Have TR, Iwashyna TJ. The relationship between hospital volume and mortality in mechanical ventilation: an instrumental variable analysis. Health Serv Res. 2009;44(3):862–79.
Lin H-C, Xirasagar S, Chen C-H, Hwang Y-T. Physician’s case volume of intensive care unit pneumonia admissions and in-hospital mortality. Am J Respir Crit Care Med. 2008;177(9):989–94.
Kumamaru H, Tsugawa Y, Horiguchi H, Kumamaru KK, Hashimoto H, Yasunaga H. Association between hospital case volume and mortality in non-elderly pneumonia patients stratified by severity: a retrospective cohort study. BMC Health Serv Res. 2014;14(1):302.
Gooiker GA, van Gijn W, Wouters MWJM, Post PN, van de Velde CJH, Tollenaar RAEM. Systematic review and meta-analysis of the volume–outcome relationship in pancreatic surgery. Br J Surg. 2011 Apr;98(4):485–94.
Karanicolas PJ, Dubois L, Colquhoun PHD, Swallow CJ, Walter SD, Guyatt GH. The more the better?: the impact of surgeon and hospital volume on in-hospital mortality following colorectal resection. Ann Surg. 2009;249(6):954–9.
Zevin B, Aggarwal R, Grantcharov TP. Volume-outcome Association in Bariatric Surgery: a systematic review. Ann Surg. 2012;256(1):60–71.
Birkmeyer JD, Finlayson SRG, Tosteson ANA, Sharp SM, Warshaw AL, Fisher ES. Effect of hospital volume on in-hospital mortality with pancreaticoduodenectomy. Surgery. 1999;125(3):250–6.
Diers J, Wagner J, Baum P, Lichthardt S, Kastner C, Matthes N, et al. Nationwide in-hospital mortality rate following rectal resection for rectal cancer according to annual hospital volume in Germany. BJS Open. 2020;4(2):310–9.
Diers J, Wagner J, Baum P, Lichthardt S, Kastner C, Matthes N, et al. Nationwide in-hospital mortality following colonic cancer resection according to hospital volume in Germany. BJS Open. 2019;3(5):672–7.
Mehta AB, Douglas IS, Walkey AJ. Hospital noninvasive ventilation case volume and outcomes of acute exacerbations of chronic obstructive pulmonary disease. Annals ATS. 2016;13(10):1752–9.
Lindenauer PK, Behal R, Murray CK, Nsa W, Houck PM, Bratzler DW. Volume, quality of care, and outcome in pneumonia. Ann Intern Med. 2006;144(4):262–9.
Shackley P, Slack R, Booth A, Michaels J. Is there a Positive Volume–Outcome Relationship in Peripheral Vascular Surgery? Results of a Systematic Review. Eur J Vasc Endovasc Surg. 2000;20(4):326–35.
Glance LG, Dick AW, Osler TM, Mukamel DB. The relation between surgeon volume and outcome following off-pump vs on-pump coronary artery bypass graft surgery. Chest. 2005;128(2):829–37.
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2nd ed: Springer Science & Business Media; 2009.
Walther F, Schoffer O, Schmitt J. Effectiveness of a collegial consultation procedure to improve in-patient care - a pragmatic cluster randomized controlled trial [Internet]. 2018. Available from: https://doi.org/10.1186/ISRCTN10188560.
Schmitt J, Schoffer O, Walther F, Roessler M, Grählert X, Eberlein-Gonska M, et al. Effectiveness of the IQM peer review procedure to improve in-patient care - a pragmatic cluster randomized controlled trial (IMPRESS): study design and baseline results. J Public Health (Berl). 2021;29(1):195–203.
Mansky T, Nimptsch U, Cools A, Hellerhoff F. G-IQI German Inpatient Quality Indicators Version 5.0 [Internet]. 2016 [cited 2020 Jul 16]. Available from: www.seqmgw.tu-berlin.de/fileadmin/fg241/GIQI_V50_Band_1.pdf
Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care. 1998;36(1):8–27.
BBSR. Laufende Raumbeobachtung – Raumabgrenzungen [Internet]. 2020 [cited 2020 Jul 16]. Available from: https://www.bbsr.bund.de/BBSR/DE/forschung/raumbeobachtung/Raumabgrenzungen/deutschland/regionen/Regionstypen/regionstypen.html
Hentschker C, Mennicken R. The Volume–Outcome Relationship Revisited: Practice Indeed Makes Perfect. Health Serv Res. 2018;53(1):15–34.
Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.
Wright MN, Ziegler A. Ranger: A fast implementation of random forests for high dimensional data in C++ and R. 2017th-03–31st ed. J Stat Softw. 2017;77(1):1–17.
Greenwell BM. Pdp: an R package for constructing partial dependence plots. The R Journal. 2017;9(1):421–36.
R Core Team. R: A language and environment for statistical computing [internet]. R Foundation for statistical Computing; 2020 [cited 2020 Jul 16]. Available from: https://www.R-project.org/
Schoffer O, Roessler M, Walther F, Eberlein-Gonska M, Scriba PC, Albrecht M, et al. Patient-level and hospital-level risk factors for in-hospital mortality in patients ventilated for more than 24 hours: results of a nationwide cohort study. J Intensive Care Med. 2020;36(8):954–62.
Hill NS. Where should noninvasive ventilation be delivered? Respir Care. 2009;54(1):62–70.
Morche J, Mathes T, Pieper D. Relationship between surgeon volume and outcomes: a systematic review of systematic reviews. Syst Rev. 2016;5(1):204.
Ross MA, Amsterdam E, Peacock WF, Graff L, Fesmire F, Garvey JL, et al. Chest pain center accreditation is associated with better performance of Centers for Medicare and Medicaid Services Core measures for acute myocardial infarction. Am J Cardiol. 2008;102(2):120–4.
Lichtman JH, Jones SB, Wang Y, Watanabe E, Leifheit-Limson E, Goldstein LB. Outcomes after ischemic stroke for hospitals with and without joint commission-certified primary stroke centers. Neurology. 2011;76(23):1976–82.
Falstie-Jensen AM, Larsson H, Hollnagel E, Nørgaard M, Svendsen MLO, Johnsen SP. Compliance with hospital accreditation and patient mortality: a Danish nationwide population-based study. Int J Qual Health Care. 2015;27(3):165–74.
Griffiths P, Ball J, Drennan J, James L, Jones J, Recio A, et al. The association between patient safety outcomes and nurse/healthcare assistant skill mix and staffing levels and factors that may influence staffing requirements. Centre for Innovation and Leadership in Health Sciences: University of Southampton; 2014.
Serdén L, Lindqvist R, Rosén M. Have DRG-based prospective payment systems influenced the number of secondary diagnoses in health care administrative data? Health Policy. 2003;65(2):101–7.
Lezzoni LI, Foley SM, Daley J, Hughes J, Fisher ES, Heeren T. Comorbidities, Complications, and coding Bias: does the number of diagnosis codes matter in predicting in-hospital mortality? JAMA. 1992;267(16):2197–203.
Maass C, Kuske S, Lessing C, Schrappe M. Are administrative data valid when measuring patient safety in hospitals? A comparison of data collection methods using a chart review and administrative data. Int J Qual Health Care. 2015;27(4):305–13.
Christian CK, Gustafson ML, Betensky RA, Daley J, Zinner MJ. The Volume–Outcome Relationship: Don’t Believe Everything You See. World J Surg. 2005;29(10):1241–4.
Aquina CT, Probst CP, Becerra AZ, Iannuzzi JC, Kelly KN, Hensley BJ, et al. High volume improves outcomes: the argument for centralization of rectal cancer surgery. Surgery. 2016;159(3):736–48.
Fargen KM, Jauch E, Khatri P, Baxter B, Schirmer CM, Turk AS, et al. Needed dialog: regionalization of stroke systems of care along the trauma model. Stroke. 2015;46(6):1719–26.
Swart E, Gothe H, Geyer S, Jaunzeme J, Maier B, Grobe TG, et al. Good practice of secondary data analysis (GPS): guidelines and recommendations. Gesundheitswesen (Bundesverband der Arzte des Offentlichen Gesundheitsdienstes (Germany)). 2015;77(2):120–6.
The authors thank Jochen Strauß, Claudia Winklmair, the IQM executive board, and all participating hospitals.
Open Access funding enabled and organized by Projekt DEAL. The IMPRESS study was funded by the Innovation Fund of the Joint Federal Committee (Gemeinsamer Bundesausschuss, G-BA), Germany. Funding number: 01VSF16013.
Ethics approval and consent to participate
The ethics committee of the TU Dresden approved the study protocol on 24/04/2017 (registered at the Institutional Review Board (IRB): Office for Human Research Protections (OHRP); identification numbers: IRB00001473 and IORG0001076). We obtained consent to participate from the included hospitals before the start of the IMPRESS study. Due to the use of secondary, routine care data, patient consent to participate was not required as confirmed by the ethics committee of the TU Dresden (reference number: EK 186052017). The study adheres to all relevant ethical standards and guidelines, including the Declaration of Helsinki, the Memorandum on the Assurance of Good Scientific Practice of the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG), the Memorandum III “Methoden für die Versorgungsforschung” of the Deutsches Netzwerk Versorgungsforschung and the guideline “Good practice secondary data analysis” (GPS) of the German Society for Epidemiology (Deutsche Gesellschaft für Epidemiologie, DGEpi) .
Consent for publication
Peter C. Scriba and Ralf Kuhlen are members of the scientific advisory board of IQM. Maria Eberlein-Gonska serves as an external expert for IQM. The other authors declare that they have no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
: Additional, indication-specific risk factors. Figure S1: Partial dependence functions capturing the relationship between the probability of in-hospital death and hospital case volume derived from random forest estimates including additional, indication-specific risk factors. Table S2: Cases excluded due to membership in multiple patient groups. Figure S2: Partial dependence functions capturing the relationship between the probability of in-hospital death and hospital case volume derived from random forest estimates excluding cases belonging to multiple patient groups. Figure S3: Partial dependence functions capturing the relationship between the probability of in-hospital death and hospital case volume derived from random forest for patients ventilated > 24 h with specific indications.
About this article
Cite this article
Roessler, M., Walther, F., Eberlein-Gonska, M. et al. Exploring relationships between in-hospital mortality and hospital case volume using random forest: results of a cohort study based on a nationwide sample of German hospitals, 2016–2018. BMC Health Serv Res 22, 1 (2022). https://doi.org/10.1186/s12913-021-07414-z
- Hospital mortality
- Volume-outcome relationship
- Cohort study
- Risk factors
- Random Forest
- Nonparametric modelling