- Research article
- Open Access
- Open Peer Review
Assessment of hospital performance with a case-mix standardized mortality model using an existing administrative database in Japan
BMC Health Services Researchvolume 10, Article number: 130 (2010)
Few studies have examined whether risk adjustment is evenly applicable to hospitals with various characteristics and case-mix. In this study, we applied a generic prediction model to nationwide discharge data from hospitals with various characteristics.
We used standardized data of 1,878,767 discharged patients provided by 469 hospitals from July 1 to October 31, 2006. We generated and validated a case-mix in-hospital mortality prediction model using 50/50 split sample validation. We classified hospitals into two groups based on c-index value (hospitals with c-index ≥ 0.8; hospitals with c-index < 0.8) and examined differences in their characteristics.
The model demonstrated excellent discrimination as indicated by the high average c-index and small standard deviation (c-index = 0.88 ± 0.04). Expected mortality rate of each hospital was highly correlated with observed mortality rate (r = 0.693, p < 0.001). Among the studied hospitals, 446 (95%) had a c-index of ≥0.8 and were classified as the higher c-index group. A significantly higher proportion of hospitals in the lower c-index group were specialized hospitals and hospitals with convalescent wards.
The model fits well to a group of hospitals with a wide variety of acute care events, though model fit is less satisfactory for specialized hospitals and those with convalescent wards. Further sophistication of the generic prediction model would be recommended to obtain optimal indices to region specific conditions.
Initiatives to measure healthcare quality attract serious attention from policy-makers and consumers who believe that such measurements can drive improvements in the quality of the service . Recent enthusiasm for outcome evaluation such as in-hospital mortality, however, has been challenged because of the difficulties of ensuring adequate risk adjustment for different patient populations, an indispensable factor for fairly evaluating healthcare performance . Owing to the clear definition of outcome and available knowledge on influential patient conditions, disease-specific risk adjustment models have been developed in several specialties, including cardiovascular diseases, and have been available for various quality improvement studies [3–6]. However, a risk adjustment model for a more generic use of outcome evaluation has not been fully developed . In our previous study, we proposed and tested a generic risk prediction model to predict the risk of in-hospital mortality, with variables easily obtainable from large electronic administrative databases . Our model showed excellent precision and calibration compared to other risk adjustment models [9–12].
However, the dataset used in the previous study was derived mainly from large university-affiliated teaching hospitals, which may compromise the ability to generalize results to a broader array of hospitals. Since the calculation of risk-adjusted in-hospital mortality is often conducted for benchmarking purposes, whether the risk adjustment model is applicable to hospitals with varying characteristics and case-mix must be clarified. To date, few studies have examined whether case-mix risk adjustment can be evenly applied such hospitals. In this study, we applied a generic case-mix-based risk adjustment model for in-hospital mortality prediction to hospitals with varying characteristics, and evaluated its performance for benchmarking risk-adjusted hospital mortality using a nationwide database of discharge cases.
We used an electronic, standardized dataset of discharged patients provided by 469 hospitals that participated in a Japanese patient classification system and related evaluation scheme from July 1 to October 31, 2006. The patient classification system, or Diagnosis Procedure Combination (DPC), includes information for up to two major diagnoses and up to six co-existing diagnoses. The 2008 version of the DPC system includes 18 major diagnostic categories (MDC) and 506 disease subcategories coded in ICD10. For analytic purposes, we re-categorized the 18 MDCs into 10 MDCs based on mortality rates. The dataset also includes additional information such as patient demographics, uses and types of surgical procedures, emergency/elective hospitalization, length of stay, and discharge status (including in-hospital death) [13–15]. Records for 1,878,767 discharge cases were available for the following analysis. Cases were randomly assigned into two subsets with an approximate 50/50 split: one for model development and the other for validation tests. The obtained model development dataset included 939,409 records and the validation dataset included 939,358 records. Because of the anonymous nature of the data, the requirement for informed consent was waived. Study approval was obtained from the institutional review board of the hospital with which the last author was affiliated.
Model building and validation
We started with the mortality prediction model used in our previous study . The model includes age, gender, use of an ambulance at admission, admission status (emergency/elective), MDC of the primary diagnosis, and comorbidity. Based on Quan's methodology , the ICD-10 code of each co-existing diagnosis was converted into a Charlson Comorbidity Index score. We classified scores into five categories: 0, 1-2, 3-6, 7-12, and 13 and over. We further modified our former model by including "admission purpose." In the previous study, we found that the mortality risk of patients with cardiovascular diseases tended to be underestimated because this group of patients included those hospitalized only for post-operative evaluation. Thus, including admission purpose should improve the precision of low-risk prediction. We also included Eastern Cooperative Oncology Group performance status (grade 0, fully active; grade 4, completely disabled)  and Fletcher-Hugh-Jones classification of respiratory status (class 1, patient's breathing is similar to others of the same sex and age; class 5, patient is breathless when talking or undressing, or is unable to leave the house due to breathlessness) . These parameters were included because the mortality risk of patients with cancer and chronic pulmonary diseases tended to be overestimated, and inclusion of these additional scores should improve predictive precision for such patients. Given that Fletcher-Hugh-Jones classification and performance status scores were required only for those with chronic pulmonary diseases and cancer, missing observations were treated as null values. A multivariate logistic regression analysis including variables mentioned above was performed to predict in-hospital mortality using the development dataset. The tests of model performance and fitness were conducted using the test dataset. Accuracy of the prediction models was determined with the c-index . We assessed the ability of the model to accurately predict mortality across all ranges of risk by comparing predicted and observed mortality rates in predicted mortality risk deciles.
Comparison of hospital performance
We excluded from analysis one hospital that had a mortality rate of zero because the c-index could not be calculated. Given that a c-index of 0.8 to 0.9 is considered excellent , we divided hospitals into two groups by setting a c-index of 0.8 as the cut-off point. We then examined differences in characteristics between the two groups of hospitals, including size, number of admissions, crude and predicted mortality, and distribution of patient demographics and diseases using Fisher's exact test and the t-test as appropriate. Hospitals for which the sole MDC category accounted for more than half of all hospitalized cases were considered "specialized hospitals." All statistical tests were 2-tailed and the significance level was set at p < 0.05.
Standardized mortality ratios (SMRs) were obtained by calculating the ratio of observed mortality to expected mortality estimated by the model. Standardized mortality rate was obtained by multiplying SMRs and the average in-hospital mortality rate for all hospitals. All analyses were conducted with SPSS version 15.0J (SPSS Japan, Inc).
Table 1 shows patient characteristics in the development and validation datasets. Among the 939,409 patients (male, 53.0%; age under 50 years at admission, 32.6%; age of 90 years or older, 1.6%) in the development dataset, the MDC with the highest proportion was the "digestive system (20.7%)," followed by "skin, ear, eye, pediatric, and newborn (14.7%)," "musculoskeletal, injuries, and others (14.6%)," "respiratory system (10.7%)," "circulatory system (9.8%)," "female, breast (8.8%)," "kidney (7.8%)," "nervous system (6.9%)," "endocrine (3.4%)," and "blood, blood forming organs, and immunological disorders (2.4%)." The majority of patients (69.5%) had a total score of 0 for the Charlson Comorbidity Index, and only 2.5% of patients had a score higher than 6. With regard to admission status, 42.3% had emergency status, 12.5% used an ambulance, 6.7% stayed in the hospital for examination, and 4.7% planned short-term admissions. For cancer performance status, almost all patients were grade 0, grade 1, or missing (98.1%), while only 1.8% of patients were grade 2 or higher. For the Fletcher Hugh-Jones classification, almost all patients were class 1, class 2, or missing (97.1%), while only 2.9% of patients were class 3 or higher.
Table 2 shows the in-hospital mortality prediction model applied to the development dataset. Using the "musculoskeletal, injuries, and others" MDC as a reference, MDCs for "endocrine" and "skin, ear, eye, pediatric, and newborn," showed a significantly lower odds ratio for in-hospital deaths compared to other MDCs. Older age, male gender, use of ambulance at admission, and emergency admission status showed a significantly higher odds ratio. Hospitalization for examination and planned short-term admission showed a significantly lower odds ratio. As scores increased for the Charlson Comorbidity Index, performance status, and Fletcher-Hugh-Jones classification, the odds ratio exhibited a linearly increasing trend.
The risk prediction model exhibited a c-index of 0.882 for both development and validation datasets. Predicted and observed deaths in the validation dataset are shown in Figure 1 by risk decile. Expected mortality was lower than observed mortality in higher deciles, whereas the reverse was observed in lower deciles.
Table 3 summarizes major characteristics of the 468 hospitals (mean ± standard deviation of c-index for each hospital, 0.88 ± 0.04). Among these hospitals, 446 were allocated to the higher c-index group (average c-index; 0.882, 95%CI; 0.878-0.885), and 22 to the lower c-index group (average c-index; 0.772, 95%CI; 0.757-0.786). The higher c-index group had a significantly higher number of admissions, hospital mortality rate, and standardized mortality rate. Hospitals in the lower c-index group were significantly more likely to be specialized hospitals, hospitals with convalescent wards, and private hospitals. Figure 2 plots expected and observed mortality by higher and lower c-index groups (expected mortality rates represent average predicted risk in each hospital). The lower c-index group tended to be positioned off-diagonal in the plot, but no systematic trend of overestimation or underestimation was found between the two groups. Expected mortality in each hospital was highly correlated with observed mortality (total, r = 0.693, p < 0.001). The correlation between expected and observed mortality in the higher c-index group (r = 0.702, p < 0.001) was higher compared to that of the lower c-index group (r = 0.663, p < 0.01). The average observed mortality to expected mortality (OE) ratio by risk decile for hospitals is shown in Table 4. A comparison of the standardized and raw mortality rate quartiles is displayed in Table 5. After risk adjustment, 62% percent of hospitals (n = 290) were categorized in a different quartile.
In this study, we developed a modified case-mix-based risk adjustment model for in-hospital mortality using administrative data, and tested its performance in various types of hospitals. The model demonstrated excellent discrimination as indicated by the high average c-index, and was applicable to the majority of hospitals in our sample set taken from a large hospital discharge database. However, our finding that a few hospitals had a lower c-index warrants further discussion.
The hospitals with a lower c-index were characterized by a case-mix predominantly involving circulatory and nervous system disorders, and older patients with higher mortality. These characteristics indicate that hospitals with a lower c-index were those that provided a combination of acute and long-term care. As is often reported, Japanese hospitals, especially small/middle-sized private hospitals, are not well differentiated with respect to provision of acute and long-term care . The hospitals with a lower c-index provided both acute and long-term care specifically to stroke patients. Although the Japanese patient classification system includes the majority of acute-care hospitals, and our dataset should cover a large share of these hospitals, the recent expansion of the system to include a wider range of hospitals has led to increased heterogeneity in the functions of participating hospitals. Our results may suggest that the proposed risk prediction model does not apply as well to mixed-care hospitals, and should be selectively applied to general hospitals that provide acute care.
Our model demonstrated excellent discrimination without the need for detailed clinical data. As discussed in a previous study , our model's high predictive precision was made possible by including patient demographics and admission status, further combined with MDCs and the Charlson Comorbidity Index. All variables are easily accessible from administrative data properly coded with internationally standardized disease codes such as ICD-10, and allows for excellent model performance. Our model framework may be applicable and useful in other countries as well.
Public disclosure of hospital performance (e.g., hospital-standardized mortality rate) is considered to provide informed choice to consumers/patients, provide a benchmark for hospital management, and enhance efficiency of the health care system by stressing competition over quality. Proper risk adjustment then becomes crucial for providing unbiased information on the quality of hospital performance. As we have demonstrated, risk adjustment had a marked impact on hospital ranking, since a larger share of hospitals shifted to a different quartile of hospital mortality rate after adjustment. These results suggest that our model can be used for benchmarking hospital-standardized mortality rate with fair risk adjustment among acute-care hospitals.
A potential limitation of our study worth noting is the quality of diagnosis coding in the database. We relied on original data submitted by participating hospitals, simply because the same information is used in actual billing statements for claim reimbursement. Our preliminary analysis did not identify serious flaws in the quality of ICD10 codes, although the quality of coding and how it affects the precision of risk prediction may be an important issue to be addressed in future studies. Regional applicability, however, may be more of a concern for the risk adjustment framework. A recent international comparative study  demonstrated that while cross-national application of a formula can achieve high predictive accuracy, the level of accuracy varied across countries [8, 21]. This may be partly because disease distribution and burden are different between countries with different health care systems. Thus, it may be preferable for investigators to develop "optimal" indices for their own data-specific and condition-specific model coefficients.
Physicians and hospitals will strongly oppose public reporting if risk-adjusted outcomes are not reflective of provider-specific performance . Enhanced validity and reliability of standardized mortality rates and other risk-adjusted outcomes may be essential not only for benchmarking, but also for public reporting. Utilizing process measures in conjunction with risk-adjusted outcomes may also be used for quality improvement, as some research has documented an association between higher adherence to care guidelines and better outcomes of patients who receive that care [23, 24]. However, other research has suggested that hospital performance measures predict small differences in hospital risk-adjusted mortality rates . Further efforts are needed to develop performance measures that are tightly linked to patient outcomes. We also note that in-hospital mortality reflects just one aspect of hospital performance. In order to properly reflect patient values, it may be necessary to assess hospital performance using other factors as well, such as potentially avoidable adverse events (e.g., readmission and complications) .
The risk model developed in this study exhibited a good degree of predictive accuracy for benchmarking hospital mortality with variables easily accessible from administrative data. The model fits better to and can be applied selectively to benchmarking general acute care hospitals. However, model fit is less satisfactory for specialized hospitals and those with convalescent wards. Further sophistication of the generic prediction model would be recommended to obtain optimal indices to region specific conditions.
Galvin R, Milstein A: Large employers' new strategies in health care. N Engl J Med. 2002, 347: 939-42. 10.1056/NEJMsb012850.
Iezzoni LI: Risk adjustment for measuring health care outcomes. 2003, Chicago, IL: Health Administration Press
Bradley EH, Herrin J, Elbel B, McNamara RL, Magid DJ, Nallamothu BK, Wang Y, Normand SL, Spertus JA, Krumholz HM: Hospital quality for acute myocardial infarction: correlation among process measures and relationship with short-term mortality. JAMA. 2006, 296 (1): 72-78. 10.1001/jama.296.1.72.
Krumholz HM, Wang Y, Mattera JA, Wang Y, Han LF, Ingber MJ, Roman S, Normand SL: An administrative claims model suitable for profiling hospital performance based on 30-day mortality rates among patients with an acute myocardial infarction. Circulation. 2006, 113 (13): 1683-1692. 10.1161/CIRCULATIONAHA.105.611186.
Motomura N, Miyata H, Tsukihara H, Takamoto S: Risk Model of Thoracic Aortic Surgery in 4707 Cases from a Nationwide Single-race Population, via a Web-based Data Entry System: The First Report of 30-day and 30-dayOperative Outcome Risk Models for Thoracic Aortic Surgery. Circulation. 2008, 118: S153-9. 10.1161/CIRCULATIONAHA.107.756684.
Miyata H, Motomura N, Ueda U, Mastuda H, Takamoto S: Effect of procedural volume on outcome of CABG surgery in Japan: Implication toward minimal volume standards and public reporting. Journal of Thoracic and Cardiovascular Surgery. 2008, 135: 1306-12. 10.1016/j.jtcvs.2007.10.079.
Rosen AK, Loveland S, Anderson JJ, Rothendler JA, Hankin CS, Rakavski CC, Moskowitz MA, Berlowitsz DR: Evaluating diagnosis-based case-mix measures: how well do they apply to the VA population?. Medical Care. 2001, 39 (7): 692-704. 10.1097/00005650-200107000-00006.
Miyata H, Hashimoto H, Horiguchi H, Matsuda S, Motomura N, Takamoto S: Performance of in-hospital mortality prediction models for acute hospitalization: Hospital Standardized Mortality Ratio in Japan. BMC Health Services Research. 2008, 8: 229-10.1186/1472-6963-8-229.
Quan H, Sundararajan V, Halfon P, Fong A, Burnand B, Luthi JC, Saunders LD, Beck CA, Feasby TE, Ghali WA: Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Medica Care. 2005, 43: 1130-39. 10.1097/01.mlr.0000182534.19832.83.
Jarman B, Gault S, Alves B, Hider A, Dolan S, Cook A, Hurwitz B, Iezzoni L: Explaining differences in English hospital death rates using routinely collected data. BMJ. 1999, 318: 1515-20.
Lakhani A, Coles J, Eayres D, Spence C, Rachet B: Creative use of existing clinical and health outcomes data to assess NHS performance in England: Part1 - performance indicators closely linked to clinical care. BMJ. 2005, 330: 1426-1431. 10.1136/bmj.330.7505.1426.
Aylin P, Bottle A, Majeed A: Use of administrative data or clinical databases as predictors of risk of death in hospital: comparison of models. BMJ. 2007, 334: 1044-10.1136/bmj.39168.496366.55.
Matsuda S, Fushimi K, Hashimoto H, Kuwabara K, Imanaka Y, Horiguchi H, Ishikawa KB, Anan M, Ueda K: The Japanese Case-Mix Project: Diagnosis Procedure Combination (DPC). Proceedings of the 19th International Case Mix Conference PCS/E; 8-11 October 2003; Washington, DC. 2003, 121-124.
Okamura S, Kobayashi R, Sakamaki T: Case-mix payment in Japanese medical care. Health Policy. 2005, 74: 282-286. 10.1016/j.healthpol.2005.01.009.
Fushimi K, Hashimoto H, Imanaka Y, Kuwabara K, Horiguchi H, Ishikawa KB, Matsuda S: Functional mapping of hospitals by diagnosis-dominant case-mix analysis. BMC Health Serv Res. 2007, 7: 5013-10.1186/1472-6963-7-50.
Oken MM, Creech RH, Tormey DC, Horton J, Davis TE, McFadden ET, Carbone PP: Toxity and response criteria of the Eastern Cooperative Oncology Group. Am J Clin Oncol. 1982, 5: 649-55. 10.1097/00000421-198212000-00014.
Hugh-Jones P, Lambert AV: A simple standard exercise test and its use for measuring exertion dyspnoea. BMJ. 1952, 1: 65-71. 10.1136/bmj.1.4749.65.
Ash A, Schwartz M: Evaluating the performance of risk-adjustment methods: dichotomous variables. Risk adjustment for measuring health care outcomes. Edited by: Iezzoni L. 1994, Ann Arbor, MI: Health Administration Press, 313-46.
Hosmer DW, Lemeshow S: Applied Logistic Regression. 2000, New York, NY: Wiley & Sons, 2
Ikegami N, Campbell JC: Health care reform in Japan: the virtues of muddling through. Health Aff. 1999, 18 (3): 56-75. 10.1377/hlthaff.18.3.56.
Sundararajan V, Quan H, Halfon P, Fushimi K, Luthi JC, Burnand B, Ghali W: Cross-national comparative performance of three versions of the ICD-10 Charlson Index. Med Care. 2007, 45: 1210-1215. 10.1097/MLR.0b013e3181484347.
Birkmeyer NJO, Birkmeyer JD: Strategies for improving surgical quality-Should payers reward excellence or effort?. N Engl J Med. 2006, 354 (8): 864-870. 10.1056/NEJMsb053364.
Peterson ED, Roe MT, Mulgund J, et al: Association between hospital process performance and outcomes among patients with acute coronary syndromes. JAMA. 2006, 295: 1912-1920. 10.1001/jama.295.16.1912.
Higashi T, Shekelle PG, Adams JL, Kamberg CJ, Roth CP, Solomon DH, Reuben DB, Chiang L, MacLean CH, Chang JT, Young RT, Saliba DM, Wenger NS: Quality of care is associated with survival in vulnerable older patients. Ann Intern Med. 2005, 143: 274-281.
Werner RM, Bradlow ET: Relationship between Medicare's hospital compare performance measures and mortality rates. JAMA. 2006, 296 (22): 2694-2702. 10.1001/jama.296.22.2694.
Halfon P, Eggli Y, Melle GV, Chevalier J, Wasserfallen JB, Burnand B: Measuring potentially avoidable hospital readmissions. Journal of Clinical Epidemiology. 2002, 55: 573-587. 10.1016/S0895-4356(01)00521-2.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6963/10/130/prepub
This study was supported by a grant-in-aid for "Research on Policy Planning and Evaluation" from the Ministry of Health, Labour and Welfare (2008). The authors express thanks to the following researchers in the Study Group on Diagnosis Procedure Combination that made DPC data publicly available: Makoto Anan, Yuichi Imanaka, Koichi B. Ishikawa, Kenji Hayashida, and Kazuaki Kuwabara.
The authors declare that they have no competing interests.
HM conceived the study and designed the protocol. HM and HH1 wrote the manuscript. HH2, KF, and SM managed data collection and processing. All authors have read and approved the final manuscript.