Skip to main content

Feasibility of coding-based Charlson comorbidity index for hospitalized patients in China, a representative developing country

Abstract

Background

The Charlson Comorbidity Index (CCI) can be automatically calculated from the International Classification of Disease (ICD) code. However, the feasibility of this transformation has not been acknowledged, particularly in hospitals without a qualified ICD coding system. Here, we investigated the utility of coding-based CCI in China.

Methods

A multi-center, population-based, retrospective observational study was conducted, using a dataset incorporating 2,464,395 adult subjects from 15 hospitals. CCI was calculated using both ICD-10-based and diagnosis-based method, according to the transformation rule reported previously and to the literal description from discharge diagnosis, respectively. A κ coefficient of variation was used as a measure of agreement between the above two methods for each hospital. The discriminative abilities of the two methods were compared using the receiver-of-operating characteristic curve (ROC) for prediction of in-hospital mortality.

Results

Total agreement between the ICD-based and diagnosis-based CCI for each index ranged from 86.1 to 100%, with κ coefficients from 0.210 [95% confidence interval (CI) 0.208–0.212] to 0.932 (95% CI 0.924–0.940). None of the 19 indices of CCI had a κ coefficient > 0.75 in all the hospitals included for study. The area under the curve of ROC for in-hospital mortality of all 15 hospitals was significantly lower for ICD-based than diagnosis-based CCI [0.735 (0.732, 0.739) vs 0.760 (0.757, 0.764)], indicative of more limited discriminative ability of the ICD-based calculation.

Conclusions

CCI calculated using ICD-10 coding did not agree with diagnosis-based CCI. ICD-based CCI displayed diminished discrimination performance in terms of in-hospital mortality, indicating that this method is not promising for CCI scoring in China under the present circumstances.

Peer Review reports

Background

The Charlson comorbidity index (CCI) is a scoring system to classify or assign weights to comorbid conditions. The index was initially developed in a small cohort of patients for predicting one-year mortality and tested in another cohort during a 10-year follow-up period [1]. After years of clinical practice, CCI not only facilitated prediction of short- and long-term mortality but could also be utilized to measure disease burden in multiple clinical settings [2, 3].

CCI involves 19 comorbidities, which can be extracted from clinical diagnoses or the corresponding International Classification of Disease (ICD) codes. Compared to the considerable work involved in one-to-one calculations based on diagnosis, CCI can be automatically and quickly calculated using the ICD code [4]. Accordingly, ICD-based CCI is widely used. The most extensively applied version is ICD-10 published in 1993. However, CCI assessed using the ICD-10 code does not completely match that from clinical diagnosis. Accurate reclassification of clinical diagnoses that do not match the ICD-10 code requires the professional clinical knowledge of coders and occasionally clinicians [5]. Due to disagreements between diagnosis and ICD-10 code-based methods, ICD-10 generalization involves long time-periods. National administrative departments in developed countries, such as the Department of Health and Human Services in the United States, are in charge of adaptations of ICD modifications and updates to ensure concordance with diagnosis [6], (https://www.cdc.gov/nchs/data/icd/10cmguidelines_2017_final.pdf). ICD has also been widely applied in developing countries [7,8,9] but its use in these cases is non-standard. China officially started to use ICD-10 in 2002 and attempted to promote a 6-digit extension code of ICD-10 in 2012. As the world’s largest developing country, China should provide valuable information for the effective implementation of ICD. Previous studies in China disclosed a 4.79–73.08% error rate of coding [10]. Considering the overall heterogenicity and relatively poor coding quality in China, the feasibility of coding-based CCI should be investigated.

The main objective of present study was to ascertain the utility of coding-based CCI through comparison with diagnosis-based CCI.

Methods

Study design and data sources

A multi-center, population-based, retrospective observational study was conducted, using the phase 1 dataset of the China Collaborative Study on Acute Kidney Injury, which contains all the literal discharge diagnoses with relative ICD-10 codes and in-hospital death records. This multicenter retrospective observational study was designed to identify novel risk factors of acute kidney injury. The registration number in clinicaltrials.gov is NCT03061786. The study protocol complied with the Declaration of Helsinki and was approved by the Ethics Research Committees of Guangdong General Hospital (GDREC2016327H).

The phase 1 dataset included 3,616,478 adult (18 years or older) admissions in 15 hospitals from January 2012 to December 2016 across 9 provinces in China (Guangdong, Sichuan, Zhejiang, Anhui, Jilin, Shanghai, Chongqing, Inner Mongolia and Xinjiang). Twelve of these were tertiary hospitals and the remaining three were secondary hospitals (Supplementary Table 1). The hospital names were anonymized in reports owing to privacy considerations. The exclusions criteria were as follows: 1) missing or abnormal data (including data of age, hospitalization stay or medical cost); 2) younger than 18 years old; 3) repeated hospitalization (Fig. 1).

Fig. 1
figure 1

Flow chart of the selected study population

CCI calculation

CCI was calculated using both ICD-10-based and diagnosis-based methods. ICD-10-based CCI was assessed according to the transformation rule reported in previous studies (Supplementary Table 2) [11,12,13] while diagnosis-based CCI was calculated based on the literal description from discharge diagnosis, regarded as the “gold standard”. Calculations were independently performed by two trained physicians. In cases where the calculations were inconsistent, final classification was made by the research group.

Statistical analysis

Data with normal distribution are presented as means ± SD and data with non-normal distribution as median values (25th or 75th percentile). Differences between two groups were determined using the independent-samples t-test or Mann–Whitney U test, as appropriate. Numerical data were evaluated as proportions. Percentage agreement and κ statistic were calculated to evaluate the degree of agreement between ICD-based and diagnosis-based CCI. The κ coefficient of variation (SD/mean × 100%) was applied as a measure of agreement variations among hospitals, with κ coefficient <  0.75 defined as poor agreement. Discrimination abilities of the methods were compared based on the area under the curve of receiver of operating characteristic (AUC of ROC) using R software (Version 1.0.153). Other statistical analyses were undertaken using SPSS version 24.0 (IBM, Armonk, NY, USA). Two-tailed P <  0.05 was considered statistically significant.

Results

Clinical characteristics of subjects

A total of 2,464,395 subjects were included. Median of the comorbidity number was 1 and ranged from 0 to 10 according to diagnosis-based CCI. The characteristics of the subjects are presented in Table 1.

Table 1 Demographic and clinical characteristics

Comorbidity distributions

According to discharge diagnoses, the comorbidity frequencies of CCI (from high to low) were as follows: cerebrovascular disease, tumor, mild liver disease, diabetes without chronic complication, congestive heart failure, chronic pulmonary disease, peripheral vascular disease, renal disease, metastatic solid tumor, diabetes with chronic complication, myocardial infarction, rheumatologic disease, peptic ulcer disease and hemiplegia (Supplementary Table 3). The other six rare comorbidities with < 1% incidence were lymphoma, moderate or severe liver disease, leukemia, dementia, hemiplegia, and acquired immune deficiency syndrome (AIDS) (Supplementary Table 3).

Disagreement between ICD-based and diagnosis-based CCI

Total agreement between ICD-based and diagnosis-based CCI for each index ranged from 86.1% (κ = 0.210, 95% CI 0.208–0.212) to 100% (κ = 0.932, 95% CI 0.924–0.940) (Table 2). None of the 19 indices had a κ coefficient > 0.75 in all the hospitals examined (Fig. 2). Typically, for all 15 hospitals, low κ coefficients < 0.75 for peripheral vascular disease were obtained, comparable to 13 hospitals for moderate or severe liver disease and 9 hospitals for mild liver disease (Fig. 2).

Table 2 Correlation coefficient and κ statistic between ICD-based and diagnosis-based CCI
Fig. 2
figure 2

Agreement between the ICD-based and diagnosis-based CCI for each index. The red horizontal line denotes a κ coefficient of 0.75. The Y-axis values denote κ coefficient, which is used as a measure of agreement variation. The red horizontal line denotes a κ coefficient of 0.75

Discrimination ability of ICD-based and diagnosis-based CCI for in-hospital death

We further compared discrimination ability of the two methods with regard to in-hospital mortality of ICD-based and diagnosis-based CCI by calculating AUC of ROC. AUCs of ICD-based CCI ranged from 0.556 (95% CI 0.516, 0.596) to 0.844 (95% CI 0.819, 0.868) and those of diagnosis-based CCI from 0.585 (95% CI 0.562, 0.608) to 0.849 (95% CI 0.817, 0.865). Total AUC was significantly lower for ICD-based CCI relative to diagnosis-based CCI [0.735 (0.732, 0.739) vs 0.760 (0.757, 0.764), P <  0.001] in all 15 hospitals (Fig. 3) as well as AUC values from10 individual hospitals (supplementary Table 4). In two hospitals, AUC values for ICD-based CCI were similar to those for diagnosis-based CCI [0.843 (0.819, 0.868) vs 0.849 (0.817, 0.865), P = 0.625; 0.713(0.700, 0.725) vs 0.718 (0.705, 0.730), P = 0.234]. AUC in one of the above hospitals was also the highest for CCI based on both methods while in three other hospitals, AUCs for ICD-based CCI were higher than those for diagnosis-based CCI [0.739 (0.716, 0.761) vs 0.717 (0.694, 0.740), P = 0.011; 0.603 (0.582, 0.625) vs 0.585 (0.562, 0.608), P = 0.013; 0.670 (0.652, 0.689) vs 0.657 (0.638, 0.675), P <  0.001]. The relatively low AUC values in these three hospitals are indicative of limited value of any type of CCI (supplementary Table 4).

Fig. 3
figure 3

Discriminatory ability of ICD-based and diagnosis-based CCI for in-hospital mortality

Discussion

This hospitalized population-based study revealed significant differences in intra-hospital comorbidity distributions [14]. ICD-based CCI did not match corresponding diagnosis-based CCI, particularly for peripheral vascular and liver diseases. None of the 19 indices showed satisfactory agreement (κ coefficient > 0.75) in any of the 15 hospitals examined, reflecting frequent discrepancies. Though the κ coefficient were generally higher than Januel et al. reported in 2003 [15]. Furthermore, ICD-based CCI was associated with lower AUC of ROC for in-hospital mortality than diagnosis-based CCI, indicative of a diminished discrimination performance, consistent with earlier studies [16, 17].

Several factors may contribute to the poor performance of ICD-based CCI, the most important being variable intra-hospital coding qualities. Distinct from American hospitals in which a national standard of ICD-Clinical Modification is adopted, Chinese hospitals modify ICD coding at the individual hospital level. Experienced coding personnel are particularly scarce in China and most are not fully trained [10]. Second, inputted Chinese diagnosis-based ICD coding does not match in a one-to-one manner in some cases, leading to inaccurate classification or even missing an ICD code [18]. Third, the qualities of ICD coding and recording are not comprehensively evaluated. Thus, in hospitals without a qualified coding system, direct application of ICD-based CCI should be avoided.

In addition to the implication of lower discrimination performance of ICD-based CCI, its convenience merits consideration. Notably, in a few hospitals (for example, hospital No. 15), ICD-based CCI displayed discriminative value for in-hospital mortality comparable to that of diagnosis-based CCI. Based on our results, we recommend that in hospitals with or without a qualified coding system, physicians and researchers should be aware of the limitations of CCI involving indices and acknowledge the potential errors of direct adoption of ICD-based CCI. Further validation of these indices is advocated, and standardization of ICD-10 coding remains an urgent task. In the future, national standards, specialized training and transformation software should be implemented to improve the reliability of ICD-based CCI along with the progress of hospital information management.

The large sample size including more than 3 million patients is a major strength of this study. Data were derived from hospital populations and both tertiary and secondary hospitals were included, thus minimizing selection bias. In addition, the hospitals included for study were distributed across various geographical and economic regions in China. Our experiences may therefore be applicable to other developing countries.

Conclusion

In conclusion, ICD-10 coding-based CCI does not concur with diagnosis-based CCI and is therefore not a promising technique for CCI scoring in China under the present circumstances.

Availability of data and materials

The data that support the findings of this study are available from the corresponding authors but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of the corresponding authors have obtained in order to freely share the hospital data.

Abbreviations

AUC:

Area under the curve

CCI:

Charlson Comorbidity Index

ICD:

International Classification of Disease

ROC:

Receiver-of-operating characteristic curve

References

  1. Charlson ME, Pompei P, Ales KL, MacKenzie CR. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40(5):373–83.

    Article  CAS  Google Scholar 

  2. Kolhe NV, Muirhead AW, Wilkes SR, Fluck RJ, Taal MW. National trends in acute kidney injury requiring dialysis in England between 1998 and 2013. Kidney Int. 2015;88(5):1161–9.

    Article  Google Scholar 

  3. Soliman IW, Frencken JF, Peelen LM, et al. The predictive value of early acute kidney injury for long-term survival and quality of life of critically ill patients. Crit Care. 2016;20(1):242.

    Article  Google Scholar 

  4. Quan H, Sundararajan V, Halfon P, et al. Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data. Med Care. 2005;43(11):1130–9.

    Article  Google Scholar 

  5. Yeoh C, Davies H. Clinical coding: completeness and accuracy when doctors take it on. BMJ. 1993;306(6883):972.

    Article  CAS  Google Scholar 

  6. ICD-10-CM Official Guidelines for Coding and Reporting FY. 2017. https://www.aapc.com/documents/2017_icd-10-cm_cms_guidlines_new.pdf.

  7. Riegel B, Driscoll A, Suwanno J, et al. Heart failure self-care in developed and developing countries. J Card Fail. 2009;15(6):508–16.

  8. Galicia-Hernández G, Parra-Salcedo F, Ugarte-Martínez P, Contreras-Yáñez I, Ponce-de-León A, Pascual-Ramos V. Sustained moderate-to-high disease activity and higher Charlson score are predictors of incidental serious infection events in RA patients treated with conventional disease-modifying anti-rheumatic drugs: a cohort study in the treat-to-target era. Clin Exp Rheumatol. 2016;34(2):261–9.

    PubMed  Google Scholar 

  9. Zampieri FG, Bozza FA, Moralez GM, et al. The effects of performance status one week before hospital admission on the outcomes of critically ill patients. Intensive Care Med. 2017;43(1):39–47.

    Article  Google Scholar 

  10. Zhou J, Bai X, Cui S, Pang C, Liu A. A systematic review of the quality of coding for disease classification by ICD-10 in China. Chinese Hospital Management. 2015;35(12):32–5.

    Google Scholar 

  11. Sundararajan V, Henderson T, Perry C, Muggivan A, Quan H, Ghali WA. New ICD-10 version of the Charlson comorbidity index predicted in-hospital mortality. J Clin Epidemiol. 2004;57(12):1288–94.

    Article  Google Scholar 

  12. Bannay A, Chaignot C, Blotière PO, et al. The best use of the Charlson comorbidity index with electronic health care database to predict mortality. Med Care. 2016;54(2):188–94.

    Article  Google Scholar 

  13. Thygesen SK, Christiansen CF, Christensen S, Lash TL, Sørensen HT. The predictive value of ICD-10 diagnostic coding used to assess Charlson comorbidity index conditions in the population-based Danish National Registry of patients. BMC Med Res Methodol. 2011;11:83.

    Article  Google Scholar 

  14. Walker RL, Hennessy DA, Johansen H. Implementation of ICD-10 in Canada: how has it impacted coded hospital discharge data? BMC Health Serv Res. 2012;12:149.

    Article  Google Scholar 

  15. Januel JM, Luthi JC. Improved accuracy of co-morbidity coding over time after the introduction of ICD-10 administrative data. BMC Health Serv Res. 2011;11:194.

    Article  Google Scholar 

  16. Kaspar M, Fette G, Güder G, et al. Underestimated prevalence of heart failure in hospital inpatients: a comparison of ICD codes and discharge letter information. Clin Res Cardiol. 2018;107(9):778–87.

    Article  Google Scholar 

  17. Kuhle S, Kirk SF, Ohinmaa A, Veugelers PJ. Comparison of ICD code-based diagnosis of obesity with measured obesity in children and the implications for health care cost estimates. BMC Med Res Methodol. 2011;11:173.

    Article  Google Scholar 

  18. O'Malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. Measuring diagnoses: ICD code accuracy. Health Serv Res. 2005;40(5 Pt 2):1620–39.

    Article  Google Scholar 

Download references

Acknowledgments

Not applicable.

Funding

This study was financially supported by the Guangdong Science and Technology Project (2017A070709008) and the Guangzhou Science and Technology Project (201604020037). The funder of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the Article.

Author information

Authors and Affiliations

Authors

Consortia

Contributions

XLL: concept and design of the research; draft the work; critical revision of the work; obtaining funding and supervising the work; final approval of the version to be published; YHC: concept and design of the research; draft the work; critical revision of the work; obtaining funding and supervising the work; final approval of the version to be published; LYM: concept and design of the research; draft the work; acquisition of data; analysis and interpretation of data; critical revision of the manuscript for important intellectual content; ZX: acquisition and analysis the data; critical revision of the manuscript for important intellectual content; GHL: concept of the work; analysis and acquisition of the data; HQ: acquisition and interpretation of the data; ZMM: acquisition of data and revise the work; YHW: acquisition and interpretation of data; revising the work for important intellectual content; WJW: acquisition of data and draft the work; FD: acquisition and analysis of data and revise the work for important intellectual content; YJL: acquisition and analysis of data; LH: acquisition of data and revise the work; CL: acquisition of the data and draft the work; JS: acquisition of the data and draft the work; LBX: acquisition and analysis the data and revise the work; YSZ: acquisition, analysis and interpretation of the data; RG: acquisition of the data and revise the work; HWP: analysis and acquisition of the data and draft the work; XHW: acquisition of the data and revise the work; JLX: acquisition of the data and revise the work; All author have approved the submitted version and have agreed both to be personally accountable for the author’s own contributions and to ensure that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Corresponding authors

Correspondence to Yuanhan Chen or Xinling Liang.

Ethics declarations

Ethics approval and consent to participate

The study protocol complied with the Declaration of Helsinki and was approved by the Ethics Research Committees of Guangdong General Hospital (GDREC2016327H). Due to the character of a retrospective design, consent to participate is not applicable.

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1.

Table S1. Geographical and economic information about the hospitals included for study.

Additional file 2.

Table S2. Previously reported ICD-10 coding for Charlson comorbidity index.

Additional file 3.

Table S3. Prevalence of comorbidities based on ICD-10 Coding Algorithms and diagnosis at different hospital levels.

Additional file 4.

Table S4. AUC for in-hospital mortality using ICD-based and diagnosis-based CCI.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mo, L., Xie, Z., Liu, G. et al. Feasibility of coding-based Charlson comorbidity index for hospitalized patients in China, a representative developing country. BMC Health Serv Res 20, 432 (2020). https://doi.org/10.1186/s12913-020-05273-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12913-020-05273-8

Keywords