Skip to main content
  • Research article
  • Open access
  • Published:

Validity of a stroke severity index for administrative claims data research: a retrospective cohort study



Ascertaining stroke severity in claims data-based studies is difficult because clinical information is unavailable. We assessed the predictive validity of a claims-based stroke severity index (SSI) and determined whether it improves case-mix adjustment.


We analyzed patients with acute ischemic stroke (AIS) from hospital-based stroke registries linked with a nationwide claims database. We estimated the SSI according to patient claims data. Actual stroke severity measured with the National Institutes of Health Stroke Scale (NIHSS) and functional outcomes measured with the modified Rankin Scale (mRS) were retrieved from stroke registries. Predictive validity was tested by correlating SSI with mRS. Logistic regression models were used to predict mortality.


The SSI correlated with mRS at 3 months (Spearman rho = 0.578; 95 % confidence interval [CI], 0.556–0.600), 6 months (rho = 0.551; 95 % CI, 0.528–0.574), and 1 year (rho = 0.532; 95 % CI 0.504–0.560). Mortality models with the SSI demonstrated superior discrimination to those without. The AUCs of models including the SSI and models with the NIHSS did not differ significantly.


The SSI correlated with functional outcomes after AIS and improved the case-mix adjustment of mortality models. It can act as a valid proxy for stroke severity in claims data-based studies.

Peer Review reports


With the advent of information technology, large amounts of health care utilization data are collected routinely for reimbursement purposes and the administration of health services. These administrative claims data offer several potential advantages to researchers, including large sample sizes, representativeness, being of large quantity and serial longitudinal, comparatively low cost, and relative efficiency [1]. Therefore, in addition to clinical epidemiological research [2], administrative claims data are appropriate for examining outcomes, performing pharmaco-economic analysis, and monitoring the quality of clinical care [3, 4].

However, because of insufficient clinical information, studies using administrative claims data have shared a crucial shortcoming, namely, a lack of adjustment for disease severity. This limitation is particularly relevant for stroke outcomes research because the heterogeneity of stroke syndromes causes stroke severity to vary greatly among patients. For example, studies based on administrative claims data that have investigated mortality and readmission rates in stroke patients have generally been limited by inadequate adjustment for stroke severity [5, 6]. In addition, claims data-based risk models for predicting mortality and readmission might be prone to mischaracterizing the quality of stroke care delivered by hospitals, because stroke severity is not included in the models [7]. Therefore, it has been advocated that more research be done to ascertain stroke severity using administrative billing codes or electronic health records [8].

Administrative claims data reflect routine clinical practice and could be analyzed as a set of proxies that indirectly represent the health status of patients [9]. In response to the pressing need for a better ascertainment of stroke severity using administrative claims data, we previously developed several models to derive a stroke severity index (SSI) that reflects the stroke severity of patients hospitalized for acute ischemic stroke (AIS), using only the information generally available in claims data [10]. Before the SSI can be applied to measure stroke severity in claims-based research, its criterion-related validity should be examined. Criterion-related validity is the extent to which the score from a new measurement instrument correlates with that of other measures evaluating the same or a very similar construct [11]. We have assessed the concurrent validity, a type of criterion-related validity, of the SSI by demonstrating its close correlation with the actual severity of neurological deficit at admission as assessed using the National Institutes of Health Stroke Scale (NIHSS) [10], the current gold standard to measure stroke severity.

In the current study, we aimed to evaluate the predictive validity, another type of criterion-related validity, of the SSI by examining the extent to which the SSI predicts future functional outcomes with a cohort of stroke patients by cross-linking hospital-based registry databases with a nationwide claims database. We also investigated whether the SSI has improved the case-mix adjustment in claims data-based outcomes research by examining the magnitude of improvement in model performance when the SSI was added to models that attempted to predict mortality in patients with AIS.


Study population

We identified adult patients with AIS in stroke registries from the 1300-bed Chi Mei Medical Center and the regional 600-bed Landseed Hospital in Taiwan. Patients admitted to the two hospitals between August 2006 and December 2010 due to AIS were included. Patients with in-hospital stroke were excluded. The study hospitals prospectively registered all stroke patients admitted within 10 days of symptom onset conforming to the design of the nationwide Taiwan Stroke Registry [12]. Ischemic stroke was defined as an acute onset of neurologic deficits persisting longer than 24 h with no hemorrhage visible in the first brain computed tomography or with acute corresponding ischemic lesion(s) on diffusion weighted magnetic resonance imaging. Stroke severity was determined using the NIHSS at admission. The NIHSS is a 15-item neurologic examination scale designed to assess neurological deficits in stroke patients. The total NIHSS scores range from 0 to 42, with higher values representing greater stroke severity.

The functional status of patients who consented to follow-up was evaluated with the modified Rankin Scale (mRS) at 3 months, 6 months, and 1 year after stroke, during an in-person assessment or by telephone interview. The mRS is a commonly used tool for measuring the degree of disability of stroke patients. It is a six-point scale with 0 for no symptoms, higher scores for increasing disability, and 6 for death. Although a structured interview format has been developed to improve the assessment of the mRS [13], the mRS was determined based on the Chinese translation of the mRS criteria according to the Taiwan Stroke Registry operation manual [12]. We dichotomized mRS at ≤ 2 versus > 2 in accordance with previous stroke trials [14]. An mRS score of ≤ 2 indicates a good functional outcome (slight or no disability with preserved ability to look after own affairs without assistance). An mRS score of >2 means a poor functional outcome (dependence in daily activities or death). To assure patient anonymity, the data collected were limited to gender, birth date, admission date, discharge date, admission NIHSS score, and follow-up mRS scores from the registry databases (Fig. 1). Both the Chi Mei Medical Center Institutional Review Board and the Landseed Hospital Institutional Review Board approved the study protocol. Because the present study involved analysis of secondary data and all patient data were deidentified, a signed informed consent to participate in the study was determined unnecessary.

Fig. 1
figure 1

Flow diagram of study procedure. Abbreviations: CMMC, Chi Mei Medical Center; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; LH, Landseed Hospital; mRS, modified Rankin scale; NHIRD, National Health Insurance Research Database; NIHSS, National Institutes of Health Stroke Scale

National Health Insurance Research Database linkage

The National Health Insurance (NHI) is a mandatory, single-payer enrollment healthcare program that covers nearly the entire population of Taiwan. The program provides universal coverage for prescription medications, inpatient care, and ambulatory care. The National Health Insurance Research Database (NHIRD) comprises medical care claims data on NHI enrollees and is maintained and released for research by the National Health Research Institutes of Taiwan.

To identify the inpatient population of patients with AIS in the NHIRD, we extracted data on all patients hospitalized between August 2006 and December 2010 for ischemic stroke (International Classification of Diseases, Ninth Revision, Clinical Modification [ICD-9-CM] diagnosis code 433.x or 434.x) as their primary discharge diagnosis [15] from an NHIRD data set. We then linked registry data with the NHIRD according to 4 nonunique patient characteristics: gender, date of birth, admission date, and discharge date (Fig. 1) [15]. Because errors in coding or entering data in administrative claims data might be present, failure of linkage is inevitable [16]. Patients who were successfully linked and had follow-up mRS scores comprised the validation cohort. The linked hospitalization record in the NHIRD was defined as the index hospitalization.

For each case in the validation cohort, we obtained all the diagnosis codes from the index hospitalization as well as inpatient and outpatient claims within the 1-year look-back period before the index hospitalization using all the available nationwide claims records (Fig. 1). Patients were identified as having a comorbid condition if its corresponding ICD-9-CM codes appeared in at least one inpatient claim or three outpatient claims during the 1-year look-back period [17]. We estimated the modified Charlson comorbidity index (CCI) according to the ICD-9-CM diagnosis codes [18]. In addition, we retrieved reimbursement data for medications, laboratory tests, imaging studies, procedures, and clinical services from the index hospitalization, which were used to estimate the SSI for each patient.

Stroke severity index

We have previously developed several SSI models using various regression methods, including k-nearest neighbor regression, regression tree, and multiple linear regression, based on the data from 3577 patients with AIS in a single hospital [10]. This index predicts the neurological deficit severity of patients hospitalized for AIS according to seven predictors (Table 1). These predictors essentially reflect the care given to stroke patients and management of stroke-related complications, which generally correlate with stroke severity [19, 20]. For example, among stroke patients, frequent airway suctioning is generally needed for those with difficulty in handling their own secretions, the placement of a nasogastric tube for feeding is usually required for those with dysphagia, bacterial sensitivity tests are often performed for those with severe stroke and therefore prone to infections, and mannitol osmotherapy is frequently prescribed for those with brain edema [19, 20].

Table 1 Multiple linear regression model for the stroke severity index [10]

An online tool ( accompanying our previous study is available for estimation of the SSI using three kinds of regression methods. In the present study, we used the multiple linear regression model because it is simple to implement and to disseminate to other researchers. We determined the presence of each predictor according to the administrative billing codes (see Additional file 1: Table S1) listed in each patient’s claims from the index hospitalization. If a patient was transferred to another hospital during the index hospitalization, the claims records from the second hospitalization were ignored. The SSI was obtained by using the regression coefficients estimated from a multiple linear regression equation in our previous study (Table 1) [10].

Statistical analysis

Continuous variables were summarized with mean (standard deviation) or median (interquartile range), and categorical variables with frequency and percentages. Two-tailed P values of < 0.05 were considered significant. To summarize the relationships between stroke severity and outcomes after stroke, the NIHSS was categorized as mild (≤5), moderate (> 5 to ≤ 13), or severe (> 13) stroke in accordance with a prior study [21], in which the NIHSS was used to predict stroke outcome. Because the SSI can be converted to the NIHSS as follows: NIHSS = 1.1722 × SSI − 0.7533 (see Additional file 1: Table S2), we categorized patients as having mild (SSI ≤ 5), moderate (SSI > 5 to ≤ 12), or severe (SSI > 12) stroke in accordance with the study mentioned above. Chi-squared tests for trend were used to evaluate trends across the categories of stroke severity.

To assess the predictive validity of the SSI (the extent to which the SSI predicts future functional outcomes), we examined the Spearman rank correlations between the SSI and the mRS at 3 months, 6 months, and 1 year. The correlations between the NIHSS and the mRS were also estimated. We further tested whether the correlations between the SSI and the mRS and those between the NIHSS and the mRS were equal [22].

To investigate the performance of models in predicting mortality at 3 months, 6 months, and 1 year after stroke, we fitted separate logistic regression models using age, sex, vascular risk factors (hypertension, diabetes mellitus, hyperlipidemia, prior stroke, atrial fibrillation, coronary heart disease, chronic kidney disease), and modified CCI as covariates with or without the SSI. The modified CCI was dichotomized into low comorbidity (0 or 1) and high comorbidity (≥ 2) for analysis [18]. The SSI was entered as a continuous variable. For comparison, additional logistic regression procedures were performed using age, sex, vascular risk factors, modified CCI, and the NIHSS as covariates. The NIHSS was also treated as a continuous variable. Model discrimination was assessed and compared using the area under the receiver operating characteristic curve (AUC) as recommended by DeLong [23]. We used integrated discrimination improvement (IDI) to evaluate the improvement of predictive ability after adding either the SSI or the NIHSS to the models [24]. A higher IDI indicates greater risk discrimination and improved classification. We assessed the model calibration by the Hosmer–Lemeshow goodness-of-fit test. All the statistical analyses were performed using Stata 13.1 (StataCorp, College Station, Texas).


In total, 4438 (Chi Mei Medical Center) and 1148 (Landseed Hospital) adult patients hospitalized for AIS were identified from the stroke registries of the two study hospitals. After linkage with the NHIRD, 3816 and 962 patients, respectively, were successfully linked (Fig. 1). Overall, the rate of successful linkage was 85.5 %. Compared with the unlinked patients, the NHIRD-linked patients were slightly younger and had somewhat lower NIHSS (see Additional file 1: Table S3). Among the 4778 linked patients, follow-up mRS scores were available for 3630 patients at 3 months after stroke, 3545 at 6 months, and 2478 at 1 year. Patients with follow-up mRS were marginally older and had slightly lower NIHSS than those without (see Additional file 1: Table S4). Table 2 lists the characteristics of the validation cohort. The mortality rates were 4.7 % at 3 months, 7.0 % at 6 months, and 14.2 % at 1 year, which were comparable with those (30-day mortality 4.2 to 4.3 %; 1-year mortality 13.9 %) found in previous nationwide studies [25, 26].

Table 2 Characteristics of the validation cohort

Of the 3630 patients in the validation cohort, 2188 (60.3 %) were categorized as having mild stroke (NIHSS ≤ 5), 863 (23.8 %) as having moderate stroke (NIHSS > 5 to ≤ 13), and 579 (16.0 %) as having severe stroke (NIHSS > 13). According to the SSI, 2404 (66.2 %) were categorized as having mild stroke (SSI ≤ 5), 630 (17.4 %) as having moderate stroke (SSI > 5 to ≤ 12), and 596 (16.4 %) as having severe stroke (SSI > 12). Figure 2 illustrates the proportions of mortality and good functional outcome (mRS ≤ 2) across the NIHSS and SSI categories at 3 months, 6 months, and 1 year after stroke. With rising stroke severity, as assessed using either the NIHSS or the SSI, the risk of mortality increased and the probability of good functional outcome decreased significantly (all were P < 0.001, chi-squared tests for trend).

Fig. 2
figure 2

Mortality and proportions of good functional outcomes stratified by stroke severity at 3 months (a, b), 6 months (c, d), and 1 year (e, f) after stroke. Abbreviations: mRS, modified Rankin Scale; NIHSS, National Institutes of Health Stroke Scale; SSI, stroke severity index

The SSI correlated significantly with the 3-month mRS (Spearman rho = 0.578; 95 % confidence interval [CI], 0.556–0.600), 6-month mRS (rho = 0.551; 95 % CI, 0.528–0.574), and 1-year mRS (rho = 0.532; 95 % CI 0.504–0.560), indicating a decrease in the magnitude of correlation over time. The admission NIHSS also correlated significantly with the 3-month mRS (rho = 0.674; 95 % CI, 0.656–0.692), 6-month mRS (rho = 0.640; 95 % CI, 0.620–0.659), and 1-year mRS (rho = 0.595; 95 % CI 0.569–0.620). The correlations between the SSI and the mRS were lower than those between the NIHSS and the mRS (P < 0.001 for 3-month mRS, P < 0.001 for 6-month mRS, and P = 0.001 for 1-year mRS). Figure 3 illustrates the distribution of the SSI and the NIHSS across all mRS grades at 3 months, 6 months, and 1 year after stroke. The SSI showed a floor effect with mRS ≤ 2.

Fig. 3
figure 3

Box-plots showing the distribution of the SSI and the NIHSS across all mRS grades at 3 month (a), 6 months (b), and 1 year (c) after stroke. Abbreviations: mRS, modified Rankin Scale; NIHSS, National Institutes of Health Stroke Scale; SSI, stroke severity index

Table 3 reports the performance of the logistic regression models for 3-month, 6-month, and 1-year mortality. Based on the Hosmer-Lemeshow statistics, all models demonstrated adequate model fit. For the base models including age, sex, vascular risk factors, and the modified CCI as covariates, the AUCs for 3-month, 6-month, and 1-year mortality models were 0.786, 0.777, and 0.767, respectively. Including the SSI in the models improved model discrimination substantially (AUCs for 3-month, 6-month, and 1-year mortality models, 0.869, 0.844, and 0.823, respectively; all were P < 0.001 as compared with the base models). Similarly, model discrimination was enhanced by adding the NIHSS to the base models (AUCs for 3-month, 6-month, and 1-year mortality models, 0.860, 0.837, and 0.816, respectively; all were P < 0.001 as compared with the base models). The AUCs of the models including the SSI and models with the NIHSS did not differ significantly (P = 0.347 for 3-month mortality, P = 0.409 for 6-month mortality, and P = 0.353 for 1-year mortality). The IDI also indicated that adding stroke severity, as assessed using either the SSI or the NIHSS, to mortality models significantly improved model discrimination.

Table 3 Performance of mortality models for acute ischemic stroke


Our study demonstrated that the claims-based SSI could be a feasible proxy indicator for stroke severity when conducting outcomes research using claims data. The predictive validity of the SSI was shown by its significant correlations with the follow-up mRS up to 1 year. The SSI helped predict long-term stroke outcomes, including mortality and functional outcomes, just as the NIHSS did. Adding the SSI to mortality risk models based on claims data for patients with AIS considerably improved model discrimination and the magnitude of improvement was similar to that when the NIHSS was added to the models.

Stroke severity, as assessed using the NIHSS or other stroke scales, is not only a strong predictor of stroke outcomes such as mortality and readmission [27, 28], but also a major determinant of hospital costs for stroke patients [29]. Without information about stroke severity, the results of stroke outcomes research could be biased. A study found that including the NIHSS by linkage with the Get With The Guidelines–stroke registry to a claims-based 30-day mortality model, which was used for profiling hospital performance in AIS treatment, both enhanced model discrimination and improved hospital performance rankings substantially [30]. Nevertheless, the opportunity to link claims data with a nationwide clinical registry is generally unavailable to researchers, and record linkage may be restricted by privacy concerns and regulatory constraints [31]. In circumstances where information on clinical stroke severity is unavailable, the claims-based SSI could be used for risk adjustment in outcomes research using solely claims data.

Various proxy measures of stroke severity have been used in previous claims data-based studies, including length of stay [32], total medical expenditure during stroke hospitalization [33], stroke-related neurological deficits (e.g., hemiplegia and aphasia) and complications (e.g., pneumonia and decubitus ulcers) [3335], and procedures (e.g., mechanical ventilation and craniotomy) [34, 35]. However, our previous work found that many of these measures might not be valid indicators of stroke severity [25]. On the contrary, the SSI has been externally validated by its strong correlation with clinical stroke severity as measured using the NIHSS [10].

Some may argue that the SSI measure depends on management and treatment given to stroke patients during hospitalization and its performance might thus be affected by variations in practice among physicians and differences in resources across hospitals. Furthermore, physicians may selectively withhold certain management or treatment from some patients, in particular those with very severe stroke because they are expected to die anyway. Despite these concerns, the components of the SSI are widely available in hospitals throughout Taiwan, leading to its consistent performance across cohorts from 4 hospitals of different sizes and types [10]. Because the SSI is superior to other proxy measures of stroke severity [25], for the time being it is a reasonable substitute for a clinical stroke scale in doing research with administrative claims data, in which clinical information is unavailable.

In addition to stroke outcomes research [5, 6, 32, 36], administrative claims data have been widely used in other types of stroke studies, including health care utilization [35, 37, 38], and studies investigating risk factors for stroke [3941]. However, only a few studies have attempted to use information available in claims data as surrogates for stroke severity [32, 3437]. With the capability to estimate stroke severity based on claims data, the SSI could be applied in these claims data-based stroke studies. For studies that investigate risk factors for stroke, we could compare the stroke severity between patients with or without the presumed risk factor and, thus, might refine the inferences of such associations. Furthermore, with its ability to predict long-term stroke outcomes, the SSI could assist surveillance of stroke burden using administrative claims data. For example, investigators could examine the temporal trends and spatial variations in patient stroke severity in addition to stroke incidence. The results of such studies may promote effective planning for health care facilities.

Our study has limitations. First, the study patients were enrolled from only 2 hospitals, one being a medical center and the other a regional hospital, and might not be representative of the general population. In Taiwan, approximately 70 % of stroke patients are admitted to medical centers and regional hospitals, with the remainder admitted to district hospitals [35]. However, the seven SSI components are quite ordinary and widely available at each level of hospital providing acute care for patients with stroke. Thus the SSI would not be affected significantly by the size or type of a hospital. Second, distribution of stroke severity is typically skewed toward mild symptoms [42]. As shown in our study, more than half of the patients had mild stroke. However, unlike the NIHSS, the lowest possible value of the SSI was not zero according to the regression equation (Table 1). Consequently, the SSI has a floor effect and is insensitive when discriminating patients with mild stroke. Third, the study results may not be generalized to administrative claims databases other than Taiwan’s NHIRD. Given the kind of administrative information, i.e., administrative billing codes, required, which are typically collected in healthcare systems operating on a fee-for-service basis, the SSI may not be applicable in organizational context that do not have such accounting schemes. More validation studies are required to test the generalizability of this index to other administrative databases and in other healthcare systems. Fourth, the SSI is probably not suitable for case mix adjustment when comparing hospital performance in the future because this method might create an incentive to over-report management and treatment to hospitals that attempt to promote their ranking among hospitals. However, the SSI has its merit in conducting stroke outcome studies with the existing NHIRD datasets [26].


The claims-based SSI is a valid substitute for the NIHSS for estimating the stroke severity of patients hospitalized for AIS. The SSI correlated with functional outcomes up to 1 year after stroke and it improved case-mix adjustment for mortality to a similar degree as that of the NIHSS. The SSI has the potential to improve stroke research based on administrative claims data.



Acute ischemic stroke


Area under the receiver operating characteristic curve


Charlson comorbidity index


Confidence interval


International Classification of Diseases, Ninth Revision, Clinical Modification


Integrated discrimination improvement


Modified Rankin Scale


National Health Insurance


National Health Insurance Research Database


National Institutes of Health Stroke Scale


Stroke severity index


  1. Virnig BA, McBean M. Administrative data for public health surveillance and planning. Annu Rev Public Health. 2001;22:213–30.

    Article  CAS  PubMed  Google Scholar 

  2. Schneeweiss S, Avorn J. A review of uses of health care utilization databases for epidemiologic research on therapeutics. J Clin Epidemiol. 2005;58:323–37.

    Article  PubMed  Google Scholar 

  3. Birnbaum HG, Cremieux PY, Greenberg PE, LeLorier J, Ostrander JA, Venditti L. Using healthcare claims data for outcomes research and pharmacoeconomic analyses. Pharmacoeconomics. 1999;16:1–8.

    Article  CAS  PubMed  Google Scholar 

  4. Crystal S, Akincigil A, Bilder S, Walkup JT. Studying prescription drug use and outcomes with medicaid claims data: strengths, limitations, and strategies. Med Care. 2007;45:S58–65.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Lichtman JH, Jones SB, Wang Y, Leifheit-Limson EC, Goldstein LB. Seasonal variation in 30-day mortality after stroke: teaching versus nonteaching hospitals. Stroke. 2013;44:531–3.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Lichtman JH, Leifheit-Limson EC, Jones SB, Wang Y, Goldstein LB. Preventable readmissions within 30 days of ischemic stroke among Medicare beneficiaries. Stroke. 2013;44:3429–35.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Fonarow GC, Alberts MJ, Broderick JP, Jauch EC, Kleindorfer DO, Saver JL, et al. Stroke outcomes measures must be appropriately risk adjusted to ensure quality care of patients: a presidential advisory from the American Heart Association/American Stroke Association. Stroke. 2014;45:1589–601.

    Article  PubMed  Google Scholar 

  8. Katzan IL, Spertus J, Bettger JP, Bravata DM, Reeves MJ, Smith EE, et al. Risk adjustment of ischemic stroke outcomes for comparing hospital performance: a statement for healthcare professionals from the American Heart Association/American Stroke Association. Stroke. 2014;45:918–44.

    Article  PubMed  Google Scholar 

  9. Schneeweiss S, Rassen JA, Glynn RJ, Avorn J, Mogun H, Brookhart MA. High-dimensional propensity score adjustment in studies of treatment effects using health care claims data. Epidemiology. 2009;20:512–22.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Sung S-F, Hsieh C-Y, Kao Yang Y-H, Lin H-J, Chen C-H, Chen Y-W, et al. Developing a stroke severity index based on administrative data was feasible using data mining techniques. J Clin Epidemiol. 2015;68:1292–300.

    Article  PubMed  Google Scholar 

  11. Kimberlin CL, Winterstein AG. Validity and reliability of measurement instruments used in research. Am J Health Syst Pharm. 2008;65:2276–84.

    Article  PubMed  Google Scholar 

  12. Hsieh F-I, Lien L-M, Chen S-T, Bai C-H, Sun M-C, Tseng H-P, et al. Get With the Guidelines-Stroke performance indicators: surveillance of stroke care in the Taiwan Stroke Registry: Get With the Guidelines-Stroke in Taiwan. Circulation. 2010;122:1116–23.

    Article  PubMed  Google Scholar 

  13. Wilson JTL, Hareendran A, Grant M, Baird T, Schulz UGR, Muir KW, et al. Improving the assessment of outcomes in stroke: use of a structured interview to assign grades on the modified Rankin Scale. Stroke. 2002;33:2243–6.

    Article  PubMed  Google Scholar 

  14. Wardlaw JM, Murray V, Berge E, Del Zoppo GJ. Thrombolysis for acute ischaemic stroke. Cochrane Database Syst Rev. 2014;7:CD000213.

    PubMed Central  Google Scholar 

  15. Cheng C-L, Kao Y-HY, Lin S-J, Lee C-H, Lai M-L. Validation of the National Health Insurance Research Database with ischemic stroke cases in Taiwan. Pharmacoepidemiol Drug Saf. 2011;20:236–42.

    Article  PubMed  Google Scholar 

  16. Setoguchi S, Zhu Y, Jalbert JJ, Williams LA, Chen C-Y. Validity of deterministic record linkage using multiple indirect personal identifiers: linking a large registry to claims data. Circ Cardiovasc Qual Outcomes. 2014;7:475–80.

    Article  PubMed  Google Scholar 

  17. Sung S-F, Hsieh C-Y, Lin H-J, Chen Y-W, Yang Y-HK, Li C-Y. Validation of algorithms to identify stroke risk factors in patients with acute ischemic stroke, transient ischemic attack, or intracerebral hemorrhage in an administrative claims database. Int J Cardiol. 2016;215:277–82.

    Article  PubMed  Google Scholar 

  18. Goldstein LB, Samsa GP, Matchar DB, Horner RD. Charlson Index comorbidity adjustment for ischemic stroke outcome studies. Stroke. 2004;35:1941–5.

    Article  PubMed  Google Scholar 

  19. Kumar S, Selim MH, Caplan LR. Medical complications after stroke. Lancet Neurol. 2010;9:105–18.

    Article  PubMed  Google Scholar 

  20. Balami JS, Chen R-L, Grunwald IQ, Buchan AM. Neurological complications of acute ischaemic stroke. Lancet Neurol. 2011;10:357–71.

    Article  PubMed  Google Scholar 

  21. Schlegel D, Kolb SJ, Luciano JM, Tovar JM, Cucchiara BL, Liebeskind DS, et al. Utility of the NIH Stroke Scale as a predictor of hospital disposition. Stroke. 2003;34:134–7.

    Article  PubMed  Google Scholar 

  22. Goldstein R. Testing dependent correlation coefficients. Stata Tech Bull. 1996;32:18.

    Google Scholar 

  23. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44:837–45.

    Article  CAS  PubMed  Google Scholar 

  24. Pencina MJ, D’Agostino RB, Vasan RS. Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat Med. 2008;27:157–72.

    Article  PubMed  Google Scholar 

  25. Sung S-F, Chen SC-C, Hsieh C-Y, Li C-Y, Lai EC-C, Hu Y-H. A comparison of stroke severity proxy measures for claims data research: a population-based cohort study. Pharmacoepidemiol Drug Saf. 2016;25:438–43.

    Article  PubMed  Google Scholar 

  26. Hsieh C-Y, Lin H-J, Chen C-H, Li C-Y, Chiu M-J, Sung S-F. “Weekend effect” on stroke mortality revisited: Application of a claims-based stroke severity index in a population-based cohort study. Medicine (Baltimore). 2016;95:e4046.

    Article  Google Scholar 

  27. Rost NS, Bottle A, Lee J-M, Randall M, Middleton S, Shaw L, et al. Stroke severity is a crucial predictor of outcome: An international prospective validation study. J Am Heart Assoc. 2016;5:e002433.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Strowd RE, Wise SM, Umesi UN, Bishop L, Craig J, Lefkowitz D, et al. Predictors of 30-day hospital readmission following ischemic and hemorrhagic stroke. Am J Med Qual. 2015;30:441–6.

    Article  PubMed  Google Scholar 

  29. Luengo-Fernandez R, Silver LE, Gutnikov SA, Gray AM, Rothwell PM. Hospitalization resource use and costs before and after TIA and stroke: results from a population-based cohort study (OXVASC). Value Health. 2013;16:280–7.

    Article  PubMed  Google Scholar 

  30. Fonarow GC, Pan W, Saver JL, Smith EE, Reeves MJ, Broderick JP, et al. Comparison of 30-day mortality models for profiling hospital performance in acute ischemic stroke with vs without adjustment for stroke severity. JAMA. 2012;308:257–64.

    Article  CAS  PubMed  Google Scholar 

  31. Durham E, Xue Y, Kantarcioglu M, Malin B. Quantifying the correctness, computational complexity, and security of privacy-preserving string comparators for record linkage. Inf Fusion. 2012;13:245–59.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Kang J-H, Xirasagar S, Lin HC. Lower mortality among stroke patients with schizophrenia: a nationwide population-based study. Psychosom Med. 2011;73:106–11.

    Article  PubMed  Google Scholar 

  33. Liao CC, Chang PY, Yeh CC, Hu CJ, Wu CH, Chen TL. Outcomes after surgery in patients with previous stroke. Br J Surg. 2014;101:1616–22.

    Article  CAS  PubMed  Google Scholar 

  34. Smith MA, Frytak JR, Liou J-I, Finch MD. Rehospitalization and survival for stroke patients in managed care and traditional Medicare plans. Med Care. 2005;43:902–10.

    Article  PubMed  PubMed Central  Google Scholar 

  35. Lee H-C, Chang K-C, Huang Y-C, Lan C-F, Chen J-J, Wei S-H. Inpatient rehabilitation utilization for acute stroke under a universal health insurance system. Am J Manag Care. 2010;16:e67–74.

    PubMed  Google Scholar 

  36. Hou W-H, Ni C-H, Li C-Y, Tsai P-S, Lin L-F, Shen H-N. Stroke rehabilitation and risk of mortality: a population-based cohort study stratified by age and gender. J Stroke Cerebrovasc Dis. 2015;24:1414–22.

    Article  PubMed  Google Scholar 

  37. Chang K-C, Lee H-C, Huang Y-C, Hung J-W, Chiu HE, Chen J-J, et al. Cost-effectiveness analysis of stroke management under a universal health insurance system. J Neurol Sci. 2012;323:205–15.

    Article  PubMed  Google Scholar 

  38. Bottacchi E, Corso G, Tosi P, Morosini MV, De Filippis G, Santoni L, et al. The cost of first-ever stroke in Valle d’Aosta, Italy: linking clinical registries and administrative data. BMC Health Serv Res. 2012;12:372.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Lee CC, Su YC, Chien SH, Ho HC, Hung SK, Lee MS, et al. Increased stroke risk in Bell’s palsy patients without steroid treatment. Eur J Neurol. 2013;20:616–22.

    Article  PubMed  Google Scholar 

  40. Mangla A, Navi BB, Layton K, Kamel H. Transient global amnesia and the risk of ischemic stroke. Stroke. 2014;45:389–93.

    Article  PubMed  Google Scholar 

  41. Wu M-P, Lin H-J, Weng S-F, Ho C-H, Wang J-J, Hsu Y-W. Insomnia subtypes and the subsequent risks of stroke: report from a nationally representative cohort. Stroke. 2014;45:1349–54.

    Article  PubMed  Google Scholar 

  42. Ali K, Cheek E, Sills S, Crome P, Roffe C. Development of a conversion factor to facilitate comparison of National Institute of Health Stroke Scale scores with Scandinavian Stroke Scale scores. Cerebrovasc Dis. 2007;24:509–15.

    Article  PubMed  Google Scholar 

Download references


This study is based in part on data from the National Health Insurance Research Database provided by the Bureau of National Health Insurance, Department of Health and managed by National Health Research Institutes. The interpretation and conclusions contained herein do not represent those of Bureau of National Health Insurance, Department of Health or National Health Research Institutes.

The authors thank Professor Sue-Jane Lin for her critical review of the manuscript.


This research was supported in part by the Tainan Sin Lau Hospital (grant number SLH-104004), the National Cheng Kung University (grant number NCKUH-10206008), and the Ministry of Science and Technology of Taiwan (grant number MOST 105-2314-B-705-001).

Availability of data and materials

Data and materials are available upon written request to the corresponding author.

Authors’ contributions

SFS and CYH, conception and design, analysis and interpretation of data, drafting the article; HJL, YWC, CHC and YHKY, analysis and interpretation of data, revising the manuscript critically for important intellectual content; YHH, conception and design, analysis and interpretation of data, revising it critically for important intellectual content, final approval of the version to be published. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Both the Chi Mei Medical Center Institutional Review Board and the Landseed Hospital Institutional Review Board approved the study protocol and determined that a signed informed consent to participate in the present study was unnecessary because all patient data were deidentified.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Ya-Han Hu.

Additional file

Additional file 1: Table S1.

Predictors of the stroke severity index and their corresponding billing codes in Taiwan’s National Health Insurance fee schedule. Table S2. Distribution of the NIHSS score across stroke severity groups stratified by the SSI based on unpublished data from the validation cohorts (n = 6617) of our prior study (Sung S-F, et al. J Clin Epidemiol. 2015;68:1292–1300). Table S3. Comparison between patients with and without successful linkage. Table S4. Comparison between patients with and without follow-up mRS. (PDF 93 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sung, SF., Hsieh, CY., Lin, HJ. et al. Validity of a stroke severity index for administrative claims data research: a retrospective cohort study. BMC Health Serv Res 16, 509 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: