The Procedural Index for Mortality Risk (PIMR): an index calculated using administrative data to quantify the independent influence of procedures on risk of hospital death

Background Surgeries and other procedures can influence the risk of death in hospital. All published scales that predict post-operative death risk require clinical data and cannot be measured using administrative data alone. This study derived and internally validated an index that can be calculated using administrative data to quantify the independent risk of hospital death after a procedure. Methods For all patients admitted to a single academic centre between 2004 and 2009, we estimated the risk of all-cause death using the Kaiser Permanente Inpatient Risk Adjustment Methodology (KP-IRAM). We determined whether each patient underwent one of 503 commonly performed therapeutic procedures using Canadian Classification of Interventions codes and whether each procedure was emergent or elective. Multivariate logistic regression modeling was used to measure the association of each procedure-urgency combination with death in hospital independent of the KP-IRAM risk of death. The final model was modified into a scoring system to quantify the independent influence each procedure had on the risk of death in hospital. Results 275 460 hospitalizations were included (137,730 derivation, 137,730 validation). In the derivation group, the median expected risk of death was 0.1% (IQR 0.01%-1.4%) with 4013 (2.9%) dying during the hospitalization. 56 distinct procedure-urgency combinations entered our final model resulting in a Procedural Index for Mortality Rating (PIMR) score values ranging from -7 to +11. In the validation group, the PIMR score significantly predicted the risk of death by itself (c-statistic 67.3%, 95% CI 66.6-68.0%) and when added to the KP-IRAM model (c-index improved significantly from 0.929 to 0.938). Conclusions We derived and internally validated an index that uses administrative data to quantify the independent association of a broad range of therapeutic procedures with risk of death in hospital. This scale will improve risk adjustment when administrative data are used for analyses.


Background
Surgeries and procedures are major functions of hospitals that importantly influence patient outcomes and hospital performance. Procedural outcomes are often used to compare surgeons, clinical divisions, hospitals, and health jurisdictions. Many different types of surgeries and procedures exist in different specialties, involving very different patient populations. As a result, the influence of different types of procedures on hospital outcomes can vary greatly.
Quantifying the independent influence of a broad range of different types of procedures on outcomes would allow analysts, administrators, and researchers to measure, compare, and adjust for the importance of each procedure. Six indexes have been developed to quantify the risk of post-operative death after a range of surgeries (Table 1) [1][2][3][4][5][6]. Each of these indexes, however, requires clinical information that is usually unavailable in routinely collected administrative data.
In this study, we derived and internally validated an index to measure the influence of a broad range of surgeries on in-hospital mortality. Our goal was to quantify the independent association of all procedures with the risk of death in hospital. To do this, we first grouped procedures based on administrative codes and the procedure's urgency status and then determined which of these procedure-urgency groups were associated with risk of death in hospital after adjusting for factors that are highly predictive of this outcome. We then created a scoring system to quantify the independent association of significant procedures with risk of death in hospital. This index can be calculated using administrative data and estimates the risk of death in hospital from these procedures that is independent of other factors associated with this outcome. It can be used to help risk-adjust analyses using administrative data that have death in hospital as an outcome. Such analyses could be done to identify factors independently associated with death in hospital and, in some situations, compare quality of care between institutions.

Study Setting
This study took place at The Ottawa Hospital (TOH), a tertiary-care teaching facility with three sites that averaged 20 000 admissions annually during the study period. TOH functions within a publicly funded health care system. TOH is the sole regional provider of trauma care, thoracic surgery, and neurosurgical interventions, and provides most of the region's oncological care.

Patients
We included all admissions to the hospital (including same-day surgeries) between 1 April 2004 and 1 April 2009. "Same-day surgeries" included patients who had their surgery on the same day on which they were admitted to hospital. These patients were typically discharged home the same day but may have been kept in hospital if complications occurred or if additional monitoring was required. We started patient recruitment in April 2004 to ensure that our hospital had at least two years of experience coding procedures with the Canadian Classification of Interventions (CCI) coding system (which was introduced in April 2002). Patient recruitment ended in April 2009 (the last complete year of data available when the analyses were conducted). To apply the Kaiser Permanente In-patient Risk Adjustment Model (KP-IRAM) [7] -the method used to adjust for other risk factors associated with death in hospital -we excluded all patients with age ≤ 15 years at admission, all delivery-related obstetrical admissions, and those who were transferred to or from TOH. Throughout this study, the unit of analysis was the hospitalization.

Candidate Procedures
We used multiple binomial logistic regression to derive our index. We chose death in hospital as the model outcome because it is accurately recorded and is important to all potential users of the index. There were a total of 4013 hospital deaths (2.9% of all admissions) in the derivation cohort. Our logistic model could therefore test a maximum of 400 procedures or surgeries (i.e. 10 deaths per exposure) to safely avoid problems with over-fitting and model instability [8].
We identified candidate procedures using their Canadian Classification of Interventions (CCI) code. The CCI system contains more than 18,000 unique codes. We therefore grouped procedures using the first five alphanumerics of each code (which identifies the anatomical area and the intervention type) and limited our study to therapeutic procedures (i.e. CCI section 1). We used the admission status of the hospitalization (i.e. elective vs. non-elective admission) to classify the procedure urgency since urgency is an important and independent predictor of post-procedural outcomes [9][10][11][12][13][14]. Procedures that could not be performed electively (such as cardiac resuscitation, implantation of an internal device in the thoracic descending aorta, and control of bleeding in the thoracic cavity) were classified as "non-elective" regardless of the admission status of the hospitalization.
There were 3984 unique procedure-urgency combinations during the study period. Since this exceeded the maximum number of variables allowed in our model without overfitting (n = 400), we used three filters to exclude procedures. First, we only included procedures that were conducted on the day of the principal procedure (defined as the procedure considered by the health records analyst to be most significant during the patient's hospital stay). In 5% of hospitalizations, coded procedures occurred on more than one day. In such cases, only procedures that occurred on the day of the principal procedure were considered. Second, procedures had to be conducted at least once per month at our hospital during the study period (independent of its urgency status). Finally, the p-value for the association of the procedure with death in hospital (after adjusting for risk of death in-hospital measured with KP-IRAM) had to be less than 0.5.

Adjusting for Risk of Death in Hospital
To adjust for risk of death in hospital due to patient and hospitalization factors, we used the Kaiser Permanente In-patient Risk Adjustment Model (KP-IRAM) [7]. This model was derived and internally validated on almost 260,000 hospitalizations at 17 hospitals belonging to the Kaiser Permanente Health Plan and was subsequently validated at our hospital [15]. The KP-IRAM includes six covariates including: patient age; patient sex; admission urgency (i.e. elective or emergent) and service (i.e. medical or surgical); admission diagnosis; severity of acute illness as measured by the Laboratory-based Acute Physiology Score (LAPS); and chronic comorbidities measured by the Comorbidity Point Score (COPS). Using the admission diagnosis, hospitalizations were grouped into "Primary Conditions," and a separate logistic regression model was created for each group. Interaction terms between age, LAPS, and comorbidity score were included. The model had excellent discrimination (c-statistic = 0.88) and calibration (p-value of Hosmer Lemeshow statistic for the entire cohort was 0.66) for all-cause death in hospital.
We made three minor modifications to the KP-IRAM for this study. First, Canada switched from the International Classification of Diseases (ICD) 9-CM system (used in the KP-IRAM) to the ICD-10-CA system in 2002. We therefore used tables (provided by Canadian Institute for Health Information) to translate ICD-9-CM admission diagnoses to ICD-10-CA codes. Second, we measured chronic comorbidities using the Elixhauser Index [16] instead of the COPS because the KP-IRAM performed equally well using either comorbidity index [15]. Finally, the KP-IRAM was calculated on the day of the procedure (rather than at admission) for people who had one of the procedures included in the model. This model was used to estimate each patient's risk of death in hospital at the time of the procedure (expressed as a number that ranged between 0 and 1).

Creation of the Procedural Index for Mortality Risk (PIMR) Score
We randomly separated patients into equally sized derivation and validation groups. Using the derivation group, we ran a binary logistic regression model with death in hospital as the outcome and the KP-IRAM estimated risk as the adjusting covariate. The index day for patients undergoing one of the procedures considered for the model was the day of the procedure. For all other patients, the index day was the day of admission. Values of all covariates for the KP-IRAM model were those on the index day. We used stepwise variable selection to identify which candidate procedure-urgency combinations were independently associated with death in hospital. Surgeries with a 2-sided p-value less than 0.05 were retained in the model.
We then used the methods described by Sullivan et. al. [17] to modify the parameter estimates of this regression model into an index. The number of points assigned to each procedure equaled its regression coefficient divided by the coefficient in the model with the smallest absolute value. We rounded this quotient to the nearest whole number. This number translated the parameter estimates into units relative to the procedure with the smallest, independently significant association with death in hospital. Therefore, the association of a procedure assigned two points was twice as important for predicting risk of death in hospital as a procedure with one point. Each person's total Procedural Independent Mortality Risk (PIMR) score was then calculated by summing up the points of all significant procedural groups for which they had been coded.
When calculating the PIMR score, we tallied up only those procedures that were performed on the index day (i.e. the day on which the principal procedure was conducted). Procedures done on other days did not influence the PIMR score. The PIMR score also did not capture whether or not the procedure was the first procedure conducted during the hospitalization.

Assessment of the PIMR score
In the validation group, we described the distribution of the PIMR score and used logistic regression to measure the association of the PIMR score alone with risk of death in hospital.
We then measured the influence of the PIMR score on risk of death in hospital independent of other factors associated with this outcome. "Discrimination" measures a model's ability to distinguish between patients who did and did not die in hospital and was measured using the c-statistic [18]. "Calibration" measures the accuracy of a model's predicted risk of death and was measured by dividing the study cohort into deciles and strata based on the estimated risk of death. Within each decile and stratum, observed and expected death rates were deemed similar if the 95% confidence interval around the former (calculated using exact methods [19]) included the latter. Overall calibration was summarized using the Hosmer Lemeshow statistic [20]. Table cells containing less than five observations were censored to maintain patient confidentiality.
In the validation group, we then compared the predictive performance of models containing the KP-IRAM with and without the PIMR score. To do this, we used two statistical measures: the Integrated Discrimination Improvement (IDI) [21] and the Net Reclassification Improvement (NRI) [22]. The IDI is the discrimination slope (the mean predicted risk in patients with the event minus that of patients without the event) of a model with the KP-IRAM and PIMR as independent predictors minus the discrimination slope of a model with the KP-IRAM alone as the independent predictor. An IDI above zero indicates improved discrimination (i.e. a larger separation in mean predicted risk between events and nonevents) with the addition of the PIMR. The NRI represents the net proportion of correct reclassification (with correct reclassification defined as the predicted risk moving upwards for events and downwards for non-events) among events and non-events (calculated separately and then summed) when the predicted risk from the model with KP-IRAM and PIMR is compared to that from the model with KP-IRAM alone. We also calculated the net number of correct reclassifications when the PIMR was added to the KP-IRAM. SAS 9.2 (Cary, NC) was used for all analyses. The study was approved by The Ottawa Hospital Research Ethics Board.

Results
There were 369 588 admissions to The Ottawa Hospital between 1 April 2004 and 1 April 2009. 93 971 of these hospitalizations were excluded from this study because patients were less than 15 years of age (n = 36 820), patients were transferred from or to another hospital (n = 12 931), or admissions were obstetrical and deliveryrelated (n = 44 220). We excluded another 157 admissions because they were missing a primary condition group (required to calculate the KP-IRAM). This left a total of 275 460 hospital admissions (137 730 in both the derivation and the validation group) consisting of 172 396 unique individuals. A description of patients in the derivation cohort is provided in Table 2. The validation group did not differ significantly from the derivation group (see additional file 1).
In the entire cohort, a total of 1939 therapeutic procedures were coded during the study period. 1436 procedures were excluded because less than one procedure per month was performed during the study period. The remaining 503 procedures included a total of 938 procedure-urgency combinations. After adjusting for the Kaiser Permanente In-patient Risk Adjustment Model (KP-IRAM) death risk estimate, the p-value of the association of 726 of these procedure-urgency combinations exceeded 0.5 in the derivation cohort and were therefore excluded. This left a total of 212 procedure-urgency combinations (including 168 individual surgeries) expressed as binomial (i.e. 1/0) variables that were offered to the logistic model (see additional file 2).
After adjusting for important patient and admission factors, 56 procedure-urgency combinations (comprising 52 individual procedures) were independently associated with death in hospital (Table 3). 37 emergent and eight elective procedures were independently associated with an increased risk of death in hospital, while four emergent and seven elective procedures were protective. In the validation set, there were 22 664 (16.4%) admissions where the patient underwent at least one PIMR procedure, with 83% of these procedures occurring within the first three days of the hospitalization. Procedures having the strongest association with death in hospital included cardiac resuscitation, ventriculectomy, pericardial drainage, and pelvic irradiation. A full description of each procedure that was independently associated with death in hospital is given in Additional File 3.
Four procedures were independently associated with risk of death in hospital regardless of whether the procedure was done emergently or electively (Table 4). In two cases, the elective version of the procedure was assigned more points (indicating a higher risk of death in hospital) than the emergent version of the procedure.
Parameter estimates for procedures in the final logistic model were modified into the Procedural Index for Mortality Risk (PIMR) score ( Table 3). The PIMR score for individual procedures ranged from -7 to +11. Since 84% of admissions had none of the included procedures, most hospitalizations had a total PIMR score of 0 (Figure 1, left axis). The risk of death in hospital was significantly associated with the PIMR score ( Figure 1, right axis). By itself, the PIMR score was moderately discriminative for death in hospital (c-statistic 67.3%, 95% CI 66.6%-68.0%).
The total PIMR score significantly changed the expected risk of death in hospital beyond that estimated by the KP-IRAM (Figure 2). The total PIMR score also significantly improved the ability to predict risk of inhospital death beyond that generated by the KP-IRAM. Model discrimination improved, as indicated by the cstatistic (increased from 0.929 [95% CI 0.926-0.932] to 0.938 [0.935-0.941]) and the Integrated Discrimination Improvement (IDI) (0.04327, 95% CI 0.0384-0.0482; p < .0001). Model calibration ( Figure 3) did not change (Hosmer-Lemeshow fit statistic decreased from 37.56 to 36.51). The Net Reclassification Improvement (NRI) analysis showed that although the overall net proportion of correct reclassification was negative (-18.4%), the overall net number of correct reclassifications was positive (+17 923 or 13% of the entire cohort, Table 5).

Discussion
We derived and internally validated an index that used administrative data to quantify the relative contribution of a broad range of therapeutic procedures on the risk of death in hospital. We identified 52 procedures which (after adjusting for a robust and validated hospital mortality model) were significantly associated with the risk of death in hospital. We modified this model into an index that reflects the independent contribution of each procedure to the risk of death in hospital. By itself, and when added to an accurate model to predict hospital  mortality, the total Procedural Index for Mortality Risk (PIMR) score significantly predicted risk of death in hospital.
The importance of surgical interventions on hospital outcomes is reflected by the large number of indexes that use patient and hospitalization factors to predict the risk of post-procedural death (Table 1) [1][2][3][4][5][6]. The clinical variables in these indexes, along with their simplicity, increase their face validity to practicing clinicians. However, these clinical variables prohibit calculation of these indexes using administrative data. To develop our index, we started with a validated, highly accurate model to predict hospital mortality risk in all hospital patients. We then determined the risk of death after a broad range of procedures independent of that predicted from the KP-IRAM. Both by itself and when added to the KP-IRAM model, the PIMR was significantly associated with the risk of death in hospital.
The PIMR would primarily be used in analyses involving administrative data. Expressing this risk as a simple score facilitates our understanding of the relative importance of various interventions on death in hospital. When combined with the KP-IRAM, the PIMR had excellent discrimination and calibration for predicting risk of death in hospital. It is notable that the discrimination achieved with the KP-IRAM and PIMR was similar to that achieved using clinical based models ( Table  1). The PIMR will allow researchers and administrators to gauge patient and procedural complexity of individual surgeons, services, or hospitals for descriptive or comparative purposes and will also let analysts adjust for the influence of a large range of therapeutic procedures on risk of hospital mortality.
The independent association between many of the PIMR procedures and risk of hospital death may reflect unresolved confounding of patient or hospitalization factors. The significance of several procedures (e.g. cardiac resuscitation) is likely due to important clinical events (e.g. cardiac arrest) that are identified by the procedure code and are not captured by the KP-IRAM. Further work is required to determine how much mortality risk is due to the procedure and how much is due to other underlying patient factors.
The addition of the PIMR to the KP-IRAM model significantly improved the ability to predict hospital mortality. The absolute increase of the model's c-statistic was small (0.009 or 0.9%). Several studies have shown that the overall, sequential improvement of model performance decreases as more and more variables are added [23,24]. However, the c-statistic of the KP-IRAM was already very high without the PIMR score (92.9%). With the PIMR, the C-statistic increased more than 10% of the distance between the KP-IRAM and perfect discrimination. This indicates, along with the results presented in Figure 1, the strength of PIMR to predict risk       Figure 3 Calibration of KP-IRAM and PIMR to predict death in hospital. These figures compare observed and expected death rates when the validation group was divided into expected risk deciles (top) and strata (bottom). The decile plot presents observed mortality rates with 95% confidence intervals with those in red significantly differing from expected. of death with or without other covariates associated with death risk in hospital. We believe that two steps could greatly improve the PIMR. The PIMR relies on procedure codes whose accuracy has not been validated. Our study's objective was to derive and validate an index that determines the independent influence of various procedures on hospital mortality. Strictly speaking, however, the PIMR measures the independent influence of codes for various procedures -rather than the procedures themselves -on hospital mortality. Without knowing the accuracy of each code for its respective procedure, we are uncertain how strong a surrogate each code is for the actual procedure. Before one uses the PIMR for individual patient risk prediction, the accuracy of the procedure codes contained in the PIMR should be validated.
The second major limitation of the PIMR is its imputation of procedural urgency using admission urgency status. For most hospitalizations, admission and procedural urgency will be identical but situations could arise in which they would differ. For example, consider a patient admitted electively for a hip replacement who has an acute myocardial infarction requiring an emergent angioplasty. In this case, the angioplasty urgency would be misclassified as an elective procedure. We believe that this bias explains why the number of points assigned to two elective procedures exceeded that for their emergent counterpart ( Table 4). The PIMR could be improved by using an accurate classification of procedural urgency.
There are other limitations to the PIMR. First, the PIMR requires that surgical procedures are coded using the Canadian Classification of Interventions (CCI). Without validated translation tables to other procedural coding systems, this limits its use to Canadian institutions. Second, the PIMR was derived and validated in a single hospital. While objective and universal criteria are used to code procedures, it is possible that local coding practices could change the PIMR's validity in other patient populations. Third, most procedures are not included in the PIMR because they were not independently associated with risk of death in hospital. As a result, the PIMR should be used as an adjunct to other factors associated with risk of death in hospital -such as those in the KP-IRAMto compare outcomes after various surgeries. Researchers should exercise some caution if this index is used when inferring institutional quality of care issues using hospital mortality. Some of the components in the PIMR (such as heart resuscitation) could result from poor quality of care, the adjustment of which could hide such problems.
Finally, our analyses did not include surgeries that were infrequently conducted at our hospital.