Study participants
The study was carried out at the Tan Tock Seng Hospital. The 1200-bedded hospital is a public sector facility with 390 general medicine beds. It is the second largest acute care general hospital in Singapore. Subjects aged 65 years and older, admitted consecutively on weekdays between 6th April and 29th October 2008 to the general medicine department were included. There were 2,171 potential subjects. Patients were excluded if they could not be interviewed for reasons including intubation, coma, severe aphasia, or terminal condition (n = 326); if they were discharged or transferred or died within 48-hours (n = 53); if they declined participation (n = 90); or others (patient was asleep, not in the ward) (n = 282). Medical notes for 18 subjects were not available. The final sample included 1402 participants. Informed consent was obtained from all participants.
The study was approved by the Institutional Review Board of the National Healthcare Group, Singapore.
Data collection
Trained research assistants carried out structured interviews with the patients or their proxies within 48 hours of admission. The interview included information pertaining to socio-demographics, level of social support, pre-admission and on-admission basic ADLs. Data on primary diagnosis and hospital mortality were extracted from the hospital’s administrative database.
Another research assistant, blinded to the study aims and the results of the patient interviews, abstracted data on co-morbid conditions and pre-admission and on-admission functional status of patients from the medical records. The nurse-trained abstractor reviewed the medical record, including notes of medical and nursing staff.
Study variables
Based on clinical judgment and predictors found to significantly predict mortality in older patients in literature [1, 16–19], we collected the following variables: demographics (age, sex, ethnic group), marital status, housing type, primary diagnosis, co-morbid disease burden and functional status.
The baseline interview included questions regarding basic ADLs. Patients or their proxies were asked if they experienced difficulty in the performance of each of the ADLs (feeding, bathing, grooming, dressing, toileting, transferring and walking) 2-weeks prior to being hospitalized, and at the point of admission. All items were rated either yes or no. Patients who used an adaptive device for an ADL function were not considered impaired if assistance from another person was not required.
Pre-admission function is routinely assessed by doctors as part of history taking and documented in the medical records. There are three checkboxes allowing doctors to indicate whether the patient is “independent”, “needs assistance” or “totally dependent” for the different functional aspects. Impairment in feeding, bathing, grooming, dressing, or toileting was captured as a whole whereas walking was a separate category. Information on transferring was obtained from documentation by doctors or therapists. On the other hand, assessment and documentation of on-admission function was done by ward nurses regarding patients’ level of nursing care needs. The checklist used asked whether patients require assistance in the following ADL categories: feeding, dressing or grooming, toileting or bathing, transferring, and walking. All items were rated either yes or no in the medical records.
Due to the differences in data collection system in the interview and medical record, we compared the difficulty experienced by the patient in performing at least 1 of 5 basic care skills (feeding, bathing, grooming, dressing, toileting), transferring, and walking. We also examined impairment in at least 1 of these 7 ADLs.
There were a total of 165 different primary diagnoses in the study sample. Only primary diagnoses, which are highly predictive of mortality in older hospitalized patients, were coded. This was based on the High-Risk Diagnoses for the Elderly Scale [18], which comprises 22 diagnostic groups. They include: Bone marrow failure, cancer (metastatic), cancer (solid tumour, localised), cirrhosis/end-stage liver disease, congestive heart failure, decubitus ulcer, delirium, dementia, lymphoma/leukaemia, major depression, malnutrition, renal failure (acute), renal failure (chronic), respiratory failure, chronic obstructive pulmonary disease, diabetes mellitus with end-organ damage, major stroke, multiple trauma, myocardial infarction, pneumonia and severe peripheral vascular disease. In our study, the International Classification of Disease, ninth version was used to identify, and group the diagnoses. Diagnoses with a prevalence of less than 1% in the sample were excluded from further analysis to avoid having too few observations per cell for analysis [20].
The Charlson Comorbidity Index (CCI) is a weighted prognostic index based on the number and severity of 19 comorbid conditions [21]. It was computed by the research team for the purpose of this study based on medical history in the medical records and diagnoses listed at admission. The medical conditions are weighted 1–6 with total scores ranging from 0–37. It was developed originally to predict 1-year mortality of internal medicine patients. Data abstracted from the medical record were used to compute the score. We have coded the score into four categories 0, 1, 2 and ≥ 3.
Statistical analysis
Comparison of data sources
Pre-admission and on-admission functional status obtained from the interview was compared with results extracted from the medical notes. Cohen's kappa coefficient κ was computed to measure the level of agreement between the sources of data. Excellent agreement beyond chance is indicated by κ values above 0.75, 0.40 through 0.75 represent fair to good agreement and values below 0.40 indicate poor agreement [22]. Using data from the patient or proxy interviews as the reference standard [23], we computed the sensitivity, specificity, negative and positive agreement values. Since we cannot assume non-documentation to be equivalent to functional independence [6], we have excluded cases with missing documentation when comparing between data sources.
Development of hospital mortality prediction models
Katz et al. suggested that functional abilities are lost in the opposite order to which they were gained during childhood (first, feeding and transfer; later toileting and dressing; last bathing) [24]. Therefore, in the prediction of mortality, we have defined “ADL impairment” as needing physical assistance in transferring rather than as impairment in one of the seven basic care skills. Due to the hierarchy of impairment in basic ADLs, the latter measure would have categorised patients requiring assistance in bathing as functionally impaired, which may not sufficiently discriminate the survivors from those who died.
Bivariate analyses were conducted to examine the relationship between inpatient mortality and each variable including pre- and on-admission functional status drawn from the medical records and from the patient interviews. We used the Chi-square test or the Kruskal-Wallis test to assess differences among or between groups. Variables statistically significant at P < 0.25 were selected for possible inclusion in multivariable logistic regression models to predict hospital mortality. We chose P = 0.25 as the threshold for including variables in the multivariable model because this has been suggested elsewhere as an appropriate threshold [25].
The following four functional status variables were included in separate multivariable logistic regression models: “pre-admission ADL impairment” based on interview, “pre-admission ADL impairment” based on medical records, “on-admission ADL impairment” based on interview and “on-admission ADL impairment” based on medical records. For missing pre- and on-admission functional status data in the medical records, the median value adjusted for mortality and diagnosis, was used.
We adopted a model selection method that combines bootstrap resampling with automated variable selection methods [26]. Since the use of automated variable selection methods with logistic regression has been found to result in the identification of nonreproducible models, this approach allowed us to determine with greater confidence that a variable is indeed an independent predictor from its empirical distribution [27]. Our model selection method was based upon drawing 1000 repeated bootstrap samples (the Stata userwritten “swboot” command) from the original dataset. Within each bootstrap sample, backwards elimination was applied with removal criteria set at P < 0.10. For each variable, the proportion of bootstrap samples in which that variable was identified as an independent predictor of the outcome is determined. Variables that were significant in at least 60% of the bootstrap samples were included in the model [26]. Results were considered significant at P < 0.05.
The area under the receiver operating characteristic curve (AUC) or the c-statistic was used to evaluate the discriminatory power of the different predictive models [28]. Paired comparisons of the AUC based on different sources of data were conducted using the Hanley and McNeil method [29]. The Hosmer-Lemeshow test was used to examine the models’ goodness of fit [25]. The different models were ranked according to the Akaike's information criterion (AIC), with the one having the lowest AIC being the best. Analyses of the data were performed using STATA version 9.2 (Stata Corp, College Station, Texas).