Setting and participants
The University Hospital of Parma is one of the largest Italian health care facilities, with full-time core residency training programs in medicine and surgery, with 1267 beds and over 52000 admissions/year. The General Medicine and Geriatrics units mainly admit patients from the Emergency Room (90% of cases) and cover 14% of overall admissions. 6% of patients are transferred to the long-term units. The hospital works in strict cooperation with community health and social services.
The 12 hospital wards with the longest LOS participated in the study; these are all medical wards, and include eight general medicine, two geriatrics and two long-term care units. Although long-term care units are expected to care for patients requiring extended hospitalisations, these wards were included in the study because they exhibited longer LOS compared with similar wards in other institutions of the region. The Directors of the participating wards, or a delegate, acted as reference physicians for the project; the study was presented to them as a quality-improvement project, without emphasizing its aims and research methodology.
Objectives and outcomes
The primary objective was to evaluate the effect of a strategy aimed to reduce unnecessary hospital days over a 12-month period. The main efficacy endpoint was the percentage of patient-days compatible with discharge measured on an index day. To identify such days we employed the Delay Tool developed by Carey et al.  (see Data Collection).
Secondary objectives were:
To describe the strategy’s effect in the long-term, measured in both arms one year after the end of implementation, on an index day.
To analyse the strategy’s effect in terms of overall length of stay for subjects included in the investigation. Information on each patient’s length of stay was retrieved from routinely collected administrative data.
To verify whether the strategy’s effect was greater for specific causes of delay, or generally distributed, according to the information gathered with the Delay Tool.
To compare readmission and mortality rates in the year of implementation between the two arms. Readmission rate is defined as the number of subjects included in the investigation who experienced unintended, acute readmission in any ward of any hospital, within 30-days of discharge from the day of admission, divided by the total number of patients included in the study. Mortality rate is defined as the number of subjects included in the investigation who died within 30-days of discharge from the day of admission, divided by the total number of patients included in the study . All objectives pertain to the individual level, considering clustering.
This is a cluster, parallel group, randomised trial, where the tested strategy is targeted at the wards (the units of random assignment) and the effectiveness is measured at the individual level, with the patient-day as unit of analysis. The strategy was implemented from February 2008 to February 2009, excluding August 2008. It was decided not to include August, because the reduction of staff and bed capacity during the holiday season, with consequent ward reorganisation (e.g. temporary merging of wards), made this month non-homogeneous with respect to other periods. One year after the end of implementation (March 2010) the long-term effect was measured in both arms. Data analysis was completed in February 2011.
Randomisation and masking
The 12 wards were assigned by equal randomisation (1:1) to an intervention arm, where the strategy was implemented, or to a control arm, where assessment only was introduced. Randomisation was stratified according to ward type, which was possible because of the coincidental even number of wards for each type.
Centralised randomisation with computer sequence generation, for ward allocation and identification of the index days, was performed with blinding by a statistician. The sequence was concealed until interventions were assigned. Staff of all participating wards was blinded to the index days for data collection, and staff in the wards of the control arm, as well as data analysts, were blinded to the trial’s objectives and interventions.
The strategy was intended to motivate individual physicians to adopt more efficient practice patterns. It comprised two integrated components:
Distribution of two monthly reports, one consisting in the list of patients who, through data collection performed on the index days with the Delay Tool (see Data Collection), were classified to be present on the ward although their clinical status was considered compatible with discharge; the other featuring individual length of stay profiles for each physician operating in the intervention arm (information taken from administrative data), allowing them to compare their own performance with that of the rest of the medical staff, similarly to the approach described by Lagoe et al. .
Audits performed by professionals of each ward of the intervention arm designed to discuss cases judged to be compatible with discharge. The organisation of this work, as well as the identification and implementation of improvement measures, were left to the wards, without any interference from the project team.
The study was reviewed by the Institutional Ethics Committee of the University Hospital of Parma, and conducted in accordance with the protocol. Because this study aimed to test the effect of quality improvement measures, with no direct intervention on the patients’ diagnostic-therapeutic pathway, informed consent was not required.
Data collection and measuring tools
The Delay Tool we used was developed by Carey et al. , who conducted an observational study at an American university-affiliated tertiary care hospital, with the aim to detect, quantify, and characterise delays that unnecessarily prolonged hospitalisation. It comprises two separate parts (see Additional file 1). The first contains questions aiming to determine whether a hospitalised patient’s clinical status was compatible with discharge. If this is the case, the second part requires to identify the factors that may have contributed to the delay, distinguishing between medical and nonmedical causes. The tool thus allows to identify those patients whose hospital stay was unnecessary, i.e. patients who had no symptoms, signs, or likely diagnoses placing them at high risk for immediate morbidity or mortality, or who had 1 or more of these risks but there was no anticipated risk reduction from hospitalisation. The tool also enables to determine the reasons for failure to discharge, and to gather information on patient age, sex, hospitalisation ward, and date of admission. The Charlson Comorbidity Index  was derived from ICD9-CM diagnostic and procedural codes available in administrative datasets. The use of such an indicator ensures a more precise control for casemix, and works as a proxy for severity at patient level: the higher the value, the greater the severity.
Information was collected for both arms by specifically trained personnel (five senior physicians: one Director TM, and four experienced collaborators AN, EP, BP, TS), operating in a ward which did not participate in the study. Training consisted in a half-day seminar, during which TM introduced the project and provided instructions on the use of the tool, followed by a 2-day implementation in the physicians’ own wards and discussion of encountered problems. The analysis included all patients present on the participating wards during one of 12 randomly selected index days (one for each month of data collection). A monthly data collection pattern was considered adequate to ensure that any seasonal variations in organization, patient flow, etc. would be taken into account. Patients admitted or discharged on the index days, and patients with LOS >90 days were excluded. Gathered information was relative to the day preceding the index day, and derived from clinical documentation; healthcare staff was interviewed only if clarifications were needed. During data collection, all five physicians were present in the ward simultaneously, and controversial cases were discussed until agreement was reached.
The monthly reports containing physician length of stay profiles were compiled using hospital administrative data. They contained, for each physician, the number of patients discharged in the month, along with relative mean, observed, and expected LOS. The names of all physicians except that of the recipient were hidden, to ensure anonymity (see Additional file 2). The hospital’s database of discharge summaries, used in this study, is periodically validated against clinical records by trained, external personnel, in accordance with regulations established by the Region .
To define the size and test feasibility of the investigation, a pilot study was conducted at one nonparticipating centre (the Neurology Unit of the University Hospital of Parma) in one day. This preliminary investigation allowed to estimate the baseline value of the main endpoint, 50% of patient-days judged compatible with discharge, as well as the proportion of ineligible cases, 9.4% (3/32).
As the primary endpoint was measured over a one-year period, and being 350/day the mean number of patients present in the 12 participating wards, it was estimated that overall approximately 4000 patient-days should be investigated, of which 10% would be ineligible.
We defined an expected difference of 10% (from 50% to 40%), slightly smaller than that reported by Lagoe et al. , considering the probable contamination implying an underestimation of the effect.
Intracluster Correlation Coefficient (ICC) was not taken into account when calculating the sample size, because at the time of protocol development no study was available providing an estimate for variance inflation as a result of clustering. Without ICC, a size of 834 had been estimated, assuming a power of 80% at a two-sided significant level of 0.05. However, starting from 4000 patient-days estimated in one year, and accounting for the clustering effect with the mean cluster size of 300, an ICC of 0.02 was obtained, and considered plausible [23, 24].
We summarised the baseline characteristics of the clusters using mean number of patient-day (± standard deviation) and the baseline characteristics of the subjects using frequencies (and relative percentages) for categorical variables, and median (with relative interquartile ranges) for continuous variables. We used cluster specific methods because wards, rather than patient-days, were randomised. To account for the clustering of patient-days from the same ward, we used logistic regression with generalized linear mixed model (PROC GENMOD, SAS version release 8.2; SAS Institute, Cary, NC) in the analyses. The ward was considered the unit of cluster for patient-day connected with a specific ward. To control for differences in patient-day and ward characteristics among intervention and control arms, we adjusted for variability both between clusters (wards) and within a cluster (patients within the same ward), and controlled for patient and ward covariates. All analyses were intention-to-treat, and the effects are presented as odds ratios (or incidence rate ratios) and 95% confidence intervals. To assess the model’s goodness-of-fit, the scaled Pearson’s chi-square statistic was used, comparing deviance with its degrees of freedom. The closer the ratio between the two values is to 1, the better the model’s fit. We assessed multicollinearity by examining tolerance and the Variance Inflation Factor (VIF).