Study design
This is a retrospective cohort study based on secondary analysis of prospectively collected data in the NHI system and governmental data on death registry. It is part of a research project on PMV care. The National Health Research Institutes (NHRI), the BNHI, and the Department of Health (DOH) approved the research project, and granted our request to construct a database linking NHI data and death-certificate data for patients receiving PMV care in the NHI. All individual identification numbers were scrambled by the BNHI for privacy protection.
Setting
The study setting was in the NHI system. The NHI covers and protects almost all Taiwanese, and provides medical services in almost all outpatient visits and hospital admissions [7]. The NHI care program for PMV patients 17 years of age or older covers invasive ventilators, negative pressure ventilators, and positive pressure ventilators, under the condition that at least some use of an invasive ventilator or a negative pressure ventilator should be made prior to the first day of using a positive pressure ventilator [8]. Thus, PMV patients in the NHI only include those under MV care. Patients with a tracheostomy tube in situ not connected to a ventilator are not regarded as ventilator-dependent in the NHI.
While the BNHI defines PMV as ventilation care use for at least 21 consecutive days, a patient might have ventilation care use for fewer than 21 days at the time of entering the non-ICU care program for PMV in NHI, because the BNHI includes periods of discontinuation of MV care that had a length equal to or shorter than 4 days when aggregating a patient's time (in days) of use for judging the patient's eligibility of entering the program [8]. The BNHI sets limits on the lengths of stays in ICUs (an acute stage, < 21 days) and respiratory care centres (a subacute stage for weaning training, up to 42 days), but no limit for stays in respiratory care wards (a chronic stage or long-term care for those with little chance for weaning). Therefore, the NHI actually sets no limit on how long a patient may use MV care. The PMV program also includes homecare services for patients with a stable condition who are cared by family members or other caregivers at home. However, extremely few patients use homecare services, because the family have to pay much higher own expenses under this care pattern.
Data sources
Two Taiwan governmental organizations provided original data for the study. The BNHI provided NHI data, and the Ministry of the Interior provided data from the national mandatory death registry system through the DOH. The BNHI maintains a comprehensive database of claims and registration data. The NHI database has detailed information on health services, procedures and prescriptions provided in the NHI, and their payments and times of use. It also includes data on diagnoses for patients, as well as background information on patients, physicians and healthcare institutions. The NHI system codes diagnoses using the International Classification of Diseases, Ninth Revision (ICD-9). The quality of NHI data is generally reliable, because the BNHI has been routinely auditing data submitted by healthcare institutions to prevent fraud in the NHI [9]. In Taiwan, it is also widely believed that death-certificate data are highly reliable, as death registry is mandatory in a well-maintained household registration system of Taiwan.
The database we acquired contains person-level longitudinal NHI claims and registration data for a national representative sample of patients who ever used invasive or non-invasive respiratory care in NHI during 1996-2007. This cohort included 2,619,534 patients, representing 10% of all NHI enrolees and 29% of all patients ever using any kind of respiratory care under NHI in these 12 years. This number of individuals was the maximal number of enrolees the government sets for application of NHI data use in 2008 (the year of our data application). We also acquired longitudinal NHI registration data for healthcare institutions and physicians.
Establishment of a PMV patient cohort
We used SAS software version 9.1.3 (SAS Institute Inc., Cary, NC) to extract, organize, and link each patient's data on NHI registration and hospital care, and death certificate data. As the BNHI counts the lengths of short periods of discontinuation of MV care that was 4-day long or shorter when determining PMV status, we combined inpatient data for two admissions requiring MV for analysis if the readmission was within 4 days after the discharge of the previous admission. We defined such two admissions as a same episode of hospital stay for later analysis.
Although the BNHI has data on the exact dates of days with MV, the database we acquired only includes information on the amount of MV services for each medical order that was measured in days. Thus, we had to determine PMV status by counting the total number of days with MV in a hospital stay. For an episode of hospital stay that included two or more admissions requiring MV, we counted the number of days of short periods between two admissions when calculating the total number of days with MV. A hospital stay that had 21 or more days with MV was defined as one with a PMV incidence. To determine the time of a PMV onset, we assumed that MV services were offered right in the middle of a hospital stay with 21 or more days of MV to determine the 21st day of MV use, which was the time of the PMV onset.
Taking into account the requirements of the NHI care program for PMV patients, we added two more criteria for selecting study patients: (1) use of invasive ventilators or negative pressure ventilators at the initiating stage of care, and (2) > = 17 years of age on the 21st day of MV. We also confined the patient cohort to those becoming under PMV in or after 1998, when use of MV started to receive much attention. Finally, we decided to include in our project database only data for patients whose 1-year survival could be observed. Note that some patients used mechanical machines on and off, and had multiple episodes of PMV. Our study focuses on episodes that had at least a period of time of 365 days from all prior hospital stays with MV. This is based on an idea that findings from investigating outcomes among "new" patients with PMV can provide more reference information to physicians and policy makers, compared with results from examining future outcomes among patients who have been using MV services continuously or intermittently for months or even years. We finally identified 50,481 new patients who became under PMV in 1998-2006.
Validation of data on PMV status and the time of PMV incidence
Because we counted the number of days of short periods between two admissions when calculating the total number of days with MV, some patients in our PMV patient cohort might actually order fewer than 21 days of MV services. To examine whether this might have a significant influence on accuracy of identification of PMV status, we investigated the proportion of patients in the study cohort who ordered MV services for fewer than 21 days. The proportion was 0.5%, suggesting that bias due to wrong identification of PMV status was minor. Furthermore, we calculated the ratio of "the total number of days with medical orders for MV services" to "the counted number of days with MV" (including short periods between two admissions requiring MV) for each patient. The average ratio of the 50,481 patients was 0.99. Only 2% of the patients had a ratio smaller than 0.8, while 87% had a ratio equal to 1. These results indicate that this project has minor inaccuracy in identifying PMV patients and determining the time of PMV onset.
Study participants
We selected new PMV patients in 1998-2003 to assure a complete 4-year follow-up observation for each patient surviving the end of 2007. The inclusion criteria for these patients were: (1) continuous use of invasive ventilators, negative pressure ventilators, and/or positive pressure ventilators for at least 21 days, (2) use of invasive ventilators or negative pressure ventilators at the initiating stage of care; (3) > = 17 years of age on the 21st day of MV; (4) the date of the 21st day of MV falling in 1998-2003; and (5) no use of invasive ventilators, negative pressure ventilators, and positive pressure ventilators for at least one year before the first day of this PMV event. We excluded patients with missing data for gender in descriptive analysis, and further excluded those with missing data for other explanatory variables in modelling survival prediction. The proportion of patients with missing data in the NHI database was very small. Exclusion of these patients from the study sample would not result in substantial bias.
Variables
For each patient passing away before the end of 2007, we calculated post-PMV survival time as the period between the day of PMV incidence and the death date. For each patient who was alive at the end of 2007, we calculated post-PMV survival time that was censored at the end of 2007, and created a marker variable to denote right-censored data. Outcome variables for modelling survival prediction included binary variables showing 3-month, 6-month, 1-year, 2-year, 3-year and 4-year survival status after the PMV onset, with a value of 1 indicating successful survival and a value of 0 denoting failure. Data on factors associated with survival were from the NHI database. We counted the total number of days alive free of hospital stays with MV in the 4 years following a PMV incidence by subtracting the sum of the lengths of all hospital stays requiring MV in the 4 years following the PMV onset from the length of survival time in the immediate 4-year period after PMV.
Modelling survival prediction
We estimated 3-month, 6-month, 1-year, and 2-year survival models to determine factors associated with survival and construct prediction functions for survival among PMV patients. For each survival model (3-month, 6-month, 1-year, or 2-year), we established a set of hospital and patient factors that were potentially influencing on survival. Other factors included diseases being diagnosed during the index hospital stay and during hospital stays in the immediate 1-year period before PMV.
Variables of hospital characteristics at the PMV onset included a set of binary variables indicating a hospital's accreditation level and region. The accreditation level of a hospital in Taiwan reflects the hospital's size and clinical capabilities [9]. The region of a hospital in Taiwan might capture effects on hospital behaviours or outcomes of the managerial pattern of a regional NHI office, as there are 6 regional NHI branch offices. We also included a continuous variable showing the year of PMV incidence. This variable was a proxy for capturing the effect of generous NHI coverage on PMV care on the trend in expanding MV care use to prolong a PMV patient's life over time after the new care policy.
Patient characteristics included a patient's gender and age, which are generally associated with disease prognosis. Also included was a set of binary variables indicating the urbanization level of a patient's NHI registration location that was measured according to population density and the local industrial pattern [10]. To estimate the effect of a patient's socioeconomic status, we included a set of binary variables showing the salary tertile.
Variables of diseases included a set of binary variables indicating conditions reported at the PMV onset, and a set of count variables showing the numbers of hospital admissions for treating various conditions in the immediate 1-year period before PMV. To classify diseases, we first used the Clinical Classifications Software developed by the U.S. Agency for Healthcare Research and Quality to categorize all diseases [11], and further reduced the number of disease classes down to 43 after discussion with physicians in related fields (see Additional file 1). While NHI claims data for reimbursement purposes do not provide very detailed information on disease categorization for each admission, the general quality of NHI data on disease diagnosis is acceptable [9]. After discussion with physicians in related fields, we believed that inpatient NHI data on disease diagnosis were adequate for this study, which looks into major and broad disease categories. We excluded variables for "respiratory failure" from the two sets of variables of diseases, because most physicians would include this diagnosis for PMV patients.
Statistical analysis
We used Stata software version 9 (StataCorp, College Station, TX) for both descriptive statistics and multivariable regression analysis. Our descriptive analysis investigated survival up to 4 years, and examined the total length of time free of hospital stays with MV in the immediate 4-year period after PMV. Binary variables indicating successful survival were reported as percentage. The number of days alive free of hospital stays with MV was shown by selected percentiles (including minimum, median, maximum, and some others), mean and standard deviation.
Our estimation of prediction models focused on survival of 2 years or less, as a primary purpose of the prediction was to identify patients who were expected to die in a near future. Prediction of the likelihood of surviving a specific period of time can potentially yield information for facilitating communication with patients approaching the end of life and their family. We adopted logistic regression to identify factors associated with 3-month, 6-month, 1-year, and 2-year survival, and generate coefficient estimates for predicting a specific patient's probabilities of surviving different lengths of time. Compared to other contemporary methods of survival analysis, logistic regression is a very understandable method for generating predictions of survival likelihood and conducting subsequent comparison of predicted outcomes with actual outcomes to assess the quality of predictions.
Among explanatory variables for survival prediction, binary variables of hospital and patient characteristics were reported as percentage. Both the binary variables and the count variables indicating morbid conditions were reported as percentage on the basis of the proportion of patients with a non-zero value. To examine collinearity between explanatory variables, we calculated the variance inflation factor (VIF) for each variable, and the mean value of all VIFs. Results of the collinearity diagnostics indicate that inclusion of these variables would not result in damaging collinearity for multivariable regression analysis.
For comparison purposes, we used the probit regression method to estimate survival models. To investigate whether morbid conditions before PMV significantly influenced post-PMV survival, we also estimated survival models that excluded from explanatory factors the count variables showing the numbers of hospital admissions for treating various conditions in the immediate 1-year period before PMV. Additionally, we used a stepwise approach for selecting explanatory variables of diseases. We decided not to use the stepwise approach to select hospital and patient characteristics in model estimation, and included all of them. Because each hospital or patient feature was represented by a set of binary variables, it was thus not appropriate to use the stepwise approach for selecting variables among all these variables. We presented the results of survival models by reporting adjusted odds ratios (ORs) of explanatory variables and their 95% confidence intervals, and using a significant level of 5%. Data of 1998-2002 were used to estimate prediction functions; the 2003 data were employed to investigate model performance.
The cutoff value of predicted survival probability for classifying patients
We selected 10% as the cutoff value of predicted probability to identify patients with low survival likelihood and classify patients after prediction. With this level of cutoff value, patients with a predicted probability of survival < 10% would be classified as a group that would die before the end of observation period, and those with a predicted probability of survival > = 10% would be classified as a group that would survive the whole observation period. There is no universal standard for choosing a cutoff value for prediction modelling; one criterion for judging appropriateness of the level of cutoff value is weighing benefits and harms due to decisions based on wrong predicted outcomes [12]. After discussion with some physicians in intensive care, we decided that 10% was an appropriately low survival probability for identifying patients who would be expected to pass away in a near future, to assure a small chance of wrongly categorizing a patient into a group with predicted death.
Using data for PMV patients of 2003, we compared predicted outcomes with actual outcomes, and further calculated four measures for reflecting the quality of predictions in different perspectives. The four measures and their formulae are below:
-
(1)
sensitivity = (the number of patients with a predicted probability of survival > = 10%)/(the number of patients who actually survive the whole observation period);
-
(2)
specificity = (the number of patients with a predicted probability of survival < 10%)/(the number of patients who actually died before the end of observation period);
-
(3)
positive predicted value (PPV) = (the number of patients who actually survive the whole observation period)/(the number of patients with a predicted probability of survival > = 10%);
-
(4)
negative predicted value (NPV) = (the number of patients who actually died before the end of observation period)/(the number of patients with a predicted probability of survival < 10%).
Sensitivity measured the proportion of patients with a correct predicted outcome among those actually surviving the whole observation period. Specificity measured the proportion of patients with a correct predicted outcome among those actually passing away during the observation period. PPV indicated the proportion of patients who did survive the whole observation period among those who were expected to survive. NPV showed the proportion of patients who did die during the observation period among those who were expected to die. As shown by these formulae, the directions of changes in sensitivity and in specificity are reversed, and the directions of changes in PPV and in NPV are reversed.
We emphasized NPV when assessing the quality of predictions, because we aimed to reduce the number of patients being wrongly classified into a group with anticipated death in a near future (called "false negative cases" in statistics). During communication concerning future healthcare for a PMV patient, data on NPV can provide the patient and the family reference information in regard to whether a patient with an anticipated death in a near future according to PMV patients' prognosis in the past would really pass away soon. Using this standard of assessing the quality of predictions, we inevitably had a low accuracy level for detecting death cases among those who actually died.
We also calculated one more performance measure: the c-statistic. This measure is also termed as AUC, which indicates the area under the receiver operating characteristic curve. It assesses the extent to which predicted outcomes discriminate between subjects with different actual outcomes. In addition to using the log-likelihood ratio test to examine model significance, we used the c-statistic to evaluate the overall adequacy of a prediction model.