Setting
In Finland, reliable provider-specific information about the effectiveness of treatments has been considered the only way to monitor the progress of centralization and constitute justified limits for the sizes of practically reasonable units in the Finnish health care system [14].
The organization of social and health care - both of which are incorporated into the same national planning and tax-based financing system - has long been considered a public responsibility in Finland [15]. The country's numerous local authorities - municipalities - are responsible for arranging primary care and other basic services, such as nursing homes and other social services for the elderly [16]. In addition, each municipality is a member of one of the 21 hospital district joint authorities that are responsible for organizing specialized medical services and coordinating hospital treatment in their own districts. Primary health care is mainly provided at health centers that are owned by municipalities or federations of municipalities. The health centers also contain inpatient wards that are mainly used by elderly and chronically ill patients. Secondary and tertiary level medical care is provided by a hierarchy of hospitals, including about forty regional hospitals, sixteen central hospitals, and five university teaching hospitals [17]. Publicly owned hospitals are not run for profit, and there are only a few private hospitals in Finland.
In regard to hip fracture treatment, virtually all hip fracture patients are first referred for examination and surgical treatment to the nearest public hospital with orthopedic services in Finland. After very short postoperative hospital treatment, a hip fracture patient is typically transferred for rehabilitation to the health center [18]. Other services used by hip fracture patients include nursing home care, outpatient health services, and home-help services. Patients have very limited possibilities to choose treatment units, as these are determined based on the patient's municipality of residence.
In the case of hip fracture treatment, there are two volume-related factors that can be regulated fairly easily: the number of orthopedic treatment units and the number of rehabilitation units. The main policy-relevant question can be stated as: Is it possible to improve the effectiveness of hip fracture treatment by regulating the minimum volume for the treatment units?
Data
In order to examine the volume-effectiveness relationship, data on comparatively risk-adjusted effectiveness indicators are needed for all care providers. The amount of data required is so massive that administrative registers are the only realistic source of such data, in spite of their known shortcomings, such as their secondary nature and the lack of clinical data for risk-adjustment purposes [19, 20]. In Finland, very good administrative registers are available, and the personal identification number allows deterministic record-linkage within and between registers. In general, the complete registration, combined with easily linkable registers, makes large, longitudinal population-based studies feasible in Finland [21].
For the purposes of this study, the total population of hip fracture patients in 1998-2001 was identified in the Finnish Health Care Register. The medical histories (1987-2002) and deaths (1998-2002) of the hip fracture population were extracted from the Finnish Hospital Discharge Register, the Finnish Health and Social Welfare Care Register, and the National Causes of Death statistics using the unique personal identification codes of the patient population. Each record in these registers includes data such as patient and provider ID numbers, age, sex, area codes, and diagnosis and operation codes, as well as dates of admission, operation, and discharge (or death). The validity of Finnish register-data for studying the effectiveness of hip fracture treatment is known to be good [22].
Data were pre-processed so that the information concerning hip fracture patients with their first hip fracture could be accurately identified. The details of the process are reported elsewhere [23]. The existence of possible comorbidities was extracted for each patient from his or her medical history using the diagnosis codes recorded in the data. The extraction method was adapted from the Charlson comorbidity categories, and the application to the current data set was done in a similar fashion to that of previous hip fracture studies [24–26]. Other relevant variables available in register-based data, such as age, sex, source of admission, and prior use of care, were also extracted from the data for risk adjustment purposes.
The data set used in this study included data for 22,857 hip fracture patients from 52 hospitals. The volume-effectiveness relationship for rehabilitation units was investigated using a subset of data including hip fracture patients aged 65 years and older who lived at home before the fracture. This subset included 10,384 patients who were transferred to a rehabilitation unit (n = 272) after an operation.
Effectiveness indicators
While using data from administrative registers, only a limited number of validated effectiveness indicators are available. The most common one is mortality. The use of short-term mortality as an effectiveness measure in volume-effectiveness studies has been criticized, however, because it is a rather crude proxy for effectiveness and also a rather uncommon event that may cause problems in statistical modeling [13]. Moreover, short-term mortality is a weak effectiveness indicator in the sense that many of the perioperative deaths of hip fracture patients may be unavoidable [27]. Four-month mortality was therefore selected as a primary effectiveness indicator in this study. The limit of four months corresponds to the population level maximum for the length of the acute hip fracture treatment episode [28].
There are other possible effectiveness indicators, such as re-hospitalizations or the occurrence of complications. Unfortunately, the indicators that require complex data abstraction using diagnosis codes, such as in the identification of complications, are prone to severe bias caused by existing differences in the registration practices of (secondary) diagnoses. It has been shown, however, that the Finnish register data allow a complete reconstruction of hip fracture treatment episodes in terms of daily levels of care, for which the directly observable levels of care are: 1) home (including home care, ordinary service houses, and outpatient care), 2) nursing home (service houses with 24-hour assistance and residential homes), 3) health center (inpatient ward of local primary care unit), 4) hospital, and 5) death [23]. It is also known that each level of care reflects a certain intensity and need for care [29]. In this sense, it can be interpreted that the directly observable backward steps in the levels of (inpatient) care in the treatment episode following the hip fracture reflect an increased need for care, i.e., obvious drops in the health status of the patient. For the purposes of this study, a new effectiveness measure of maintainability was defined: maintainability can be considered satisfactory if no backward steps are observed in the levels of care. In practice, this measure describes whether there have been some unexpected steps during the treatment (by capturing deaths, readmissions, and referrals to higher-level hospitals). Here, maintainability was operationalized as a dummy variable that indicates unsuccessful maintainability if an event that breaks maintainability was observed during the first four months after the hip fracture.
Basic model for the volume-effectiveness relationship
The basic idea in volume-effectiveness analyses is to compare the effectiveness of treatment between providers (such as hospitals). This kind of activity is commonly referred to as profiling of providers. Profiling can be quite complicated, as there is variation between providers for at least three reasons: 1) differences may be attributable to random variation due to the size of the provider, 2) the patient case-mix varies from provider to provider, and 3) providers may differ in the effectiveness of their care [30]. For these reasons, a statistical model for provider profiling, in which provider differences are modeled explicitly, must be considered for justified conclusions.
Traditionally, the ratio of observed to expected outcomes multiplied by the mean rate is used as the risk-adjusted rate for providers [31]. In the case of a binary response variable, a logistic regression is a suitable tool for the calculation of expected outcomes. The idea is to construct and estimate a model in which the observed outcome (Y) is a dependent variable and patient characteristics (x) are independent variables. With this kind of model, it is possible to calculate predicted values for all individuals, using patient characteristics and estimated values of parameters with the inverse logit transformation. As the focus of profiling is on providers and not on individuals, the observed and expected outcomes must be aggregated to the provider level as follows:
and
where the sums are over j patients treated by provider i, while β is an estimated parameter vector [32].
As the observed outcomes Oi are non-negative integers describing frequencies of events, they can be assumed to have a Poisson distribution with an unknown mean μi:
where
and i is the provider index [33]. In other words, it is assumed that the expected outcomes Ei adjust the patient characteristics, and θi describes the variation caused by the provider. The use of logarithms guarantees that θi remains positive in this kind of random effects model.
In data sets with a hierarchical structure, there often exist correlations between observations that may result in overestimated differences in profiling analyses. Small sizes of providers may also cause some estimation problems. Assuming exchangeability of providers (i.e., that the results for all providers are equal if there is an infinite number of [similar] patients), a two-level hierarchical model can be used to deal with such problems. A simple solution is to assume that variation caused by providers is normally distributed:
where exp(α) is the "general" risk-adjusted ratio and σ2 describes the variance between providers [33]. In order to define a full probability model, prior distributions for the parameters α and precision τ = 1/σ2 must also be defined. Suitable non-informative priors are
and
Extended model for the volume-effectiveness relationship
Hierarchical models, similar to the one presented above, are widely applied in provider profiling and are known to be superior to non-hierarchical models [34]. Unfortunately, the presented model is not optimal for the investigation of a possible relationship between effectiveness and volume because the observations are shrunk towards the global mean, even though it can be hypothesized that there will be some kind of trend between the volume and provider-specific effectiveness measures. In fact, it has been hypothesized that the relationship between volume and effectiveness may be non-linear, linear, stepwise, or may have a single cut-off [13].
In the model presented above, the logarithm of the ratio between observed and expected outcomes was used as a convenient starting point for the model. This means that technically, it would be convenient to incorporate also the possibility of a trend on the logarithmic scale. In fact, the ratio between observed and expected outcomes is a measure of relative difference, and the log difference is the preferred scale for such measures [35]. It is also known that the relative difference approximates to the more adequate log-difference measure in the proximity of ratio one, which means that the interpretations are approximately equal, if the differences are quite small.
The basic model is actually a special case of a linear trend model in which the slope parameter is fixed at zero. The model can be modified in a straightforward way to include the possibility of a volume-related linear trend. More specifically, let zi be the provider-specific volume and
where
and priors for α and τ = 1/σ2 are as above, and, correspondingly, a non-informative prior for the slope parameter is
In principle, the same model works in the single cut-off case: if zi is changed to a dummy-variable indicating the "high-volume" provider. Similarly, a stepwise model could be implemented by adding regression parameters and dummy variables to the model. The practical problem for the non-continuous models is the determination of appropriate cut points. It is possible to use predetermined limits or try to estimate optimal cut points with the data [36]. With the hierarchical full probability models, it would be possible to build a model for the single cut-off case where the cut-off point is treated as a parameter that is estimated simultaneously with the other parameters. Such a model, however, is not considered here because the estimation easily results in multimodal posterior distributions.
The extension of the model to incorporate a non-linear trend is a little more challenging. The simple parametric approach of using low-order polynomials in the regression model offers only a limited family of shapes and, with more complex forms, it is typically very difficult to choose between well-fitting models. In principle, regression using the fractional polynomial approach could be a satisfactory compromise but would require the fitting of numerous regression models [37]. With the hierarchical modeling approach, it is actually more tempting to use the recently invented connection between penalized splines and linear mixed models to extend the standard regression model to a semi-parametric form in which the non-linear relationship is not restricted by the parametric forms [38]. The aim of such models is to describe the local structure of the relationship between outcome and covariate, resulting in a good fit across the range of the covariate.
The linear model presented above can also be extended to the semi-parametric form. In fact, with a thin-plate spline regression modification, the model remains similar in regard to θi, but αi is extended to the form
where the random coefficients are normally distributed with zero mean and variance σb
2, i.e.,
k is the number of so-called knots, and wij are special design variables calculated using k sample quantiles of the covariate [39]. The priors for α, γ, and σ2 are as above, and an adequate non-informative prior for τb = 1/σb
2 is
Application of the models
In this study, the three different volume models described above - a mean model, a linear trend model, and a spline model - were applied to the examination of the volume-effectiveness relationship between the four-year (1998-2001) pooled hospital or rehabilitation unit volume and two effectiveness measures, four-month mortality, and maintainability. The predicted probabilities of mortality and maintainability required for risk-adjustment purposes were estimated using the logistic regression model, and the predictive power of the model was measured using the c-statistics. The hierarchical models were estimated using MCMC simulation. Five knots were used in the specification of the spline model. The mixing of the estimation procedure was examined using two chains in the estimation, and the convergence was evaluated on the basis of Gelman-Rubin convergence plots [40]. A hundred thousand iterations following ten thousand burn-in iterations were used in the actual estimation of the parameters for each model. The complexity and relative fit of the hierarchical models were assessed with the deviance information criterion (DIC) [41].