Is the health of people living in rural areas different from those in cities? Evidence from routine data linked with the Scottish Health Survey

Background To examine the association between rurality and health in Scotland, after adjusting for differences in individual and practice characteristics. Methods Design: Mortality and hospital record data linked to two cross sectional health surveys. Setting: Respondents in the community-based 1995 and 1998 Scottish Health Survey who consented to record-linkage follow-up. Main outcome measures: Hypertension, all-cause premature mortality, total hospital stays and admissions due to coronary heart disease (CHD). Results Older age and lower social class were strongly associated with an increased risk of each of the four health outcomes measured. After adjustment for individual and practice characteristics, no consistent pattern of better or poorer health in people living in rural areas was found, compared to primary cities. However, individuals living in remote small towns had a lower risk of a hospital admission for CHD and those in very remote rural had lower mortality, both compared with those living in primary cities. Conclusion This study has shown how linked data can be used to explore the possible influence of area of residence on health. We were unable to find a consistent pattern that people living in rural areas have materially different health to that of those living in primary cities. Instead, we found stronger relationships between compositional determinants (age, gender and socio-economic status) and health than contextual factors (including rurality).


Background
Rural Scotland comprises 89% of Scotland's landmass, and contains 20% of the population, and 27% of those employed [1]. There is growing interest in the health of people living in rural and remote areas, and in the study of health care services provided to them [2][3][4]. Urbanrural variations in health outcomes have been studied within Scotland [5][6][7][8][9] and the UK [10,11]. Studies so far have used a variety of health outcomes, including long standing illness [12], mortality [13], cancer [14][15][16], hypertension and cardiovascular disease [17][18][19][20], and respiratory health [5]. The available evidence is derived from specific, one-off projects and the collective evidence is inconclusive [5,16,[21][22][23]. Some studies in the US, Australia, and Canada have found rural residents to be in poorer health than their urban counterparts [24][25][26]. In Scotland, there is little evidence of important widespread urban-rural differences. A recent literature review about urban-rural health status differentials in developed countries suggested that rurality per se is not associated with poor health, but rural location is a major determinant of the nature, level of access to, and provision of health services [27].
The Scottish NHS resources allocation formula (The Arbuthnott Allocation Formula) is the first in the UK to include a cost adjustment for remoteness and rurality [28]. And the Kerr report highlighted that the rural population tends to have a significant proportion of older people who often have chronic diseases and do require more health care [29]. There is however, limited empirical evidence to indicate whether health outcomes are significantly different between rural and urban Scotland. Using objective health outcome measures, this study will shed light if there are significant differences between people living in rural and urban areas in Scotland.
Most of the previous studies were small area variation analyses and did not include the structure of the healthcare (especially primary care) serving them. Thus, they did not control for either the characteristics of individuals and/or the healthcare services received, either of which might influence health outcomes. Previous studies have suggested that geographical and organizational variations in the structure of primary care services should be considered in studies of health outcomes and health care services [30]. General practices vary in their structural characteristics, their use of health service resources, their standards of clinical care, and in patient outcomes. Structural characteristics of general practices, such as: the partnership size [31][32][33]; whether the practitioner works single-handedly; the age and sex of the general practitioner [34][35][36]; whether the practice provides vocational training [37] or engages in undergraduate medical education [38]; the size and characteristics of the practice list [39][40][41]; and whether the practice is in a rural location [42,43], have been found to be associated either with process of health care, or with different health outcomes or health care utilization [44][45][46][47][48]. There are few female GPs working in rural areas and this leads to reduced choice for patients and poorer access to some treatments [49][50][51].
The characteristics of the population served as well as the practices providing care are complex confounders that previous studies have not adequately accounted for. Information from such studies are of limited use for health care planners, who require representative, regularly updated information, ideally collected for a number of purposes in order to reduce administration costs.
The rationale for this study therefore was to assess whether routinely available national datasets can be used to examine whether there are rural-urban variations in health outcomes, after allowing for differences in the characteristics of the populations, the practices serving them, and the place of residence of the population.
To control for the characteristics of the population and the practices, we need data on the socio-demographic characteristics of the population, characteristics of the general practices serving them, and data on the ruralurban classification of the residence of the population. It is not easy to get such data due to logistical and ethical issues. That is why previous studies on this issue linked different databases to create a dataset for analysis. Most of these studies, however, used aggregated area-based indicators of population health and structure of care [52]. Such aggregation might mask the characteristics of individual general practitioners or practices and thereby affect the results. In epidemiological term, such approaches are prone to ecological fallacy, i.e. area-level aggregate results do not necessary mean that relationships hold at the individual level. Individual level data avoids the problem of ecological fallacy which can arise when area-based data are used [53].
In 2004, a record linkage exercise was undertaken by the Information Services Division (ISD) of NHS Scotland to link both the 1995 and 1998 Scottish Health Survey data to the linked Scottish hospital admission and mortality database (SMR). Its creation provided an ideal opportunity to examine the association between rurality and health outcomes. In order to adjust for some aspects of the structure of the primary care, we have linked this data to the Scottish General Practitioners Census Data (SGPC).
The main focus of the Scottish Health Survey (SHS) was on cardiovascular disease. We therefore included admissions due to coronary heart disease (CHD), total hospital stays and hypertension as health outcome measures. The SHS is linked to death records from the General Register Office for Scotland. We therefore included mortality which is usually used as an overall health outcome indicator.

Individual characteristics
The Scottish Health Survey (SHS) studied a nationally representative sample of people living in private households in Scotland; (n = 7,932 adults aged 16-64 in 1995; and n = 9,047 adults aged 16-74 and children aged 2-15 in 1998) [54]. In both years a particular focus was on cardiovascular disease (CVD) and related risk factors. Each survey collected information about the demographic and socio-economic characteristics of participants, including: age, gender, social class, education and housing tenure. Both surveys are claimed to be the first reliable surveys in Scotland that give a comprehensive picture of the health of the whole population, its biological characteristics and health-related behavior [55]. The surveys were made available through the UK Data Archive at the University of Essex. enable regional comparisons the survey divided Scotland into seven regions by aggregating health boards. Both surveys employed stratified, multi-stage random sampling to provide a nationally representative sample [55].
Of those interviewed in 1995, 6,958 informants (and 7,455 adults and 3,211 children in 1998) were subsequently visited by a nurse. Within these visits, at least one usable blood sample was taken from 6,184 adults (in 1995) and from 6,178 adults and 466 children aged 11-15 in 1998 informants. The Information Services Division (ISD) of National Health Services (NHS) Scotland linked the Scottish Health Survey to the Scottish Morbidity Records and made the data available for this study. These data were used to answer our research question in this study.

Definition of rurality
There is no universally agreed definition of what constitutes an 'urban' or 'rural' area [56]. Rurality of where individuals lived were measured using the Scottish Executive Urban Rural Classification (SEURC) [57].
The SEURC was selected as a pragmatic definition of rurality in this study. Its advantages include: it takes on board several indicators that are likely to be associated with economic issues, such as the dispersed nature of the population lacking economies of scale and large travelling times affecting access to care; it is available at national level enabling linkage to other routinely collected national datasets base to examine the relationship between rurality, health, health care provision and utilization; it is an appropriate generic definition to describe the variance in the characteristics of the populations living in remote rural areas, the practices serving them, and variations in health and health care between rural and urban areas; it enables analyzing individuals in terms of their rurality and urbanity versus their remoteness and accessibility [57]. The SEURC is being increasingly used in many studies [6,8,9,22,58,59] The SEURC classification divides Scotland into eight categories based on settlement size and remoteness: Primary Cities (settlements with over 125,000 people); Urban Settlements (settlements with between 10,000 and 124,999 people); Accessible Small Towns (settlements with between 3,000 and 10,000 people and within 30 minute drive time of a settlement of 10,000 or more); Remote Small Towns (settlements with between 3,000 and 10,000 people and between 30 and 60 minutes drive time of a settlement of 10,000 or more); Very Remote Small Towns (settlements with between 3,000 and 10,000 people and more than 60 minute drive time away from a settlement of 10,000 or more); Accessible Rural (settlements of less than 3,000 people and within a 30 minute drive time of a settlement of 10,000 or more); Remote Rural (settlements of less than 3,000 people and between 30 and 60 minutes drive time of a settlement of 10,000 or more) and Very Remote Rural (settlements of less than 3,000 people and more than 60 minute drive time of a settlement of 10,000 or more).

Health outcomes
As part of the survey, a nurse took three blood pressure measures for all individuals who gave consent. We used the mean of the three measures and defined people as hypertensive if their systolic blood pressure was ≥ 150 mmHg, or diastolic blood pressure was ≥ 90 mmHg. This definition was taken from the New General Medical Services contract -Quality of Outcome Framework criteria for hypertension [60]. Death records from the General Register Office for Scotland were used to identify all those in the sample who had died by November 2006. Scottish Morbidity Recording (SMR) data routinely collected by ISD, concerning general acute hospital inpatient and day case episodes provided information about hospital admissions due to coronary heart disease (CHD) and total number of days spent in hospital up to November 2006. SMR data go back to January 1981. We used weights provided by the surveys to account for the sampling design and non-response bias.

Practice characteristics
The ISD maintains a dataset of all doctors working as general practitioners in Scotland. For the years 1996 and 1998, we obtained the age and sex of each general practitioner (GP), the size of the partnership that each doctor worked in, the contracted time commitment of each principal (expressed as whole time equivalent -WTE), and the number, age, and gender of people registered with each practitioner. We have linked the Scottish General Practitioners Census to the survey using the serial number and practice code pertaining for each participant in the SHS at the time of the survey. (See details in "Linkage" section) We also ascertained whether the practice had a GP vacancy. We calculated practice-level partnership size and availability of a female practitioner.

Linkage
Everyone participating in the Health Surveys was asked to give explicit consent to their information being passed to ISD for subsequent linkage to routinely available datasets. For those who gave such permission, their health survey information was linked by ISD, on our behalf, to the SMR data. This was done using standard probability matching, based on name, postcode and date of birth. The postcode of each individual was also used by ISD to assign the appropriate SEURC category. The subsequent Scottish Health Survey-linked-to Scottish Morbidity Data (SHS-SMR) dataset was stripped of any identifying information before being released to us. ISD also provided the serial number and practice code pertaining for each participant in the SHS at the time of the survey. This enabled us to link the new SHS-SMR information to the characteristics of each participant's practice.
Our analysis was limited to 15668 survey respondents who said that their information could be linked to the administrative datasets (Table 1). The ISD managed to link 7363 adults aged 16-64 (1995 survey) and 8305 adults aged 16-74 (1998 survey). The linked dataset contained mortality data from 1995 up to November 2006 and hospital admission data for CHD from 1981 up to November 2006.
We were unable to assign a practice to 146 (1.9%) respondents in the 1995 SHS and 87 (1.1%) respondents in the 1998 survey. These individuals were excluded from the analyses. Our analysis was restricted to respondents who were known to be registered with a practice at the time of survey. This enabled us to examine any association between practice characteristics and health. From the available hospitalization data, only the time of the first event is known. We ran separate models, one for all respondents and one excluding respondents who had an event prior to the survey. Comparison of the two models showed no significant difference in the pattern observed, so we have shown results using all available information for respondents.

Analysis
We used hierarchical models to account for the stratified, multi-stage random sampling used in both surveys to provide a nationally representative sample [55]. Observations are nested within seven regions and within 983 general practices. We employed a hierarchical model to account for clustering.
In analyzing the data, we used sampling weights from the survey to account for the sampling design (nonprobabilistic) and non-response bias. This provided robust standard errors that relax the assumption of homesckedaciticty and adjust for heterosckedacitcity [61]. The surveys over-sampled rural areas in order to provide sufficient sample sizes within each region.
For each health outcome, we started with a basic model which adjusted for age and gender only. We created fiveyear age bands to capture the specific effect of age in men and women. Males < 30 years was taken as the reference group. Place of residence was then introduced to examine age-sex standardized differences in outcome between people living in different parts of Scotland. We then introduced socio-economic characteristics. We included a year dummy to capture change in health outcome measures over time, with year 1995 as the base year. The final model also adjusted for practice characteristics including WTE of GPs per 1000 population, and mean GP age. Additional variables about the structure of the practices were then included to see if the model was improved; availability of at least one female GP, whether the practice had a GP vacancy, and practice-level partnership size (total number of GPs in a practice). We compared the models using the Akaike information criterion (AIC) and Bayesian information criterion (BIC), and using 'fitstat' command in Stata. We used the log likelihood, AIC and BIC to compare the performance of these models. The AIC is defined as AIC = -2ln L+2k, and the BIC is defined as BIC = -2ln L+k ln(N), where ln L is the maximized log likelihood of the model and k is the number of parameters in the mode, and N is the sample size. We preferred the model with larger values of the log likelihood and smaller values of AIC and BIC. We have not presented these results due to space limitation but they are available from the authors. All three criteria favoured the model with WTE of GPs per 1000 population and mean age of GPs in a practice. Including practice structural variables (availability of a female GP, GP vacancy, and number of GPs in a practice) did not improve the model. This could probably be due to WTE of GPs per 1000 population, and mean GP age capturing the significant characteristics of a practice.
The number of observations was reduced when the survey data was linked to the practice and hospitalization (SMR) data. To reduce the probability of getting statistically significant associations by chance (type I error), a small significance level (p < 0.01) was used [62]. The study did not require ethical approval as respondents were not identifiable.
A set of eight urban-rural category dummy variables were created for rurality. Summary statistics for urbanrural categories were compared using the t-test, with Primary Cities (the largest group) as the reference group. The joint significance of differences between categories was measured using the F-test, with seven degrees of freedom.
The dependent variables, prevalence of hypertension and all-cause mortality, were dichotomous variables, while hospital admissions due to CHD and total number of hospital admissions were non-negative count values. For the dichotomous variables, multivariate logit models were generated.
For the non-negative integer count variables, negative binomial regression was used. The analyses used Stata version 11.0 [63].

Results
The distribution of the population across the eight geographic categories and socio-demographic characteristics by survey year is presented in Table 1. The three rural categories contained 12% of the total survey respondents. The pooled dataset consisted of 15,668 subjects (7363 participants in the 1995 and 8305 in 1998 survey: Table 1). The average age (standard deviation: sd) was 42.9 (14.9) years. Both surveys had a similar distribution of participants with respect to gender, social class, housing tenure and place of residence. The characteristics of the practices serving the survey participants were also similar in the two surveys: GP principals working in rural areas tended to be older, male and single-handed, when compared with those in primary cities (data not shown).

Hypertension
Older people were more likely to be hypertensive in both sexes ( Table 2 Model 1). Adjusting for rurality, individuals living in remote small towns were less likely to have hypertension than those living in primary cities (OR = 0.57, 99% CI 0.33 to 1.00, p = 0.01: Table 2 Model 2). This relationship did not persist when we control for the socio-economic differences. Living in publicly owned housing was highly associated with an increased probability of hypertension (OR = 1.34, 99% CI 1.11 to 1.60, p < 0.001: Table 2 Model 3). When practice characteristics and year dummies were introduced, the significant association between living in remote small towns and hypertension did not reach our threshold for statistical significance (p = 0.096). Being older and living in publicly owned housing was associated with a statistically significant higher likelihood of hypertension in the fully adjusted model.
None of the practice characteristics were associated with hypertension ( Table 2 Model 4). The joint significance test indicated no significant variation in the chances of hypertension across the eight urban rural categories.

Mortality
In the first model, older males and females were more likely to die males less than 30 years old (Table 3 Model 1). The risk estimates for those aged fifty years and older were highly significant (p < 0.001). After adjustment for rurality, older age remained strongly associated with mortality. Individuals in remote rural (OR = 0.51, 99% CI 0.30 to 0.87, p < 0.01) and very remote rural (OR = 0.45, 99% CI 0.25 to 0.81, p < 0.001) were less    likely to die than those in primary cities (Model 2). After allowing for the social status and housing tenure of individuals, those in very remote rural areas still had a lower chance of dying, but the strength of the association was diminished slightly. Individuals with the highest socio-economic status (professionals) had significantly lower likelihood of mortality than the intermediate socio-economic group (Table 3 Model 3). Living in publicly owned housing was associated with higher mortality. The final model that adjusted for the practice characteristics indicated that individuals in very remote rural areas have lower mortality compared to primary cities. Older age and lower social class remained significantly associated with mortality. Indeed the relationships with social class increased (in terms of magnitude and significance) after adjustment for practice characteristics.

Hospital admissions due to CHD
In the final model which adjusted for individual and practice characteristics, and year of survey, older age, and public housing tenure was associated with significantly higher levels of admission for CHD than the respective reference group (Table 4 Model 4). There was also some evidence of a relationship between hospital admissions for CHD and living in remote small towns (p = 0.009).

Total hospital stay
As expected, older individuals had more hospital stays than younger individuals, in both sexes (Table 5 Model 4). Individuals living in public housing had significantly more hospital stays than people who owned their house. Adjusted for individual and practice characteristics, there was no association between total hospital stays and place of residence.

Discussion
Our analyses of four health outcomes using individualbased data adjusted for the characteristics of people living in different parts of Scotland and the general practices serving them, failed to reveal a consistent pattern of substantially different health among those living in rural areas compared with primary cities. Older age and living in publicly owned housing appeared to be more important determinants of health than rurality or structure of the practices serving the population. Producing information about the socio-demographic characteristics of individuals living in different urban-rural areas, the practices serving them and relevant health outcomes is currently not straightforward because of the lack of a dataset that contains everything. We have shown that it is possible to link different routinely collected datasets to explore urban-rural issues. An alternative approach would be to conduct specific large-scale epidemiological surveys, which tend to be more expensive.
The health surveys used in our study were nationally representative, so should produce more representative results for Scotland as a whole than studies representing particular groups or places. The SHS linked to the SEURC over-sampled rural areas in order to provide sufficient sample sizes within each region. The SHS also provided Sampling weights to account for the sampling design and non-response bias. These gave us enough population in remote rural areas and made our results robust. We overcame the limitation of area-based analyses by looking at individual-based socio-economic characteristics.
To the best of our knowledge, this is the first study to examine the association between rurality and health in Scotland, after adjusting for the demographic and socioeconomic characteristics of the individuals living in different areas, and the characteristics of the general practices serving them. The few international studies that adjusted for the structure of care have tended to use area-based aggregated data rather than individual-based information [52,[64][65][66][67]. This is in part due to lack of relevant data.
A limitation of the study was the small number of health indicators available for analysis. Health is a complex issue, with many factors influencing health status. Rurality is one factor that has been proposed as an important influence on health, but it is not easy to ascertain whether place or other physical, social and cultural environmental factors (e.g., pollution, traffic and neighborhood noise) are important. We were unable to examine any of these factors. Furthermore, in addition to demographic and socio-economic factors, there are other important influences on health, such as drinking alcohol, smoking and substance abuse. We were unable to determine whether these potential confounding variables affected our results. As wellbeing is a combination of physical and mental health, adjusting for psychological problems might have produced different results.
Routine data are primarily collected for administrative purposes and their accuracy has been questioned [68]. Population-based, administrative data sets have been used to assess service utilization -among other outcomes -for many years, and the data linkage and analysis procedures have been validated and well-established [69,70]. The use of administrative data eliminates biases associated with the use of self-report data and attrition problems common in studies involving long-term follow-up. On the other hand, routinely collected administrative data can suffer from problems with completeness and accuracy data collection, issues which are not under the direct control of researchers and which can be difficult to quantify [71].
Most previous studies of rural-urban differences in health have adjusted for the demographic and/or socioeconomic characteristics of the population [12,72]. Few    studies in the US, and only one study in the UK, have allowed for the structure of health care providers [52,[64][65][66][67]. This is in part due to lack of relevant data. Studies in the US have found higher mortality in areas with fewer primary care doctors [64][65][66][67]. The study in England concluded that mortality levels were weakly associated with the characteristics of practices delivering primary medical care [52]. That study used health authority aggregated data about the structure of care. Such aggregation might mask the characteristics of individual general practitioners or practices and thereby affect overall results. In our study, the practice characteristics assessed were not associated with the health outcomes measured. Using individual-based data, we did not find strong or consistent significant associations between the various health indicators assessed and location, after allowing for population and practice characteristics.

Conclusions
Compositional determinants of health (age and gender) and socio-economic characteristics were found to be more strongly associated with the health outcomes examined than contextual factors (including rurality). Similar studies, using more health measures, should be carried out to confirm or refute our findings.