The effects of changes in distance to nearest health facility on under-5 mortality and health care utilization in rural Malawi, 1980–1998

Background Despite important progress, the burden of under-5 mortality remains unacceptably high, with an estimated 5.3 million deaths in 2018. Lack of access to health care is a major risk factor for under-5 mortality, and distance to health care facilities has been shown to be associated with less access to care in multiple contexts, but few such studies have used a counterfactual approach to produce causal estimates. Methods We combined retrospective reports on 18,714 births between 1980 and 1998 from the 2000 Malawi Demographic and Health Survey with a 1998 health facility census that includes the date of construction for each facility, including 335 maternity or maternity/dispensary facilities built in rural areas between 1980 and 1998. We estimated associations between distance to nearest health facility and (i) under-5 mortality, using Cox proportional hazards models, and (ii) maternal health care utilization (antenatal visits prior to delivery, place of delivery, receiving skilled assistance during delivery, and receiving a check-up following delivery), using linear probability models. We also estimated the causal effect of reducing the distance to nearest facility on those outcomes, using a two-way fixed effects approach. Findings We found that greater distance was associated with higher mortality (hazard ratio 1.007 for one additional kilometer [95%CI 1.001 to 1.014]) and lower health care utilization (for one additional kilometer: 1.2 percentage point (pp) increase in homebirth [95%CI 0.8 to 1.5]; 0.8 pp. decrease in at least three antenatal visits [95% CI − 1.4 to − 0.2]; 1.2 pp. decrease in skilled assistance during delivery [95%CI − 1.6 to − 0.8]). However, we found no effects of a decrease in distance to the nearest health facility on the hazard of death before age 5 years, nor on antenatal visits prior to delivery, place of delivery, or receiving skilled assistance during delivery. We also found that reductions in distance decrease the probability that a woman receives a check-up following delivery (2.4 pp. decrease for a 1 km decrease [95%CI 0.004 to 0.044]). Conclusion Reducing under-5 mortality and increasing utilization of care in rural Malawi and similar settings may require more than the construction of new health infrastructure. Importantly, the effects estimated here likely depend on the quality of health care, the availability of transportation, the demand for health services, and the underlying causes of mortality, among other factors.


Data
Health facility data A national census of health facilities in 1998 carried out by the Malawi Ministry of Health and the Japanese International Cooperation Agency (MOH & JICA 1999) listed 719 facilities in operation. For most facilities, the data included the date of construction (589 out of the 719 (82%)), and for all facilities it included the GPS coordinates, the facility type, principle funder, and owner.
The dates of construction ranged from 1889 (when the first missionary hospital was built) to 1998. We restricted our analysis to the period from 1980 to 1998 to reduce recall bias in the mortality data (described below). Of the 589 facilities for which we have date of construction, 337 (57%) were built between 1980 and 1998. There was no apparent temporal trend in the number of facilities built per year; the number ranged from a minimum of 10 in 1987 to a maximum of 32 in 1998 (Fig. 1).
Of the 337 new facilities, 92% were one of two types: 135 (40%) are classified as "dispensary" and 175 (52%) are classified as "dispensary/maternity." We restricted our analysis to these facilities. Dispensaries are permanent structures from which drugs are distributed. They provide outpatient care and may contain holding beds. Dispensary/maternities are similar but provide more extensive services to expectant mothers (antenatal, delivery, and postnatal care). The other facility types are district hospital, hospital, mental hospital, primary health center and urban health center. These are almost exclusively located in urban areas.
We assumed that those facilities missing the date of construction (130 out of 719), were built prior to 1980. If some of these facilities were in fact built between 1980 and 1998, then some births will be erroneously coded as being closer to a health facility than they actually were. This will bias the effect estimate towards the null if the true effect of being closer to a health facility is to reduce mortality or increase utilization.

Mortality and utilization data
The data on under-5 mortality and health care utilization came from the 2000 Malawi Demographic and Health Survey (MDHS), a nationally representative survey targeting all resident women aged 15-49 [20]. Variables collected include the date of birth and date of death (if applicable) for all children ever born to respondents (n = 40,221 children). The data also included GPS coordinates for centroids of MDHS enumeration areas, which we refer to as villages in the remainder of the paper (Fig. 2). Enumeration areas were based on the 1998 census, which identified 9213 in total. Rural enumerations areas have populations of between 800 and 1200 persons.
For births in the 5 years prior to the survey, the MDHS also contains information on the following utilization outcomes: (1) place of delivery, (2) receiving a check-up following delivery, (3) number of antenatal visits prior to delivery, and (4) receiving skilled assistance during delivery.
Migration has the potential to cause measurement error in the treatment variable since mothers' residences in 2000 may not be in the same location as their residences in previous years. We restricted our analysis to rural births that occurred at the same location where the mother was living at the time of the survey, as reported in an MDHS question about length of time at one's current residence.

Operationalization of treatments and outcomes
The primary outcome of interest was the hazard of death between birth and age five, estimated from retrospective birth histories in the MDHS that included the date of birth and date of death for each child. Children still alive and under age five at the time of the survey were right-censored (i.e. they have missing survival data between the age at which the survey occurred and age five). Additional file 1 includes tests showing that recall bias is not a concern in these data.
For the causal analysis, the treatment of interest is the reduction in distance to the nearest health facility caused by the construction of a new facility, conditional on distance to the nearest health facility prior to the construction of a new facility. In addition to models in which the linear reduction in distance is the treatment variable, in other models we use multiple treatment variables to reflect the intuition that the benefit of a new facility depends on both the distance from the village to the old facility and the distance from the village to the new facility.
For each village we calculated the distance to the nearest health facility in each year from 1979 to 1998. When a new facility was built, resulting in a change in distance to nearest facility, we assigned that village to one of the above distance change categories for all remaining years. We linked the change in distance category (including no change) for each village-year to each child-year in the mortality dataset. In villages where no new facility was built, all person-time is assigned to the 'no change' category. In villages where a new facility is built, all persontime before the facility is built is assigned to the 'no change' category. All person-time after the facility is built is assigned to the appropriate distance change category (e.g. > 10 km to 5-10 km). If a facility is built during a child's life from age 0 to 5, the portion before the facility was built was assigned to the 'no change' category, and the portion after the facilty was built was assigned to the appropriate distance change category. We linked the change in distance category (including no change) for each village-year to each child-year in the mortality dataset. The dates of construction did not include the month of construction. Therefore, to avoid over-estimating exposure to new facilities, construction was assumed to have occurred on December 31.
For the secondary outcomes, we used the linear reduction in distance as our treatment variable. We do not use the multiple category distance reduction variable due to much smaller sample sizes.

Identification strategy and statistical analysis
An ideal study of the effects of new health facilities on under-5 mortality would randomly assign villages to receive a new health facility. By comparing the mortality rates before and after health facility construction in villages that did receive a facility to those that did not, the average treatment effect of a new facility could be easily calculated. In the current study, the location for new facilities may be endogenous to the under-5 mortality rate. If one were to simply carry out a cross-sectional comparison of areas with new health facilities to those without, it is unclear which direction the bias would take. For example, if facilities tend to be located in areas with higher disease burden, they may be positively associated with mortality, even if they have a beneficial effect. Conversely, if they tend to be built in wealthier areas, they may appear to be negatively associated with mortality, even if they have no effect.
We estimated the association between distance to nearest facility and mortality or utilization as a first step to investigating causality. Mortality was measured as survival time at the child-level, and many observations were right-censored (i.e. children were under five and still alive at the time of the survey). Therefore, we fit semiparametric Cox proportional hazards models [21,22]. Models 1-4 used linear distance to test for a non-linear relationship between distance and mortality. These include models with and without dummies for year (n = 18; to capture temporal trends in mortality unrelated to new facilities) and month (n = 11; to capture seasonality in mortality). Model 4 adds controls for child and mother characteristics. As a sensitivity analysis, we run the same models with the logarithm of distance rather than linear distance (see Additional file 1). For our secondary outcomes, which are binary and thus not rightcensored, we used linear probability models.
To estimate the causal effect of changes in distance on mortality, we used stratified Cox models, with each stratum corresponding to one village. This controls for time-invariant characteristics of each village, and uses within village variation in distance and mortality to estimate effects. In some models, we again added year dummies to capture temporal trends; this is sometimes referred to as a two-way fixed effects model (two-way referring to time and space) [23]. We thus estimated the multiplicative change in the hazard ratio for mortality within these villages, before and after changes in distance. For our secondary outcomes, we take a similar approach, but using linear probability models rather than Cox models. We included fixed effects for each village and each year.
One key assumption, inherent in Cox models, is that the hazards are proportional. We tested this assumption in two ways (Additional file 1). First, we tested for nonzero slope in a generalized linear regression of the scaled Schoenfeld residuals on time [24]. Second, because that test can be "over-powered"with many observations it may classify substantially insignificant changes in the hazard ratio as statistically significant --we visually assessed plots of the scaled Schoenfeld residuals for the covariates that the test identified as violating the proportional hazards assumption [25].
P-values for distance reduction coefficients in the categorical distance models were adjusted for multiple testing using a false discovery rate procedure [26].
Ethical approval was obtained from Simmons University Institutional Review Board.

Results
There were a total of 40,306 births reported in the 2000 MDHS. Of those births, 18,714 were eligible for the mortality analysis, meaning that they occurred in a rural area between 1980 and 1998 to a woman in a village who reported living in the same village since at least the date of birth (Table 1). Of the mothers in the mortality analysis sample, 43.8% had no education, 55.0% had primary education, and 1.2% had secondary education or higher. For the utilization analysis, the number of eligible births ranged from 2333 to 4926.
Kernel density plots of births by distance to nearest facility for selected years show that the average distance has been decreasing over time, driven in particular by facilities 10-20 km from a village being replaced by facilities less than 10 km away from a village (Fig. 3). Overall, the modal distance is 5-10 km, and very few births occurred within 1 km of a health facility (Fig. 4). Turning to the number of births per year, the number was higher in more recent years because the population was growing rapidly and the data come from women who were aged 15-49 in 2000. The sharp decrease in births in 1995 is almost certainly due to misreporting of 4 year olds as 5 year olds. This means that for a small number of child-years, children will be coded as being untreated when they were in fact treated. If the true effect of health facilities is protective, then this will bias estimates toward the null.
Each village contributed an average of 1643 childmonths (137 child-years) to the analysis (min 32; max 4493) ( Table 2). On average, 1289 of those child-months (78%) occurred prior to the construction of a new facility. Of the child-months that occurred after the construction of a new facility, relatively few were contributed by villages in which the distance to nearest facility decreased from 2 to 5 km to less than 2 km (only six villages fell into this category). Thus the statistical analysis that follows may have relatively little power to detect effects from changes to less than 2 km.
In all four models testing an association between distance and mortality, we found a significant relationship at p < 0.05. (Table 3) The bivariate model with linear distance showed that each additional kilometer was associated with an 1.1% increase in the hazard of death (95%CI 0.5 to 1.8%). Adding controls for year and month reduced the hazard ratio slightly to 1.007 (95%CI 1.001  . Similar results were found after adding controls for child and mother characteristics (column 4) and using log (distance) instead of linear distance (see Additional file 1). We found no statistically significant effect of reductions in linear distance to nearest health facility on under-5 mortality, using Cox models stratified by village ( Table 4). The effect was marginally significant (p < 0.10) when linear distance was the only variable, but no longer significant after we added controls for year, month, and mother and child characteristics.
Using categorical variables to account for differences in initial distance to nearest health facility in Cox models stratified by village, we again found no statistically significant effect of reductions in distance to nearest health facility on under-5 mortality (Table 5). Reductions in distance from 5 to 10 km to 2-5 km caused a 37.5% reduction in the hazard of mortality in the model with no controls for year, seasonality, or mother and child characteristics; however, once those controls were added, there was no effect from any category of distance reduction.  Distance to nearest health facility was associated with three out of four measures of health care utilization ( Table 6). One additional kilometer was associated with a 1.2 percentage point (pp) increase in homebirth, a 0.8 pp. decrease in having done at least three antenatal visits, and a 1.2 pp. decrease in skilled assistance at birth. We obtained similar results without control variables (not shown).
We found that a 1 km reduction in distance to nearest health facility caused a 2.4% decrease in the probability that a professional checked the mother's health after a birth (Table 7). We did not find any effect on probability of homebirth, having done at least three antenatal visits, or skilled assistance at birth.

Key results
We found that children born further from health facilities were at higher risk of dying before age five. However, we found no evidence that reductions in distance to the nearest health facility caused by the construction of new health facilities in rural Malawi between 1980 Notes: U5M under-5 mortality Hazard ratios (95% confidence intervals) from proportional hazards models. Distance is linear distance to nearest health facility from village centroid. The coefficient on the distance variable represents the HR for a one-kilometer increase in distance. The reference category for mother's education is 'less than primary', and for mother's age is '19-35 years old' *** p < 0.01 ** p < 0.05 * p < 0.1 and 2000 caused a change in the risk of children dying before age five. Similarly, we found that pregnant women living further from health facilities were more likely to give birth at home, less likely to have at least three antenatal care visits, and less likely to have skilled assistance at delivery. However, we did not find that reductions in the distance to the nearest health facility changed utilization of any of those three services. We did find, surprisingly, that reductions in distance caused a decrease in the probability that women received a postnatal checkup. Overall, this suggests that the associations between distance and our outcomes are driven by omitted variables, such as local burden of disease or social determinants of health (beyond those controlled for here).

Limitations
There are several important limitations to this study. First, new health facilities were not constructed in randomly assigned locations. Therefore, it is possible that children born in areas where new facilities were built were systematically more or less likely to die before age Total child-months 737,547 737,547 737,547 ***p < 0.01 Notes: Hazard ratios from Cox proportional hazards models, with baseline hazard stratified by village of birth (n = 449) The reference category for mother's education is 'less than primary', and for mother's age is '19-35 years old' Table 6 Associations between distance to nearest health facility and maternal health care utilization Notes: Coefficients from linear probability models ***p < 0.01 **p < 0.05 *p < 0.1 five than children in other areas. We used a two-way fixed effects estimation strategy to overcome this endogeneity, but it relied on the assumption that the underlying village-level hazard of under-5 mortality can be modeled as a multiplicative combination of timeinvariant village effects and year-specific effects that are common across villages. That assumption implies that, after adjusting for mother's education, age at birth of her child, whether or not the child is first born, and village characteristics that do not change during the study period, secular trends in under-5 mortality are equivalent across villages. Second, while we estimated the average treatment effect, new health facilities are likely to produce heterogeneous effects depending on the quality of care provided. The training of the facility staff, the frequency with which they work, and the availability of pharmaceuticals and other medical supplies may vary widely across facilities. We restricted the analysis to dispensaries and maternity/dispensaries, but even within these categories the variability in quality of care may be significant. Variability in quality of care over timeif, for example, new and better health care technologies become available in more recent yearscan also lead to heterogeneous effects for similar reasons.
Third, we had limited data on utilization. The MDHS only included those variables for births in the 5 years prior to the survey, so the sample size was roughly onesixth of that for the mortality analysis. It remains possible that utilization of those services increased in earlier years, or that utilization of other services (e.g., treatment for diarrhea, pneumonia or malaria) increased at any time in the analysis period.
Fourth, the mortality data were retrospective accounts of women who were 15-49 years old in 2000. Children born to women who died were not represented. Because a child is more likely to die if her mother dies, retrospective estimates of under-5 mortality will tend to be underestimates if a substantial proportion of women die between the ages 15 and 49. The potential for bias in this study depends on two factors: the impact of new facilities on adult mortality, and the propensity for new facilities to be built in areas with increasing or decreasing adult mortality. Assume first that new facilities have no impact on adult or child mortality. If new facilities were built in areas where adult mortality is increasing and, thus, under-5 mortality is underestimated, then it will appear that the new facility caused a reduction in under-5 mortality. If, on the other hand, new facilities were built in areas where adult mortality was decreasing, then the opposite will occur. The onset of the HIV epidemic in Malawi in the mid-1980s increased adult mortality substantially [27]. It is unlikely, however, that new facilities were targeted to areas with higher or lower HIV prevalence. The Malawian government in 1980-1998 is unlikely to have possessed adequate capacity to monitor the burden of disease at a high enough spatial and temporal resolution to target new facilities to the areas with the highest burden [13]. Even with sufficient information on disease burden, health systems face a tradeoff between equity and efficiency when deciding where to locate new facilities [28,29]. Furthermore, there is likely to have been political pressure to use facilities as patronage, targeting areas to gain support rather than reduce disease [30].Another determinant of adult mortality would have been the availability of antiretroviral therapy (ART). In Malawi, ART was not widely available in until after 2004 [31], thus it would not affect the analysis presented here.
Fifth, there is measurement error in our distance measure, which is from the centroid of the village to the health facility. To the extent that households in a village are not located at the centroid, our measure of distance is not accurate. We expect over-and under-estimates Table 7 The effect of changes in distance to nearest health facility on maternal health care utilization  from this mismeasurement to be equally likely. Nonetheless, this measurement error would be expected to bias our effect estimates towards the null. There may also be measurement error from facilities that closed between 1980 and 1998 and were not replaced, because they would not be in the 1998 health facility census. In those cases, we would be overestimating the true distance from the village to that facility. If shorter distances cause lower mortality, then our current analysis will underestimate the effect of reducing distance on mortality. However, we find it unlikely that a large number of facilities were shut down and not replaced during a time period when Malawi's population nearly doubled, from 6.25 million to 11 million.

Comparison to similar studies
Previous studies on the relationship between distance to nearest health facility and child mortality show mixed results. A study that combined survey data from 21 countries found that greater distance was associated with higher neonatal mortality, but not mortality at later ages [32]. A study in a rural Kenyan districts with high health facility density found no association between travel time and child mortality [8], nor did a case-control study in rural Gambia [9]. The file drawer problem suggests that studies finding no relationship between distance and mortality are less likely to be published [33].
Studies reporting an association between distance and mortality include a study in rural Burkina Faso which estimated that under-5 mortality was 50% higher at a distance of 4 h walking time to the nearest facility compared with having a facility in the village [4]. A matched pairs study found that the construction of maternity clinics in Indonesia in the mid-1980s reduced infant mortality in the surrounding area by 15% [34]. In South Africa, allowing blacks to utilize facilities that were formerly restricted to whites increased the weightfor-age scores for male infants, but had no effect on female infants [35]. Several other observational studies have found a positive association between distance and under-5 mortality [36][37][38].

Generalizability
The external validity of this study benefits from the fact that it used data from villages and health facilities throughout rural Malawi, over a period of 18 years. Nevertheless, this is a study of a single country during a particular phase of history. As discussed above, the effects estimated here likely depend on the quality of health care, the availability of transportation, the demand for health services, and the underlying causes of mortality, among other factors. Data on these variables are scarce in Malawi and other high mortality countries, and more resources should be invested in collecting that information.

Conclusion
The results presented here suggest that there is more to the story of reducing under-5 mortality than increasing the availability of health infrastructure by reducing distances to the nearest facility. This finding may hold across other low-income, high-mortality countries, particularly in sub-Saharan Africa. More research is needed on the relationship between access to care, quality of care, and perceived quality of care in these settings.
Additional file 1: Additional results and sensitivity analysis.
Abbreviations ART: Antiretroviral therapy; MDHS: Malawi Demographic and Health Survey