Study design, setting and period
Secondary data analysis was conducted based on the three consecutive EDHSs data conducted in 2005, 2011, and 2016 [24,25,26]. These surveys are a nationally representative study conducted in Ethiopia, which is situated in the Horn of Africa. Ethiopia is the second most populous country in Africa next to Ngeria, and has 9 Regional states (Afar, Amhara, Benishangul-Gumuz, Gambela, Harari, Oromia, Somali, Southern Nations, Nationalities, and People’s Region (SNNP) and Tigray) and two Administrative Cities (Addis Ababa and Dire-Dawa). Ethiopia is an agrarian country and 84% of the population lives in a rural area, and 80% of the country’s total population lives in the regional states of Amhara, Oromia, and SNNP [27]. About 60% of the total population are living in the pastoral regions (Somali, Afar, Oromiya, Southern region, Gambella, and Benishangul-Gumuz regions) where people are sparsely populated and the community are least benefitted the health sector development [28]. Ethiopia is a multi-religious country with the domination of orthodox Christian and Muslim religious followers and having more than 80 ethnics groups that exercise their own culture and language. Ethiopia currently has one of the fastest growing economies in Africa and agriculture accounts for 40.5% of national GDP [29]. Ethiopia has registered an average annual growth rate of 11% of GDP, but 24% of the population still live below the national poverty line [30, 31]. Healthcare funding for the county is highly dependent on donors followed by households in the form of out-of-pocket care expenditure [32].
Ethiopia has 3 tiers health systems, Primary health care unit (Primary hospital, health center, health post, primary clinic, and medium clinic); Secondary health care (General hospital, specialty clinics, and specialty centers); and Tertiary health care (Specialized hospital). The number of hospitals varies from region to region in response to differences in population size [33].
Sample and population
For this study, the data were obtained from eligible women aged 15–49 years who were participated in the survey. A stratified two-stage cluster sampling technique was employed for all three EDHS surveys using the population and housing census as a sampling frame. In total, 21 sampling strata have been created. In the first stage, a total of 540 Enumeration Areas (EAs) in EDHS 2005, 624 EAs in EDHS 2011, and 645 EAs in EDHS 2016 were selected with probability proportional to the EA size and with independent selection in each sampling stratum. At the second stage, on average 28–32 households were systematically selected. Based on this a total weighted sample of 14,062 reproductive-age women in EDHS 2005, 16,490 in EDHS 2011, and 15,863 in EDHS 2016 were included for the analysis. The detailed sampling procedure was presented in the full EDHS report [24,25,26]. For the spatial analysis, the geographic coordinate (longitude and latitude) data were taken from the selected enumeration areas. The EDHSs data set and the geographic coordinate data were accessed through an online request to the measure DHS program by explaining the objective of the study and we receive an authorization letter.
Measurement of variables
The dependent variable was a score, health care access challenges were categorized dichotomously as Yes/No. To measure health care access challenges, each reproductive-age women were asked whether each of the following factors is a big problem in seeking medical advice or treatment for themselves when they are sick: 1) getting permission to go to the doctor, 2) getting money for advice or treatment, 3) distance to a health facility and 4) not wanting to go alone [34]. Then we created a composite variable that labeled as “health care access challenges” if the women responded to at least one the item as big problem classified as “had health care access challenges “and when women had responded as not a big problem to all of the questions then she was classified as “had no health access challenge” [35, 36]. Based on prior similar studies [37,38,39], the independent variables included in this study were maternal age (recoded as 15–24, 25–34, and 35–49), residence (recoded as urban, and rural), maternal education (recoded as no, primary education, and secondary and above), husband education (recoded as no, primary education, and secondary and above), marital status (recoded as never married, married/living together, and separated/widowed/divorced), wealth status (recoded as poor, middle and rich), visiting health facility in the last 12 months (recoded as Yes, and No), ANC visit (recoded as Yes and No), place of delivery (recoded as home and health facility), maternal occupation status (recoded as working and not working), contraceptive use and intention (recoded as using modern method, using traditional method, non-users and intends to use latter, and doesn’t intends to use), household head (recorded as male and female), preceding birth interval (recoded as < 2 years and ≥ 2 years), media exposure (generated by aggregating the three variables (reading news paper, listening to radio and watching television and recoded as No and Yes), and current pregnancy.
Data collection procedure
This study was performed based on the three EDHSs data obtained from the official DHS measure system website www.measuredhs.com after permission was given via online request through specifying our analysis objective. We used the set of individual (IR) data and extracted the outcome and the independent variables. The location data (latitude and longitude) was obtained from the measure DHS program.
Data management and analysis
The data were weighted using sampling weight, primary sampling unit, and strata before any statistical analysis to restore the representativeness of the survey and to tell the STATA to take into account the sampling design when calculating standard errors to get reliable statistical estimates. Cross tabulations and summary statistics were conducted to describe the study population. Descriptive and summary statistics were conducted using STATA version 14, ArcGIS version 10.6, SaTScan version 9.6, and R software.
Decomposition analysis
Data from EDHS 2005, and 2016 were appended together with the decomposition analysis. The trend was assessed separately in three phases (phase 1 (2005–2011), phase 2 (2011–2016), and phase 3 (2005–2016)). A multivariate decomposition analysis of the decrease in health care access challenges over time was fitted to identify the significant factors contributing to the decrease in health care access challenges for the last 11 years (2005–2016). Logit based multivariate decomposition analysis technique for non-linear response model (MVDCMP) was used for identifying factors significantly contributing to the decrease in health care access challenges since it was a binary outcome. It was a regression analysis of the decrease in the health care access challenges between EDHS 2005 and 2016. The model utilizes the output from a logit based multivariate decomposition model to parcel out the observed decrease in the percentage of health care access problems across the survey into two components.
The multivariate decomposition analysis decomposes the overall decrease in health care access challenge overtime into the decrease due to the difference in women’s composition (endowment) across the surveys and the decrease due to the difference in the effect of the characteristics (coefficient) between the surveys. In the overall decomposition analysis, we can measure the percentage in an overall decrease in health care access challenges over time attributed to the compositional difference in women (difference in characteristics or endowment) and the percentage of overall decrease due to the difference in the effect of explanatory variables (difference in coefficient) between the surveys.
Hence, the observed decrease in health care access challenges between surveys is additively decomposed into a characteristics (or endowments) component and a coefficient (or effects of characteristics) component.
For logistic regression, the Logit or log-odd of health care access problem is taken as:
$$ \mathrm{Logit}\;\left(\mathrm{A}\right)\hbox{-} \mathrm{Logit}\;\left(\mathrm{B}\right)=\mathrm{F}\;\left(\mathrm{XA}\upbeta \mathrm{A}\right)\hbox{-} \mathrm{F}\;\left(\mathrm{XB}\upbeta \mathrm{B}\right). $$
$$ =\frac{\left[\mathrm{F}\;\left(\mathrm{XA}\upbeta \mathrm{A}\right)\hbox{-} \mathrm{F}\;\left(\mathrm{XA}\upbeta \mathrm{A}\right)\right]}{\mathrm{E}}+\frac{\left[\mathrm{F}\;\left(\mathrm{XB}\upbeta \mathrm{B}\right)\hbox{-} \mathrm{F}\;\right(\mathrm{XB}\upbeta \mathrm{B}\Big].}{\mathrm{C}} $$
The E component refers to the part of the differential owing to differences in endowments or characteristics. The C component refers to that part of the differential attributable to differences in coefficients or effects.
The equation can be presented as:
$$ \mathrm{Logit}\;\left(\mathrm{A}\right)\hbox{-} \mathrm{Logit}\;\left(\mathrm{B}\right)=\left[\upbeta 0\mathrm{A}\hbox{-} \upbeta 0\mathrm{B}\right]+\sum \mathrm{XijB}\ast \left[\upbeta \mathrm{ijA}\hbox{-} \upbeta \mathrm{ijB}\right]+\sum \upbeta \mathrm{ijB}\ast \left[\mathrm{XijA}\hbox{-} \mathrm{XijB}\right]. $$
- XijB is the proportion of the jth category of the ith determinant in the DHS 2005,
- XijA is the proportion of the jth category of the ith determinant in DHS 2016,
- ΒijB is the coefficient of the jth category of the ith determinant in DHS 2005,
- ΒijA is the coefficient of the jth category of the ith determinant in DHS 2016,
- Β0B is the intercept in the regression equation fitted to DHS 2005, and.
- Β0A is the intercept in the regression equation fitted to DHS 2016.
The recently developed multivariate decomposition for the non-linear model was used for the decomposition analysis of health care access challenges using the mvdcmp STATA command [40]. In this study variable with p-value <, 0.2 in the bivariable multivariate decomposition analysis were considered for the multivariable multivariate decomposition analysis. In the multivariable multivariate analysis variables with p-value< 5% in the endowment and coefficient component were considered as significant contributing factors for the decrease in health care access challenges over time. Variance Inflation Factor (VIF) and tolerance were done to check whether there is significant multicollinearity between the independent factors. The mean VIF in this study was less than 10 and tolerance greater than 0.1, it indicates there is no significant multicollinearity.
Spatial analysis
ArcGIS version 10.6 software and SaTScan version 9.6 software were used to explore the Spatio-temporal distribution of health care access challenges. The global spatial autocorrelation (Global Moran’s I) was done to assess whether women’s health care access challenges were dispersed, clustered, or randomly distributed in the study area [25]. Global moran’s I is a spatial statistics used to measure spatial autocorrelation by taking the entire data set and produce a single output value which ranges from − 1 to + 1. Moran’s I value close to − 1 indicates that health care access challenges is dispersed, whereas moran’s I close to + 1 indicate health care access challenges are clustered and if moran’s I close to 0 revealed that health care access challenge is randomly distributed. A statistically significant Moran’s I (p < 0.05) showed that women’s health care access challenge is non-random.
Kriging interpolation was employed to explore the burdens of health care access challenges in the unsampled areas of the country based on the observed data. The spatial interpolation technique is used to predict women’s health care access challenges on the un-sampled areas in the country based on the value observed form sampled EAs. Therefore, part of a certain area can be predicted by using observed data using a method called interpolation. There are various deterministic and geostatistical interpolation methods. Among all of the methods, ordinary Kriging and empirical Bayesian Kriging are considered the best method since it incorporates the spatial autocorrelation and it statistically optimizes the weight [26]. In this study, the ordinary kriging spatial interpolation method was used for the predictions of women’s health care access challenge in unobserved areas of Ethiopia since it had the lowest residual.
Bernoulli based spatial scan statistical analysis was employed to detect the primary and secondary significant spatial clusters of health care access challenges using Kuldorff’s SaTScan version 9.6 software. The spatial scan statistic uses a circular scanning window that moves across the study area. A woman with health care access challenge was taken as cases and women with no health care access challenges were taken as controls to fit the Bernoulli model. The default maximum spatial cluster size of < 50% of the population was used since it allowed both small and large clusters to be detected and ignored clusters that contained more than the maximum limit. For each potential cluster, a likelihood ratio test statistic and the p-value were used to determine if the number of observed health care access challenge cases within the potential cluster was significantly higher than expected or not. The scanning window with maximum likelihood was the most likely performing cluster, and the p-value was assigned to each cluster using Monte Carlo hypothesis testing by comparing the rank of the maximum likelihood from the real data with the maximum likelihood from the random datasets. The primary and secondary clusters were identified and assigned p-values and ranked based on their likelihood ratio test, based on 999 Monte Carlo replications [27].
Ethical approval and consent to participate
Since the study was a secondary data analysis of publically available survey data from the MEASURE DHS program, ethical approval and participant consent were not necessary for this particular study. We requested DHS Program and permission was granted to download and use the data for this study from http://www.dhsprogram.com. There are no names of individuals or household addresses in the data files. The geographic identifiers only go down to the regional level (where regions are typically very large geographical areas encompassing several states/provinces. In surveys that collect GIS coordinates in the field, the coordinates are only for the enumeration area (EA) as a whole, and not for individual households, and the measured coordinates are randomly displaced within a large geographic area so that specific enumeration areas cannot be identified.