Study site
Burkina Faso reported approximately 12 million of malaria cases and 4000 deaths in 2018 [1]. Malaria transmission is intense and seasonal [16, 17] and despite SMC, clinical malaria remains on the rise [1]. The health and demographic surveillance site (HDSS) of Nanoro in the centre west provides rich household survey data suitable for microplanning studies [18]. Three villages (Soaw, Rakolo, Mogdin) with different characteristics (geographic or demographics) were selected to test the potential of microplanning in optimizing SMC deployment. Population raster data were extracted to compare villages’ characteristics and results are presented in the results’ section.
Census data were obtained from the HDSS and incidence data from national’s District Health Information Software 2 platforms (DHIS2).
Microplanning model
To improve SMC door-to-door delivery, we used a salesman algorithm-based accessibility model (Fig. 1) to determine the optimal itinerary for CHWs to efficiently visit all households in each village. The model computes the shortest door-to-door visit itinerary using global positioning system (GPS) information of households and outputs travel distance and travel time as well as treatment duration. Households’ GPS coordinates and family sizes were extracted from the population raster of each village. A 2015’s population raster of Burkina Faso as provided by the Center for International Earth Science Information Network (CIESIN) and the Connectivity Lab at Facebook encapsulates values on number of individuals inside raster’s pixels that can be processed to extract family sizes [19]. Fraction of under 5 children per household was subsequently derived from related-family size assuming that under 5 children make up approximately 18% of the total population [20].
Generating household visit itinerary using the salesman algorithm
We describe the traveling salesman problem (TSP) as a graph theory problem. Each household is thought of as a vertex and each vertex is connected by an edge, so our graphs is G = (V, E), where V is the set of vertices, and E is the set of edges. Each edge has an associated cost cij, which is the distance between the two households. Our goal is to find the shortest path from any starting vertex that passes through all other vertices without repeating. Unlike the classic TSP, we do not return to the starting household. Instead, we use the Held-Karp algorithm, a dynamic programming solution [21]. The idea is to compute optimal sub-paths. We compute table entries C(S, i, j) for each subset S ⊂ V, and i, j ∈ S, defined to be the length of the shortest path from vertex i to vertex j visiting each vertex in S exactly once (and no node outside of S). The algorithm computes C(S, i, j) for increasing number of vertices in set S, up to N, the total number of vertices.
-
1
Let C({i, j}, i, j) = cij for all i ≠ j.
-
2
For k = 3 to N, do
-
a
For all sets S ⊂ V with k vertices, compute \( C\left(S,i,j\right)=\underset{l\in S\backslash \left\{i,j\right\}}{\min}\left[C\left(S\backslash \left\{j\right\},i,l\right)+{c}_{lj}\right] \)
-
3
Return the optimal cost \( L=C\left(V,{n}_1,{n}_N\right)=\underset{i\ne j}{\min }C\left(V,i,j\right) \)
We now can recover the path as follows: n1, nN are our starting and ending vertices respectively. Vertex nN − 1 is the unique vertex satisfying
$$ \mathrm{C}\left(\mathrm{V},{\mathrm{n}}_1,{\mathrm{n}}_{\mathrm{N}}\right)=\mathrm{C}\left(\mathrm{V}\backslash \left\{{\mathrm{n}}_{\mathrm{N}}\right\},{\mathrm{n}}_1,{\mathrm{n}}_{\mathrm{N}-1}\right)+{\mathrm{c}}_{{\mathrm{n}}_{\mathrm{N}-1}{\mathrm{n}}_{\mathrm{N}}}. $$
If we have computed nN − 1, …, nj + 1, then vertex nj is the unique vertex satisfying
$$ \mathrm{C}\left(\mathrm{V}\backslash \left\{{\mathrm{n}}_{\mathrm{N}},\dots, {\mathrm{n}}_{\mathrm{j}+2}\right\},{\mathrm{n}}_1,{\mathrm{n}}_{\mathrm{j}+1}\right)=\mathrm{C}\left(\mathrm{V}\backslash \left\{{\mathrm{n}}_{\mathrm{N}},\dots, {\mathrm{n}}_{\mathrm{j}+1}\right\},{\mathrm{n}}_1,{\mathrm{n}}_{\mathrm{j}}\right)+{\mathrm{c}}_{{\mathrm{n}}_{\mathrm{j}}{\mathrm{n}}_{\mathrm{j}+1.}} $$
We now have the whole path (n1, …, nN) along with optimal distance L = C(V, n1, nN).
With the path linking household via GPS coordinates, we could estimate walking distance using Euclidian distance formula.
Subdividing hard-to-reach areas using k-means clustering
For hard-to-reach villages with accessibility constraints (e.g. rivers), the model first clusters households using the constrained K-Means algorithm before determining optimal itinerary and unmet needs for each cluster [22].
We give the mathematical formulas for the constrained K-Means problem. Let the dataset be D = {x1, …, xm}, where xi ∈ Rn. Let 1 ≤ k ≤ m be the number of clusters. We want to find cluster centers C1, . . , Ck such that the distance between each point xi and the nearest cluster center Ch is minimized under the condition that cluster number h must contain at least τh data points, where \( {\sum}_{h=1}^k{\tau}_h\le m \). If τh > 0, this forces clusters to be non-empty, and we can also choose τh such that all clusters have relatively the same number of data points. We let Ti, h ∈ {0, 1} denote the “selection variables” that indicate whether xi belongs to cluster number h. The constrained K-Means problem is as follows.
$$ \underset{C,T}{\min }{\sum}_{i=1}^m{\sum}_{h=1}^k{T}_{i,h}\left(\frac{1}{2}{\left\Vert {x}_i-{C}_h\right\Vert}^2\right) $$
We can solve this iteratively. At iteration t, let C1, t, …, Ck, t be the cluster centers. We compute the cluster centers C1, t + 1, …, Ck, t + 1 at iteration t + 1 in 2 steps.
-
1
Cluster assignment: let \( {\mathrm{T}}_{\mathrm{i},\mathrm{h}}^{\mathrm{t}} \) be a solution to the following linear program with Ch, t fixed
$$ \underset{\mathrm{T}}{\min }{\sum}_{\mathrm{i}=1}^{\mathrm{m}}{\sum}_{\mathrm{h}=1}^{\mathrm{k}}{\mathrm{T}}_{\mathrm{i},\mathrm{h}}\left(\frac{1}{2}{\left\Vert {\mathrm{x}}_{\mathrm{i}}-{\mathrm{C}}_{\mathrm{h}}\right\Vert}^2\right) $$
$$ \mathrm{subject}\ \mathrm{to}\ {\sum}_{\mathrm{h}=1}^{\mathrm{k}}{\mathrm{T}}_{\mathrm{i},\mathrm{h}}=1,\mathrm{i}=1,\dots, \mathrm{m} $$
$$ \kern5.25em {\sum}_{\mathrm{i}=1}^{\mathrm{m}}{\mathrm{T}}_{\mathrm{i},\mathrm{h}}\ge {\uptau}_{\mathrm{h}},\mathrm{j}=1,\dots, \mathrm{k} $$
$$ \kern8.75em {\mathrm{T}}_{\mathrm{i},\mathrm{h}}\ge 0,\mathrm{i}=1,\dots, \mathrm{m},\mathrm{h}=1,\dots, \mathrm{k} $$
-
2.
Update Ch, t + 1as follows. If \( {\sum}_{i=1}^m{T}_{i,h}^t=0 \), then no update is made: Ch, t + 1 = Ch, t. If \( {\sum}_{\mathrm{i}=1}^{\mathrm{m}}{\mathrm{T}}_{\mathrm{i},\mathrm{h}}^{\mathrm{t}}>0 \), then
$$ {\mathrm{C}}_{\mathrm{h},\mathrm{t}+1}=\frac{\sum_{\mathrm{i}=1}^{\mathrm{m}}{\mathrm{T}}_{\mathrm{i},\mathrm{h}}^{\mathrm{t}}{\mathrm{x}}_{\mathrm{i}}}{\sum_{\mathrm{i}=1}^{\mathrm{m}}{\mathrm{T}}_{\mathrm{i},\mathrm{h}}^{\mathrm{t}}} $$
We terminate when Ch, t + 1 = Ch, t for all h. This algorithm is guaranteed to converge to a locally optimal solution. The constraints in the linear program in the cluster assignment step is equivalent to a Minimum Cost Flow (MCF) problem, a linear network optimization problem.
SMC performance under current standard deployment
Current standard SMC deployment refers to as a door-to-door delivery performed by a CHW whose geographical orientation and time management is solely based on the CHW’s own perception. The number of CHWs for Rakolo during the 2016’s SMC campaign was limited to two who were trained by one supervisor (health facility nurse). MAPs were not carried by CHWs although rough sketched paper’s map was used by the supervisor to macro-plan SMC deployment across the health facility catchment area. SMC coverage was defined as follows: \( SMC\ Coverage={\sum}_{\mathrm{i}=1}^{\mathrm{T}}\frac{n_i}{N}\times 100 \) where T is campaign duration in days, n is number of treated and N the total number of children. Based on personal communications and on reports, we estimated at 12.5mn the average treatment duration per child ranging under 15mn to above 30mn in 63 to 22% of occasions respectively [23]. During the 2016’s SMC campaign, CHWs service packages were loaded with GPS devices and unknowingly provided GPS-tracking itineraries in Rakolo. Walking distances were estimated using visited household coordinates and converted to travel times assuming a 20mn walk per km in the wet season [24, 25].
SMC performance under microplanning
To predict SMC coverage under microplanning, the model assumes an initial number of 2 CHWs, daily working time of 8 h, and 4 days of campaign duration. Walking distances in optimized visit itineraries were converted to travel times [24, 25]. Based on current CHWs experiences, we assumed random draws of treatment duration for 1, 2 or 3 children per household as follows t ∼ U (10, 15); t ∼ U (15, 20) or t ∼ U (20, 25) respectively. Assuming a household of 3 children we estimated on average 8.5mn (25mn/3) per child as best optimal treatment duration. Predicted SMC coverage was computed as follows: \( SMC\ Coverage=\frac{\boldsymbol{T}}{\boldsymbol{t}}\times 100 \) where T is campaign duration in days and t is total of treatment and travel times in days (Fig. 1).
We assessed CHWs’ performances (SMC coverages) and unmet needs under current SMC deployment mode and two microplanning scenarios (A and B). Microplanning A consists in optimizing visits itinerary and time invested in treatment while for microplanning B, visits itinerary, time invested in treatment and number of CHWs are optimized.
Comparison analyses of proportions of treated children and visited households between current SMC deployment and microplanning A or B were based on Chi2 test. Uncertainties around the optimal number for daily treated children and walking km were estimated as 95% Confidence Intervals using the t-distribution.
Unmet needs for SMC performance maximization
To maximize SMC performance, unmet needs were estimated by converting maximum time invested to reach 100% of SMC coverage into number of CHWs needed:
Unmet needs (CHWs) = \( \frac{t}{T}\times 2 \) where T is campaign duration in days and t is total time invested for treatment and travel (Fig. 1). We chose to express unmet needs as supplementary CHWs instead of supplementary campaign days to reduce workforce burden (fatigue).