Skip to main content

The effectiveness of pay-for-performance contracts with non-governmental organizations in Afghanistan – results of a controlled interrupted time series analysis



In many contexts, including fragile settings like Afghanistan, the coverage of basic health services is low. To address these challenges there has been considerable interest in working with NGOs and examining the effect of financial incentives on service providers. The Government of Afghanistan has used contracting with NGOs for more than 15 years and in 2019 introduced pay-for-performance (P4P) into the contracts. This study examines the impact of P4P on health service delivery in Afghanistan.


We conducted an interrupted time series (ITS) analysis with a non-randomized comparison group that employed segmented regression models and used independently verified health management information system (HMIS) data from 2015 to 2021. We compared 31 provinces with P4P contracts to 3 provinces where the Ministry of Public Health (MOPH) continued to deliver services without P4P. We used data from annual health facility surveys to assess the quality of care.


Independent verification of the HMIS data found that consistency and accuracy was greater than 90% in the contracted provinces. The introduction of P4P increased the 10 P4P-compensated service delivery outcomes by a median of 22.1 percentage points (range 10.2 to 43.8) for the two-arm analysis and 19.9 percentage points (range: - 8.3 to 56.1) for the one-arm analysis. There was a small decrease in quality of care initially, but it was short-lived. We found few other unintended consequences.


P4P contracts with NGOs led to a substantial improvement in service delivery at lower cost despite a very difficult security situation. The promising results from this large-scale experience warrant more extensive application of P4P contracts in other fragile settings or wherever coverage remains low.

Peer Review reports

Strengths and limitations of the study

  • We have 48 months of data before introduction of P4P and 30 months of data afterwards. The HMIS data we use were verified robustly and independently.

  • The scheme was implemented on a very large scale — covering a population of about 29 million in the intervention provinces and 1.4 million in the comparison arm. It was also implemented at a cost that is reproducible and was done in the context of increasing insecurity. This was not a lavish pilot.

  • We rely primarily on HMIS data as no household survey data are available after 2018 to ascertain the population-level impact of the scheme

  • The HMIS data from the comparison provinces were less consistent and accurate than those from the intervention provinces, which could bias the estimates (and understate the differences).

  • There is some heterogeneity in the results between services.

  • Non-availability of valid sub-population estimates for denominators


Fragile and conflict-affected settings, like Afghanistan, contribute disproportionately to worldwide poverty and the global burden of disease. In 2021, about 48% of the extreme poor lived in such settings, and this is expected to increase to 66% by 2030 [1]. Of the 10 countries with the highest rates of infant mortality, seven are classified as fragile states; and 14 of the 20 countries with the weakest progress on reducing maternal mortality from 1990 to 2015 were fragile [2]. Finding effective ways of improving health services in fragile settings is thus of global consequence.

Significant attention has recently focused on using financial incentives to improve the delivery and uptake of health services in fragile and low-resource settings. This has usually been referred to as results-based financing although the literature is replete with different nomenclature. In this paper, we examine pay-for-performance (P4P) contracts with non-governmental organizations (NGOs) in which they are paid partly on the basis of the number of defined services they deliver.

Much of the recent literature on results-based financing focuses on performance-based financing (PBF), in which funds are provided to individual public sector health facilities based mostly on the number of services they deliver. A recent systematic review of PBF [3, 4] combined the results of 22 randomized studies. The authors concluded that PBF had a modest effect on utilization. PBF increased institutional deliveries by only 4.4% points and uptake of modern contraceptives by 2.4% points. This is consistent with a recent Cochrane Collaboration systematic review [5] which found, based on a review of 14 randomized studies predominantly in the public sector, limited effects on institutional deliveries (− 3 to + 18%), modern family planning use (0.2%), and pentavalent vaccination coverage (− 5.7%). Numerous articles have highlighted the importance of context in assessing PBF impact [6,7,8].

We evaluated whether contracts with NGOs that use an approach similar to PBF would be effective. The P4P contracts were similar to PBF in that they made incentive payments based on the number of specifically selected services (such as institutional deliveries) that were provided to beneficiaries and the services had to be verified independently. The P4P contracts differed from PBF in that: (i) NGOs rather than the public sector provided the services; and (ii) payments were made to the NGO for a defined geography, in this case a whole province rather than to an individual health facility.

There has been interest in harnessing the non-state sector to improve health services, however, a recent systematic review [9] of contracting with NGOs found only two studies [10, 11] that used rigorous methodologies and concluded more studies were needed. The quasi-experimental study from Cambodia [10] randomized districts to contracts with NGOs or continued public sector management. Contracting led to an average 0.5 baseline standard deviation increase in coverage compared to the control districts (e.g., Vitamin A coverage increased by 21% points and antenatal care by 33% points). Similarly, a quasi-experimental study from Guatemala [11] found that mobile outreach teams implemented by contracted NGOs led to an increase in prenatal care provided by a nurse or physician (29% points), in women receiving 3 or more prenatal visits (38% points), and DTP3 coverage (24% points), but no discernable increase in the uptake of family planning.

NGOs have played an important role in health care in Afghanistan over the last 40 years. From the 1980s until the early 2000s, most health services in Afghanistan, about 80% in rural areas, were provided by international humanitarian NGOs that worked “cross-border,” primarily from Pakistan. Starting in 2002, with the establishment of a new government, the Ministry of Public Health (MOPH) defined its priorities through the Basic Package of Health Services (BPHS) and Essential package of Hospital services (EPHS) which laid out structures of the country’s primary health care and secondary hospital services [12]. With financial support from the international community, the Government contracted with NGOs for the delivery of these two packages of services. By setting priorities, allocating geographical responsibilities, providing financing, and carefully monitoring performance, the MOPH was able to provide strategic direction to what was previously an uncoordinated sector and helped address serious constraints, such as scarce human resources and a lack of physical facilities [13].

Health services and health outcomes in Afghanistan improved considerably from 2004 to 2015, although significant gaps remained [14]. For example, the under-five mortality rate as measured by demographic and health surveys declined from 97 per 1000 live births in 2010 to 55 in 2015, but the maternal mortality ratio remained high [15]. Starting in 2015, a project called SEHAT (“health” in Dari) pooled funds from different donors so that contracts were the same across the country. A presidential health summit in 2017 identified key areas where progress was lagging: 1) family planning to provide women with greater reproductive choice; 2) institutional deliveries to decrease maternal mortality; 3) growth monitoring to address widely prevalent child malnutrition; and 4) immunization. There was also consensus on the need to shift from “contract management” (adherence with contractual obligations) to “performance management” (greater focus on results). This shift provided the impetus for the design of a new health project called “SEHATMANDI” (“healthy” in Dari), which introduced an explicit P4P aspect to the NGO contracts. These contracts began on January 1, 2019 and covered about 29 million people in 31 of the country’s 34 provinces.

P4P contracts were implemented during a period of increasing insecurity in Afghanistan. The Uppsala Conflict Data Program [16] found that total fatalities from conflict increased annually from 2008 to 2019 before declining in 2020 (Fig. 1). The conflict in Afghanistan was considerably bloodier than those in Somalia, South Sudan, and Yemen; and in 2018, Afghanistan replaced Syria as the world’s deadliest conflict.

Fig. 1
figure 1

Number of Conflict-Related Fatalities per Year from the Uppsala Conflict Data Programa

We conducted an interrupted time series analysis with a non-randomized comparison group to assess the effectiveness of a P4P scheme with NGOs in Afghanistan in increasing the volume of services. We compared this approach to a “business as usual” model of government delivery of services. This contributes to the literature on the use of financial incentives and contracting with non-state actors for health service delivery.



P4P contracts under SEHATMANDI (2019–2021)

The P4P contracts in Afghanistan have been described in detail elsewhere [17]. Briefly, they involved paying the selected NGOs a fixed tariff for each of the 11 services, mostly focused on reproductive, maternal, neonatal, and child health. The number of services claimed by the NGO was verified by an independent third-party under a separate contract with the MOPH. Payments to the NGO were adjusted based on the findings of the third-party. For example, if an NGO claimed 1000 institutional deliveries in 6 months, the tariff per delivery was $20, and the third-party was able to verify 90% of the deliveries in the health facilities they visited, then the NGO would receive $18,000 (1000 x $20 × 90%). In addition to the P4P payments, NGOs were provided a lump-sum payment to cover their overheads and other services not compensated by the P4P mechanism. The amount of the lump sum was determined through a competitive tendering process and averaged 40.2% of the total contract value (range 4 to 69%).

SEHAT contracts (2015–2018)

SEHAT contracts differed from SEHATMANDI contracts principally in the way NGOs were paid. Under SEHAT, NGOs received 80% of the price that they bid as a lump-sum payment every 6 months. They could earn up to 10% of the contract value based on the functionality of their health facilities as judged by the availability of technical staff, functional equipment, and pharmaceuticals. They could earn a further 10% of the contract price based on reaching targets using HMIS data. This was not a PBF approach because it was based on achievement of pre-specified targets not a payment for each service provided. The approach resulted in less of the payments to the NGOs being at risk. In practice, contracted NGOs earned about 96% of the contract value. BPHS contracts under SEHAT cost 15% more than under SEHATMANDI.


The MOPH Strengthening Mechanism (MOPH-SM) involved providing a fixed annual budget and technical assistance to the provincial health directorates of 3 provinces (Kapisa, Panjshir, and Parwan) with a combined population of 1.4 million. These provinces were considered the comparison area in this study. They were chosen by the Government in 2004 as a testbed for public sector delivery of services and were maintained that way until the fall of the Government in 2021. The MOPH-SM provinces were: 1) close to Kabul; 2) relatively easy to access; and 3) suffered less insecurity than other provinces. The government continued to deliver health services and competitively recruited 25 Afghan consultants to support the provinces. The government paid the salaries of about two-thirds of the frontline health workers, while international donors paid for the technical assistance, drugs, supplies, salaries for about one- third of the frontline workers, and other operating expenses. The main differences between the three approaches are summarized in Table 1.

Table 1 Characteristics of the different types of contracts

Data sources

We analyzed quarterly HMIS data collected by the MOPH from January 1, 2015, to June 30, 2021, for both the intervention and comparison provinces. Each facility submitted standard HMIS forms through the contracted NGO to the provincial health units. From there, the forms were submitted electronically to the central MOPH. All P4P indicators (see Table 3, column 1) were included in the HMIS. One of the 11 indicators, growth monitoring and promotion, was introduced as a new indicator only in 2019 and so could not be analyzed. Prior to doing the analysis, we chose 4 non-compensated services (last 4 rows of Table 3) for which there was HMIS data during the study period to determine whether they were negatively affected.

Verification of the HMIS data was performed by an independent third-party contracted by the MOPH. Every 6 months, the third-party chose a stratified random sample of 25% of the health facilities in each province (with a minimum of 10 facilities) and visited them in person. For the 11 P4P indicators, the third-party compared the quantity of services found in the hard copy registers in the facility to the numbers reported through the HMIS. This ratio was defined as “consistency” and was capped at 100% (“consistency” above 100% would have represented under-reporting of performance which was seen as less serious than over-reporting.)

The third-party also randomly selected 5 patients for each of the 11 services (55 patients in total) and visited them at home to see whether they had received the specified service at the indicated time. The ratio of patients found to exist and to have received the service to the number of patients sampled was referred to as “accuracy.” A composite score of HMIS data “correctness” equaled the product of the consistency and accuracy scores. The verification procedures were similar during SEHAT and SEHATMANDI except that during SEHAT, only 20 patients were sampled to assess the accuracy of the facility records and only 3 randomly chosen indicators were assessed. The verification during both projects covered all provinces, including the comparison group, and the verification cost was about $600,000 per year, 0.5% of the annual total contract value ($120 million).

The consistency and accuracy of the HMIS data were above 95% in the contracted provinces but somewhat lower in the MOPH-SM provinces (Table 2) indicating more over-reporting. There is no evidence that the consistency, accuracy, or “correctness” of the HMIS data deteriorated with the introduction of P4P contracts.

Table 2 Verification of HMIS Data by an Independent Third-Party Monitor in Intervention (P4P) and Comparison (MOPH-SM) Provinces

To assess quality of care, we used information from a series of annual health facility surveys that were known as the Balanced Scorecard. The facility surveys were conducted by the third-party in PHC facilities (to assess BPHS implementation) and separately in hospitals (to assess EPHS execution). They involved direct observation of patient-provider interactions, examination of the facility, exit interviews with patients, and interviews with staff and community members.

For the BPHS Balanced Scorecard, the third-party visited an average of 24 facilities per province (about 30% of all PHC facilities) and examined 23 composite indicators. For EPHS, the third-party visited all provincial hospitals and regional hospitals (in those provinces where they were located) and examined 34 composited indicators. Even at the height of the conflict (in 2019 and 2020) the third-party was able to visit 99.2% of the sampled facilities. The cost of carrying out the Balanced Scorecard was about $400,000 per round.


We analyzed quarterly HMIS data points as rates (e.g., number of institutional deliveries/100,000 population) with segmented linear regression modelling. The rates were calculated based on the population size for the corresponding year when services were delivered. The baseline time segment was January 2015–December 2018, scale-up was January–March 2019 (excluded from the analysis), and follow-up was April 2019–June 2021. Models were first run with a term to adjust for autocorrelation from repeated observations over time; but if the autocorrelation term was not significant at the p < 0.05 level, it was dropped. The modeling was performed separately for each study arm, and the results were used to estimate three effect sizes for each outcome: 1) level change from the end of the baseline period to immediately after scale-up (Table 3, column 2), 2) baseline-to-follow-up change in outcome slope (Table 3, column 3), and 3) a relative-change “combined” effect size of the level and slope changes at the mid-point of the follow-up period (Table 3, column 4). A fourth effect size was the arithmetic difference between the “combined” effect sizes of the intervention and comparison arms (Table 3, column 5). 95% confidence intervals were calculated with standard errors that were estimated using the delta method, which estimates the variance of a function by combining the uncertainty of each element of the function [18]. We computed the combined effect size at 4.5 quarters because that was the mid-point of the follow-up period. This approach was used because other studies [19, 20] have found that effect sizes of ITS studies calculated at the mid-point of the follow-up period are similar to effect sizes derived from a randomized-controlled study design. The overall analytic approach is identical to that used by a large systematic review of the effectiveness of strategies to improve health worker performance in LMICs [21].

Table 3 Modeling results on the effects of the Pay for Performance (P4P) intervention – Services provided per 100,000 population

Further details on the analytical approach have been described elsewhere [see Appendix 1, pages 52–56 and 69–72 of reference 21] and are in Table 4. A pre-planned sensitivity analysis was performed in which one data point (April–June 2020) was omitted because the COVID-19 pandemic had a particularly strong negative effect on utilization of health services during this quarter. We assessed the goodness-of-fit of our models using R2 statistics. Statistical analyses were conducted using R Statistical Software (version 4.1.2; R Foundation, Vienna, Austria).

Table 4 Details on the estimation of effect sizes from the main analysis of the institutional deliveries outcome

No ethical approval of this study was sought because it relies on published HMIS data that had no individual identifiers and no disaggregation below the level of a province.


Figure 2a, shows a comparison in the number of institutional deliveries per 100,000 population in the contracted provinces and the MOPH-SM provinces. The levels and rates of change appeared similar during the SEHAT project (2015–2018). These rates of increase diverge sharply after the introduction of P4P in the first quarter of 2019 and the rate of change is accompanied by a substantial “step change.”

Fig. 2
figure 2

a Institutional deliveries in Contracted (Intervention) and MOPH-SM (Control) Provinces. b Institutional deliveries - Intervention(P4P) arm only

Figure 2b shows the level effect and slope effect in just the contracted provinces (the intervention arm). The implementation of P4P immediately increased the number of institutional deliveries by 60.6 deliveries per 100,000 population (95% confidence interval 30.9 to 90.2). There was also a change in the slope which increased 8.7 (95% CI: 4.0 to 13.3) deliveries per 100,000 population per quarter. Combining the level effect and slope effect, we estimate that the number of institutional deliveries was 609.0 per 100,000 population at the midpoint of P4P implementation (after 4.5 quarters) compared to 509.0 per 100,000 in the counterfactual scenario. The increase is equivalent to 20.2 percentage points (95% CI: 14.4–26.0%). Effect sizes are further described in Table 4 (with institutional deliveries as an example) and graphs for the other nine P4P indicators are in Additional file 2: Annex B.

Table 3 shows the impact of P4P on the 10 compensated services (including institutional deliveries as described above) and four non-compensated health services. Overall, the introduction of P4P was associated with median increase in service delivery of 22.1% points (range: 10.2 to 43.8%) in the two-arm analysis (Table 3, column 5). The one arm analysis of the P4P services showed a 19.9 percentage point (range: − 8.3 to 56.1) median increase in service delivery (Table 3, column 4). The greatest increase was in couple years of protection with an increase of 43.8% points (95% CI: 22.3–65.4%), and the smallest in pentavalent 3 immunization with an increase of 10.2% points (95% CI: 1.2–19.3%,). We examined services that were not compensated using P4P under SEHATMANDI (bottom of Table 3) and found that for three of the four indicators examined there was no statistically significant impact of P4P on the volume of services provided. Only in minor surgery was there a sharp reduction.

The sensitivity analysis showed that the COVID-19 pandemic had little impact on P4P effectiveness (See Additional file 1: Annex A). The services that the Government of Afghanistan particularly targeted for improvement, namely family planning (couple years of protection), institutional delivery, and Pentavalent 3 child vaccination, saw substantial improvements compared to the comparison provinces of 44, 12, and 10 percentage points, respectively.

Results of the annual health facility surveys (Balanced Scorecard) show a decline in the quality of care in 2019 in BPHS facilities in P4P provinces (Fig. 3). However: (i) this decline began in 2018 (before the P4P contracts began); (ii) the decline was short-lived; and (iii) the BSC scores rose after the first year of the contracts. We do not observe a decline in EPHS quality.

Fig. 3
figure 3

Mean Balanced Scorecard Scores (out of 100)a for BPHS & EPHS in Contracted (Intervention) and MOPH-SM (Comparison) Provinces


Our analysis indicates that P4P contracting with NGOs in Afghanistan was very effective in increasing the availability of key services, particularly those related to reproductive, maternal, and child health. In the medium term, the progress in increasing the quantity of health services impaired neither the quality of care nor progress on other health services. The improvements from baseline for the priority services were significantly larger in the P4P (intervention) provinces than those in the MOPH-SM (comparison) provinces despite lower per capita investments. The P4P results in Afghanistan are impressive given the deteriorating security situation during much of the study period and 15% lower cost per capita of the P4P contracts.

Our study has several strengths: (i) we have 48 months of data before P4P and 30 months of data afterwards; (ii) the HMIS data were verified robustly and independently; (iii) the intervention was done on a very large scale — covering a population of about 29 million in the P4P arm and 1.4 million in the MOPH-SM arm; (iv) the goodness of fit of the models we employed was generally good; (v) we are conservative in calculating effect sizes as we estimate impact only at the mid-point of implementation (not at the end); and (vi) besides the introduction of P4P, there were few other important changes between SEHAT and SEHATMANDI.

Our conclusions are tempered by limitations in our data, which include: (i) we rely primarily on HMIS data as no household survey data are available after 2018 to ascertain the population-level impact of P4P; (ii) the HMIS data from the MOPH-SM provinces were less consistent and accurate (thus, more likely to be overstated) than those from the contracted provinces, which could bias the estimates (and understate the differences); (iii) we do not have estimates of sub-populations (such as infants or pregnant women) with which to construct the denominators for coverage estimates and (iv) there is some heterogeneity in the results (see below).

The median effect sizes we found for P4P contracts with NGOs (22.1 and 19.9 percentage points), are consistent with previous quasi-experimental studies on the impact of contracting with non-state actors [10, 11]. They are larger than those found in the systematic reviews of PBF. We hypothesize that the differences between our results and the findings in the PBF literature are due to: (i) a greater ability of NGOs to take advantage of a P4P scheme possibly because of greater flexibility and autonomy (for example, in managing their human resources) and stronger management (for example, some NGOs extended opening hours or invested in real-time data solutions); and (ii) more of their payment was at risk (on average 60% was linked to performance) than is the case in most PBF schemes (where health worker salaries typically increase between 10 and 30% and there is no risk of earning less than they did before). It appears that P4P worked as a strong and effective signal to the providers about what was important and could explain the substantial “step change” in the first quarter of 2019 after the introduction of P4P.

The improvements that we observed are also large compared to other possible interventions. For example, the removal of user fees has been estimated to increase service delivery by about 15 percentage points [22]. The effect sizes we observed were achieved in a context where publicly financed services were already free. The MOPH abolished user fees for primary health care in 2007, and the verification studies found no evidence of “under-the-table” payments.

Besides increasing key service delivery outcomes, we believe that P4P contracts strengthened the government’s stewardship of the health sector by ensuring all service providers remained focused on its priorities. We also observed that a strong focus on performance helped the government increase the alignment of its development partners with the government’s health sector strategy.

We observed heterogeneity in the response of different indicators to P4P incentives. For example, immunization (Penta3, measles, and tetanus toxoid) saw smaller improvements than other indicators. Penta3 coverage in the 2018 Afghanistan Health Survey [23] was 60.8% indicating that there was substantial room for improvement. We hypothesize that the relatively modest progress in immunization could be due to: (i) too low a tariff for vaccination relative to the effort required to reach the unimmunized; (ii) demand-side issues (such as lack of parental interest); (iii) there were many organizations providing vaccination services besides the contracted NGOs; and (iv) explicit Taliban prevention of outreach immunization efforts.

With all financial incentives, like P4P, there is a legitimate concern about unintended negative consequences. We did see an initial drop in quality of care. However, this was quickly rectified. Non-compensated services were not much affected by P4P, except minor surgeries. We are unsure about the reason for this decline, but it may be due to insufficient attention to uncompensated services during semi-annual reviews by the MOPH. This highlights the importance of regular review of contractor performance on all indicators. We also observed a 30 percentage-point increase in caesarean sections, which raises the specter of excessive reliance on surgical deliveries. The 2018 survey [23] found a C-section rate of 6.6%, suggesting that there was still some benefit to increasing access to this service. One commonly expressed concern with P4P is that the cost of verification is high and takes funds away from service delivery. We found that verification costs under SEHATMANDI were about 0.5% of the contract value.

The success of P4P contracts with NGOs in Afghanistan justifies their continuation, especially as they are widely accepted by most stakeholders. The results also warrant their extension to other fragile and conflict-affected settings. The compelling results achieved in Afghanistan, at scale, at reasonable cost, and despite serious security challenges, suggests that P4P contracts with NGOs could be considered wherever the coverage of basic health services remains a challenge.

Research in context panel

Evidence before this study

Fragile and conflict-affected countries account for a large proportion of global poverty and burden of disease. Despite their importance to the achievement of global goals, there are few rigorous evaluations of approaches to improve service delivery in such contexts. In these, and other settings where the coverage of health services is low, there is a need to understand what works. There is growing interest and use of NGOs to provide health services, especially in fragile environments; but a recent systematic review found only two robust evaluations of such efforts. The evidence on pay-for-performance (P4P) is mixed, and a recent meta-analysis of P4P schemes in the public sector in lower and lower-middle income countries (LMICs) has shown that its impact on service coverage is only a few percentage points.

Added value of this study

We describe here a formal interrupted time series (ITS) analysis to assess the effectiveness of P4P contracts with NGOs in Afghanistan to improve service delivery. It benefits from independent verification of the routine information, more than 6 years of data, consideration of quality of care, implementation on a very large scale (covering almost 30 million people), and comparison to a set of provinces where P4P contracts were not implemented. The introduction of P4P increased the delivery of 10 key services by a median of 22.1 percentage points.

Implications of all the available evidence

P4P contracts with NGOs had a large impact (with some heterogeneity of effects) on service delivery in Afghanistan and are worthy of expansion in other fragile settings or contexts where coverage is low. We found it possible to employ an ITS design to assess the impact of large-scale policy reform in a fragile context and the cost of evaluation (verification) was modest. The discussion about the value of P4P approaches, such as performance-based financing (PBF), needs to consider whether it is being implemented by the public sector or by NGOs (or other non-state actors). The latter seems better able to take advantage of the P4P approach. The lessons from the extensive experience in Afghanistan emphasize the importance of: 1) being clear about priorities; 2) providing NGOs with substantial autonomy so they can respond flexibly to the context they’re working in; 3) managing contractor performance rather than just contractual obligations; and 4) investing the relatively modest funds needed for independent and robust verification and measurement of performance.

Availability of data and materials

All data used during the current study are available and is uploaded as supplementary material “Quarterly HMIS data 2015-2021”.


  1. World Bank Group Strategy for Fragility, Conflict, and Violence 2020–2025 – Accessed 20 Oct 2022 at

  2. Ager A, Saleh S, Wurie H, Witter S. Health systems research in fragile settings. Bull World Health Organ. 2019;97(6):378–378A.

    Article  Google Scholar 

  3. de Walque D, Kandpal E, Wagstaff A, Friedman J, Neelsen S, Piatti-Fünfkirchen M, et al. Improving effective coverage in health: do Financial incentives work? Policy research report. Washington, DC: World Bank; 2022. License: Creative Commons Attribution CC BY 3.0 IGO

    Book  Google Scholar 

  4. de Walque D, Kandpal E. Reviewing the evidence on health financing for effective coverage: do financial incentives work? BMJ Glob Health. 2022;7:e009932.

    Article  Google Scholar 

  5. Diaconu K, Falconer J, Verbel A, Fretheim A, Witter S. Paying for performance to improve the delivery of health interventions in low- and middle-income countries. Cochrane Database Syst Rev. 2021;(5):CD007899.

  6. Coulibaly A, Gautier L, Zitti T, Ridde V. Implementing performance-based financing in peripheral health centres in Mali: what can we learn from it? Health Research Policy and Systems. 2020;18:54.

    Article  Google Scholar 

  7. Bertone MP, Falisse J-B, Russo G, Witter S. Context matters (but how and why?) a hypothesis-led literature review of performance based financing in fragile and conflict-affected health systems. PLoS One. 2018;13(4):e0195301.

    Article  CAS  Google Scholar 

  8. de Walque D, Robyn PJ, Saidou H, Sorgho G, Steenland M. Looking into the performance-based financing black box: evidence from an impact evaluation in the health sector in Cameroon. Health Policy Plan. 2021:1–13.

  9. Odendaal WA, Ward K, Uneke J, Uro-Chukwu H, Chitama D, Balakrishna Y, et al. Contracting out to improve the use of clinical health services and health outcomes in low- and middle-income countries. Cochrane Database Syst Rev. 2018;(4):CD008133.

  10. Bloom E, King E, Bhushan I, et al: Contracting for Health: Evidence from Cambodia. Accessed 20 Oct 2022.

  11. Cristia J, Evans WN, Kim B. Improving the health coverage of the rural poor: does contracting-out mobile medical teams work? J Dev Stud. 2015;51:247–61.

    Google Scholar 

  12. Dalil S, Newbrander W, Loevinsohn B, et al. Aid effectiveness in rebuilding the Afghan health system: A reflection. Global Public Health. 2014;9(Suppl 1):S124–36 Published online June 16, 2014.

    Article  Google Scholar 

  13. Loevinsohn B, Sayed GD. Lessons from the health sector in Afghanistan - how Progress can be made in challenging circumstances. JAMA. 2008;300(6):724–6.

    Article  CAS  Google Scholar 

  14. Akseer N, Salehi AS, Hossain SM, et al. Achieving maternal and child health gains in Afghanistan: a countdown to 2015 country case study. Lancet Glob Health. 2016;4:e395–413.

    Article  Google Scholar 

  15. WHO, UNICEF. UNFPA, World Bank Group, United Nations Population Division. Trends in maternal mortality 2000 to 2017: estimates by WHO, UNICEF; 2019. p. 93–103.

    Google Scholar 

  16. Uppsala Conflict Data Program - Accessed 20 Oct 2022.

  17. Anderson CT, Ahmadzai H, Rasekh W, et al. Improving health service delivery in conflict-affected settings: lessons from a nationwide strategic purchasing mechanism in Afghanistan. J Glob Health. 2021;11:04049.

    Article  Google Scholar 

  18. Rice JA. Mathematical statistics and data analysis. Belmont: Wadsworth & Brooks/Cole Publishers; 1988. p. 142–7.

    Google Scholar 

  19. Fretheim A, Soumerai SB, Zhang F, Oxman AD, Ross-Degnan D. Interrupted time-series analysis yielded an effect estimate concordant with the cluster-randomized controlled trial result. J Clin Epidemiol. 2013;66:883–7.

    Article  Google Scholar 

  20. Fretheim A, Zhang F, Ross-Degnan D, Oxman AD, Cheyne D, Foy R, et al. A reanalysis of cluster randomized trials showed interrupted time-series studies were valuable in health system evaluation. J Clin Epidemiol. 2015;68:324–33.

    Article  Google Scholar 

  21. Rowe AK, Rowe SY, Peters DH, Holloway KA, Chalker J, Ross-Degnan D. Effectiveness of strategies to improve health-care provider practices in low-income and middle-income countries: a systematic review. Lancet Global Health. 2018;6(11):e1163–75.

    Article  Google Scholar 

  22. Rowe AK, Rowe SY, Peters DH, Holloway KA, Chalker J, Ross-Degnan D. A systematic review on the effectiveness of strategies to improve health worker performance in low- and middle-income countries: preliminary results on the utilization of health services. Oral presentation by AK Rowe at the 65th annual meeting of the American Society of Tropical Medicine and Hygiene, November 13–17, 2016, Atlanta, Georgia. Abstract number 1919, available at: Accessed 6 Feb 2023.

  23. KIT, Royal Tropical Institute: Afghanistan Health Survey 2018, Final report November 2018. Accessed 20 Oct 2022.

Download references


Not applicable.

BMC license agreement

The corresponding author has read the BMC journal policies on author responsibilities and submits this manuscript in accordance with those policies and accept the conditions of submission and the BMC Copyright and License Agreement.


We received no funding to carry out this study.

Author information

Authors and Affiliations



Diwa Samad, Bashir Hamid, Ghulam Dastagir Sayed, and Benjamin Loevinsohn conceived of the study. Diwa Samad retrieved the data from the HMIS of the Ministry of Public Health of Afghanistan. Wu Zeng, Alexander K. Rowe, Yueming Liu, and Benjamin Loevinsohn performed the analysis. All authors contributed substantially to the analysis, interpretation of the results, and completion of the manuscript. All authors approved the final manuscript.

Corresponding author

Correspondence to Diwa Samad.

Ethics declarations

Ethics approval and consent to participate

The study was in accordance with Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

All authors have completed the ICMJE uniform disclosure form at and declare: no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Annex A.

Modeling results on the effects of the Pay for Performance (P4P) intervention – Services provided per 100,000 population with sensitivity analysis to examine the effects of COVID-19.

Additional file 2: Annex B.

Graphs for other nine P4P indicators.

Additional file 3.

Quarterly HMIS data 2015–2021.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Samad, D., Hamid, B., Sayed, G.D. et al. The effectiveness of pay-for-performance contracts with non-governmental organizations in Afghanistan – results of a controlled interrupted time series analysis. BMC Health Serv Res 23, 122 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • NGOs
  • Contracting
  • Health
  • Afghanistan
  • Pay-for-performance
  • Results-based financing
  • Fragile settings