Skip to main content

The contribution of benchmarking to quality improvement in healthcare. A systematic literature review



Benchmarking has been recognised as a valuable method to help identify strengths and weaknesses at all levels of the healthcare system. Despite a growing interest in the practice and study of benchmarking, its contribution to quality of care have not been well elucidated. As such, we conducted a systematic literature review with the aim of synthesizing the evidence regarding the relationship between benchmarking and quality improvement. We also sought to provide evidence on the associated strategies that can be used to further stimulate quality improvement.


We searched three databases (PubMed, Web of Science and Scopus) for articles studying the impact of benchmarking on quality of care (processes and outcomes). Following assessment of the articles for inclusion, we conducted data analysis, quality assessment and critical synthesis according to the PRISMA guidelines for systematic literature review.


A total of 17 articles were identified. All studies reported a positive association between the use of benchmarking and quality improvement in terms of processes (N = 10), outcomes (N = 13) or both (N = 7). In the majority of studies (N = 12), at least one intervention, complementary to benchmarking, was undertaken to stimulate quality improvement. The interventions ranged from meetings between participants to quality improvement plans and financial incentives. A combination of multiple interventions was present in over half of the studies (N = 10).


The results generated from this review suggest that the practice of benchmarking in healthcare is a growing field, and more research is needed to better understand its effects on quality improvement. Furthermore, our findings indicate that benchmarking may stimulate quality improvement, and that interventions, complementary to benchmarking, seem to reinforce this improvement. Although this study points towards the benefit of combining performance measurement with interventions in terms of quality, future research should further analyse the impact of these interventions individually.

Peer Review reports


Introduced in the late 70s as an effort to reduce production costs in the manufacturing sector, benchmarking has since then been used as a method for continuous quality improvement in many different sectors and fields [1]. Although international literature has provided several definitions and taxonomies of benchmarking [2,3,4,5,6], all of them share a common theme, defined as a “continuous process of measuring products, services and practices against the toughest competitors or those companies recognized as industry leaders” [2].

Starting from the 1990s, benchmarking has been applied to the healthcare sector with the aim of measuring and comparing clinical outcomes across organizations as well as enabling them to learn from one another and apply best practices [1, 7]. Benchmarking has become a structured method in the United States and the United Kingdom with the end goal of comparing hospital outcomes for cost-containment purposes [8], although comparison of outcome indicators dates back to the seventeenth century. The increased use of benchmarking was influenced by different factors, including the need to identify and better understand differences in healthcare practices and outcomes between and within different geographical areas [9]. If properly used, benchmarking may also provide a mechanism to detect unwarranted variation and promote the reduction of such [10, 11].

Nowadays, benchmarking represents one of the strategies used for quality improvement, that is, «the changes that will lead to better patient outcomes (health), better system performance (care) and better professional development» [12]. When benchmarking is used to this end, it includes a series of steps such as: identification of best performers through data analysis as well as in-depth (qualitative) investigation of factors that support the observed performance and quality improvement. Performance indicators allow for the conversion of quality to quantifiable metrics that can provide simplified information about a larger area of interest and facilitate comparison across organizations [13, 14]. Depending on the context, the indicators reporting benchmarking data can be aimed at different users with varying decision-making capabilities, ranging from patients to clinicians and policy makers [1, 15]. For instance, comparative performance data of certain clinical processes may lead clinicians to engage in different quality improvement activities such as audit & feedback strategies as well as professional development programs, whereas governments and regional authorities may choose to set policies based on the reporting of certain outcomes [15,16,17]. Thus, it is crucial that performance indicators convey the right type of information to the right stakeholders. Another key element that contributes to the success of benchmarking is the development of reliable and valid performance indicators that are fit for use [13, 17]. This, however, remains a challenge, especially when it comes to cross-national comparisons as countries may differ in coding and methodologies they use to calculate indicators [14, 18]. Additionally, collaboration between benchmarking participants has also been shown to be a key factor contributing to the successful implementation and use of benchmarking in the healthcare sector [19, 20].

A number of reviews provided evidence that combining benchmarking with public reporting had a limited to moderate effect on quality improvement [21, 22]. However, public comparisons of performance of individuals or organizations could lead to controversy as poorer performers may be discouraged to improve if they feel their reputation has been damaged (e.g. “naming and shaming”) [23,24,25]. On the other hand, public reporting of performance can also be used to stimulate quality improvement if top performance is emphasized (e.g. “naming and faming”) [26].

What emerges from the existing literature is that there is a continuous and growing interest in the systematic assessment and practice of benchmarking undertaken by healthcare systems and international agencies [13, 27,28,29]. However, the contribution of benchmarking to quality of care has not been studied extensively.

To investigate this further, we conducted a systematic literature review with the aim of answering to the following research questions:

RQ1: Is there a relationship between the use of benchmarking and quality improvement in healthcare?

RQ2: Can benchmarking combined with additional strategies (e.g. meetings among participants, audit and feedback, use of incentives) further stimulate quality improvement?


A systematic literature review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [30].

Search strategy

To identify articles, we searched the following three databases, PubMed, Web of Science and Scopus. Search terms and keywords were defined according to the current literature on benchmarking. We reported in Additional file 1 the search strategies used for each of the databases along with the number of studies found.

The three databases were searched in January 2021, from their inception date to December 2020. The screening of articles followed a two-step process including: i) screening of titles and abstracts and ii) full text reading. Additionally, the reference lists of relevant articles were scanned to overcome the lack of database search generated articles containing the defined keywords in their title or abstract text.

A quality appraisal of the eligible articles was performed using the quality assessment tool (QATSDD) developed by Sirriyeh et al. for reviewing studies with diverse designs [31] (see Additional file 4). Additionally, we summarized the methodological strengths and weaknesses in the results section.

Study selection

Our search was restricted to peer-reviewed articles published in the English language. Inclusion and exclusion criteria were defined a priori. Articles were considered eligible if they empirically assessed the relationship between benchmarking and clinical outcomes as well as processes across at least two entities over time. We considered healthcare entities at all scales of benchmarking analysis: international, national, and regional level.

While we excluded articles that only focused on the direct impact of public reporting on performance, we considered articles in which benchmarking results were publicly available. Furthermore, we included articles in which the benchmarking participants were the sole decision-makers and users of the benchmarking results. As such, we excluded articles where the decision-making was external to the benchmarking participants, as it is the case for value-based programs in the US or consumers making informed choices. Additionally, we excluded studies that estimated the potential effects of benchmarking on quality through prediction models and those in which the relationship between benchmarking and performance was considered too indirect. We also excluded articles which did not assess performance over time. Finally, we excluded conceptual and theoretical articles as well as review articles, although we did not apply a filter concerning the study design (qualitative versus quantitative) or methodological approach as mixed-methods bring valuable contribution to this research field.

Two reviewers (PB and CW) independently screened titles and abstracts for relevance (see step I in search strategy subsection). Once potentially eligible articles were identified, all four authors independently screened full-text articles for inclusion. Any disagreement between reviewers was resolved through internal discussion and until consensus was reached. Additionally, it is worth noting that the heterogeneity of the studies in terms of methodology, clinical areas and study design was taken into consideration during the undertaking of this systematic literature review.

Data extraction and analysis

Using a data-charting tool (see Additional file 2 for the list of the variables included), we extracted the following information from the articles: authors; title; year; reported impact of benchmarking; type of quality improvement activity; country; data related to the benchmarking initiative (scale, participation, development, communication and indicators); study design; research question and findings. The data-charting tool was designed collectively as well as piloted by all four investigators (CW, PB, AMM, MV). We performed additional searches using authors sources or institutional webpages when information concerning the benchmarking initiative was missing or not specified in the article directly.

Following Donabedian’s definition of quality [32], we classified the results by process and outcome domains. Due to the high level of heterogeneity between studies in terms of outcomes and methodological designs, we were unable to perform a meta-analysis. However, we provided a synthesis of the resulting evidence.


Literature search

As shown in Fig. 1, the literature search across the three databases identified 5935 articles. An additional 12 articles, identified through scanning of the articles’ references were integrated with the articles identified during the screening of titles and abstracts. Therefore, a total of 5947 articles were identified. The removal of duplicates (N = 999) narrowed down the number of articles to 4948. After applying the inclusion and exclusion criteria, a further 4879 articles were excluded from the second round of screening, thus resulting in 69 articles eligible for assessment. Finally, the full-text screening led to the exclusion of 52 articles, reasons being that they either did not meet the inclusion criteria previously defined in the methods section (see subsection “study selection”) or their full texts were unavailable. As such, a total of 17 articles were finally considered for qualitative assessment and synthesis [33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49].

Fig. 1
figure 1

Flowchart of the literature review process

Study characteristics and benchmarking approaches

Table 1 illustrates the characteristics of the 17 studies. These were published in academic journals between 2004 and 2020 and all benchmarking initiatives were implemented in either North America, Europe or Japan. Thus, all analysed studies took place in high-income countries, as classified by the World Bank [50].

Table 1 Description of the studies used in this paper

We found that the studies included diverse clinical areas. Nevertheless, a number of studies can be grouped in similar clinical areas (see column “Clinical area” in Table 1), namely oncological care (N = 4), surgical care – general and cardiovascular (N = 5) - and chronic illeness care (N = 3).

In all but one benchmarking initiative, participation was voluntary as opposed to mandatory. Participants varied from individual clinicians to hospitals. In terms of granularity of the analyses (see column “units analysed” in Table 1), the level of data aggregation ranged from individual procedures and patients to hospitals and regional healthcare systems.

Figure 2 Panel A illustrates the distribution of the different scales at which benchmarking was carried out. Benchmarking activities were mostly conducted at a national level: either covering an entire territory or selected regions. Only one initiative was implemented at the international level.

Fig. 2
figure 2

Scale of benchmarking initiatives (Panel A) and types of benchmarking developers (Panel B)

As displayed on Fig. 2 Panel B, the benchmarking activities were developed and implemented by a wide variety of actors within the healthcare system. Most of them, however, were carried out by either academia or medical associations (N = 13, see studies number 1,2,3,5,6,7,8,9,10,12,13,14,16 in Table 1). Additionally, the majority of benchmarking initiatives (N = 11, see column “Reporting frequency in Table 1) monitored performance continuously over time.

With reference to our research objective, we found that all studies included in our analysis reported quality improvement both in terms of care process and outcomes.

Secondly, we found that the use of benchmarking was generally associated with various complementary quality improvement strategies, as illustrated in the following subsections. Finally, all the results reported evidence of a positive contribution of benchmarking, suggesting a bias in the literature.

Quality improvement in terms of processes and outcomes

Evaluation of performance on process indicators over time was conducted in over half of the studies. Almost all of these studies (N = 10) reported significant improvement on these measures. Table 1 shows that measures on medication were most commonly reported (N = 4, see studies number 1,10,11,14 in Table 1), followed by measures on documentation of patient’s health (N = 3, see studies number 5,10,11 in Table 1), diagnostic test (N = 2, see studies number 10,13 in Table 1) and multidisciplinary meetings (N = 2, see studies number 8,12 in Table 1). Medication measures included use of B-blockers, anticoagulants and insulin. Six studies did not evaluate care processes (see studies number 3,6,7,9,16,17 in Table 1). Evaluation of performance on process indicators over time was conducted in over half of the studies. Almost all of these studies (N = 10, see studies number 1,2,5,8,10,11,12,13,14,15 in Table 1) reported significant improvement on these measures.

14 studies assessed outcome measures over time. Apart from two, all of these studies reported significant improvement on outcome measures, which largely consisted of measures on mortality and post-surgery complications (N = 6, see studies number 5,6,7,9,13,15 in Table 1), followed by outcomes for diabetic patients, e.g.systolic blood pressure, cholesterol and HbA1c levels (N = 2, see studies number 4,14 in Table 1, hospital length of stay (N = 2, see studies number 9,12,13 in Table 1) and time to surgery (N = 2, see studies number 5,12 in Table 1). Four of the studies reported adjusted outcome measurements at patient level (age, risk).

Seven studies reported performance improvement on both process and outcome indicators. The study period outlined in all the articles varied from 6 months to 18 years.

Performance changes on process and outcomes indicators reported by each study are described in Table 1.

Quality improvement related actions

The methods used to improve quality can be classified into two categories: strategies that made direct use of results on performance indicators to actively stimulate performance improvement – audit & feedback, quality improvement plans, Plan-do-check-act (PDCA) cycles, financial incentives - and strategies that indirectly supported quality improvement such as meetings, provision of guidelines as well as technical support.

Table 1 shows that meetings among participants were the most frequently used strategy by benchmarking initiatives to support performance improvement (N = 11, see studies number 1,3,5,6,7,8,11,12,14,15,16 in Table 1), followed by quality improvement plans (N = 4, see studies number 1,3,8,11 in Table 1), pay-for performance schemes (N = 3, see studies number 8,12,16 in Table 1), provision of guidelines (N = 2, see studies number 6,14 in Table 1) and audit & feedback (N = 2, see studies number 6,10 in Table 1). A combination of at least two strategies were present in over half of the studies (N = 10, see studies number 1,3,5,6,8,10,11,12,14,16 in Table 1). This combination would most commonly include meetings or discussions and direct quality improvement plans (N = 5, see studies number 3,5,8,10,11 in Table 1). Additionally, meetings were used as a single strategy in two of the studies. Five studies, on the other hand, did not report any type of quality improvement strategy implemented (see studies number 2,4,9,13,17 in Table 1).

Methodological approaches for quality improvement measurement

To assess the change in quality linked to benchmarking, most of the studies included in this analysis considered time trends, starting from the beginning of performance reporting (see studies number 1, 3, 6–9, 11–16 in Table 2). Other studies, however, used different approaches, including comparing performance between initial participants and those that joined the benchmarking initiative later (see studies number 2, 17 in Table 2), as well as comparing performance of facilities before and after initiation of benchmarking (see studies number 5, 10 in Table 2). In one case, a control group was used to evaluate the change in performance of facilities that underwent benchmarking (see study number 4 in Table 2). While the articles varied in terms of study periods, ranging from 6 months to 18 years, performance, was on average, monitored over a period of 4 years. The longer the study period was, the more likely information bias was reduced. Seven studies were population-based (see studies number 8, 9, 12, 13, 14, 15, 16 in Table 2), which reduced selection bias in these cases. In certain studies, data was aggregated at the healthcare provider or regional level (see studies number 8, 11,12 in Table 2). Methods for counteracting selection bias and accounting for differences between patients as well as care settings were specified in almost all articles. In certain smaller-scale studies, data analysis was performed and reported for each facility involved, thus also accounting for potential differences between care settings (see studies number 3, 5 in Table 2). In cases where no form of risk-adjustment was performed, the analysis was often focused on process rather than outcome indicators (see studies number 1, 2 in Table 2). Additionally, in two instances, data validation was performed to address information bias (see studies number 6, 13 in Table 2). Aside from one study in which long-term survival was analysed (see study number 15 in Table 2), the majority reported short-term outcomes.

Table 2 Summary of methodological strengths and weaknesses


Summary of main findings

This systematic literature review addresses our research questions by providing evidence concerning a positive association between the use of benchmarking and quality, which is further stimulated when combining benchmarking with specific interventions, such as meetings between participants, quality improvement plans and financial incentives.

The studies we analysed confirm that benchmarking is a useful tool which has yet to be systematically implemented at all levels of the healthcare system [1].

Most of the initiatives were voluntary based and had a bottom-up approach, involving mainly medical associations and academia. More specifically, our findings suggest that benchmarking data was in large part used at the micro level by speciality departments and hospitals, sometimes in the context of small-scale pilot studies that involved a small number of participants [35, 37, 42]. This raises questions regarding the involvement of high-level decision makers when it comes to the use of benchmarking. Importantly, the geographical scope of these studies was limited to Europe and North America.

Research on the practice of benchmarking

Healthcare systems worldwide are increasingly being called on to identify reliable methods for measuring quality of care [51, 52]. This is partly due to the increasing availability of data generated at all levels of the healthcare system. The practice of benchmarking and performance improvement has been considered, especially in Europe, a growing area of research which has received less attention than the identification of performance indicators that reliably benchmark information in different clinical areas [16].

Following the identification of indicators, the questions ensue as to which users they are intended for and the purpose of their use. The information needs of users may differ depending on their decision-making capacity when it comes to taking action based on benchmarking data. As such, the actionability of this type of evidence-based information remains debatable. Furthermore, certain studies [53, 54] have suggested that benchmarking data was generally underused by decision makers within the healthcare system. On the other hand, when healthcare providers do take into account benchmarking data, reluctance may arise when integrating this information into practice for changing behaviour and procedures [55]. The clinician’s subjective perception can also be a factor when deciding on which areas of performance to consider for improvement [37].

Benchmarking and quality improvement

All articles considered in this review reported performance improvement following communication of benchmarking data. One could argue, however, that the sustainability of the reported quality improvement could differ from one study to another depending on the length of follow-up time and monitoring of performance. For instance, in five of the articles, performance was monitored over a relatively short period of time, ranging from 6 months to 2 years [36, 37, 39, 42, 43]. Although these studies validate the use of benchmarking as a tool for quality improvement, researchers have argued that, in this case, performance improvement could be attributed to the experimental conditions under which benchmarking is taking place as well as the newness of the initiative itself, rather than a long-lasting impact of performance measurement [49, 56]. On the other hand, articles reporting a longer follow-up time have also shown sustained performance improvement [33, 38, 40, 41, 46]. Interestingly, only one article focused on the capacity of benchmarking to reduce geographical variation [11].

Furthermore, our results suggest that quality improvement was achieved not only by high performing organisations but also by those whose performance was initially suboptimal [38, 39]. It has long been speculated that the combination of continuous performance measurement with interventions, such as discussions of benchmarking results, was associated with long-lasting quality improvement [43, 46, 56]. The majority of articles from our results reported the implementation of these interventions in addition to benchmarking, ranging from meetings to quality improvement plans and audit & feedback. Meetings between benchmarking participants were the most frequently cited intervention by the articles. Although this type of intervention has a more supporting than active role in terms of quality improvement, interactions between benchmarking participants do facilitate direct exchange of experience and transfer of best practices, thus prompting organisations to further engage in activities adapted to their performance needs. Furthermore, our results showed that meetings were often combined with other interventions, such as quality improvement plans and financial incentives. For instance, Italy’s Tuscany region combines discussions of publicly reported benchmarking data between different stakeholders with pay for performance schemes for local decision-makers and clinicians [40, 44, 48]. Although many have recognized the positive effects that benchmarking and quality improvement activities have, some have argued that the extent of their impact on quality remains unclear, and as such, establishing a causal relationship between benchmarking and quality remains difficult [38, 43, 57].

The relationship between process and outcome indicators

Lastly, several articles included in our review suggest that performance improvement on process indicators is correlated with better outcomes as well, particularly in primary care and certain clinical areas such as diabetes and colorectal cancer [42, 44,45,46,47]. This should come to no surprise as it is widely accepted that processes of care contribute in large part to patient outcomes [58, 59]. However, it has been argued that outcomes are reflective of a wide variety of determinants, some related to healthcare and others not. Furthermore, processes of care that are measurable may represent only a fraction of all the processes that contribute to a particular outcome [60]. However, given the ongoing transformation of performance management systems and the rise of innovative measures, including patient reported data, population based indicators and measures on resilience and sustainability [61], one could expect the relationship between processes and outcomes to change.


This literature review included peer-reviewed studies in English, and excluded grey literature as well as foreign language journals. Furthermore, the results show a very limited number of studies on the relationship between benchmarking and quality improvement, despite the growing interest and research on this topic at the international level. Many articles focus on the practical actions to foster benchmarking as a tool to learn from excellence [62], set strategic planning [40, 63], and improve reputation by naming and faming and peer learning [26]. However, these articles provide specific frameworks on the use of benchmarking rather than report results and impacts of its application.

Another limitation relates to the robustness of the methods used as almost all articles are based on observational analysis and are thus susceptible to methodological biases.


The limited number of studies generated by this systematic literature review suggests that the contribution of benchmarking in healthcare needs to be further explored. Our findings also indicate that benchmarking may foster quality improvement, and that complementary interventions, such as meetings and audit & feedback, can also play a role in further reinforcing quality improvement.

As data becomes more widely available, it is becoming increasingly important for healthcare systems to identify reliable performance indicators that are adapted to the needs of different stakeholders, who ultimately, are the end-users of benchmarking information. As such, further research needs to be conducted as to discern the factors, including contextual elements, that could influence the uptake of benchmarking at all levels of the healthcare system. Although this study points towards the positive impact of combining performance measurement with interventions on quality, future research should analyse the individual impact of these interventions, including non traditional ones such as the promotion of good performance practices.

Availability of data and materials

The datasets analysed are available from the corresponding author upon reasonable request.


  1. Ettorchi - Tardy A, Levif M, Michel P. Benchmarking: A method for continuous quality improvement in health. Healthc Policy. 2012;7(4):101–19.

    Google Scholar 

  2. Camp RC. The search for industry best practices that lead to superior performance; 1989. p. 320.

    Google Scholar 

  3. Liebfried HJ, McNair CJ. In: Sons JW, editor. Benchmarking: a tool for continuous improvement. New York: Wiley; 1992.

    Google Scholar 

  4. Watson GH. Strategic benchmarking: How to rate your comoany’s performance against the world’s best. John Wiley & Sons Incorporated, editor. Somerset: Wiley; 1993.

  5. Bowerman M, Francis G, Ball A, Fry J. The evolution of benchmarking in UK local authorities. Benchmarking An Int J. 2002;9(5):429–49.

    Article  Google Scholar 

  6. Doug M, Gift B. Collaborative Benchmarking in Healthcare. J Qual Improv. 1994;20:239–49.

    Google Scholar 

  7. Camp RC, Tweet AG. Benchmarking applied to health care. Jt Comm J Qual Improv. 1994;20(5):229–38.

    CAS  PubMed  Google Scholar 

  8. Thonon F, Watson J, Saghatchian M. Benchmarking facilities providing care: an international overview of initiatives. SAGE Open Med. 2015;3:205031211560169.

    Article  Google Scholar 

  9. Wennberg JE. Understanding geographic variations in health care delivery. N Engl J Med. 1999;340(1):52–3.

    Article  CAS  PubMed  Google Scholar 

  10. Arah OA, Klazinga NS, Delnoij DMJ, Ten Asbroek AHA, Custers T. Conceptual frameworks for health systems performance: a quest for effectiveness, quality, and improvement. Int J Qual Heal Care. 2003;15(5):377–98.

    Article  CAS  Google Scholar 

  11. Nuti S, Seghieri C. Is variation management included in regional healthcare governance systems? Some proposals from Italy. Health Policy (New York). 2014;114(1):71–8.

    Article  Google Scholar 

  12. Batalden PB, Davidoff F. What is “quality improvement” and how can it transform healthcare? Qual Saf Heal Care. 2007;16(1):2–3.

    Article  Google Scholar 

  13. Smith P, Mossialos E, Papanicolas I, Leatherman S. Performance measurement and professional improvement. Cambridge: Cambridge University Press; 2009. p. 613–40.

    Google Scholar 

  14. Expert Group on Health Systems Performance Assessment. So what? - Strategies across Europe to assess quality of care: European Union; 2016. p. 92–107. Available from:

    Google Scholar 

  15. Oliver TR. Population health rankings as policy indicators and performance measures. Prev Chronic Dis. 2010;7(5):A101.

  16. Klazinga N, Fischer C, Ten Asbroek A. Health services research related to performance indicators and benchmarking in Europe. J Heal Serv Res Policy. 2011;16(SUPPL. 2):38–47.

    Article  Google Scholar 

  17. Barbazza E, Klazinga NS, Kringos DS. Exploring the actionability of healthcare performance indicators for quality of care: a qualitative analysis of the literature, expert opinion and user experience. BMJ Quality & Safety 2021;30:1010–20.

  18. Nolte E. International benchmarking of healthcare quality: a review of the literature. Rand Heal Q. 2012;1(4):e1000097.

  19. Nuti S, Vainieri M. Strategies and tools to manage variation in regional governance systems. In: Handbook of Health Services Research. Boston: Springer Reference; 2014. p. 23.

    Google Scholar 

  20. Codling S. In: Gower Publishing L, editor. Best practice benchmarking: a management guide. Aldershot: Gower Publishing, Ltd; 1995.

    Google Scholar 

  21. Lober WB, Flowers JL. Consumer reports in health care: do they make a difference? Semin Oncol Nurs. 2011;27(3):169–82. Available from:

    Article  PubMed  Google Scholar 

  22. Prang KH, Maritz R, Sabanovic H, Dunt D, Kelaher M. Mechanisms and impact of public reporting on physicians and hospitals’ performance: a systematic review (2000–2020). Plos One. 2021;16(2 February):1–24. Available from:

    Article  CAS  Google Scholar 

  23. Hibbard JH, Stockard J, Tusler M. Does publicizing hospital performance stimulate quality improvement efforts? Health Aff. 2003;22(2):84–94.

    Article  Google Scholar 

  24. Bevan G, Fasolo B. Models of governance of public services: Empirical and behavioural analysis of ‘econs’ and ‘humans’. In: Behavioural Public Policy. Cambridge: Cambridge University Press; 2013. p. 38–62.

    Chapter  Google Scholar 

  25. Bevan G, Wilson D. Does “naming and shaming” work for schools and hospitals? Lessons from natural experiments following devolution in England and Wales. Public Money Manag. 2013;33(4):245–52.

    Article  Google Scholar 

  26. Bevan G, Evans A, Nuti S. Reputations count: why benchmarking performance is improving health care across the world. Heal Econ Policy Law. 2019;14(2):141–61.

    Article  Google Scholar 

  27. World Health Organization. Health systems : improving performance: World Health Organization; 2000. p. 215.

    Google Scholar 

  28. WHO, OECD. DAC guidelines and reference series: poverty and health; 2003. p. 55. Available from:

    Google Scholar 

  29. Arah OA, Westert GP, Hurst J, Klazinga NS. A conceptual framework for the OECD health care quality indicators project. Int J Qual Heal Care. 2006;18(SUPPL. 1):5–13.

    Article  Google Scholar 

  30. Moher D, Liberati A, Tetzlaff J, Altman DG, Altman D, Antes G, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Plos Med. 2009;6(7).

  31. Sirriyeh R, Lawton R, Gardner P, Armitage G. Reviewing studies with diverse designs: the development and evaluation of a new tool. J Eval Clin Pract. 2012;18(4):746–52.

    Article  PubMed  Google Scholar 

  32. Donabedian A. The Quality of Care: How Can It Be Assessed? JAMA. 1988;260(12):1743–8.

    Article  CAS  PubMed  Google Scholar 

  33. Cronenwett JL, Likosky DS, Russell MT, Eldrup-Jorgensen J, Stanley AC, Nolan BW. A regional registry for quality assurance and improvement: the vascular study Group of Northern new England (VSGNNE). J Vasc Surg. 2007;46(6):1093–103.

    Article  PubMed  Google Scholar 

  34. Campion FX, Larson LR, Kadlubek PJ, Earle CC, Neuss MN. Advancing performance measurement in oncology. Am J Manag Care. 2011;17(Suppl 5):31–5.

    Google Scholar 

  35. Stern M, Niemann N, Wiedemann B, Wenzlaff P. Benchmarking improves quality in cystic fibrosis care: a pilot project involving 12 centres. Int J Qual Heal Care. 2011;23(3):349–56.

    Article  Google Scholar 

  36. Hermans MP, Elisaf M, Michel G, Muls E, Nobels F, Vandenberghe H, et al. Benchmarking is associated with improvedquality of care in type 2 diabetes: the OPTIMISE randomized, controlled trial. Diabetes Care. 2013;36(11):3388–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Merle V, Moret L, Pidhorz L, Dujardin F, Gouin F, Josset V, et al. Does comparison of performance lead to better care? A pilot observational study in patients admitted for hip fracture in three French public hospitals. Int J Qual Heal Care. 2009;21(5):321–9.

    Article  Google Scholar 

  38. Hall BL, Hamilton BH, Richards K, Bilimoria KY, Cohen ME, Ko CY. Does surgical quality improve in the american college of surgeons national surgical quality improvement program: an evaluation of all participating hospitals. Ann Surg. 2009;250(3):363–74.

    Article  PubMed  Google Scholar 

  39. Tepas JJ, Kerwin AJ, Devilla J, Nussbaum MS. Macro vs micro level surgical quality improvement: a regional collaborative demonstrates the case for a national NSQIP initiative. J Am Coll Surg. 2014;218(4):599–604. Available from:

    Article  PubMed  Google Scholar 

  40. Nuti S, Vola F, Bonini A, Vainieri M. Making governance work in the health care sector: evidence from a “natural experiment” in Italy. Heal Econ Policy Law. 2016;11(1):17–38.

    Article  Google Scholar 

  41. Govaert JA, Van Dijk WA, Fiocco M, Scheffer AC, Gietelink L, Wouters MWJM, et al. Nationwide Outcomes Measurement in Colorectal Cancer Surgery: Improving Quality and Reducing Costs Presented at the European Society of Surgical Oncology 34th Congress, Liverpool, United Kingdom, October 2014. J Am Coll Surg. 2016;222(1):19–29.e2. Available from:

    Article  PubMed  Google Scholar 

  42. Piccoliori G, Mahlknecht A, Abuzahra ME, Engl A, Breitenberger V, Vögele A, et al. Quality improvement in chronic care by self-audit, benchmarking and networking in general practices in South Tyrol, Italy: results from an interventional study. Fam Pract. 2021;38(3):253–8.

    Article  PubMed  Google Scholar 

  43. Qvist P, Rasmussen L, Bonnevie B, Gjørup T. Repeated measurements of generic indicators: a Danish national program to benchmark and improve quality of care. Int J Qual Heal Care. 2004;16(2):141–8.

    Article  Google Scholar 

  44. Nuti S, Seghieri C, Vainieri M. Assessing the effectiveness of a performance evaluation system in the public health care sector: some novel evidence from the Tuscany region experience. J Manag Gov. 2013;17(1):59–69.

    Article  Google Scholar 

  45. Van Leersum NJ, Snijders HS, Henneman D, Kolfschoten NE, Gooiker GA, Ten Berge MG, et al. The dutch surgical colorectal audit. Eur J Surg Oncol. 2013;39(10):1063–70. Available from:

    Article  PubMed  Google Scholar 

  46. Margeirsdottir HD, Larsen JR, Kummernes SJ, Brunborg C, Dahl-Jørgensen K. The establishment of a new national network leads to quality improvement in childhood diabetes: implementation of the ISPAD guidelines. Pediatr Diabetes. 2010;11(2):88–95.

    Article  PubMed  Google Scholar 

  47. Kodeda K, Johansson R, Zar N, Birgisson H, Dahlberg M, Skullman S, et al. Time trends, improvements and national auditing of rectal cancer management over an 18-year period. Color Dis. 2015;17(9):O168–79.

    Article  CAS  Google Scholar 

  48. Pinnarelli L, Nuti S, Sorge C, Davoli M, Fusco D, Agabiti N, et al. What drives hospital performance? The impact of comparative outcome evaluation of patients admitted for hip fracture in two Italian regions. BMJ Qual Saf. 2012;21(2):127–34.

    Article  PubMed  Google Scholar 

  49. Miyata H, Motomura N, Murakami A, Takamoto S. Effect of benchmarking projects on outcomes of coronary artery bypass graft surgery: challenges and prospects regarding the quality improvement initiative. J Thorac Cardiovasc Surg. 2012;143(6):1364–9. Available from:

    Article  PubMed  Google Scholar 

  50. World Bank. World Bank country and lending groups [internet]. 2021. Available from:

    Google Scholar 

  51. World Health Organization, World Bank Group O. Delivering quality health services: World Health Organization, World Bank Group, OECD; 2018. p. 1–100. Available from:

    Google Scholar 

  52. Kruk ME, Gage AD, Arsenault C, Jordan K, Leslie HH, Roder-DeWan S, et al. High-quality health systems in the sustainable development goals era: time for a revolution. Lancet Glob Heal. 2018;6(11):e1196–252.

    Article  Google Scholar 

  53. Clarke A, Taylor-Phillips S, Swan J, Gkeredakis E, Mills P, Powell J, et al. Evidence-based commissioning in the English NHS: who uses which sources of evidence? A survey 2010/2011. BMJ Open. 2013;3(5):1–6.

    Article  Google Scholar 

  54. Ivankovic D, Poldrugovac M, Garel P, Klazinga NS, Kringos DS. Why, what and how do European healthcare managers use performance data? Results of a survey and workshop among members of the European hospital and healthcare federation. Plos One. 2020;15(4):1–19. Available from:

    Article  CAS  Google Scholar 

  55. De Lange DW, Dongelmans DA, De Keizer NF. Small steps beyond benchmarking. Rev Bras Ter Intensiva. 2017;29(2):128–30.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Lied TR, Kazandjian VA. A Hawthorne strategy: implications for performance measurement and improvement. Clin Perform Qual Health Care. 1998;6(4):201–4.

    CAS  PubMed  Google Scholar 

  57. Braithwaite J, Yukihiro M, Johnson J. Healthcare reform, quality and safety: perspectives, participants, partnerships and prospects in 30 countries. Boca Raton: CRC Press; 2017.

    Book  Google Scholar 

  58. Obit LJ. The measurement of health service outcomes. In: Oxford Textbook of Health Care; 1993.

    Google Scholar 

  59. Goldstein, H., Spiegelhalter DJ. League Tables and Their Limitations : Statistical Issues in Comparisons of Institutional Performance Author ( s ): Harvey Goldstein and David J . Spiegelhalter Source : Journal of the Royal Statistical Society . Series A ( Statistics in Society ), Vol . 1. Society 2008;159(3):385–443.

  60. Lovaglio PG. Benchmarking strategies for measuring the quality of healthcare: Problems and prospects. Sci World J. 2012;2012(iii):606154.

  61. Vainieri M, Noto G, Ferre F, Rosella LC. A performance management system in healthcare for all seasons? Int J Environ Res Public Health. 2020;17(15):1–10.

    Article  Google Scholar 

  62. Borghini A, Corazza I, Nuti S. Learning from excellence to improve healthcare services: the experience of the maternal and child care pathway. Int J Environ Res Public Health. 2021;18(4):1–10.

    Article  Google Scholar 

  63. Vainieri M, Lungu DA, Nuti S. Insights on the effectiveness of reward schemes from 10-year longitudinal case studies in 2 Italian regions. Int J Health Plann Manag. 2018;33(2):e474–84.

    Article  Google Scholar 

Download references


The authors want to acknowledge the members of the Healthcare Management Laboratory of the Scuola Superiore Sant’Anna.


The manuscript was developed with support from a member of a Marie Skłodowska-Curie Innovative Training Network (HealthPros—Healthcare Performance Intelligence Professionals) that has received funding from the European Union’s Horizon 2020 research and innovation programme under Grant agreement No. 765141.

Author information

Authors and Affiliations



CW and PB drafted the Introduction, Methods, Results and Discussion sections of the manuscript. MV and AMM contributed to the study design and to the interpretation and implication of the findings and revised the manuscript critically for intellectual content. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Paolo Belardi.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Willmington, C., Belardi, P., Murante, A.M. et al. The contribution of benchmarking to quality improvement in healthcare. A systematic literature review. BMC Health Serv Res 22, 139 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: