What is important in evaluating health care quality? An international comparison of user views

Background Quality of care from the perspective of users is increasingly used in evaluating health care performance. Going beyond satisfaction studies, quality of care from the users' perspective is conceptualised in two dimensions: the importance users attach to aspects of care and their actual experience with these aspects. It is well established that health care systems differ in performance. The question in this article is whether there are also differences in what people in different health care systems view as important aspects of health care quality. The aim is to describe and explain international differences in the importance that health care users attach to different aspects of health care. Methods Data were used from different studies that all used a version of the QUOTE-questionnaire that measures user views of health care quality in two dimensions: the importance that users attach to aspects of care and their actual experience. Data from 12 European countries and 5133 individuals were used. They were analysed using multi-level analysis. Results Although most of the variations in importance people attach to aspects of health care is located at the individual level, there are also differences between countries. The ranking of aspects shows similarities. 'My GP should always take me seriously' was in nearly all countries ranked first, while an item about waiting time in the GP's office was always ranked lowest. Conclusion Differences between countries in how health care users value different aspects of care are difficult to explain. Further theorising should take into account that importance and performance ratings are positively related, that people compare their experiences with those of others, and that general and instrumental values might be related through the institutions of the health care system.

people from different countries view as important in evaluating health care quality? The World Health Report 2000 has been criticized on its assumption of a universal value base to all health care systems; values such as responsiveness may be valued differently in different countries [8].
In this article we address this issue by comparing what people find important in general practice care in different countries. Grol et al studied patients' priorities in general practice [9]. They found both many similarities and differences between countries. Particularly, doctor-patient communication and accessibility of services were common priorities among general practice visitors in different countries. Service aspects, such as waiting times, were considered less important.
In this study we did a secondary analysis on surveys of patient views on quality of health care. Patients' views were measured using the QUOTE-questionnaire -with the acronym QUOTE standing for QUality Of care Through the patients' Eyes -that distinguishes two quality of care dimensions: performance and importance [10]. Performance relates to the actual experience of the use of health care services (rather than a patient satisfaction judgement), which is in line with recent developments within health services research. Importance refers to the fact that people see some features of health services as more significant than others. They reflect what people see as desired qualities in health care. This approach avoids problems with conceptualising people's evaluations of health care in terms of satisfaction (usually high levels of satisfaction, not specific enough to be used in quality improvement) and expectations (ambiguous relations between expectations and actual experiences).
To select relevant quality of care aspects, a general and a disease-specific approach was followed, using focus group discussions [10]. With this procedure a series of QUOTEquestionnaires has been tailored to the needs of various patient groups. In these QUOTE-questionnaires the expectations of people are reflected in the statements included in the instruments. These questionnaires have been used in several studies in different countries.

Research questions
In this article we first compare the importance dimension of QUOTE across several European countries and Israel to gain insight in the similarities and differences in people's views on quality of care. Secondly, we will look at the relationship between importance and performance ratings as part of an explorative analysis to explain differences in importance ratings between patients and/or countries. The general research question is: Do patients in different European countries think differently about the relative importance of various aspects of quality of care, and if so, how can these differences be explained?
This general question is divided into the following separate questions: a. To what extent do the importance judgements of patients cluster within countries when individual characteristics of patients are taken into account? b. Does the ranking of importance judgements vary between countries? c. What is the relationship between the average performance of health care systems, as judged by patients, and the individual importance judgements?
When we compare the importance judgements of patients between countries, we will take into account individual characteristics of respondents to rule out differences in the composition of the groups of respondents. With respect to the relationship between importance and performances scores it might be anticipated that in general people attach more importance to those aspects that they actually experience less often. Analogous to the economic mechanism of decreasing marginal utility, e.g. quick service without waiting time in the doctor's office might be valued as less important, if in general services are quick and people don't have to wait long. However, at the same time it can be hypothesized that it's no use aspiring to something that nobody has. If quality of care ratings, as seen through the eyes of the patient, are low on the average and if there is small variation in these performance ratings between individuals, people will probably not find these aspects important. This idea is based on a mechanism of social comparison [11].

Material
The First Dutch QUOTE-questionnaires (for disabled persons, COPD, arthritis and frail elderly people) served as a starting point for our database [10,12]. These questionnaires contain 16 general importance and performance items. In the SCOPE-project (Supporting Clinical Outcomes in Primary Care for the Elderly) the QUOTE-elderly was used in Finland, Ireland and the Netherlands [13]. A large contribution to the database comes from an international study of patients with inflammatory bowel disease (IBD) in eight countries [14]. This study used ten generic questions (of the original 16) relating to GPs. Additional material was obtained from the UK (QUOTE-disabled) and from Belarus and the Ukraine [15,16].
The QUOTE-questionnaires have been translated in the context of different projects. In all but two cases backward-forward translations have been used. Answering formats of importance items were: Not important (1); Fairly important (2); Important (3); and Extremely important (4). The answering formats for the performance items were: No (1); Not really (2); On the whole, yes (3); and Yes (4). The equivalence of the answering formats in different languages has not been assessed. The wording of the importance items that were used in the QUOTE-questionnaires are presented in tables 2 and 5 and throughout the result section of the article. The performance items ask for the actual experience of respondents. One of the importance items is, e.g., 'my GP should always take me seriously'. The corresponding performance item is: my GP always takes me seriously'. QUOTEitems included in the analysis refer to the organisation of health care services and the care giving process.   Table 1 gives the number of respondents in each user group and country and the selection of respondents. In the case of Belarus and Ukraine respondents were selected by distributing questionnaires to people who visited general practices. Response rates are not available for these two countries. In all other countries but one addresses were randomly selected from the files of health care (and in one case home care) institutions and (in The Netherlands) from membership lists of voluntary patient organisations, irrespective of actual visits to a GP. In the case of the QUOTE-Migrants, respondents from ethnic groups in the Netherlands were selected using a snowball sampling method; data for these respondents were collected through oral interviews in the respondents' mother language. In all other cases postal questionnaires were used, followed by one or two reminders. Response rates vary between 35% (elderly in the Netherlands) and 78% (the average of the IBD samples).

Statistical analysis
All 5133 health care users reported for each of (maximum) ten items their importance and performance ratings. The importance ratings are dependent variables in a series of statistical analyses with patients hierarchically nested in countries. In contrast to traditional forms of analysis of variance in which factors have 'fixed' effects, countries are considered to have 'random' effects. Such a variance component model is preferred over traditional analysis if the number of categories exceeds ten [17,18]. The degree of resemblance between patients belonging to the same country can be expressed by the intraclass correlation coefficient (ICC). If there is no resemblance between patients within countries, the ICC is zero or near   zero. An ICC of .15 is considered quite high [19]. Most commonly ICCs are lower. For instance, the median ICC calculated for more that 1000 primary care variables, was .01 [20].
The ICC is statistically defined as the variance between countries divided by the total variance. An ICC of zero therefore implies no variance between countries, indicating the absence of differences between countries in patients' importance ratings. Age and sex were included as covariates to take into account differences in the composition of responder groups in the different countries, because of their association with importance scores [9,10,[21][22][23]. Correction for different user groups turned out to be impossible due to the small number of countries for some user groups. Differences in number of cases between countries were taken into account in the statistical analysis. The estimates of country parameters are more precise with larger numbers per country.
In order to explore the relationship between importance and performance ratings Pearson correlations were calculated between the performance items, both at the individual level and aggregated to country level, and the importance rating on user level. In the introduction we have put forward two hypothetical relations between variation of performance within the countries and the importance ratings. To look at the relation between variation of performance and importance ratings, a distinction was made between countries with large variation in experienced health care quality and countries with smaller variation. Based on the mean standard deviation (SD) of all ten performance items, the countries were equally split into:

Results
We start the presentation of the results with a description of the overall importance that respondents in all twelve countries attached to the different aspects and the clustering of their answers within countries. We then move to the differences in the ranking of aspects between countries. Finally, we will present results for the relationship between the actual experiences of respondents, both individually and aggregated to an average for each country, and the importance they attach to the different aspects.

Importance judgements and clustering
As shown in table 2, 'The GP should always take me seriously' is seen as most important, halfway between 'important' and 'extremely important' on the Likert scale. Less than 1% of users rated this item as 'not important' (not in  table). The differences between users as well as countries for this aspect are the smallest of all aspects (smallest variance, both on user-and country level). 'The GP should not keep me in the waiting room for more than 15 minutes' is seen as least important, halfway between 'important' and 'fairly important'. About 20% of users rated this aspect as 'not important' (not in table). The differences between users for this aspect are largest (highest variance on user level). The importance of 'The GP should make sure that I can see a specialist within 2 weeks' shows the biggest differences between countries. The uncorrected ICCs vary from low (.058 aspect 5) to high (.251 aspect 9). The sex-age adjusted ICCs are a little (7%) lower on average, but still range to high.
In order to explore differences between user groups, we have computed intra-class correlation coefficients for user groups within the Netherlands only. These coefficients are on average higher than those regarding countries. We also analysed the differences between countries within the IBD patient groups. These differences are much like the figures of table 2. So the estimated intra-class coefficients of table 2 seem to refer more to differences between countries than differences between user groups.

Differences between countries and ranking
The variation between the countries for each aspect is illustrated in table 3 by mean importance scores for all aspects. For instance, 'The GP should always take me seriously' is seen as most important in Israel and as least important in Finland. 'The GP should not keep me in the waiting room for more than 15 minutes' also is seen as most important in Israel but as least important in Italy.
In order to look at the consistency of user views across the different countries, the ten importance aspects were ranked according to their mean value within each country. Table 4 gives the ranks for countries where all ten aspects are available. Some rankings differ between countries. For instance in Denmark 'My GP should inform me, in understandable language, about the medicines that are prescribed for me' is ranked first. In Portugal it is 'My GP should prescribe medicines which are fully covered by the National Health System or social services'. But there is also a general pattern. The service aspect 'My GP should not keep me in the waiting room for more than 15 minutes' is ranked last in all countries, while 'My GP should always take me seriously' is ranked high in all countries.

Importance and performance
Looking at the relationship between importance and performance ratings by means of correlation coefficients,  Italic means in column Low variation at country level/Low performance at individual level differ statistically significant from means in column Low variation at country level/High performance at individual level (Scheffé contrasts).
If we take into account the variation in performance ratings within countries, the positive relationship between importance and performance ratings is somewhat stronger in the 'high variation' mode compared to the 'low variation' mode. This can be seen by comparing the difference between the first two columns of table 6 and those between the last two columns. We had specifically expected to find a difference between the last two columns of table 6, i.e. within countries with low variation in aggregate performance. However, for only half of the items the difference is statistically significant.
Overall, importance scores are somewhat lower in countries with high variation in performance ratings as compared to countries with low variation in performance scores.

Discussion
The objective of our study was to gain insight into similarities and differences in the value users of health care in different countries attach to aspects of care. As an indicator of these values, importance scores of the QUOTE-questionnaires were used. These scores reflect what is important in evaluating health care quality according to users. Our results show that health care users in different countries to some extent think differently about the relative importance of various aspects of quality of care. Intra-class correlation coefficients were calculated to measure the difference between countries. They range from low to high. Sex-age adjusted intra-class correlation coefficients were only slightly lower. This means that demographic differences between the groups that filled in the questionnaires in different countries cannot explain the differences in average importance between the countries.
Although there are differences between countries, the importance rankings of the aspects also show consensus. 'My GP should always take me seriously' is nearly always ranked highest, while the item about waiting time is always ranked as least important. Since we only analysed a small sample of countries, it is difficult to generalise this result. However, it might say something about a hierarchy of these instrumental health care values, suggesting that values concerning respectfulness are seen as more important than service aspects, such as waiting time. This is in line with the findings of Grol et al. [9]. There are no accepted explanations for these value differences between countries. General theories about dimensions of culture see culture as the independent variable, explaining differences between countries in institutions and organisations [24,25]. A more specific application in the health care field is Payer's book Medicine and Culture that relates differences in culture to variations in the practice of medicine [26]. In this article, however, differences in values is what we want to explain.
In general, there is a positive relation between what people find important and their experiences, both on an individual level and related to the average experience in a country. The positive relation between importance and experience could probably be explained by a general tendency of cognitive consistency [27] or alternatively by processes of selection where people try to find those health care providers that do what they value most. This alternative hypothesis would mean that the correlation between importance and experience is stronger the more freedom of choice of health care provider people have. One of the assumptions behind the QUOTE-questionnaires is that importance and experience are two different aspects, together constituting quality judgements. The correlation at the individual level is not so high as to invalidate this assumption.
By looking at the variation in the performance scores, we have tried to include social comparison mechanisms into the analysis. It can be argued that if people's individual experience indicates low performance for certain aspects in countries, they will value these aspects as more important. However, such a hypothesis has to be rejected on the basis of the results presented in this article. A contrasting hypothesis, that in countries with low variation in actual experiences, people who experience low performance themselves, will not aspire to something that is apparently (from their own and others' experience) out of reach is only partly supported by our findings. Although people in this low variation condition have significantly lower importance scores for half of the items, still the differences in the high variation condition are higher.
Looking at the individual aspects, it can be argued that people don't find issues important if they are more or less guaranteed by the health care system. An example for this is the issue of prescribing pharmaceuticals that are fully covered by the health care insurance plans of patients. On the basis of the material presented in this article this hypothesis too has to be rejected. People in countries with low levels of cost sharing in this field, found this item more important than people in countries with higher levels of cost sharing. However, an alternative explanation for this finding could be that structural aspects of the health care system, e.g. on the dimension public -private, reflect general values [28,29]. If these general values also relate to instrumental values, than the relationship we found is understandable: the people in countries that took the pain to organise their health care system in a way that financial access is very good, might find this issue more important. In this explanation the mechanism between general and instrumental values is the institutional makeup of health care systems. In general, we believe that further theorising about differences between health care systems in what people find important, might start from the positive relation between importance and experience, from the idea that people compare their experiences with those of others, and from the idea that general and instrumental values might be related trough the institutions of the health care system. Hofstede [24] has identified a number of general dimensions of culture that might also be related to instrumental health care values through different types of welfare states [30].
The analysis presented in this article has its shortcomings. The number of countries is small. The analysis of differences between countries is based on only twelve countries, even though large numbers of respondents were involved. The number respondents increase our confidence in the averages per country, but this does not solve the problem of small numbers at the country level. Differences between user groups could not be taken into account because of the small number of groups for all countries except the Netherlands, although one might expect these differences to exist [29].
The multilevel model now contained two levels, respondents and countries. However, as table 1 indicated, the selection of respondents was through health care institutions. Hence, a level between respondents and countries should ideally have been specified. However, the data did not allow this. Differences between user groups are relatively large. However, the estimated intraclass correlations of table 2 seem to refer more to variation between countries than between user groups. Future research with larger numbers of user groups would be helpful to be able to take case mix differences into account.
We used existing data, collected in different studies with different aims and methods. The response rates differed across studies. The user groups in the surveys refer to are very different, ranging from GP-patients in Belarus to disabled people in the UK and IBD patients in Israel. Apart from the apparent differences between user groups, there are also differences between countries in the position and tasks of GPs. The situation in Ukraine and Belarus was transitional, even ten years after the fall of the Iron Curtain [31]. This may have reduced comparability and create variations in respondents and services to be evaluated.
The QUOTE-questionnaires provide a general framework, and researchers have adapted them to their own aims. Against the advantage of flexible adaptation to different research aims and populations stands the disadvantage of reduced (international) comparability. For (international) comparisons more standardisation is very important. Apart from that, the ten QUOTE-aspects we used in our analyses cover different dimensions of the Quality of care concept and can be understood as being comprehensive enough for this explorative type of research.
In conclusion, we believe it is important to continue to do research into health care related values because of the increasing importance of user views, both in the health policies of European countries separately and in the international debate about the performance of health care systems. There is not much ground for strong cultural relativism, saying that what is important in the eyes of health care users is so different that it is not possible to develop performance measures that can be used in a wide range of countries.

Conclusion
There are differences between countries in the importance people attach to aspects of health care. Most of this variation is related to individual differences, but there is also significant variation between countries. The ranking of aspects shows similarities between countries. In nearly all countries, people ranked the item that their GP should take them seriously as most important, while an item about waiting time was always ranked lowest. It is difficult to explain the variation between countries. Further theorising should take into account that importance and performance ratings are positively related, that people compare their experiences with those of others, and that general and instrumental values might be related through the institutions of the health care system.