Research article | Open | Open Peer Review | Published:
Doctor and practice characteristics associated with differences in patient evaluations of general practice
BMC Health Services Researchvolume 7, Article number: 46 (2007)
Variation in patients' evaluation due to general practitioner (GP) and practice factors may provide information useful in a quality improvement context. However, the extent to which differences in patients' evaluation of the GPs are associated with differences in GP and practice characteristics must also be ascertained in order to facilitate comparison of adjusted patient evaluations between GPs. The aim of this study was to determine such associations in a setting where GPs serve a list of patients and act as gatekeepers.
We carried out a patient evaluation survey among voluntarily participating GPs using the EUROPEP questionnaire, which produced 28,260 patient evaluations (response rate 77.3%) of 365 GPs. In our analyses we compared the prevalence of positive evaluations in groups of GPs.
Our principal finding was a negative association between the GP's age and the evaluation of all aspects, except accessibility. We also found an association between the way the practice was organised and the patients' evaluation of accessibility, with GPs in single-handed practices getting far the most positive evaluations. Long weekly working hours were associated with more positive evaluations of all dimensions except accessibility, whereas more than 0.5 full-time employees per GP, a higher number of listed patients per GP and working in a training practice were associated with negative evaluation of accessibility.
GP characteristics are mainly associated with patients' experience of interpersonal aspects of care, while practice characteristics are associated with evaluation of accessibility. These differences need to be accounted for when comparing patient evaluations of different practices.
Ongoing efforts to improve the quality of general practice care increasingly include patients' evaluation of various aspects of care [1–3]. Variation in patient evaluation of general practice reflects differences in performance, which, to some extent, may be associated with GP and practice characteristics, and differences in the patients' perception of performance. Crude unadjusted results, though, may serve best for quality improvement at a practice level as they reflect the extent to which the GPs succeed in meeting the patients' individual needs and dealing with different working conditions .
When comparing practices and when holding practices against standards comparability is obtained through standardisation – that is adjustment for systematic variation in patient, GP and practice characteristics between the compared groups. Studies of associations between patient evaluations and patient, GP and practice characteristics are needed to determine which characteristics to adjust for. Knowledge of such associations may also serve quality improvement purposes at regional and national levels.
Earlier studies report a positive association between patients' evaluation and the GP's age , but a negative association between the evaluation and the GP's age and seniority has also been found [6, 7]. Baker [5, 6] and Campbell  found an inverse relation between assessment and total list size as well as list size per GP. Lin  found better evaluations of group practices than of single-handed practices, while Hombergh  found the opposite, which was also supported by Wensing , who found better ratings of practices with fewer and full-time working GPs. This supports the general finding that patients prefer personal, continuous care [10, 12, 13]. In addition, Baker [5, 6] found that being a training practice produced less positive evaluations.
These results derive from studies in different settings and with different degrees of adjustment for differences in patient characteristics. Hence, there seems to be a need for a sufficiently powered comprehensive study of associations between patient evaluations and GP and practice characteristics adjusted for systematic variation in patient characteristics. Such a study will provide us with information about GP and practice factors of significance to the quality of general practice and information on how to standardize patient evaluation results for comparison between (groups of) practices.
On this background the aim of the present study is to determine to which extent variation in patient evaluations of the GPs are associated with GP and practice characteristics.
This study was carried out in a general practice setting where self-employed GPs work as gatekeepers for the public health services on a contract basis serving patients on their list (a brief introduction to the Danish general practice is given in Additional file 1). During 2002–4 all 2181 GPs from ten Danish counties received an invitation to carry out patient evaluations of their practices. A total of 365 GPs (16%–34% of all GPs in these counties) entered the project and undertook an evaluation. This was part of the national project on patient evaluations, the DanPEP. The participating GPs handed out questionnaires to 100 successive patients seen by the GP in the surgery or at home visits. The patients were at least 18 years of age, were listed in the practice and accepted a Danish language questionnaire. They were informed that their replies were anonymous to the doctor. Each questionnaire was identified by a serial number connecting it with the GP who handed it out and with the patient. All questionnaires not distributed within two weeks could be returned to the research office. All GPs filled in a form with information about the GP and the practice. Practice information from GPs working in the same practice was cross-checked.
The questionnaire contained the 23 items forming the EUROPEP instrument, which is a European validated instrument for patient evaluation of general practice care based on literature analysis and surveys of patients' expectations and opinions on good care [14, 15]. The 23 items are displayed in Additional file 2.
These questions covered specific aspects of general practice and were grouped into five dimensions: doctor-patient relationship, medical care, information and support, organisation of care and accessibility. The answers were marked on a 5-point scale ranging from "poor" to "excellent", with "acceptable" as the middle value. Alternatively, the patients could choose a sixth category "not able to answer/not relevant". The questionnaire also included questions about the patient's gender, age, educational level, frequency of attendance to a general practice, time listed with the GP, self-rated health  and chronic conditions.
The patients were asked to assess the GP they considered to be their personal GP (and his practice when assessing aspects of accessibility) based on their contact experience over the past 12 months. They were also asked to write the GP's name on the questionnaire to confirm which GP was assessed and to allow individual assessment of GPs in partnership practices. The questionnaires were returned by the patients in prepaid envelopes to the research office.
In order to be able to carry out the reminder procedure, the GPs recorded the names, addresses and serial numbers from the questionnaires handed out. Reminders with new questionnaires were sent out by the research office to non-responding patients three to five weeks after the GPs' distributed the questionnaires and the patient lists were then destroyed. At the research office, the diagnoses reported by patients with chronic conditions were coded according to the major ICPC-2 groups .
For each dimension, a patient's evaluation was included in the calculations only if 50% or more of the items had been answered in one of the six categories. An answer was considered positive if it fell into one of the two most favourable categories. An assessment of a dimension was categorised as 100%, 50–99% or 0–49% positive depending on the proportion of positively evaluated items in that dimension. In the analyses we compared the prevalence of assessments in the 100% category between strata and the prevalence in the 0–49% category, respectively.
The gender and age of the GPs, their seniority in present practice, number of weekly working hours (out-of-hours-duties, teaching away from practice and consultative services were not included), number of listed patients and full-time working staff per GP were categorised as displayed in Table 1 (in shared single-handed practices we only counted one GP). Practices were registered as urban, rural or mixed. They were divided into categories of single-handed, shared single-handed, groups of single-handed or partnership practices with doctors working either full-time or part-time (practice types are explained in Additional file 1). Involvement in education was registered as education given in the practice and/or in settings outside the practice or as no involvement.
Associations between the GP and practice characteristics and the assessment scores for each of the five dimensions were estimated as prevalence ratios (PR). The PRs with 95% confidence intervals (95% CI) were chosen instead of odds ratios, which would tend to overestimate the associations owing to the high prevalence of the variables [18, 19].
In the first model we estimated the crude PRs between the GP and practice characteristics and the proportion of 100% and 0–49% positive evaluations of each dimension. Based on these univariate analyses we identified confounding GP and practice variables. Hence, we found statistically significant correlations between GP age, gender and seniority and between organisation of practice, list size per GP, weekly working hours, urbanity and staff per GP and a high collinearity between GPs' age and seniority. GP age was more closely associated with patient evaluation than seniority and was therefore chosen as the adjusting variable. List size and urbanity were not associated with assessment and, hence, despite correlation with other variables, did not act as confounders.
In the next model we estimated the association between the GP and practice characteristics and the evaluation adjusted for patient characteristics (patient-adjusted PR). The confounding patient variables (patients' age and gender, frequency of attending a GP and self-rated health) were identified from analyses of associations between patient characteristics and their GP evaluation performed on the data set also used in the present study (Heje et al., submitted).
In the last model we included the confounding GP and practice variables resulting from the analyses in the first model, which were GP gender, age and weekly working hours, organisation of practice and staff (number of employees converted into full-time) per GP, along with the patient variables used in the former model and calculated the fully adjusted PRs for the associations between GP and practice characteristics and the patients' evaluations.
We did two sets of analyses. In the first set the dependent variable was the prevalence of assessments in the 100% category. In the second set it was the prevalence of assessments in the 0–49% category.
Depending of the model (crude, patient adjusted or fully adjusted), our independent variables were GP gender, age and weekly working hours, organisation of practice and staff per GP and patient age, gender, frequency of attending a GP and self-rated health.
We used generalised linear models (GLM) with log link for Bernoulli family, i.e. modelling the PRs. Due to the high prevalences, some of the adjusted GLM analyses could not converge using the Bernoulli family. In these situations, we used the Poisson regression [20, 21]. Model fit for each multivariate model was tested using the Hosmer-Lemeshow test for goodness of fit  developed for testing logistic modelling. However, where the Poisson regression was used, the high prevalences could produce estimated probabilities greater than one which would hamper the Hosmer-Lemeshow goodness of fit test very much. In such situations, we therefore used the post-estimation goodness of fit test for Poisson regression based on deviance statistics .
Patient clustering by GPs produced relatively high intra-class correlation coefficients (ranging from 0.034 to 0.134). We accounted for patient clustering by using robust standard errors in all analyses [24, 25]. Analyses were performed using complete data only, i.e., the univariate and the GLM analyses were performed using the same set of data so that an increasing number of missings would not explain differences in associations when adjusting for more variables. We used Stata 9.1 for data processing .
A total of 36,561 questionnaires were distributed by the 365 GPs. After a reminder procedure and exclusion of responses from patients not indicating which GP they assessed or assessing non-participating GPs, we had 28,260 (77.3%) valid responses. Characteristics of the evaluated GPs and their practices are displayed in Table 1. Respondent characteristics are shown in Table 2. There were more than twice as many female as male respondents (Table 2), which reflects that women are more inclined than men to attend a GP  and to respond to questionnaires (Heje et al., submitted).
GP gender was not associated with evaluation outcome when we adjusted for patient, GP and practice confounders. However, adjustment did not neutralize the negative association found with GP age for all dimensions except accessibility. GPs aged 30–39 years received the most positive evaluations, whereas there was no difference between the other age categories. There was no association with seniority, neither before nor after adjustment.
In the following the word "association" refers to the fully adjusted PR, i.e. the PR adjusted for confounding patient, GP and practice variables and for patent clustering.
For the first four dimensions ("GP-patient-relationship", "medical care", "information and support" and "organisation of care"), we found a positive association between the evaluation and GP working hours in excess of 45 hours a week (Tables 3, 4, 5, 6). Patients' evaluation of the organisation of practice (e.g. continuity) was positively associated with the GP's weekly working hours exceeding 37 hours. There was no association between weekly working hours and patients' assessment of accessibility (Table 7).
We saw practically no association with practice urbanisation. The number of patients and the number of employees per GP were negatively associated with the experienced accessibility (Table 7), but not with the evaluation of the other dimensions (Tables 3, 4, 5, 6). The way the GPs had arranged themselves in practice played a minor role regarding the first four dimensions (Tables 3, 4, 5, 6), while there were major differences regarding accessibility. GPs in single-handed practices were assessed more positively than GPs in shared single-handed, groups of single-handed and partnership practices. Accessibility to GPs working part-time in partnership practices was assessed least favourably (Table 7).
Working in a training practice was only associated with the assessment of accessibility aspects (Table 7), where we found a negative association, while the evaluations were not associated with teaching outside the practice.
We found a strong negative association between the GP's age and the evaluation of all aspects, except accessibility. We also found a strong association between the way the practice was organised and the patients' evaluation of accessibility, with GPs in single-handed practices getting far the most positive evaluations. Long weekly working hours were associated with more positive evaluations of all dimensions, except accessibility, whereas more than 0.5 full-time employees per GP, a comparatively high number of listed patients per GP and working in a training practice were associated with negative evaluation of accessibility.
Discussion of methods
We chose to include the GPs' age as ten-year categories, and weekly working hours were categorised into two categories above and two categories below standard Danish full-time hours. For both variables we found general trends in their associations with assessments, which indicates that more narrow ranges would hardly have revealed additional patterns of associations. Seniority in present practice was preferred to seniority as a GP as the latter to some extent would correlate with the GP's age, while the former indicates the GP's opportunity to provide continuity.
We did not include locums and vocational trainees in the variable "number of GPs sharing premises", even though they may have influenced the total practice resources and the number of staff needed.
We chose a one-level model to account for patient clustering. However, GPs were also clustered in practices and a two-level model may therefore have been more applicable. Due to difficulties in converging the two-level model, we tested the differences between one- and two-level models and found that the variances between GPs were small, and differences in PR estimates were therefore negligible (below 2%).
Overall, the model fit showed to be good using the Bernoulli family and in the situations where the Poisson regression was used. As earlier explained in the methods section, the high prevalences forced us to use Poisson regression in some instances to enable the model to converge, which in this study did not change the estimates [20, 21].
The study enjoyed a very high statistical precision, which meant that we were able to detect quite small statistically significant associations and one might question their clinical relevance. However, the precision was, indeed, considerable, so the risk of overlooking associations (type II-error) was accordingly extremely low.
In several cases, we found that GP and practice characteristics were differently associated with the evaluation of various aspects of care. Such differences may be blurred in measures of general satisfaction, which makes it difficult to compare our results with results from studies using this kind of measure [28, 29].
The project was a part of a large national patient evaluation project. This may have introduced some methodological weaknesses in relation to the aim of this particular study. All GPs in the involved counties were invited, but those who entered the project were not necessarily a representative fraction. The method of patient inclusion should ideally secure a random draw from the attending part of the listed patients, well knowing that frequently attending patients would be overrepresented. We do not know to which extent GPs forgot to hand out questionnaires or more systematically left some patients out. We focused on adjusted associations between evaluations and GP and practice characteristics. Selection of GPs and patients therefore did not have the same impact as if the aim had been to investigate actual evaluation levels.
Discussion of results
Our finding that GP gender was not associated with the evaluation is in agreement with earlier findings [6, 7], whereas the statistically significantly more positive evaluation of the youngest GPs in this field represents new knowledge. Earlier studies have reported a positive association between patients' evaluation of general practice care and the GP's age , but negative associations between the evaluation and the GP's age and seniority have also been found [6, 7]. We found no correlation between the GP's age and the number of patients per GP or practice organisation that could explain the association found. The negative association between GP age and the evaluation may indicate that patients experience younger GPs as more skilled or it may reflect possible deficiencies in the continuous medical education (CME) of older GPs. Maybe, patients expect more from older GPs or, maybe, GPs over time adjust their effort in order to counter burnout. We would have expected the impact of continuity in the relationship to be reflected in a positive association between GP age and the evaluation score. Maybe, a "shift of scenes" is sometimes appreciated by the patients. If the GP is not aware of this potential problem, continuity may imply a degree of preoccupancy that hinders him from perceiving and adjusting to changes in the patient's life situation and health. This challenges the perception of personal, continuous care as an unconditionally valuable quality [10, 12, 13]. Yet another possible explanation may be that the association originated in a cohort effect, reflecting improvement in the CME of GPs, including both technical and inter-personal competencies.
The GP's weekly working hours were not associated with the patients' perception of accessibility, but were positively associated with aspects of care organisation, including continuity, presumably because continuity is closely related to the GP personally, whereas lack of accessibility may be compensated for by the colleagues or managed otherwise.
The geographical location of the practice was not associated with the evaluation. It is surprising, though, to see that practising in rural districts was not negatively associated with the evaluation of accessibility, even though regions with shortage of GPs were included in this study. This may possibly be explained by an adjustment of patient expectations due to public discussion of the problem.
Like in earlier studies [5, 6, 8, 30], the number of patients per GP was not consistently associated with the evaluation in the first four dimensions, but we were not surprised to see that it was negatively associated with the patients' perception of accessibility. Inversely, the number of staff was also negatively associated with accessibility. In Denmark, practice nurses and laboratory assistants can conduct consultations on their own if supervised by the GP. Delegating certain services to practice staff may induce a sense of limited access to the GP, which is supported by the fact that we found that GPs with more employees served larger lists of patients.
In concordance with earlier findings [8, 31], there was a slight tendency towards more positive evaluation of the GP's technical care in part-time partnership practices, but otherwise practice organisation mainly influenced the patients' experience of accessibility. GPs in single-handed practices got, by far, the most positive assessments of all, while part-time partnership practices obtained the most negative assessments. This supports results from earlier studies [8, 10, 11]. More modest patient expectations of the single-handed GP's accessibility may explain this, or the explanation may be concealed by other factors regarding organisation, service and priorities that we have not been able to adjust for.
Baker [5, 6] found training practices to obtain less favourable assessments than non-training practices. Being involved with teaching and training out of practice was not associated with the evaluation in our study and working in a training practice was only associated with a comparatively lower evaluation of accessibility.
Policy and practice implications
Our results suggest adjustment for GP age whenever evaluation results are compared between practices. On the other hand, both the GPs themselves and policy makers must be aware of correctable consequences of increasing GP age. Depending on the situation, adjustment for practice organisation may also be appropriate, but as practice organisation may be closely interwoven with other organisational aspects, this may blur important differences between practices that ought to be addressed.
We found single-handed practice, a short patient list, only a few employees, and not being a training practice to be associated with a better patient-experienced accessibility to care. This is contradictory to the finding of better technical performance of GPs in larger partnerships [8, 31], to the current trend in society that favours larger primary care units and that younger doctors wish to work part-time hours in partnership practices. Further studies are therefore needed to discover what lies behind these results and how practice may be organised in order to secure high professional standards and at the same time satisfy the patients' needs for personal continuity and accessibility. As this is in some way a general paradox concerning all practices, policy makers and GPs should come together to seek for possible solutions to this problem.
It cannot be concluded that patient evaluation results should always or never be adjusted for differences between practices. Adjustment may be appropriate when comparing results between (groups of) practices, but in the perspective of individual practices, one may claim that every GP must strive to satisfy his patients disregarding that his working conditions may be different from those of his colleagues.
We found the GP's age to be negatively associated with the patients' evaluation of all aspects of care, except accessibility. We also found a strong association between the way the practice was organised and the patients' evaluation of accessibility, with GPs in single-handed practices getting far the most positive evaluations. We suggest that future evaluations be adjusted for differences in the GPs' age and practice organisation before comparing (groups of) practices. Long weekly working hours were associated with more positive evaluations of all dimensions, except accessibility, whereas more than 0.5 full-time employees per GP, a comparatively high number of listed patients per GP and working in a training practice were associated with negative evaluation of accessibility. These results may be used to adjust practice in order to increase the patient-experienced accessibility to care.
Measurement of patients' satisfaction with their care. 1993, London: Royal College of Physicians of London, 1
Jung HP: Quality of care in general practice. The patient perspective [thesis]. 1999, Nijmegen: University of Nijmegen, University of Maastricht
O'Riordan M, Seuntjens L, Grol R, (editors): Improving patient care in primary care in Europe. 2004, Netherland: EQuiP, 1
Perneger TV: Adjustment for patient characteristics in satisfaction surveys. Int J Qual Health Care. 2004, 16: 433-435. 10.1093/intqhc/mzh090.
Baker R, Streatfield J: What type of general practice do patients prefer? Exploration of practice characteristics influencing patient satisfaction. Br J Gen Pract. 1995, 45: 654-659.
Baker R: Characteristics of practices, general practitioners and patients related to levels of patients' satisfaction with consultations. Br J Gen Pract. 1996, 46: 601-605.
Kvamme OJ, Sandvik L, Hjortdahl P: [Practice patterns, physicians' characteristics and patient-evaluated quality of general practice in Norway] (in Norwegian). Tidsskr Nor Laegeforen. 2000, 120: 2499-2502.
Campbell JL, Ramsay J, Green J: Practice size: impact on consultation length, workload, and patient assessment of care. Br J Gen Pract. 2001, 51: 644-650.
Lin HC, Xirasagar S, Laditka JN: Patient perceptions of service quality in group versus solo practice clinics. Int J Qual Health Care. 2004, 16: 437-445. 10.1093/intqhc/mzh072.
van den Hombergh P, Engels Y, van den Hoogen H, van Doremalen J, van den Bosch W, Grol R: Saying 'goodbye' to single-handed practices; what do patients and staff lose or gain?. Fam Pract. 2005, 22: 20-27. 10.1093/fampra/cmh714.
Wensing M, Vedsted P, Kersnik J, Peersman W, Klingenberg A, Hearnshaw H, Hjortdahl P, Paulus D, Künzi B, Mendive J, Grol R: Patient satisfaction with availability of general practice: an international comparison. Int J Qual Health Care. 2002, 14: 111-118.
Hjortdahl P, Laerum E: Continuity of care in general practice: effect on patient satisfaction. BMJ. 1992, 304: 1287-1290.
Baker R, Mainous AG, Gray DP, Love MM: Exploration of the relationship between continuity, trust in regular doctors and patient satisfaction with consultations with family doctors. Scand J Prim Health Care. 2003, 21: 27-32. 10.1080/02813430310002995.
Grol R, Wensing M: Patients evaluate general/family practice. The EUROPEP instrument. 2000, EQuiP, WONCA Region Europe
Grol R, Wensing M, Mainz J, Jung HP, Ferreira P, Hearnshaw H, Hjortdahl P, Olesen F, Reis S, Ribacke M, Szecsenyi J: Patients in Europe evaluate general practice care: an international comparison. Br J Gen Pract. 2000, 50: 882-887.
Bjørner JB, Damsgaard MT, Watt T, Bech P, Rasmussen NK, Kristensen TS, Modvig J, Thunedborg K: [The Danish manual for SF-36. A health status questionnaire]. 1997, Copenhagen: Lif, (in Danish)
Lamberts H, Wood M: ICPC. [International Classification for Primary Care] (Danish version). 1990, Oxford Medical Publications
Clayton D, Hills M: Statistical Models in Epidemiology. 1993, Oxford: Oxford University Press
Modern Epidemiology. 1998, Philadelphia: Lippicott-Raven Publishers, Second
Barros AJ, Hirakata VN: Alternatives for logistic regression in cross-sectional studies: an empirical comparison of models that directly estimate the prevalence ratio. BMC Med Res Methodol. 2003, 3: 21-10.1186/1471-2288-3-21.
Zou G: A modified poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004, 159: 702-706. 10.1093/aje/kwh090.
Hosmer DW, Lemeshow S: Applied Logistic Regression. 1989, New York: WILEY, 1
Armitage P, Berry G, Matthews JNS: Statistical Methods in Medical Research. 2005, Oxford: Blackwell Science, Fourth
Donner A, Klar N: Design and Analysis of Cluster Randomisation Trials in Health Research. 2000, London: Hodder Arnold
Stata Statistical Software: Release 8.0. 2003, College Station, TX: Stata Corporation
Stata Statistical Software: Release 9.0. 2005, College Station, TX: StataCorp LP
Statistics Denmark, Statbank Denmark. (Accessed November 2005), [http://www.statbank.dk]
Cleary PD, McNeil BJ: Patient satisfaction as an indicator of quality care. Inq. 1988, 25: 25-36.
Sitzia J, Wood N: Patient satisfaction: a review of issues and concepts. Soc Sci Med. 1997, 45: 1829-1843. 10.1016/S0277-9536(97)00128-7.
Campbell SM, Hann M, Hacker J, Burns C, Oliver D, Thapar A, Mead N, Safran DG, Roland MO: Identifying predictors of high quality care in English general practice: observational study. BMJ. 2001, 323: 784-787. 10.1136/bmj.323.7316.784.
Baker R: General practice in Gloucestershire, Avon and Somerset: explaining variations in standards. Br J Gen Pract. 1992, 42: 415-418.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6963/7/46/prepub
This study was carried out as part of the national project on patient evaluations, the DanPEP. In the DanPEP project, the local committees for quality improvement in general practice in each county were approached in order to gain their support prior to the individual invitation of GPs to enter the project. The project served as a pilot study prior to the decision on how to conduct future systematic quality assessment of general practice in the patients' perspective.
We wish to thank all the GPs and patients who through their evaluation provided this project with valuable data and Ms. Gitte Hove, cand. scient. bibl., for her competent management of the data. The DanPEP study was supported by grants from the Central Committee on Quality Development and Informatics in General Practice and the Danish Ministry of the Interior and Health. Direct expenses incurred by the participating GPs were refunded by the local Committees for Quality Improvement in General Practice in the counties of Aarhus, Frederiksborg, Funen, Ribe, Southern Jutland, Vejle and Western Zealand and the municipalities of Bornholm, Copenhagen and Frederiksberg.
The author(s) declare that they have no competing interests.
HH, PV and FO planned the study, and HH carried out the patient evaluation survey assisted by the research secretariat. IS, PV and HH planned the statistical analyses, which were performed by IS. HH drafted the manuscript, which was rewritten by HH, PV, FO and IS. All authors read and approved the final manuscript.