- Research article
- Open Access
- Open Peer Review
Effect of response format for clinical vignettes on reporting quality of physician practice
BMC Health Services Research volume 9, Article number: 128 (2009)
Clinical vignettes have been used widely to compare quality of clinical care and to assess variation in practice, but the effect of different response formats has not been extensively evaluated. Our objective was to compare three clinical vignette-based survey response formats – open-ended questionnaire (A), closed-ended (multiple-choice) questionnaire with deceptive response items mixed with correct items (B), and closed-ended questionnaire with only correct items (C) – in rheumatologists' pre-treatment assessment for tumor-necrosis-factor (TNF) blocker therapy.
Study design: Prospective randomized study. Setting: Rheumatologists attending the 2004 French Society of Rheumatology meeting. Physicians were given a vignette describing the history of a fictitious woman with active rheumatoid arthritis, who was a candidate for therapy with TNF blocking agents, and then were randomized to receive questionnaire A, B, or C, each containing the same four questions but with different response formats, that asked about their pretreatment assessment. Measurements: Long (recommended items) and short (mandatory items) checklists were developed for pretreatment assessment for TNF-blocker therapy, and scores were expressed on the basis of responses to questionnaires A, B, and C as the percentage of respondents correctly choosing explicit items on these checklists. Statistical analysis: Comparison of the selected items using pairwise Chi-square tests with Bonferonni correction for variables with statistically significant differences.
Data for all surveys distributed (114 As, 118 Bs, and 118 Cs) were complete and available for analysis. The percentage of questionnaire A, B, and C respondents for whom data was correctly complete for the short checklist was 50.4%, 84.0% and 95.0%, respectively, and was 0%, 5.0% and 5.9%, respectively, for the long version. As an example, 65.8%, 85.7% and 95.8% of the respondents of A, B, and C questionnaires, respectively, correctly identified the need for tuberculin skin test (p < 0.0001).
In evaluating clinical practice with use of a clinical vignette, a multiple-choice format rather than an open-ended format overestimates physician performance. The insertion of deceptive response items mixed with correct items in closed-ended (multiple-choice) questionnaire failed to avoid this overestimation.
Improvement of quality of clinical practice needs quality measurements. These measurements must be accurate, valid and feasible. The advantages and disadvantages of different methods of measuring the process of care, including both the competence of the clinician and what the clinician actually does, have been well described. Methods include chart extraction, standardized patients and clinical vignettes. Substantial inaccuracies in administrative data are common, which leads to expensive data extraction and difficulty in validating data [1–3]. Compared to standardized patients and chart extraction, clinical vignettes are an accurate, valid, feasible and inexpensive tool to measure quality of health care [4, 5]. Thus, clinical vignettes have been used widely to compare quality of clinical care and to assess variation in practice across countries, health care systems, specialties or clinicians [6–11].
Vignette-based surveys for physicians feature open-ended questions rather than multiple-choice questions or checklists. In this way, physicians can give a personal response to each question, which ensures that the survey captures the full range of practice variation [11, 12]. Open-ended questions avoid the "cueing" inherent in multiple-choice questions, which could overestimate real physician performance. However, the accuracy of close-ended questionnaires has been demonstrated in several domains: the format provides a significantly higher rate of accuracy than an open-ended format in terms of eyewitness confidence with the former format . Close-ended questionnaires maximize questionnaire response rate and ensure questionnaire completeness .
Moreover, different formats yield different answers: in one study, a closed-ended questionnaire produced results reflecting higher willingness to pay for a health care intervention, with different justifications for those evaluations than did those from an open-ended one . A public opinion survey with two different response modalities asked subjects about the most important problem facing the United States: respondents of the open-ended format most often complained about political leadership, whereas those of the close-ended format considered violence as most important .
From these data, we wanted to evaluate how clinical vignette-based surveys influence physician responses. Assuming that closed-ended (multiple-choice) questions for vignettes produce different responses than open-ended, leading to an overestimation of professional performance, we aimed to determine whether the influence of deceptive response items included in the closed-ended questionnaires result in a better assessment of professional performance.
We conducted a prospective randomized study aimed at comparing three response modalities for a vignette-based survey: open-ended questionnaire, closed-ended questionnaire (with only correct response items) and closed-ended questionnaire with deceptive response items mixed with correct items.
The survey was composed of two parts. The first part was short, identical in each questionnaire, and collected demographic characteristics and specialties of physicians. The second part was the clinical vignette.
The vignette reported the history of a fictitious 50-year-old woman with active rheumatoid arthritis, a candidate for therapy with tumor necrosis factor (TNF) blocking agents. Physicians were asked to answer four questions about their pre-treatment assessment, considering that TNF-blocking treatment was planned: 1) what specific data are you searching for in this patient's history? 2) What clinical data are you personally searching for during the physical examination? 3) Which biological, radiographic or other tests do you request? 4) What other preventive measures do you take? Physicians were given these questions in one of three questionnaire formats: open-ended questionnaire (questionnaire A) [see Additional file 1], closed-ended (multiple-choice) questionnaire with deceptive response items mixed with correct items (n = 73) (questionnaire B) [see Additional file 2], closed-ended questionnaire with only correct items (n = 35) (questionnaire C) [see Additional file 3]. Deceptive and correct response items were created by following published international and national recommendations to help physicians care for patients under this treatment [17–21]. Three experts (XM, TP and FL) met to formulate correct and deceptive items. They based their work on the published international and national recommendations to help physicians care for patients under this treatment, to first determine the correct items, and then propose deceptive items. Each expert has elaborated 20 deceptive items, within 4 categories: patient's history, physical examination, biological, radiographic or other tests and other preventive measures. From the 53 elaborated items (duplicates were eliminated), only the more believable were kept, allowing to propose 38 deceptive items, which were mixed with the 35 correct items in questionnaire B.
Responses to questionnaire A were coded for comparison to those of the other two questionnaires. For each item, the response was classified as "item correctly selected"; "item incorrectly selected"; "item correctly not selected"; "item incorrectly not selected." We classified each item according to three sources: an evidence-based literature search of clinical practice concerning TNF-blocking drug management, international and national guidelines, and a French clinical tool guide on use of TNF-blocking agents elaborated by an expert panel of academic and community physicians . From these sources, we developed two checklists of items for pretreatment assessment for TNF-blocker therapy: a long version extracted from the French clinical tool guide on use of TNF-blocking agents , with detailed data on research into possible contraindications (Table 1), and a short version extracted from the same clinical tool guide and from the French agency for health care recommendations with items mandatory in France (Table 2) .
During the 2004 French Society of Rheumatology meeting, rheumatologists were asked to participate in a survey concerning pretreatment assessment in cases of therapy with TNF blockers, which aimed at detecting contraindications to treatment. The survey was conducted on behalf of the Club Rhumatismes et Inflammation (CRI), the division of the French Society of Rheumatology dedicated to musculosquelettal inflammatory diseases.
Until the targeted sample size was achieved, the survey distribution was randomized, with each physician receiving only one questionnaire format (A, B or C). Rheumatologists were blinded to the hypothesis. Particularly, they were unaware of the existence of different response modalities, of deceptive items in questionnaire B, and that all items of questionnaire C were correct. Time to complete the survey was limited to fifteen minutes.
Four interviewers were responsible for encouraging participation in the survey, explaining the official nature of the survey, checking that all the questionnaires were correctly completed in the time allowed, and checking the randomization was achieved. Participation was voluntary, and physicians' responses were kept anonymous.
A chi-square test was used to compare the proportion of items selected or not in terms of the questionnaire format of each of the three questionnaires. A p < 0.05 was considered statistically significant. Pairwise chi-square tests with Bonferonni correction (corrected significant probability of 0.017) were used to compare variables with statistically significant differences. Statistical analysis involved use of SAS Release 8.2 and Splus 6.2.
Sample size calculation: Three sets of 100 questionnaires – one set for each of the three questionnaires – were planned for the analysis. In fact, when considering pairwise comparisons for the response item "tuberculin skin test," with a sample size of 100 in two groups, a two-group chi-square test with a 0.017 two-sided significance level would have 80% power to detect a difference between a 65% proportion in one group and a 85% proportion in the other group. Because we expected 15% incomplete or non-analyzable questionnaires, we distributed 350 questionnaires.
Of 350 questionnaires dispensed (114 questionnaire As, 118 questionnaire Bs and 118 questionnaire Cs), all were completed, and all responses were eligible for further analysis. Table 3 displays demographic and specialty characteristics of physicians responding to the identical format part of the survey. Physicians were similar in terms of sex, practice duration and practice modalities. Questionnaire A respondents were younger than those of the other two questionnaires. Only two questions were asked about rheumatologists' experience with TNF-blocking drugs: 69.4% had already prescribed anti-TNF therapy and 43.1% had access to a checklist for screening potential contraindications in their department.
Although we expected 15% incomplete or non-analyzable questionnaires, we did not observe any missing data for open-ended or closed-ended questionnaires.
Significant differences depending on questionnaire format were found in reporting pre-treatment assessment. Compared with the two closed-ended questionnaires, the open-ended questionnaire gave lower reporting of items correctly selected and correctly not selected (Table 4).
In terms of global results, none of the questionnaire A respondents proposed response items of the long checklist, although 5.0% and 5.9% of questionnaire B and C respondents, respectively, correctly selected all items (Table 4).
For the A, B and C questionnaires, 50.4%, 84.0% and 95.0% respondents, respectively, correctly selected all mandatory response items of the short checklist. When focusing on response items within the short checklist, questionnaires B and C did not produce differences in responses to item "order chest X-rays" also difference was observed with open- and close-ended questionnaires (p < 0.0001) (Table 4). In contrast, respondents to questionnaires A, B, and C significantly differed in responses for another mandatory item, "obtaining a tuberculin skin test": 65.8%, 85.7% and 95.8% respondents, respectively, identified this item.
Rheumatologists completing the closed-ended questionnaire B, with deceptive response items, more often chose these items, such as seeking advice of a systematic lung specialist (26.1%) or determining blood sugar level (40.3%). None of the questionnaire A respondents spontaneously proposed these items. Questionnaire B respondents showed a tendency for a lower percentage of correctly selected items than questionnaire C respondents. The open-ended format allowed for collecting qualitative data on items that we did not propose in the close-ended questionnaires, such as "give information to the patient on potential adverse effects" or "give information to the patient on monitoring these drugs."
We compared three clinical vignette-based survey response formats: an open-ended questionnaire, a closed-ended (multiple-choice) questionnaire with cued correct items and a closed-ended questionnaire with deceptive items mixed with correct items. As expected, use of a closed-ended questionnaire with cued items overestimated physicians' performance as compared with an open-ended questionnaire, given that the latter is considered as the gold standard in assessing practice [5, 12, 24]. Also as expected, the open-ended questionnaire supplied more information on clinical practice than the close-ended questionnaires; physicians were more willing to provide information to the patient. Although we included response items on examinations or tests, such as cutaneous examination, in the closed-ended questionnaires, none of the respondents of the open-ended questionnaire suggested such tests.
Our study focuses on the difficulty in evaluating the quality of physician performance for specific domains with open-ended questionnaires. Physicians may be more brief with open-ended formats and responses may be less accurate. Of the 114 questionnaire A respondents, 74.4% responded with "tuberculosis" to the "other tests" question but gave no specific description of a test or what clinical examination they would do to evaluate this tuberculosis risk. In the closed-ended format, we assumed that including deceptive items would influence respondents' answers and lower the overestimation inherent in the closed-format survey. To our knowledge, this is the first time that deceptive items have been mixed with cued items in a close-ended questionnaire. Questionnaire B respondents indeed selected fewer correct items than did questionnaire C respondents. However, these results were very different from those obtained with the open-ended questionnaire (A), which are probably closer to reality.
The influence of framing questionnaire items remains crucial for clinical practice evaluation. This bias in response acquiescence has been reported from study of two versions of a training satisfaction questionnaire randomly distributed to medical residents; in one, half the items were stated positively and half negatively, and in the other, all items were stated positively. Results showed a significant effect of positive versus negative framing .
In conclusion, even if closed-ended questionnaires may provide more accurate data in clinical practice evaluation, general open-question format has value in such evaluation. Strategies for generating quantitative and qualitative data from open-ended questionnaires, associated or not with closed-ended questionnaires, facilitating survey analysis, are very likely interesting to develop to improve physician performance evaluation .
Peabody JW, Luck J, Jain S, Bertenthal D, Glassman P: Assessing the accuracy of administrative data in health information systems. Med Care. 2004, 42 (11): 1066-72. 10.1097/00005650-200411000-00005.
Lloyd SS, Rissing JP: Physician and coding errors in patient records. JAMA. 1985, 254 (10): 1330-6. 10.1001/jama.254.10.1330.
Green J, Wintfeld N: How accurate are hospital discharge data for evaluating effectiveness of care?. Med Care. 1993, 31 (8): 719-31. 10.1097/00005650-199308000-00005.
Peabody JW, Luck J, Glassman P, Jain S, Hansen J, Spell M, et al: Measuring the quality of physician practice by using clinical vignettes: a prospective validation study. Ann Intern Med. 2004, 141 (10): 771-80.
Peabody JW, Luck J, Glassman P, Dresselhaus TR, Lee M: Comparison of vignettes, standardized patients, and chart abstraction: a prospective validation study of 3 methods for measuring quality. JAMA. 2000, 283 (13): 1715-22. 10.1001/jama.283.13.1715.
Peabody JW, Nordyke RJ, Tozija F, Luck J, Muñoz JA, Sunderland A, et al: Quality of care and its impact on population health: a cross-sectional study from Macedonia. Soc Sci Med. 2006, 62 (9): 2216-24. 10.1016/j.socscimed.2005.10.030.
Peabody JW, Tozija F, Muñoz JA, Nordyke RJ, Luck J: Using vignettes to compare the quality of clinical care variation in economically divergent countries. Health Serv Res. 2004, 39 (6 Pt 2): 1951-70. 10.1111/j.1475-6773.2004.00327.x.
Peabody JW, Liu A: A cross-national comparison of the quality of clinical care using vignettes. Health Policy Plan. 2007, 22 (5): 294-302. 10.1093/heapol/czm020.
Sriram TG, Chandrashekar CR, Isaac MK, Srinivasa Murthy R, Kishore Kumar KV, Moily S, et al: Development of case vignettes to assess the mental health training of primary care medical officers. Acta Psychiatr Scand. 1990, 82 (2): 174-7. 10.1111/j.1600-0447.1990.tb01377.x.
Landon BE, Reschovsky J, Reed M, D B: Personal, organizational, and market level influences on physicians' practice patterns: results of a national survey of primary care physicians. Med Care. 2001, 39 (8): 889-905. 10.1097/00005650-200108000-00014.
Veloski J, Tai S, Evans AS, Nash DB: Clinical vignette-based surveys: a tool for assessing physician practice variation. Am J Med Qual. 2005, 20 (3): 151-7. 10.1177/1062860605274520.
Sandvik H: Criterion validity of responses to patient vignettes: an analysis based on management of female urinary incontinence. Fam Med. 1995, 27 (6): 388-92.
Venter A, Louw DA: The effect of confidence and method of questioning on eyewitness testimony. Med Law. 2005, 24 (2): 369-89.
Griffiths LE, Cook DJ, Guyatt GH, Charles CA: Comparison of open and closed questionnaire formats in obtaining demographic information from Canadian general internists. J Clin Epidemiol. 1999, 52 (10): 997-1005. 10.1016/S0895-4356(99)00106-7.
Frew EJ, Whynes DK, Wolstenholme JL: Eliciting willingness to pay: comparing closed-ended with open-ended and payment scale format. Med Decis Making. 2003, 23: 150-159. 10.1177/0272989X03251245.
Schuman H, Presser S: Questions and answers in attitude surveys: experiments on question form, wording, and context. New York: Academic press. 1981
Emery P, Reginster JY, Appelboom T, Breedveld FC, Edelmann E, Kekow J, et al: WHO Collaborating Centre consensus meeting on anti-cytokine therapy in rheumatoid arthritis. Rheumatology (Oxford). 2001, 40 (6): 699-702. 10.1093/rheumatology/40.6.699.
Ledingham J, Deighton C: Update on the British Society for Rheumatology guidelines for prescribing TNFalpha blockers in adults with rheumatoid arthritis (update of previous guidelines of April 2001). Rheumatology (Oxford). 2005, 44 (2): 157-63. 10.1093/rheumatology/keh464.
Pham T, Guillemin F, Claudepierre P, Luc M, Miceli-Richard C, Fautrel B, et al: TNFalpha antagonist therapy in ankylosing spondylitis and psoriatic arthritis: recommendations of the French Society for Rheumatology. Joint Bone Spine. 2006, 73 (5): 547-53. 10.1016/j.jbspin.2006.02.005.
Fautrel B, Constantin A, Morel J, Vittecoq O, Cantagrel A, Combe B, et al: Recommendations of the French Society for Rheumatology. TNFalpha antagonist therapy in rheumatoid arthritis. Joint Bone Spine. 2006, 73 (4): 433-41. 10.1016/j.jbspin.2006.04.001.
BTS recommendations for assessing risk and for managing Mycobacterium tuberculosis infection and disease in patients due to start anti-TNF-alpha treatment. Thorax. 2005, 60 (10): 800-5. 10.1136/thx.2005.046797.
Pham T, Claudepierre P, Deprez X, Fautrel B, Goupille P, Hilliquin P, et al: Anti-TNF alpha therapy and safety monitoring. Clinical tool guide elaborated by the Club Rhumatismes et Inflammations (CRI), section of the French Society of Rheumatology (Societe Francaise de Rhumatologie, SFR). Joint Bone Spine. 2005, 72 (Suppl 1): S1-58.
AFSSAPS: Recommandations nationales. Prevention et prise en charge des tuberculoses survenant sous anti-TNF. 2005, [http://www.afssaps.fr/content/download/12022/143647/version/2/file/reco.pdf]
Veloski J, Rabinowitz HK, Robeson MR, Young PR: Patients don't present with five choices: an alternative to multiple-choice tests in assessing physicians' competence. Academic Medicine. 1999, 74 (5): 539-46. 10.1097/00001888-199905000-00022.
O'Cathain A, Thomas KJ: "Any other comments?" Open questions on questionnaires – a bane or a bonus to research?. BMC Med Res Methodol. 2004, 4: 25-10.1186/1471-2288-4-25.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6963/9/128/prepub
The authors thank Laura Heraty for her thoughtful review of this manuscript before submission.
This study was support by an educational grant from Wyeth Pharmaceuticals. The pharmaceutical company has paid for the 3-day employment of the 4 Ipsos interviewers. Ipsos is an independent company focused on survey-based research. The authors did not receive money from the pharmaceutical company.
TP participated in the design of the study, carried out the survey, participated in the elaboration of the clinical vignettes and drafted the manuscript. CS performed the statistical analysis. XM and FL participated in the elaboration of the clinical vignettes and especially the formulation of the correct and deceptive items. PD participated in the design of the study and its coordination. PR conceived of the study, and participated in its design and coordination. All authors read and approved the final manuscript.