Health service utilization in IBD: comparison of self-report and administrative data

Background The reliability of self-report regarding health care utilization in inflammatory bowel disease (IBD) is unknown. If proven reliable, it could help justify self-report as a means of determining health care utilization and associated costs. Methods The Manitoba IBD Cohort Study is a population-based longitudinal study of participants diagnosed within 7 years of enrollment. Health care utilization was assessed through standardized interview. Participants (n = 352) reported the total number of nights hospitalized, frequency of physician contacts in the prior 12 months and whether the medical contacts were for IBD-related reasons or not. Reports of recent antibiotic use were also recorded. Actual utilization was drawn from the administrative database of Manitoba Health, the single comprehensive provincial health insurer. Results According to the administrative data, 15% of respondents had an overnight hospitalization, while 10% had an IBD-related hospitalization. Self-report concordance was highly sensitive (92%; 82%) and specific (96%; 97%, respectively). 97% of participants had contact with a physician in the previous year, and 69% had IBD-related visits. Physician visits were significantly under-reported and there was a trend to over-report the number of nights in hospital. Conclusions Self-report data can be helpful in evaluating health service utilization, provided that the researcher is aware of the systematic sources of bias. Outpatient visits are well identified by self-report. The discordance for the type of outpatient visit may be either a weakness of self-report or a flaw in diagnosis coding of the administrative data. If administrative data are not available, self-report information may be a cost-effective alternative, particularly for hospitalizations.


Background
It is estimated that 0.6% of Canadians and 0.4% of Americans have Crohn's disease (CD) or ulcerative colitis (UC), the two forms of inflammatory bowel disease (IBD) [1,2]. These chronic diseases are characterized by episodic flares that can contribute to significant disruption and overall poor quality of life. With the median incident age between 20 and 30 years, patients are often required to take medication for the duration of their lives. Up to one-third of UC and almost three-quarters of CD patients will require surgery in their lifetime [3,4]. Health care utilization for IBD is higher than in the general population [5].
There is interest in understanding heath care utilization in IBD as it informs cost-effectiveness analysis and resource allocation planning, including decisions about adopting expensive medications and diagnostic tests. One context is determining base rates of health care utilization and then the impact of any intervention at reducing them. The two most widely used methods of data collection for health service utilization are administrative records and self-report [6,7]. Self-report measures are widely used in economic and epidemiological research for representative population samples [6]. There have been two large-scale reviews comparing administrative to self-report data [6,7]. Each method has advantages and disadvantages. Health insurance claims records are a commonly used form of administrative data. However, these data are costly to maintain and analyze. Privacy and administrative issues limit access by independent researchers. Self-report data, while less costly to collect, may be inaccurate due to poor recall, particularly when longer time periods are considered or more complex categorization of information is needed [6][7][8]. Previous studies of self-report data on health care utilization suggest that concordance with administrative data, depends on the level of cognition and age of respondents, the severity of illness, length of recall period, and types of utilization assessed [6,7]. Given that errors in reporting have been found to be associated with some disease groups, and since IBD patients have not been among the study populations to date, careful examination of concordance in an IBD sample is needed to establish accuracy [8]. This study directly compared self-report and administrative data regarding hospital and physician visits over the prior 12 months for a population-based IBD cohort. We were interested in evaluating the accuracy and limitations of self-report data.

Methods
This study was approved by the University of Manitoba Research Ethics Board and the Manitoba Health-Health Information Privacy Committee.

Data Sources Administrative Data
In this province of 1.15 million, Manitoba Health is the government organization that serves as the single, comprehensive health insurance provider for all residents [9]. The organization maintains four large databases. These include a population registry, hospital discharge database, physician claims database and a prescription drug purchases database. The population registry contains a unique personal health identification number for each insured individual which is used to link the databases. This system is described in detail on the Manitoba Centre for Health Policy Website [10]. The accuracy of the provincial administrative health data has been demonstrated across a number of studies for various medical conditions [10][11][12][13][14].
The methods used to establish and validate an administrative definition of IBD in the provincial database were described by Bernstein and colleagues, and the IBD identification algorithm was found to have sensitivity and specificity of 90% [5,15]. In the 2005-2006 fiscal year, there were 7375 IBD cases identified in the population registry, which were subsequently included in an epidemiology database. All individuals in this University of Manitoba IBD Epidemiology Database were mailed invitations to enroll in the University of Manitoba IBD Research Registry, a database used by our Centre, and almost half of the province's IBD population agreed to participate [15].

Manitoba IBD Cohort
The Manitoba IBD Cohort Study is a prospective longitudinal study following adults with IBD who were diagnosed within 7 years of enrollment [16]. This populationbased study was initiated in 2002, recruiting eligible participants from the University of Manitoba IBD Research Registry. It has collected information on disease course, health service utilization, and mental and physical health outcomes. Participants are surveyed every 6 months, and interviewed annually. There were 388 individuals enrolled in the Manitoba IBD Cohort Study. At the time health care utilization information was obtained (12-month interview), 352 cohort study participants had complete data, were insured by Manitoba Health, and had their IBD diagnosis independently confirmed by chart review.
Demographic and disease variables included age and disease duration as of the date of interview, sex, disease type (CD or UC), and disease activity in the prior 12 months. Disease activity was assessed based on the Manitoba IBD Index (MIBDI), a validated patient report measure of symptom frequency over a six-month period [17]. Disease activity across the 12 months was categorized as consistently inactive, consistently active, or fluctuating between inactive and active, based on two consecutive six-month MIBDI assessments.
Assessment of Health Care Utilization: Self-report Data on health care utilization were obtained from the Cohort participants through standardized interview between the months of August 2003 and January 2005, the time period during which the Cohort 12-month follow up interviews took place. The questions were drawn directly from a large national health survey (Canadian Community Health Survey) [18]. Respondents were told to consider the prior 12-month period. They were asked whether they had been a patient overnight in a hospital (excluding emergency room visits) and if so, how many nights they spent in hospital over the 12-month recall period. They were also asked how many nights were for IBD reasons. In addition, they were asked whether they had seen a general or specialist physician, whether they had seen a physician for IBD reasons and how many physician visits they had. Frequency of physician visits for IBD reasons was not recorded.
Participants brought medication used in the prior 2 weeks to the interview. Antibiotics were selected for comparison of self-report and administrative data as they are typically consumed soon after purchase, with little discretionary use by the patient. Other drugs may have been purchased outside the 2-week window but consumed during the recall period leading to a false positive or alternately, purchased during the recall period, but consumed at another time resulting in a false negative. Seven types of antibiotics were tracked: amoxicillin, ciprofloxacin, clarithromycin, clindamycin, erythromycin, metronidazole, and penicillin.

Assessment of Concordance Between Administrative Data and Self-Report
The self-report data regarding utilization was linked to the administrative databases, using the interview date to extract utilization records from the administrative data for the previous 12 months. Variables for any overnight hospitalization (dichotomous) and the total number of nights in hospital were created. An IBD-related overnight hospitalization was flagged if any one of the hospital diagnosis codes had an ICD-9-CM prefix of 555 (Crohn's disease) or 556 (ulcerative colitis), or since April 1, 2004, ICD-10-CM code prefix of K50 or K51. A physician visit was flagged if a physician consultation occurred at a time period outside of hospitalization. ICD-9-CM physician codes of 555 (CD) and 556 (UC) were used to create variables for the total number of physician visits and to indicate that an IBD-related visit occurred.
Sensitivity and specificity measures were computed to summarize concordance of self-reported use with administrative data. To test the concordance in the selfreported frequency of use against the administrative data, the concordance correlation coefficient was computed and the correlated t-test was used to compare the means of the distributions [19,20]. The concordance correlation coefficient is the multiple of the Pearson correlation coefficient, which is a measure of precision (how far each observation deviates from the best-fit line), and the bias correction factor, which is a measure of accuracy (how far it deviates from the 45°line). Oneway analysis of variance and Bonferroni multiple comparisons were used to detect differences in frequency of physician contacts or self-reported contacts attributable to the level of disease activity.

Representativeness of the Manitoba IBD Cohort Participants
To assess how well the Manitoba IBD Cohort participants modeled the broader population in the province with IBD, they were compared on demographic characteristics to a population reference group comprised of all individuals in the IBD Epidemiology Database who (a) were also in the population registry file over the period of the Cohort Study 12-month interviews from August 6, 2003 to January 24, 2005, (b) had IBD of similar duration as the Manitoba IBD Cohort sample, (i.e., 2 to 10 years at the end of the period January 24,2005), and (c) were not already participating in the Manitoba IBD Cohort Study. A total of 2963 patients from the University of Manitoba IBD Epidemiology Database met these criteria and formed the population reference group. For the purposes of standardization in comparison, age was computed as of January 24, 2005 for all the Manitoba IBD Cohort and University of Manitoba IBD Epidemiology Database subjects. Proportions were compared using Chi-square tests. STATA SE 10.1 (College Station, Texas) was used to conduct all of the analyses. Although the reference period for testing individual concordance in administrative data and self-report was 12 months for health service utilization, and 2 weeks for antibiotic prescriptions, this 18-month period was used to compare Cohort Study respondents against IBD subjects in the administrative data as it coincided with the total period over which the 12-month Cohort study interviews were conducted. The Cohort group did not differ significantly on age, sex, or proportion of disease subtype (CD or UC) from the reference group, suggesting excellent representativeness of the Cohort. Manitoba IBD Cohort participants had higher levels of health service use over this period, and were also more likely to use services for IBD reasons.

Participant Characteristics and Representativeness
More detailed clinical information was obtained from the Cohort participants. Twenty-three percent had consistently inactive disease over the previous 12 months, 21% had fluctuating disease and 56% had consistently active disease. Disease activity patterns across time were similar between those with CD and UC.

Concordance of Self-report Compared to Administrative Health Care Utilization Data
The comparisons of self-report and administrative measures of health service utilization over the 12 months prior to the interview are shown in Table 2. The administrative data indicated that 52 out of 352 (15%) of Manitoba IBD Cohort respondents were hospitalized overnight and 34 out of 352 (10%) were hospitalized overnight for IBD reasons. There was reasonably good agreement of self-report data with high sensitivity and specificity. Taking the administrative data as the reference group, there were 4 and 6 false negatives respectively for both non-IBD-related and IBD-related hospitalizations, 11 false positives for non-IBD-related reasons, and 10 false positives for IBD-related reasons.
To clarify discrepancies for hospitalizations, specific cases were examined more closely. Errors included the failure to estimate a stay within the recall period, omitting stays following an emergency room visit, mistakenly including day surgery as hospitalization, and misclassifying an IBD stay. Of the four cases in which overnight hospitalization was not reported but was evident in the administrative database, two were seen by a physician in an emergency room and were subsequently hospitalized. As participants were told not to report emergency room visits, some may have failed to report a hospitalization that was a consequence of an emergency room visit, potentially accounting for these two discrepancies. Further, another one of these cases had a 10-night stay in hospital, which occurred just less than a year before the interview. Thus, an explanation for failure to report a hospitalization may have been the long recall period leading the patient to consider the visit as outside the specified period.
Eleven respondents reported a hospitalization when there was no record of one in the administrative database. A closer review of the administrative record indicated that four of these 11 were hospitalized 15 months to 2 years before their interview date. Four others had been admitted for day surgery within a year of the interview and may have considered this in their reporting.
Considering IBD-related hospitalizations, 10 respondents reported an IBD-related hospitalization that was not congruent with the administrative data. According to the administrative record, one of the 10 had an admission for liver disease. Four did have an IBD-related hospitalization, but it was 15 months to 2 years before the interview date. There was no record of any overnight hospitalization within the province for the other 5 patients. It is certainly possible that some respondents had been hospitalized out of province, but this information was not available. Six respondents did not report  an IBD-related hospitalization that was on record in the administrative data. Of those, two had hospitalizations that occurred just under the annual threshold according to the administrative data, and 4 had reported a hospitalization but did not specify that it was an IBD-related stay.
Although overnight hospitalization was uncommon, in contrast, 97% of the Manitoba IBD Cohort participants saw a physician in the previous year, with 69% visiting a physician for IBD-related reasons, as seen in Table 2. Sensitivity for reporting physician visits was similar to sensitivity for reporting hospitalizations, however specificity was weaker likely because so few did not see a physician.
With regard to discrepancies in reporting physician visits, the 6 respondents who had visits recorded in the administrative data but who did not directly report physician visits had a total of 22 visits recorded (range [1,9] visits). The most recent visits before the interview date for these 6 people ranged from 4 to 338 days earlier. Of the 30 who did not report an IBD-related visit, two of them had not reported any visits, and 28 reported visits but did not identify IBD as the reason. In contrast, 40 respondents reported having an IBD-related visit, but it was not coded as such in the administrative data. Of those 40, 37 had a recorded visit, and 3 had no visits recorded in the administrative database. Of the 37 who had a recorded visit, the diagnosis codes showed no clear indication of IBD co-morbidities such as abdominal pain, diarrhea or arthritis although for 9 respondents, the records indicated a code for "other and unspecified noninfectious gastroenteritis and colitis" (ICD-9-CM code 558). For both types of discordant reporting, the discrepancy was primarily in the understanding or recall of the details of the visit, not whether one had occurred. Table 3 provides more detailed information comparing the number of hospital nights and physician visits based on self-report or administrative data. Considering the number of physician visits, while the self-report and administrative data were significantly correlated, the self-report data underestimated the number of physician visits relative to the administrative data. In analyzing the number of hospital nights, we considered both the means for all respondents and the means for those respondents with at least one hospital admission (based on either administrative data or self-report) considering overall admissions and IBD-related admissions. The administrative and self-report data were highly correlated for the whole group, as well as for the subgroup of respondents with at least one admission. While there was a tendency to over-estimate the number of hospital nights in the self-report data, the differences were not large enough to be statistically significant.
At the time of the 12-month Manitoba IBD Cohort interview, 16 of 352 Cohort participants (4.5%) reported they were taking either one or more of seven antibiotics over the preceding two weeks. There was complete (100%) agreement between participants' reported use of antibiotics, as assessed by review of their submitted prescription containers, and the purchase of antibiotics recorded in the database, resulting in 100% sensitivity and specificity (data not shown).

Disease Activity and Reporting Concordance
Since nearly all of the Manitoba IBD Cohort participants had seen a physician in the previous year, the relationship of disease activity and accuracy of reporting number of physician visits was assessed. As shown in Table  4, based on both self-report and administrative data, those with inactive disease over the 12-month period had significantly fewer visits than those with consistently active disease, by almost half (self-report: F = 8.31, p = 0.0003; administrative data: F = 8.46, p = 0.0003). Regardless of whether disease was inactive, fluctuating or active across the full 12 months, the number of physician visits tended to be underestimated by at least one-third, comparing the self-report information to the administrative data (p < 0.0001).

Discussion
An advantage of the sample used in this study was that it was community-based from the population of patients in Manitoba with IBD, and not from those attending health care settings for treatment [16]. Further, the Manitoba IBD Cohort sample was found to be closely representative of the province's IBD population, based on demographic comparisons. Interestingly, the Cohort participants were more likely to use health care services and take prescription medication than the general IBD population. It may be that persons who are experiencing more symptoms, or are more engaged in the health care system are more likely to agree to take part in a longitudinal study. From the current study we were unable to determine if the increased rate of utilization was due to legitimate medical illness. Alternatively, participation in a study may increase health care utilization due to improved rapport with physicians as suggested in a report of persons enrolled in a study of traumatic events where higher rates of health care utilization was found following 12-month enrolment in a research study [21]. Data sets contain both random and systematic sources of error. Considering self-report and administrative health service utilization data, random sources of error regarding service type and frequency of use may be related to problems in memory (for self-report data), data entry errors (administrative data), and interpretation or coding differences (both self-report and administrative data). For example, a patient may have understood a visit for a skin rash due to IBD medication side effects to be an IBD-related visit, whereas the physician coded the visit for dermatitis. In this study, when discrepancies between self-report and administrative data were reviewed in more detail, some of these types of errors were identified, including recall errors (i.e., reporting a hospitalization that was outside the one year recall period) and likely mis-categorization (e.g., emergency room visits reported as hospitalizations). Discrepancies related to emergency room visits have been identified in other studies [22,23]. Unfortunately we were unable to verify this directly in our sample, since emergency room visits are not recorded in the administrative data.
Systematic errors, on the other hand, may be identified by comparing one imperfect source of information (such as self-report data) with another imperfect but probably more accurate source of information (such as administrative data), and looking for directional trends when there are discrepancies. The administrative health data have been evaluated and established as a reliable source of health care utilization information [10][11][12][13][14]. The perfect match of patient and administrative data for the medication use found in this study, which used a short time frame, and a specifically defined health service (i.e. antibiotic dispensing) provides further confidence in the reliability of the data. Comparison of selfreport and administrative data in this sample suggested that there were systematic sources of error concerning length of hospitalization and number of physician visits over a one-year recall period. There was a trend to over-report, by 25% to 35%, the number of nights in hospital, regardless of whether the hospitalization was for IBD or other reasons. While not statistically significant in this sample a difference of this magnitude would likely have been significant in a larger sample with a higher base rate of hospitalizations. Other studies have found severity of illness associated with over-reporting the length of hospital stays [24]. On the other hand, physician visits, which had a high base rate, were underestimated by 35% to 45% using self-report relative to administrative information, and these discrepancies held across different levels of disease activity. This tendency to under-report physician visits was also found in other studies [6,24]. Researchers should be aware of these systematic errors in reports of hospitalization and physician visit data when evaluating self-report data. The magnitude of these self-report errors will depend on a variety of factors including time, the type of health service utilization reported (including its salience in memory), the medical condition, and characteristics of the specific sample. For instance, the detailed evaluation in this study suggested that several of the discrepancies in reporting may have been avoided with a shorter recall time frame than the 12-month period.
In spite of the clear problems with over-reporting of hospital nights and under-reporting of physician visits, as in previous research there were still relatively high correlations between self-report and administrative data for hospital nights and physician visits [6]. When administrative data are not available, self-report data may still be able to provide general estimates of health care use, although further attention could be paid to methods to enhance accuracy.
In their review of accuracy of self-report information concerning health care utilization, Bhandari and Wagner identified some modifiable factors that affect accuracy of this type of information such as questionnaire design, mode of data collection, and memory aids [6]. Recent research suggests that memory aids and probes using approaches like the event history calendar can facilitate better accuracy and substantially reduce under-reporting of health care utilization, including when longer time intervals (such as 12-month periods) are used [25,26]. Recalling the sequence of health care events (for example: Who did you see first? And then whom did you see after that?) can help the respondent recall more specific details than using just a single question. In IBD, for instance, some services (such as endoscopy) are more likely so inquiring specifically about those services may assist in more accurately reporting utilization. In work with an elderly population, one group obtained very accurate self-report data by instructing participants in a longitudinal study to track health utilization via a calendar, and there were regular bi-monthly phone contacts to gather data [27].
As the use of electronic health records becomes more widespread, there is the potential to improve both administrative and self-report data [28]. A growing trend in many countries is to encourage patients to interact with their health information through the use of electronic personal health records. Examples include patient generated health and lifestyle records that are stored and managed using personal computer or web applications, and passive access to provider held records through waiting room kiosks, the internet, or digital copy (such as on a CD or smart card) [29]. These electronic health records can assist patients in more actively managing their health condition through individualized care plans, graphing of symptoms, passive feedback, tailored instructive or motivational feedback, decisional aids, and reminders [29]. Our findings suggest that IBD patients, unaided by tracking prompts such as personal health records or visit information tools such as personal health records or visit receipts, can readily recall medical contacts, but are poor at detailing the specifics such as frequency of physician visits and length of hospital stays. Until high-quality information tools become accessible to patients, when health care utilization is of interest, patients should be encouraged to keep a record of symptoms, utilization and treatments in a diary or in a calendar and not to rely solely on selfrecall.

Conclusions
In summary, the validity of self-report data, as it informs the estimation of health care costs, is critical. This study has drawn attention to some areas of reasonable accuracy and others of systematic inaccuracy in self-report data describing service utilization over a 12-month period. Although recall of a medical contact was good, there were systematic errors in the reported frequency of physician visits (under-reported) and hospital nights (over-reported). If adequate administrative data are not available, self-report data may be a cost-effective alternative, depending on the level and precision of data needed. Use of data collection techniques designed to improve the quality of self-report data may also enhance utility. Recall accuracy may be improved by assessing shorter time frames with more specific time cues and questions about use of specific types of service. Prompts may be used to clarify common areas of miscategorization, such as recall of emergency room visits or day surgeries. Self-report questions may allow opportunities to explore aspects of health care utilization not captured in administrative data.
Abbreviations IBD: inflammatory bowel disease; CD: Crohn's disease; UC: ulcerative colitis; ICD-9-CM: International Classification of Diseases Coordination and Maintenance Committee final manuscript. JRW contributed to study conception and design, data analysis and interpretation, revising the manuscript critically for important intellectual content, edited and approved the final manuscript. LAG contributed to study conception and design, data analysis and interpretation, revising the manuscript critically for important intellectual content, edited and approved the final manuscript. CNB contributed to study conception and design, acquisition of data, data analysis and interpretation, revising the manuscript critically for important intellectual content, responded to editors requested revisions, edited the manuscript accordingly, edited and approved the final manuscript. All authors read and approved the final manuscript.

Competing interests
In the past 5 years Dr Bernstein has served as consultant or on advisory board of Abbott Canada, Shire Canada, Axcan Pharma, Astra Zeneca Canada, and Janssen Canada and has received an unrestricted educational grant from Axcan Pharma, an unrestricted research grant from UCB Pharma Canada and a research grant from Abbott Canada.