Technical evaluation of methods for identifying chemotherapy-induced febrile neutropenia in healthcare claims databases

Background Healthcare claims databases have been used in several studies to characterize the risk and burden of chemotherapy-induced febrile neutropenia (FN) and effectiveness of colony-stimulating factors against FN. The accuracy of methods previously used to identify FN in such databases has not been formally evaluated. Methods Data comprised linked electronic medical records from Geisinger Health System and healthcare claims data from Geisinger Health Plan. Subjects were classified into subgroups based on whether or not they were hospitalized for FN per the presumptive “gold standard” (ANC <1.0×109/L, and body temperature ≥38.3°C or receipt of antibiotics) and claims-based definition (diagnosis codes for neutropenia, fever, and/or infection). Accuracy was evaluated principally based on positive predictive value (PPV) and sensitivity. Results Among 357 study subjects, 82 (23%) met the gold standard for hospitalized FN. For the claims-based definition including diagnosis codes for neutropenia plus fever in any position (n=28), PPV was 100% and sensitivity was 34% (95% CI: 24–45). For the definition including neutropenia in the primary position (n=54), PPV was 87% (78–95) and sensitivity was 57% (46–68). For the definition including neutropenia in any position (n=71), PPV was 77% (68–87) and sensitivity was 67% (56–77). Conclusions Patients hospitalized for chemotherapy-induced FN can be identified in healthcare claims databases--with an acceptable level of mis-classification--using diagnosis codes for neutropenia, or neutropenia plus fever.


Background
Neutropenia is a common side effect of myelosuppressive chemotherapy. Neutropenia both increases the risk of infection and diminishes patients' ability to fight infection. When neutropenic patients develop fever, a cardinal sign of infection, the high likelihood of serious consequences usually results in hospitalization for urgent evaluation, ongoing monitoring, and administration of intravenous antibiotics. This common and well-studied complication of neutropenia is called febrile neutropenia (FN), regardless of whether infection is ultimately documented as the cause of fever. FN--as well as severe or prolonged dose-delays, dose-reductions, and discontinuation, interfering with the delivery of optimal treatment and possibly adversely affecting patient outcomes [1][2][3][4]. For these reasons, prophylactic administration of a colony-stimulating factor (CSF)-which has been shown to reduce the risk of FN, FN-related hospitalization, and infection, and to reduce antibiotic use in clinical trials--is recommended concurrently with myelosuppressive chemotherapy when FN risk is estimated to be approximately 20% or greater [3,[5][6][7][8].
Using public and private healthcare claims databases as well as hospital discharge records, a number of retrospective studies have been undertaken to assess the clinical risk and economic burden of FN requiring hospitalization in clinical practice and to evaluate the comparative effectiveness of CSF agents against chemotherapy-induced hospitalized FN [7,[9][10][11][12][13][14]. Because the study databases did not include information on absolute neutrophil count (ANC) and body temperature, and because there is no specific diagnosis code for FN, various combinations of codes for neutropenia, fever, and/or infection were utilized to identify FN in these studies. The accuracy of these definitions is unknown, however, and thus it is uncertain whether the methods of case ascertainment may have biased study results.
To the best of our knowledge, only one study has evaluated the test characteristics of claims-based definitions for FN, and this study included a small sample of older adults with a single tumor type (i.e., non-Hodgkin's lymphoma) and evaluated only a single definition [15]. In light of the dearth of data on the accuracy of the claimsbased definitions that have been used to identify FN hospitalizations, and because such data may be important in the design of future studies as well as in reducing bias in estimators based on definitions that are less than perfectly sensitive and specific (i.e., <100%), a new evaluation was undertaken [16].

Data source
Data were obtained from the MedMining Database, and spanned the period January 2004 through April 2010. The MedMining Database comprises electronic medical records (EMR) for services provided by the Geisinger Health System (GHS) to more than 2 million persons as well as healthcare claims for services received within GHS (as well as selected other health systems) to the approximately 225,000 members of the Geisinger Health Plan (GHP).
Data from the GHS EMR include: patient demographics (e.g., age, sex, race); ambulatory care visits and inpatient admissions (along with associated diagnoses [ICD-9-CM] and procedures [ICD-9-CM, HCPCS], and dates of service); medication orders/lists; and clinical laboratory and exam results (including white blood cell count, neutrophil count, body temperature, and dates of observation). Data from the GHP healthcare claims database include inpatient and outpatient diagnoses and procedures, outpatient drug utilization, and dates of service. EMR and healthcare claims data can be linked for all GHP members using unique patient identifiers, and all such data can be arrayed chronologically to provide a detailed longitudinal profile of medical and pharmacy services used within GHS by each GHP member.
Patient-identifying information was encrypted or removed from the study database prior to its release to the study investigators, as set forth in the corresponding Data Use Agreement. The study database has been evaluated and certified by an independent third party to be in compliance with the Health Insurance Portability and Accountability Act of 1996 (HIPAA) statistical de-identification standards and to satisfy the conditions set forth in Sections 164.514 (a)-(b)1ii of the HIPAA Privacy Rule regarding the determination and documentation of statistically de-identified data (45 CFR 46 §46.101). Use of the study database for health services research is therefore fully compliant with the HIPAA Privacy Rule and federal guidance on Public Welfare and the Protection of Human Subjects. Permission to use the data for this study was requested by study investigators and granted by MedMining.

Source population
The source population included all GHP members, aged ≥18 years, who began one or more courses of myelosuppressive cancer chemotherapy between January 1, 2004 and April 30, 2010. Receipt of chemotherapy was ascertained based on the presence of one or more claims for a chemotherapy drug or administration, which were identified using HCPCS and ICD-9-CM codes in the healthcare claims database. Chemotherapy agents were classified by level of myelosuppression--any vs none--based on expert opinion. Regimens including one or more myelotoxic agents were considered to be myelosuppressive. Evidence of initiation of a new course of chemotherapy was based on the earliest claim for chemotherapy during the study period that was preceded by a 60-day or longer period without any other claims for chemotherapy; the date of the earliest such claim was designated as the "index date". Among these identified patients, those with evidence of cancer as indicated by two or more medical claims with a qualifying 3-digit ICD-9-CM diagnosis code during the period beginning 60 days prior to the index date were selected for inclusion in the source population. For patients with multiple courses of chemotherapy (based on a gap of ≥60 days between the last administration of chemotherapy in a given course and the first administration in the next course) during the study period, all courses were considered. We excluded patients from the source population if they were not continuously eligible for comprehensive medical and drug benefits throughout the 60-day period prior to chemotherapy initiation. The classification of chemotherapy agents by level of myelosuppression is available in Additional file 1: Table S1 of the online supplement; cancer codes are available in Additional file 2: Table S2 of the online supplement.

Chemotherapy cycles and regimens
For each patient in the source population, we identified each unique cycle within each course of chemotherapy. The first chemotherapy cycle was defined as beginning with the date of initiation of chemotherapy (i.e., the index date) and ending with the first service date for the next administration of chemotherapy (as evidenced by a medical claim with a corresponding HCPCS or ICD-9-CM code) occurring at least 7 days--but no more than 59 days--after the date of initiation of chemotherapy. If a second chemotherapy cycle did not commence prior to day 60, both the first cycle of chemotherapy and the course of chemotherapy were considered to have been completed 30 days following the beginning of the cycle. The second and all subsequent chemotherapy cycles were similarly defined, up to a maximum of nine cycles in total.

Study population
The study population included all patients in the source population who were admitted to an acute care, short stay hospital in the GHS during the chemotherapy course--based on healthcare claims data--and who had an absolute neutrophil count (ANC) within 1 day of hospital admission (i.e., the day before, the day of, or the day after admission)-based on EMR data. Because chemotherapy-induced FN almost never occurs earlier than the fifth day of a chemotherapy cycle, and because some of the aforementioned retrospective studies did not consider hospitalizations during this period, only hospitalizations occurring on or after the fifth day of a chemotherapy cycle were included [11][12][13].

Febrile neutropenia
The gold standard for identification of FN hospitalization (i.e., presumptive, for purposes of this retrospective evaluation) was evidence in EMR data of ANC <1.0 × 10 9 /L and either body temperature ≥38.3°C (101°F) or administration of antibiotic or antiviral therapy following ANC assessment, all having occurred within 1 day of hospitalization (i.e., the day before, day of, or day after hospitalization); this definition was considered the presumptive "gold standard" for purposes of analyses. Receipt of antibiotics was included in the definition because some patients (12%) had no recorded body temperature data during the period within one day of hospital admission, while others may have had fever prior to admission that had resolved with antipyretic therapy. We chose our gold standard definitions of neutropenia and fever to be as consistent as possible with those employed by the National Comprehensive Cancer Network (NCCN) and Infectious Diseases Society of America (IDSA), given the limitations of our database. The NCCN/IDSA defines neutropenia as ANC <0.5 × 10 9 /L or ANC <1.0 × 10 9 /L with a predicted decline to <0.5 × 10 9 /L over the next 48 hours. Because the study database does not include information on anticipated decline in ANC over time, and many patients--we believe--with ANC 0.5-1.0 would be expected to have ANC <0.5, we defined neutropenia as ANC <1.0 × 10 9 /L [17,18]. (We note that less authoritative guidelines have employed the higher threshold in their definition [19,20]). The NCCN/IDSA defines fever as a single temperature ≥38.3°C orally or ≥38.0°C over 1 hour. Because the study database does not provide information on duration of temperature readings, and because--we believe--the large majority of recorded readings were obtained orally, we defined fever as ≥38.3°C. In addition, because body temperature readings were not consistently recorded for all patients, our definition of fever was expanded to include administration of antibiotic or antiviral therapy, which was assumed to be a proxy for presence of infection and fever. Claims-based operational definitions of FN hospitalization were based on ICD-9-CM inpatient diagnosis codes used in the aforementioned retrospective studies, and included: (1) neutropenia (ICD-9 288.0) in the primary position; (2) neutropenia in any position; (3) neutropenia plus fever (780.6) in any position; (4) neutropenia or fever in any position; and (5) neutropenia, fever, or infection in any position. (Because the last definition has been used in analyses to identify all neutropenic complications [including FN] resulting in hospitalization, its sensitivity was expected to be greater than the others, although at some loss of specificity and positive predictive value [PPV].) All of the aforementioned claims-based definitions were set forth on a priori basis, and thus were "blinded" to evaluation of the gold standard; however, operational definitions considering the ICD-9-CM diagnosis code for pancytopenia (284.1) (i.e., either neutropenia or pancytopenia) were evaluated on a "post-hoc" basis.
Each hospitalization was classified as either an FN hospitalization (FN positive) or a non-FN hospitalization (FN negative) based on each of the alternative claims-based definitions and gold standard, respectively. Thus, for each comparison of a claims-based definition versus the gold standard, four mutually-exclusive and mutually-exhaustive results were possible: (1) (Table 1).

Measures and analyses
The accuracy of claims-based definitions for FN, relative to the gold standard, was evaluated based on their test characteristics and performance characteristics, principally PPV and sensitivity; specificity, negative predictive value, accuracy, and positive and negative likelihood ratios also were evaluated. Hospitalizations misclassified by the claims-based definitions (i.e., false-positives and falsenegatives) were further investigated and their diagnosis codes were reported. Confidence intervals for estimates of test and performance characteristics were calculated using techniques of nonparametric bootstrapping (percentile method) from the study population (1,000 replicates with replacement). Analyses were conducted using SAS W Software, Release 9.2 (SAS Institute Inc., Cary, NC).

Results
A total of 3,193 patients received one or more courses of myelosuppressive chemotherapy during the period of interest, were continuously eligible for comprehensive medical and drug benefits during the 60 days prior to initiation of chemotherapy, and had evidence of cancer (Table 2). Among these patients, 772 (24%) had a hospitalization during a course of chemotherapy on or after the fifth day of a new cycle, and of these patients, 357 (46%) were hospitalized within GHS and had available data on ANC within 1 day of hospitalization, and thus were included in the study population. No patient in the study population had more than one qualifying hospital admission during the study period. Mean age of the study population was 64 (SD: 13) years and 56% of patients were male (Table 3). Mean ANC at the time of hospital admission was, on an overall basis, 4.4 × 10 9 /L. Common types of cancer included breast (10%), lung (19%), colon/rectum (11%), pancreas (8%), and non-Hodgkin's lymphoma (11%). Thirty-five percent of subjects had metastatic disease, and 68% received chemotherapy with ≥2 myelosuppressive agents. Hospitalizations occurred most often in cycle 1 (28%) and cycle 2 (17%). Among the 357 patients in the study population, 82 (23%) met the gold standard for hospitalized FN; 28 (34%) of the 82 cases had low ANC, fever, and receipt of antimicrobial therapy, 30 (37%) had low ANC and receipt of antimicrobial therapy only (5 of the 30 were missing temperature data), and 24 (29%) had low ANC and fever only (all within 1 day of hospitalization).
For the claims-based definition identifying FN hospitalization based on a diagnosis of neutropenia in any position, eight of the 27 false-negatives patients had a diagnosis of septic shock/severe sepsis/septicemia, two had a diagnosis of pancytopenia, five had diagnoses of other infections, and 12 had neither a diagnosis of infection nor fever. Among the 16 false-positives, six patients had neutropenia (ANC<1,000), but none received antimicrobials and three had no temperature data recorded in the relevant time period. In addition, two patients, of whom one was febrile, had an ANC > 1,000 but < 1,500, and one had fever and received antimicrobial therapy with an ANC of 1,617.

Discussion
Because they are easily accessible and provide readilyavailable information on a large number of patients receiving care in real-world clinical practice, private and public healthcare claims have been used extensively to date in health-economic and outcomes research. Such databases, however, are not without limitations, since they are not designed to capture certain components of care (e.g., laboratory results) and thus may lack data on potentially clinical parameters, which oftentimes are important for case-ascertainment. Inaccuracies in caseascertainment may confer unknown biases in the estimation of study results.
In this study, we evaluated the accuracy of operational definitions previously used to identify chemotherapyinduced hospitalization FN in healthcare claims databases. We found that PPV exceeded 80%-the assumed minimum acceptable threshold--for the definitions including the diagnosis code for neutropenia in the principal position (87%) and diagnosis codes for neutropenia plus fever (100%), and that PPV was close to the acceptable threshold for the definition including a diagnosis code for neutropenia in any position (77%). Among these definitions, sensitivity was highest for the last one (67%), although this figure is somewhat lower than that (81%)      reported for a similar definition in the aforementioned evaluation of older adults with non-Hodgkin's lymphoma [15]. While sensitivity was improved in our study by including codes for fever and infection (87%), PPV (35%) was lower. We note, however, that this claims-based definition has most often been employed in comparative effectiveness studies of CSF agents, in which the focus was not on FN per se but all neutropenic complications (including severe neutropenia, afebrile infections in neutropenic patients, as well as FN) resulting in hospitalization that may be prevented with prophylaxis. When modifying the gold standard (i.e., ANC was < 1.5 × 10 9 /L) to capture additional neutropenic complications-irrespective of whether the patient was febrile or received antimicrobials-PPV (45% [38-53]) improved, but sensitivity was lower (81% [74][75][76][77][78][79][80][81][82][83][84][85][86][87][88]). Because no one definition was found to be associated with both high PPV and high sensitivity, the selection of a specific definition for identifying hospitalized FN should be guided by the potential impact of misclassification on study outcomes. For example, in studies evaluating the comparative effectiveness of CSFs in clinical practice, a premium should be placed on a high PPV and limiting the inclusion of false-positive cases. Since CSFs can only reduce the risk of infections associated with true neutropenia, a low PPV would result in a dilution of the estimated effectiveness of CSF prophylaxis and estimated relative differences in effectiveness among agents being studied. A high PPV alone may not be sufficient, however, since if sensitivity is low, analyses may not be adequately powered to evaluate study objectives. For other studies, however, such as those in which the outcome is the cost of FN hospitalization, some misclassification may be acceptable to the extent that there are offsetting effects (e.g., if sensitivity and PPV are comparable; and either costs of false-positives, false-negatives, and truepositives are similar or costs of false-negatives and falsepositives differ in the opposite direction by the same degree). The same also may be true for studies characterizing the clinical risk (i.e., incidence) of FN, for which--given the limitations of the alternative definitions--the goal should be to select one that is adequately sensitive and/or yields a balance between the number of patients who are misclassified (i.e., as false-positives and false-negatives). It therefore is essential to consider the importance of the individual test and performance characteristics of a measure, relative to the objectives of the study, before selecting one definition for use, and we recommend considering several alternative definitions.
Our study has a number of limitations. First, we excluded from the study population all patients who were not hospitalized in a GHS facility (because ANC data were not available for these patients) and those hospitalized in a GHS facility who did not have ANC data proximal to hospital admission. Only 35% of patients in the source population who were hospitalized (including patients treated inside and outside GHS) had ANC data during the study time period and thus could be included in the study population. For these reasons, the number of patients in the study population was limited to those treated at a GHS facility and who had ANC data. To the extent that excluded patients actually had FN and differed from the study population in important respects, our results may not be reflective of the larger population. In addition, because patients who had ANC data proximal to hospital admission may be selectively different from those who did not (because of, for example, a history of FN), our study population may be "enriched" and subject to verification bias. Second, even for some patients included in the study population, critical data-especially body temperature-was missing and thus we used a gold standard based on ANC and body temperature or administration of antibiotic or antiviral therapy. Because we used an "imperfect" gold standard, however, and thus patients' true status in terms of FN was unknown, the presence of errors in the classification of patients by the gold standard could result in biased estimates of PPV, sensitivity, and other markers. Third, since the data for the study come from a single health plan and coding practices may vary across plans and geographic regions of the country, the generalizability of our findings is uncertain. Fourth, the accuracy of claims-based definitions in identifying FN may vary across subgroups of patients (e.g., those defined on the basis of type of cancer or regimen), and thus caution should be employed when generalizing study results to specific populations; such subgroup analyses, however, were not possible given the relatively small size of the study population. Fifth, while acknowledging that culture data are less than perfectly sensitive in identifying infection, the lack of such information in the data extract precluded any evaluation of the presence of infection in true FN patients based on positive cultures. Finally, because chemotherapyinduced FN almost never occurs earlier than the fifth day of a chemotherapy cycle, and because some of the aforementioned prior studies did not consider hospitalizations during this period, only hospitalizations occurring on or after the fifth day of a chemotherapy cycle were included. Results were not sensitive to this methodological feature as <5% of all hospitalizations occurred during the first five days of a chemotherapy cycle.

Conclusion
In summary, the results of our study suggest that patients hospitalized for chemotherapy-induced FN can be identified in healthcare claims databases-with an acceptable level of mis-classification-using the diagnosis code for neutropenia or diagnosis codes for neutropenia plus fever. Research investigating the test and performance characteristics of operational definitions of FN using alternative databases (especially those in which ANC, body temperature, and culture data are available) is needed to verify the findings of this study, as is research in tumor-specific subgroups.

Additional files
Additional file 1: Table S1. Chemotherapy agents.

Competing interests
Declaration of Funding. Funding for this research was provided by Amgen Inc. to Policy Analysis Inc. (PAI). Declaration of Financial/Other Relationships. Derek Weycker, Oleg Sofrygin, Kim Seefeld, and John Edelsberg are employed by PAI. Robert Deeter and Jason Legg are employed by Amgen Inc. and own Amgen stock.

Authors' contributions
Authorship was designated based on the guidelines promulgated by the International Committee of Medical Journal Editors (2006). All persons who meet criteria for authorship are listed as authors. All authors have read and approved the final version of the manuscript. The study sponsor reviewed the study research plan and study manuscript, and was provided an opportunity to submit (non-binding) comments on study-related materials to study investigators. Data management, processing, and analyses were conducted by PAI, and all final decisions on data analytics as well as the content and structure of written materials--including the study manuscript--were made by study investigators. All authors read and approved the final manuscript.