In this nationwide, observational study of women with PMO, the performance of diagnosis codes in identifying hypocalcemia and dermatologic adverse events from health insurance claims data varied across settings, and by provider specialty. Our definition of hypocalcemia (as the primary reason for obtaining ED or inpatient care) yielded a PPV of 40%, and for dermatologic adverse events, a PPV of 70%. The inclusion of incidental cases increased the PPV of hypocalcemia appreciably, suggesting that secondary hypocalcemia may frequently be recorded in the primary position on claims. Incidental cases were infrequent for dermatologic adverse events, possibly because these events generally represent the true primary reason for the patients’ care. With both outcomes, the diagnosis codes from ED claims were more accurate than inpatient claims. Serious hypocalcemia and dermatologic adverse events may be treated and resolved within the ED without requiring hospital admission, and if hospitalization does occur, these outcomes—hypocalcemia in particular—may be considered a secondary concern.
There are few published data for comparison. Strom et al. reported that within Medicaid claims, 60.9% of the erythematous events captured through presence of ICD-9 695.1 (erythema multiforme, Stevens-Johnson syndrome, and toxic epidermal necrolysis) were later confirmed as true cases [15]. Within a health plan database, Chan et al. reported that the presence of a discharge diagnosis of erythema multiforme yielded a PPV of 60.7% [16]. These are similar to our PPV finding of 56 to 67% (including incidental cases) for serious erythematous events, which also included ICD-9 695.5 (exfoliation due to erythematous conditions).
Historically, 70 to 80% of medical records requested by our research group (and similar institutions) are obtained [17, 18]. In this study, as expected, our retrieval rate was at the lower end of this spectrum as we sought medical records only from the principal site of care. This choice arose from the study objective to validate outcomes associated with a specific medical claim, rather than to confirm the presence of an outcome. While our lower retrieval rate decreased the precision of the PPVs, leading to broader confidence intervals, it likely did not bias the PPVs estimates, unless the chart retrieval rate was somehow differential with respect to the true case status. For example, if hospitals were more likely to provide charts for true cases of hypocalcemia, our PPV estimates would be biased toward 100%. However, this scenario seems unlikely. With other study objectives, it is generally feasible to seek charts from multiple providers or institutions (e.g., a dermatologist and a hospital) to increase the fraction of events for which at least one medical record is available.
In this study, we had expected that limiting our algorithm to the first-position diagnoses on claims would increase the PPV for capturing serious occurrences of adverse events that were the primary reason for seeking care. However, we found that clinically incidental or secondary events are also captured through diagnosis codes recorded in the primary position. Further, it is important to note that outcomes leading to hospitalization or ED visits may have had ICD-9 codes recorded in a secondary position on claims. These cases were not counted in this study, and thus, incidence derived with these code sets will be underestimated.
This study was conducted in a US commercially-insured population which, on average, tend to be slightly younger than the US general population. While we expect the results of this study to be generalizable to other insured populations, caution must be taken if there are differences in coding standards for reimbursement for hypocalcemia or dermatologic adverse events across insurers. Further, as PPVs vary according to disease prevalence, our PPVs may underestimate those observed in populations with a higher prevalence of hypocalcemia and/or dermatologic events than our study population, and overestimate those observed in populations with lower prevalence of these conditions than our study population. This highlights the need to assess the performance of case-identification algorithms within specific populations of interest. Lastly, we recognize that additional work is needed to assess the performance of algorithms for identifying other outcomes of interest that are associated with the use of osteoporosis medication, including osteonecrosis of the jaw, and atypical femur fractures.