This study utilized the most recent race and ethnicity data collected as part of the administrative records of a large, integrated health plan and compared it to information available from birth certificates. The information was accurate for ethnicity and for the three largest racial groups (White, Black, and Asian). Two major causes of disagreement between administrative and birth certificate records were identified: (1) missing information in administrative records and (2) classification of children of multiple races based on information from only one parent. Eliminating these causes would increase the sensitivity for correct racial classification from 66.4% to 95.7%. Because race and ethnicity information in health plan administrative records are constantly updated, information was more accurate in children with more medical encounters. Sensitivity and PPVs were generally higher in non-Hispanics than in Hispanics. Limitations in data quality were noted for children of multiple races and children of AIAN origin.
The quality of racial and ethnic information in children has not been well studied. However, the results from the present study were comparable to two previous studies investigating race and ethnicity information in adults [14, 15]. In these studies, PPVs for Whites and Blacks were between 86.7% and 95.1%. However, PPVs and sensitivity for small minority groups such as AIAN were generally poor . Comparably, PPV and sensitivity for Hispanic adults was lower than for non-Hispanic Whites and Blacks. These patterns are generally consistent with the accuracy observed for racial and ethnic information in Medicare enrollment databases . The present study also shows that the patterns of misclassification varied greatly between Hispanic and non-Hispanic children.
In the present study, one major reason for race/ethnicity misclassification in the administrative records was missing information (non-classification). After exclusion of non-classified individuals, the sensitivity improved significantly for Whites, Black, and AIAN. This partially explains the lower sensitivity observed in our study compared to other studies which excluded non-classified individuals from their study population [14, 15]. Incomplete and missing information on race, ethnicity and language in databases from health care organizations has been reported by others previously . The results from our study suggest that birth certificate information is not routinely used to fill missing information in administrative records, even if available as in this setting.
The second important cause of disagreement between administrative records and birth certificates was the misclassification of children whose parents had a different race (i.e. multiple races). Among children of multiple races, the vast majority of children were misclassified because only racial information of one parent - mostly maternal information - was used for classification purposes. One possible explanation for this misclassification is an often observed simplification of multiracial heritage. Multiple races are often reported as one main race [25, 26]. Multiracial identification varies across regions and races; in particular, AIAN are less likely to report themselves as multiracial . It may also be speculated that maternal presence during birth as well as later medical encounters account for this observation.
The present study adds new information on changes in the quality of race information over the course of membership. Race and ethnicity data collected in an integrated health care system used in the present study are updated during medical visits, as opposed to other settings such as health insurance claims where race/ethnicity information is usually collected at enrollment. The present study shows that the quality of information increased over time with increasing number of medical encounters, especially inpatient visits. Although the effects may differ in magnitude by organization, we can assume our results are generalizable to other integrated health care settings that update their patient's demographic data during office visits.
Our study benefited from the substantial size of a diverse population with adequate numbers of Hispanic and non-Hispanic racial and ethnic group representation to generate ample statistical power and allow valid estimates of sensitivity and PPVs. A limitation of the present study is the use of information obtained from birth certificate records as a criterion standard. After carefully reviewing the birth certificate records, previous studies have reported that birth certificate records provide relatively valid information on race and ethnicity [16, 19]. Race and ethnicity from birth certificates are also used as standards for federal statistics such as intercensal population estimates [20–22]. Despite PPVs of 96% and above for most races, significant limitations of the data quality were described for individuals of AIAN origin.
Consequences of misclassification of racial and ethnic minorities can lead to data misinterpretation and erroneous conclusions. Incorrect classification of individuals of a small minority group may lead to over or underestimation of health disparities and race-related risk factors. Therefore, accurate racial and ethnic information is crucial for health care research.