Hypertension affects an estimated 30% [1–3] of the population in the United States, and is associated with health outcomes such as cardiovascular disease, heart attack and stroke [4–8]. Population estimates of hypertension prevalence are often assessed through large scale surveys which rely on participant self reports of previous clinical diagnosis of hypertension [5, 9]. Self-reported data is often more economically feasible and readily available (e.g., through telephone interviews [10, 11]) compared to clinically measured high blood pressure (HBP). However, given substantial evidence that awareness of hypertension is imperfect (for example, discrepancies between clinical measures and self-reported hypertension), reliance on self-reported data may contribute to inaccuracies in estimating population prevalence of hypertension [12–15]. Furthermore, given evidence that awareness varies across various subgroups within the United States [16–19], reliance on self-reported data to estimate prevalence in small areas where population characteristics differ from national characteristics may contribute to inaccuracies in prevalence estimates.
Several studies have examined the validity of self-reported hypertension and its use for surveillance of hypertension trends. Studies using national data such as NHANES [18, 20] or large samples [11, 21, 22] have suggested that self-reported data may underestimate hypertension prevalence [10, 12–15], given that some with hypertension are unaware or otherwise do not report the condition [5, 16, 23]. Age, gender, education, geographic area, marital status, race and ethnicity have been found to be associated with accuracy of self-reported HBP [4, 6, 7, 16, 24–27]. Studies that have attempted to gauge the extent of this problem have reported differences between clinically measured and self-reported HBP that range from 2.0  to 27.0% . Most studies designed to assess the accuracy of self-report data have compared self-reported high blood pressure to a ‘gold standard’ [17, 23, 28–31] such as measurements obtained from physical examinations using a mercury sphygmomanometer [26, 32]. The majority of these studies have been based on small samples; have relied on volunteers; include only persons in good health; or recruit participants of particular organizations (e.g., an HMO) or screening programs. These factors limit the ability to either generalize to broader populations or identify characteristics that may be associated with differential accuracy of the self-reported versus clinically measured HBP. One validation study has been based on a nationally representative sample , and this study identified a prediction model used to estimate prevalence of high blood pressure. These methods were developed for large-scale national samples, and require fairly sophisticated statistical expertise to implement.
However, there are well-established differences in the rates, awareness and treatment of hypertension across racial and ethnic groups, by socioeconomic status, and across geographical areas within the United States [25, 34]. Thus, the applicability of national models within specific communities or areas may vary. In addition, the severity of the underestimation of self-reported data varies across different chronic diseases [16, 23] such as diabetes, stroke and heart attacks [11, 35–38]. Assessing the validity of self-reported data in estimating hypertension prevalence in specific geographic areas, and developing simple prediction models that correct for possible miss reporting of HBP in self-reported data, can be essential to the creation of accurate population level estimates, and for population level efforts to effectively prevent or treat HBP within particular contexts. To date, no studies of which we are aware have developed such a correction model for self-reported data at local geographic levels.
Thus, our objective in this paper is to examine the accuracy of self-reported data in describing the prevalence of hypertension in racially and ethnically diverse urban community, and to develop a simple tool to correct self-reported data to more accurately reflect clinical prevalence of HBP. Specifically, we aim to:
Aim1: Examine the extent to which reliance on self-reported data may miss-characterize hypertension prevalence in a multiethnic urban community.
Aim2: Develop a prediction model to calibrate self-reported data to more closely correspond to the clinical prevalence of hypertension in a local community sample.
To address these aims, we draw on data from two multiethnic urban samples, the 2002 Healthy Environments Partnership (HEP) community survey  and the NHANES 2001–2002 national survey, restricted to residents 25 years and older of metropolitan areas as described in the following section.