Survey development and administration
Drawing from questions from the Health Information National Trends Survey (HINTS) [9], a national population-based survey administered by the National Cancer Institute (NCI), we added validated questions about health care access [10] and health literacy [11, 12] to create a local survey (see Additional file 1). We described the development and translation of the full survey and its administration in greater detail elsewhere [13]. Briefly, we administered the survey in the City and County of San Francisco in 2017. We used a community-based sampling approach to optimize survey recruitment from populations likely to have liver disease and who bear the burden of cancer disparities; specifically, we had prespecified targets for race/ethnicity (25% Black American) and language (50% English, 25% Spanish, 25% Chinese) [13]. We worked with San Francisco Cancer Initiative (SF CAN), a local collaborative ‘collective impact’ effort to reduce cancer burden in San Francisco [14], as well as several local community-based organizations to identify community events and popular community establishments at which to recruit participants. Participants included adults who were 18 to 75 years old, lived in San Francisco, and able to complete the survey in English, Spanish, or Chinese. Trained bilingual staff administered the survey in person in the participants’ preferred language (English, Spanish, or Chinese). Participants provided informed written consent. Consent was reviewed at the start of each participant survey. The University of California San Francisco Institutional Review Board approved this study (16–20,707). All methods were performed in accordance with relevant ethical guidelines and regulations of the Declaration of Helsinki.
Variables
The outcome of interest was self-reported prior HBV screening, defined as participants answering ‘Yes’ to the validated question: ‘Have you ever had a blood test to check for hepatitis B?’ We used the Health Behavior Framework (see Additional file 2) to identify a list of potential factors that could explain variations in HBV screening rates. The predictors of interest were race/ethnicity, language preference, and a usual place of care. Other variables included age, gender, education, English proficiency, health literacy, and insurance status.
Age (i.e. 18–34 as the reference category, 35–49, 50–64, and ≥ 65 years) and educational attainment (less than high school, high school or equivalent, some college/vocational training, and college graduate or higher as the reference category) were categorical variables. We collected disaggregated data on gender identity and race/ethnicity. For analysis purposes, we dichotomized gender (i.e. women as the reference category, and men) and categorized race/ethnicity (i.e. White as the reference category, Black, Asian, Latinx, and Other) rather than using disaggregated data because of small numbers in some groups.
Language preference was reported by the participant and was the language in which the participant completed the survey. If participants reported that they spoke English “not at all”, “poor”, or “not well” [15], English proficiency was defined as limited. Health literacy was asked in reference to materials in participants’ preferred language. Health literacy was reported as limited if participants answered “sometimes”, “often”, or “always” to the question: “How often do you need to have someone help you when you read instructions, pamphlets, or other written material from your doctor or pharmacy?” [11] We decided to use this single-question self-report health literacy item as it has been validated against sentence-completion and vocabulary-based direct health literacy measures in English and Spanish [11, 12]. In addition, it has been used in multiple studies [16,17,18] instead of burdensome healthy literacy testing.
Statistical analyses
We calculated descriptive statistics for participants, including means and standard deviations for numeric variables and frequencies and percentages for categorical variables. We assessed differences between language groups using chi-squared tests for categorical variables. We computed weights using iterative proportional fitting (raking), a technique used for nonprobability samples that involves raking over a set of variables (age, gender, and race/ethnicity) iteratively, to reweight the cohort population to match the distribution of the reference population (San Francisco). Among respondents without missing data, we assessed the association between predictor variables and HBV screening using univariable logistic regressions. In addition, we analyzed the association between primary predictor variables (e.g., race/ethnicity, language preference, and a usual place of care) with HBV screening using multivariable logistic regression, adjusting for age, gender, education, English proficiency, health literacy, and insurance status. We determined no statistically significant collinearity or interaction between variables. Given the potential interaction between language preference and nativity, we conducted sensitivity analyses that evaluated models with both language preference and nativity, with each variable alone, and with both variables and their interaction. In these analyses, only language preference was significant, and therefore, our final model only included language preference. We assessed statistical significance at the p < 0.05 level for all tests. Stata 16 (College Station, TX) was used for data analysis [19].