The Safety Attitudes Questionnaire: psychometric properties, benchmarking data, and emerging research

Background There is widespread interest in measuring healthcare provider attitudes about issues relevant to patient safety (often called safety climate or safety culture). Here we report the psychometric properties, establish benchmarking data, and discuss emerging areas of research with the University of Texas Safety Attitudes Questionnaire. Methods Six cross-sectional surveys of health care providers (n = 10,843) in 203 clinical areas (including critical care units, operating rooms, inpatient settings, and ambulatory clinics) in three countries (USA, UK, New Zealand). Multilevel factor analyses yielded results at the clinical area level and the respondent nested within clinical area level. We report scale reliability, floor/ceiling effects, item factor loadings, inter-factor correlations, and percentage of respondents who agree with each item and scale. Results A six factor model of provider attitudes fit to the data at both the clinical area and respondent nested within clinical area levels. The factors were: Teamwork Climate, Safety Climate, Perceptions of Management, Job Satisfaction, Working Conditions, and Stress Recognition. Scale reliability was 0.9. Provider attitudes varied greatly both within and among organizations. Results are presented to allow benchmarking among organizations and emerging research is discussed. Conclusion The Safety Attitudes Questionnaire demonstrated good psychometric properties. Healthcare organizations can use the survey to measure caregiver attitudes about six patient safety-related domains, to compare themselves with other organizations, to prompt interventions to improve safety attitudes and to measure the effectiveness of these interventions.


Background
Experts believe that hhhealthcare quality and safety must be investigated within the framework of systems and con-textual factors in which errors and adverse events occur. [1][2][3][4][5][6][7][8] Vincent and colleagues describe several factors that influence clinical practice: organizational factors such as safety climate and morale, work environment factors such as staffing levels and managerial support, team factors such as teamwork and supervision, and staff factors such as overconfidence and being overly self assured. [8] Healthcare provider attitudes about these and related factors are one component of an organization's safety culture. Influential organizations such as the UK National Health Service, the Joint Commission for the Accreditation of Healthcare Organizations, the Agency for Healthcare Research and Quality, and the U.S. National Quality Forum are encouraging the measurement of safety culture. This interest derives in part from the experience of other industries (nuclear power, naval aircraft carriers, NASA) that are known for their ability to reliably deal with risky processes. [9] Despite considerable interest, there is limited psychometric and benchmarking data available for the surveys designed to measure these attitudes [10][11][12][13]. The aims of this paper are to present our experience with the University of Texas Safety Attitudes Questionnaire. We describe the survey's background, psychometric characteristics, provide benchmarking data, discuss how the survey can be used, and note emerging areas of research.

Terminology
Safety culture has been be defined as "the product of individual and group values, attitudes, perceptions, competencies, and patterns of behavior that determine the commitment to, and the style and proficiency of, an organization's health and safety management [13]." The SAQ elicits a snapshot of the safety culture through surveys of frontline worker perceptions. When using questionnaires to study group-level perceptions, the most appropriate term to use is climate (e.g., safety climate, or teamwork climate). Climates are more readily measurable aspects of safety culture (perceptions are part of both definitions) but surveys are generally not capable of measuring all other aspects of culture like behavior, values, and competencies. However, readers should be aware that some papers, organizations, and opinion leaders use the terms climate and culture interchangeably. We use the term climate where some may expect to see the phrase culture of patient safety.
Here we use clinical areas (a.k.a., work units, patient care areas, nursing units) as the group-level of interest. By testing the psychometrics of the SAQ at the individual level and the clinical area level, we can test the appropriateness of conceptualizing patient safety issues at the clinical area level, because clinical areas are generally associated with managers, geographical locations, and specific clinical and operational outcomes.

Lineage and conceptual background of the Safety Attitudes Questionnaire (SAQ)
The Safety Attitudes Questionnaire (SAQ) is a refinement of the Intensive Care Unit Management Attitudes Questionnaire, [14,15] which was derived from a questionnaire widely used in commercial aviation, the Flight Management Attitudes Questionnaire (FMAQ). [16,17] The FMAQ was created after researchers found that most airline accidents were due to breakdowns in interpersonal aspects of crew performance such as teamwork, speaking up, leadership, communication, and collaborative decision making. The FMAQ measures crew member attitudes about these topics.
Because 25% of the FMAQ items demonstrated utility in medical settings in terms of the subject covered and factor loadings, they were retained on the SAQ, The new SAQ items were generated by discussions with healthcare providers and subject matter experts. In addition, we relied upon two conceptual models to decide which items to include: Vincent's framework for analyzing risk and safety [8] and Donabedian's conceptual model for assessing quality [18] This generated a pool of over 100 new items covering four themes: safety climate, teamwork climate, stress recognition, and organizational climate. Items were evaluated through pilot testing and exploratory factor analyses. This phase of survey development consistently yielded 6 factor-analytically derived attitudinal domains containing 40 items from the survey (two, three, four, and five factor structures were less robust). Three of the targeted themes, safety climate, teamwork climate, and stress recognition, emerged as factors. In particular, safety climate and stress recognition are conceptually quite similar to their counterparts in aviation. [19] The fourth targeted theme, organizational climate, consistently emerged as three distinct but related factors, perceptions of management, working conditions, and job satisfaction. Organizational climate plays a decisive role in setting the preconditions for success or failure in managing risks [3,4,20] , and we therefore retained these three factors as part of safety attitude assessment. An additional 20 items were retained because they were deemed interesting and valuable to the unit managers and senior hospital leadership to whom we reported the results of our pilot studies.
The SAQ has been adapted for use in intensive care units (ICU) [15,21] , operating rooms (OR), general inpatient settings (medical ward, surgical ward, etc.), and ambulatory clinics. For each version of the SAQ, item content is the same, with minor modifications to reflect the clinical area. For example, "In this ICU, it is difficult to discuss mistakes," vs. "In the ORs here, it is difficult to discuss mistakes." The SAQ elicits caregiver attitudes through the 6 factor analytically derived climate scales: teamwork climate; safety climate; job satisfaction; perceptions of man-agement; working conditions; and stress recognition ( Figure 1).
The SAQ is a single page (double sided) questionnaire with 60 items and demographics information (age, sex, experience, and nationality). The questionnaire takes approximately 10 to 15 minutes to complete. Each of the 60 items is answered using a five-point Likert scale (Disagree Strongly, Disagree Slightly, Neutral, Agree Slightly, Agree Strongly). Some items are negatively worded. There is also an open-ended section for comments: "What are your top three recommendations for improving patient safety in this clinical area?" Each version of the SAQ in the current study includes a "Collaboration and Communication" section, where respondents are asked to indicate the quality of collaboration and communication they have experienced with each of the types of providers in their clinical area (e.g., Staff Surgeons, Surgical Residents, Staff Anesthesiologists, OR Nurses, etc.) using a five-point Likert scale (Very Low, Low, Adequate, High, Very High).

Chronology of the Safety Attitudes Questionnaire administrations
Early survey development, pilot studies, and exploratory factor analyses were conducted in four USA critical care sites. [14,15] This work lead to a six-factor solution using 40 of the 60 items, [21] and set the stage for the subsequent survey administrations reported in the current study. The data presented here came from six administrations (Table 1) of the SAQ between 2000 and 2003, totalling 203 sites (in the discussion we briefly note results of more recent survey administrations led by other investigators). We conducted further pilot testing of the SAQ for the United Kingdom and New Zealand, but aside from simple translations (e.g., USA Attendings and Residents became UK Consultants and Registrars, respectively) there were no substantial revisions. The first non-pilot version of the SAQ was administered in 106 United Kingdom (UK) ICUs. The second administration took place in 20 New Zealand (NZ) ICUs. The subsequent administrations occurred in the following sequence: 11 USA Inpatient settings, 2 USA OR settings, 11 USA Ambulatory Clinics, and 53 USA ICUs.

Participants
To qualify for inclusion, both full-and part-time staff had to have worked in the unit (including those not based in the unit, but with a significant work commitment to it) for at least one month prior to administration of the questionnaire. The "rule of thumb" we applied was that all personnel within a clinical area who either influence or are influenced by the "working environment" in that clinical area were invited to participate (e.g., Attendings/Staff Physicians, Resident Physicians, Registered Nurses, Charge Nurses, Pharmacists, Respiratory Therapists, Technicians, Ward Clerks, Other:_____________). Response was voluntary, and administration techniques included hand-delivery, meeting administrations, and in-house mailing administrations.

Data management and processing
SAQs were read into an OpScan8 ® OMR scanner using ScanTools ® software, producing a tab-delimited file, which was converted into an SPSS Version 11.5 file for analysis. The Likert scale (1 = Disagree Strongly, 2 = Disagree Slightly, 3 = Neutral, 4 = Agree Slightly, 5 = Agree Strongly) was used to score each of the 60 items. Negatively worded items were reverse scored so that their valence matched the positively worded items.

Data analysis
Each clinical area possesses a unique social fabric, leading respondents who work within the same clinical area to respond more similarly than respondents who are members of different clinical areas. Consequently, it is important to control for the non-independence of responses SAQ factor definitions and example items -Disagreements are appropriately resolved (i.e., not who is right, but what is best for the patient) -Our doctors and nurses work together as a well coordinated team Teamwork climate: perceived quality of collaboration between personnel Example items Scale: Definition gathered from the same clinical area via performing analyses that address the multilevel nature of the data in order to obtain accurate model test statistics and scale reliability estimates. Therefore, we fit the hypothesized six-factor model via multilevel confirmatory factor analysis using Mplus version 2.12. [22] We used the entire sample of respondents in order to make the maximum number of clinical areas (n = 203) available for parameter estimation at the clinical area level. To evaluate the overall fit of each model to the data, we used the Mplus MLR chi-square test of model fit that is robust to non-normal data. [23] This estimator uses White's sandwich-based method to yield test statistics that are robust to misspecification of the model's factor structure and non-normal input data [24]. While this robust estimator yields superior results compared to standard maximum likelihood when input data are non-normal, the chi-square test of absolute model fit can still be sensitive to trivial misspecifications in the model's structure, however, so we also evaluated the following descriptive measures of model fit: the standardized root mean residual (SRMR) [25] , the Comparative Fit Index (CFI) [26] , and the Root Mean Square Error of Approximation (RMSEA) [27] using the recommended cutoff values of .90 for the CFI and related incremental fit indices, .08 for the RMSEA, and .10 for the SRMR [28]. We initially fit a six factor multi-level confirmatory factor analysis model that contained the 40 items retained in previous studies that explored the SAQ's construct validity [21]. Items with weak factor-item associations at the clinical area level or individual level were then deleted sequentially via a backward elimination procedure until satisfactory model fit was attained.
For purposes of consistency and to display separate reliability results for clinical areas and for individuals nested within clinical areas, we computed coefficient alpha values in Mplus using the structural equation modelingbased approach of Miller [29] and Raykov [30].
Once satisfactory model fit was obtained, we used the model results to compute composite scale reliability using Raykov's ñ coefficient. Coefficient alpha, the usual statistic used to estimate scale reliability, assumes that all items' factor loadings are identical, a restrictive assumption that biases scale reliability estimates [30]. Raykov's ñ relaxes this assumption, yielding more accurate reliability estimates. Moreover, coefficient alpha is limited to singlelevel analyses whereas ñ has recently been extended to incorporate multilevel analysis scenarios of the type presented here [31] Accordingly, we report ñ below as the scale reliability estimate for the SAQ.

Terminology and interpretation
For ease of interpretability, we conducted analyses on mean scores, but also present percent agreement to facilitate understanding of the items and scales. The percentage of respondents within a clinical area reporting "agree slightly" or "agree strongly" for each of the items within a given scale were charted as the percent positive. When individual attitudes are aggregated by clinical area, the SAQ provides a snapshot of the climate in a given clinical area (i.e., one attitude is an opinion, but the aggregate attitudes of everyone in a clinical area is climate). Attitudinal questionnaires are also informative in organization-wide assessments of climate, but it is important to interpret organization-wide results at the work-unit or clinical area level as well, due to the high degree of variability between This chart details the number of clinical areas within an SAQ administration, the number of respondents, and the response rate. Note: percentage floor/percentage ceiling = the percent of scores at the extremes of the scaling range clinical areas within the same organization [32]. Variability within an organization is not unique to healthcare settings, as we have found that there is generally more variability within an airline between fleets (types of aircraft) and departments, than there is between organizations. For clarification of terms, we use the phrases "clinical area" and "site" to refer to all of the respondents from a given ICU, OR, Inpatient Ward, or Ambulatory Clinic.

SAQ administrations
The overall SAQ response rate was 67.0% (10,843 out of 16,184 questionnaires), with a range of 65.7% to 72.2% across administrations. Response rates and floor/ceiling effects for each scale are presented in Table 1, by administration. Table 2 presents the SAQ factors' descriptive data by administration, including overall means, minimum and maximum clinical area means within an administration, and overall standard deviations. Incomplete data at the item level was approximately 1.5% overall (Table 3).
Descriptive analyses of individual items should not be appreciably affected by such a small amount of incomplete data. [33] There was substantial variability across the 203 clinical areas at the item level. In total, for example, one out of five respondents reported that it is difficult to speak up if they perceive a problem with patient care, but at the clinical area level, the percent of respondents who agree ranged from 0% to 50%. In other words, zero respondents reported difficulty speaking up in some clinical areas, while in other clinical areas, half of the caregivers reported difficulty speaking up.

Safety Attitudes Questionnaire: factor structure and multilevel modeling
To assess the fit of the expected six factor structure to the data, we fit a sequence of six factor multi-level confirmatory factor analysis models to the survey data. [34,35] Of the original 10,843 cases, 10,810 were associated with an identifiable clinical area; there were 203 available clinical areas for these analyses.
The SAQ with six factors and 40 items (plus 20 additional items) was used in all the administrations reported here. However our analysis for this paper used a more rigorous multi-level confirmatory factor analysis and prompted us to drop ten items to attain satisfactory model fit for the majority of fit indices. The fit of the final model containing the 30 remaining items was generally satisfactory:  Table 3. The correlations between the factors are shown in Table 4 Reliability assessments Composite scale reliability for the SAQ was assessed via Raykov's ρ coefficient. The ρ value for the SAQ in this sample was .90, indicating strong reliability of the SAQ. Overall, this finding, in conjunction with the multi-level factor analyses demonstrated that the SAQ has good psychometric properties. Also, anecdotal evidence from respondents during feedback presentations indicates that the SAQ items are in fact assessing topics of importance to front-line personnel.

Benchmarking climate
The percentage of respondents within a clinical area reporting "agree slightly" or "agree strongly" for each of the items within a given scale were charted as the percent positive for each SAQ factor. The six SAQ distributions in Figure 2 demonstrate the variability in percent positive SAQ scores across the 203 clinical areas in the present study.

Discussion
The SAQ is a psychometrically sound instrument for assessing six safety-related climate domains by systematically eliciting input from front-line caregivers. The SAQ can be used to meet the increasing demand for safety climate (often called safety culture) assessment at the clinical area level. For comparison purposes, those interested may use the 203 clinical areas reported here, as they demonstrated substantial variability in teamwork climate, safety climate, job satisfaction, stress recognition and working conditions. We found substantial variability in teamwork climate, safety climate, job satisfaction, stress recognition and working conditions. The item descriptives (Table 3), together with the percent positive distributions (Figure 2), and the administration-level descriptives ( Table 1 and Table 2), serve as benchmarking data for the SAQ. Examination of Table 4 [41] Similarly, ICUs with high stress recognition scores made more recommendations regarding the need for increased staffing levels relative to low stress recognition ICUs (i.e., respondents who acknowl-edge the effects of stress on their performance were much more likely to identify the need for improved staffing levels).
Our results indicate that researchers should consider hospitals comprised of clinical areas to resemble corporations comprised of organizations, because the clinical areas appear to resemble what are typically considered organization-like unique climates. The multi-level model demonstrated that there is more variability between clinical areas than within clinical areas. In other words, context of care assessments appear to be more robust, meaningful, and interpretable at the clinical area level. Climate at the clinical area level is important as many clinical and operational outcomes are tracked at the clinical area level (e.g., catheter related blood stream infections in intensive care units), and it is easier to target clinical area level improvements than hospital wide improvements. We see the focus on clinical area level climate as a way to acknowledge the complexity of the systems in which caregivers work, rather than assuming monolithic hospital climates that lack diagnosticity of clinical area level issues.
The SAQ differs from other medical safety climate or "culture" surveys [10][11][12][13] in four respects: first, the SAQ has been more widely used for a longer period of time, so there is benchmarking data available and many of the challenges of longitudinal assessment have been encountered and addressed; second, a larger amount of psychometric data is available for the SAQ; and third, the SAQ maintains continuity with its predecessor (the FMAQ) -a traditional human factors survey with a 20 year history in aviation. [16,17] The availability of benchmarking data in the public domain enables organizations to evaluate their own climate data. Also, preserving item continuity with other high-reliability industries allows for comparisons   Table 3 Provides general descriptive information at the item level (likert scale: 1 = disagree strongly, 2 = disagree slightly, 3 = neutral, 4 = agree slightly, 5 = agree strongly): percent missing data; overall mean (standard deviation); overall percent agree (minimum agree-maximum Agree by clinical area); overall percent disagree (minimum disagree-maximum disagree by clinical area); standardized factor loadings at the between-area and within-area levels.
between professions [14] , and assists with the search for universal human factors issues across professions.

Limitations
The SAQ demonstrates generally good psychometric properties, though a number of the factor loadings at the clinical area level were smaller in magnitude than the corresponding factor loadings at the individual level. Not surprisingly, the standardized root mean residual (SRMR) model fit statistic at the clinical area level was larger than desirable, indicating that further scale refinement could result in stronger factor loadings and a better fitting model at the clinical area level without sacrificing integrity of measurement at the individual caregiver level. As noted above, a number of studies have linked SAQ factor scores to important clinical outcomes [38] and other instruments that are used in healthcare. [40] Nonetheless, further research on the relationship between SAQ factors and other variables such as staff turnover, patient morbidity, length of stay, and errors is needed.
Lastly, due to the limited scope of this study, it was not possible to assess factor structure invariance across countries, job categories of respondents, or other stratification variables. At a minimum, we demonstrated the validity of the SAQ in a large heterogeneous sample made up of many different healthcare provider types, clinical areas of various acuity levels, countries of origin, hospital types, gender, experience level, etc., in hopes that the results will generalize to a wide variety of healthcare providers.

Conclusion
It is possible to reliably and meaningfully measure caregiver attitudes and perceptions relevant to the safety of healthcare. Use of the SAQ to assess climate in clinical areas will allow valid comparisons between hospitals, patient care areas, and types of caregivers, and tracking of change over time. We can and should do more to tap into the wisdom and perspective of the frontline caregivers regarding the contexts in which they deliver care. Versions of the SAQ, as well as the SAQ Users Manual and additional benchmarking data can be downloaded from our website. [42]