English as First Language
The aim of this paper was to report the psychometric properties of responses obtained with the CRU scale when used with healthcare aides in nursing homes. In line with previous studies [57, 58], a substantial number (48%) of the healthcare aides in the TREC study (which comprised our sample 3) were not from Canada and, did not speak English as their first language. This is challenging from a psychometric perspective because a homogenous sample is preferred for psychometric assessments such as factor analysis. There is some evidence to suggest that healthcare aides differ on several psychological concepts, for example, job satisfaction and burnout [58, 59], by ethnicity  of which first language spoken is a component. In our analysis, we found that healthcare aides who spoke English as their first language reported significantly lower scores on the CRU scale in comparison to healthcare aides who did not report English was their first language. These differences may reflect difficulty generally in understanding of the English language. It may also reflect difficulty in comprehending the concept of CRU and what the items comprising the scale were asking. Another possible explanation for the difference noted in the scores is a social desirability bias effect on part of healthcare aides who do not speak English as their first language since their scores on all items were consistently 'higher' than the scores of aides who did speak English as their first language. The differences in scores may, however, also be a valid discovery that can be explained by examining the specific cultural practices of the healthcare aides that did not speak English as their first language; the vast majority came from a variety of non-western cultures. This could be a fruitful area for future investigation. Although the finding that healthcare aides who speak English as their first language responded differently on the CRU scale compared to healthcare aides who do not speak English as their first language is not fully understood at this time, this study underscores the importance of collecting demographic data on healthcare aides' native language and ethnicity, as well as assessing differences by both variables prior to conducting psychometric analyses. In future research we will conduct additional qualitative work to explore reasons why healthcare aides who do not speak English as their first language score higher on the CRU scale then those that do speak English as their first language. We will also conduct a differential item analysis using item response theory to determine whether the items are biased towards healthcare aides who do or do not speak English as their first language. Bias occurs when one group of individuals has a different probability of endorsing a response category to an item, compared to a second group of individuals, after controlling for the value of the latent trait .
In this study, we aimed to assess the validity of the CRU scale and each of its items when completed by healthcare aides in nursing homes. A sound validity argument integrates various types of evidence to make a determination about the degree to which existing evidence and theory support the intended interpretations of scale scores for specific uses . The Standards', adopted in this study, focuses on content, response processes, internal structure, and relations to other variables evidence to obtain a unitary and comprehensive perspective of validity. In this framework all validity contributes to construct validity and exists as a matter of degree, meaning interpretations from scores are more or less valid given a specific context. The Standards' approach therefore provides an alternative to the traditional conceptualization of validity which views validity as: (1) distinct types (e.g., content, criterion, construct), and (2) existing or not.
In this study, we systematically performed several analyses to seek validity evidence (in each of the four domains comprising the Standards) with respect to the scores and interpretations obtained from the CRU scale when completed by healthcare aides in nursing homes. While it does do not provide a complete picture of all aspects of validity, it does provide a much needed first look at several critical issues that need to be addressed before more in-depth validity studies can be undertaken with additional samples.
Content validity is an important source of validity evidence; it is essential to identifying the concept being measured and is an early step in establishing construct validity. We explored content validity in a number of ways. First, we attempted to include a representative sample of items by reviewing the existing literature and modifying previously developed statements designed to capture conceptual use of knowledge in acute care hospitals with professional nurses. Second, before conducting a formal content validity assessment with experts, we assessed the appropriateness of the scale with respondents representative of those for whom it was developed (i.e., healthcare aides). This latter activity is formally labeled as 'response processes' validity evidence in the Standards. Based on this analysis, several revisions were made to the scale before it was formally assessed for item-concept relevance (i.e., content validity) with an expert panel. This process (integrating content and response process approaches to validation) illustrates the importance of considering multiple evidence sources. A traditional (more compartmentalized) approach to validity assessment would have resulted in the original items being assessed for relevance by an expert panel without knowledge of misfit between the items (as interpreted by the healthcare aides) and the concept of CRU. However, by adopting the Standards approach and letting multiple evidence sources inform one another, we were able to pilot test a form of the CRU scale that produced more valid score interpretations, then would have been used, if a traditional approach to validity assessment was undertaken.
Our validity assessment revealed problems with two of the five items in the CRU Scale: item 1 (give new knowledge or information) and item 3 (help change your mind). The formal (expert) content validity assessment resulted in item 1 (give new knowledge or information) being rated at an unacceptable level overall with respect to its relevance to CRU. Some experts also identified item 1 as having content overlap with the concept of instrumental research utilization. The ICC (2,1) measure of agreement further supported item 1 needing removal and/or revision; ICC (2,1) increased substantially when item 1 was removed from the scale (0.317 with item 1 to 0.793 without item 1). While the bivariate correlation between item 1 and instrumental research utilization was low - moderate (0.295), of the five scale items, it correlated the strongest with instrumental research utilization, lending some empirical support to the expert panel's assessment of the item (that it had content overlap with instrumental research utilization). Other issues with item 1 also emerged in our analysis. For example, item 1 had the second lowest factor loading in the PCA (though still substantial, Table 3), and model fit increased significantly in the CFA when the item was removed from the model. Post-analysis inspection of the item also revealed it to be a 'double-barreled' item, meaning it conveys two ideas: (1) give new knowledge; and, (2) give new information. Such items should be avoided wherever possible in instrument development since endorsement of the item might refer to either or both ideas ; however the item was not discovered to be double barreled until after the pilot test. Taken together, these findings suggest removal and/or revision of item 1 is required. Revision of the item so that it represents a single idea may lead to improved fit with the remaining four items. However, it is also possible that item 1 represents a distinguished aspect of CRU (i.e., an aspect not captured by the remaining four items); this would mean CRU is a more complex concept then the literature portrays and is multi-dimensional in nature. If this is confirmed in future research, an additional item group to assess this distinguished aspect of CRU should be developed. Until further research is conducted on item 1 (testing whether rewording the item improves its fit with the remaining four scale items or whether it represents a distinguished aspect of CRU), we recommend only using the four-item version of the scale (i.e., without item 1) in assessments of CRU by healthcare aides.
Item 3 (help change your mind) received a perfect relevance score in the formal content validity assessment (Table 2). However, the healthcare aides experienced difficulty comprehending this item according to our response processes work, which occurred prior to this assessment. Item 3 also exhibited the lowest factor loading of the five items in the PCA and CFA and the lowest corrected item total correlation (Tables 3 and 4). In our assessment of change in mean values with increasing levels of instrumental, persuasive, and overall research utilization, item 3 displayed the least change (Table 5). Combined, these findings indicate the healthcare aides may have had continued difficulty interpreting the item. These findings also demonstrate the importance of taking a comprehensive approach to validity assessment. While the formal content assessment revealed a perfect match between item 3 and CRU as a concept, the other evidence sources rendered the scores and interpretations from this item as less valid which affects the overall validity of the CRU scale. We trust the formal content validity assessment finding that the item is a good match with CRU. However, we believe, as seen in the response processes evidence, that the healthcare aides in our sample had difficulty understanding the item, thus rendering their responses to it as less valid. Future work on this item is required and should entail in-depth response processes work with healthcare aides to ensure clarity in item wording without appreciable loss in meaning.
Relations with other variables evidence also added to the construct validity argument for the CRU scale. Statistically significant bivariate correlations (Table 5) between the CRU latent scale score and the five item's scores with instrumental, persuasive, and overall research utilization reinforce past empirical research [2, 7], providing supporting validity evidence. The regression analysis (Table 6) also provided supporting validity evidence by showing that the CRU scale score was a predictor of overall research utilization, after controlling for other covariates [2, 7].
The Factor Model
While the items comprising the CRU scale were originally selected to cluster on one dimension (CRU) they were also intentionally selected to be non-redundant, allowing each item to focus on a slightly different feature of CRU. The intended 'clustering' of the items onto a factor renders the factor model the most appropriate model for assessing the internal structure of the CRU scale but the purposefully non-redundant nature of items meant that the scale would not function perfectly as a factor model. We employed three factor models: Model 1 with the five items loading onto a single factor, Model 2 with the five items loading onto a single factor with correlated errors between two sets of items (items 1 and 2, and items 3 and 4), and Model 3 with four items (item 1 was removed) loading onto a single factor with correlated errors between one set of items (items 3 and 4). A fourth model with one of items 3 or 4 also removed (in addition to item 1) would have been the next logical alternative model. However, this model would be just identified (df = 0) and thus, not testable. Item parceling (i.e., combining items into small groups of items within scales or subscales) has been used by others to deal with issues around local dependence and lack of unidimensionality. This was not an option here given the small number of items in the CRU Scale; by parceling items 3 and 4 along with removal of item 1, the model would remain 'just identified' and not testable.
As an alternative to the strict factor models assessed in this study, a model appropriately acknowledging the non-redundancy of the CRU items could be used. This would require use of single-item latent concepts, but such a model does not provide the kind evidence required by the Standards. A better model may be to simultaneously assess both measurement and latent structures using structural equation modeling. However, at this stage we do not know enough about the causal world of conceptual research utilization by healthcare aides to construct this model. Further research is needed to identify predictors of and outcomes to CRU, following which a causal model of CRU can be developed and tested. A CFA model was therefore our next best choice at this stage of the development of CRU with which to assess the internal structure of the CRU Scale.