Content analysis is defined as a “research technique for making replicable and valid inferences from texts (or other meaningful matter) to the contexts of their use” , i.e. systematically evaluating texts (e.g. documents of various forms and verbal communication) looking for specific words and phrases, followed by coding and categorizing to induce assumptions. Content analysis was therefore used to record the number of times specific knowledge terms are referenced in media, specifically to collect and analyse data from the websites of Australian public hospitals.
Selecting the research sample
Purposive sampling was used to select the entities for the research, using the “Hospital resources 2016–17: Australian hospital statistics” list . The list contains 695 public Australian hospitals from all six Australian states and two territories. 151 hospitals and their websites were included in the research sample, consisting of 100 small hospitals and 51 large hospitals from all states and territories. The size of the hospitals referred to the number of beds, with the large hospitals having more than 6 beds and small hospitals having 6 or fewer beds. Other criteria included the location (state), characteristics of the website, e.g. its complexity and the existence of an own domain. Both public acute hospitals and public psychiatric hospitals were included in the research sample. Private hospitals were excluded from the study. The number of hospitals included from different states and territories were as follows: Australian Capital Territory (ACT) (1); New South Wales (NSW) (33); Northern Territory (NT) (1); Queensland (QLD) (22); South Australia (SA) (12); Tasmania (TAS) (11); Victoria (VIC) (33) and Western Australia (WA) (38).
Ten variables were recorded:
1) Size – the number of beds.
2) State – ACT; NSW; NT; QLD; SA; TAS; VIC and WA.
3) P or O – ‘P’ indicates that the website is a complex portal consisting of multiple pages; ‘O’ represents a one-page website that does not have a deeper structure and shows all the information about the hospital on one page.
4) Dom – the existence of an own domain was identified. ‘2nd’ indicated that the hospital website URL looks e.g. like ‘shvs.org.au’ and thus, runs on its own 2nd level domain; ‘3rd’ was used if the website is operated on an own subdomain name before the main domain name, such as fionastanley.health.wa.gov.au; ‘no’ was used if the hospital website resides on a general domain used by multiple entities, such as the health network domain with the homepage URL being e.g. ‘health.qld.gov.au/townsville’.
5) K – the total number of documents containing the word ‘knowledge’.
6) KC – the total number of documents containing terms related to knowledge creation.
7) KR – the total number of documents containing terms related to knowledge retention.
8) KS – the total number of documents containing terms related to knowledge sharing.
9) KM – the total number of documents containing terms related to knowledge management.
10) Total – represents the sum of values for variables 6 to 9, i.e. the total number of documents containing any of these knowledge terms.
To collect the data for variables 3 (P/O) and 4 (Dom), observation and visual analysis were used. Links on the website were followed to see whether there is a deeper structure behind the first page (homepage). Sitemap was opened if available to confirm the findings.
To gather values for variable 5, a series of searches using the Australian version of Google was performed. Google.com.au website was used in an incognito mode to enter the search phrases. The keyword ‘knowledge’ was used as a search phrase and the website domain was added to limit the search results to include the examined hospital website only. The search string looked like:
The number of online resources containing this keyword was recorded. The value was further refined by manually checking every page with search results, as Google occasionally includes a page that does not contain the keyword in the search results.
For variables 6 to 9, the same principle was used, however, multiple keyword combinations were entered. For each of the variables, a series of keywords closely related to the examined area was defined and used as search strings. For example, when looking for the number of websites mentioning terms related to knowledge creation (KC), terms such as ‘create knowledge’, ‘generate knowledge’, ‘gain knowledge’ etc. were also included. The search string used looked like:
“generate knowledge” site:https://fionastanley.health.wa.gov.au
The results of this search string would include only pages that contained this exact phrase – both words next to each other. To increase the relevance of results, variations with another word being included in between them were also considered. If a text on a web page reads e.g. ‘gain new knowledge’, this should be considered as communicating about the knowledge creation area. The number of these occurrences were added to the previous value. The modified search string was:
“generate * knowledge” site:https://fionastanley.health.wa.gov.au
The principles introduced in the methodology that was used in the study by  were applied to enable a comparison of the results with different industries. The study evaluated three options of collecting the data to analyse the online communications about knowledge terminology. The method of using Google search to determine the number of websites with the terms mentioned was used in both studies, however, in the presented study it was enhanced by including keywords that were separated by another word and expanding the number of keywords for each of the areas. The already referenced  study examined three areas based on the three knowledge processes defined  —knowledge creation, knowledge sharing, and knowledge implementation. Further adjustments were made to the methodology in the presented study, as one extra area was added to include the topic of knowledge retention. More keywords were also used for each knowledge area. The final keyword list for each of the variable (area) was finalised after extensive testing, where multiple keyword combinations were used to examine the results for the hospitals with the most keyword mentions on their websites. This iterative approach enabled the researches to include the most relevant keywords and not focus on keyword combinations that would not have enough coverage. The keywords list thus reflects the existence of a knowledge focus in an organisation. The researchers are aware that the list is not, and will never be, complete.
We considered the number of websites containing each of the following terms:
KC (knowledge creation)
build knowledge, building knowledge, create knowledge, creating knowledge, generate knowledge, generating knowledge, acquire knowledge, acquiring knowledge, improve knowledge, improving knowledge, increase knowledge, increasing knowledge, develop knowledge, developing knowledge, expand knowledge, expanding knowledge, gain knowledge, gaining knowledge, knowledge building, knowledge creation, knowledge expansion, knowledge acquisition, knowledge generation
KR (knowledge retention)
knowledge retention, knowledge capture, knowledge base, retain knowledge, retaining knowledge, retention of knowledge, capture knowledge, capturing knowledge, body of knowledge
KS (knowledge sharing)
knowledge sharing, knowledge transfer, knowledge exchange, knowledge dissemination, knowledge application, sharing knowledge, apply knowledge, applying knowledge, share knowledge, dissemination of knowledge, disseminate knowledge, transfer knowledge, transfer of knowledge
KM (knowledge implementation/management)
Implement knowledge, manage knowledge, managing knowledge, knowledge implementation, knowledge management
Before the analyses were conducted, the data were cleaned and prepared. Descriptive statistics was applied to determine the total keyword occurrences and illustrate the difference in the use of knowledge-related terminology on the websites between different hospitals.
To allow for significance testing, numerical variables were recoded as categorical variables as follows:
Size: this was recoded with categories Large (more than 6 beds) or Small (up to 6 beds). The decision to use 6 beds as the cut-off came from the data which showed two clear groupings.
MentionedK: this was recoded as a binary variable indicated either the phrase K was mentioned (counts of at least 1) or the phrase was not mentioned (counts of 0). In the raw data counts of mentions of “K” were dominated by 0 s which severely impacts and limits the statistical analysis possible, hence the choice to convert the variable to binary.
MentionedKC, KR, KS, KM, Total: “Mentioned” binary variables were also created for the remaining phrase counts for KC, KR, KS, KM and Total using the same rationale.
The chi-square test of independence was used to analyse the two-way contingency table for each variable combination to determine whether there is an association between two variables of interest. If the result was significant, indicating an association between the categories of the two variables, then post-hoc analysis was applied to identify the categories producing a significant result. In the two-way tables where cell counts equalled 0, violating a fundamental assumption of the chi-square test of independence, Fisher’s Exact Test was used to validate the results of the Chi-Square test.