Profiling the different needs and expectations of patients for population-based medicine: a case study using segmentation analysis

Background This study illustrates an evidence-based method for the segmentation analysis of patients that could greatly improve the approach to population-based medicine, by filling a gap in the empirical analysis of this topic. Segmentation facilitates individual patient care in the context of the culture, health status, and the health needs of the entire population to which that patient belongs. Because many health systems are engaged in developing better chronic care management initiatives, patient profiles are critical to understanding whether some patients can move toward effective self-management and can play a central role in determining their own care, which fosters a sense of responsibility for their own health. A review of the literature on patient segmentation provided the background for this research. Method First, we conducted a literature review on patient satisfaction and segmentation to build a survey. Then, we performed 3,461 surveys of outpatient services users. The key structures on which the subjects’ perception of outpatient services was based were extrapolated using principal component factor analysis with varimax rotation. After the factor analysis, segmentation was performed through cluster analysis to better analyze the influence of individual attitudes on the results. Results Four segments were identified through factor and cluster analysis: the “unpretentious,” the “informed and supported,” the “experts” and the “advanced” patients. Their policies and managerial implications are outlined. Conclusions With this research, we provide the following: – a method for profiling patients based on common patient satisfaction surveys that is easily replicable in all health systems and contexts; – a proposal for segments based on the results of a broad-based analysis conducted in the Italian National Health System (INHS). Segments represent profiles of patients requiring different strategies for delivering health services. Their knowledge and analysis might support an effort to build an effective population-based medicine approach.

determining their care, which fosters a sense of responsibility for their own health c .
Second, within single healthcare organization-whether private or public, for-profit or not-for-profit-profiles are fundamental for developing targets and increasing understanding of the forces that drive healthcare consumption. Customer relations could also benefit from a better understanding of the different patient clusters.
Third, in all types of systems and organizations, profiles and their corresponding segments are important for building empowerment strategies to facilitate a shift from simple compliance to a "concordance" d approach [20][21][22][23].
We have noticed a lack of both empirical analysis within health contexts and research on market segments and targets within health systems. Most of the analysis is inferred from general surveys of consumer behavior. In this research, we provide the following: a method for profiling based on common patient satisfaction surveys, which is easily replicable in all health systems and contexts; a proposal for segments based on the results of a broad-based analysis conducted in the Italian National Health System (INHS).
A review of the literature on patient segmentation provided the background for this research. After presenting our analyses and findings, we draw some preliminary conclusions that could have implications for healthcare policy.

Background
Several studies highlight a process for differentiating the populations of developed countries [24][25][26][27][28][29][30][31]. This process is often referred to as a polarization process, and studies found in the literature can be divided into two interpretations.
From the perspective of the health and social conditions of a potential patient, many studies introduce a dichotomous scheme with the two poles represented as follows: frail, vulnerable elderly patients who lack family support, have multiple chronic conditions, are not self-sufficient, have cognitive disorders, are financially distressed, and are unable to express an appropriate demand for health and social services; healthy and wealthy elderly patients who are educated and pursue well-being through recurring access to an extended range of health services (preventive, curative, and aesthetic) and are willing to pay out-ofpocket or premium prices for high-quality and additional services.
From the perspective of the role of behaviors, some studies have theorized that patient polarization may occur among different dominant profiles [32][33][34] Researchers at the Institute for the Future have identified the concept of 'Personal Health Ecologies' (PHEs), which reflects a consumer's unique approach to managing their health. The principal PHEs proposed are illustrated in Table 1.
Although both efforts to categorize patients into clusters are effective to some extent, they are either too narrow or too broad and abstract to provide useful information to health care managers and policy makers. A contextembedded and evidence-based segmentation of patient populations could help to explain the driving forces necessary to improve service delivery (appropriateness, access, timeliness) and to engage and empower the patient in the care process [35]. This approach contributes to the transformation of the health care system from one that is essentially reactive, primarily responding when a person becomes ill, to one that is proactive and focused on keeping a person as healthy as possible. In this respect, we describe a new analysis of patient segmentation derived from the combination of patient characteristics and evidence gathered from survey data. The analysis is easily replicable and can be contextualized to specific subpopulations or geographical areas.

Survey definition
In the first phase of research, a literature review was conducted to identify published patient satisfaction questionnaires and surveys of patient opinions regarding outpatient services between January 1990 and January 2009 to identify various issues that patients may consider in their assessment of such services. The bibliographic databases used were Medline, Scopus, Social Science Citation Index and EconLit. Search terms included "satisfaction," "evaluation," "assessment," "judgement," "opinions," "perceptions," "questionnaire," "patient," "user," "people," "outpatient," "primary care," "out of hours," and "care continuity." A bibliographical search was supplemented by reviewing references from identified articles and by an Internet search of relevant web sites. Studies were included if they were experimental and if they were written in English. The search methodology generated 497 possible references. By reading the complete articles or abstracts, possible evaluation schemes-such as those related to outpatient services [36][37][38][39], after-hours services [40] and walk-in clinics [41]-were identified. In the second phase, to ensure valid content, a preliminary list of issues and statements produced from the literature review was submitted to two focus groups of 5 patients who had recently used outpatient services. The method was chosen to generate a limited number of relevant variables and items related to patient satisfaction and experiences with outpatient services. The patients were asked to select the most important aspects affecting their satisfaction, the significant features influencing their utilization of the services and to suggest additional issues or questions. Both groups were examined by a trained interviewer and the examinations followed a similar structure. The groups were audiotaped and coded separately by two researchers. Analysis revealed 4 recurrent domains that characterized patient responses: quality of healthcare services, quality of administrative services, access to out-of-hour care and interpersonal aspects. These dimensions formed the basis for developing the questionnaire. In the third phase, an expert panel of primary care managers and the district managers of local health authorities (LHAs) e reviewed the questionnaire to ensure relevance and clarity of the items. Questions that were confusing or ambiguous were removed, replaced or rewritten with the appropriate terminology. Moreover, two additional items regarding home care and vaccinations were included. Therefore, the final set of evaluation characteristics included specialist visits, diagnostic services, administrative services, home care, advisories, vaccinations, and the coordination of continuity of care. The professionalism and kindness of the health care and the administrative staff working in the outpatient clinics and the professionalism of the afterhours doctors (AHDs) were also rated for satisfaction. These 12 variables were fed into a 12-question questionnaire using a five-point rating scale. The variables were integrated with another 10 multiple choice or yes/no questions designed to examine the subjects' experiences with these services. The questionnaire also included sociodemographic data (see Additional file 1: Appendix 1). Four questions, unrelated to the outpatient services assessment, were incorporated in the final section of the survey to evaluate communication initiatives of the LHAs within the Tuscan region. The results of these 4 variables are not reported in this paper.

Data collection
The reference population consisted of Tuscan citizens over 18 years of age (3,168,955 in 2009). We selected Tuscan citizens because they are the target population for the Tuscany Regional Health System (TRHS). During 2008-2010, the TRHS introduced a regional health plan [42] with a strategic priority to develop a proactive and chronic care model for population-based medicine. All Tuscan LHAs and their primary care and district managers are engaged in this strategic goal f . Therefore, the result of this research would be of great interest to the TRHS. The sample was stratified into the 34 health districts in the region. In each health district, a sample size of approximately 196 subjects was required, assuming a 50% satisfaction at a 95% confidence level with a margin of error of ± 7%. Assuming a response rate of approximately 24%, which is in line with previous studies in that area, oversampling was performed to ensure that the minimum sample size was obtained. The calculated sample size was then multiplied by 34 to obtain the total sample size of 27,300, representing 0.9% of the reference population. The sample used was randomly selected from an updated regional phone directory, containing all listed residential telephone numbers. A pilot test was performed on a sample of 34 individuals of differing ages and geographical locations to verify whether the subjects understood the questionnaire and to determine if other relevant issues had been omitted from the survey. Based on the respondents' feedback, response rates and item response rates, the pilot program indicated that no topics other than those already included in the questionnaire were considered relevant. Some changes were made to the wording of the questions and the instructions to eliminate ambiguous phrasing. Interviews were conducted during the summer of 2009 with a computeraided telephone interview system during both working and non-working hours to reach a wide variety of patients g . The use of a sample list from the telephone directory created intrinsic variation in the sociodemographic composition of the interviewees. Women and elderly subjects were overrepresented with respect to the actual composition of the population. However, a nonresponse bias test to identify possible distortions among the opinions of the interviewees classified in the dataset [43][44][45][46][47][48] provided reassuring results. The testing compared the responses from the first 200 and the last 200 -Allopathic self-care: prefer over-the-counter products or toughing it out rather than seeing a physician; -Maximizers: highly engaged with their physicians and try to get the most out of their health care plans; -Nutritionists: rely on food and diet to prevent illness; -Naturalists: rely on complementary and alternative medicine and their bodies' natural healing process and dislike using the health care system; -Integrators: those who rely on the health care system for medical diagnoses but also dabble in complementary and alternative medicine (CAM); -Holistics: use the health care delivery system and CAM for the things each modality excels in; -Healthy Lifestylers: dramatically change their lives to maximize their health and look for health benefits across a wide range of products and services.
respondents for each of the 12 questions to determine the degree of satisfaction. A chi-square test indicated no significant statistical differences (at the 5%leve) between the scores of the two groups for the different variables.

Statistical analyses
Questions used to investigate more specific services with a percentage of missing values greater than 10% (home care, advisories and vaccinations) were eliminated from the satisfaction assessment, leaving nine variables. "I don't know" and similar answers were considered missing values and were replaced with an "expectation-maximization" algorithm [49]. Principal component factor analysis via varimax rotation was used to extrapolate the key structures on which the subjects' perceptions of outpatient services were based. After the factor analysis, segmentation was performed using cluster analysis to better analyze the influence of individual attitudes on the results [44][45][46]. The "scores" given to various factors were employed for hierarchical cluster analysis using Ward's method to identify the correct number of clusters and their respective centers [47] h while accounting for any a priori expectations concerning the data structure. Using the previously identified cluster centers, K-means cluster analysis was performed i , and the results were validated through linear discriminant analysis [48]. All statistical analyses were performed using SPSS 16.0.

Results
The interviewees evaluated the analyzed characteristics, which are listed in Table 2. When analyzing single variables, neither the effectiveness of the outpatient services nor the professionalism of their staff appears to be objectionable. However, questions regarding the administrative services of the outpatient clinics and the continuity of care received lower scores. The results also indicate a degree of variability in the average judgments, especially those focused on variables related to diagnostics, administrative services and the service staff.

Factor identification
Factor analysis was performed to reduce the nine characteristics to a more condensed set of dimensions and to test the construct validity of the questionnaire. The analysis met the Kaiser-Meyer-Olkin measure of sampling adequacy equal to 0.8, and the application of Bartlett's test of sphericity yielded a highly significant chi-square value. During the extraction, all communalities exhibited values greater than 0.7. Based on the explained variance and the scree plot results, a three-factor solution (with 87.5% of the total variance explained) was considered appropriate. Table 3 shows the rotated structure matrix used to identify the extracted factors and the variables related to each factor. The first factor identified was the outpatient clinic staff, which is correlated to variables assessing the professionalism and kindness of the staff in the outpatient district structures that perform health and administrative duties. As this dimension exhibited the greatest percentage of variance explained (37.3%), the 4 correlated items were reanalyzed using principal component analysis with varimax rotation and eigenvalues greater than 1 to determine whether this factor could be divided into 2 components representing, for example, the characteristics of the health staff and that of the administrative staff. The analysis revealed loading values greater than 0.8 for all variables and no statistically significant loadings for the other factors, suggesting the homogeneity of this domain. The second factor (28.5% of variance explained) includes the other aspects of the outpatient clinics and indicates a close connection between all attributes of the services provided, including diagnostic tests, specialist visits and administrative services. The high correlation coefficients for all variables and the absence of large variations in them allow one to define this factor as representative of all the scores given to the outpatient clinic services. The third factor (21.7% of variance explained) was continuity of care. Although this factor consists of only two variables, it was strong because it did not change with modifications to the factors used for the analysis or the methods of calculation. Continuity of careis closely associated with organizationand professionalism of the afterhours doctors (AHDs). The item internal consistency was satisfactory for all dimensions as the correlation level of each item with its scale achieved the 0.40 standard [50]. The item discriminant validity was also adequate [51], indicating that all items correlate more highly with the dimension in which they fit than with the other dimensions (0.18-0.52 for the first factor, 0.20-0.32 for the second factor and 0.17-0.31 for the third factor). Cronbach's alpha coefficient (that should exceed 0.7 [52]) was 0.92 for the first dimension, 0.95 for the second dimension and 0.89 for the third, indicating a high internal reliability for each factor. The discriminant validity was tested by comparing the mean dimension scores across the patient groups (age, gender and education). As in the literature [53][54][55], we found that older patients and those with lower education levels are more satisfied with respect to all dimensions; furthermore, gender does not have a significant effect on the satisfaction scores for any dimensions.

Group creation
After reducing the district service assessments to three macroelements, cluster analysis was used to investigate how they varied within the interview samples [56][57][58][59][60][61][62]. According to the Ward method, the results of the first hierarchical cluster indicated the presence of four groups. The cluster's final centers were obtained using the K-means method. Statistical tests confirmed the robustness of the analysis. The high values of the F-test for each of the factors used in the analysis demonstrate that the differences in the means of the groups in each factor were statistically significant (p < .001). To validate the results, discriminant analysis was performed using the original composition of the different groups as a grouping variable. The discriminant functions were significant (p < .001) and support the existence of four different clusters. Based on the values of the three factors, the confusion matrix (Table 4) indicates the success of the prediction algorithm, confirming that the three discriminant functions correctly sorted 99.6% of the cases into the four groups. The reliability of the clusters obtained was also investigated using cluster analysis on a random sample of 50% of the cases and a second analysis on the remaining cases. For both samples, the composition of the cluster was the same.
For each segment, Table 5 provides the mean scores of the three factors in the various groups, the sociodemographic characteristics and the principal variables for access to and use of the outpatient services. Due to the high number of missing values, 4 variables related to patient experiences were not included in the analysis. With the exception of the sex variable, the values from the chi-square test indicated that the sociodemographic and behavioral differences between clusters were statistically significant. Therefore, identification of each group is potentially useful for developing policies aimed specifically at that group. Finally, the size of each group could also lead to better prioritization of decision-making processes when coping with allocation of scarce resources.

Segment 1: The unpretentious patients
This segment includes the highest number of interviewees who gave positive evaluations for all three factors and comprises the highest percentages of the elderly, those who only completed elementary school or had no qualifications, retired people, the chronically sick, and those living alone or, at most, with one other person. This segment utilizes outpatient clinics less frequently. In general, the users included in this segment appear to  be fragile, with no clear ideas regarding available health care and appear to be incapable of turning their needs into demands. These characteristics lead to the hypothesis that the positive opinions of the considered services derive from the fact that the group members received better results than expected (hence, the name "unpretentious," given their low expectations). Due to their poorer health and social conditions compared with the other groups, this segment uses home visits more frequently to consult with their AHD and reports a high level of satisfaction with the AHD. The frequent use of AHDs explains the low use of the accident and emergency department (A&ED). Although not completely comparable due to differences in the methodological approach and the subject of analysis, this group's characteristics are similar to those of the "easy to please" cluster identified by Morrison et al. [43] in their segmentation analysis of GP services, which indicated that the subjects in this segment exhibit a laissez-faire attitude toward factors such as communication, relationships and empowerment and are quite satisfied with their GPs.

Segment 2: The informed and supported patients
The subjects in this segment gave positive evaluations of the outpatient clinic staff, but they were not satisfied with the services received. In addition, they were confused by the continuity of care. The health needs of this group, though less complex, do not differ greatly from those in the previous segment. In fact, when compared with the other groups, the percentage of chronically ill patients is high due to the high number of individuals over 50 years of age. This segment differs from the first because of their greater awareness of their health needs (due to a higher level of education and a higher number of qualified professionals) and the availability of a better network of health care facilitators (support from a larger family group and a deeper knowledge of the system, which leads to a greater use of GPs) Therefore, these patients utilize outpatient services more frequently than the previous group defining this group "informed and supported." These patients tend to blame the system more than the workers for failing to deliver processes, which could explain the positive perception of the district staff. This group may be related to the "engagement needed" segment described by Morrison et al. [43] in which people are particularly interested in the caring qualities of the GPs yet have low health status.

Segment 3: The expert patients
This segment includes all patients who consider the staff the weak element in the services. Although these patients are not enthusiastic, they are generally satisfied with the services. However, they have a slightly negative opinion of the continuity of care, not unlike second group. From a sociodemographic and epidemiological point of view, this segment does not differ significantly from the others and can be defined as an "evolution" of segment 2. The sociocultural level of this segment and its frequent use of the services indicate that these subjects are knowledgeable of the system (hence, the name "expert patients"). They expect the staff to solve problems and to improve the system. These patients are experts because they can selfmanage their symptoms. With regards to continuity of care, the same considerations apply as for segment 2.
After interactions with the AHDs, these patients understand the nature of their problems, which confirms a high capacity to interpret their symptoms correctly.

Segment 4: The advanced patients
Although the previously identified cluster is rather limited in number, its members provided such distinctive scores that a fourth segment was required. These subjects provided positive evaluations for both the services and the staff of the outpatient clinics, but gave the continuity of care strongly negative evaluations. This group is characterized by a higher percentage of young users with the highest level of education and includes highly skilled professionals and technicians (twice as many as in the other groups). The proportion of retired persons is 50% lower than that in the other groups. Its patients are the least burdened by chronic diseases, and nearly 50% of them live in a family of more than three members. One can assume that these characteristics entail a good capacity to interpret health needs, a deep knowledge of the health system and the ability to independently identify the best offers available. As a result, this group is the greatest user of the district services, primarily responding to personal situations and needs (low compliance with letters of invitation from the clinics, but high use of GPs and high individual initiative). Given these characteristics and their significant experience with the previously mentioned services, this segment represents "advanced" patients. This group provides an extremely negative evaluation of the continuity of care, which corresponds to the hypotheses formulated for segments 2 and 3. Furthermore, this segment has common characteristics with the "generation X" and "service users" groups described by Morrison et al.   [43] and Gabbott and Hogg [63], respectively. Both groups include young people with high socio-economic and health status. In the former, the subjects demonstrated preferences for quality communication and an aversion to GP advice regarding their treatment options. In the latter, individuals that are frequent users of general practice services are concerned with the overall health care experience, including empathy from the staff and the responsiveness of the service.

Conclusion: preliminary implications for a policy and research agenda
Some early conclusions can be drawn from the segmentation. Actions can be taken to address specific priorities and alter the driving forces for each segment and the direction and change for health organizations and managerial practices.
Better knowledge of the patient segments could be useful on three levels: -First, this knowledge would aid in the design of more effective communication tools and relationship processes. Interactive web design provides an example. How should health organizations use the internet to respond to the expectations and capabilities of different segments? Access processes are another example. Should health organizations diversify channels of access to meet different patient profiles? For example, could some segments have direct access to secondary care, or should everything originate with the GPs? -Second, strategies for empowering patients might differ. For example, some segments could have more control over their health budgets and could be targets for a policy of healthcare vouchers with Further information on diagnosis/therapy proposed by AHD 7.1 8.7 0.0 6.5 6.1 GP: General practitioner; PD: Pediatrician; AHD: After-hours doctor; A&ED: Accident & emergency department. a Percentages are based on responses. p < .001 for age, education, job, family situation, no. of visits in outpatient c. in the last year, AHD consultation, method of AHD consultation, A&ED visit after AHD consultation, reason for A&ED visit after AHD consultation; p = .005 for those referred to outpatient clinic; p = .025 for chronic diseases. * Higher factor scores indicate that the respondents are more satisfied with the items in the factor or have rated the items in the factor more positively.
more responsibility and the freedom to choose their own providers, thus making them more engaged in appraising their medical services. -Third, segmentation could become a mechanism to address cultural issues and could provide a good excuse to engage clinicians and health staff to review their patient relationship practices. Do they recognize and pay attention to differences? Different segments might require different language, information, and individual approaches (paternalistic, autocratic, democratic, etc.). In contrast, the segments could be used to cause patients to consider their attitudes toward health issues and clinicians. For example, patients could be asked to identify the segment to which they believe they belong and to discuss the implications with their GP.
The results for the "unpretentious" patients could be used to prioritize preventive actions, such as the creation of medical records for chronic illnesses (hypertension, diabetes, etc.) to more accurately monitor the clinical evolution of more serious patients. This strategy should be reinforced, whenever possible, by specific incentives to aid the outpatient services staff (in collaboration with the primary health services) in implementing proactive attitudes for contacting and guiding patients who do not thrive and who can neither interpret the nature and dynamics of their pathology nor manage the necessary stages of their diagnostic/therapeutic procedures. For the "informed and supported" patients, integrated medical records could be helpful but not a priority because of the patients' greater awareness of their health needs and their better use of the healthcare network. In this case, the role of the GP should be stressed. These patients expect "customized" diagnostic/therapeutic paths or direction toward the best possible paths. The GPs should work as "mentors" and supervisors to patients who, given the proper health care procedures and "activated" by empowerment, could make more autonomous use of the services they need. Quick access to information seems to be critical for "expert" patients. This group could benefit from more exhaustive and rapid information on specific services (including other providers who can offer these services) and ways to make the most appropriate use of this information. The process for determining the most adequate provider could be greatly simplified, and the patient would have greater responsibility for the correct use of all available resources. Information should be provided from a variety of sources j because these patients often do not consider GPs a primary source of advice. Finally, communication and marketing (or demarketing) initiatives are central for the "advanced" patients as a way to direct them toward a more appropriate use of general and specialist services. This segment appears to be independent in its decisionmaking and is expected, due to its members' relatively young age, to respond to informative materials and educational initiatives.

Limitations
Some limitations must be noted and should be addressed in future research. First, the questionnaire used in this study has proven reliable and valid; however, further tests are needed to assess the stability of our findings in other samples. Second, the profiles developed in this work are the result of an analysis conducted in one region of Italy, thus have some path dependency and are strongly influenced by the dominant cultural traits in Tuscany. Therefore, the segments are not universally valid, although we can expect some agreement with similar analyses conducted in other developed countries. However, the method is universally valid and can be used by managers and policy makers to investigate their own systems and to develop their own profiles. Third, a comparison of the mean age of the subjects sampled with the Tuscan population data revealed that, to a small extent, younger females and older adults were less likely to respond, while individuals aged 46-65 were more likely to respond [64] (see Additional file 2: Appendix 2). This result highlights a possible bias in the sampling that suggests caution when generalizing these findings to the wider community. The telephone directory sampling may yield biased samples (for example with younger persons less likely to be listed in the sampling frame) [65], but it is still considered a reasonable approach because the selection bias is sufficiently small, particularly for health-related variables [66]. In future studies, the potential for this bias must be addressed by expending additional resources on the recruitment of subjects.
Endnotes a For example, a national or public health system, an integrated delivery system, and insurance or other thirdparty payer. b As Snyderman and Williams stated, "The ability to identify those individuals most at risk for developing chronic diseases and to provide a customized means to prevent or slow that progression are emerging competencies and provide the foundation for prospective care" [37].
c If we consider that the most recent data indicate that almost half of all US citizens live with a chronic condition [33] and the rate of increase is estimated at more than one percent per year by 2030, resulting in an estimated chronically ill population of 171 million, we can understand the urgent need for information and tools for improved population-based medicine. Almost half of all patients suffering chronic illness have multiple conditions, and their treatment suffers from deficiencies, such as the following: hurried practitioners who do not follow established guidelines; a lack of care coordination; a lack of active follow-up to ensure the best outcomes; patients who are inadequately trained to manage their illnesses.
d "Concordance" refers to the explicit participation of the patient in the decision-making process. We do not refer to the definition of concordance as the similarity, or shared identity, between the physician and patient based on a demographic attributes, such as race, sex, or age [16]. e LHAs are integrated delivery systems, or "umbrella" organizations that manage the entire spectrum of services and levels of care. LHAs might involve different combinations, including community services with hospitals, home care schemes, rehabilitation facilities, nursing homes, mental health centers, etc. More specifically, LHAs regroup facilities providing care at different levels: prevention and environmental health services, primary care (GPs), secondary care (outpatient services), tertiary care (general or community hospitals), quaternary care (academic medical centers and specialty hospitals), rehabilitation (nursing homes, rehabilitation centers), and long-term care (long-stay inpatient centers, home care units). In short, an LHA provides or aims to provide a coordinated continuum of services to a defined population and is willing to be held clinically and fiscally accountable for the outcomes and the health status of the populations served.
f The Tuscan Regional Health System serves a population of roughly 3.7 million and is organized with 12 LHAs and 4 independent teaching hospitals. The LHAs are accountable for the residents of a provincial geographical area and are sub-organized into health districts run by managers responsible for planning and governing the delivery network of primary care and continuity of care. g Participants that had not used outpatient services in the previous 12 months were only asked to answer questions regarding communication initiatives and sociodemographics.
h One of the most widespread hierarchical clustering methods is Ward's method [56,57], which attempts to generate clusters to minimize the within-cluster variance. Starting from with t clusters, each containing one object, at each step, Ward's method combines the two clusters that will result in the smallest increase in the sum-ofsquare index (or variance), and repeats the process until one cluster remains containing all the objects. At each stage, the mean of each cluster, or the average of the variable values for the objects in the given cluster, is first calculated. Then, the sum of the squared differences between each object in a given cluster and its cluster mean are computed [58].
The method has been shown to perform better than the other hierarchical procedures [59] because it tends to produce robust, dense, spherical clusters with distinct characteristics [60]. However, the solutions it provides tend to be distorted by outliers [59] and produce poorer results than the K-means non-hierarchical partitioning [61,62] if a nonrandom starting point is specified [47]. Hence, a two-stage clustering approach has been suggested. In the first step, a hierarchical method should determine a candidate number of clusters, a starting point for the iterative partitioning analysis and should identify outliers that may be eliminated from further analysis. Then, non-hierarchical approach should be performed to refine the clusters [47].
i Such clustering procedures yield solutions for discrete optimization problems, as opposed to model-based clustering methods that posit an underlying statistical model, producing for each object a probability of membership in each group. In general, insufficient evidence exists to recommend the model-based over more deterministic methods for clustering applications [67]. j A survey conducted in the US [68] analyzed a variety of sources for health-related information used by patients: -64% of those sampled consider a GP or a specialist doctor the primary source; -54% use the family network; -47% use specialized web sites; -32% use specific mailing lists; 26% use media programs (especially television).