A network perspective on patient experiences and health status: the Medical Expenditure Panel Survey 2004 to 2011
BMC Health Services Research volume 17, Article number: 579 (2017)
There is a growing emphasis on the need to engage patients in order to improve the quality of health care and improve health outcomes. However, we are still lacking a comprehensive understanding on how different measures of patient experiences interact with one another or relate to health status. This study takes a network perspective to 1) study the associations between patient characteristics and patient experience in health care and 2) identify factors that could be prioritized to improve health status.
This study uses data from the two-year panels from the Medical Expenditure Panel Survey (MEPS) initiated between 2004 and 2011 in the United States. The 88 variables regarding patient health and experience with health care were identified through the MEPS documentation. Sex, age, race/ethnicity, and years of education were also included for analysis. The bnlearn package within R (v3.20) was used to 1) identify the structure of the network of variables, 2) assess the model fit of candidate algorithms, 3) cross-validate the network, and 4) fit conditional probabilities with the given structure.
There were 51,023 MEPS interviewees aged 18 to 85 years (mean = 44, 95% CI = 43.9 to 44.2), with years of education ranging from 1 to 19 (mean = 7.4, 95% CI = 7.40 to 7.46). Among all, 55% and 74% were female and white, respectively. There were nine networks identified and 17 variables not linked to others, including death in the second years, sex, entry years to the MEPS, and relations of proxies. The health status in the second years was directly linked to that in the first years. The health care ratings were associated with how often professionals listened to them and whether professionals’ explanation was understandable.
It is feasible to construct Bayesian networks with information on patient characteristics and experiences in health care. Network models help to identify significant predictors of health care quality ratings. With temporal relationships established, the structure of the variables can be meaningful for health policy researchers, who search for one or a few key priorities to initiate interventions or health care quality improvement programs.
Engaging patients is an important element of healthcare, as it improves patient experience and thus leads to improved health outcomes . For example, experiences in the timeliness and perceived quality of health care and communication with physicians are measured with the Consumer Assessment of Health Plans Study (CAHPS) questionnaire . The perceived quality of primary care has been linked to outcomes, such as emergency room visits  and other health care use . The recent focus on patient experience and patient-oriented practice, especially engagement, aims to improve health care quality and patient health [5, 6]. However, there are several problems in studying the relationship between patient experience and health outcomes. One of the issue is that patient experiences may be simplified into a single dimension . This may oversimplify the complexity of patient experiences in health care and overlook the opportunities that we can take advantage of to improve healthcare quality and patient health . On the contrary, there is no common measure or universally agreed - upon definition of patient engagement  or patient satisfaction, each an important aspect of patient experience . In fact, there are various experience measures and definitions proposed to identify important components or priorities that we can adapt to improve patient centeredness or engage patient effectively . Only a few of them show promising results in validity or reliability, including the CAHPS . Furthermore, the input of all types of health professionals may not be properly measured in questionnaires. For example, the communication with providers other than doctors may not be considered while assessing patient experience . This can underestimate the input and effectiveness of patient-provider communication to improve patient health .
Occasionally, the objective of improving patient experience and subsequently health outcomes may be in conflict . One reason is that the priorities identified via various methods may not be compatible with each other . There isn’t enough evidence to help us understand whether there are pathways or networks to link individual characteristics or experiences to better outcomes in health care. Whether multiplicative benefits through changes in upstream policies or interventions can be achieved remains unclear [11, 12]. This problem is aggravated by the fact that there are few studies with sufficient samples to systematically identify the key factors for patient experience improvement .
To address these problems, we adopt a network approach to examine all possible inter-dependence of these measures of patient experience and health outcomes in a large population, while also taking into account individual characteristics. This can help us to identify potential intervention priorities as well as avoid incurring undesirable or adverse interactions between them. This study aims to 1) construct a Bayesian network model with individual characteristics and commonly used measures of patient experience, especially quality of care in the CAHPS, and 2) illustrate the relationships between patient experience and health outcomes through graphics and probability distributions.
This study uses data from the Medical Expenditure Panel Survey (MEPS) that was implemented with a focus on both self-perceived health status and patient experience with health care . The MEPS staff interviewed non-institutionalized civilians in the United States since 1996 . The MEPS provided a nationally representative sample with an oversample of blacks and Hispanics . The interviewees were followed up for 2 years during each MEPS panels .
The questions about patient-perceived health and experience in health care were asked once per year or twice in the two-year panels . The questionnaires were modified and new variables were added over time. To ensure the consistency of the variables in the MEPS panels, only data from two-year panels initiated between 2004 and 2011 were used. Because of the lack of adequate tools to adjust for complex survey design under graphic models, all statistics in this study were unweighted and not nationally representative. The flowchart of data processing and analysis is in Fig. 1.
Inclusion and exclusion criteria
This study included those age 18 years and over. Only those who made an appointment with a doctor or clinic for health care were included. Those with missing data in the following section were not included: the self-administered questionnaire (SAQ) that contained questions on patient experience of care (Consumer Assessment of Healthcare Providers and Systems, CAHPS) and three types of self-reported physical or mental health status: 1) Short-Form 12 Version 2, SF-12v2 along with the Physical Component Summary (PCS) and the Mental Component Summary (MCS) of the SF-12v2, 2) the Kessler Index (K6) of non-specific psychological distress; and 3) the Patient Health Questionnaire, PHQ-2 . Missing data were defined as the following answer categories: no data in round, refused, don’t know, or not ascertained. If subjects were not eligible for specific questions, their replies were coded as not applicable (see Additional file 1: Appendix 1 for proportions of ineligibility). For example, interviewees were asked whether they had experienced an illness or injury that had required immediate care. This question might not apply to all surveyed individuals.
The list of variables regarding patient experience, especially health care ratings, and patient-reported outcomes were selected and categorized according to the MEPS documentation (see Additional file 1: Appendix 1 for details) . The patient-reported rating of health care ranging from zero to 10 was provided by those who had visited health care professionals (variable name: adhecr2 and adhecr4 for the first and second years respectively). Individual health status, measured by SF-12v2 (adgenh2 and adgenh4 for the first and second years respectively), were reported in five categories: poor, fair, good, very good, and excellent. The labels for variables imported from the MEPS documentation were also listed in Table 1. Moreover, sex, age, race/ethnicity, regions (Northeast, Midwest, South and West regions of the United States) and years of education were also included in the network analysis.
There were five rounds of data collection in each two-year panel. The SAQ was administered during the second and fourth rounds that were approximately in the middle of the years . Therefore, each outcome or experience measure was numbered with one or two to represent first and second-year information.
Bayesian networks consist of nodes and arcs that represented variables of interest and the relationships between them . The relationships between variables were described in conditional dependencies and tested with Chi-square tests or other score-based methods . Given its advantage of visual presentation and network structures that were more appropriate to describe interactions between variables, Bayesian networks were used in medical , biological , and social  research to study the conditional dependencies between random variables.
Layers of variables
The directions between variables were regulated in layers . The higher-layer variables were allowed to be directed to variables of other layers. However, variables of other layers were not permitted to be directed to higher-layer variables. The highest layer was one, and included the following variables: years of entry to the MEPS panels, races, ages in years, regions, education, and sex. The second-layer variables were income, health status, and experience in health care in the first years of the panels. The third-layer variables were deaths in the second years, and the health status and experience in health care in the second years of the panels (see Additional file 1: Appendix 1). In other words, some of the directions between variables were blacklisted in the network modeling .
Model development process
The network modeling followed a previously published development process to review and revise the process . The initial Bayesian network models were developed after data cleaning and missing data assessment . The following steps were taken to finalize the model. First, data-driven models were built and assessed for adequacy according to expert opinions by all authors. This led to some adjustments in the variables to be specified in different layers. The performance of different algorithms was also compared. Second, the expert panel made decisions about 1) the retaining of essential variables for further model selection, 2) the identification of important links between variables, and 3) the validation of the conditional probability distributions based on prior knowledge on the research of patient experience and engagement. After discussion and model re-specification, the Bayesian network models were rerun to obtain stable network structures based on 10-fold cross-validation.
The temporal relationships between first-year and second-year variables in the MEPS panels were considered in the model development process. In some of the complex time series studies based on network modeling, the relationships among variables of different time points were assumed to be similar. For example, the relationships between insulin adjustment and other related variables were assumed similar across different time points . However, we considered that there was limited evidence to justify imposing similar network structures to the first-year and second-year variables, given this to be one of the first studies using Bayesian network modeling with patient experience data.
Bayesian network implementation
The bnlearn package  available within R environment (v3.20 released in April 2015) was used to 1) apply several of the best heuristic algorithms, including Max-Min Hill Climbing that obtained the best scores in network modeling with the MEPS data  (see Additional file 2: Appendix 2 for the scores), 2) verify the stabilities and strengths of network arcs through averaging 200 bootstrapped networks, and 3) query the conditional probability distributions in the finalized network, and 4) illustrate the final networks with visualization tools . If the Bayesian network models were found to be inadequate by expert opinions in any step of the development process, these procedures were rerun to obtain finalized network models.
Correlations between variables and cross-group comparisons
In addition to Bayesian network modeling, the associations between variables were also determined by the correlation coefficients in Spearman’s correlation tests. The differences in continuous and categorical variables across countries or parent variables were also tested with Student’s t and Chi-square tests, respectively. The level of significance was at 0.05 level at two tails.
The demographic characteristics of the MEPS participants are listed in Table 2. Between 2004 and 2011, there were 51,023 MEPS participants aged 18 to 85 years (mean = 44.1, 95% CI = 43.9 to 44.2). The years of education ranged from 1 to 18 years (7.4, 95% CI = 7.40 to 7.46). The proportion of female participants was 55% and did not change significantly across MEPS panels. The majority of those sampled were white, at 76%, with the largest sample, 38%, being from the South.
There were nine networks identified and 17 variables were not linked to any others, including death in the second years, sex, entry years to the MEPS, and relations of proxies (see Additional file 1: Appendix 1 for details; see Additional file 3: Appendix 3, Additional file 4: Appendix 4 and Additional file 5: Appendix 5 for all networks). Variables of different categories tended to group in various networks. The largest network contained 42 variables, of which 22 were the health status measured by the SF12v2 and 12 measures of non-specific psychological distress (Figure 2a). The second largest network consisted of ten CAHPS variables (Figure 2b). The third and fourth largest networks each had seven variables: one was related to the interactions between different types of attitudes toward health and the other was related to health care needs and appointment making (Fig. 2c and d respectively). The other networks contained three variables or less (see Additional file 3: Appendix 3, Additional file 4: Appendix 4 and Additional file 5: Appendix 5 for all networks).
Patient experience: rating of healthcare
The healthcare ratings in the first years of the MEPS panels was directly linked to whether health professionals listened to patients (marked by an arrow, p 6 in Additional file 3: Appendix 3). Healthcare was rated higher when professionals listened to the patients more frequently. The same figure indicated that patients found professionals more understandable when the professionals listened to them more frequently.
In contrast, the patient-reported healthcare ratings in the second years was linked to whether the health professionals explained their conditions in a way that they understood (the arc marked by an arrow, Fig. 2b and p 2 in Additional file 3: Appendix 3). The probability distributions of the healthcare ratings were shown in Fig. 3. The more frequently the health providers explained things in a way that was easy to understand, the more likely the patients were to rate health care higher.
The health status in the first and second years was directly linked to one another in Fig. 2a, the arc marked with a arrow. The probability distributions of general health status in the first and second years were shown in Fig. 4. More than 47% of the individuals maintained the same categories of health status throughout the two-year panels. There were two variables linked to health status in the first years: how often individuals felt everything was an effort and whether health status limited moderate activities. How often individuals felt everything an effort is a question to assess non-specific psychological stress. If patients felt everything an effort more often or more limitation on moderate activities due to health, they were more likely to report a worse health status in the first year. The probability distribution of the health status in the second years was related to health status in the first year and whether their health limited climbing stairs in the second years.
Connection between first-year and second-year variables
There were limited connections between the first-year and second-year variables. In addition to the link between health status in the first and second years, there were ten other arcs linking first-year and second-year variables. First, if individuals had a lot of energy for a majority of the time during the first year, they tended to feel the same in the second year. Second, if patients were able to make medical appointments when desired in the first year, they were more likely to report having blood pressure checked by health professionals in the second year. Third, the degree to which pain limited normal work in the first year was related to the same variable in the second years. Fourth, the more frequently health professionals showed respect to patients in the first year, the more likely patients were to report more respect to them in the second years. Another three variables were related to the association with the number of visits to medical officers for care in the second years. The last three were the linkages between attitudes about health. Patients’ attitudes about whether they needed insurance, whether insurance was worth costs, and whether they could overcome illness without medical help were consistent in the first and second years.
This study shows that network modeling is both feasible and useful for further policy or academic research. The measures of patient experience and physical or mental health are interconnected across time. The results not only show the complexity in patients’ interactions with healthcare systems, but also point to possible approaches to navigating the intricacies of these interactions. The first important finding is that measures of patient experiences and health status are interconnected, but only to a certain degree. The largest network consists of 42 variables that are predominantly dimensions of mental and physical health measured by SF12v2, non-specific psychological distress, and Kessler scale, along with three measures of patient experience and two general health questions. In this network, self-rated health status in the first and second years is linked. The temporal associations of health status across different time points is verified by previous studies .
Second, 13 patient experience variables measured by the CAHPS are included in two separate networks, while seven others are included in two other networks that include measures of health status and SF12v2 functional status. The association between health care rating and understandable explanation by providers  can be found in the first or second years. The networks in this study show that age, education or income may not have extensive connection with patient experience in health care, if conditional dependencies taken into account. This is different from previous studies that use regression models and show the associations between patient experience and individual characteristics especially age and sex [7, 27]. Other researchers find that the degree to which patients engage in their care could be explained predominantly by income, with race/ethnicity playing a lesser role . The differences in the results from various methods are expected and the network perspective shows that the inter-dependencies between patient experience measures may need to be considered in regression models as well.
Third, there seem to be key arcs that link health status and patient engagement across time. The first- and second-year variables of health status, how often providers show respect to what patients have to say, whether pain limits normal work, and attitudes about insurance in the first and second year, are connected. The other measures are not well connected across time.
In addition, this study highlights some of the undervalued associations and opportunities to improve both patient experience and health status. For example, there are extensive interactions between functional status and psychological stress. The importance of mental status has been well demonstrated,  and the findings suggest that some measures of the psychological stress may be more important than others. For example, feeling everything is an effort is directly linked to self-reported health status.
Strengths and limitations
This network approach is useful to handle a large number of variables with or without prior knowledge in the interactions or interdependencies between them . The visual presentation is appealing for the audience, who are interested in exploring interactions between variables or measures of patient experience.
Despite the large sample size and standardized questionnaires used in the MEPS, there are several limitations in the newly identified networks. First, the MEPS is designed to produce nationally representative statistics for the civilians in the United States through the adjustment of the survey design . However, it is not feasible to account for the survey design in the Bayesian network models  and certain population groups may be overrepresented. This can limit the generalizability of the results. Second, the research tools, Bayesian networks and graphic models, may not be widely known to health policymakers or researchers, who are familiar with regression models that summarize the significance and associations of all predictors towards a single outcome.
Third, one inherent difficulty in health care quality research with observational data is that potential interventions were not randomly assigned and healthcare is rated only by those with any exposure to health care systems . It is unclear how the causes of health care consumption, chronic or acute, will relate or influence the ratings of health care, whether through self-selection, lack of insurance coverage,  or other mechanisms. Fourth, the purpose of the MEPS is to understand the health status of populations on a yearly base. The measurement of patient experiences of the last 1 year may be subject to recall bias. The CAHPS questionnaire may be widely used, but remains unspecific to events, such as specialty visits or hospitalization . Fifth, there was no information about whether patients switched health providers that might be important to patients’ experience in health care. Lastly, causal relationships cannot be established with cross-sectional data . This is to be studied and analyzed with trials or interventions in the future.
These findings are important for the planning of future research. First, the identified networks are meaningful for health policy researchers, who search for one or a few priorities to design and initiate interventions on patient experience in health care in order to improve health status and health care quality. For example, there are several variables linked to more than three measures of patient experience and could serve as intervention priorities, such as making doctors’ explanations understandable or improving appointment-making procedures for routine care.
Second, the network also contrasts two distinctive approaches to improve patient experience and health status. The first one is to use immediate parent variables in the network as the targets of intervention. The other is the trickle-down approach  that may focus more on the upstream factors that may not have immediate influence on patient experience or health. Instead, it may be of interest for policy makers who aim to improve the overall well-being extensively, through one of the beginning variables in the network, such as whether health professionals spend enough time with patients.
Bayesian network modeling is feasible with health experience data. A network perspective highlights the interactions between the measures of patient experience in health care. This can be used to identify potential priorities for interventions that aim to improve health status or experience in health care. Researchers evaluate potential interventions by using this model to identify immediate parent variables or distinct upstream-factors. The effectiveness of these two dissimilar concepts will require further testing.
Consumer Assessment of Healthcare Providers and Systems
Mental Component Summary
Medical Expenditure Panel Survey
Physical Component Summary
Patient health questionnaire
Short-form 12 version 2
Clancy CM. Patient Engagement in Health Care. Health Serv Res. 2011;46(2):389–93.
Hays, RD, et al. Psychometric Properties of the CAHPS™ 1.0 Survey Measures. Med Care. 1999;37(3 Suppl):MS22–31.
Brousseau DC, et al. Quality of Primary Care and Subsequent Pediatric Emergency Department Utilization. Pediatrics. 2007;119(6):1131.
Raphael JL, et al. Associations between quality of primary care and health care use among children with special health care needs. Arch Pediatr Adolesc Med. 2011;165(5):399–404.
Boivin A, et al. What are the key ingredients for effective public involvement in health care improvement and policy decisions? A randomized trial process evaluation. Milbank Q. 2014;92(2):319–50.
Boivin A, et al. Involving patients in setting priorities for healthcare improvement: a cluster randomized trial. Implement Sci. 2014;9(24):24.
Manary MP, et al. The Patient Experience and Health Outcomes. N Engl J Med. 2012;368(3):201–3.
Simmons LA, et al. Patient engagement as a risk factor in personalized health care: a systematic review of the literature on chronic disease. Genome Med. 2014;6(2):16.
Phillips NM, Street M, Haesler E. A systematic review of reliable and valid tools for the measurement of patient participation in healthcare. BMJ Qual Saf. 2015:2015–004357.
Herrin J, et al. Patient and family engagement: a survey of US hospital practices. BMJ Qual Saf. 2015:2015–004006.
McKinlay JB, Marceau LD. Upstream healthy public policy: lessons from the battle of tobacco. Int J Health Serv. 2000;30(1):49–69.
Williams DR, et al. Moving Upstream: How Interventions that Address the Social Determinants of Health can Improve Health and Reduce Disparities. J Public Health Manag Pract. 2008;14(Suppl):S8–17.
Cohen SB, Cohen JW. The capacity of the Medical Expenditure Panel Survey to inform the Affordable Care Act. Inquiry. 2013;50(2):124–34.
Cohen JW, et al. The Medical Expenditure Panel Survey: a national health information resource. Inquiry. 1996;33(4):373–89.
Agency for Healthcare Research and Quality, MEPS HC-147 2011 Full Year Consolidated Data File, Agency for Healthcare Research and Quality, Editor. 2013, Agency for Healthcare Research and Quality,: Rockville, MD. p. C-29.
Scutari M, Strimmer K. Introduction to Graphical Modelling, in Handbook of Statistical Systems Biology. Hoboken: Wiley; 2011.
Scutari M. Learning Bayesian Networks with the bnlearn R Package. J Stat Softw. 2010;35(3):1–22.
Gevaert O, et al. Predicting the prognosis of breast cancer by integrating clinical and microarray data with Bayesian networks. Bioinformatics. 2006;22(14):e184–90.
Jansen R, et al. A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data. Science. 2003;302(5644):449–53.
Conati C, et al. On-Line Student Modeling for Coached Problem Solving Using Bayesian Networks. In: Jameson A, Paris C, Tasso C, editors. User Modeling. Vienna: Springer; 1997. p. 231–42.
Sambo F, et al. A Bayesian Network analysis of the probabilistic relations between risk factors in the predisposition to type 2 diabetes. Conf Proc IEEE Eng Med Biol Soc. 2015;2015:2119–22.
Constantinou AC, et al. From complex questionnaire and interviewing data to intelligent Bayesian network models for medical decision support. Artif Intell Med. 2016;67:75–93.
Fuster-Parra P, et al. Bayesian network modeling: A case study of an epidemiologic system analysis of cardiovascular risk. Comput Methods Prog Biomed. 2016;126:128–42.
Andreassen S, et al. A Model-Based Approach to Insulin Adjustment. In: Stefanelli M, et al., editors. AIME 91: Proceedings of the Third Conference on Artificial Intelligence in Medicine, Maastricht, June 24–27, 1991. Berlin, Heidelberg: Springer Berlin Heidelberg; 1991. p. 239–48.
Nagarajan R, Scutari M, Lèbre S. Bayesian Networks in R: with Applications in Systems Biology. Use R! 2013. New York: Springer.
Bailis DS, Segall A, Chipperfield JG. Two views of self-rated general health status. Soc Sci Med. 2003;56(2):203–17.
Osborn R, Squires D. International perspectives on patient engagement: results from the 2011 Commonwealth Fund Survey. J Ambul Care Manage. 2012;35(2):118–28.
Cox ED, et al. Influence of Race and Socioeconomic Status on Engagement in Pediatric Primary Care. Patient Educ Couns. 2012;87(3):319–26.
World Health Organization, The World Health Report 2001: Mental Health : New Understanding, New Hope. 2001, Geneva, Switzerland: World Health Organization.
Pei B, Shin DG. Reconstruction of biological networks by incorporating prior knowledge into Bayesian network models. J Comput Biol. 2012;19(12):1324–34.
Shin J, Moon S. HMO plans, self-selection and utilization of health care services. Appl Econ. 2007;39(21):2769–84.
O'Connor SJ. Listening to patients: the best way to improve the quality of cancer care and survivorship. Eur J Cancer Care. 2011;20(2):141–3.
YSC is financed by the Fonds de recherche du Québec – Santé (FRQS) fellowship.
Availability of data and materials
All data sets can be freely assessed via the Agency for Healthcare Research and Quality website (https://meps.ahrq.gov/data_stats/download_data_files.jsp).
No applicable. The MEPS data is publicly available and there is no patient consent form available for download.
Consent to participate
Ethics approval and consent to participate
This secondary data analysis study was approved by the ethics committee of the Centre hospitalier de l’Université de Montréal (number: 2016–6095).
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix 1. Characteristics of the included variables from the Medical Expenditure Panel Survey. The characteristics of the variables included for analysis. (XLSX 14 kb)
Appendix 2. Scores of the Bayesian network algorithms. The scores of the Bayesian network algorithms, used to assess model fit and select the algorithm for analysis. (XLSX 7 kb)
Appendix 3. Networks of the measures of patient experiences and health status with the short names of the Medical Expenditure Panel Survey variables in the nodes. The results of the Bayesian network modeling with all connected networks. The short names of the variables are labelled in the nodes. Corresponding variable names can be found in Additional file 1: Appendix 1. (TIFF 36363 kb)
Appendix 4. Networks of the measures of patient experiences and health status with the short names of the Medical Expenditure Panel Survey variables in the nodes. The results of the Bayesian network modeling with all connected networks. The original variables are labelled in the nodes. Corresponding variable names can be found in Additional file 1: Appendix 1. (PDF 72 kb)
Appendix 5. Networks of the measures of patient experiences and health status with the short names of the Medical Expenditure Panel Survey variables in the nodes. The results of the Bayesian network modeling with all connected networks. The names of the variables are labelled in the nodes. Corresponding variable names can be found in Additional file 1: Appendix 1. (PDF 76 kb)
About this article
Cite this article
Chao, YS., Wu, Ht., Scutari, M. et al. A network perspective on patient experiences and health status: the Medical Expenditure Panel Survey 2004 to 2011. BMC Health Serv Res 17, 579 (2017). https://doi.org/10.1186/s12913-017-2496-5