The impact of payer status on hospital admissions: evidence from an academic medical center

Background There are plenty of studies investigating the disparity of payer status in accessing to care. However, most studies are either disease-specific or cohort-specific. Quantifying the disparity from the level of facility through a large controlled study are rare. This study aims to examine how the payer status affects patient hospitalization from the perspective of a facility. Methods We extracted all patients with visiting record in a medical center between 5/1/2009-4/30/2014, and then linked the outpatient and inpatient records three year before target admission time to patients. We conduct a retrospective observational study using a conditional logistic regression methodology. To control the illness of patients with different diseases in training the model, we construct a three-dimension variable with data stratification technology. The model is validated on a dataset distinct from the one used for training. Results Patients covered by private insurance or uninsured are less likely to be hospitalized than patients insured by government. For uninsured patients, inequity in access to hospitalization is observed. The value of standardized coefficients indicates that government-sponsored insurance has the greatest impact on improving patients’ hospitalization. Conclusion Attention is needed on improving the access to care for uninsured patients. Also, basic preventive care services should be enhanced, especially for people insured by government. The findings can serve as a baseline from which to measure the anticipated effect of measures to reduce disparity of payer status in hospitalization. Supplementary Information The online version contains supplementary material available at (10.1186/s12913-021-06886-3).


Background
Based on their primary insurance payers, patients can be classified into five groups: Medicare, Medicaid, Commercial, Uninsured, and Other, among which Medicare and Medicaid are government administered programs whereas commercial insurance is provided by private insurers. Medicare is available for patients 65 or older, younger people with disabilities, and dialysis patients, and Medicaid is available for low-income individuals or pays *Correspondence: zhaoyanying.happy@outlook.com 1 School of Management, Fudan University, 670 Guoshun Road, Yangpu District, Shanghai 200433, China Full list of author information is available at the end of the article for costs as a supplement to Medicare. In reality, health care service is not experience equity by all populations. For example, there are 49.6% of adults ages from 16 to 64 met difficulties in access to care in Massachusetts due to insurance provider issues in 2015 [1]. Therefore, reducing inequity has always been the focus of study and health care reform in U.S.
Generally, equity in healthcare encompasses timely access, equivalence of care, and absence of avoidable or remediable differences among groups of people, pertinent to distinct social, economic, demographical or geographical criteria [2]. Previous studies have demonstrated that inequity exists in the entire process of care, including access [3][4][5], in-hospital experience [6], treatment and outcomes [7][8][9]. In terms of access to care, for example, Hsiang et al. [4] found that Medicaid patients are faced with 1.6-fold lower likelihood in scheduling a primary care appointment and 3.3-fold lower likelihood in specialty appointment comparing to private insured patients. The avoidable hospitalization rate for Medicaid and uninsured patients is higher than privately insured patients [10]. These studies indicate that more attention should be given to low-income patients to improve their access to care.
Expanding insurance coverage is one of the actions taken by the government to improve patients' equity in health care. For example, Massachusetts issued a bill titled "An act providing access to affordable, quality, accountable health care" in 2006, which increased the insurance coverage proportion in non-elderly adults (ages from 16 to 64) from 86.0% in 2006 to 95.7% in MA [1]. Generally speaking, health insurance coverage expansion has improved patients access to care [11,12]. Dzordzormenyoh [13] further investigated the impact of Medicaid expansion from three aspects: access to a physician, access to basic healthcare, and access to specialized care, and finds that Medicaid expansion can significantly improve the access to basic care and specialized care, but access to physicians is weakened due to the low reimbursement rate and complex paper work. Decker et al. [14] found that 31% office-based physicians are unwilling to accept any new Medicaid patients, while only 17% office-based physicians are unwilling to accept any new privately insured patients. The limitation access to office-based physicians drives Medicaid patients to seek care from public institutions, for example, hospital emergency departments and outpatient departments [15,16]. According to Sutton et al. [17], 40% of inpatients in safety-net hospitals were either covered by Medicaid (34.7%) or uninsured (6.7%), compared to just 20.7% of inpatients in non-safety-net hospitals (16.8% Medicaid and 3.9% uninsured).
Medicaid is a costly program for government to sustain. Faced with the increasing Medicaid patients, the finance problem in public institution becomes more sever. How to obtain sustainable and secure finance support becomes the primary issue for their administrators [18]. Some studies claim that the improvement on access to care by Medicaid expansion would be weakened in the long run for public institutions which may have difficulty in giving continued and quality care [19]. It is obvious that in addition to insurance coverage expansion, more measures should be taken to protect low-income patients' health.
Therefore, it is very important for policy makers or hospital administrators to have a comprehensive understanding on the disparity of payer status in access to care. However, most of related studies are either diseasespecific or cohort-specific [3,5,7,8]. Few quantify the impact of payer status on access to care, especially for hospitalization, with a large controlled study from a more "macro" level. To our knowledge, Cai et al. [20] is one of the few studies exploring the differences in hospitalization between Medicaid patients and private-pay patients within a facility. Studying from the level of a facility is conducive to find the inherent difference in hospitalization, and thus promote hospitals to find reasons and to take actions correspondingly [20].
Following the spirit of Cai et al. [20], this study aims to quantify the disparity of payer status in hospitalization from the perspective of a facility. Comparing to Cai et al. [20], our study includes all payer status, and compares their impact on hospitalization based on the value of standard regression coefficients. It is helpful not only for promoting a more comprehensive understanding on the impact of payer status on hospitalization, but also for providing hospital administrators with a basis for deciding which group of patients should be given more attention.
It is worth mentioning that this study is conducted based on a safety-net hospital, which is required to provide health care to all patients regardless of their ability to pay [21]. Ideally, the hospitalization rate should be the same for patients with the same health conditions, and any disparity in hospitalization should be solely due to patients severity of illness. Given the nature of a safety-net hospital, we think it can provide us with a more appropriate condition to reveal the inherent gaps of hospitalization among different payer status.

Data
The data we used come from a large academic medical center in the U.S. Only patients who had visit records during the period of 5/1/2009-4/30/2014 were included in our study, resulting in a dataset containing information for 490,761 patients. For each patient in this set, we extracted their demographic characteristics and medical history (outpatient/emergency room visit records, diagnoses, medications, hospitalizations) during the period of 5/1/2006-4/30/2014. Table 1 describes the extracted features corresponding to each patient.

Dependent Variable
A binary variable Admission is defined indicating whether a patient has been hospitalized during the entire 5 year period we considered.
We find there exists significant imbalance between the sample size of the admitted and non-admitted patients. In order to reduce the imbalance, we introduce a "floating" time node -"target time", which is similar to the "target year" used in Brisimi et al. [22]. The value of target time is defined according to the following rules: (1) For patients who have been hospitalized during the 5-year period, we define the last hospitalization time as their target time.
(2) For patients who have not been hospitalized during the 5-year period, we define the last day of the subperiod (the length of a sub-period is 365 or 366 days) in which their last visiting record falls as their target year. Figure 1 present the process of obtaining a patient's target time. The model we will derive seeks to predict admission at the target time. With this method, the number of patients labeled as admitted accounts for 16.53% of the total number of patients.

Independent Variable
There are 8 payers in our sample: Medicare, Medicaid, Commonwealth care (offered by the state), Health safety net, Accident, Commercial, Self-pay and Other. According to the nature of payers, we classified them into 5 groups: government-private-sponsored, governmentsponsored, private-sponsored, uninsured, and other (see Appendix A in Additional file 1 for detailed information on group composition). We use a categorical variable Payer to denote a patient's payer status.
We note that some patients have multiple payers during the specific time-period. Since the insurance status of a patient may change over time, we consider the latest status as their payer status.

Covariates
In order to organize all the available information in a uniform way for all patients, some preprocessing of the data is needed. Pre-processing steps include imputing missing values and summarizing historical factors.

Sample imputation
Missing data imputation seeks to replace missing data with substituted values. We note that some records lack the information on the primary diagnosis. We use a matching algorithm to impute the missing values based on the following rules: (1) for patients who have prescriptions, we select the most common disease treated with their prescriptions as the substituted value; (2) for patients who have no prescriptions, we select the most common disease according to the history of their medical visits; and (3) for patients who have neither prescriptions nor prior medical visits, we label their diagnosis as "Unknown. " The most common disease here means the disease with the highest frequency. For example,if a patient visit the medical center 3 times due to the disease of Circulatory, Circulatory and Digestive, respectively, her most common disease is labeled as Circulatory.
We note that a patient may seek care for different causes that correspond to different primary diagnoses in their visiting records. In order to keep each patient's primary diagnosis unique in our sample, we choose the most common visit (or hospitalization) cause as the primary diagnosis for a patient who has never (ever) been hospitalized. We define Diagnosis as an unordered categorical variable denoting a patient's primary diagnosis (23 categories; cf. Table 2).

Summarization of historical factors
We summarize records three years before a patient's target time. We assigned a patient's most common primary diagnosis in the variable we call Diagnosis. In order to include other diagnoses, we introduce a variable called Combinations, indicating the total number of diseases of a patient except her Diagnosis. PrscNum denotes the total number of prescription order in the record for this patient that are attributed to the primary diagnosis, which takes values in the interval 0 to 30, with 30 including cases of 30 or more prescriptions. Prior Admissions and ER Visits are both ordered categorical variables coded 0 to 2, with 0 representing no earlier visit/admission records, 1 representing 1 earlier visit/admission records, and 2 representing more than 1 earlier visit/admission records. Chronic is a binary indicator variable indicating whether a patient's disease is chronic. We also include other demographic factors: Age, Race, Marital Status. Age is an ordinal category variable, coded 1 to 8, with 1 indicating ages no more than 10 years, 2 indicating ages from 10 to 20 years, etc., up to 7 corresponding to ages from 60 to 70 years, and 8 indicating ages more than 70 years. We convert categorical featuresincluding Race, Marital Status, Diagnosis and Payer -into a set of binary variables with one-hot encoding. For all binary variables, zero indicates lack of this feature. In total, there are about 50 features for each patient. We present a summary description of the variables in Appendix D in Additional file 1.

Patient and feature selection
We remove features which are only available for a small number of patients. Patients who are under 3 years old are also removed as they have not enough historical records to indicate their physical condition until the selected target time. There are 462,809 patients retained after the steps outlined above.

Statistical analysis
In the following analysis, we randomly select 70% of the patients for training the logistic regression model and retain 30% for testing the performance of the trained model. The model predicts admission at the target year for each patient.
First, we present the descriptive characteristics of key factors. A Chi-square test is implemented to compare the frequency of various categories for categorical variables in the admitted and non-admitted cohorts. For continuous variables, a t-test is used to accept or refute the null hypothesis that the corresponding means of the variables are equal in the admitted and non-admitted cohorts.
Second, following Brisimi et al. [22], we utilize a statistical hypothesis test comparing the sample difference of proportions between the insured and uninsured patients. The methodology of this method can be found in [23]. With this method, we split our data set into two groups of size N 1 and N 2 , respectively, where the first group includes insured patients and the second group the uninsured. Correspondingly, the admission rates are denoted by p 1 and p 2 . Suppose that whether or not a patient has been insured does not influence their hospitalization. Then, p 1 should be statistically similar to p 2 . Under the null hypothesis that p 1 = p 2 , the difference between p 1 and p 2 approximately complies with a normal distribution, whose mean μ and deviation σ equal to 0 and PQ(1/N 1 + 1/N 2 ), respectively, where P = (p 1 N 1 + p 2 N 2 )/(N 1 + N 2 ) and Q = 1 − p. We can then use the estimator z = (p 1 − p 2 )/σ to assess whether the null hypothesis holds or not.
Then, we conduct a logistic regression to further elaborate how a patient's hospitalization was affected. In order to adjust the effect of confounding factors, we stratify the samples before analysis. More detailed methodology on the stratification technology can be found in Kleinbaum et al. [24]. Generally speaking, the distributions of cases and controls in different strata are usually substantially  where Prob(Admission) denotes the probability of hospitalization in the target time, β = (β 0 , β 1 , ..., β 9 ) are unstandardized coefficients. Finally, a Receiver Operating Characteristic (ROC) curve, which plots the sensitivity (or detection rate, or recall) as a function of the false positive rate (equal to one minus the specificity) is presented to demonstrate the performance of the model.

Results
After the data pre-processing, our study population consisted of 462,809 patients, 67,332 of whom were admitted during the target year. Table 2 presents summary statistics of our sample. Patients who have been admitted are more likely to be male, white, and be divorced or separated. The number of patients insured by a government program is much higher than those with other payer status in our sample, which is consistent with a safety-net health care provider. This is mainly the result of the mission of safety-net hospitals to provide healthcare to all populations regardless of their payer status or ability to pay. The severity of illness, as defined earlier, is significantly different between admitted and non-admitted. Appendix C in Additional file 1 illustrates how the three factors we used to define severity affect admissions.
As outlined in Methods, we first use a hypothesis test to compare admission rates between the insured patients and the uninsured. In the hypothesis test, we obtain z = 49.3, which means that the probability that p 1 = p 2 is much smaller than 0.0001. Therefore, we can reject the null hypothesis and assert that insurance status affects hospitalizations.
Then, we examine how payer status affects admissions with a conditional logistic regression model. Table 3 presents our results. We use Private as the reference payer status. The significantly positive unstandardized coefficients of Multiple Payers and Government indicate that the admission probability for patients who are totally or partially insured by government, controlling for other variables in the model, is higher than that for patients who are insured by private insures. Similarly, the significantly negative unstandardized coefficient of Uninsured indicates that the uninsured patients are less likely to be admitted than patients insured by private insurers.
The specific influence of payer status on admission can be further elaborated by the corresponding odds ratios.
(1) The odds ratio of Multiple Payers is 1.956, indicating that the odds of being admitted increase by 95.6% when the variable Multiple Payers increases. (2) The odds ratio of Government (odds ratio = 1.214) indicates that the odds of being admitted for patients who are insured by the government is 21.4% higher than the rest of the population.
(3) The odds ratio of Uninsured (odds ratio = 0.815) indicates that uninsured patients are less likely to be admitted than insured patients when controlling for other variables.
The standardized coefficients in the 5th column in Table 3 indicate how many standard deviations of change in logit(Admissions) are associated with one standard deviation increase in the independent variables. According to Menard [25] and Agresti [26], standardized coefficients can be used to compare the relative influence of independent variables within a logistic regression model when the independent variables are measured in different units of measurements. Similar application of standardized coefficients can be found in [27]. In our findings, Combinations has the greatest influence in hospitalization among other factors (See Appendix B in Additional file 1 for the graph of ordered standardized coefficients). As the standardized coefficients value shown in Table 3, for example, a unit increase in Combinations is associated with a 1.113 increase in hospitalization. The values for payer status are somewhat lower (for example, 0.153 for Multiple payers, 0.097 for Government). On the other hand, the standardized coefficient values reflect that Multiple payers has the greatest impact on admissions among all payer status, successively followed by Government, Private and Uninsured.

Validity check
We validate the estimated model via its prediction accuracy on a dataset distinct from the one used for training. Figure 2 plots the ROC curve for a random split of the data into a training and test set. The model has an Area Under the ROC Curve (AUC) of 92%, indicating excellent predictive power.

Discussion
In an analysis of the relationship between admissions and payer status in a safety-net hospital, we found that uninsured patients are still less likely to be admitted than insured patients after controlling for demographics and prior medical conditions. These results are consistent with a streamline of work demonstrating inequity in access to health care [10,13]. Generally, most uninsured people are in low-income families and may not be receiving public financial assistance for various reasons (e.g., age, income cutoff of financial assistance, undocumented immigrants, mental illness). As a consequence, it is also possible that they may not seek timely health care or receive treatment due to their economic, legal, or mental illness condition. Further, we also found that patients who are partially or totally insured by government are more likely to be admitted than those who are insured by private insurers. Though we have controlled for prior medical conditions and age, still the association is significant. According to Exhibit A1 in the Additional file 1, among patients insured by government, 77.18% are totally or partially covered by Medicaid, therefore, we can infer that the high odds ratio of admission for patients insured by government are mainly driven by patients with Medicaid. Since our findings are not causal, this relationship only demonstrates an association between payer status and admissions. There may be several plausible explanations for this association: (1) Low-income patients tend to experience worse health quality and delayed diagnosis and treatment, leading to worse health conditions when they are forced to seek health care. This is also consistent with Adepoju et al. [8] and Giacovelli et. [3]. From this perspective, there may be more opportunities to design preventive care programs for patients that are insured by the government, compared to those insured by private insurers, for basic preventative care can significantly decrease avoidable hospitalizations [28]. Given the characteristics of the safety net hospital, we feel this explanation is very plausible.
(2) Patients with enough ability to pay may elect to seek care at a non-safety-net hospital setting, and these admissions are not present in our dataset. As suggested by Sutton et al. [17], inpatients in a safety-net hospital are usually associated with pregnancy and injuries, whereas inpatients in non-safety-net hospitals are more likely to be related to surgery. (3) Patients insured by Medicare and Medicaid include patients with disabilities who may require more frequent hospitalizations. However, this effect is mitigated by the fact that our model controls for the complexity of a patient's health condition.
Our results also highlight additional information on factors that can influence a patient's hospitalization. Demographics play a role in explaining the relationship between admissions and payer status. Among all factors included in our analysis, the number of other diseases for a patient (combinations) contributes the most to a hospitalization since this variable has the largest standardized coefficient. In addition, the standardized coefficients suggest that demographics are not a major factor.
There exist significant differences among disease types in admissions. Specifically, the top 3 disease types contributing to admission are Pregnancy, Injury & Poison and Endocrine (see Table 3 or Appendix B in Additional file 1).
Our results can lead to two policy recommendations. (1) Attention is needed on improving the access to care for vulnerable (low-income) patients, for example, by actively advertising free care programs, reaching out to community organizations with better access to these individuals, or offering assurances that access to care is not linked to immigration procedures. (2) In order to reduce preventable admissions, basic preventive care services should be enhanced. The policy recommendations are in line with the World Health Organization's demand for developing long-term growth in health spending and effective health policies. Similar policy recommendations can be found in Jakovljevic et al. [29], where policies on improve the low-income patients health by promoting patients' healthy lifestyle and enhancing basic care are called on.

Limitations
First, the differences among payer groups may be biased by not controlling for lab tests and other unaccounted factors in our analysis, including disability status for those insured by government programs. Though we use surrogates for illness severity, they may not fully account for the true health status of a patient.
Second, the identification of hospitalization is imprecise. Hospitalized patients in our sample are limited to those who were actually hospitalized, omitting those who were suggested to be hospitalized but did not follow through, or elected to be hospitalized at a different hospital, or even moved or died. This type of label bias makes our results underestimate the probability of hospitalization.
Third, the lack of patient source. Due to the lack of information on whether a patient has primary care though the hospital we considered, we cannot separate patients who get their primary care and those who come just for a hospitalization. This may lead to deficiency in historical information for some admitted patients.
Fourth, some estimates of independent variable coefficients may not be accurate, in part due to the dependence between payer status and disease types. Nevertheless, according to a rough rule in estimating the severity of collinearity resulting from dependence between variables [30,31], the collinearity can be tolerated if the standardized coefficient is smaller than 1 or the unstandardized coefficient is smaller than 2. In our analysis, the standardized coefficients are all smaller than 1 except for Combinations, which suggests that the results are plausible.

Conclusions
This study provides a snapshot of the differences of hospitalization for patients with different payer status. and it is a first such study done at a facility level.
Based on the insurance status, we stratified patients into five groups: government-sponsored, private-sponsored, multiple-sponsored, uninsured and other. We then used a conditional logistic regression model that is able to control for the influence of a patient's illness severity to investigate the influence of other potential social factors. We found that coverage by Medicaid or Medicare plays a significant role in improving access to care (e.g., hospitalization) for low-income patients, but there might exist preventable admissions for this group of patients. For uninsured patients, inequity in terms of hospitalization still exists. Therefore, strategies to prevent hospitalizations for low-income insured patients and providing help for uninsured patients may be advisable.
We believe this study offers some insights for hospital administrators and policy makers on disparity of payer status on hospitalization. The findings can serve as a baseline from which to measure the anticipated effect of measures to reduce disparity of payer status in hospitalization.