Impact of a New York City supportive housing program on Medicaid expenditure patterns among people with serious mental illness and chronic homelessness

Background A rapid increase of Medicaid expenditures has been a serious concern, and housing stability has been discussed as a means to reduce Medicaid costs. A program evaluation of a New York City supportive housing program has assessed the association between supportive housing tenancy and Medicaid savings among New York City housing program applicants with serious mental illness and chronic homelessness or dual diagnoses of mental illness and substance use disorder, stratified by distinctive Medicaid expenditure patterns. Methods The evaluation used matched data from administrative records for 2827 people. Sequence analysis identified 6 Medicaid expenditure patterns during 2 years prior to baseline among people placed in the program (n = 737) and people eligible but not placed (n = 2090), including very low Medicaid coverage, increasing Medicaid expenditure, low, middle, high, and very high Medicaid expenditure patterns. We assessed the impact of the program on Medicaid costs for 2 years post-baseline via propensity score matching and bootstrapping. Results The housing program was associated with Medicaid savings during 2 years post-baseline (−$9526, 95% CI = −$19,038 to -$2003). Stratified by Medicaid expenditure patterns, Medicaid savings were found among those with very low Medicaid coverage (−$15,694, 95% CI = −$35,926 to -$7983), increasing Medicaid expenditures (−$9020, 95% CI = −$26,753 to -$1705), and high Medicaid expenditure patterns (−$14,450, 95% CI = −$38,232 to -$4454). Savings were largely driven by shorter psychiatric hospitalizations in the post-baseline period among those placed. Conclusions The supportive housing program was associated with Medicaid savings, particularly for individuals with very low Medicaid coverage, increasing Medicaid expenditures, and high Medicaid expenditures pre-baseline. Electronic supplementary material The online version of this article (10.1186/s12913-017-2816-9) contains supplementary material, which is available to authorized users.


Background
Medicaid is a public health insurance program in the United States (US) that provides comprehensive healthcare services to vulnerable populations including individuals with disabilities, older adults, and low-income families [1]. Several studies have demonstrated that almost 50% of total US Medicaid expenditures are driven by only about 4% of Medicaid enrollees who are charac-terized by being elderly and/or having non-stable housing; having mental illness, chronic medical conditions, or substance use disorders; and being in long-term institutional care [2]. As Medicaid expenditures have rapidly increased from $201 billion in 2000 to $408 billion in 2011 [3], there has been growing interest in the reduction of Medicaid costs by improving housing stability [4,5]. In particular, the "housing first model" where unstably-housed individuals are placed into permanent housing without requiring adherence to substance use treatment or other services has been proposed as an intervention for health-promoting behaviors and improved healthcare access [6,7]. These benefits, in turn, are thought to decrease the likelihood of high-cost medical events such as emergency department visits and hospitalizations [5,[8][9][10].
A recent review of US and Canadian studies found that the housing first model for homeless individuals with serious mental illness has led to decreases in emergency department (ED) visits and hospitalizations [11]. However, among unstably-housed individuals who incur high medical costs due to frequent ED visits or hospitalizations, no studies have identified high users based on duration and sequencing of medical expenditures. Given that extremely high Medicaid costs are often due to only a few hospitalization events, using aggregated Medicaid costs as a threshold for high Medicaid users (e.g., 75th percentile of 1-year total Medicaid costs among Medicaid enrollees) may not identify individuals with consistently high Medicaid costs. In a program evaluation of supportive housing in New York City (NYC), we sought to determine whether there was an association between supportive housing tenancy and Medicaid savings among those with serious mental illness and chronic homelessness or dual diagnoses of mental illness and substance use when stratified by distinctive Medicaid expenditure patterns. We employed group-based trajectory modeling to identify groups of individuals who shared similar trajectories of monthly Medicaid expenditures during the 2-year pre-housing period.

Setting and sample
The population in this evaluation was composed of individuals aged 18 years or older who were eligible for a NYC supportive housing program because they met one the 2 following conditions: 1) chronic homelessness and serious mental illness or, 2) dual diagnosis of mental illness and a substance use disorder. This is one of 9 population groups in a NYC supportive housing program launched by NYC and New York State governments in 2007 to create 9000 supportive housing units for people who were homeless or at risk of homelessness [12]. The program followed the housing first model with housing placement not being contingent on adhering to treatment or services. Chronic homelessness is determined if individuals had to have a record of either 1) staying at least 2 out of the last 4 years in a homeless shelter or living on the street or 2) being disabled and spending at least one of the last 2 years in shelter or living on the street. To qualify as having a mental illness condition, they submitted comprehensive psychiatric evaluations signed by psychiatrists or nurse practitioners conducted within 6 months prior to the application date. This clinical information (i.e., Axis I and Axis II according to the Diagnostic and Statistical Manual of Mental Disorders IV criteria and a comprehensive psychosocial summary) in the housing application was independently assessed and verified by clinical staff from the NYC Human Resources Administration. We matched multiple administrative records to evaluate the program's impact on utilization of government services and benefits Data included utilization of government housing, jails, homeless shelters, New York State psychiatric facilities, Medicaid claims, and food stamps. These data were provided by the NYC Human Resources Administration's Customized Assistance Services and HIV/AIDS Services Administration, the NYC Department of Health and Mental Hygiene, the NYC Department of Homeless Services, the NYC Department of Correction, and the New York State Office of Mental Health. Data matching was performed deterministically between service data (Medicaid, State psychiatric facilities, food stamps, some types of subsidized housing) and the NYC supportive housing program data using identifiers (first and last name, birth date, and social security number). For all the other data, we performed probabilistic matching using first, middle, and last names, birth date, social security number, sex, and address (when available) via QualityStage software (IBM Corporation, Armonk, New York) [13]. According to an independent human review of sample cases for the probabilistic match, matching performance was acceptable (sensitivity: 93%, specificity: 97%).
In this evaluation, we included individuals who were eligible for the program during 2007-10. Housing or program directors received multiple eligible applicants per 1 vacancy and determined a placement decision after interviewing a maximum of 3 candidates. We categorized these individuals into 2 groups: 1) applicants who were placed in the program for more than 7 days ("placed") and 2) applicants who were not placed in the program during 2 years after eligibility ("unplaced"), and were not placed in any other government-subsidized housing tracked by the evaluation during the first 6 months after eligibility (i.e., program eligibility period) to ensure lack of exposure to a supportive housing program during the initial treatment assignment. We excluded those who were housed in other government-subsidized housing programs immediately before becoming placed in the program (no gap between move-out and movein dates; n = 20) because these continuous housing experiences might have confounded changes in outcomes associated with the program. Baseline (i.e., index date) was defined as the first placement date (for the placed group) or the first eligibility date (for the unplaced group). For each individual, we examined the 2-year period prior to baseline and 2-year period post baseline. For example, for those who became eligible for the program on December 31, 2010, the evaluation period was January 1, 2008 through December 31, 2012. For placed people, a median of 95 days (51 days on 25th percentile, 208 days on 75th percentile) passed between the first eligibility date and the first move-in date. This gap between the eligibility and placement dates reflected administrative processes to finalize placement (e.g., documentation, service arrangement, and entitlement paperwork).
For the analysis, we retained placed applicants in the placed group even if they left the program, and unplaced applicants were included in the unplaced group even if they were placed in any other government-subsidized housing after the first 6-month period. On average the placed group stayed in the program for 661 days (162 on standard deviation, 730 on 25th percentile, 730 days on median, 730 days on 75th percentile) during the 2 years after placement. The final dataset contained 2827 people. This sample selection process was illustrated in Additional file 1. The NYC Department of Health and Mental Hygiene Institutional Review Board determined that this was a program evaluation activity and non-human subject research, and therefore did not fall under the purview of the Institutional Review Board.

Variables
In this program evaluation, the exposure variable was an indicator of supportive housing placement (defined above). The outcomes were aggregated Medicaid costs during 2 years after baseline, adjusted for medical inflation to 2012 dollars. The total Medicaid costs were further divided into costs from 1) outpatient care, 2) inpatient care, 3) emergency department visits, and 4) prescription drugs. The remaining costs ("other costs") mainly consisted of those from home health agencies and personal care, and residential care. Capitation payments were classified in the other costs category and combined with fee-for-service costs because Medicaid savings in this evaluation were assessed using the costto-government perspective (i.e., government payments for Medicaid services) rather than the societal cost perspective. Because capitation payments did not provide information about costs associated with specific use of services, service-specific costs in the remaining categories might have been underestimated; yet this potential bias is likely small because behavioral healthrelated claims, which represented most claims in this population, were carved out and reimbursed via a feefor-service mechanism during the evaluation period.
As an additional outcome, we calculated the percent of individuals who participated in managed care for at least 6 months during the 2 year pre-or 2 year post-baseline because pre-and post-changes in managed care enrollment by placement status, which is a cost containment effort, could help explain changes in Medicaid costs associated with the housing program.
Lastly, to describe and account for differences between placed and unplaced people, we included covariates that captured the use of government services and benefits during 2 years prior to baseline using the other administrative data sources and monthly Medicaid coverage dates during 2 years prior to baseline and post-baseline using Medicaid data. To construct the Medicaid expenditure patterns via sequence analysis, we calculated monthly Medicaid expenditures during 2 years prior to baseline. We also obtained demographic and clinical descriptors at the time of program application using the NYC supportive housing application data (Additional file 1).

Statistical analyses
We performed sequence analysis to identify distinctive patterns of Medicaid expenditures during the 2 years prior to baseline among the placed and unplaced groups combined. Sequence analysis is a method that identifies mutually exclusive groups of individuals who share similar trajectories of events (in this analysis monthly Medicaid costs) [14]. This method compares individuals' duration and sequencing of events among all possible pairs and determines unique temporal patterns [14,15]. The resulting temporal grouping can be more instructive than grouping based on conventional aggregated measures (e.g., high Medicaid users based on 2-year aggregated Medicaid costs) when there are multiple types of events (e.g., no Medicaid coverage, zero Medicaid cost) and costly outlying events (e.g., psychiatric hospitalizations) [15]. The utility of sequence analysis in health services research has been demonstrated by recent findings of prenatal care trajectories among French population [16] and sequences of clinical instabilities such as fever and hypotension among United States patients who were hospitalized due to community-acquired pneumonia [17]. We first categorized monthly Medicaid costs for each month during the 2-year pre-baseline period using a ranking: 0 = no Medicaid coverage; 1 = $0; 2 = $1-$292; 3 = $293-$690; 4 = $691-$1379; 5 = $1380-$2898; 6 = ≥$2899. Ranks 2 through 6 were based on the quintiles for all non-zero monthly costs. To determine whether zero cost was due to no coverage (rank 0) or no utilization (rank 1), we examined Medicaid coverage in each month when there were zero costs during the 2-year period, and assigned the rank 1 to the month if individuals were on Medicaid; if not, we assigned the rank 0. We converted Medicaid costs into these 7 categories using this ranking approach because there was a high percentage of individuals with no costs and a few individuals who had extremely high expenditures. In Fig. 1, individual-level sequences of monthly Medicaid (categorized by no coverage, zero, and quintile non-zero monthly Medicaid costs and presented using different colors) were represented as horizontal lines. For example, if a line is dark brown-colored up to the first year (= t12) and subsequently light blue-colored for the second year, it represents an individual who initially did not have Medicaid coverage and then consistently incurred small Medicaid costs. A line that does not change over time represents a time period when there is no change in Medicaid costs. Then we adopted the optimal matching algorithm to determine the extent of dissimilarity between pairs of 2-year sequences of monthly Medicaid cost ranks [14]. We translated pair-wise differences of sequences into weights based on transition probabilities, created a distance matrix by applying this procedure to all possible pairs, and then conducted a hierarchical cluster analysis with the Ward method [14,18]. In other words, individual-level expenditure sequences were stacked together based on their similarities. We found that the cluster analysis identified 6 non-overlapping groups of individuals who shared similar trajectories of monthly Medicaid expenditures over 2 years prior to baseline. By examining these trajectories, we labeled each group to conceptualize distinct patterns of Medicaid expenditures among the NYC supportive housing applicants.
We used propensity score matching to minimize baseline differences between the placed and unplaced groups [19]. For each Medicaid expenditure pattern separately, we ran a logistic regression model to predict the probabilities of placement (i.e., propensity scores) using characteristics of program applicants prior to baseline or at baseline as the independent variables (Additional file 2). In the final propensity score models, we excluded variables with greatly inflated standard errors because these were interpreted as evidence of multicollinearity (Additional file 2). Then, we carried out optimal full matching, which yielded matched sets of varying numbers of individuals from both placed and unplaced groups [20]. In this propensity score matching method, differences between placed and unplaced groups were achieved by using varying matching ratios (e.g., 1:1, 1:N, N:1, and N:N), which is more efficient than the conventional 1:1 nearest neighbor matching [21]. Another advantage of this optimal full matching was it retained all sample members, which was important for this program evaluation because we were interested in the impact of the housing program among all eligible applicants [21]. We assessed the extent to which the propensity score matching reduced observed differences between placed and unplaced individuals by observing the standardized difference in average covariate values before and after matching [22]. Specifically, we calculated the standardized absolute difference by dividing absolute difference in a covariate value between placed and unplaced people (continuous covariate: difference in average scores; categorical covariate: difference of proportion in each category) by pooled standard errors via a variance formula for either continuous or categorical variables. These estimated differences across most covariates and in each of the Medicaid expenditure patterns substantially decreased after the propensity score matching, suggesting that the propensity score matching performed well in minimizing the observed differences (Additional file 3). For each Medicaid expenditure pattern, we calculated mean differences in Medicaid costs between placed and unplaced groups during 2 years post-baseline using standardization, In this technique the mean cost differences were calculated per matched set, and proportionally weighted according to the size of the matched set relative to the total sample size, which allowed accounting for propensity score matching in mean cost difference estimates. To perform statistical tests for the mean differences between the 2 groups, we used 10,000 bootstrap replicates, which is a non-parametric variance estimation approach to address the violation of the normal distribution assumption [22]. To incorporate propensity score matching into this estimation of 95% confidence interval (CI), in each bootstrap replicate, we performed propensity score matching and calculated the mean cost differences using standardization. We corrected bootstrapping distributions using bias and acceleration statistics to address potential bias in point estimates and account for varying variances, which could result from random replications [23]. The percent of the managed care participation, which was a secondary outcome, was compared between placed and unplaced people per each expenditure group using conditional logistic regression that accounted for propensity score matching.
We performed independent t-tests for continuous variables and chi-square tests for categorical variable to test whether placed people were different from unplaced people in baseline or pre-baseline characteristics before and after propensity score matching. In addition, we described baseline characteristics and pre-baseline Medicaid utilization (i.e., Medicaid costs due to outpatient care, inpatient care, emergency department visits, prescription drugs, and other uses) by 6 Medicaid expenditure groups. Similar to tests for difference between placed and unplaced people, we performed independent t-tests for continuous variables and chi-square tests for categorical variables as we tested differences across Medicaid expenditure groups. Statistical significance for these tests was established using 2-tailed p < 0.05. For post-baseline mean comparisons, statistical significance was established using 95% confidence intervals via bootstrapping. Sequence analysis, optimal full matching, and bootstrapping were performed using TraMineR, cluster, optmatch, and boot packages in R 3.3.1 software (Vienna, Austria) [24]. All other analyses were performed using SAS 9.4 software (Cary, NC) [25]. Table 1 shows that a majority of the eligible applicants were non-Hispanic Black or Hispanic males and did not need assistance with activities of daily living (i.e., capacity of living independently). Sixty four percent were aged to 35 years to 54 years at the time of the application. About 50% were diagnosed with substance use disorders, while 74% reported not currently using substances. Baseline demographic and behavioral characteristics were in general similar between the placed and unplaced groups, and differences in particular characteristics (e.g., higher likelihood of being capable of living independently among the placed group) substantially decreased after propensity score matching ( Table 1).

Results
The sequence analysis identified 6 distinct patterns (Fig. 1). These represent non-overlapping Medicaid expenditure patterns during the 2 years prior to baseline, in- In general, each expenditure pattern displayed a unique profile of baseline characteristics that was distinguishable from the others ( Table 2). Individuals with patterns of high expenditures were disproportionately more likely to have a history of psychiatric hospitalizations  and substance use disorders and at the time of application were more likely to have been participating in a substance use treatment course than those with the other patterns.
In addition, those with very low coverage and emerging user patterns were less likely than others to have been receiving supplemental security at the time of application. Lastly, for more than 80% of the 2-year period prior to baseline, the housing applicants were on Medicaid, except for the very low coverage and emerging user whose Medicaid coverage was less than 53%. The total mean Medicaid costs over the 2 years prior to baseline was the highest among high user ($149,203, 97,882 on standard deviation), followed by second-highest user ($56,683, 34,390 on standard deviation) ( Table 2). Medicaid utilization among individuals with very low coverage and emerging user patterns were similar when the contributions of Medicaid subcategories were compared (Fig. 2). Inpatient care constituted the largest proportion of total Medicaid costs (>50%), followed by outpatient care. Medicaid utilization among those with low user, middle user, second-highest user, and high user patterns were different from those of the previously described very low coverage and emerging user patterns. Specifically, the percentage of Medicaid costs due to prescription costs (for high user) or other expenses (for low user, middle user) was greater than that due to outpatient care. Other Medicaid costs for middle user were mostly due to managed care costs (73%), whereas those for low user were more evenly attributed to three major types of costs (19% managed care, 22% residential care, and 18% home care). Medicaid costs for second-highest user were largely attributed to inpatient (34%) and outpatient care types (29%). Emergency department visits contributed the smallest amount of Medicaid costs across all 6 Medicaid expenditure patterns.
Of the 6 expenditure patterns, there were statistically significant cost savings among those with very low coverage pattern (−$15,694, 95% CI = −$35,926 to -$7983), emerging user (−$9020, 95% CI = −$26,753 to -$1705), and second-highest user (−$14,450, 95% CI = −$38,232 to -$4454) after accounting for baseline characteristics via propensity score matching (Table 3). Among all of the expenditure patterns combined, the housing program    was also associated with total Medicaid cost savings (−$9526, 95% CI = −$19,038 to -$2003). For emerging user, second-highest user, and all 6 expenditure patterns combined, these savings represented about a quarter of 2-year Medicaid costs prior to baseline; for individuals with very low coverage, the savings were equivalent with almost 240% of baseline costs. Cost savings were mostly driven by a reduction in inpatient services, which represented more than 75% of cost differences between the placed and unplaced individuals. Cost savings from inpatient services were due to cost reductions in psychiatric hospitalizations among placed individuals (Additional file 4). Additionally, some cost savings from emergency department visits were observed among individuals with second-highest user and high user patterns.th=tlb= Table 4 shows that pre-baseline managed care enrollment was not different between placed and unplaced individuals. After placement, the overall percent of those who were in managed care was greater among placed than unplaced individuals (43% vs. 33%, p < 0.05). Stratified by Medicaid expenditure patterns, this difference was statistically significant only among emerging user and second-highest user.
Cost savings for those with very low coverage, emerging user, and second-highest user patterns were mostly driven by reduced costs due to psychiatric hospitalizations. Whereas psychiatric inpatient costs more than doubled from pre-baseline to post-baseline among unplaced individuals with the very low coverage pattern ($2573 to $5731), slightly decreased among those with the emerging user pattern ($7920 to $5767), or were unchanged among those with the second-highest user pattern ($6596 to $6338), substantial reductions in psychiatric inpatient costs were observed among placed individuals (very low coverage: $940 to $743; emerging user: $6700 to $3432; second-highest user: $5884 to $4128). This implies that the supportive housing program may be particularly effective in preventing future inpatient events and reducing the length of hospitalization, which in turn contributes to decreased total Medicaid costs. Along with small cost savings from emergency department visits, these inpatient cost savings are consistent with previous findings that supportive housing is associated with a reduction in the number of emergency department visits and hospital admissions among high Medicaid users [26], homeless adults with disabilities [27], and those with chronic medical illness [28].
Another possible explanation for these cost savings was the increased managed care enrollment after placement. Among the applicants overall, placed people were more likely to enroll in managed care than unplaced people during the post-baseline period. Given that high Medicaid costs were likely to have been controlled under the managed care payment system at least for nonbehavioral health-related claims, managed care enrollment among the placed group might have contributed to containing Medicaid costs during 2-year post baseline [29]. Similarly, managed care enrollment was much higher among the placed vs. unplaced people with the emerging user and second-highest user pattern when stratified by Medicaid expenditure patterns, which supported this hypothesis. Future studies with detailed housing service data are warranted to better understand the association between housing placement and increased managed care enrollment. Additionally, estimates of the real cost of managed care beyond capitation costs are needed to explain contributions of managed care enrollment to cost savings.
Despite the fact that this evaluation, like other studies, found an association between living in a supportive housing program and mean Medicaid savings in inpatient and emergency department hospitalizations, the magnitude of the program impact on Medicaid savings in this evaluation is smaller than has been found in previous studies [5,10,26,27]. In addition, although the largest point estimate of Medicaid savings was observed among individuals with the high user pattern (−$34,100, 95% CI = −$72,238 to $24,126), this estimate was not statistically significant, indicating high estimation uncertainty. This lack of statistically significant savings may reflect the fact that the high user group had the highest prevalence of pre-existing chronic diseases that would still require treatment after placement. Even if some medical costs for treatment inevitably occur after housing placement, previous studies highlight reduced numbers of emergency department and hospital admission events instead of evaluating total medical costs [28]. The difference in findings may be also due to the small sample size [5,28] or lack of appropriate control groups in previous studies [10]. This evaluation has some limitations. First, because applicants were not randomly selected for housing placement, our results might be biased due to unobserved confounders. Despite thorough efforts to ensure comparability between placed and unplaced groups via propensity score matching and sequence analysis, two groups could be different in unobserved characteristics, which could affect Medicaid utilization post-baseline. Second, Medicaid expenditure patterns were identified based on the assumption that there were underlying groups of individuals who have distinct trajectories of Medicaid utilization. This assumption is consistent with health policy approaches of distinguishing high Medicaid users from low users [2,4,30]. However, clustering from sequence analysis has been criticized because it has been difficult to replicate using a population other than the one studied in any given analysis [31]. Third, Medicaid costs from managed care capitations were classified as costs due to "other" utilization. Although Medicaid expenditure patterns were based on total Medicaid costs and all behavioral health-related claims were carved out and reimbursed via a fee-for-service mechanism during the evaluation period, Medicaid savings due to specific services may be under-or over-estimated. Fourth, even if we accounted for pre-baseline Medicaid coverage, post-baseline coverage might be different between the placed and unplaced groups, which could introduce bias in the observed cost savings. To address this concern, we calculated the average Medicaid coverage days post baseline in each expenditure pattern and found that differences in the length of Medicaid coverage postbaseline between the 2 groups were very small during the full 2-year period, ranging from 5 to 48 days. Fifth, our finding might be biased because the index date was different for the two groups (the first move-in date for the placed group, the first eligibility date for the unplaced group). For placed persons, there was median gap of 95 days between eligibility and housing placement. However, even if substantial changes in Medicaid utilization were unlikely to occur during this period, we included Medicaid costs due to 5 service types as covariates in the propensity score models in order to minimize bias. Lastly, we assumed that housing placement status and confounders were not time-varying. Although we captured anyone placed in the supportive housing program and included them in the placed group, an inability to account for a dynamic aspect of housing and confounding could result in bias.
A main strength of this evaluation is that sequence analysis identified distinct groups of individuals based on 2 years of pre-baseline Medicaid records. Compared with conventional mean-or frequency-based methods, our method produced groups that likely more validly captured patterns of Medicaid utilization because they preserved sequences of claims over a sufficiently long time period [14,15,18]. Another strength is that this analysis controlled for confounding using a large number of variables that measure baseline demographic and behavioral characteristics. Lastly, the use of comparison groups and the larger sample size relative to that in previous studies strengthened the internal validity of the findings.

Conclusion
In conclusion, in an NYC supportive housing program for people with serious mental illness and chronic homelessness or dual diagnoses of mental illness and substance use disorder, there were Medicaid savings overall as well as for tenants who had patterns of very low coverage, emerging user, or second-highest user of Medicaid-funded services prior to baseline. This evaluation found that tenancy in a supportive housing program among individuals at risk of homelessness and with serious mental illness had a significant impact on Medicaid expenditures, and that cost savings were mostly driven by reduced psychiatric hospitalizations. These findings call for further evaluation of this program to examine its long-term impact and inclusion of detailed data about housing services or referrals to identify mechanisms through which supportive housing placement influences Medicaid cost savings.

Additional files
Additional file 1: A flow chart of sample selection process. This file illustrates how exclusion criteria were applied in order to generate a final evaluation sample. (DOCX 20 kb) Additional file 2: Covariates included in the propensity score models. This file contains a list of covariates that were included in the propensity score models. (DOCX 19 kb)