Longitudinal and cross sectional assessments of health utility in adults with HIV/AIDS: a systematic review and meta-analysis

Background Utility estimates are important health outcomes for economic evaluation of care and treatment interventions for patients with HIV/AIDS. We conducted a systematic review and meta-analysis of utility measurements to examine the performance of preference-based instruments, estimate health utility of patients with HIV/AIDS by disease stages, and investigate changes in their health utility over the course of antiretroviral treatment. Methods We searched PubMed/Medline, Cochrane Database of Systematic Review, NHS Economic Evaluation Database and Web of Science for English-language peer-reviewed papers published during 2000–2013. We selected 49 studies that used 3 direct and 6 indirect preference based instruments to make a total of 218 utility measurements. Random effect models with robust estimation of standard errors and multivariate fractional polynomial regression were used to obtain the pooled estimates of utility and model their trends. Results Reliability of direct-preference measures tended to be lower than other types of measures. Utility elicited by two of the indirect preference measures - SF-6D (0.171) and EQ-5D (0.114), and that of Time-Trade off (TTO) (0.151) was significantly different than utility elicited by Standard Gamble (SG). Compared to asymptomatic HIV patients, symptomatic and AIDS patients reported a decrement of 0.025 (p&#×2009;=&#×2009;0.40) and 0.176 (p&#×2009;=&#×2009;0.001) in utility scores, adjusting for method of assessment. In longitudinal studies, the pooled health utility of HIV/AIDS patients significantly decreased in the first 3 months of treatment, and rapidly increased afterwards. Magnitude of change varied depending on the method of assessment and length of antiretroviral treatment. Conclusion The study provides an accumulation of evidence on measurement properties of health utility estimates that can help inform the selection of instruments for future studies. The pooled estimates of health utilities and their trends are useful in economic evaluation and policy modelling of HIV/AIDS treatment strategies. Electronic supplementary material The online version of this article (doi:10.1186/s12913-014-0640-z) contains supplementary material, which is available to authorized users.

structural components, it is important to have measures that can capture this complexity.
While in general quality of life is an abstract concept that is difficult to quantify, health-related quality of life (HRQL) is a concept that researchers and clinicians have used to assess a patients' ability to function in their daily life and their perceived well-being [4]. Many different tools have been developed for the measurement of HRQL, and although they vary widely, it is common that HRQL is multi-dimensional that captures all the relevant areas of a patient's life, including physical health, mental health and functioning, social interaction and role functioning, and general well-being [5]. HRQL can be assessed using generic or condition specific measures. Generic measures are those that are applicable to the general population and large variety of diseases, while conditionspecific measures are concerned with issues and symptoms involved with a specific disease. Generic measures can typically be categorized as health status profiles, in which each domain of a patients' HRQL is scored separately, or as preference-based HRQL (utility) measures, in which patients' individual scores are preference weighted to achieve an aggregate single score [6]. In health assessment, utility is defined as "a cardinal measure of the preference for, or desirability of, a specific level of health status or specific health outcome". Utility is defined as a function of health status and the consumption of goods, services, and leisure over a specified period of time [7]. Utility measures are classified by two major approaches: the direct and indirect preference. Direct preference-based measures ask the patients about the value they attach to their current subjective health states. Meanwhile, indirect preference-based approaches use preferences from other samples, usually from general population, to generate preference index scores for hypothetical health states from a HRQOL instrument [8].
Various generic and disease-specific HRQL measures have been applied in HIV populations [5,[9][10][11], most of which, however, were developed before the advent of ART. As a result, the breadth of these measures might include aspects of HRQL which are now less relevant, while lack increasingly important issues in HIV care and treatment [11]. For example, HIV patients may have concerns with sexual functioning, stigma, or body image, and their HRQL may be negatively affected by some of the side-effects of antiretroviral medication [5,9]. In addition, some important methodological considerations of HRQL measures have emerged, such as their sensitivity or responsiveness, and the appropriateness of repeated use in HIV populations [12]. Since many clinical interventions for HIV patients result in small, but significant changes, it is important that HRQL measures used in HIV/AIDS populations are sensitive to such treatment changes [9]. Additionally, since HIV is a progressive and episodic disease, with different symptoms appearing at different times, any HRQL tool must also be responsive to patients' disease states over time. Finally, the ability of a tool to capture changes in HRQL over time is complicated by the fact that patients often get acclimated to their own disease state, and thus rate their current health as higher although there has not been any change in clinical health status [3].
One of the most important uses of HRQL assessments in the sphere of HIV/AIDS is in decision making about the effectiveness and cost-effectiveness of treatments and interventions [13]. Generic, preference-based measures provide a single summary score of HRQL outcomes, an integral part of the quality-adjusted life-year (QALY) estimation, a measure which has been widely used in cost-effectiveness analyses of health interventions [8,14]. Although utility approaches have been increasingly applied in HIV interventions [15][16][17][18], measurements indicate a wide range of scores and use a wide range of methods [15,16]. Therefore, pooled estimates of utility measures both aggregate this data and maximize their external validity, making them more relevant and useful for policy makers, and researchers making economic evaluations of HIV interventions [19].
Previous reviews have compared various instruments in HIV studies [9,11,12,20], however, they did not sufficiently identify the applications of preference-based HRQL measures [9,11,21], nor examine the longitudinal changes in HRQL over time of these measures [16]. We hypothesized that the choices of indirect-and direct-preference based HRQL measures might yield significantly different utility scores, and that utility of patients deteriorated as the disease progressed, and could be improved given antiretroviral treatment. The objectives of this study were to systematically review utility measures applied in HIV studies, estimate health utility of HIV/AIDS patients by disease stages, and investigate changes in their health utility over the course of antiretroviral treatment.

Eligibility criteria
This review followed the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines when selecting studies for inclusion [22]. Studies were included if 1) they were written in English in the period of 2000 up to February 2014 and accessed following our search strategy; 2) they were longitudinal or crosssectional design studies, employing preference-based instruments of health utility and reporting the composite score of health utility, 3) their sample included adult participants (≥18 years old) and 4) their full-text articles were available. To minimize the file-drawer effect, we contacted principle investigators of studies on health utility and HIV/AIDS identified but no paper or report published. In addition, we specifically searched for current well-known utility measures that have been applied to HIV populations, including indirect utility measures such as: EuroQol (EQ-5D-3L and EQ-5D-5L), Health utility index (HUI), Quality of Wellbeing (QWB), Short form-6D (SF-6D), 15D; and direct utility measures such as: Standard Gamble (SG), Time trade-off (TTO) and Visual Analogue Scale (VAS). Studies were excluded if they 1) were letters, opinion pieces, editorials, ecological studies, abstracts, and conference proceedings and full reports were not available; 2) were systematic review or meta-analysis studies; 3) used non-utility measures and 4) reported health utility from proxies (e.g. doctors or caregivers). Due to accessibility, we limited our search strategies only for English-language papers. Since a previous study by Tengs and Lin did synthesize utility estimates among HIV/AIDS patients till 2000, we restricted our search for those studies published after 2000 [16].

Information sources and search strategy
Two separate search strategies were performed, including: 1) searching with a combination of free text keywords and 2) searching for the application of well-known utility measures in HIV/AIDS field. The search process was conducted from 15th February, 2014 to 8th March, 2014 (date of last search). Four databases were used for the search process, including PubMed/Medline, Cochrane Database of Systematic Review, NHS Economic Evaluation Database and Web of Science. The search terms used are listed in Table 1. The search strategy was modified for each database by experienced experts and librarians. Finally, the bibliographies of selected papers were reviewed and the authors of unpublished papers were contacted to identify all of potential relevant studies.

Study selection
After the search was completed, all duplicated studies were removed. Next, titles and abstracts of all remaining studies were screened by the research team to ensure that they matched the selection criteria. All papers whose title and abstract revealed that it did not match the selection criteria were excluded. Several further studies were excluded if their full-text articles revealed that they did not measure utility or duplicated data.

Data items and data collection
Using a data extraction form, three independent reviewers extracted specified data from the final selected studies. These reviewers compared their extraction results, discussing and resolving any disagreements prior to producing the final data file for the statistical analysis Reliability of the data extraction among the three independent reviewers was 90%.
Data collected included information about study setting, study design, sample size, utility measure used, mean or median utility scores, standard deviations, methods of assessment, length of follow-up, and clinical and demographic characteristics of respondents. We collected some additional information about the measures used, including data about validity, reliability and responsiveness of each measure (if available).
To define the health utility of each subject based on clinical characteristics, we divided subjects into 3 disease stage categories: asymptomatic, symptomatic and AIDS. However, when we coded disease stage, we found that HIV/AIDS status was reported in numerous ways. For example, some of articles simply reported their cohorts into 3 groups (asymptomatic HIV infection, symptomatic HIV infection, and AIDS) [23], while some authors reported CD4 cell count or the presence of HIV/AIDSdefining illnesses. In the latter case, we used all available data to identify the health state based on the current Centre for Disease Control and Prevention (CDC) guidelines [23]. If authors described subjects without indicating

2000-2014
Language English data about HIV/AIDS stages or CD4 counts, the HIV/ AIDS status was classified as "combined stages". If two articles described overlapping research findings from the same dataset, we removed the article that reported less methodological information.

Data analysis
We used two approaches in analyzing the data. The first one aimed to obtain the pooled estimates of utility and examine the influences of study characteristics on these estimates [24]. We consider every assessment using a specific tool in both cross-sectional and longitudinal studies as a single measurement, making a dataset of 218 observations. Since most studies actually applied several HRQL measures, these studies were considered as clusters in the model, in which each within-study measurement was seen as a nested observation [25]. Therefore, we conducted meta-regression analysis, using a random effect model with robust estimation of standard error. If the standard deviation of the estimated utility was missing, we calculated it using standard error or 95% confident interval of the estimated utility. In the first model, comparison of individual measure was conducted. Second, we fit separate models for each of the subgroups of interest and adjusted for type of HRQL measure. Finally, we included all study characteristics in a multivariate model. The second approach was applied for longitudinal measurements (n&#x2009;=&#x2009;99) to estimate the changes in health utility of patients during ART. Traditionally, regression models often provide a linear dose-response relationship that might not truly reflect the variability of health outcomes given different time on ART. To better describe the association between utility scores and duration on ART, we applied multiple fractional polynomials models which are Intermediate between polynomials and non-linear curves. We fitted first-order and second-order fractional polynomial regression with powers (−2,-1, −0.5, 0, 0.5, 1, 2, 3) for the "duration on ART" to increase the flexibility in estimating the best-fitting curve to the health utility trajectories. Data were analyzed using STATA 12.0, 'xtmixed' and 'mfp' syntax. The details of data analysis and extracted data set are provided in Additional files 1 and 2.

Ethical approval
All data included in this review were previously published and publicly available. We only synthesize and analyzed aggregated data. Therefore, this study did not require ethical approval.

Results
Our systematic literature search yielded 49 studies for inclusion in this study (see Figure 1 for flow chart of the search). We selected these studies for their application of nine utility instruments to the field of HIV. These utility measures included 6 indirect and 3 direct preferencebased measures (see Table 2 for descriptions of the measures and their psychometric properties). Of the 49 total studies, 14 utilized longitudinal designs, while 37 studies were cross-sectional, generating 218 utility estimates.
Of these 218 utility measures, 8 were of asymptomatic patients, 15 were of symptomatic patients, 56 were from AIDS patients, and 139 were of a combination of patients of different stages (Table 3). VAS accounted for the majority of utility measures (100 times, 45.9%), while HUI2 was only used in 1 measure (0.5%).

Psychometric properties of utility measures in HIV population
Few studies have reported the reliability of these measures. Stavem (2005) [17] determined that the test-retest reliability of EQ-5D, 15D and SF6D was 0.78, 0.90 and 0.94 respectively. Among direct utility measures, Lara (2008) showed a low reliability of 0.41 for SG while it was around 0.71-0.83 for TTO and VAS [16]. Many studies evaluated the validity of utility measures using concurrent and predictive validation. Several studies established convergent validity of EQ-5D, EQ-VAS, HUI3, SG, TTO and VAS by demonstrating their correlation with the subscales of the condition specific MOS-HIV [17,26,27]. In addition, the EQ-5D and HUI3, along with 3 direct preferencebased measures, were shown to discriminate subjects by disease severity according to the levels of CD4+ and viral load. Finally, the EQ-5D single index, 15D and SF-6F demonstrated responsiveness relative to a global rating of change [18], while the EQ-VAS and HUI3 demonstrated responsiveness to the development of opportunistic infections, clinical AIDS-defining events, and adverse events [18,26,27] (Table 4).

Utility estimates
Data from the 218 utility measurements of 27,951 subjects were extracted for meta-analysis. The meta-regression results are shown in Table 5, including Model 7 for comparison of individual measure, Model 2-6 for the subgroups of interest and adjusted for type of HRQL measure and Model 1 for all characteristics.
Type of instrument used was a significant predictor of health utility estimates. Adjusting for study characteristics, the SF-6D and the HUI yielded the highest and lowest scores, respectively. We found large, statistically significant differences between utility elicited by SF-6D (0.171), EQ-5D (0.114), and TTO (0.151) and the reference measure, SG. Meanwhile, VAS and HUI provided utility estimates that were not significantly different than SG.
Health utility of HIV/AIDS patients in developing countries was 0.082 lower than those who lived in developed countries. We did not find significant differences in utility estimates across different years of publication.

Longitudinal changes in health utility of HIV/AIDS patients
We used a multivariate fractional polynomial model of the 99 utility measurements from the 14 selected longitudinal studies to analyse changes in health utility over time (see Table 5-Model 8, Figures 2 and 3). The model's coefficients show that the duration of ART was a significant predictor of the changes in health utility scores of HIV/AIDS patients, after adjusting for study characteristics. Health utility of HIV/AIDS patients significantly decreased in the first 3 months of treatment, and rapidly increased afterwards (Figure 2). The magnitude of change was also affected by duration of ART, as well as by the methods of assessment. Direct preference-based measures resulted in greater changes in utility scores  than indirect preference-based measures during the first year of treatment. Starting from the second year, though, the magnitude of change in health utility measured by indirect-preference instruments was larger than direct-preference ones. While this trend was typical for studies conducted in developed countries, it was slightly different in developing countries. In such countries as South Africa, Brazil, Thailand, Uganda, and Vietnam, patients' health utility markedly increased right after the initiation of ART, and then changed only slightly during the first 6 months of treatment, before increasing rapidly again afterwards (Figure 3).

Discussion
By systematically reviewing studies of health utility among HIV/AIDS patients, we provide an accumulation of psychometric evidence of the preference-based HRQL instruments applied in this patient group. Moreover, we compared the performance and utility estimates by various instruments, as well as modelled the changes in health utility over the course of HIV/AIDS treatment. Prior to this work, Tengs and Lin did a meta-analysis of health utility estimates from studies published from 1985-2000 [16]. In this study, we found similar findings that disease stage is an important predictor of health utility. Also, different HRQL instruments might yield clinically important differences in health utility scores. Moreover, findings of this study provide most-updated evidence of preferencebased HRQL assessments among patients with HIV/AIDS during 2000-2013. This is the period when HIV/AIDS treatment services have been rapidly scaled up in developing countries. We extend previous work by analyzing the changes in health utility of patients over the course of ART. Especially, we revealed that different types of instruments had different levels of responsiveness over the early and stable periods of ART. When analyzing the performance of the different instruments, we found that the Time Tradeoff (TTO) instrument, SF-6D, and EQ-5D yielded higher utility scores than the reference Standard Gamble (SG) instrument, while the Visual Analogue Scale (VAS), HUI, and 15D showed no statistically significant difference in measurement than the SG. This is in contrast to various other studies, in which the use of the SG method generally yields the highest utility score among directpreference instruments [8,72]. Generally, it is believed that SG yields higher health utilities, because it asks patients to make a gamble between a chance of good health and a chance of death, and most people are reluctant to accept a large risk of death to avoid an adverse health state [72,73]. There has been very little research about the effect of context on SG and TTO instruments, and yet our results indicate that these instruments may perform differently in HIV/AIDS populations [74]. Indeed, one of the papers included in this review showed that SG was an unreliable measurement of healthy utility in HIV/AIDS patients (0.41) and that TTO and VAS were much more reliable (0.71-0.83) [17]. This low reliability may help explain why SG yielded lower utility scores, contrary to what was expected.
Finally, our use of pooled health utility estimates to determine the changes in HRQL during treatment has significant implication for economic and clinical evaluation of HIV/AIDS care and treatment interventions. In particular, the rapid reduction in health utility during the first 3 months of ART highlights the importance of intensive support for patients after ART initiation to relieve both physical and psychological burden experienced by these patients.
The strengths of this meta-analysis include a systematic approach in synthesizing evidence from the literature. In addition, we applied multivariate fractional polynomial models to select the best fitting model for changes in health utility and length of ART. However, there are some limitations to be acknowledged. First, aggregated data in some studies limited the estimate ability of the model. Second, the length of ART was inconsistent between different patient groups and health utility measures. Third, the frequency of application of some instruments, such as 15D and HUI2, was very small, which resulted in imbalanced models. Finally, since the selected measures are generic instruments, we were not able to identify a set of common measures, including HIV-specific items, to be used for comparing across studies.
The pooled estimates of health utilities and trends throughout the course of ART provided in this study provide valuable information about the effect of ART on HIV/ AIDS patients health related quality of life, which in turn can support developing economic models for evaluating the cost-effectiveness of HIV/AIDS treatment strategies. Researchers can use estimated utility scores by this study for quantifying time-dependent health outcomes of interventions in their cost-effectiveness models. In addition, significant reductions in health utility during the first six month on ART suggest that additional care and support and intensive monitoring should be incorporated in clinical practice. Finally this study provides a basis for the selection of preference-based HRQL instruments for future research in HIV population.

Conclusion
The study provides an accumulation of evidence on measurement properties of health utility estimates that can help inform the selection of instruments for future studies. The pooled estimates of health utilities and their trends are useful in economic evaluation and policy modelling of HIV/AIDS treatment strategies.