The statistical approach in trial-based economic evaluations matters: get your statistics together!

Mutubuki, Elizabeth N.; El Alili, Mohamed; Bosmans, Judith E.; Oosterhuis, Teddy; J. Snoek, Frank; Ostelo, Raymond W. J. G.; van Tulder, Maurits W.; van Dongen, Johanna M.

doi:10.1186/s12913-021-06513-1

BMC Health Services Research

Table 1 Statistical challenges in trial-based economic evaluations

From: The statistical approach in trial-based economic evaluations matters: get your statistics together!

1). Baseline imbalances
It is commonly assumed that the random allocation of participants in trial-based economic evaluations ensures that observed and non-observed characteristics are well-balanced across study groups. Nevertheless, some between-group differences in baseline values and/or important prognostic factors regularly occur [18]. Failure to account for such baseline imbalances will likely lead to biased results [18, 19]. Various methods have been suggested to account for baseline imbalances in trial-based economic evaluations, including mean difference adjustment, regression-based adjustment, and matching methods [13, 15, 18,19,20]. In the literature, regression-based approaches are most commonly used to deal with baseline imbalances. Advantages of regression-based adjustment are that it enables to adjust for various covariates simultaneously, its ability to identify important subgroups by using interaction terms, and its relatively simple implementation in standard statistical software packages [13, 15, 18, 20,21,22,23,24,25,26]. However, regression-based adjustment also has a number of drawbacks, including the need for normally distributed residuals and similar distributions of covariates across treatment arms [15]. Moreover, adjusting for several covariates simultaneously might result in overfitting of the model which leads to misleading goodness-of-fit statistics, regression coefficients and p-values. In addition, matching methods can be used, which include propensity score adjustment and propensity score matching. However, these methods are specifically developed for trial-based economic evaluations that use non-randomized study designs, because of their ability to deal with the non-randomized nature of data.
2) Skewed costs
Costs are generally right-skewed as there are relatively few participants with (very) high costs and it is impossible to incur negative costs. Consequently, the assumption of normality of standard parametric tests, such as t-tests and linear regression analyses, is violated [4]. Although the normality assumption is violated by the skewed nature of costs, if the sample size is large enough the central limit theorem ensures that sample means will be normally distributed and standard parametric statistical methods may be used [27]. Log-transformations and standard non-parametric tests (e.g., Mann-Whitney U) are unsuitable for trial-based economic evaluations, since both methods fail to provide an estimate of the mean difference in costs, whereas this is required by decision-makers to allow for estimations of the total budget impact of a new intervention [1, 7, 28, 29]. Suggested methods that are suitable to account for the skewness of costs, while simultaneously comparing mean costs are non-parametric bootstrapping and generalized linear models (GLMs) assuming distributions that fit the data best (e.g. Gamma, Log-Normal and Inverse-Gaussian) [1, 4, 7, 14, 28,29,30,31,32]. An important advantage of non-parametric bootstrapping is that it avoids the need for making distributional assumptions. Of the different non-parametric bootstrapping techniques, Bias-Corrected and Accelerated (BCa) bootstrapping is generally recommended, because it better adjusts for skewness and bias of the sampling distribution, resulting in more accurate confidence intervals than other techniques (e.g. normal bootstrap, percentile bootstrap, bootstrap-t method) [27, 33,34,35]. GLM can deal with non-normally distributed data as well. However, a choice needs to be made about the most appropriate distribution of the outcome as well as the appropriate link function, which sometimes can be challenging [21, 36]. For the comparison of arithmetic means an identity link could be specified, which obviates the need for complex retransformation techniques [37]. Nevertheless, specification of the identity link is suboptimal as this does not ensure that only positive mean values are estimated as with the Gamma, Log-Normal and Inverse-Gaussian distributions [38].
3) Correlated costs and effects
Costs and effects are typically correlated and hence their correlation should be accounted for [26, 37, 39]. Proposed methods for dealing with the correlated nature of costs and effects include non-parametric bootstrapping and seemingly unrelated regression (SUR). When resampling cost and effects in pairs, the correlation structure is kept intact when estimating statistical uncertainty [10]. When using SUR, two separate regression models are specified simultaneously; i.e. one for costs and one for effects. In SUR, the correlation between costs and effects is accounted for through correlated error terms [26, 40].
4) Missing data
In clinical trials, missing data are common [1, 28, 29, 41]. This is of great concern for trial-based economic evaluations, because total costs are calculated as the sum of several cost components measured at different time points. If one resource use item or one time point is missing, total costs will also be missing [1, 28, 29]. According to Rubin [41], missing data can be classified in three mechanisms. First, if missing values are not dependent on any observed or unobserved variable, data are said to be missing completely at random (MCAR). Second, when missing values are related to one or more observed variables, but not the missing value itself, data are said to be missing at random (MAR). Third, when missing data depends on the missing values itself, data are said to be missing not at random (MNAR). In case of MAR and MNAR, bias may be introduced when analyses are restricted to complete cases or when naïve imputation methods, such as mean imputation or last observation carried forward, are used [1, 28, 29, 42]. Only, when missing data can validly be assumed to be MCAR or the proportion of missing data is low (i.e. < 5%), naïve imputation methods may be used [43, 44]. For all other situations, “naïve” imputation methods are discouraged, because - amongst others - they do not account for the uncertainty related to imputing missing values [9, 11, 45,46,47]. More advanced methods to account for missing data assuming MAR in trial-based economic evaluations include multiple imputation and statistical models with maximum likelihood estimation [9, 11, 45, 46]. Multiple imputation takes into account that imputed values are not the truth and results in valid estimates of mean outcomes and the associated uncertainty [48, 49]. However, multiple imputation heavily relies on correct specification of the imputation model [12]. In addition, with increased complexity of the imputation model, the model might fail to converge [50]. When using maximum likelihood estimation for multilevel missing data, data are not imputed, but all available data is used to compute the maximum likelihood estimate, that is the most likely value had the variable been observed. However, maximum likelihood methods may not be appropriate when observations are missing for multiple variables, which is typically the case for cost data [48, 50].

Back to article page

ISSN: 1472-6963

Contact us

General enquiries: journalsubmissions@springernature.com