- Research article
- Open Access
- Open Peer Review
Multilevel latent class casemix modelling: a novel approach to accommodate patient casemix
BMC Health Services Researchvolume 11, Article number: 53 (2011)
Using routinely collected patient data we explore the utility of multilevel latent class (MLLC) models to adjust for patient casemix and rank Trust performance. We contrast this with ranks derived from Trust standardised mortality ratios (SMRs).
Patients with colorectal cancer diagnosed between 1998 and 2004 and resident in Northern and Yorkshire regions were identified from the cancer registry database (n = 24,640). Patient age, sex, stage-at-diagnosis (Dukes), and Trust of diagnosis/treatment were extracted. Socioeconomic background was derived using the Townsend Index. Outcome was survival at 3 years after diagnosis. MLLC-modelled and SMR-generated Trust ranks were compared.
Patients were assigned to two classes of similar size: one with reasonable prognosis (63.0% died within 3 years), and one with better prognosis (39.3% died within 3 years). In patient class one, all patients diagnosed at stage B or C died within 3 years; in patient class two, all patients diagnosed at stage A, B or C survived. Trusts were assigned two classes with 51.3% and 53.2% of patients respectively dying within 3 years. Differences in the ranked Trust performance between the MLLC model and SMRs were all within estimated 95% CIs.
A novel approach to casemix adjustment is illustrated, ranking Trust performance whilst facilitating the evaluation of factors associated with the patient journey (e.g. treatments) and factors associated with the processes of healthcare delivery (e.g. delays). Further research can demonstrate the value of modelling patient pathways and evaluating healthcare processes across provider institutions.
Survival from cancer varies according to many factors including place of diagnosis and treatment centre (Trust), [1, 2] stage at diagnosis, [3, 4] and associated risk factors such as age at diagnosis, sex, and socioeconomic background (SEB) [5–9]. Some Trusts perform better or worse than others in terms of average survival rates perhaps due to patient casemix at the time of entry into the healthcare system, though patient outcome differences will reflect underlying differences in the effectiveness of healthcare organisations. Much interest lies in identifying good and poor performing healthcare providers, to identify best practice and advocate changes in under-performing institutions. It is important to account for patient casemix when evaluating institutional performance and there are currently several strategies.
Regression (linear or logistic) is a traditional and well-documented approach,  where variables relating to patient characteristics are modelled, effectively to adjust the outcome in relation to the likely influences of these factors. Methods such as matching, stratification,  or propensity score analysis, [11, 12] may also be used, though these techniques make potentially untestable assumptions and never account for the impact of unmeasured variables or accommodate Trust-level variation. Although multilevel modelling accounts for patients nested within Trusts, and provides improved estimates compared with logistic regression, [13, 14] parametric assumptions are made that may not be tenable. Other methods, such as boosted decision trees,  have occasionally been used, though these can be difficult to interpret.
No casemix-adjustment strategy will eliminate all bias due to unmeasured differences amongst patients;  some procedures increase bias . Accommodating patient variation through measured variables only is crude: models ought to reflect the uncertainty associated with patient casemix characteristics. Furthermore, casemix adjustment does not account for differences in patient treatments. Failure to capture variation in patient pathways and their consequences may result in over-simplistic interpretation of healthcare processes and consequent outcomes. Models need to accommodate patient casemix, the patient experience, and uncertainty in both.
Multilevel latent class (MLLC) modelling is proposed to: (i) adjust for patient casemix whilst accommodating uncertainty surrounding unrecorded patient characteristics; (ii) adjust for patient pathways in terms of the delivery of appropriate healthcare (e.g. treatments); and (iii) differentiate patient outcomes in relation to institutional process characteristics (e.g. delays to treatment). To demonstrate and validate all three steps simultaneously is challenging. The first of these is explored here. We contrast the MLLC model ranking of Trust performance with that of ranks derived from calculating Trust standardised mortality ratios (SMRs). To illustrate our methodology, we study routine data on colorectal cancer patients from a large UK health region.
The illustrative colorectal cancer dataset
Patients with colorectal cancer (ICD10  codes C18, C19 and C20) diagnosed between 1998 and 2004 and resident in the Northern and Yorkshire regions were identified from the Northern and Yorkshire Cancer Registry and Information Service (NYCRIS) database. Patient age, sex, tumour stage at diagnosis (using the Dukes classification ), Trust of diagnosis/treatment, and whether or not the patient received treatment were extracted. Initial data extraction yielded 26,455 unique patient records. Socioeconomic background (SEB) was defined at the 2001 enumeration district level of residence (super output area) using the Townsend Index  and matched to patients using postcode. The primary outcome was dead or alive three years following diagnosis, which is clinically meaningful since colorectal cancer has a median survival of approximately three years and survival to three years is often considered for policy reasons.
An area deprivation score could not be obtained for one case. Patients with age at diagnosis greater than 100 years (7 patients) and patients identified by death certificate only (364; 1.4%) were excluded. Some patients had multiple diagnosis codes and for patients attending more than one hospital (16,549; 63%), the location of the most recent Trust with a relevant diagnosis code was recorded as the diagnostic/treatment centre, as this provided the latest staging information. For patients who did not have a relevant diagnosis code for any Trust visits (220; 0.83%), the location of their first Trust visit was taken as the diagnostic/treatment centre. Some 1,239 (4.7%) patients were excluded as their diagnostic centres were outside the NYCRIS region. Following exclusions, 24,640 (93%) of the identified patients remained for analysis.
Latent class analysis (LCA) is well established within single-level regression analysis. Also known as discrete latent variable modelling, or mixture modelling, one determines a number of latent classes, or subgroups, the optimum choice of which is typically informed by log-likelihood statistics. The Bayesian Information Criterion (BIC),  the Akaike Information Criterion (AIC),  and changes in log-likelihood (LL) are used as model-fit indicators, though models might also be selected on the basis of interpretation . Model parameters of each latent class are determined empirically, along with their contribution to the outcome distribution. LCA models are useful where subtypes are sought and one wishes to model uncertainty surrounding class membership, since observations may belong to all classes, with probabilities determined empirically. LCA thus reflects the uncertainty associated with a limited number of predictors when determining subtypes of outcomes. The proposed LCA models are multilevel because patients are nested within diagnostic/treatment centres (Trusts). LCA extends to a multilevel setting by incorporating discrete latent variables at all levels of the hierarchy. For the colorectal cancer data, latent classes at the patient level model uncertainty surrounding affiliation to patient subgroups and latent classes at the Trust level model Trust variation. The modelling strategy was to determine patient-level latent classes (having included patient-level covariates) with Trust-level variation accommodated initially by a continuous latent variable. With patient-level subtype structure fixed, Trust classes were then sought by switching the Trust-level latent variable from continuous to categorical. A minimum of two Trust classes was required to exhibit discretised Trust class differences in patient outcomes.
The proposed modelling strategy builds upon work originated by Downing et al.,  where multilevel LCA circumnavigated potential bias due to the 'reversal paradox' when adjusting for confounders on the causal path between exposure and outcome . We have no such concerns here, since we are not seeking inference of any exposure nor confounder adjustment: rather, we seek to optimise outcome prediction by modelling patient characteristics to accommodate casemix differences. Consequently, all available covariates for which there was complete data (age, sex, and SEB) were considered by the modelling process, along with stage at diagnosis (coded A to D for increasing severity and missing coded X). Stage was included despite a degree of missing data (13.1%), because it is known to influence survival, [3, 4] and a missing category was conveniently added. Although additional patient variables were available, such as time-to-first-treatment and treatment-received, these had substantial incomplete data that would question their utility and were therefore not used. Patient age at diagnosis and Townsend score (SEB) were continuous measures; age was centred on the study mean (71.5 years) and SEB was centred on the population mean of zero (study mean was -0.040). Both covariates exhibited a non-linear relationship with 3-year survival, so a quadratic term for age was included in the model; and by 'trimming' the tails of SEB (assigning rare values > ± 5.0 as ± 5.0), it was possible to avoid higher order terms for Townsend score. The model is described in the Appendix.
SMRs were calculated for each Trust (standardised by age, sex, deprivation and stage) and a scaled difference from 'SMR = 1' was determined for each Trust by dividing by the square root of the Trust size. For both the SMRs and the MLLC models, 200 bootstrapped datasets were generated and each was analysed in the same manner to determine 95% confidence intervals (CIs). We used MLLC to calculate absolute differences in Trust effects on the log odds scale (with patient-level values aggregated to the Trust level) before ranking in order of 'best' to 'worst' survival, to compare with the ranks generated from the Trust SMRs. For data manipulation, summary statistics, tabulation, and charts, Stata was used;  for latent variable models, LatentGold  was used.
Table 1 summarises the 'ideal' MLLC model determined by the procedures described. Patients were assigned to two latent classes of similar size, one with reasonable prognosis (PC1: 54.3% of cases, of which 63.0% died within three years), and one with better prognosis (PC2: 45.7% of cases, of which 39.3% died within three years). Trusts were similarly assigned to two latent classes. The largest Trust class, with 53.1% of patients, had better prognosis (TC1: 51.3% of patients died within three years; TC2: 53.2% of patients died within three years). Table 2 summarises the number of deaths within each patient class by stage. Allocating patients to classes according to their largest class probability (modal assignment), all patients in PC1 diagnosed either at stage B or C died within three years; in PC2, all patients diagnosed at stage A, B or C survived. This difference is anticipated, as stage at diagnosis is an important predictor of survival. Most of the early- or mid-stage patients died within three years in PC1 compared to PC2, and there was a clear graduation in survival with increasing stage at diagnosis from early- to late-stage within both classes. The predictor age differed substantially across classes. In contrast, the predictors deprivation and sex differed only marginally between patient classes.
Trust ranks and their bootstrapped 95% CIs are summarised in Table 3; a low ranking value indicates a better survival rate than expected. Differences in the median rank of Trust performance between the MLLC model approach and the Trust SMRs are within their estimated 95% CIs. Figure 1 provides a graphical representation of these results, in order of increasing median probability of belonging to the best survival Trust class by the MLLC methodology.
In a standard multilevel setting, where a continuous latent variable is adopted at the Trust level, the implicit assumption is that Trust-level outcomes have an underlying normal distribution (conditional on Trust-level covariates): Trusts are effectively treated as a random sample of a larger (infinite) population of Trusts. Trusts are not, however, randomly placed geographically and nor are patients randomly assigned to Trusts. Parametric assumptions were therefore replaced by other assumptions which are less restrictive by adopting discrete latent variables, although there remains a degree of geographical dependency that is not accounted for. This remains a limitation. The simplest MLLC model adopted was therefore where the continuous latent variable at the upper level is replaced by a categorical latent variable. The model estimates the mean outcome for each Trust class and the size of each Trust class (summation of Trust probabilities for each Trust class) and no assumptions were made regarding the underlying distribution or class sizes. More complex models can extend this approach to accommodate the spatial dependencies, though this will be part of future developments.
An upper-level discrete latent variable allows for individual Trusts to be assigned probabilistically across the discrete latent classes, providing less restricted weighting of Trust relative performance. This may improve the accuracy of the estimated patient outcome differences across Trust classes, which improves the estimated patient casemix adjustment for individual Trusts. The MLLC model is more likely to capture contextual effects due to the inherent data hierarchy than either a standard multilevel approach or by merely estimating Trust ranks according to their SMRs. Continuous and discrete latent variables, if combined, may prove more parsimonious, with variation within each Trust class captured by the continuous latent variable, potentially leading to fewer Trust classes needed to describe overall Trust-level variation. Where determination of Trust ranks is important, the estimation of Trust outcomes is simpler if the categorical latent variable only is adopted at the Trust level, avoiding derivation of the normally distributed effects within each Trust class. Addressing spatial dependencies amongst the Trusts may nevertheless warrant incorporating upper-level effects.
In fixing patient-level latent class composition and accommodating patient casemix differences, the residual Trust-class differences in outcome reflect variations in Trust performance that depend upon Trust characteristics (differences in the treatments given and healthcare delivery processes). Model improvement might be feasible with more patient-level variables, but this would incorporate incomplete data, which can cause bias. Within a latent class framework the uncertainty surrounding unrecorded or unused patient characteristics is modelled explicitly: 'fuzzy' matching. Trust-level covariates might explain some of the Trust-class outcome differences if included. The optimum number and composition of Trust (and possibly patient) classes may change with the inclusion/exclusion of different covariates.
The probabilities of Trust class membership in Table 3 were marked, with most Trusts belonging entirely or predominantly to one Trust class. This is unsurprising, as there is only a modest difference between the two classes in median survival, and probabilistic assignment differentiates between the two, providing a class weighted combined survival rate. It is not feasible, however, for a Trust to be assigned a class weighted survival rate below that of the poorer survival class, or above that of the better survival class. This is an implicit constraint on the estimated weighted survival for Trusts allocated entirely to one of the two classes (e.g. Trust 1). To alleviate this, more Trust-level classes could be sought, increasing the number until no Trust had a probabilistic assignment of exactly one for classes at the extremities of the range of Trust outcome means. More research is needed, but as applied here, the estimated ranks are robust.
Although the analyses undertaken were primarily for illustration of the proposed methodology, the results are to be taken seriously. Bias may have occurred, however, due to patients with more than one Trust visit having been assigned the most recent Trust visited as the treatment centre. If diagnosis was made at a separate Trust to that which subsequently provided treatment, it would be the latter that was important when modelling healthcare delivery and process variables. In our dataset, 75% of patients visited only one Trust. Nevertheless, some inaccuracies may remain, which could be addressed by screening each patient journey to determine where the majority of interventions take place, or by using multilevel multiple membership models for multiple treatment centres. Furthermore, technically, we have cross-classified data, with patients nested in both area of residence (which yields the patient SEB) and diagnostic centre (Trust); the area level is thus crossed with the Trust level. The number of patients in each area, however, is small and for simplicity of illustration we discarded this level in our model. The methodological principles of MLLC modelling extend theoretically to a cross-classified context, but software does not yet facilitate this.
We have satisfactorily demonstrated the principles of step (i) outlined previously, but there is more research to be undertaken to determine the processes for steps (ii) and (iii), which embark upon modelling patient pathways and the evaluation of process differences that vary across healthcare provider institutions. Distinction could then be made between the delivery of care (e.g. treatments) and health service process characteristics (e.g. delays to treatment) that make up the total patient experience. The proposed methodology paves the way for a more advanced modelling approach to the analysis of treatment centre characteristics (in addition to patient casemix characteristics), where differences in the patient pathway of care are modelled to evaluate organisational features in relation to patient outcomes. Such strategies permit hypothesis generation around which healthcare delivery and organisational features warrant intervention, informing prospective cluster-randomised trials targeted at improving service organisation and delivery. This feeds into existing approaches for quality improvement research, consistent with the principles of the MRC framework for the development and evaluation of complex interventions .
The main advantages of the MLLC approach are that it provides accurately derived estimates of the outcome differences across Trust classes, hence improved 'casemix adjustment' for individual Trusts. Trust level covariates may be included, capturing additional casemix complexity. Although deliberately simplified, our illustration demonstrates a principle that could readily extend to a number of more sophisticated scenarios (e.g. time-to-event analysis, multiple treatment centres, cross-classified structures). The MLLC model paves the way to adjust for variations in the patient pathway (especially delivery of appropriate healthcare), permitting the evaluation of institutional processes, which should provide a more robust approach to evaluating institutional performance than is current practice.
The multilevel latent class model used in this study takes the form:
where y ij is the outcome (death = 1, alive = 0) for patient i within Trust j; is the vector of patient-level covariates; t are the Trust classes (1...T); and c are the patient classes (1...C); p(c|t) is the probability of being in patient class c conditional on being in Trust class t, and in this study C is taken as the same for each Trust. The patient class model, P ( c ), expands to:
where β 0 ( c ) to β 5 ( c ) are the patient-class specific coefficients for the patient-level covariates.
Kee F, Wilson RH, Harper C, Patterson CC, McCallion K, Houston RF, et al: Influence of hospital and clinician workload on survival from colorectal cancer: cohort study. BMJ. 1999, 318: 1381-1385.
Steele RJ: The influence of surgeon case volume on outcome in site-specific cancer surgery. Eur J Surg Oncol. 1996, 22: 211-213.
Monnet E, Faivre J, Raymond L, Garau I: Influence of stage at diagnosis on survival differences for rectal cancer in three European populations. Br J Cancer. 1999, 81: 463-468. 10.1038/sj.bjc.6690716.
Woodman CB, Gibbs A, Scott N, Haboubi NY, Collins S: Are differences in stage at presentation a credible explanation for reported differences in the survival of patients with colorectal cancer in Europe?. Br J Cancer. 2001, 85: 787-790. 10.1054/bjoc.2001.1958.
McMillan DC, Hole DJ, McArdle CS: The impact of old age on cancer-specific and non-cancer-related survival following elective potentially curative surgery for Dukes A/B colorectal cancer. Br J Cancer. 2008, 99: 1046-1049. 10.1038/sj.bjc.6604669.
Pollock AM, Vickers N: Breast, lung and colorectal cancer incidence and survival in South Thames Region, 1987-1992: the effect of social deprivation. J Public Health Med. 1997, 19: 288-294.
Schrijvers CT, Mackenbach JP, Lutz JM, Quinn MJ, Coleman MP: Deprivation, stage at diagnosis and cancer survival. Int J Cancer. 1995, 63: 324-329. 10.1002/ijc.2910630303.
Wichmann MW, Muller C, Hornung HM, Lau-Werner U, Schildberg FW: Gender differences in long-term survival of patients with colorectal cancer. Br J Surg. 2001, 88: 1092-1098. 10.1046/j.0007-1323.2001.01819.x.
Wolters U, Stutzer H, Isenberg J: Gender related survival in colorectal cancer. Anticancer Res. 1996, 16: 1281-1289.
Normand SLT, Sykora K, Li P, Mamdani M, Rochon PA, Anderson GM: Readers guide to critical appraisal of cohort studies: 3. Analytical strategies to reduce confounding. BMJ. 2005, 330: 1021-1023. 10.1136/bmj.330.7498.1021.
Rosenbaum PR, Rubin DB: The central role of the propensity score in observational studies for causal effects. Biometrika. 1983, 70: 41-55. 10.1093/biomet/70.1.41.
Rosenbaum PR, Rubin DB: Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984, 79: 516-524. 10.2307/2288398.
Cohen ME, Dimick JB, Bilimoria KY, Ko CY, Richards K, Hall BL: Risk adjustment in the American College of Surgeons National Surgical Quality Improvement Program: A comparison of logistic versus hierarchical modeling. Journal of the American College of Surgeons. 2009, 209: 687-93. 10.1016/j.jamcollsurg.2009.08.020.
Damman OC, Stubbe JH, Hendriks M, Arah OA, Spreeuwenberg P, Delnoij DM, et al: Using multilevel modeling to assess case-mix adjusters in consumer experience surveys in health care. Med Care. 2009, 47: 496-503. 10.1097/MLR.0b013e31818afa05.
Neumann A, Holstein J, Le Gall JR, Lepage E: Measuring performance in health care: case-mix adjustment by boosted decision trees. Artif Intell Med. 2004, 32: 97-113. 10.1016/j.artmed.2004.06.001.
Nicholl J: Case-mix adjustment in non-randomised observational evaluations: the constant risk fallacy. J Epidemiol Community Health. 2007, 61: 1010-1013. 10.1136/jech.2007.061747.
Deeks JJ, Dinnes J, D'Amico R, Sowden AJ, Sakarovitch C, Song F, et al: Evaluating non-randomised intervention studies. Health Technol Assess. 2003, 7: iii-173.
World Health Organisation: The International Statistical Classification of Diseases and Health Related Problems ICD-10: Tenth Revision. 2005, Geneva: World Health Organisation, 2
DUKES CE: The surgical pathology of rectal cancer. J Clin Pathol. 1949, 2: 95-98. 10.1136/jcp.2.2.95.
Townsend P, Phillimore P, Beattie A: Health and Deprivation: Inequality and the North. 1988, London: Routledge
Schwarz G: Estimating the dimension of a model. Annals of Statistics. 1978, 6: 461-464. 10.1214/aos/1176344136.
Akaike H: A new look at the statistical identification model. IEEE Trans Auto Control. 1974, 19: 716-723. 10.1109/TAC.1974.1100705.
Gilthorpe MS, Frydenberg M, Cheng Y, Baelum V: Modelling count data with excessive zeros: the need for class prediction in zero-inflated models and the issue of data generation in choosing between zero-inflated and generic mixture models for dental caries data. Stat Med. 2009, 28: 3539-3553. 10.1002/sim.3699.
Downing A, Harrison WJ, West RM, Forman D, Gilthorpe MS: Latent Class Modelling of the Association between Socioeconomic Background and Breast Cancer Survival Status at 5 Years Whilst Incorporating Stage of Disease. J Epidemiol Community Health. 2009
Tu YK, West R, Ellison GT, Gilthorpe MS: Why evidence for the fetal origins of adult disease might be a statistical artifact: the "reversal paradox" for the relation between birth weight and blood pressure in later life. Am J Epidemiol. 2005, 161: 27-32. 10.1093/aje/kwi002.
Stata Corp: Stata 11 Reference Manuals. 2009, Stata Corp
Vermunt JK, Magidson J: Latent GOLD 4.0 User's Guide. 2005, Belmont, Massachusetts: Statistical Innovations Inc
Craig P, Dieppe P, Macintyre S, Michie S, Nazareth I, Petticrew M: Developing and evaluating complex interventions: the new Medical Research Council guidance. BMJ. 2008, 337: a1655-10.1136/bmj.a1655.
The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1472-6963/11/53/prepub
The authors would like to thank the Northern and Yorkshire Cancer Registry and Information Service (NYCRIS) for access to the routinely collected data for the purposes of this research.
The authors declare that they have no competing interests.
MSG conceived the idea and planned the study, he drafted the manuscript, and coordinated input from all coauthors; WJH did the analysis, addressing the statistical problems throughout the course of analytical developments, she produced the results (tables and chart) and drafted the manuscript; AD contributed her expertise from previous work that led on to this study and commented on the manuscript; DF provided cancer epidemiology expert advice and commented on the manuscript; RMW contributed to initial discussions surrounding concept and study design, helped steer the analyses, and contributed to the interpretation of the results and the writing of the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.