Assessment of The ACG System in Dutch Primary Care Using GP’s Electronic Health Records: A Retrospective Cross-Sectional Study

Background Within the Dutch health care system the focus is shifting from a disease oriented approach to a more population based approach. Since every inhabitant in the Netherlands is registered with one general practice, this offers a unique possibility to perform Population Health Management analyses based on general practitioners’ (GP) registries. The Johns Hopkins Adjusted Clinical Groups (ACG) System is an internationally used method for predictive population analyses. The model categorizes individuals based on their complete health prole, taking into account age, gender, diagnoses and medication. However, the ACG system was developed with non-Dutch data. Consequently, for wider implementation in Dutch general practice, the system needs to be validated in the Dutch healthcare setting. In this paper we show the results of the rst use of the ACG system on Dutch GP data. The aim of this study is to explore how well the ACG system can distinguish between different levels of GP healthcare utilization. This study showed that the ACG is applicable as risk stratication tool in Dutch primary care using routinely registered data from general practitioners’ registries. The ACG system yields good results compared to the traditional ICPC classication. Country specic adjustments in the classication and validation of specic risks are necessary.

0.85. These models performed better than the base model (age and gender only) which showed AUC values between 0.64 and 0.71.
Conclusion The results of this study show that the ACG system is a useful tool to stratify Dutch primary care populations with GP healthcare utilization as the outcome variable.

Background
With rising health care utilization and costs, a shift from disease oriented to population based approaches is being advocated worldwide. With the upcoming need for improved organization and management of healthcare and the increasing possibilities of big data, strategies based on health registry analyses are becoming popular. One use of health registry data in population health management strategies is risk strati cation. With risk strati cation, differences in individual health risks can be screened for, and used to assign interventions to the population and individuals that will bene t the most. With rising pressure on medical services provided by general practitioners (GPs) in most European countries (1), primary care can bene t from proven advantages of risk strati cation approaches, such as improved care management (2), resource allocation (3) and identi cation of subpopulations for tailored care interventions (4).
Despite the proven bene ts of using risk strati cation, especially in primary care, there is no evidence for application of internationally used risk strati cation tools in Dutch primary care. Risk strati cation approaches using Dutch GP registry data can be especially bene cial due to the gatekeeper's function of Dutch GPs, providing the opportunity to overview a near total population. Different tools for risk strati cation are used worldwide, amongst which the Adjusted Clinical Groups (ACG) tool developed by the Johns Hopkins University. The ACG system is an internationally used tool for risk strati cation on a generic level and is one of the most frequently used risk strati cation tools in primary care. Evidence has also shown stronger statistical validity for the ACG compared with other risk strati cation tools, regarding predictions of different healthcare utilization outcomes (5-7).
The ACG system uses registered diagnoses over a twelve month period, to assign individuals to one of 98 ACG categories, based on their healthcare pro les and expected health utilization (8). ACG categories are based on combinations of diagnoses types. Registered diagnoses processed by the ACG system, can include the International Classi cation of Primary Care (ICPC) coded (9), a commonly used registration method for diagnoses in primary care (10).
In this study we explored the potential use of Johns Hopkins University ACG System in routine registration data extracted from Dutch primary care practices. The aim of this study is to explore how well the ACG system, compared to the 17 chapters of the ICPC coding system, can distinguish between different levels of GP healthcare utilization in Dutch general practice registries.

Study design and data
For this retrospective cross-sectional study, we used data from patients registered in Primary Care. Patients were registered during 2014 with one of the ve participating GP practices in Nijkerk, the Netherlands. Data at individual patient level was extracted from the practices' electronic health records.
Obtained data included age, gender, and coded healthcare procedures, diagnoses and pharmaceutical data for 30,596 patients over the year 2014. Diagnoses were registered as ICPC-1 diagnoses codes, as used in the Netherlands (11). For proper recognition by the ACG system, ICPC-1 codes were converted to ICPC-2 codes. Prescribed medication was registered as Anatomical Therapeutic Chemical (ATC) codes (12), the classi cation system for pharmacy products. The number of GP visits was extracted from all healthcare procedures in 2014. GP visits were de ned as all GP encounters, including physical and telephone consults and home visits by either GPs or nurse practitioners working at the GP practices.
From the original datasets 4,289 cases were removed, due to corrupted patient identi cation numbers. Another 2,689 cases belonging to three speci c ACG categories, were left out of the analyses: No Diagnosis or Only Unclassi ed Diagnosis (n=281), Non-Users (n=2,407) and Invalid Age or Date of Birth (n=1). The nal analyses were performed with data for 23,618 persons (77% of 30,596 registered people).
Data preparation and analyses were performed with IBM SPSS Statistics 24.

ACG System software
The study was conducted, using the Johns Hopkins University's ACG® System software 11. The ACG® System software 11 is a risk strati cation tool, assigning each patient to one of the 98 mutually exclusive ACG categories. Assignment to ACG categories is based on combinations of diagnoses types. With the ACG system the diagnoses for each patient are grouped into 32 Aggregated Diagnosis Groups (ADGs), based on type of diagnoses rather than on speci c diagnoses, i.e. speci c ICPC codes. Individuals' patterns of ADGs determine the assignment of patients to one of the 98 mutually exclusive ACG categories (8).
Information on diagnoses and medication, in addition to age and gender, were used as input data for the ACG ® System software 11.

Assessment of the ACG system
To assess the applicability of the ACG system in Dutch primary care, we looked at two aspects: face validity and model performance.

Face Validity
According to Mosier (13) an important aspect of the testing of an instrument lies in the 'consumer acceptance'. The rst step in effective use of a test, is the actual selection for use and acceptance of the results. Mosier describes one of the translations of face validity as the appearance of validity: the test must appear valid in addition to the statistical validity. In this study we de ned face validity as this appearance of validity described by Mosier (13).
We assessed the ACG system's face validity by exploring the actual ACG categorization with regard to age. Age distributions for each ACG category were created and ACG categories were assessed on recognition of multimorbidity in relation to age.

Model Performance
To investigate the impact of the ACG system in Dutch primary care, four different logistic regression models were estimated.

Dependent variable
The outcome variable, number of GP visits, was transformed into binary variables according to four de nitions. According to the rst de nition, no GP visits was de ned as no utilization of care, whereas one or more GP visits were de ned as utilization of care. With the second de nition, a distinction between zero or one GP visit and two or more GP visits was made. With the third de nition, a distinction between zero to two GP visits and three or more GP visits was made. Accordingly, for the nal de nition the outcome was de ned as a distinction between zero to three and four or more GP visits. The performance of each of these models was investigated.

Independent variables
In the null or base model only age as a continuous variable and gender were included as explanatory variables.
Model 2 included age, gender and ADG diagnoses as independent variables. As an individual can have more than one ADG, the 32 ADGs were added to the model as 32 dummy variables.
Model 3 included age, gender and mutually exclusive ACGs. Before estimating the logistic regression, the numbers of individuals in each ACG category were checked. Aggregation of some ACG categories was necessary due to categories with small numbers of individuals. In the supplementary le 1 the aggregation of the original ACG categories is presented. Table 1 gives an overview of the four different models estimated. To select the best model, the performance of each logistic regression with outcome variable as de ned above, was investigated. The Area Under the Curve (AUC) values were calculated for each model.

Ethics approval and patients' consent
The need for ethical approval was waived by the medical ethical committee of Leiden University Medical Center (CME -LUMC), the Netherlands.
Participants were not asked for their consent because we used routinely collected de-identi ed data. standard deviation of 5.0 and the maximum number of GP visits was 92. In gure 1 the distribution of the number of GP visits within the study population is presented. As expected, this is a skewed distribution, where most of the population has had zero or one GP visits. Figure 2 shows the health problems within the study population according to the 17 chapters of the ICPC registry system. The percentages of the study population with at least one diagnosis code corresponding to a speci c ICPC chapter, are presented in the gure. ICPC chapters Musculoskeletal (L), Respiratory (R) and Skin (S) had the highest frequencies, with percentages between 43 and 49.

Face validity of ACG categorization
In gure 3 the distribution of age within each ACG category is presented with boxplots. The ACG categories are grouped according to the number of ADGs: one, two to three, four to ve, six to nine and lastly ten plus ADGs. Each group of ACGs corresponds with a different color, red being the highest numbers of ADGs. The gure shows that the number of ADGs gradually goes up with increasing age.
Mean ages of the ACG categories with only one ADG (green) are mostly under 30. Exceptions are the ACG categories Chronic medical: Stable and Eye/Dental, which have mean values above 50. The mean age of ACGs with two to three ADGs (yellow) is mostly between 30 and 40, with the exception of ACG category Acute Minor and Chronic Medical: Stable (mean age of 50+). For three out of four of the ACG categories with four to ve ADGs, the mean ages are between 50 and 62. However, the ACG category Acute Minor/Acute Major/Likely Recur/Psychosocial has a mean age of under 40. The ACG categories with six to nine ADGs have a mean age of around 63, whereas the mean age of ACG categories with ten or more ADGs is above 70. An extended overview of individuals from each ACG category, distributed over 10 year age bands, is presented in the supplementary le 2.

Model performance
To investigate the model performances, where the outcome variable utilization of GP was de ned as discussed in the methods section, AUCs along with their con dence intervals were computed.

Discussion
The results of this study suggest that the ACG system can be applied to Dutch primary care data, when regarding both face validity and model performance. With regard to the face validity, it can be concluded that the assignment of ACG categories is as expected: the ACG categories which indicate higher multimorbidity and thus higher expected care burden, are found amongst older patients. With respect to model performance, results showed that distinctions between the different levels of GP healthcare utilization can be made with the ACG system. The ACG and ADG categories, as well as the ICPC chapters (the commonly used primary care coding system), are highly associated with GP utilization. However, the ACG system is at patient level and provides a variety of other risk strati cation variables, such as multimorbidity measures, risks of hospitalization and high costs, making the use of the ACG as risk strati cation tool a good addition to the use of the ICPC coding system.
Comparison of the results of this study to previous research is challenging, as most previous studies investigating the association of the ACG system with continuous utilization outome measures. Some previous studies were carried out on dichotomous variable however and showed C-statistics and AUC values between 0.73 and 0.82 for the ACG as predictor for hospitalization (5,6,14). In addition, the study by Haas et al. presented C-statistics of 0.67 for emergency department visitation and 0.76 for top 10% healthcare costs (5).
Adding to the above mentioned studies, this study suggests that the ACG system is applicable in primary care. Analyzing primary care data in such a manner is of great importance for the understanding of e ciency of healthcare systems that are under increased physical and nancial pressure. A study by Sibley et al. showed that administrative data can be used to determine morbidity burden, an important indicator for future care utilization (15). Kristensen and colleagues assessed the use of the ACG system as a morbidity based casemix adjustment system amongst type 2 diabetes patients in order to allocate resources according to degree of co-morbidity (3). They stated that the Danish healthcare system, which is based on fee for service incentives, would pro t from a morbidity based casemix adjustment system.
The ACG has also proven to be effective for identifying inequities in healthcare utilization by Shadmi et al. (7). Identifying inequities is the rst step towards minimizing unwarranted care gaps. With risk strati cation tools such as the ACG, case nding for inclusion in population-level interventions can be performed in more health systems worldwide. A study by Soto-Gordoa used risk strati cation to select cases for a patient-centered intervention for multimorbid patients with the goal to lower hospitalization.
The approach avoided nine percent of hospitalization when cases were selected with the ACG tool (4).
With our study, a rst step towards validation of the ACG system, a tool to shift from disease oriented to population based approaches, is revealed for use in the Netherlands. This is opening up a variety of opportunities to reorganize and manage Dutch primary care in an e cient way.
Although the ACG seems an excellent tool to be used in the Netherlands, local adjustment of the software is of eminent importance. The main limitation of this study lies in the availability of only GP data (without, for example, hospital and mental health care data), forcing us to restrict healthcare utilization outcomes to GP visits, whereas healthcare utilization may be better de ned as a total overview of healthcare use. With our research we were not able to explore other types of healthcare utilization, for example de ned by total healthcare costs or more costly types of healthcare utilization such as

Conclusions
This study showed that the ACG is applicable as risk strati cation tool in Dutch primary care using routinely registered data from general practitioners' registries.

Declarations Ethics approval and patients' consent
The need for ethical approval was waived by the medical ethical committee of Leiden University Medical Center (CME -LUMC), the Netherlands.
Participants were not asked for their consent because we used routinely collected de-identi ed data.

Consent for publication
Not applicable Availability of Data and Materials hospitalization and emergency department visitation. Consequently, a full adjustment of the ACG system for use with Dutch data was not possible.
Moreover, the quality of data needs to be considered. For this study, historic data from GP registries were used. Risk strati cation with routinely collected primary care data is an easy and practical way to perform risk strati cation on a large scale. Data quality for risk strati cation purposes can be improved and strengthened by linkage with different data sources such as hospital and social care registries.

Further research
Before applying the ACG system in Dutch primary care, further research is required. This study showed associations between just two components of the ACG system, the ADG and ACG categories, and GP visitation. Risk scores, for example, for future hospitalization and total healthcare costs were outside the scope of this study. To justify the use of the ACG system as risk strati cation tool in Dutch primary care, studies validating the ACG risk scores should be conducted. In addition, the ACG models need to be adjusted and improved for use with Dutch primary care data.

Figure 1
Distribution of the number of GP visits within the study population Overview of health problems within the study population according to the 17 main ICPC chapters. ICPC chapters form the basis of the International Classi cation of Primary Care (ICPC) coding system.