Skip to main content

Cross-validation of comorbidity items in two national databases in a sample of patients with end-stage kidney disease

Abstract

Background

The use of national medico-administrative databases for epidemiological studies has increased in the last decades. In France, the Healthcare Expenditures and Conditions Mapping (HECM) algorithm has been developed to analyse and monitor the morbidity and economic burden of 58 diseases. We aimed to assess the performance of the HECM in identifying different conditions in patients with end-stage kidney disease (ESKD) using data from the REIN registry (the French National Registry for patients with ESKD).

Methods

We included all patients over 18 years of age who started renal replacement therapy in France in 2018. Five conditions with a similar definition in both databases were included (ESKD, diabetes, human immunodeficiency virus [HIV], coronary insufficiency, and cancer). The performance of each SNDS algorithm was assessed using sensitivity, specificity, positive predictive values (PPVs), negative predictive values (NPVs), and Cohen’s kappa coefficient.

Results

In total 5,971 patients were included. Among them, 81% were identified as having ESKD in both databases. Diabetes was the condition with the best performance, with a sensitivity, specificity, PPV, NPV, and Kappa coefficient all over 80%. Cancer had the lowest level of agreement with a Kappa coefficient of 51% and a high specificity and high NPV (94% and 95%). The conditions for which the definition in the HECM included disease-specific medications performed better in our study.

Conclusion

The HECM showed good to very good concordance with the REIN database information overall, with the exception of cancer. Further validation of the HECM tool in other populations should be performed.

Peer Review reports

Background

The use of national medico-administrative databases for epidemiological studies has increased in the last two decades as an alternative to traditional observational studies. These databases were conceived to survey healthcare systems from a financial and administrative point of view, with information such as reimbursement claims, healthcare services, medical procedures, daily compensation, etc. [1]. The use of such databases for research has the potential to reduce the risk of selection bias often present in epidemiological surveys, as they are almost exhaustive. In addition, it is less costly, as the data is collected systematically and relatively easily accessible, simultaneously eliminating recall bias, as it relies on data collected systematically and not based on patient reporting with potential recollection mistakes. National databases are helpful for longitudinal studies, as they make it possible to include extended follow-up times and large sample sizes, as well as rare events and epidemiological surveillance or surveys. Such databases are, however, not exempt from information bias [2, 3] as the information tends to be essentially administrative. For example, pharmaceutical information is limited to the dispensation of prescribed and reimbursed medications that are registered in the insurance records [1]. Over the counter medications can be easily missed.

The French population benefits from universal public healthcare coverage. All information concerning the use of the healthcare system is recorded in the National Health Data System (“Système National des Données de Santé, SNDS”) [4]. Since 2012, the French National Health Insurance has developed a tool based on the SNDS to analyse and monitor the morbidity and economic burden of 58 treated diseases, chronic treatments, and episodes of care through healthcare utilization [5]. Healthcare Expenditures and Conditions Mapping (HECM) allows the identification of diseases by means of medical algorithms based on the diagnoses for hospitalization, long-term disease diagnoses, and reimbursement of specific treatments for certain diseases for a given year and a period up to four years before. This algorithm is repeated for each year providing a cross-sectional study repeated over time [6]. The HECM has provided information to improve healthcare policies in France (preparing the French Social Security Funding Act and the Public Health Act). The findings of the HECM on disease prevalence and expenditures are similar to those of studies conducted in other countries [6].

A previous study in France compared the performance of various SNDS-based algorithms to identify treated diabetes against clinical data from CONSTANCES (a national French cohort of professionally active or retired salaried workers and their families), showing excellent performance for the three algorithms, including HECM’s current algorithm concerning diabetes [7]. However, such algorithm validity assessments are still scarce. Data from registries offer this opportunity because they provide gold-standard data: they are exhaustive for a given territory, registered manually, and controlled by experienced research assistants.

We aimed to assess the performance of five HECM algorithms on patients with ESKD (ESKD, diabetes, HIV infection, cancer, and coronary disease) against information on the French Renal Epidemiology and Information Network (REIN). The REIN database provides national quality-controlled data on patients with ESKD. It relies on a network of nephrologists, epidemiologists, patients, and public health representatives who are coordinated regionally and nationally by the French biomedical agency, collecting exhaustive information on patients with ESKD (treatment and its changes, demographics, comorbid conditions, treatment center location, etc.) [8, 9].

Methods

Data sources

The REIN registry

The REIN registry was started in 2002 and covered all of France by 2012. It includes all patients receiving renal replacement therapy (RRT) in mainland France and its overseas territories. The REIN database collects information on patient characteristics (body mass index [BMI], age, sex, RRT modality, date of RRT start) and conditions (e.g., diabetes, coronary artery disease, cancer) based on medical records. Nephrologists, health managers, nurses, medical secretaries, and research assistants collect the data. Continuous controls are ensured during the year (with a strict focus on inclusion criteria, which excludes patients with acute renal disease). Yearly updates are performed to allow the inclusion of new information on patient treatment status, as well as comorbid condition updates. Detailed information on the definitions of comorbidities and coding in REIN can be found in Caillet et al. [9]. Quality controls and data collection procedures are detailed in Couchoud et al. [8].

The SNDS

The SNDS (a medico administrative database) collects individual data from various French health insurance schemes. This database contains exhaustive expense and reimbursement information on hospitalizations, ambulatory care, medications, laboratory analyses, and consultations for both public and private healthcare facilities, as well as transportation, compensatory daily allowances, and third-party compensatory indemnity, regardless of the payer of the services (state, complementary insurance, or out of pocket). It does not record primary care consultation diagnoses, or clinical results. For reimbursement, the SNDS includes information on long-term chronic diseases (LTD, a status that guarantees 100% reimbursement for healthcare expenses related to the disease when reported, given the fact that the patient could already been considered for LTD due to another medical condition) [4].

The HECM applied to the SNDS database uses discharge diagnoses, as well as the chronic diseases registered for healthcare reimbursement and/or specific medical acts/drugs to identify patient conditions (different algorithms for each condition, see details in supplementary Table 1). These algorithms are applied to all beneficiaries of the health insurance regimens in France (66.3 million inhabitants) that have used the healthcare system at least once during the year of interest. The pathologies, chronic treatments, and use of healthcare identified are, for the most part, non-exclusive, as the same person can be affected by several pathologies [5].

Study population

We included patients over 18 years of age that started RRT (either dialysis or renal transplant) in France in 2018 identified through the REIN registry and who could be linked to the SNDS database.

Independently of the present study, all REIN patients were matched with SNDS patients over the available extraction period, i.e. 2006–2020 by the national coordination of REIN. with an indirect deterministic linkage that uses a combinations of 6 items: sex, age, location of residence, date and facility of kidney transplant/or start of dialysis treatment, and date of death, if available, with varying granularity (age ± 1, location at municipality or district, date ± 2 months, exact facility or facility in the same district). Further details on the linkage procedure applied yearly can be found in Raffray et al. [10]. For the purpose of this study, we selected only subjects from our incident population considered to have “good linkage”. Good linkage was defined as exact match on sex plus: either 1/ exact linkage on date of death, whatever the granularity of the other 4 items either 2/ two or more exact match on the following items: age, location of residence, date and facility of RRT. Other combinations were not included in the present study.

Health conditions compared

For the purposes of this study, the conditions identified in the REIN registry were considered as the reference.

The following conditions identified in both the SNDS and REIN registry were included in this study: ESKD, diabetes, HIV, coronary disease, and cancer (see definition for each in Supplementary Table 1). These conditions were selected, as their identification method in both databases were comparable. In addition, the conditions studied presented an opportunity to explore the performance of the algorithms’ with different characteristics. Diabetes and HIV are disease specific and likely to be well identified in pharmaceutical records, one being very frequent, whereas the other is less. Coronary disease identification relies on mainly clinical criteria and cancer because it represents a combination of both cases. The definitions of other conditions identified with the HECM were too different compared to those in the REIN registry.

Statistical analysis

A descriptive analysis was performed comparing subjects with and without good REIN-SNDS linkage (patients included vs those excluded from the study). These included survival after 2018 (recruitment year), first RRT, sex, comorbid conditions, age, and regions of residence in France.

The performance of each algorithm was evaluated using sensitivity, specificity, the positive predictive value PPV), the negative predictive value (NPV), and Cohen’s kappa coefficient, together with the 95% confidence interval (CI). The level of agreement was assessed as poor (K-coefficient ≤ 0.20), fair (0.20 ≥ K-coefficient ≤ 0.40), moderate (0.40 ≥ K-coefficient ≤ 0.60), good (0.60 ≥ K-coefficient ≤ 0.80), or very good (K-coefficient ≥ 0.80) [11]. All populations included in the REIN registry had ESKD by definition. Therefore, only true positives and false negatives could be calculated for the item ESKD.

To account for the fact that HECM algorithms were designed for medico-economical purpose and individuals may not have been taken into account when they are treated at the beginning or end of the year, a secondary analysis was performed for subjects whose comorbidity data did not match for the year 2018. In these cases (unmatched conditions for 2018), the comorbidity information from the HECM for the year 2017 and 2019 were taken into consideration and new comparisons were performed. As an example, if a patient with a diabetes status did not match for the year 2018, we considered their HECM diabetes status for the year 2017 and repeated the comparison for the whole population. This secondary analysis was carried out for all conditions.

A comparison of certain characteristics was conducted (survival after 2018, first renal replacement treatment, sex, age, region of residence, nephropathy at recruitment, acute kidney disease diagnosis) to better understand the population whose conditions matched and did not match for the year 2018.

All analyses were performed using SAS enterprise guide software (version 8.3 SAS institute Inc., Cary, NC, USA).

Ethical approval

The REIN registry creation was approved by the relevant French committees: the Comité consultatif sur le traitement de l’information en matière de recherché (CCTIRS N°03–149) and the Commission nationale de l’informatique et des libertés (CNIL N° 903,188).

The French national health insurance (CNAM) in charge of the SNDS (Système National des Données de Santé) has permanent access to the pseudonymized reimbursement data in application of the provisions of articles R. 1461‐12 et seq. of the French Public Health Code, with rules and criteria similar to the Helsinki declaration and permanent full access to the SNDS by decree (Décret n° 2016–1871 du 26 décembre 2016 relatif au traitement de données à caractère personnel dénommé « système national des données de santé»). The CNAM has authorization to perform studies based on SNDS data from the CNIL (National independent Commission for Computing and Freedom, the French data protection agency for sensitive information). All methods were carried out in accordance with relevant guidelines and regulations.

Results

In total, 8,309 individuals were identified as incident patients in the REIN registry for the year 2018 (present in both databases). Among them, 5,971 patients were included in our study because of good linkage between the REIN and SNDS databases. The excluded population (those without good linkage) was more likely to include those who died in the year of their diagnosis, started RRT with dialysis, were older, or were a resident of Ile-de-France (Paris region) (Table 1).

Table 1 Description and comparison between the included and excluded populations in relationship to the linkage between the SNDS and REIN database

ESKD status

With the HECM 2018 81% of the subjects with ESKD were true positives. In a secondary analysis that included information on the ESKD status from the HECM for 2019, the percentage of patients correctly identified by the SNDS database increased to 93% (Table 2). The 1,126 false negative ESKD patients (HECM 2018) were more likely died in the year they started treatment, started treatment with dialysis, among the older population, residents of Ile-de-France, and classified in the SNDS database as having acute renal disease (Table 3).

Table 2 Comparison between patient comorbidities in the two databases
Table 3 Characteristics of the matched and unmatched populations by disease

Diabetes

Forty-two percent of the population identified in the REIN database were registered as having diabetes. Eight percent of the population’s diabetes identification differed between the databases (distributed equally between false positives and false negatives) for their diabetes status between the two databases for HCEM 2018 (Table 2). The population of 530 patients with differing diabetes status had a higher proportion of patients who had transplantation as their first RRT, were over 75 years of age, or were residents of Ile-de-France (Table 3). The Kappa coefficient of agreement was found to be very good (82%), as were the specificity, sensitivity, NPV, and PPV (over 89%). No great improvement was observed when including the patients’ diabetes status in the HECM for 2017 or 2019.

HIV infection

Only 1% of patients identified in the REIN database were HIV positive. Approximately 0.4% of the population differed between the databases based on their HIV/AIDS status (Table 2). Among the 21 disparate patients based on HIV status, no transplant patients were misclassified, a higher percentage were aged between 45 and 64, and most were identified as residents of Ile-de-France (Table 3). This comparison showed a good Kappa coefficient of agreement. The sensitivity and PPV were the lowest among the other parameters measured, with 83% and 66%, respectively. An improvement to 0.2% was observed for the false positives when including information from the HECM the year before and after recruitment.

Coronary disease

Twenty four percent of the patients identified in the REIN database were recorded as having coronary disease. Fifteen percent of the population differed on coronary disease status, of which two thirds of the disparate patients were false positives (Table 2). The 872 unmatched patients based on coronary disease status were more likely to be patients who died early or started treatment with dialysis (Table 3). The sensitivity was 79% and specificity 87%. The Kappa coefficient of agreement between the REIN and SNDS databases on coronary disease was 62%. The level of agreement improved to 75% and 69% when considering the information from 2017 and 2019 from the HECM, respectively.

Cancer

Eleven percent of the population was identified as having cancer. Cancer status differed in 10% of the patients (Table 2). The 546 unalike patients based on cancer status died early, all started treatment with dialysis, and were more likely to be part of the older group (Table 3). Sensitivity was 58% and specificity 94%. This comparison showed the lowest PPV of the comorbidities studied, with 56%, as well as only a 51% kappa coefficient of agreement.

Discussion

In this study, we compared the information on patient conditions between the REIN registry collected based on clinical data and the HECM algorithm based on health consumption reimbursement data. The agreement between diagnoses as identified by the REIN and the SNDS varied between conditions, with the highest for diabetes and the lowest for cancer. Specificity was above 85% and the PPV over 95% for all conditions, suggesting overall good performance of the HECM algorithms in identifying the conditions of interest in this study.

Ease of diagnosis

Pathologies with tracer drugs or tracer medical acts are better identified in medico-administrative refund information databases [12, 13]. The Kappa coefficients for the status of diabetes and HIV were higher than those for coronary disease and cancer. These comorbidities identified in the SNDS database are treated with medications that are specifically used for the disease, allowing us to identify patients whose LTD registration or hospitalization diagnoses are not reported. Coronary disease as a medical diagnosis is slightly more difficult to identify in the SNDS database, as it relies on discharge records for patients hospitalized during the given period or an LTD reported in the four years before the year of interest. There are no specific drugs or medical procedures that are integrated into the HECM that can help identify patients who do not comply with the specified conditions. The REIN database benefits from direct patient interviews and medical records to record information on these conditions.

The definition of active cancer in the HECM is based on patients with a reported LTD and hospital diagnosis during the year. These definitions could lead to an underestimation of patients who either did not receive treatment or whose treatment was received in ambulatory care (whose LTD is not reported for the year of interest). As an example, a patient receiving antiestrogen therapy for breast cancer treatment in an ambulatory setting, without hospitalization associated with the reported disease and no LTD reported could be missed by the HECM tool [14]. On the contrary, the REIN database reports active cancer regardless of the patients’ current treatment status. These differences in definition could explain some of the false negative patients.

The sensitivity and specificity were high (> 80) for most of the assessed diseases, except the sensitivity for coronary disease. This high level of sensitivity suggests that the HECM tool is able identify patients with a disease (unlikely to produce false negatives). High specificity was seen for all comorbidities assessed, suggesting a low number of patients being categorized as having the condition when they do not (false positives).

Timelines

In comparing these databases, we should consider that the REIN database collects information prospectively and that the HECM categorizes diseases retrospectively for a given year. The identification of patients with ESKD improved when adding information from the year 2019. A great number of patients with mismatched ESKD status were found to be patients coded in the SNDS as acute kidney disease in the REIN incident year. These may have been patients with chronic kidney disease but who started chronic dialysis after an acute episode who did not fulfill the required time under treatment to be classified as ESKD by the end of the year of interest. As well, despite the work of the REIN registry's research assistants, whose mission is to check the completeness of the cases and compliance with the protocol, we cannot rule out a few marginal errors.

Concerning false negatives for diabetes, a patient identified in the REIN database in December as being diabetic that did not fulfill the requirements to be identified by HECM (e.g., needs 3 antidiabetic drug deliveries to be identified through medication) for that year would have resulted in a mismatch. A patient identified in the REIN database in January as a patient without cancer might have developed the disease later in the same year and the HECM tool would register them as positive for cancer in that same year, resulting in a false positive. For, coronary disease, we observed better performance when data for the year 2017 was added. This could be a result of HECM considering data for the four years prior to the year of interest to classify patients, therefore, including adding information for patients from 2015.

Patient characteristics

We explored the characteristics between the unmatched and matched populations for each condition. We found a higher proportion of early deaths, first RRT with dialysis, males, and residents of Ile-de-France among the unmatched population. Patients with short survival would not have the opportunity to have their record corrected in the REIN database and in the SNDS, they may not have had sufficient healthcare consumption to be identified. First RRT treatment with dialysis and residency in Ile-de-France were the biggest subgroups for which linkage was more likely to be less precise. The Ile-de-France region is a densely populated region were patients could easily mobilize between the different facilities [8]. Patients could start their treatment at an ICU (recorded in the SNDS) in a postal code and later transferred to a less medicalized center elsewhere (recorded in the REIN database). The prevalence of a disease in a population can influence the PPV and NPV. When prevalence increases the PPV increases but the NPV decreases [15, 16]. In this population, the prevalence of diabetes, coronary disease, and cancer was higher than in the general population (prevalence estimated to be 5.88%, 3.11%, and 4.98% in 2018, respectively, for the general French population [17]). These accuracy parameters (PPV and NPV) may, therefore, not be replicable in the general population.

Strengths and limitations

The strengths of this study were that it used two national databases in which comorbidities are identified by two different methods. However, this study also had several limitations. First, even though the parameters to categorize a patient as having or not a condition are comparable between both databases they are not identical. Therefore, certain patients’ conditions could be disparate eeven when correctly categorized in both databases. Unfortunately, among the 58 conditions of the HECM, only 5 had similar identification method with REIN. Many medical conditions explored by the HECM are not collected in REIN like precise cancer location or psychiatric disorder or neurodegenerative disease. We recognise that the results observed for these 5 diseases would have been significantly poorer if we had used diseases whose identification method initially differed.

Second, for legal reasons the databases used do not have a shared unique identifier for patients and therefore relied on a direct deterministic algorithm to link patient information between them. Even when only including patients with a good linkage, there might have been certain patients who were imperfectly linked. The choice to keep only patients with a good match led to the exclusion of 2,338 patients. It seemed to us that in the case of our objective, this did not constitute a bias but may reduced the scope of the extrapolation of our comparison.

Third, HECM algorithms were designed for medico-economic rather than epidemiological purposes. As such, they do not aim to collect the exhaustive number of incident cases over one year, as economists are generally more interested in the longitudinal evolution of healthcare expenditure and consumption, observed on specific samples. The pathologies categorized by the algorithm are based on short periods, with individuals not taken into account when they are treated at the beginning or end of the year. This may explain the improvement in performance when the search was extended to the years 2017 and 2019. On the other hand, despite the fact that completeness and accuracy are ascertained by REIN research assistants during regular visits in every dialysis centre, and update at each annual visit, we may not exclude coding error in transcription from medical record.

The REIN database included only patients with ESKD, representing only a small proportion of the French population. Therefore, the generalisability of the results to other populations should be explored. Other French registries have successfully linked most of their patients (all over 85%) to the SNDS database: CONSTANCES, FRESH HR, ACIRA, France-TAVI, CANARI [18,19,20,21]. These linkages have been used to enrich the databases of the registries and could potentially be used as a starting point to further validate the HECM tool.

Conclusion

The development of tools that allow the use of medico-administrative databases for epidemiological research is of great important, as they provide information at the national level, limiting the costs and time required for more traditional data collection methods. The HECM algorithm matched the information provided by the REIN database with that of the SNDS database relatively well. However, further validation of the HECM tool on other populations should be performed.

Availability of data and materials

The author is authorized by the REIN scientific council to use the datasets used and/or analysed during the current study. These databases are available from the corresponding author on reasonable request.

References

  1. GavrielovYusim N, Friger M. Use of administrative medical databases in population-based research on JSTOR. J Epidemiol Community Health. 2014;68(3):283–7. Available from: https://www-jstor-org.ezproxy.universite-paris-saclay.fr/stable/43281961.

    Article  PubMed  Google Scholar 

  2. Suissa S, Garbe E. Primer: administrative health databases in observational studies of drug effects—advantages and disadvantages. Nat Clin Pract Rheumatol. 2007;3(12):725–32.

    Article  CAS  PubMed  Google Scholar 

  3. Ray W. Improving automated database studies. Epidemiology. 2011;22(3):302–4.

    Article  PubMed  Google Scholar 

  4. Tuppin P, Rudant J, Constantinou P, Gastaldi-Ménager C, Rachas A, de Roquefeuil L, et al. Value of a national administrative database to guide public decisions: From the système national d’information interrégimes de l’Assurance Maladie (SNIIRAM) to the système national des données de santé (SNDS) in France. Rev Epidemiol Sante Publique. 2017;65(4):s149-67.

    Article  PubMed  Google Scholar 

  5. Rachas A, Gastaldi-Ménager C, Denis P, Barthélémy P, Constantinou P, Drouin J, et al. The economic burden of disease in France from the National Health Insurance Perspective: the healthcare expenditures and conditions mapping used to prepare the French Social Security Funding Act and the Public Health Act. Med Care. 2022;60(9):655–64.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Caisse nationale de l’Assurance Maladie (Cnam). Méthodologie médicale de la cartographie des pathologies et des dépenses, version G9 (années 2015 à 2020, Tous Régimes). 2022. Available from: https://assurance-maladie.ameli.fr/sites/default/files/2022_methode-reperage-pathologies_cartographie_0.pdf. Accessed 1 Mar 2023.

  7. Fuentes S, Cosson E, Mandereau-Bruno L, Fagot-Campagna A, Bernillon P, Goldberg M, et al. Identifying diabetes cases in health administrative databases: a validation study based on a large French cohort. Int J Public Health. 2019. Available from: https://pubmed.ncbi.nlm.nih.gov/30515552/. Accessed 15 Jan 2023.

  8. Couchoud C, Stengel B, Landais P, Aldigier JC, de Cornelissen F, Dabot C, et al. The renal epidemiology and information network (REIN): a new registry for end-stage renal disease in France. Nephrol Dial Transplant. 2006;21(2):411–8.

    Article  PubMed  Google Scholar 

  9. Caillet A, Mazoué F, Wurtz B, Larre X, Couchoud C, Lassalle M, et al. Which data in the French registry for advanced chronic kidney disease for public health and patient care? Nephrol Ther. 2022;18(4):228–36.

    Article  PubMed  Google Scholar 

  10. Raffray M, Bayat S, Lassalle M, Couchoud C. Linking disease registries and nationwide healthcare administrative databases: the French renal epidemiology and information network (REIN) insight. BMC Nephrol. 2020;21(1):25.

    Article  PubMed  PubMed Central  Google Scholar 

  11. McHugh ML. Interrater reliability: the kappa statistic. Biochem Medica. 2012;22(3):276–82.

    Article  Google Scholar 

  12. Malone D, Billups S, Valuck R, Carter B. Development of a chronic disease indicator score using a veterans affairs medical center medication database. J Clin Epidemiol. 1999;52(6):551–7.

    Article  CAS  PubMed  Google Scholar 

  13. Barnett M, Khosraviani V, Doroudgar S, Ip E. A narrative review of using prescription drug databases for comorbidity adjustment: A less effective remedy or a prescription for improved model fit? Res Soc Adm Pharm. 2022;18(2):2283–300.

    Article  Google Scholar 

  14. Etude des algorithmes de definition de pathologies dans le Systeme National d’Information inter-regimes de l’Assurance Maladie (SNIIRAM). Available from: https://www.ameli.fr/sites/default/files/2014_etude-algorithmes-definition-pathologies-partie-1_cartographie.pdf.. Accessed 17 Jan 2023.

  15. Chubak J, Pocobelli G, Weiss N. Tradeoffs between accuracy measures for electronic health care data algorithms - ScienceDirect. Available from: https://www-sciencedirect-com.ezproxy.universite-paris-saclay.fr/science/article/pii/S0895435611002782?via%3Dihub. Accessed 19 Jan 2023.

  16. Tenny S, Hoffman MR. Prevalence. In: StatPearls. Treasure Island: StatPearls Publishing; 2022. Available from: http://www.ncbi.nlm.nih.gov/books/NBK430867/. Accessed 18 Jan 2023.

  17. Caisse nationale de l’Assurance Maladie (Cnam). Data pathologies. Data pathologies. Available from: https://data.ameli.fr/pages/data-pathologies/. Accessed 19 Jan 2023.

  18. Lesaine E, Belhamri NM, Legrand JP, Domecq S, Coste P, Lacroix A, et al. Appariement entre un registre régional de pratiques en cardiologie interventionnelle et la base médico-administrative d’hospitalisation française : développement et validation d’un algorithme d’appariement déterministe. Rev DÉpidémiologie Santé Publique. 2021;69(2):78–87.

    Article  CAS  Google Scholar 

  19. Scailteux LM, Droitcourt C, Balusson F, Nowak E, Kerbrat S, Dupuy A, et al. French administrative health care database (SNDS): the value of its enrichment. Therapies. 2019;74(2):215–23.

    Article  Google Scholar 

  20. Didier R, Gouysse M, Eltchaninoff H, Le Breton H, Commeau P, Cayla G, et al. Successful linkage of French large-scale national registry populations to national reimbursement data: Improved data completeness and minimized loss to follow-up. Arch Cardiovasc Dis. 2020;113(8):534–41.

    Article  PubMed  Google Scholar 

  21. Logeart D, Damy T, Doublet M, Salvat M, Tribouilloy C, Bauer F, et al. Feasibility and accuracy of linking a heart failure registry to the national claims database using indirect identifiers. Arch Cardiovasc Dis. 2023;116:18–24.

    Article  Google Scholar 

Download references

Acknowledgements

We want to acknowledge CIFRE for providing us the opportunity to develop our research.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

All of the authors listed in our paper contributed in the authorship of this paper by contributing in the conception and design or acquisition of data, or analysis and interpretation of data; AND. drafting the article or revising it critically for important intellectual content; AND final approval of the version to be published. IV: CD, AI and D; CP: CD, AD, AI and D; AH: AD and D; EC: AD and D; AR: AD, AI and D; PT: CD and D; CC: CD, AD, AI and D; CD: Conception and design; AD: Acquisition of data; AI: Analysis and interpretation; D: Drafting the article or revising it critically for important intellectual content.

Corresponding author

Correspondence to Isabella Vanorio-Vega.

Ethics declarations

Ethics approval and consent to participate

The creation of the REIN registry was approved by the relevant French committees: the Comité consultatif sur le traitement de l’information en matière de recherché (CCTIRS N°03–149) and the Commission nationale de l’informatique et des libertés (CNIL N° 903188). For population-based registries requiring exhaustiveness, French regulations require that patients be informed by the clinic that if he/she opposes to his nominative data recording, he/she will be anonymously recorded. Patients’ under the CNIL regulations have the right to withdrawal information have been anonymized and de-identified before the extraction for analysis. Patients in the REIN registry have been given the option to opt out from the use of their personal data. Those who consented have consented to the use of data for research. The use of the databases are under compliance of the approval by CNIL (French regulations), more information’s on: The Data Protection Act | CNIL [Internet]. [Cited 2023 Apr 19]. Available from: https://www.cnil.fr/fr/la-loi-informatique-et-libertes#article4. Patients information letter and REIN compliance details (REIN information letter) available at: R.E.I.N. (Réseau Epidémiologique et Information en (…)—Agence de la biomédecine [Internet]. 2021. Available from: https://www.agence-biomedecine.fr/R-E-I-N-Reseau-Epidemiologique-et-Information-en-Nephrologie

The French national health insurance (CNAM) in charge of the SNDS (Système National des Données de Santé) has permanent access to the pseudonymized reimbursement data in application of the provisions of articles R. 1461‐12 et seq. of the French Public Health Code, with rules and criteria similar to the Helsinki declaration and permanent full access to the SNDS by decree (Décret n° 2016–1871 du 26 décembre 2016 relatif au traitement de données à caractère personnel dénommé « système national des données de santé»). The CNAM has authorization to perform studies based on SNDS data from the CNIL (National independent Commission for Computing and Freedom, the French data protection agency for sensitive information). All methods were carried out in accordance with relevant guidelines and regulations.

The main author of this article is under the ethical requirement of both the Agence de la biomédecine (institution managing the REIN registry and database) as well as the CNAM. The scientific committee from the biomedical agency approved the use of their databases for this study.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Supplementary Table 1. Comparison between the definitions used in the REIN and SNDS databases.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vanorio-Vega, I., Constantinou, P., Hami, A. et al. Cross-validation of comorbidity items in two national databases in a sample of patients with end-stage kidney disease. BMC Health Serv Res 23, 1140 (2023). https://doi.org/10.1186/s12913-023-10145-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12913-023-10145-y

Keywords