Skip to main content
  • Research article
  • Open access
  • Published:

Exploring the concurrent validity of the nationwide assessment of permanent nursing home residence in Denmark - A cross-sectional data analysis using two administrative registries



Many register studies make use of information about permanent nursing home residents. Statistics Denmark (StatD) identifies nursing home residents by two different indirect methods, one based on reports from the municipalities regarding home care in taken place in a nursing home, and the other based on an algorithm created by StatD.

The aim of the present study was to validate StatD’s nursing home register using dedicated administrative municipality records on individual nursing home residents as gold standard.


In total, ten Danish municipalities were selected. Within each Danish Region, we randomly selected one municipality reporting to Stat D (Method 1) and one not reporting where instead an algorithm created by StatD was used to discover nursing home residents (Method 2). Method 1 means that municipalities reported to Stat D whether home care has taken place in a nursing home or in a private home. Method 2 is based on an algorithm created by Stat D for the municipalities where Method 1 is not applicable. Our gold standard was the information from the local administrative system in all ten selected municipalities. Each municipality provided a list with all individuals > 65 years living in a nursing home on January 1st, 2013 as well as the central personal number. This was compared to the list of individuals >65 living in nursing home facilities in the same ten municipalities on January 1st, 2013 retrieved from StatD.


According to the data received directly from the municipalities, which was used as our gold Standard 3821 individuals were identified as nursing home residents. The StatD register identified 6,141 individuals as residents. Additionally, 556 of the individuals identified by the municipalities were not identified in the StatD register.

Overall sensitivity for the ten municipalities in the StatD nursing home register was 0.85 (95% CI 0.84-0.87) and the PPV was 0.53 (95% CI 0.52-0.54). The municipalities for which nursing home status was based on the StatD algorithm (method 2) had a sensitivity of 0.84 (95% CI 0.82-0.86) and PPV of 0.48 (95% CI 0.46-0.50). Both slightly lower than the reporting municipalities (method 1) where the sensitivity was 0.87(95% CI 0.85-0.88) and the PPV was 0.57 (95% CI 0.56-0.59).

Additionally, the sensitivity and PPV of the Stat D register varied heavily among the ten municipalities from 0.51 (95% CI 0.43-0.59) to 0.96 (95% CI 0.95-0.98) and PPV correspondingly, from 0.14 (95% CI: 0.11-0.17) to 0.73 (95% CI 0.69-0.77).


The overall PPV of StatD nursing home register was low and differences between municipalities existed. Even in countries with extensive nation-wide registers, validating studies should be conducted for outcomes based on these registers.

Peer Review reports


Permanent nursing home is a common term for a staffed residence for individuals who are unable to take care of themselves due to for example immobility and severe health problems [1] In a number of countries, including Denmark, permanent nursing home residents are the most frail and ill of the elderly [1,2,3] with a high prevalence of multimorbidity, cognitive impairment and functional limitation [3]. Hence, permanent nursing home placement may be a determining factor or an outcome in epidemiological studies [4,5,6,7]. Residency at a permanent nursing home may also be an important confounder to consider, and nursing home admission is already being used as a confounding variable in many research studies in the field of public health and medicine [5, 8]. In one study it is used to find place of death [4], in another study the association between subjective memory complaints and nursing home placement is investigated [5], additionally another paper looks at factors contributing to mortality of nursing home inhabitants [7]. Gonzales-Colaco et al were interested in the cognitive decline after nursing home admission [9]. Further, reviews have addressed nursing home residence as a relevant outcome/proxy [8, 10, 11].

Denmark is well known for their comprehensive administrative registers that can be linked to other registers and bio-banks by using a personal identification number [12]. Based on several assumptions the administratively collected data is often used to calculate other variables, which may have high face validity, but often a proper validation is lacking. In Denmark, the national authority for statistical data, Statistics Denmark (StatD), identifies individuals living in permanent nursing homes according to one of two indirect methods: 1) for municipalities reporting data about home care, persons living in nursing homes were identified by individuals who had received home care in nursing homes (and not in their own home); 2) for non-reporting municipalities, persons living in nursing homes were identified by combining the address of the individual with addresses expected to be nursing homes (based on an algorithm).

For epidemiological studies, it is important to know the validity of such indirect methods. To our knowledge, no validation studies have yet been published. Hence, we aimed to validate StatD´s register of permanent nursing home residency by using the administrative data from the municipality regarding permanent nursing home residents as a gold standard.



Since 1968 a unique and personal Central Personal Register (CPR) identification number has registered all persons living in Denmark in the Danish Civil Registration System (CRS) for administrative purposes. This CPR-number consists of 10 digits that include information on date of birth and sex and gives the opportunity to link various administrative registers at an individual level [12].

In Denmark, the clear majority of nursing homes are managed and owned by the municipalities and their residents are therefore billed by the municipalities. The rest of the nursing homes are managed by private organizations, but they are economically supported by the municipalities if they provide information on their residents to the municipality – if not, they do not get economic support. Since all people in Denmark have a CPR identification number and all people are registered in their municipality to be able to receive health care, pay taxes, get pension and so on, the administrative records of the municipalities can identify all individuals living in nursing homes.

Statistics Denmark’s identification of residency in permanent nursing home

The variable “resident of permanent nursing home” is constructed by StatD by one of two indirect methods:

Residents identified as persons who have received home care in a nursing home (Method 1)

At data collection, 40 of the 98 municipalities in Denmark reported data to StatD about their citizens’ use of home care and indicate where services are given. Persons who have received care in a nursing home within a specific year are labeled as nursing home residents.

Residents identified by living on addresses likely to be nursing homes (Method 2)

For those 58 municipalities not providing information about use of home care the identification of residents is computed from an address list of home care facilities. It is assumed that an address shared by six or more people aged 80 years or older is a nursing home, when this age group represents more than 75 % of the residents on the address, in the capital region only 50 %. StatD use three of their own registers for this algorithm: “The population register”, “Home care in home register”, and the “Home care at nursing home register”.

Population and data used for the validation study

Permanent nursing home residents > 65 years old were identified by their CPR-number on the 1st of January 2013. We included individuals from all five regions of Denmark. Two municipalities were selected randomly from each region: one for which individuals living in nursing home facilities was based on home care data delivered by the municipality to StatD (Method 1) and one where addresses were based on StatD’s algorithm (Method 2).

The process of selecting data from the 10 municipalities within the five regions had the following steps (Fig. 1):

  1. 1.

    StatD was asked for data extraction including all permanent nursing home residents in Denmark on January 1st, 2013 (irrespective of the method for finding addresses of nursing homes).

  2. 2.

    The municipalities in each of the five regions were divided into two groups depending on the two different methods according to StatD.

  3. 3.

    Ten municipalities, two from each region stratified according to method for determining the addresses were randomly selected.

  4. 4.

    The selected municipalities where asked to provide information from the administrative systems about individuals living in nursing homes and all 10 selected municipalities agreed to participate. One municipality lacked information on the last four digits in the CPR-number for 50 persons registered as living in permanent nursing homes. However, information about names, addresses and the first six digits was sufficient information for StatD to identify the full CPR-numbers for all 50 individuals.

  5. 5.

    Gold standard

Fig. 1
figure 1

Flowchart showing the selection process of participating municipalities, which was done in a randomly selective way

Information on permanent nursing home residents from the municipalities was used as gold standard since this information is a dedicated administrative system used to bill permanent nursing home residents for their accommodation. This register is the only register possible to use when direct knowledge about individuals in nursing homes is wanted since all individuals in Denmark are registered in their municipality to receive for example pension.

Statistical analysis

For the analyses, two binary variables were used:

  1. 1.

    Nursing home resident according to StatD. Resident of a nursing home as registered in StatD (yes/no).

  2. 2.

    Nursing home resident according to municipality (gold standard). Residents of a nursing home as documented by each of the 10 selected municipalities (yes/no).

First, we visualized the distribution of the population from the 10 selected municipalities and the distribution of sex, age, inhabitants >65 years old, socio-economic status, and how many inhabitants that are labeled as nursing home residents based on information from the municipality records versus the register from StatD. Information about the municipalities was obtained from StatD ´s publicly accessible tabulations (Table 1). Secondly, we examined the sensitivity and positive predictive value (PPV, with 95% confidence interval, CI) of StatD’s register, taking into consideration how the information about permanent nursing home addresses was obtained (from home care reports of the municipality or by algorithm).

Table 1 Population size, distribution of sex, marital status, inhabitants >65 years, and socio economic position (SEP) among participating municipalities

All statistical analyses were performed with SAS version 9.4 (SAS Institute, Cary, NC, USA).


Characteristics for the participating 10 municipalities are presented in Table 1.

Based on the data retrieved directly from the 10 municipalities (gold standard) a total of 3,821 individuals were identified as permanent nursing home residents while The StatD register identified 6,141 as residents. Furthermore, 556 of the individuals identified by the municipalities were not identified in the StatD register (Table 2). The overall sensitivity of StatD’s register was 0.85 (95% CI 0.84–0.87) and the overall PPV was 0.53 (95% CI 0.52–0.54) (Tables 3 and 4).

Table 2 The table is showing numbers of individuals identified as nursing home residents in the Municipality Data (MD), and in the Statistic Denmark’s register (SD)
Table 3 Number of individuals identified as nursing home residents by the Statistical Denmarks (SD) register and by the Municipality data (MD), and sensitivity and PPV for each municipality and in total for in non-reporting municipalities where the algorithm was implemented (Stat D´s method 2)
Table 4 Number of individuals identified as nursing home residents by the Statistical Denmarks (SD)register and by the Municipality data (MD), and sensitivity and PPV for each municipality and in total for in reporting municipalities where Stat D´s method 1 was applied

Among the municipalities, the sensitivity and PPV of the StatD register identifying permanent nursing home residents varied substantially from 0.51 (95% CI 0.43-0.59) to 0.96 (95% CI 0.95-0.98) and the corresponding PPV from 0.14 (95% CI: 0.11-0.17) to 0.73 (95% CI 0.69-0.77), respectively (Table 1).

Table 3 presents the numbers of individuals in all 10 municipalities identified as nursing home residents in the Municipality Data, and in the StatD register.


The validation of StatD’s permanent nursing home register showed that 85% of the individuals identified by the municipalities as residents of nursing homes (gold standard) were labeled as such by StatD. The PPV was 0.53, i.e. only 53% of those identified by StatD where registered as residents in nursing homes in the dedicated records of the municipalities. The accuracy in terms of sensitivity and PPV of StatD’s permanent nursing home algorithm was lower in the municipalities where address lists of nursing homes were created from an algorithm (method 2) as compared to municipalities who had provided information about home care for the individuals (method 1). The difference was, however, rather small. We found a high variability between the municipalities, which might be explained by the fact that municipalities have different procedures of nursing home residents, and perhaps some of the used methods have a problem in the interface to StatD.

Sensitivity, which is independent of the population studied, may be useful for validating a register but PPV is more useful on an individual level. PPV indicates how accurate the register is for a specific individual found to be a nursing home resident, and is dependent on the prevalence residency. We did not report specificity or negative predictive value (NPV) in this study since most inhabitants of a municipality do not live in a nursing home. Therefore, specificity and NPV are close to 100% and are not relevant to consider as a quality indicator of StatD’s register.

One potential limitation of our study is that our gold standard is also a “register” that has not yet been validated. However, administrative data like our municipality gold standard data are generally considered as valid data (7,8). The municipalities have a strong incentive to register all nursing home residents to collect payment from citizens and receive reimbursement from the government for individuals living in permanent nursing homes. Registers where an reimbursement is involved are often considered useful for research purposes [13]. Therefore we hypothesize that the gold standard used in this study, is trustworthy. Further, the best available administrative registers often have to be used as gold standard to validate other registers, as for example in the study by Guldberg et al, where the Danish Urogynecological database is validated, with the Danish National Patient Registry as gold standard [14].

An additional limitation could be that we only examined permanent nursing home residents on a specific day (the 1st of January) and therefore the prevalence may not be representative of any other day of the year. However, the scope of our paper was not to examine the prevalence but to validate StatD’s nursing home register using dedicated administrative municipality records on individual nursing home residents as gold standard.

Danish nursing homes were established before the law of general homes for elderly was implemented, and after the year 1987, a differentiation between nursing residents and general dwellings for elderly emerged. Nursing dwellings replaced nursing homes; however, no difference exists between care-taking or nursing. Individuals are admitted to a nursing home or a nursing dwelling depending on availability [15]. In this paper both nursing homes and nursing dwellings are therefore labeled as permanent nursing homes.

Several factors may explain the misclassification found in StatD’s register. The gold standard applied in our study is based on data about residency from municipalities that are responsible for permanent nursing homes and data is used for sending bills to citizens. Such data is likely to identify residency on an exact day and can be used to calculate a prevalence of persons dwelling in nursing homes.

The StatD methods may have misclassifications for several reasons. The method based on provision of home care may include short term nursing home rehabilitation units and not solely permanent nursing home residents. Furthermore, a misclassification of place for provision of care has few direct consequences for municipalities and citizens and may therefore not be corrected. The method based on combination of addresses can give misclassification of residency due to possible errors in one or both addresses. The list of nursing home addresses based on the StatD algorithm may provide some misclassification in situations where the elderly keep their original home address when moving into a nursing home facility [16].

We have found no other studies that have tried to validate nursing home residence status data that is based on addresses. In a study validating The Danish National Patient Registry, the register was found to be a valuable tool for epidemiological research, but not without considering strengths and limitations [17]. The same national register was validated by Mason et al. They found a low completeness, which without precaution, could lead to bias [18]. Our results are in line with these other studies, since we found a rather low sensitivity and PPV. Other national Danish registers have been validated as well, as for example Uggerby et al, investigating the validity of Schizophrenia diagnosis in the Danish Psychiatric Central Research Register, which they found to be well-suited for research [19]. Lykke Petri et al validated specific data in the Danish Gynecological Cancer Database and found it sufficient for quality monitoring [20]. Another study validated variables in the National Clinical Thyroid Cancer Database, also finding it reliable to use for research at a national level [21].

Denmark has some of the most comprehensive registers in the world, and many are hosted and maintained by StatD. However, changes in the organization and provision of health services can be factors that affect some registers and their completeness. Furthermore, different definitions of variables are used by different registers indicating that the ability to make cross-overs between registers is very important. Moreover, changing of the codes used to register and the coding practices may also have an effect on the validity of the StatD’s permanent nursing home register [22, 23].

Permanent nursing home residence is already used as a confounding variable in many research studies [5, 8, 9]. In observational studies, administrative data can be used as a confounding variable or proxy for frailty which is difficult to measure in other registers [24]. Previous studies using this algorithm to identify permanent nursing home residents might have made an overestimation, meaning that the impact of what they examined had less impact than reported. For example, if one wanted to investigate if a specific diagnosis led to permanent nursing home residency, an over- estimation of the truth might have been made. Further, if there are subgroups in the data used in a study, where the accuracy between the groups are different (like the variability between municipalities in our study), the estimates of effect when nursing home placement is used as an outcome could be biased. Therefore, it would be necessary to adjust according to municipality. Additionally, regional difference may be biased. Consequently, a valid algorithm for nursing home status is of importance in epidemiologic surveys.

Implications for research: Due to our results with a low PPV, we can conclude that validation studies are important for the accuracy in studies involving register even in countries like Denmark having comprehensive registers.

As implications for future use we suggest that one needs to be careful in the interpretation of StatD´s nursing home variable, especially since it´s accuracy varies tremendously between the different municipalities.

The validity of the variable can be improved by a direct use of the municipalities´ registers by StatD,


The overall predictive value of StatD’s permanent nursing home register was found to be low. When used for epidemiological studies, an overestimation of nursing home residency is to be expected. At present direct municipality-administrative registers regarding permanent nursing home residents could be more suitable for epidemiological studies.



Confidence Interval


Negative Predictive Value


Predictive Value

Stat D:

Statistics Denmark


  1. Saga S, Vinsnes AG, Morkved S, Norton C, Seim A. What characteristics predispose to continence in nursing home residents?: A population-based cross-sectional study. Neurourology and urodynamics. 2014;

  2. Zuckerman IH, Sato M, Hsu VD, Hernandez JJ. Validation of a method for identifying nursing home admissions using administrative claims. Bmc Health Serv Res. 2007;7

  3. Koroukian SM, Xu F, Murray P. Ability of Medicare Claims Data to Identify Nursing Home Patients A Validation Study. Med Care. 2008;46(11):1184–7.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Aabom B, Kragstrup J, Vondeling H, Bakketeig LS, Stovring H. Population-based study of place of death of patients with cancer: implications for GPs. The British journal of general practice : the journal of the Royal College of General Practitioners. 2005;55(518):684–9.

    Google Scholar 

  5. Waldorff FB, Siersma V, Waldemar G. Association between subjective memory complaints and nursing home placement: a four-year follow-up. International journal of geriatric psychiatry. 2009;24(6):602–9.

    Article  PubMed  Google Scholar 

  6. Mulders AJ, Zuidema SU, Verhey FR, Koopmans RT. Characteristics of institutionalized young onset dementia patients--the BEYOnD study. International psychogeriatrics / IPA. 2014;26(12):1973–81.

    Article  CAS  Google Scholar 

  7. Navarro-Gil P, Gonzalez-Velez AE, Ayala A, Martin-Garcia S, Martinez-Martin P, Forjaz MJ. Which factors are associated with mortality in institutionalized older adults with dementia? Archives of gerontology and geriatrics. 2014;59(3):522–7.

    Article  PubMed  Google Scholar 

  8. Gaugler JE, Duval S, Anderson KA, Kane RL. Predicting nursing home admission in the U.S: a meta-analysis. BMC geriatrics. 2007;7:13.

  9. Gonzalez-Colaco Harmand M, Meillon C, Rullier L, Avila-Funes JA, Bergua V, Dartigues JF, et al. Cognitive decline after entering a nursing home: a 22-year follow-up study of institutionalized and noninstitutionalized elderly people. Journal of the American Medical Directors Association. 2014;15(7):504–8.

    Article  PubMed  Google Scholar 

  10. Muller C, Lautenschlager S, Meyer G, Stephan A. Interventions to support people with dementia and their caregivers during the transition from home care to nursing home care: A systematic review. International journal of nursing studies. 2017;71:139–52.

    Article  PubMed  Google Scholar 

  11. Baumgarten M, Battista RN, Infante-Rivard C, Hanley JA, Becker R, Bilker WB, et al. Use of physician services among family caregivers of elderly persons with dementia. Journal of clinical epidemiology. 1997;50(11):1265–72.

    Article  CAS  PubMed  Google Scholar 

  12. Pedersen CB. The Danish Civil Registration System. Scandinavian journal of public health. 2011;39(7 Suppl):22–5.

    Article  PubMed  Google Scholar 

  13. Andersen JS, Olivarius Nde F, Krasnik A. The Danish National Health Service Register. Scandinavian journal of public health. 2011;39(7 Suppl):34–7.

    Article  PubMed  Google Scholar 

  14. Guldberg R, Brostrom S, Hansen JK, Kaerlev L, Gradel KO, Norgard BM, et al. The Danish Urogynaecological Database: establishment, completeness and validity. International urogynecology journal. 2013;24(6):983–90.

    Article  PubMed  Google Scholar 

  15. Statistics Denmark: Nyt fra Danmarks Statistik, 5. november 2013, Nt 583. Pleje og ældreboliger 2013, avaiable at:

  16. Statistics Denmark, Imputering af borgere på plejehjem/-bolig 2014. Available from:

  17. Schmidt M, Schmidt SA, Sandegaard JL, Ehrenstein V, Pedersen L, Sorensen HT. The Danish National Patient Registry: a review of content, data quality, and research potential. Clinical epidemiology. 2015;7:449–90.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Mason K, Thygesen LC, Stenager E, Bronnum-Hansen H, Koch-Henriksen N. Evaluating the use and limitations of the Danish National Patient Register in register-based research using an example of multiple sclerosis. Acta neurologica Scandinavica. 2012;125(3):213–7.

    Article  CAS  PubMed  Google Scholar 

  19. Uggerby P, Ostergaard SD, Roge R, Correll CU, Nielsen J. The validity of the schizophrenia diagnosis in the Danish Psychiatric Central Research Register is good. Danish medical journal. 2013;60(2):A4578.

    PubMed  Google Scholar 

  20. Petri AL, Kjaer SK, Christensen IJ, Blaakaer J, Hogdall E, Jeppesen U, et al. Validation of epithelial ovarian cancer and fallopian tube cancer and ovarian borderline tumor data in the Danish Gynecological Cancer Database. Acta obstetricia et gynecologica Scandinavica. 2009;88(5):536–42.

    Article  PubMed  Google Scholar 

  21. Londero SC, Mathiesen JS, Krogdahl A, Bastholt L, Overgaard J, Bentsen J, et al. Completeness and validity in a national clinical thyroid cancer database: DATHYRCA. Cancer epidemiology. 2014;38(5):633–7.

    Article  PubMed  Google Scholar 

  22. Lynge E, Sandegaard JL, Rebolj M. The Danish National Patient Register. Scandinavian journal of public health. 2011;39(7 Suppl):30–3.

    Article  PubMed  Google Scholar 

  23. Helweg-Larsen K. The Danish Register of Causes of Death. Scandinavian journal of public health. 2011;39(7 Suppl):26–9.

    Article  PubMed  Google Scholar 

  24. Yun H, Kilgore M, Curtis J, Delzell E, Gary L, Saag K, et al. Identifying types of nursing facility stays using medicare claims data: an algorithm and validation. Health Serv Outcomes Res Method. 2010;10(1-2):100–10.

    Article  Google Scholar 

Download references


We would like to thank data manager Willy Karlslund at The Research Unit and Section of General Practice, Institute of Public Health, University of Copenhagen, Denmark for initiating the data management, professor Niels de Fine Olivarius at The Research Unit and Section of General Practice, Institute of Public Health, University of Copenhagen, Denmark for contributing with important inputs. Finally, we would like to thank all the participating municipalities for being willing to participate and contribute with data.


This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Availability of data and materials

Since data are managed at Statistic Denmark´s server this is a systematic review of the published literature there are no individual-level data available. Data available in articles meeting inclusion criteria are published in Table 2 and Fig. 1. The databases used are publicly accessible: PubMed (, WHO Index Medicus (, International AIDS Society, Conference on Retroviruses and Opportunistic Infections, HIV/AIDS Implementers’ Meeting, The WHO International Clinical Trials Registry Plat.form (, the Cochrane Central Register of Con.trolled Trials (, the International Standard Randomised Controlled Trial Number Register (, and (

Author information

Authors and Affiliations



AB carried out the study conception, study design and drafted the manuscript. FBW contributed to the study design, reviewed the abstracts and drafted the manuscript. FBW,VS and DRN participated in the design of the study. VS and DRN abstracted study data, performed the statistical analyses and interpreted the data. ABSN, TGW; JS; JK drafted the manuscript and revised the manuscript for important intellectual content, participated in its design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Anna Bebe.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors have declared that no competing interests exist. The opinions and statements in this article are those of the authors and do not necessarily represent the official policy, endorsement or views of their organizations.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bebe, A., Nielsen, A.B.S., Willadsen, T.G. et al. Exploring the concurrent validity of the nationwide assessment of permanent nursing home residence in Denmark - A cross-sectional data analysis using two administrative registries. BMC Health Serv Res 17, 607 (2017).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: