Health insurance claims data are an important data source to analyse incidence or the burden of disease on a population level and to conduct health services research [3, 5–8, 20–23]. Regional or temporal differences in the burden of disease derived from these data may reflect "true" regional or temporal differences in the incidence, but they can also be due to regional or temporal differences in coding practices. Therefore, studies conducting health services research based on hospitalisation records may be affected by coding practices. To our knowledge, variations in coding have not been investigated for Germany so far. Our results showed that overall hospitalisation rates for OB and UGIB were only marginally higher in the East than in the West. Despite this fact, the coding for OB and UGIB was more specific in former East Germany, although the differences between both regions decreased over time. The historical separation of East and West Germany might have contributed to these regional differences in coding practice.
Coding differences became smaller from 2003 onwards, especially with respect to specific and unspecific OB diagnoses. In 2003, a reimbursement system based on diagnosis related groups (DRG) was introduced for hospitals in Germany, while detailed mandatory coding rules for hospital diagnoses were first introduced in 2002. For the following years our results showed an increase in the proportion of unspecific codes for OB hospitalisations and a decrease in the proportion of unspecific codes for UGIB hospitalisations while the total incidence of hospitalisations due to OB or UGIB, respectively, remained on the same level as in the previous years.
Corresponding to this pattern Preyra  showed that the introduction of a complexity adjustment to a Case Mix Groups (CMG) system (which is a Canadian adaptation of DRGs) resulted in a significant increase of the hospital case mix variance without concomitant changes in morbidity or resource use. In a random sample of inpatient cases, Klaus et al.  estimated an overcoding in 34% of diagnoses under DRG conditions whereas undercoding was only present in 9% of diagnoses.
In Germany, in addition to reducing the costs of patient treatment and optimisation of health care, one of the intended effects was the more uniform and specific coding of hospitalisation diagnoses . This effect could be seen for UGIB where the percentage of cases with unspecific coding decreased in East and West Germany for both sexes in the years following the introduction of DRGs. For OB, by contrast, we found an increase of unspecific coding in the former East after 2003, both for men and women, while in the former West the percentage of cases with unspecific coding was nearly unaffected over the same time period. The reasons for this difference and change over time are not clear. Despite a long time since the reunification of Germany, some regional variation in training of medical staff might still exist. There are also slightly different traditions in the organisation of the health care system with a more pronounced role of polyclinics for outpatient services in the former East. Still, it is not clear why these differences should result in a different coding specificity. One potential explanation for the increase in unspecific coding of OB can be the fact that the reimbursement for treatment of cases coded with unspecific bleeding diagnoses was higher than for treatment of cases coded with specific bleeding diagnoses even if the ascertainment of the bleeding cause required additional procedures (gastroscopy). In such a way, surplus services reduced the reimbursement. While we do not assume that indicated gastroscopies were not conducted, there is a possibility that unspecific coding was preferred. This problem of "reduced reimbursement while providing additional service" was recognized in 2004 but it was not immediately solved so that it persisted in 2005 (see , page 105). From today's perspective, it is difficult to understand the interaction between the change in coding rules and coding practices which affected the reimbursement for the hospitals. Nevertheless, our study provides evidence that such factors need to be considered. Further studies are also needed to assess the reasons for regionally different vulnerability to changes in coding rules. For Belgian hospitals, Aelvoet et al.  found not only that coding practices improved over time but also that there was evidence for fraudulent undercoding.
Our study is limited to the investigation of only two disease entities. We do not know whether the observed regional and temporal variations in coding also apply to other disease entities. For Germany, no systematic analysis of the coding quality and regional variations in coding quality of claims data has been published up to date  and published data on the quality of coding using ICD-10 are rare in general [29, 30]. In Australia, Henderson et al. found a high level of reliability for principal codes of hospital discharge letters when comparing hospital coding and auditor coding . However, Stausberg et al. demonstrated that the agreement in coding decreased when more detailed (for example five digit versus three digit ICD-10) codes were used and concluded that very detailed classification and complex coding rules for ICD-10 diagnoses cause significant difficulties even for coding experts . The complexity and ambiguity of the coding rules may contribute to the formation of different coding preferences, which could result in the differences we observed.
Furthermore, in our analysis, it was not possible to distinguish between "true" differences in disease epidemiology and coding differences. For example, we were not able to adjust for the levels of alcohol consumption which may increase incidence of oesophageal bleeding. On the other hand, we could show that regional and temporal differences in the incidence of hospitalisations due to OB or UGIB nearly disappeared when specific and unspecific codes were analysed together. This supports the assumption that "true" disease differences were only to a minor degree accountable for the found regional and temporal differences.