Skip to main content

Exploring hot topics and evolutionary paths in the Diagnosis-Related Groups (DRGs) field: a comparative study using LDA modeling



This study reviews the research status of Diagnosis-related groups (DRGs) payment system in China and globally by analyzing topical issues in this field and exploring the evolutionary trends of DRGs in different developmental stages.


Abstracts of relevant literature in the field of DRGs were extracted from the China National Knowledge Infrastructure (CNKI) database and the Web of Science (WoS) core database and used as text data. A probabilistic distribution-based Latent Dirichlet Allocation (LDA) topic model was applied to mine the text topics. Topical issues were determined by topic intensity, and the cosine similarity of the topics in adjacent stages was calculated to analyze the topic evolution trend.


A total of 6,758 English articles and 3,321 Chinese articles were included. Foreign research on DRGs focuses on grouping optimization, implementation effects, and influencing factors, whereas research topics in China focus on grouping and payment mechanism establishment, medical cost change evaluation, medical quality control, and performance management reform exploration.


Currently, the field of DRGs in China is developing rapidly and attracting deepening research. However, the implementation depth of research in China remains insufficient compared with the in-depth research conducted abroad.

Peer Review reports


Owing to an aging population, medical innovation, and rising health needs, countries face increasing medical demands and rapidly growing service costs. Controlling medical expenses, conserving health resources, and providing affordable, high-quality medical care have become urgent problems for public hospitals [1]. To solve this practical problem, in the 1970s, diagnosis-related groups (DRGs) were established in the USA [2]. DRGs prioritize clinical similarity and consider resource consumption, grouping cases by disease severity, diagnosis complexity, treatment mode, and resource consumption to set payment standards [3]. As a refined medical payment tool, its appearance is conducive to protecting the rights and interests of patients, hospitals, and medical insurance. It can effectively meet the requirements of health expenditure control, hospital management, and medical resource allocation [4]. Therefore, different countries have developed and implemented adapted versions based on the payment method of DRGs in the USA and have achieved results in controlling medical expenses [5,6,7]. Since the late 1980s, China has introduced DRGs to control medical expenses and reduce patient burdens [8]. After more than 30 years of localization development and pilot programs, DRGs have proven effective in managing insurance fees, resource use, and medical quality [9, 10].

As medical insurance payment system reforms deepen, scholars worldwide are exploring diverse topics, generating new ideas, and proposing innovative theories and methods. DRGs in China started relatively late. Therefore, the overall quality of the papers published is lower compared to those from developed countries. Problems such as limited research scope, insufficient method innovation, and considerable homogenization persist [11]. Therefore, exploring the development status and research prospects of DRGs in China based on foreign research experiences is crucial. To this end, numerous researchers have extensively reviewed the literature on DRGs based on bibliometric tools such as CiteSpace and VOSviewer. They have also summarized the research directions in DRGs worldwide through keyword clustering [12, 13]. However, this method is limited to keywords, authors, and research institutions and cannot visually analyze the overall content of the text. Moreover, for documents with unclear semantic relations and a rough logical structure, keyword clustering can easily ignore the main keywords [14]. Latent Dirichlet Allocation (LDA) effectively addresses this problem by leveraging text information to enhance semantic keyword associations within mined topics [15]. However, few studies have applied the LDA topic model to the literature analysis of DRGs. Therefore, this study comprehensively reviews the Chinese and international literature on the DRG payment system since its implementation. Using the LDA topic model, it identifies hot topics and developmental trends, drawing on international experiences to support DRG implementation in China.


Source of data

The data sources for this study included the Web of Science (WoS) and China National Knowledge Infrastructure (CNKI). The WoS database encompasses diverse multidisciplinary, high-impact, international, and comprehensive academic journals. It ensures the representativeness and authority of the literature sources [16]. Considering its highest applicability to the LDA topic model, it was used as the database for this study [17]. CNKI is the largest and most comprehensive Chinese literature database, covering more than 99% of Chinese academic and practical journals. It ensures the representativeness and authority of the literature sources [18]. Both databases serve as effective tools for bibliometric analysis. Therefore, this study selected the WoS and CNKI databases as the data sources for English and Chinese literature, respectively. The following search strategies were employed: Topic = (“Diagnosis Related Groups”) OR Topic = (“Diagnosis Related Group”) OR Topic = (“Diagnosis-Related Groups”) OR Topic = (“Disease Related Group”) OR Topic = (“casemix”) OR Topic = (“case-mix”), Document Types = Article; Language = English, Indexes = Web of Science Core Collection (WoSCC) for English literature. Non-research literature and review articles such as conference notices, news reports, industry trends, policy documents, literature with incomplete data, and literature with unavailable full texts were excluded. Literature related to other professional terms with the same abbreviation “DRG,” such as Dorsal Root Ganglion, Digital Raster Graphics, and Dynamic Route Guidance System, was excluded. The cut-off date for data retrieval was April 12, 2023. A total of 6,758 and 3,321 English and Chinese articles, respectively, were screened.

The literature screening was independently conducted by two reviewers, who retrieved a literature record based on inclusion and exclusion criteria. Disagreements were resolved through consensus, and in cases of persistent differences, a third reviewer was consulted to assess and make a final decision. Furthermore, data collection from each screened literature record was independently conducted by two reviewers. A third reviewer then assessed the completeness and accuracy of the extracted data.

Research method

We retrieved relevant literature on DRGs from the WoS Core and CNKI databases. We used the abstracts of the literature as text data. We preprocessed the text data by incorporating keyword information and removing stop words. We applied the LDA topic model to explore textual topics, identify hot topics in the field based on topic strength, and analyze the evolutionary trends of topics by calculating the cosine similarity between adjacent-stage topics. The specific steps are as follows:

Text preprocessing

We standardized and supplemented the exported bibliographic data and built a custom dictionary and stop-word list. We segmented the abstract text, replaced synonyms, removed stop words, and built a dictionary and corpus according to the bag-of-words model. Among them, the self-defined dictionary was derived from the keywords included in this study: Stanford University stop list [19], Harbin Institute of Technology stop list, Baidu stop list [20], and NLTK stop list [21] were used for the summary of the stop list. The stop list was supplemented according to the pre-segmentation results to remove words with high word frequency but no practical meaning. Chinese and English word segmentations use the Jieba and NLTK modules in Python, respectively, to segment the abstract text based on the custom dictionary in precise mode. Simultaneously, the term frequency-inverse document frequency (TF-IDF) algorithm is used to extract text features, and the importance of each term in the document is quantified to verify keyword relevance in the corpus [22] to improve the accuracy and interpretation of the topic model.

LDA topic model

LDA, first proposed by Blei et al. [23], is a probabilistic topic model that identifies hidden semantic structures in unstructured text for topic abstraction and clustering. The model comprises a three-layer Bayesian network structure of “document-topic-word,” where documents include several topics. Each topic comprises a specific set of words from the document, and each word in the document is associated with a certain probability distribution. Thus, the topic of a document can be represented by the group of words with the highest probability of occurrence [24]. Thus, this model obtains the document situation of a specific topic in the DRG domain through “document-topic probability distribution.” It obtains the potential topic and word distribution of different topics in the DRG domain through “topic-word probability distribution,” [25] thereby calculating the semantic relevance between topics and documents and between topics and words. The study utilizes the Gensim library in the Python language to construct a topic model. After debugging and validation, the hyperparameters α and β are set to 0.1 and 0.01, respectively, with 1000 iterations and 5 passes over the corpus. The resulting topic model, under these settings, ensures sufficient training and convergence while possessing good robustness and generalization capabilities.

The LDA topic model must set the number of topics in advance, and the common method is to calculate the topic perplexity and coherence [26]. The degree of confusion [27] measures the quality of a probability distribution or probability prediction sample and determines the optimal number of topics. The smaller the degree of confusion, the more stable the topic structure of the model is and the less topic uncertainty there is. Topic consistency [28] is used to describe the distribution distance between different topics. We used the sliding-window-based coefficient of variation (CV) method to calculate the consistency. The higher the score, the higher the discrimination between topics and the better the clustering effect of the model. We used perplexity and coherence score metrics, along with visual analysis, to select the optimal number of topics, aiming for low perplexity and high coherence scores.

Confusion is calculated as follows:

$$\mathbf{Perplexity}\boldsymbol\;\left(\mathbf D\right)\boldsymbol\;\boldsymbol=\boldsymbol\;\mathbf{exp}\boldsymbol\;\left\{\boldsymbol-\frac{\boldsymbol\sum_{\mathbf d\boldsymbol=\mathbf1}^{\mathbf M}\boldsymbol\;\mathbf{log}\boldsymbol\;\mathbf p\boldsymbol\;\left({\mathbf w}_{\mathbf d}\right)}{\boldsymbol\sum_{\mathbf d\boldsymbol=\mathbf1}^{\mathbf M}\boldsymbol\;{\mathbf N}_{\mathbf d}}\right\}$$

where D is the document set, exp{} is an exponential function with the natural logarithm e as the base, \(\mathbf{p}({\mathbf{w}}_{\mathbf{d}})\) is the generation probability of document d, \({{\varvec{N}}}_{{\varvec{d}}}\) represents the lexical length of document d, and M is the number of documents.

The consistency score is calculated as follows:

$$\mathbf C\mathbf o\mathbf h\mathbf e\mathbf r\mathbf e\mathbf n\mathbf c\mathbf e\boldsymbol\;\left(\mathbf V\right)\boldsymbol\;=\sum\limits_{\left({\mathbf v}_{\mathbf i},{\mathbf v}_{\mathbf j}\in\mathbf V\right)}\mathbf s\mathbf c\mathbf o\mathbf r\mathbf e\boldsymbol\;\left({\mathbf v}_{\mathbf i},{\mathbf v}_{\mathbf j},\boldsymbol\in\right)$$

Here, V is a set of words describing the topic, and \({\varvec{\upepsilon}}\) returns a smoothing factor for the real number to guarantee the score.

Thematic similarity

Topics in similar developmental stages often exhibit high similarity in the evolutionary process, and the evolutionary relationships between different topics can be identified by extracting topics with high similarity. The cosine similarity is a widely used measure that assesses the similarity between topics by measuring the angle between two vectors, thereby determining the degree of correlation and evolutionary path of topics [29]. The range of cosine values was [0, 1]. The closer the cosine value is to 1, the higher the similarity of the topic vector, and vice versa. Based on existing related research, the average cosine similarity of adjacent temporal topic stages is used as a threshold, where if the cosine value between two topics is greater than the mean, it is considered that there is an evolutionary relationship between the two topics [30]. The cosine similarity is calculated as follows:

$$\mathbf c\mathbf o\mathbf s\mathbf i\mathbf n\mathbf e\mathbf s\mathbf i\mathbf m\mathbf i\mathbf l\mathbf a\mathbf r\mathbf i\mathbf t\mathbf y\boldsymbol\;\mathbf{\left({A,B}\right)}\;=\mathbf c\mathbf o\mathbf s\;\mathbf{\left(\theta\right)}=\frac{\mathbf A\cdot\mathbf B}{\mathbf A\mathbf B}=\frac{\boldsymbol\sum_{\mathbf i\boldsymbol=\mathbf1}^{\mathbf n}{\mathbf A}_{\mathbf i}{\mathbf B}_{\mathbf i}}{\sqrt{\boldsymbol\sum_{\mathbf i\boldsymbol=\mathbf1}^{\mathbf n}\mathbf A_{\mathbf i}^{\mathbf2}}\sqrt{\boldsymbol\sum_{\mathbf i\boldsymbol=\mathbf1}^{\mathbf n}\mathbf B_{\mathbf i}^{\mathbf2}}}$$

Here, Ai and Bi are vector representations of the two topics A and B.

Thematic intensity

Topic intensity is the degree of attention paid to a topic in a certain period, and its expression is the number of documents containing the topic. The greater the topic’s intensity, the more likely it is to be considered a hot topic.

The formula for calculating the theme intensity is as follows:


Here, \({{\varvec{T}}}_{{\varvec{k}}}\) is the intensity value of the topic k, and \({{\varvec{\theta}}}_{{\varvec{k}}}^{{\varvec{d}}}\) is the probability of the topic k appearing in the document d.


Time and geographical distribution

Time distribution

Since the late 1970s, DRGs have predominantly been studied abroad. With the implementation of DRGs in an increasing number of countries and regions, the number of related studies is also increasing annually. According to the trend in the literature quantity distribution curve, the development of foreign DRGs can be divided into four stages: exploration stage (1979–1990), embryonic stage (1991–2002), development stage (2003–2013), and maturity stage (2014–2023). The number of published articles has shown a steady growth trend overall, and research in the field of DRGs in developed countries has matured. In the late 1980s, China introduced the concept of DRGs to control unreasonable growth in medical expenses and reduce the medical burden on patients. According to the number of papers published, the development of DRGs in China can be divided into three stages: the initial trial period (1985–2005), the active exploration period (2006–2015), and the rapid development period (2016–2023). Figure 1 shows the trend in the number of annual publications of DRG-related literature, both in China and internationally.

Fig. 1
figure 1

Annual trend of diagnosis-related group (DRG) publications in Chinese and international contexts. Note: The study data search deadline was April 12, 2023. Therefore, the literature data for 2023 were incomplete

Geographical distribution

Figure 2 shows the number of publications issued by countries worldwide. Research in the field of DRGs has primarily been conducted in developed countries and concentrated in the USA and Europe. The top five countries in terms of the number of publications were the USA, UK, Canada, Netherlands, and Germany. Among them, the USA is the origin of payment according to the disease diagnosis group and takes the lead in applying DRGs to the settlement process of hospitalization expenses, becoming the leading country in the field of DRGs, and its number of publications far exceeds that of other countries. The global proportion of articles published by China is only 2.69%. There is still much room for development in this field, and international influence needs further improvement.

Fig. 2
figure 2

Top 10 countries for DRG-related research and number of publications

Figure 3 shows a visual analysis of the regions to which documents are sent in China. Therefore, research in the field of DRGs is mainly concentrated in eastern coastal areas, such as Beijing, Shanghai, and other economically developed areas. Among these, Beijing, the leading city in the development of DRGs in China, has the largest number of publications. Note that the number of publications issued in Xinjiang is outstanding, and the reform of medical and health systems and high-quality development in western China are effective.

Fig. 3
figure 3

Regional distribution of the number of DRG-related publications in China

Topic extraction and analysis based on LDA topic model

Determination of the optimal number of topics

The optimal number of topics was selected by combining the confusion and consistency scores, and the number of topics with smaller confusion and larger consistency scores was selected by combining the visualization results. Taking Chinese literature as an example, we selected an integer within the range of 1 to 20 as the number of potential topics and calculated the corresponding topic confusion and consistency scores, as illustrated in Fig. 4A. When the number of topics K = 8, the degree of confusion is relatively low and the consistency score is relatively high, and both have obvious inflection points. Therefore, the number of topics K = 8 was selected as the optimal number of topics for the entire cycle of Chinese literature. Similarly, according to Fig. 4B, C and D, the optimal numbers of topics for 1985–2005, 2006–2015, and 2016–2023 were determined to be K = 6, K = 6, and K = 8, respectively. Similarly, it can be determined that the optimal number of topics for the full period and each period of the English literature is K = 6, K = 5, K = 8, K = 5, and K = 6, respectively.

Fig. 4
figure 4

A Determination of the optimal number of topics over the entire period. (a) Perplexity Score; (b) Coherence Score. B Determination of the optimal number of topics over the initial trial period. (a) Perplexity Score; (b) Coherence Score. C Determination of the optimal number of topics over the active exploration period. (a) Perplexity Score; (b) Coherence Score. D Determination of the optimal number of topics over the rapid development period. (a) Perplexity Score; (b) Coherence Score

Topic content analysis

Based on the modeling results of LDA topics, several topics have been studied in the field of DRGs in China and internationally, and the word distribution of each topic can be obtained, namely, “theme-vocabulary probability distribution.” The top 10 featured words with high probability under each topic were sorted and summarized according to the featured words to summarize the topic name and form a topic list in the field of DRGs research.

Among them, the six topics in the English literature were payment systems and cost analysis, effect evaluation and performance management, disease mortality and status assessment, case grouping based on disease severity, internal medicine nursing and quality control, surgical procedures, and operating room management. Table 1 presents the topic summary and subject keywords. The eight topics in the Chinese literature were payment mechanisms, medical expenses, the front sheet of medical records, reform background, performance management, medical service, clinical pathway, and effectiveness evaluation. The topic summary and subject keywords are presented in Table 2.

Table 1 Probability distribution of topics and keywords in English literature
Table 2 Probability distribution of topics and keywords in Chinese literature

Hot topics analysis

Based on the LDA topic mining results, the probability of topic k appearing in document d can be obtained, namely “document-topic probability distribution,” which represents the probability distribution of each topic in each document. The larger the probability value, the more likely it is that a document belongs to this topic. The topic intensity corresponding to each topic category is calculated according to the document-topic probability distribution, as shown in Fig. 5. The red dotted line is the average intensity of each topic, and the topic with a topic intensity higher than the average intensity value is a hot topic in the DRG field. The figure shows that the hot topics in the field of DRGs abroad are Topic1 (payment system and cost analysis), Topic5 (internal medicine nursing and quality control), and Topic6 (internal medicine nursing and quality control), while those in China are Topic1 (payment mechanism), Topic4 (reform background), Topic5 (performance management), Topic7 (clinical pathway), and Topic8 (effectiveness evaluation).

Fig. 5
figure 5

Histogram of topic intensity corresponding to each topic category. a English literature; b Chinese literature

Analysis of theme evolution path

After LDA topic modeling was performed on the literature at each stage, the cosine similarity between topics in adjacent time stages was calculated. Based on the cosine similarity with the evolutionary relationship, a Sankey diagram of topic evolution was drawn, as shown in Fig. 6. The evolutionary path and logical relationship between topics were obtained. The horizontal axis represents each stage of DRGs’ development in China and internationally; the vertical axis represents topics at different development stages; and the connecting line develops from left to right. The thickness of the line between rectangular blocks represents the similarity of topics. The thicker the line between different topics, the higher the similarity, and the stronger the evolutionary relationship between topics.

Fig. 6
figure 6

Sankey diagram of topic evolution. a English literature; b Chinese literature


The payment system for disease DRGs is a type of homogeneous case-planning based on disease diagnosis and operation. Its practical application controls medical expenses and improves the medical environment. Global research into DRGs continues to increase. After the reform of payment modes in China, the quantity and quality of DRG-related research have increased drastically; however, there is still a gap compared with developed countries. Based on the above results, the following conclusions can be drawn:

DRG research in China has developed rapidly, but regional development remains uneven

Globally, English literature on DRGs was published earlier than Chinese literature, and the number of published articles generally showed an increasing trend. In the 1970s, Robert B. Fetter and John D. Thompson of Yale University in the USA pioneered the design and development of the payment system for DRGs [31], marking the historic milestone of case-based payment based on clinical and resource consumption similarity. The DRGs payment system entered an “exploration stage.” Since the 1990s, most high-income countries have adopted the DRGs payment method as the primary means of reimbursing hospital acute inpatient expenses. Building upon the USA DRGs, countries have successfully explored reforms adapted to their own DRG systems [32,33,34], ushering DRG development into an “embryonic stage.” Additionally, Liu [35] has elucidated the sudden surge in literature volume in 1991 based on database constraints. Subsequently, the literature volume has steadily increased, aligning with Liu et al.’s observations [36]. Furthermore, with expanded patient coverage, new medical technologies, and standardized diagnosis and treatment coding, the USA has developed a medical payment system based on an international single-disease grouping system. DRGs have thus been promoted globally as a revolutionary means of medical quality management and cost reimbursement [37], entering a “development stage.” In recent years, the number of studies has stabilized, indicating that research on DRGs in developed countries has reached a “maturity stage” [38]. China began to focus on DRGs in the 1980s and made its first attempt at a large-scale DRGs pilot study in hospitals in Beijing. However, given the relatively backward informatization in China at that time, no unified electronic medical record was established. Additionally, information data could conduct DRG-related research [39]; therefore, the number of DRG studies was low, and this was the “initial trial period.” In 2006, China’s DRG system began to conduct localization development and advocated that regions with conditions could gradually explore the method of payment by disease groups. The advance of network technology, standardization of electronic medical records, and policy formulation have fostered the development of DRGs [40]. The number of DRGs issued showed an initial increasing trend, called an “active exploration period.” In 2016, China began to issue guidance on deepening the reform of basic medical insurance payment methods at the national level, as well as the CHS-DRG version of disease diagnosis-related grouping based on China’s national conditions, and took 30 cities as the national pilot for DRG payment. DRGs began to enter the actual payment stage [41,42,43], during which the number of DRGs issued in China significantly increased, called a “rapid development period.”

Research on DRGs in China is primarily distributed in the eastern coastal areas, which may be because DRGs were first piloted in Beijing once they were introduced into China. The information technology development in Beijing, Shanghai, Guangdong, and other first-tier cities occurred earlier, and the flow of scientific research institutions and talent was more concentrated. Therefore, more relevant literature existed. In contrast, owing to the relatively backward information level, insufficient supply of medical resources, and uneven distribution of talent in western China [44], the amount of DRG research literature is relatively small. However, compared with other provinces and cities in the west, Xinjiang has a relatively large number of papers, which may be because it covers one-sixth of the total area of China. Its strategic position and geographical advantages are more prominent, making it representative of the western region [45]. Among them, Urumqi City, an early pilot city of national payment according to disease DRGs, performed the actual payment of DRGs in six local hospitals. Therefore, the research effect in Xinjiang requires attention.

DRGs’ wide range of research topics in China and insufficient research depth

The LDA topic modeling shows that foreign countries also focus on the difficult problems of diseases and fields not covered by DRGs based on the payment policy environment of DRGs. The topics include influencing factors and optimization suggestions for DRG group payment based on the severity of diseases, medical disease nursing, and surgical operations, as well as research on DRGs of nursing home versions, DRGs of home care versions, and DRGs of long-term care [46]. Chinese research has primarily focused on exploring and applying DRG reform background analysis, payment mechanism research, cost accounting, implementation effect evaluation, medical quality management, and other topics. Simultaneously, to continuously strengthen medical quality, standardize medical behavior, and save medical resources during DRG implementation, clinical pathway management, quality control of medical record first page and coding operation, selection of performance appraisal indicators, and other topics have also attracted attention. This may be due to moral hazard behaviors such as high coding problems, patient selection problems, inhibition of new technology and business, and investment reductions in disease prevention and health promotion [10, 47]. Research on DRGs in China is progressing; however, compared with the mature experience abroad, China prefers to study the theoretical framework and practical exploration of DRGs as a whole. China lacks refined research on the DRG implementation path from a disease perspective. Research on quality control in the operation process of DRGs is not mature. Therefore, the overall implementation depth is insufficient, and the research field must be expanded.

Additionally, the subject intensity shows that the research hotspots of foreign scholars in the field of DRGs focus on care quality for specific diseases in surgery, the impact of surgical procedures on DRGs, and cost analysis and payment system research on DRGs. However, research hotspots in China focus on mechanism analysis, hospital management, and effect evaluation. Difficult problems such as accurate case grouping of various diseases, continuous innovation of payment modes, and efficient evaluation of medical services require further studies. Yin Yani [48] confirmed this conclusion.

Development progress of DRGs and ongoing long-term exploration stage

The literature evolution Sangji diagram shows that, after the short-term methodological research, foreign countries began applying DRGs to medical nursing and surgical operations. DRGs were optimized based on indices such as disease severity and patient survival status, and DRGs were used in hospital management, cost analysis, cost control, efficiency evaluation, and quality management. After a long period of development and innovation, DRG-related research has almost reached maturity. However, the research on DRGs in China has experienced the development path of “basic theory-practical application-quality control,” which is consistent with the results of Liu [49] on the evolutionary history of DRGs in developing countries. Early research focused on basic theory, exploring DRGs in developed countries such as the United States, prepaid mechanisms, and proposing a group payment strategy suitable for China. In the middle stage, most studies were practical explorations and comparative analyses. China has begun to conduct application research on DRGs, focusing on hospital pilot research, medical insurance system reform, payment mode comparisons, and implementation effect evaluations. Among them, the negative impact of implementing DRGs has attracted the attention of many scholars. Therefore, research on quality evaluation and clinical pathways has gradually increased in this stage to strengthen the standardized management of medical diagnosis and treatment processes. In recent years, China has begun to conduct DRG hospital management and quality control-related research, including performance management, medical record first page, and path optimization.

From the evolutionary path of the development of DRGs, DRGs research in China is becoming increasingly improved; however, it remains in the long-term reference and exploration stages. The challenges posed by implementing DRGs to healthcare quality continue to warrant attention. Gu [50] believed that the payment mode of DRGs in China is only in the experimental stage and that the healthcare supply system lacks sufficient governance structure and incentive measures. This indicates that a management system that is more suitable for national conditions in the pilot process of DRGs must be found. Simultaneously, topics related to expenses run through the entire path cycle. Combined with the rapid development of the level of medical services in China, controlling medical expense growth has become the core goal and research focus of DRGs. Li [51] reached a similar conclusion in their review; that is, the main research content of DRGs in China is medical expenses. Thus, Chinese research should analyze the factors influencing medical expenses based on the DRG payment system. The results of the research hotspots and theme evolution paths show that research trends in DRGs will focus on medical quality management, cost control effect evaluation, accurate case grouping, and implementation path optimization.

The above conclusions show that after the pilot and application of the DRGs payment mode in China, research has made breakthroughs. However, compared to its overall promotion and application in developed countries, it faces challenges. Therefore, China should strengthen cooperation between Chinese and international academic institutions, understand the latest research results and practical experience, and actively explore and implement a payment system adapted to its own national conditions. China should further optimize the allocation of medical resources, improve the quality and efficiency of medical services, realize the reform and innovation of medical insurance payment methods, and provide assistance for popularizing and applying DRGs in China. The details are as follows:

  • First, the resources of all regions are balanced, and effective coverage of the reform of payment methods is achieved. DRG pilot areas should be extended from the national to the interior level. DRG management mode and reimbursement standards should be formulated in line with actual regional conditions according to the economic conditions and epidemiological characteristics of each region. Provincial reform pilots paid by DRGs should be promoted, and the regional coverage rate of DRG reform should be improved. Simultaneously, multi-party cooperation should be strengthened among government departments and medical institutions at all levels. Special DRG working groups, or expert advisory committees, should be established. Meetings, seminars, and experience-exchange activities should be regularly organized. Information sharing, communication, and DRG research and application should be promoted nationwide.

  • Second, departmental linkage is strengthened, and the research depth is improved in the DRG field. Considering the key and difficult problems in implementing DRGs, while continuously expanding the research field, research on implementing DRGs should be refined to ensure the accuracy and applicability of DRGs in different disease fields. Simultaneously, through the coordination and cooperation of health departments, financial departments, medical insurance departments, and other departments, the coordinated promotion of system reform, payment reform, performance reform, and revenue reform forms policy integration and optimal allocation of resources and realizes the deep integration of multiple reform achievements based on the linkage of medical treatment, medical insurance, and medicine.

  • Third, theory is combined with practice to promote high-quality medical reform. Scientific, transparent, and operable theoretical frameworks are formulated and improved, such as the DRG classification standard, cost weight calculation method, and clinical pathway management guide; providing operational guidance for the quality control of medical institutions; and promoting fine management and operational efficiency improvement in hospitals. Simultaneously, aiming at the difficulties and problems in the practice process, such as medical data collection, filling in the first page of medical records, and weight calculation rules, we will provide corresponding training, consultation, and technical support to help medical institutions fully understand and reasonably apply the DRGs system and provide an empirical basis and improvement direction for theoretical research. Additionally, a high-quality, strict monitoring and evaluation system should regularly evaluate the impact of DRGs, identify problems, and make timely improvements to inform policy adjustments and enhance DRG application in China’s medical system.


We analyzed the time and regional distribution of Chinese and international publications by comprehensively searching Chinese and English databases. We used the LDA model to mine potential topics in Chinese and international DRGs. We also summarized the hot topics of DRG research based on the calculation of topic intensity. We analyzed the evolution path and research trend of DRGs development based on the calculation of cosine similarity between topics. It is a reference for in-depth research and future research on DRGs. In DRG development abroad, the research hotspots primarily focused on exploring DRGs, related research under the payment policy environment, and uncovered fields of DRGs. In contrast, the Chinese research started late, primarily focusing on basic theory research and foreign experience and then gradually turning to practical exploration and comparative analysis research. In recent years, Chinese research has focused mainly on hospital management and quality control. In the future, Chinese research should focus on the quality of medical care during the implementation of DRGs and analyze the influencing factors of medical expenses from the perspective of DRGs. Future research should also explore related fields such as cost control effect evaluation, accurate case grouping, and clinical pathway optimization. Additionally, DRG optimization management should be strengthened for complex and high-risk diseases, as well as for special populations. This would promote DRG application and popularization in China. Strengthening academic exchange and cooperation, deepening coordination between medical insurance and relevant departments, and establishing a reform mechanism integrating research and practice will enhance China’s medical quality management, optimize resource allocation, control expenses, and contribute to global healthcare development.


First, the literature sample was only selected from the WoS and CNKI databases. Although these databases are authoritative, several DRGs-related literatures published in other databases may still be potentially omitted. Therefore, future research should include additional databases. Second, regarding language, this study only included literature in Chinese and English. Future research should broaden language selections. Additionally, although the LDA model overcomes the one-sidedness of using journal literature keywords to mine research hotspots, this study only analyzed abstracts in the literature, and the topic tags were named according to the subject words and subjective judgment, lacking the participation of domain experts. Moreover, literature processing was performed manually, potentially leading to incomplete data collection and omissions.

Availability of data and materials

The datasets and analysis codes used and/or analyzed during the current study are available from the corresponding author on reasonable request.



Diagnosis-Related Groups


Latent Dirichlet Allocation


Web of Science


Science Citation Index Expanded


China National Knowledge Infrastructure


Term Frequency–Inverse Document Frequency


Coefficient of Variation


  1. Chen YJ, Zhang XY, Yan JQ, Xue T, Qian MC, Ying XH. Impact of diagnosis-related groups on inpatient quality of health care: a systematic review and meta-analysis. Inquiry. 2023;60:469580231167011.

    PubMed  Google Scholar 

  2. Fetter RB, Shin Y, Freeman JL, Averill RF, Thompson JD. Case mix definition by diagnosis-related groups. Med Care. 1980;18(2 Suppl):iii, 1–53.

    PubMed  Google Scholar 

  3. Wu Y, Fung H, Shum HM, et al. Evaluation of length of stay, care volume, in-hospital mortality, and emergency readmission rate associated with use of diagnosis-related groups for internal resource allocation in public hospitals in Hong Kong. JAMA Netw Open. 2022;5(2):e2145685.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Yu L, Lang J. Diagnosis-related Groups (DRG) pricing and payment policy in China: where are we? Hepatobiliary Surg Nutr. 2020;9(6):771–3.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Pilla J, Hindle D. Adapting DRGs: the British, Canadian and Australian experiences. Health Inf Manag. S 1994;24(3):87–93.

    CAS  PubMed  Google Scholar 

  6. Busse R, Geissler A, Aaviksoo A, et al. Diagnosis related groups in Europe: moving towards transparency, efficiency, and quality in hospitals? BMJ. 2013;346:f3197.

    Article  PubMed  Google Scholar 

  7. Mathauer I, Wittenbecher F. Hospital payment systems based on diagnosis-related groups: experiences in low- and middle-income countries. Bull World Health Organ. 2013;91(10):746–756a.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Dong Y, Wang S. Current situation and suggestions on the development of Diagnosis Related Group (DRG) policy in China. Asian J Soc Pharm. 2022;17(4):367–75.

    Google Scholar 

  9. Jian WY, Lu M, Liu GF, Chan KY, Poon AN. Beijing’s diagnosis-related group payment reform pilot: impact on quality of acute myocardial infarction care. Soc Sci Med. D 2019;243:112590.

    Article  PubMed  Google Scholar 

  10. Zhang LL, Sun LH. Impacts of diagnosis-related groups payment on the healthcare providers’ behavior in China: a cross-sectional study among physicians. Risk Manag Healthc Policy. 2021;14:2263–76.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Chen ME, Yang JY, Yue WJ, Kong FX, Su B, Qian YL. Research status and trend of diagnosis related group (DRG) at home and abroad. Chin Rural Health Service Administration. 2022;42(09):670–8.

    Google Scholar 

  12. Lang X, Guo J, Li Y, Yang F, Feng X. A bibliometric analysis of diagnosis related groups from 2013 to 2022. Risk Manag Healthc Policy. 2023;16:1215–28.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Xu H, Hua SL. International Diagnosis Related Groups (DRG) research:progress and trends——bibliometric analysis based on Citespace and VOSviewer. Journal of Shenyang Pharmaceutical University. 2020;37(12):1125–32.

    Google Scholar 

  14. Qian WJ, Zhang YK. Visual knowledge mapping analysis of human destiny community research. Journal of Southwest Minzu University (Humanities and Social Sciences Edition). 2020;41(07):222–30.

    Google Scholar 

  15. Tazibt AA, Aoughlis F. Latent Dirichlet allocation-based temporal summarization. International Journal of Web Information Systems. 2019;15(1):83–102.

    Article  Google Scholar 

  16. Birkle C, Pendlebury DA, Schnell J, et al. Web of science as a data source for research on scientific and scholarly activity. Quantitative Science Studies. 2020;1(1):363–76.

    Article  Google Scholar 

  17. Ye Z, Li Z, Zhong S, et al. The recent two decades of traumatic brain injury: a bibliometric analysis and systematic review[J]. Int J Surg. 2024;110(6):3745–59.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Sun S, Zhong L, Law R, et al. Health tourism evolution: a review based on bibliometric analysis and the China national knowledge infrastructure database. Sustainability. 2022;14(16):10435.

    Article  Google Scholar 

  19. Singh S P, Karkare S, Baswan S M, et al. The application of text mining algorithms in summarizing trends in antiepileptic drug research[J]. bioRxiv. 2018: 269308. Posted February 22, 2018.

  20. goto456. commonly used Chinese stop word list. Accessed 18 Apr 2023.

  21. Perkins J. Python 3 text processing with NLTK 3 cookbook[M]. Packt Publishing Ltd; 2014.

  22. Albalawi R, Yeap TH, Benyoucef M. Using topic modeling methods for short-text data: a comparative analysis. Front Artif Intell. 2020;3:42.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. Journal of machine Learning research. 2003;3(Jan):993–1022.

    Google Scholar 

  24. Gan J, Qi Y. Selection of the optimal number of topics for LDA topic model—taking patent policy analysis as an example. Entropy. 2021;23(10):1301.

    Article  PubMed  PubMed Central  Google Scholar 

  25. Wu S, Liu J, Liu L. Modeling method of internet public information data mining based on probabilistic topic model. J Supercomput. 2019;75:5882–97.

    Article  Google Scholar 

  26. Zhao W, Chen JJ, Perkins R, et al. A heuristic approach to determine an appropriate number of topics in topic modeling. BMC Bioinf. 2015;16 Suppl 13(Suppl 13):S8.

    Article  Google Scholar 

  27. Arun R, Suresh V, Veni Madhavan C, Narasimha Murthy M. On finding the natural number of topics with latent dirichlet allocation: some observations. In: Paper presented at: Advances in knowledge discovery and data mining: 14th Pacific-Asia Conference, PAKDD 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part I 14. 2010.

    Google Scholar 

  28. Stevens K, Kegelmeyer P, Andrzejewski D, Buttler D. Exploring topic coherence over many models and many topics. In: Paper presented at: Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. 2012.

    Google Scholar 

  29. Qian Y, Liu Y, Sheng QZ. Understanding hierarchical structural evolution in a scientific discipline: a case study of artificial intelligence. J Informet. 2020;14(3):101047.

    Article  Google Scholar 

  30. Jung S, Segev A. Semantic similarity analysis between future topics and their neighbors in topic networks for network-based topic evolution. In: Paper presented at: 2022 IEEE International Conference on Big Data (Big Data). 2022.

    Google Scholar 

  31. Fetter R B, Thompson J D. Diagnosis related groups (DRGs) and nursing resources[M]. Health Systems Management Group, School of Organization and Management, Yale University; 1987.

  32. Geissler A, Quentin W, Scheller-Kreinsen D, Busse R. Introduction to DRGs in Europe: common objectives across different hospital systems. In: Diagnosis related groups in Europe: moving towards transparency, efficiency and quality in hospitals. 2011. p. 9–21.

    Google Scholar 

  33. Hayashida K, Murakami G, Matsuda S, Fushimi K. History and profile of diagnosis procedure combination (DPC): development of a real data collection system for acute inpatient care in Japan. J Epidemiol. 2021;31(1):1–11.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Böcking W, Ahrens U, Kirch W, Milakovic M. First results of the introduction of DRGs in Germany and overview of experience from other DRG countries. J Public Health. 2005;13:128–37.

    Article  Google Scholar 

  35. Liu W. Caveats for the use of Web of Science Core Collection in old literature retrieval and historical bibliometric analysis. Technol Forecast Soc Chang. 2021;172:121023.

    Article  Google Scholar 

  36. Liu W, Ni R, Hu G. Web of Science Core Collection’s coverage expansion: the forgotten Arts & Humanities Citation Index?[J]. Scientometrics. 2024;129:933–55.

    Article  Google Scholar 

  37. Goldfield N. The evolution of diagnosis-related groups (DRGs): from its beginnings in case-mix and resource use theory, to its implementation for payment and now for its current utilization for quality within and outside the hospital. Quality Management in Healthcare. 2010;19(1):3–16.

    Article  Google Scholar 

  38. Zhong Q, Huang Z, Chen C, Chen F, Wang H, Zhu X. Study on DRGs of single disease based on drug cost of patients with primary liver cancer. In: Paper presented at: Proceedings of the 1st International Symposium on Artificial Intelligence in Medical Sciences. 2020.

    Google Scholar 

  39. Jiao WP. Diagnosis-related groups’ payment reform in Beijing. Chin Med J (Engl). 2018;131(14):1763–4.

    Article  PubMed  Google Scholar 

  40. Jian W, Lu M, Han W, Hu M. Introducing diagnosis-related groups: is the information system ready? Int J Health Plann Manage. 2016;31(1):E58-68.

    Article  PubMed  Google Scholar 

  41. General Office of the State Council. Guiding opinions of the General Office of the State Council on further deepening the reform of the payment method of basic medical insurance. In: Gazette of the State Council of the People’s Republic of China, no. 20. 2017. p. 9–13.

    Google Scholar 

  42. National Healthcare Security Administration of The People's Republic of China. Notice on printing and distributing the national pilot technical specifications and grouping scheme for DRG payment related to disease diagnosis. Accessed 24 Apr 2023.

  43. National Healthcare Security Administration of The People's Republic of China. Notice of the Office of the National Medical Security Administration on printing and distributing the list of DRG/DIP payment demonstration sites. Accessed 29 Apr 2023.

  44. Wang Z, He HY, Liu X, Wei HK, Feng QM, Wei B. Health resource allocation in Western China from 2014 to 2018. Arch Public Health. 2023;81(1):30.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Zhang X, Liu J, Han X. Study on the development and evaluation of open economy in Xinjiang. The Border Econ Cult. 2023;05:25–30.

    CAS  Google Scholar 

  46. Wang HY, Zhou JH, Fang L, Peng Y, Jin C. Development, evolution and features of DRGs in the United States and enlightenment for China. Chinese Health Quality Management. 2018;25(06):25–7.

    Google Scholar 

  47. Zou K, Li HY, Zhou D, Liao ZJ. The effects of diagnosis-related groups payment on hospital healthcare in China: a systematic review. BMC Health Serv Res. 2020;20(1):112.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Yin YN, Miao CX, Zhuo L, et al. Researches on the evolution and visualization of DRGs in China based on CiteSpace. Chinese Health Service Managemen. 2021;38(12):957–60.

    Google Scholar 

  49. Liu R, Shi JW, Yang BL, et al. Charting a path forward: policy analysis of China’s evolved DRG-based hospital payment system. Int Health. 2017;9(5):317–24.

    Article  PubMed  PubMed Central  Google Scholar 

  50. Gu E, Page-Jarrett I. The top-level design of social health insurance reforms in China: towards universal coverage, improved benefit design, and smart payment methods. Journal of Chinese Governance. 2018;3(3):331–50.

    Article  Google Scholar 

  51. Li F, Zhao YH, He H. Bibliometric analysis of studies on diagnosis-related groups in China. Chinese Medical Record English Edition. 2013;1(2):81–4.

    Article  Google Scholar 

Download references


The authors would like to thank everyone who helped with this study.


This research was funded by Chongqing Municipal Science and Technology Bureau's Chongqing Talent Plan for 2022 (cstc 2022 ycjh-bgzxm 0015). And it was also funded by the Program for Youth Innovation in Future Medicine, Chongqing Medical University (W0063).

Author information

Authors and Affiliations



XC: Conception and design of the work, as well as manuscript writing; MZ, QB, and BT: Data analysis and interpretation; PP, YZ, YT, and XT: Data collection; DD: Critical revision of the article. All authors approved the final manuscript before submission.

Corresponding author

Correspondence to Dan Deng.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, X., Zhang, M., Bu, Q. et al. Exploring hot topics and evolutionary paths in the Diagnosis-Related Groups (DRGs) field: a comparative study using LDA modeling. BMC Health Serv Res 24, 756 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: