With the concurrent advancement of the economy, South Korea has seen an increasing interest in healthcare information among its citizens, propelled by advancements in medical technology and improvements in education. The desire for better health has escalated with the rapidly aging population. Both individuals and society are actively interested in enhancing health and fitness and prolonging lifespan. This growing interest extends to oral health, which plays a role in improving quality of life1). As consumer interest in oral health continues to rise, premium-grade toothbrush market competition within the oral care product industry intensifies. This has led to the introduction of various functional toothpaste products amid increasing competition2). Interest in oral cleansers has been increasing, not only for the management of oral diseases and the oral environment but also for functions such as bad breath elimination and teeth whitening. The development of effective oral cleansers targeting periodontal disease and oral environmental management has been consistently researched. In South Korea, active research focuses on natural substances that induce changes in oral environments and examines their antibacterial effects on periodontal diseases and oral bacteria3). Interest in dental treatments extends beyond oral health and includes an increased focus on aesthetic dental care. Owing to recent westernized dietary changes, there has been a growing demand and interest in orthodontics4). Moreover, with an increasing proportion of adults seeking orthodontic treatment and an increase in income levels, there is a heightened desire for aesthetic procedures. Consequently, ongoing developments in orthodontic device innovations and clinical research are aligned with these trends. Transparent aligners and lingual orthodontic devices have been widely adopted as key tools in aesthetic orthodontic treatments. Companies that provide transparent aligners are developing various methods and auxiliary devices for tooth movement5). In dental prosthodontics, numerous aesthetic materials and restorative products have been introduced, notably zirconia, which has garnered significant attention owing to its excellent strength, wear resistance, and high biocompatibility6). As the demand for esthetic restorative materials increases, computer-aided design/computer-aided manufacturing (CAD/CAM) methods have been introduced. In dentistry, the utilization of three-dimensional (3D) printing technology is steadily increasing in various areas such as diverse tooth models, temporary teeth, transparent aligners, and implants, leading to continuous advancements in associated technologies7,8).
As various studies have advanced with the development of oral health-related technologies, it is challenging to find comprehensive research that encompasses all oral health-related technologies. There are diverse methods for predicting promising technologies using technological trend analyses. Traditional methods such as the Delphi technique, analytic hierarchy process, scenario technique, expert panels, and trend extrapolation exist but are limited by the subjective opinions of relevant technology experts. Consequently, data-based methods utilizing paper and patent information have been predominantly employed9). Patent data, for example, can be utilized as a metric for measuring trends and achievements in technological research and development. It refers to textual documentation containing specific technical and scientific information about an invention and detailing the elements intended for legal protection. Moreover, it is quantifiable and objective data on the latest technology, which is globally standardized according to specific formats, thus making it applicable for various purposes. Sawng et al.10) and Jo et al.11) have previously conducted research utilizing patent information as analytical data for technological trends12). In this study, we performed network analysis and topic modeling using text analysis programs to comprehensively analyze patent data. Network analysis involves breaking down words constituting sentences to extract meaningful concepts, and represents how words form relationships in a network format. Network analysis allows us to identify keywords and understand the connections between words, thereby enabling us to infer the context of sentences within a text13). Kim et al.14) also applied network analysis methods to analyze keywords related to dental hygiene. Topic Modeling is a technique used to estimate latent and meaningful topics within a collection of unstructured text data. Among these algorithms, Latent Dirichlet Allocation (LDA) is a method that calculates a specific number of topics by considering the probability distribution of terms related to the topics15). Kim et al.16) and Lee17) conducted patent analyses by applying topic-modeling techniques to uncover hidden themes within a large volume of documents.
Although previous studies have examined text analysis using patent data, there remains a shortage of research in the healthcare field. This study aimed to utilize text-mining techniques to examine the interrelationships among keywords in patents related to oral health and extract latent topics for visualization. Through this process, the study sought to derive significant keywords, analyze core and specific technologies within different topics, and comprehend technological trends. Additionally, it advocates the active utilization of text analysis techniques in dental hygiene and aims to utilize patent-derived analysis as fundamental data to promote advancements in oral health-related technologies.
This study progressed sequentially, as depicted in Fig. 1, and involved data collection, data preprocessing, keyword extraction, frequency analysis, network analysis and visualization, and topic modeling.
As of July 31, 2023, a total of 11,710 patent documents related to oral health were collected from the domestic patent database service platform, KIPRIS (www.kipris.or.kr). The scope of data collection was not limited by time and included patents that were either publicly disclosed or registered, utilizing search queries constructed with keywords related to “dentistry,” “teeth,” and “oral health.” Keyword selection was based on synonymous words provided by the KIPRIS’s search term expansion function. After gathering the patent data, invention titles and abstracts were reviewed, excluding titles associated with pet-related contents. Additionally, among Korean invention titles containing “teeth,” “tooth,” or “dental” terms, those not semantically related were excluded. The final research dataset comprised 6,865 invention titles in Korean, collected from March 2003 to July 2023.
We used Python version 3.10 (Python Software Foundation, Santa Clara, CA, USA) as an analytical tool capable of text analysis. For the Korean text, we employed the Hannanum morphological analyzer from the Korean Natural Language Processing in Python (KoNLPy) package. To extract meaningful keywords, we preprocessed the tokenized words. The words in Korean, such as “seuk-ru, im-peul-laen-teu, im-peul-lam-teu, beu-rae-kit, pik-seu-chyu, pik-seu-chyu-eo, pik-seu-chwo, yu-nit, yu-ni-teu, eo-bu-teu-meon-teu, and eo-byu-teu-meon-teu,” have been standardized to their equivalent words with the same meanings: screw, implant, bracket, fixture, unit, and abutment. This involved preprocessing by removing punctuation, numbers, and stop words, and extracting words consisting of more than one syllable for the analysis.
Using the Pandas module and counter function in Python, we conducted a frequency analysis of keywords (37,155) extracted from patent invention titles. We extracted the top 1 to 50 keywords with the highest occurrence frequencies.
We conducted a network analysis to understand the connections between the keywords. We compared the degree centrality indices of the top 50 keywords by frequency. This value is based on the degree of direct connections between nodes, indicating the centrality of the network. A higher degree of centrality implies that a node is highly connected directly to many other nodes, and is calculated based on the number of links directly connected to a specific node18,19). The analysis was performed using a Python library network. From the total keywords, the 50 most frequently appearing keywords were used to derive a 2D adjacency matrix (50×50), indicating the co-occurrence relationships among these keywords. A portion of the top 10 results is presented in tabular form. For visual inspection of the degree of connectivity between nodes, we visualized the keyword network using Gephi 0.10.1. (Gephi Consortium, Paris, France) The nodes were positioned based on their degree centrality values. Nodes ranked from 1 to 10 were centrally placed and their color was set to red. The nodes ranked 11∼20 were orange, 21∼30 were yellow, 31∼40 were green, and 41∼50 were blue.
To extract meaningful topics from the patent data, we performed topic modeling based on the connection rules between keywords. We used the “gensim.models.LdaMulticore” module from the Gensim library, version 4.3.0 in Python, as our research tool. To determine the optimal number of topics, we conducted topic optimization experiments based on topic coherence values. We varied the number of topics from 1 to 20, measured the coherence value for each topic, and selected the number of topics that exhibited the highest coherence value as the appropriate number of topics for our study. The repetition count for topic sampling was set to 500 times20). We used the LDA method to categorize data collected from the library into topics. For the parameter settings, we designated the number of topics (num_topic) as eight. The chunk size, which represents the number of documents processed in a single training session, was set to 2,000. The total number of training passes (passes) were set to 20, and the per-document iteration count (iterations) was set to 1,50021). The results of the topic modeling were visualized using the pyLDAvis library, which demonstrated the trained model. The keywords associated with topics were generated by setting the lambda (λ) value to 0.622). Lambda values range from 0 to 1 and act as hyperparameters that determine the diversity of word selection23). Furthermore, the lambda value represents the weight that indicates the degree of relevance between each topic and keyword. When setting the weight to 1, it generates a ranking of familiar terms based on the words that frequently appear in each topic. However, setting the weight to 0 prioritizes the selection of words that show significant differences among topics24,25). The research findings of topic modeling were interpreted by two researchers (dental hygienists) under the guidance of a topic-modeling expert. The criteria for topic selection were centered on five main keywords. During the topic-labeling process, topics were structured to include at least one of the five keywords.
Table 1 presents the results based on the top 50 words with a high frequency of occurrence in invention titles, along with their respective centrality values. According to the analysis of occurrence frequency and centrality, “Method” exhibited the highest centrality value. Following this, “Apparatus” and “Tooth” showed high frequencies. Words with a centrality of 0.9 or higher, excluding words with a centrality of 1, were “Apparatus”, “Treatment”, “System”, “Inclusion”, “Possibility”, and “Oral”. Examining words ranked 11th to 15th based on centrality, we observed “Implant,” “Orthodontics,” “3D,” “Processing,” and “Usage.” Among the top 50 keywords, those with lower centrality included “Extracts,” “Zirconia,” and “Active ingredient,” which ranked 48th, 49th, and 50th, respectively (Table 1).
Frequency, Degree Centrality of Top 50 Keywords
Word | Frequency | Degree Centrality | |||
---|---|---|---|---|---|
Rank | Frequency | Rank | Degree Centrality | ||
Method | 1 | 2,347 | 1 | 1 | |
Apparatus | 2 | 1,677 | 5 | 0.979 | |
Tooth | 3 | 1,502 | 2 | 1 | |
Oral | 4 | 1,491 | 10 | 0.918 | |
Manufacture | 5 | 1,001 | 3 | 1 | |
Composition | 6 | 949 | 17 | 0.775 | |
Implant | 7 | 939 | 11 | 0.897 | |
Orthodontic | 8 | 736 | 12 | 0.897 | |
System | 9 | 626 | 7 | 0.938 | |
Use | 10 | 577 | 4 | 1 | |
Inclusion | 11 | 564 | 8 | 0.938 | |
Toothbrush | 12 | 326 | 43 | 0.51 | |
Dentition | 13 | 254 | 34 | 0.632 | |
Instrument | 14 | 245 | 20 | 0.734 | |
For treating | 15 | 238 | 27 | 0.693 | |
Prevention | 16 | 223 | 45 | 0.489 | |
Treatment | 17 | 202 | 6 | 0.979 | |
Extracts | 18 | 194 | 48 | 0.387 | |
Recording | 19 | 193 | 24 | 0.714 | |
Prosthetic | 20 | 189 | 28 | 0.673 | |
Picture | 21 | 188 | 25 | 0.714 | |
Containing | 22 | 182 | 44 | 0.51 | |
Bracket | 23 | 172 | 47 | 0.408 | |
Guide | 24 | 171 | 18 | 0.775 | |
Disease | 25 | 169 | 39 | 0.591 | |
Procedure | 26 | 166 | 35 | 0.632 | |
Processing | 27 | 161 | 14 | 0.836 | |
Imaging | 28 | 147 | 38 | 0.612 | |
Management | 29 | 145 | 21 | 0.734 | |
Possibility | 30 | 140 | 9 | 0.938 | |
Production | 31 | 131 | 19 | 0.755 | |
Data | 32 | 130 | 22 | 0.734 | |
Artificial | 33 | 130 | 23 | 0.734 | |
3dimension | 34 | 128 | 13 | 0.877 | |
Medium | 35 | 128 | 29 | 0.673 | |
Scanner | 36 | 123 | 40 | 0.571 | |
Image | 37 | 121 | 36 | 0.632 | |
Provide | 38 | 120 | 16 | 0.795 | |
Active ingredient | 39 | 118 | 50 | 0.367 | |
Generate | 40 | 117 | 26 | 0.714 | |
3D | 41 | 114 | 30 | 0.673 | |
Scan | 42 | 113 | 31 | 0.653 | |
Computer | 43 | 113 | 32 | 0.653 | |
Abutment | 44 | 113 | 41 | 0.551 | |
Toothpaste | 45 | 112 | 46 | 0.448 | |
Usage | 46 | 105 | 15 | 0.836 | |
Zirconia | 47 | 98 | 49 | 0.387 | |
Digital | 48 | 95 | 37 | 0.632 | |
Purpose | 49 | 94 | 42 | 0.53 | |
Medical | 50 | 94 | 33 | 0.653 |
After reviewing Fig. 2, which displays the results of the network visualization and examines the 2D adjacency matrix (50×50) showing the co-occurrence relationships, it is evident that thicker links between nodes represent a higher frequency of co-occurrence. For instance, when “Method” is linked to “Manufacturing (866),” “Tooth (593),” and “Apparatus (680),” the links appear thicker. “Oral” is found to be connected to “Composition (451).” “Implant” shows multiple connections with “method (260),” “Apparatus (137),” and “Tooth (260).” “Orthodontics” exhibits close associations with “Method (228),” “Apparatus (266),” “Tooth (228),” and “Dentition (220)” (Fig. 2).
To determine the optimal number of topics, topic coherence values were computed for varying topic counts ranging from 1 to 20. The coherence value for the eight topics was highest at 0.807 (rounded to the fourth decimal place) (Fig. 3).
Table 2 displays the top ten core keywords for each of the eight topics, representing the labeling results for each topic. Each topic was analyzed based on primary keywords to understand the respective subjects. Topic 4 is identified with the highest weight, characterized as “Implants.” Meanwhile, the themes for the lowest-weighted topics, Topic 1 and Topic 3, are “Instrument and Materials for Oral Health” and “Toothbrush and Oral Health Care,” respectively.
Result of the Topic Modeling
Numbers of topic | Topic label (Keywords 1∼10) |
---|---|
1 | Instrument and Materials for Oral Health (Instrument, Ingredient, Care, Hygiene, Improvement, Washing, Sintering, Structure, Impression Material, Tissue) |
2 | Orthodontics (Method, Apparatus, Orthodontic, Dentition, Recording, Picture, Data, Processing, Tooth, Image) |
3 | Toothbrush and Oral Health Care (Toothbrush , Fixing, Able, Removing, Electric, Unit chair, Gum, Driver, Toothbrush hair) |
4 | Implant (Manufacture, Implant, Method, Use, Guide, Abutment, Prosthetic, Tooth, 3D, Fixture) |
5 | Oral Composition for Prevention and Treatment (Composition, Oral, Inclusion, For treating, Prevention, Disease, Containing, Extracts, Active ingredient, Toothpaste) |
6 | Dental Treatment Aid Apparatus (Apparatus, Measurement, Mask, Auxiliary, Root canal, Dental, Occlusion, Material, Medical, Cleaning) |
7 | Apparatus or Method Based on Artificial Intelligence (Tooth, Orthodontic, Artificial, Model, Apparatus, Whitening, Purpose, Derived, Transparent, Attachment) |
8 | Oral Care System or Service (Management, System, Prove, Service, Method, Oral, Zirconia, Predict, Graft material, Health) |
The lambda (λ) value, determining the diversity of word selection, was set to 0.6.
Fig. 4 presents a visualization of topic modeling using pyLDAvis after setting the number of topics to eight. It displays the Intertopic Distance Map (IDM) and the 30 core keywords. In the IDM, topics 6 and 7 intersect within one quadrant, while in another quadrant, topics 2, 4, and 8 are situated closely, overlapping with each other. In contrast, topics 1, 3, and 5 appeared relatively distant from the other groups and did not overlap with any other topics. The weightage of each topic was: Topics 1 (5.5%), 2 (19.6%), 3 (5.5%), 4 (22.0%), 5 (17.4%), 6 (8.9%), 7 (10.5%), and 8 (10.5%). The topic with the highest weightage is represented by the larger circle indicating “Implant,” while the lowest-weighted topics are indicated by smaller circles denoting “Instrument and Materials for Oral Health” and “Toothbrush and Oral Care.” The core keywords forming the entire set of topics are “Manufacturing,” “Composition,” “Implant,” “Oral,” and “Method,” with the most frequent keyword being “Methods” (Fig. 4).
This study was conducted to understand technological trends through keyword network analysis and topic modeling based on domestic patents related to oral health. When considering the occurrence frequency, words, such as “Method,” “Apparatus,” “Tooth,” “Oral,” and “Manufacture,” showed higher proportions. This could be attributed to the collection of data that primarily focused on patents related to oral health, hence reflecting these prominent terms. According to previous research26) analyzing patents using network analysis methods, there are cases where words, such as “Method,” “Apparatus,” and “System,” were not considered as stop words and were derived as key keywords. Considering that specific technical domains might influence the interpretation of the results, these words were not excluded and were analyzed together. Additionally, in studies related to medical procedures and patents, it is noted that medical procedures involving human life are excluded from patent rights to prevent restrictions on such procedures. Instead, according to patent office practices, the functional and systemic operating methods of medical devices themselves are recognized as patentable, not as medical procedures or treatments27). This can be interpreted as one of the reasons why words related to “method” and “apparatus” accounted for a significant proportion in the frequency analysis results of this study.
To identify keywords with relatively higher importance in relation to their frequency of occurrence, we analyzed their degree of centrality. There are cases where keywords are perceived as important, despite having a low frequency of occurrence but a relatively high degree of centrality. For instance, the keyword “Use,” ranking 10th in frequency, had the highest degree centrality of 1, while “Treatment,” ranking 17th, showed the next highest value of 0.979. Therefore, relying solely on the frequency of occurrence to derive key terms should be avoided. Furthermore, when comparing the rank differences between frequency of occurrence and degree centrality, “3D” ranked 34th in frequency with 128 occurrences but secured a relatively higher position at 13th with a degree centrality of 0.877. The importance of “3D” is also evident in the network visualization. It is observed to be connected with nodes such as “Data,” “Scanner,” and “Scan.” In the collected patent data, titles like “Method for 3D scan data processing for dental prosthesis manufacture” and “Apparatus and method for restoring 3D oral scan data using computerized tomography images” were identified. Moreover, the relationships between keywords were confirmed within the sentences. Digital dental technology based on 3D printing, one of the core technologies of the Fourth Industrial Revolution in dentistry, is advancing. The introduction of 3D scanners into dentistry has significantly affected CAD/CAM technology in prosthetic manufacturing systems, and digital impression techniques using intraoral or extraoral scanners are commonly used in clinical settings28).
We conducted topic modeling and keyword analysis to extract potential themes from the text. Determining the number of topics before executing the model is crucial in this regard29). There is no statistical solution for determining the appropriate number of topics, as it depends heavily on the interpretability, validity, and research questions that influence the topics generated through topic modeling. The determination of the number of topics is subjective and is decided by researchers based on what they believe would yield the most meaningful results through topic modeling.
To minimize subjectivity, topic coherence values were used as in a previous study30). In this study, a function measuring topic coherence values for each number of topics from 2 to 20 was implemented in Python, following the methodology of a previous study to determine the appropriate number of topics30). In topic modeling analysis, topics with low relevance to the research subject are sometimes excluded based on the researchers’ judgment. In related studies, it has been observed that researchers reviewed the words included in the derived topics to eliminate those with low relevance to the research topic from the analysis. This exclusion of low-relevance topics by the research team has been acknowledged and considered in the topic inference method25,31).
Topic 1 comprises words like “Instrument” and “Ingredient,” followed by “Care,” “Hygiene,” and “Improvement.” Interpreting this as a comprehensive concept encompassing oral health, it was named “Instrument and Materials for Oral Health.”
Topic 2 mainly consists of words such as “Orthodontic” and “Dentition,” and this topic was named “Orthodontics.” The analysis also confirmed the association of words like “Recording,” “Picture,” and “Data” with “Orthodontic.” Orthodontics involves clinical assessments based on quantitative analyses of the human skeletal structure and dental alignment. In clinical practice, there are ongoing developments in orthodontic analysis systems in South Korea, such as WebCeph software (AssembleCircle Corp., Seongnam, Korea), using imaging data, indicating a significant attempt to employ artificial intelligence (AI) technology and digital image recognition and processing in the field of dental alignment32).
Topic 3 is focused on words like “Tooth,” “Removing,” and “Electric,” thus designated as the theme “Toothbrush and Oral Health Care.” Choi et al.33) analyzed 512 patents from 2005 to 2014, to examine the patent trends related to toothbrushes. They suggested that various shapes, materials, functions, and socially oriented aspects of toothbrushes have been consistently researched and patented. Moreover, they anticipated the continued production of electric toothbrushes.
Topic 4 is inferred as “Implant” based on words such as “Manufacture,” “Implant,” “Method,” “Use,” and “Guide.” Unlike in the past, dental implants are now considered and proposed as the foremost treatment when teeth are lost, demonstrating a high long-term success rate and reliability. Studies continue to focus on the materials and surface forms of implants for osseointegration, aiming to address the drawbacks or complications associated with them34). In the field of surgery, diagnostic models that replicate the structure or form of surgical sites are being developed to aid in surgical decision-making. Templates and surgical guides that utilize 3D printing technology are being developed to enhance the accuracy and safety of surgeries8). Recently, the majority of the domestic implant companies in Korea have expanded the distribution of advanced software and 3D printers. This has enabled practitioners to directly develop surgical guides for implants. The use of implant surgical guides has become widespread, facilitating their diverse production using various materials and methods. Several researchers are actively studying this trend35,36).
Topic 5 was inferred based on the words “Composition,” “Oral,” “For treating,” “Prevention,” and “Extracts,” and was named “Oral Composition for Prevention and Treatment.” According to the study of Hwang et al.37), when analyzing patents related to toothpaste on KIPRIS, composition accounted for 35% of the patents, representing the most developed field. Various extracts have been used in patents related to preventing and treating periodontal diseases. Moreover, studies are being conducted on the effects of extracts, such as their antibacterial effects, on oral health37).
Topic 6 appears to be associated with “Dental Treatment Aid Apparatus,” based on the words “Apparatus,” “Measurement,” and “Root Canal.” Through research on devices for occlusal force measurement38), wireless handpieces39) for root canal treatment, and devices for root canal irrigation40), it is evident that devices and instruments related to measurement and root canals are under investigation. Devices for measuring root canal length, vertical height, and bone density were identified in the collected patent inventories.
Topic 7 highlights the development of devices for artificial tooth processing and orthodontic systems utilizing AI, with prominent keywords including “Tooth,” “Orthodontic,” “Artificial,” “Model,” and “Apparatus.” Therefore, it was named “Apparatus or Method Based on Artificial Intelligence.” The future of AI foresees its extensive adoption in various healthcare sectors, particularly in the diagnostic imaging field, and it is anticipated to be widely applied in dentistry. Research suggests that AI will enhance diagnostic efficiency by automating repetitive tasks, thereby reducing costs and alleviating the burden on healthcare systems in an aging society. This efficiency is expected to extend to dental inventory management, patient scheduling, automatic generation of electronic medical records, and surgical feedback, eventually evolving into an integrated dental operation and management system that utilizes AI41). In a study by Choi et al.42), it was highlighted that clinical trials of AI-assisted software aiding physicians’ diagnoses in South Korea increased significantly from six cases in 2018 to 17 cases in 2019. Moreover, the range of conditions targeted by AI technology has expanded beyond prostate and breast cancer to include diverse conditions, such as lung diseases, vertebral compression fractures, and dental conditions42). The potential for the advancement of patented technology related to “Apparatus or Method Based on Artificial Intelligence” is considerably high and requires continuous attention.
Topic 8 was named “Oral Care System or Service,” derived from major keywords such as “Management,” “System,” “Provide,” “Service,” and “Method,” concluding the analysis. With the growing interest in oral health, governments have been continuously implementing oral health promotion programs targeting vulnerable populations, such as students, seniors, and people with disabilities, by establishing oral health clinics and regional clinics for dental care43,44). With the evolving trends of the times, the approach to providing oral health services has transformed. The widespread adoption of smartphones has led to increased interest in applications aimed at improving the quality of life of people with disabilities. For instance, there was a case where an Android-based “15 Minutes of Daily Oral Exercise” app was developed to enhance oral motor skills for individuals with cerebral palsy45). Moreover, studies analyzing oral health-related apps have highlighted their significance as crucial educational tools for acquiring oral health knowledge and proper maintenance methods, as seen in related research46). This development indicates the advancement of technology related to oral care services associated with the theme of Topic 8.
The technologies associated with oral health have advanced in various aspects. However, while there have been studies on specific technological trends in dentistry and dental hygiene, there is a lack of comprehensive research analyzing patented technologies. Therefore, a significant contribution of this study lies in deriving the potential topics related to oral technology over the past 20 years and segmenting them into particular topics. This not only aids in setting future directions for detailed research and formulating patent strategies but also provides valuable data. Despite the methodological limitations, the value of this study lies in deriving results based on a vast amount of patent data in the dental hygiene field using Python and proposing a method to extract keywords representing topics. We hope that the methodology and visualization tools used in this study will find applications in various research fields within dental hygiene. Moreover, the results from network analysis and topic modeling can be utilized to guide technological development in the field.
In this study, text mining techniques were employed for network analysis, whereas in topic modeling, efforts were made to maintain objectivity by undergoing refinement processes and utilizing stop-word lists. While objectivity regarding the number of topics was maintained using topic coherence values, the process of inferring and naming topics unavoidably involved qualitative interpretation by the researchers, potentially incorporating subjective judgments based on their choices. Exploring various methods to maintain objectivity in topic modeling is essential, and further research is required to select topic labels that semantically represent topics effectively. Additionally, applying time-series analysis to topic modeling to analyze patent trends across different years would aid in predicting competitiveness and promising technologies for various technical changes over time.
This study conducted text analysis on 11,710 patent data related to oral health and dentistry obtained from KIPRIS as of July 31st, spanning 2023 years, using a program developed in Python. Of these, 6,865 keywords highly relevant to the field were selected. The analysis involved determining the relationships among keywords in patent titles by analyzing centrality based on the top 50 words by frequency and applying topic modeling using LDA to infer topics.
The analysis of the connectivity between adjacent words revealed that the centrality index was highest for “Method” and lowest for “Active ingredient.” In network visualization, “Method” exhibited thicker links when connected to “Manufacture,” “Tooth,” and “Apparatus,” while “Oral” was closely linked to “Composition.”
Regarding topic modeling derived from keywords, the topic weights indicated “Implant” as the highest, followed by “Orthodontics,” “Oral composition for prevention and treatment,” “Apparatus or Method Based on Artificial Intelligence,” “Oral Care System or Service,” and “Dental Treatment Aid Apparatus.” Conversely, topics related to “Instrument and Materials for Oral Health” and “Toothbrush and Oral Health Care” had the lowest weightage. Therefore, in clinical and research contexts, studies primarily focused on “Implant” and “Orthodontics” regarding patent technology trends.
This study aimed to comprehensively understand and categorize technological trends related to oral health and utilize the frequency and relationships of the analyzed keywords to grasp technological advancements. It is anticipated that this research will serve as fundamental material for technological developments associated with patents in dentistry and dental hygiene.
None.
No potential conflict of interest relevant to this article was reported.
Not applicable.
Conceptualization: Hee-Kyeong Bak. Data acquisition: Hee-Kyeong Bak. Formal analysis: Hee-Kyeong Bak. Supervision: Han-Na Kim. Writing-original draft: Hee-Kyeong Bak and Han-Na Kim. Writing-review & editing: Hee-Kyeong Bak, Yong-Hwan Kim, and Han-Na Kim.
None.
Raw data is provided at the request of the corresponding author for reasonable reason.