
Dental morphology is a foundational subject in dental and dental hygiene education, focusing on understanding the structure, function, and developmental processes of teeth. The morphological characteristics of teeth are essential for diagnosis and treatment, enabling students to assess oral conditions and establish suitable treatment plans1).
With the rapid development of artificial intelligence (AI) technology, attempts to introduce AI into dental education are expanding. AI has the potential to analyze complex data and provide critical information in the learning process. This technological advancement opens possibilities for using AI in diagnostics, treatment planning, and as an educational support tool in dentistry2). Large language models (LLMs) like ChatGPT, which interact with learners in natural language, particularly hold high potential as educational tools. Consequently, studies abroad are currently exploring whether ChatGPT can be applied in dental education1-14). However, most studies focus on clinical or anatomical education, with limited research on the academic examination of dental morphology. ChatGPT, by explaining complex concepts simply and offering an environment where students can ask questions in real-time, can greatly enhance learning efficiency15). Therefore, ChatGPT in dental morphology education could be used to explain tooth structures and developmental processes or provide real-time feedback on clinical issues that students may face. Such an interactive learning tool could significantly boost students’ motivation and com-prehension.
ChatGPT has progressed from GPT-3 to GPT-4, which has many more parameters and is trained on more data, providing a higher level of understanding and more consistent responses. ChatGPT’s parameters, which function like synapses in the human brain, are known to number 175 billion16). According to Achiam et al.16), GPT-4 scored among the top 10% of examinees on simulated bar exams, showing higher performance than GPT-3.5. Currently, GPT has garnered global attention and is applied across various fields1-17). Therefore, there is a need to study its feasibility in dental morphology education.
This study focused on exploring the potential of using AI tools like ChatGPT in dental morphology education. The specific objectives were as follows: first, to check the accuracy of GPT-4’s responses to multiple-choice questions in Korean; second, to examine the reasoning behind GPT-4’s answer choices for incorrect answers to evaluate its accuracy.
This exploratory study aimed to determine the applicability of ChatGPT in dental morphology, a foundational subject in dental education, by analyzing the responses generated by ChatGPT to academic questions in this field.
The subject of this study was ChatGPT’s ability to answer multiple-choice questions related to dental morphology. To select the questions used, 21 questions on dental morphology from the first session of the national dental hygiene exam (questions 28∼34) from 2021 to 2023 were chosen. The criteria for selection included questions from the past three years (2021∼2023) that dealt with dental morphology. Specifically, each question was posed to ChatGPT in Korean.
GPT-4 was used to ask the multiple-choice questions on dental morphology. The multiple-choice questions were taken from the 2021∼2023 national dental hygiene exam, first session, questions 28 to 34 (Fig. 1∼3). These questions were provided in PDF format, and the responses were checked for accuracy. For incorrect answers, the reasoning behind the chosen answer was verified by asking additional questions. All questions were posed in the format of “Please tell me” or “Please explain why this answer was chosen.”
ChatGPT (version Mar 24; OpenAI) was used, and data were collected on October 29, 2024. Google Chrome was used to access ChatGPT at the URL https://chat.openai.com/chat. Additionally, to use GPT-4, the researcher logged in with a personal Google ID previously created and paid a monthly usage fee of $20 to access the paid version.
The research team consisted of one dental hygiene professor, one dental school professor, and one Ph.D. in dentistry. The dental hygiene and dental school professors were teaching dental morphology at the university in the second semester of 2024, while the Ph.D. in dentistry was conducting related research on teeth, including animal experiments in an integrated dentistry laboratory. Thus, the research team had extensive experience and knowledge of dental morphology. Additionally, to assess ChatGPT’s accuracy, the number of correctly answered questions for each year was divided by the total number of questions, and the result was multiplied by 100 to express it as a percentage. The formula is as follows:
When the 2021 questions were posed to GPT-4, it answered 4 out of 7 questions correctly, resulting in an accuracy rate of 57.14%. For the 2022 questions, GPT-4 answered 5 out of 7 questions correctly, achieving an accuracy rate of 71.43%. When the 2023 questions were presented, GPT-4 correctly answered 2 out of 7 questions, resulting in an accuracy rate of 28.57%.
When examining the reasoning behind incorrect answers for the 2021 questions, the results were as follows. For question 29, where the correct answer was option 5, GPT-4 selected option 3. Upon inquiry, it provided the explanation: “The reason option 3 was chosen for question 29 is that the ‘angle symbol’ is most prominently observed in the maxillary first premolar. The maxillary first premolar has distinct anatomical characteristics, and the angle symbol (a morphological characteristic determined by tooth angle) is one of the most prominent features of this tooth. Therefore, the best-matching choice was option 3: it appears most clearly in the maxillary first premolar.”
For question 30, where the correct answer was option 3, GPT-4 selected option 1. When asked for the reasoning, it explained: “Option 1 was chosen for question 30 because the question described a characteristic of the mandibular central incisor crown, where the mesial half is larger than the distal half. The mandibular central incisor indeed has a more significant and broader mesial half. This asymmetry is a distinguishing characteristic of the mandibular central incisor, making option 1 the most fitting answer.”
For question 32, where the correct answer was option 1, GPT-4 selected option 5. Upon inquiry, it explained: “Option 5 was chosen for question 32 because the described features best match the maxillary first molar. The question describes characteristics such as the presence of a mesial marginal ridge visible from the mesial side, a 3:2 ratio between buccal and lingual halves, and from the occlusal view, a prominent mesial half. These traits are specific to the maxillary first molar, so option 5 (maxillary first molar) was considered the correct choice.”
When analyzing the reasoning behind GPT-4’s incorrect answers for the 2022 questions, the findings were as follows. For question 30, where the correct answer was option 5, GPT-4 selected option 3. When asked for an explanation, it responded: “The question asked about a tooth with an occlusal cusp. Here, the occlusal cusp refers to a characteristic edge seen in the premolar’s occlusal surface. The premolar is typically associated with a sharply angled structure on its occlusal surface, particularly visible in the mandibular first premolar. Therefore, option 3 (mandibular first premolar) was selected as the answer.”
For question 34, where the correct answer was option 4, GPT-4 selected option 5. When asked for reasoning, it explained: “The question focused on visible features on the occlusal surface of the maxillary first molar. The maxillary first molar possesses distinct structures, among which the distolingual triangular ridge is especially prominent. This ridge, found on the maxillary first molar’s occlusal surface, forms a distinct ridge on the lingual (tongue-facing) side. Thus, option 5 (distolingual triangular ridge) was chosen.”
When analyzing the reasoning behind GPT-4’s incorrect answers for the 2023 questions, the findings were as follows. For question 28, where the correct answer was option 2, GPT-4 selected option 4. Upon inquiry, it explained: “The question asked which anterior tooth undergoes the most morphological change and displays incisal wear or attrition. This concerns which tooth most often shows changes due to wear on its occlusal surface. The lower incisors, particularly the mandibular lateral incisor and central incisor, exhibit substantial morphological changes, making option 4 (mandibular lateral incisor) the most suitable answer.”
For question 29, where the correct answer was option 1, GPT-4 selected option 5. When asked for reasoning, it responded: “The question asked about the correct feature of the labial surface of the mandibular central incisor. Typically, the mesial incisal angle of the mandibular central incisor is sharper than the distal incisal angle, reflecting the anatomical structure where the mesial side is sharper and more distinct than the distal side. Therefore, option 5 (the mesial incisal angle is sharper than the distal incisal angle) was chosen as the correct answer.”
For question 31, where the correct answer was option 1, GPT-4 selected option 3. When asked for its reasoning, it explained: “Option 5 (mandibular first molar) was chosen as the correct answer because the given characteristics match those of the mandibular first molar. The question describes a complex molar with buccolingual division, a larger mesial buccal cusp than the distal buccal cusp, and a depression in the cervical one-third on the mesial surface. These features describe the mandibular first molar, making option 5 the most appropriate answer.”
For question 33, where the correct answer was option 1, GPT-4 selected option 4. When asked for reasoning, it responded: “Option 4 (mesiolingual cusp) was selected because the largest cusp of the mandibular first molar is the mesiolingual cusp. This anatomical feature of the mandibular first molar means that the mesiolingual cusp is larger and more developed than the others, so option 4 was chosen.”
For question 34, where the correct answer was option 4, GPT-4 selected option 2. When asked for reasoning, it explained: “Option 2 (lighter tooth color) was chosen because primary teeth, compared to permanent teeth, have a relatively lighter color. Primary teeth have thinner enamel and higher translucency, making them appear brighter and lighter in color, a unique feature when comparing primary and permanent teeth. Thus, option 2 was selected.”
Dental morphology is a foundational subject in dental and dental hygiene education, providing essential knowledge for accurate diagnosis and treatment planning. This course focuses on understanding tooth morphology, structure, and function, with applications across various dental specialties, such as orthodontics, conservative dentistry, and oral surgery6). Traditionally, dental morphology is taught through a combination of theory and practice, often involving visual aids or carving exercises to enhance students’ comprehension7). However, some studies have found that these traditional methods have limitations in fostering student interaction and engagement7).
Recently, AI has been used in the digital health field9). In dentistry, AI applications are utilized to diagnose and plan treatment for conditions like dental caries and periodontitis through medical image analysis10,11). Accordingly, there is ongoing research on using LLMs like ChatGPT in dental education12,13). ChatGPT, as an AI tool, provides an environment where learners can ask questions and receive real-time feedback, significantly improving learning efficiency. Furthermore, AI-based educational support tools enable students to reinforce difficult concepts through repetition, facilitating the acquisition of clinically relevant knowledge. Therefore, this study evaluated ChatGPT’s feasibility in dental morphology education with two specific objectives: first, to verify the accuracy of responses generated by GPT-4 and, second, to assess the accuracy of GPT-4’s answers generated in Korean and English, respectively.
When checking the accuracy of GPT-4’s responses, it was found that while GPT-4 demonstrated relatively high accuracy on specific questions in dental morphology, it lacked consistent accuracy across different years’ questions. GPT-4’s performance varied depending on the structure and content of the data it learned. For example, in the 2021 exam questions, GPT-4’s accuracy rate was approximately 42.9%, whereas in the 2022 exam questions, it reached a higher accuracy rate of 71.43%. According to Ohta and Ohta13), who checked the accuracy of GPT-4’s answers on the Japanese dental licensing exam in 2023, GPT-4 outperformed GPT-3.5 and Bard in accuracy. However, its accuracy was limited for essential and dental-specific questions, with rates of 77.6% and 51.6%, respectively. Similarly, Suárez et al.14) reported that ChatGPT’s accuracy on endodontic questions was 85.44%, although the rate dropped to an average of 57.33% depending on question difficulty. Therefore, ChatGPT’s learning in the field of dentistry still appears to be lacking.
Further analysis of the reasoning behind incorrect answers showed that while GPT-4 accurately grasped medical terminology, it did not fully understand specific aspects of tooth morphology. For example, in a 2021 question about the angle symbol, it incorrectly answered that the feature was most prominent in the maxillary first premolar, demonstrating it understood the general concept but not the specific tooth involved. Huh17) reported that when parasitology questions were asked to Korean medical students and GPT, GPT scored lower than the students, indicating it lacked the knowledge and interpretive skills needed for the parasitology exam. Accordingly, this study also found GPT to be lacking in knowledge and interpretive skills for dental morphology.
ChatGPT has demonstrated relatively high accuracy for certain questions; however, its responses have shown inconsistency depending on the complexity and academic difficulty of specific questions. These findings align with existing research, indicating that the performance of LLMs such as ChatGPT can vary significantly depending on the structure and content of the training data. First, the necessity of an expert verification system is highlighted as a response to the errors generated by ChatGPT. There were cases where ChatGPT failed to provide accurate understandings of specific academic concepts, which is particularly concerning in specialized domains such as dental morphology. Hence, establishing a system wherein experts or educators verify the responses generated by ChatGPT is critical. Such a verification process is essential to prevent students from uncritically accepting AI-generated answers and to assist them in identifying and correcting erroneous information. Second, ChatGPT should be employed as a supplementary tool in the educational process rather than as a primary resource. While ChatGPT’s accuracy fluctuated depending on the difficulty of the questions, it may not be reliable enough for specialized fields such as dental morphology. Therefore, it is recommended that ChatGPT play an auxiliary role in student learning, while its use should be limited to a reference source for critical assessments or national exam preparation. This strategy can help minimize the potential learning distortions caused by AI inaccuracies. Lastly, there is a pressing need for regulations and guidelines governing the educational use of AI tools. When integrating AI into significant assessments, such as national licensing examinations, strict validation protocols and legal frameworks should be implemented to ensure reliability and accuracy. Such regulations would contribute to preventing the misuse of AI tools and ensuring fairness within educational contexts.
This study faced several limitations. First, the accuracy of GPT-4’s responses varied significantly with question difficulty and specific dental concepts. Its performance across different years’ questions was inconsistent, suggesting that the performance depends on the structure and content of the data it learned. Second, as AI models generate responses based on learned data, there are limitations when addressing recent dental research or complex clinical scenarios. Lastly, the study results are confined to specific conditions, such as particular question types and scope, limiting the generalizability of the findings.
This study explored the potential of using AI tools like ChatGPT in dental morphology education. While GPT-4 demonstrated relatively high accuracy on certain dental morphology questions, it lacked consistency and struggled with high-difficulty questions, indicating the need for supplementary educational adjustments. Future research should focus on building collaborative learning environments where AI and educators can work together to address AI limitations and continuously evaluate and improve educational effectiveness. Through these efforts, AI tools may play a significant role in enhancing students’ comprehension in dental education.
None.
Conflict of interest
Jeong-Hyun Lee has been journal manager of the Journal of Dental Hygiene Science since January 2023. Jeong-Hyun Lee was not involved in the review process of this editorial. Otherwise, no potential conflict of interest relevant to this article was reported. No potential conflict of interest relevant to this article was reported.
Ethical approval
This study does not require IRB approval as it does not involve human subjects or the collection of personal data, relying solely on publicly available materials and AI tools.
Author contributions
Conceptualization: Eun-Young Jeon, Hyun-Na Ahn, and Jeong-Hyun Lee. Data acquisition: Eun-Young Jeon, Hyun-Na Ahn, and Jeong-Hyun Lee. Formal analysis: Eun-Young Jeon, Hyun-Na Ahn, and Jeong-Hyun Lee. Supervision: Eun-Young Jeon and Jeong-Hyun Lee. Writing-original draft: Eun-Young Jeon and Jeong-Hyun Lee. Writing-review & editing: Eun-Young Jeon and Jeong-Hyun Lee.
Funding
None.
Data availability
Raw data is provided at the request of the corresponding author for reasonable reason.
![]() |
![]() |