Presented By: Daniel Levine, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Co-Authors: Charissa Alo, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Emily Ames, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Justin Atkins, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Rosalie Kalili, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Colin Standifird, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Thomas Vida, Kirk Kerkorian School of Medicine at University of Nevada, Las Vegas
Purpose
Generative artificial intelligence (AI) systems, built and trained with natural language processing models such as ChatGPT, can potentially make major impacts in medical education. This study determines whether ChatGPT can generate multiple-choice questions (MCQs) suitable for medical education. MCQs have played a pivotal role in medical education for enhancing long-term retention of fundamental concepts. They are the cornerstone of improving learning outcomes through the application of retrieval practice, as well as serving as a key tool for assessing the knowledge of medical students.
Methods
We prompted ChatGPT (GPT 3.5) to create 100 MCQs with 5 answer options using information from trusted and reputable medical information resources. To determine validity, we independently prompted the questions back into ChatGPT in two manners. First, we posed the AI-generated questions as open-ended, asking for a free response answer from ChatGPT. Second, we prompted the questions including the answer choices, and scored for correct answers.
Results
As free-response questions, ChatGPT answered 84% correctly with approximately 2% of responses not providing sufficient information for a focused answer. However, when the same questions were prompted with multiple-choice options, ChatGPT answered them 95% correctly.
Conclusions
This work is innovative as it leverages the power of freely available generative AI platforms to construct MCQs. These can be matched to a variety of difficulty levels, affording a customized study plan for medical students. A limitation arises in that the questions may need knowledge from a content expert to verify their accuracy. While more research is needed to determine AI's role in medical education, the use of ChatGPT as a readily available tool in medical school can potentially bridge knowledge gaps. Generative AI is rapidly increasing in accuracy and breadth, making it quite feasible as a low-stakes assessment tool for medical students.