Name
Enhancing Biochemistry Understanding in a Bridge Program through AI Use to Create a Question Bank for Self-Assessment - A Mixed Method
Date
Wednesday, October 2, 2024
Time
1:30 PM - 3:00 PM (EDT)
Authors

Brian Kaw - Touro University Nevada, College of Osteopathic Medicine
Naweed Yusufzai - Touro University Nevada, College of Osteopathic Medicine
Diorra Shelton - Touro University Nevada, College of Osteopathic Medicine
Eseosa Aigbe - Touro University Nevada, College of Osteopathic Medicine
Sherli Koshy-Chenthittayil - Touro University, Nevada, Office of Institutional Effectiveness
Amina Sadik - Touro University Nevada, College of Osteopathic Medicine

Description

Our study explores AI use, like ChatGPT4, in medical education. It aims to identify a more efficient method in multiple-choice questions (MCQs) creation while ensuring coverage of basic science concepts. To increase engagement with difficult content, graduate students in biochemistry course were required to write MCQs based on learning objectives to build a self-assessment question bank under the supervision of faculty.

 

In the Master of Medical Health Science, a bridge program to Medical programs, selected students were tasked to use ChatGPT while the remainder of the class wrote questions manually. Half of the research assistants used a free version of ChatGPT, whereas the other half used ChatGPT4. Data collection included prompt iteration number, MCQ generation time, and hallucination rates. Faculty ensured that MCQs covered the necessary concepts and were hallucination-free before their use in self-assessment. Select AI-generated questions were included with faculty-generated questions in formative assessments. Using the first exam scores that did not have item writing assignment or formative assessment, students were classified into low-performing (LPS), medium-performing (MPS), and high-performing (HPS) categories. These students’ performance was tracked as the semester progressed. To evaluate the process, a survey, reviewed and validated, was deployed at the end of the semester.

 

Average iterations ranged between 1.1 to 2.1 whereas the time ranged between 1.8 to 7.5 minutes per question. The rate of hallucinations was 23 to 70% in the free version vs. 0 to 9% in ChatGPT4. ChatGPT4 generated MCQs used in the assessments had a point biserial of 0.42 in ExamSoft vs. 0.25 for MCQs manually created by faculty. The use of MCQ for formative assessment has improved the performance of LPS and MPS.

 

Indeed, students from the LPS moved into the MPS category while MPS joined the HPS category. Over 94% of students surveyed found AI-generated questions clear and beneficial for understanding basic science concepts. Above 95% agreed that AI-generated questions used in self-assessment improved their understanding of basic science concepts.

 

Higher prompt iterations were observed when prompts were complex, vignette questions. Lower hallucination rates were noted when more time was spent on prompt engineering. Considering the low hallucination rate and the efficiency in creating adequate vignette-style questions, it is advisable to use ChatGPT4. It is also noteworthy that the depth of knowledge of the user and the familiarity with prompt engineering are essential. Using point biserial as a quality indicator of ChatGPT4 generated MCQs, findings suggest that the use of these questions should be extended to summative assessment. Students had varying abilities to distinguish between AI-generated and faculty-generated vignette-style questions, underscoring the high quality of ChatGPT’s MCQs. Nearly 95% of students found ChatGPT4 questions highly satisfactory in assessing critical thinking and complex concepts.

Presentation Topic(s)
Innovations
Presentation Tag(s)
Post-Bacc, Innovation, Other