Name
Enhancing Assessment Using ChatGPT-4 by Streamlining MCQ Generation
Description

Presented By: Diorra Shelton, Touro University Nevada
Co-Authors: Eseosa Aigbe, Touro University Nevada
Brian Kaw, Touro University Nevada
Amina Sadik, Touro University Nevada
Naweed Yusufzai, Touro University Nevada

Purpose
Artificial intelligence (AI) Large Language Models, such as ChatGPT-4,  are novel tools transforming medical educational. This study pioneers ChatGPT-4 usage for medical science students' self-assessment and its goal is to adopt a more efficient and effective method of item writing while ensuring adequate coverage of basic science concepts and encouraging faculty to use AI. 

Methods
Students in the medical biochemistry course are required to write MCQs based on learning objectives to create a question bank for self-assessment. Four students were selected to do the same using ChatGPT-4 to compare the effectiveness, quality, and time spent producing AI vs. manually-created MCQs. All questions underwent rigorous faculty evaluation to ensure basic science concept coverage. Final versions of AI-generated questions were imbedded in formative and summative assessments using ExamSoft. An evaluation was conducted using a survey to determine students' perception of AI-generated questions used in assessment.

Results
Thus far, these preliminary findings reveal there is undeniable effectiveness in utilizing ChatGPT-4, but user prompts are critical. At an extremely lower rate than free models, ChatGPT-4's responses contain inaccuracies, necessitating faculty evaluation and participants having a deep grasp of the content to identify hallucinations. The point biserial of AI-generated questions in assessments averaged 0.42. The partial analysis of the survey revealed that 50% of students were comfortable using ChatGPT and 58% correctly identified AI questions. More than 93% agreed that AI questions enhanced their understanding of basic science concepts, provided a fair assessment of their knowledge, were clear and would like to see more of these questions on summative exams.

Conclusion
Given effectiveness, efficiency and performance on AI-generated questions we suggest students and faculty use ChatGPT-4 for practice questions and summative exams. This work is ongoing in formulating more effective prompts, leading to more accurate ChatGPT-4 generated MCQs and possibly circumventing the need for faculty evaluation.

Date & Time
Sunday, June 16, 2024, 4:15 PM - 4:30 PM
Location Name
Marquette VIII