Name
AI-Assisted Assessment of Motivational Interviewing as a Valid Supplement to Faculty Evaluation in Telehealth Patient-Centered Education
Date & Time
Sunday, June 7, 2026, 4:57 PM - 5:12 PM
Location Name
Walsh
Authors
Victor Lim, Medical College of Georgia, Augusta University, Augusta, Georgia Kuang-Drew Li, Medical College of Georgia, Augusta University, Augusta, Georgia Henry Moon, Medical College of Georgia, Augusta University, Augusta, Georgia Daniel Kaminstein, Medical College of Georgia, Augusta University, Augusta, Georgia
Presentation Topic(s)
Technology and Innovation
Description
PURPOSE
Effective motivational interviewing (MI) is crucial in supporting
patient-physician communication and encouraging behavioral change in chronic
disease management. Self-assessment of clinical communication skills often
lacks predictive validity for actual performance. This study assessed
generative AI (GenAI) scoring of MI compared to self-assessment and faculty
evaluation in medical students during clinical encounters.
METHODS
Sixty medical students completed telehealth encounters as part of a
required MI and patient-centered learning course. Students completed self-assessments
using the MIA-STEP rubric (eight items, 1-7 scale). Audio recordings were
transcribed and scored using advanced automatic speech recognition (ASR) and
large language model (LLM) systems trained with the same rubric, generating
GenAI-MIA scores. Faculty evaluated student performance using a SOAP rubric
assessing clinical reasoning, presentation structure, and communication.
Spearman's rank-order correlation was used to examine associations among the
three evaluations.
RESULTS
Students overestimated MI performance with a mean self-assessment of 5.82
(SD = 0.46), compared to mean GenAI scores of 3.28 (SD = 0.84). Self-assessment
showed weak, non-significant correlation with GenAI scores (? = .21, p = .11)
and with SOAP faculty evaluations (? = .09, p = .47). GenAI-MIA scores
demonstrated moderate, significant correlation with SOAP performance (? =
.36, p = .006). Overall, students displaying higher GenAI scores were
associated with higher-quality clinical presentations.
CONCLUSIONS
GenAI evaluation captures MI communication behaviors aligned with faculty
assessed clinical competence, whereas student self-assessment does not.
Despite limitations with transcript-based analysis, GenAI scoring
demonstrates validity as a formative supplemental assessment tool and
provides immediate objective feedback on communication performance. These
results suggest that AI-assisted assessment tools for MI may address critical
limitations of self-evaluation and be integrated as a scalable, supplemental
tool into clinical communication training in medical education.
Presentation Tag(s)
Student Presentation