Kristy Motte, Texas A&M College of Medicine
Purpose
Objective Structured Clinical Examinations (OSCEs) are central to assessing clinical competency in medical education. Despite their standardized structure, subjectivity in scoring particularly by standardized patients (SPs) may introduce grading inconsistency. In distributed campus models, variability in SP stringency or leniency can result in inequities that influence learner performance and downstream evaluation. This work seeks to identify patterns of inherent rater bias and propose strategies to improve objectivity in OSCE assessments.
Methods
We are conducting a retrospective analysis of OSCE data from Texas A&M College of Medicine. Our initial phase focuses on M1 and M2 pre-clinical OSCEs, where all students are graded on a pass/fail basis. We will evaluate whether certain SPs consistently rate students higher or lower, regardless of case or testing day. Using score distribution patterns, z-score normalization, and inter-rater comparisons, we aim to flag outlier raters whose evaluations may disproportionately impact student outcomes.
Subsequent phases will incorporate M3 OSCEs, which contribute to tiered grading (e.g., Pass, High Pass, Honors) and can significantly affect clerkship grades and residency competitiveness. We will assess whether observed SP-level variability persists across campuses and over time, and how it correlates with learner outcomes.
Results
Data collection is ongoing and preliminary analysis is underway. Identifying systematically stringent or lenient SPs could allow for more targeted rater training, real-time calibration, or weighting adjustments to ensure fairness.
Conclusion
This work aims to support equitable clinical assessment by reducing variability in SP scoring. By identifying and addressing rater bias institutions that utilize SP-based assessments in multi-site models can enhance the reliability and fairness of OSCEs, promoting accurate evaluation of learner competence regardless of testing location or SP assignment.