Technical flaws in multiple-choice questions in the access exam to medical specialties ("examen MIR") in Spain (2009-2013).

BMC Medical Education

PubMedID: 26846526

The main factor that determines the selection of a medical specialty in Spain after obtaining a medical degree is the MIR ("médico interno residente", internal medical resident) exam. This exam consists of 235 multiple-choice questions with five options, some of which include images provided in a separate booklet. The aim of this study was to analyze the technical quality of the multiple-choice questions included in the MIR exam over the last five years.

All the questions included in the exams from 2009 to 2013 were analyzed. We studied the proportion of questions including clinical vignettes, the number of items related to an image and the presence of technical flaws in the questions. For the analysis of technical flaws, we adapted the National Board of Medical Examiners (NBME) guidelines. We looked for 18 different issues included in the manual, grouped into two categories: issues related to testwiseness and issues related to irrelevant difficulties.

The final number of questions analyzed was 1,143. The percentage of items based on clinical vignettes increased from 50 % in 2009 to 56-58 % in the following years (2010-2013). The percentage of items based on an image increased progressively from 10 % in 2009 to 15 % in 2012 and 2013. The percentage of items with at least one technical flaw varied between 68 and 72 %. We observed a decrease in the percentage of items with flaws related to testwiseness, from 30 % in 2009 to 20 % in 2012 and 2013. While most of these issues decreased dramatically or even disappeared (such as the imbalance in the correct option numbers), the presence of non-plausible options remained frequent. With regard to technical flaws related to irrelevant difficulties, no improvement was observed; this is especially true with respect to negative stem questions and "hinged" questions.

The formal quality of the MIR exam items has improved over the last five years with regard to testwiseness. A more detailed revision of the items submitted, checking systematically for the presence of technical flaws, could improve the validity and discriminatory power of the exam, without increasing its difficulty.