transforming multiple choice questions to effectively assess

46
Transforming Multiple Choice Questions to Effectively Assess Application of Knowledge STReME Series, August 11, 2011 Brenda Roman, MD, Professor of Psychiatry, BSOM Paul Koles, MD, Associate Professor of Pathology and Surgery, BSOM

Upload: trannga

Post on 01-Jan-2017

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Transforming Multiple Choice Questions to Effectively Assess

Transforming Multiple Choice Questions to Effectively Assess Application of Knowledge

STReME Series, August 11, 2011Brenda Roman, MD, Professor of Psychiatry, BSOM

Paul Koles, MD, Associate Professor of Pathology and Surgery, BSOM

Page 2: Transforming Multiple Choice Questions to Effectively Assess

Journey through Lunch

• Power and Purposes of Assessment• Learning Approaches and Assessment• Assessment Using Multiple-Choice Questions (MCQs)• Evaluation of MCQ Quality• Identification of Flaws in MCQs• Practice: Find the Flaws• Practice: Choose the Highest-Quality MCQ

Page 3: Transforming Multiple Choice Questions to Effectively Assess

Q1: Of the criteria listed below, which one do you believe is most important for judging the quality of a multiple choice question (MCQ)?

a. The MCQ assesses knowledge that is considered important by the writer of the question.

b. The MCQ is directly related to one or more of the course’s learning objectives.

c. The MCQ asks the student to make a decision that is based on critical interpretation of data.

d. The MCQ requires the student to appropriately apply knowledge, not just to recall facts.

Page 4: Transforming Multiple Choice Questions to Effectively Assess

Flaws in the previous MCQ

• Options– Non-homogeneous options:

(a) (b) about content; (c) (d) about format and purpose

– Unnecessarily long– Only (d) has a contrasting

clause• Stem

– question can’t be answered if the answer options are covered up

– “judging the quality” which aspect of quality?

– “do you believe” implies that the best answer is a matter of personal opinion (there is no single best answer)

• Q1: Of the criteria listed below, which one do you believe is most important for judging the quality of a multiple choice question?

a. The MCQ assesses knowledge considered important by the writer of the question.

b. The MCQ is directly related to one or more of the course’s learning objectives.

c. The MCQ asks the student to make a decision that is based on critical interpretation of data.

d. The MCQ requires the student to appropriately apply knowledge, not just to recall facts.

Page 5: Transforming Multiple Choice Questions to Effectively Assess

Power of Assessment

• “Assessment drives student learning. Student assessment can be designed to foster the development of elaborated knowledge structure by making relationships and understanding—rather than isolated facts—the objects of assessment.”

Bordage G: Elaborated Knowledge: A Key to Successful Diagnostic Thinking. Acad Med 69:883-885, 1994

Page 6: Transforming Multiple Choice Questions to Effectively Assess

Purposes of Assessment (using written questions)

• Assumption: performance on a sample of questions allows inferences about the skills of examinees in a broader domain

• Communicate what instructor views as important• Motivate students to learn • Allow objective comparisons among students who

often experience variations in curriculum• Compensate for instructional gaps by encouraging

students to read broadly and utilize a variety of educational tools

Case SM, Swanson DB; Constructing Written Test Questions for the Basic and Clinical Sciences, 3rd edition, NBME 2002

Page 7: Transforming Multiple Choice Questions to Effectively Assess

Assumption Refuted:

Physicians who pass licensure exams may lack some essential skills for practicing medicine

Page 8: Transforming Multiple Choice Questions to Effectively Assess

Learning Behavior• Learning behavior: “. . .the set of cognitive and metacognitive

processes that learners draw on to acquire knowledge, skills, and understanding” (Mitchell R; Acad Med 84:918-926, 2009)

• 424 residents from 7 IM residencies completed a cognitive behavior survey (140 items, 7 point Likert scale)

• Seven learning behavior scales developed from survey data: memorization, conceptualization, reflection, independent learning, critical thinking, meaningful learning experience, attitude toward educational experience

• RESULTS– Memorization not correlated positively with other 6 scales– Memorization correlated negatively with critical thinking– Residents in top 20% on reflection scale also conceptualized, learned

independently, and thought critically more than the bottom 20%

Page 9: Transforming Multiple Choice Questions to Effectively Assess

Competent Physicians

• Integrate: “to bring together parts into a whole” (Webster’s)

Competent Physician

Knowledge of biomedical and clinical science

Skills of analytical and

critical thinking

Ability to communicate

Cultural and ethical

competence

Page 10: Transforming Multiple Choice Questions to Effectively Assess
Page 11: Transforming Multiple Choice Questions to Effectively Assess

Assessment in Medical Education

• Primary purpose: measure student’s competence in course, clerkship, or residency

• Secondary purpose: develop competent physicians – Motivate student to integrate new knowledge with

previously mastered knowledge (longitudinal learning)– Foster critical thinking skills (clinical decision-making)– Impart direction for future learning (subliminal

messages embedded in assessments)

Page 12: Transforming Multiple Choice Questions to Effectively Assess

Learning Approaches and Assessment• Students adapt learning approaches to context in which

learning occurs• Three basic approaches identified

– Surface (memorization)– Deep (comprehension and application)– Strategic (adapted to meet perceived expectation of faculty)

• Teaching methods influence students’ approach to learning• Some teaching methods hinder development of deep

learning approach• Education of competent physicians requires “substantial

changes in teaching, curriculum and, particularly, assessment . . .”

Newble DI, Entwistle NJ: Learning Styles and Approaches: Implications for Medical Education. Medical Education 1986; 20:162-175)

Page 13: Transforming Multiple Choice Questions to Effectively Assess

Can MCQs assess learner’s ability to apply knowledge by critical thinking and problem solving?

Authors Method Results and Conclusions

Corderre etal, BMC Mdical Education 2004, 4:23

Think-aloud protocols to determine problem-solving strategy used by gastroenterologists and MS4s in answering 8 questions about dysphagia, nausea/vomiting, diarrhea, and elevated liver enzymes

1. Similar clinical reasoning skills used to answer 5-option and extended matching MCQs

2. Stem more important than options for testing clinical reasoning

Beullens etal, Medical Education 2005, 39:410-417

20 final year med students & 20 final year IM residents solved extended matching questions (EMQs) aloud.

1. Residents & upper 50% in both groups used more “forward” than “backward” reasoning

2. Processes of clinical reasoning can be assessed using EMQs.

Cuddy etal, Acad Med 2004, 79:S43-45

27 experts complete survey about clinical relevance of 150 NBME step 2 MCQs

92% questions clinically relevant; 85% of content used in clinical practice

Page 14: Transforming Multiple Choice Questions to Effectively Assess

*

* Bloom’s taxonomy of cognitive learning collapsed into 3 levels: (1) knowledge; (2) comprehension and application; (3) problem solving

Page 15: Transforming Multiple Choice Questions to Effectively Assess
Page 16: Transforming Multiple Choice Questions to Effectively Assess

MCQs using clinical vignettes in the stem

• “Questions with rich descriptions of clinical context invite the more complex cognitive processes that are characteristic of clinical practice.”

• “Conversely, context-poor questions can test basic factual knowledge but not its transferability to real clinical problems.”

Epstein RJ: Assessment in Medical Education, New England Journal of Medicine 2007; 356:387-396.

Page 17: Transforming Multiple Choice Questions to Effectively Assess

“There is nothing new under the sun”

(Ecclesiastes 1:9)

• “No teaching should be done without a patient for a text.” (Osler William: On the Need of A Radical Reform in our Methods of Teaching Medical Students; Medical News 82:49-53, 1904.)

• NBME announcement 2010-2011: decision to use only clinical or experimental vignette formats on USMLE step 1.

Page 18: Transforming Multiple Choice Questions to Effectively Assess

Format of Clinical Vignette

• Outline (not all parts necessary)– Age and gender (“42-year-old woman”)– Site of care (“comes to the emergency department”)– Presenting complaint (“because of headache”)– Duration (“has persisted for 2 days”)– Past history (may not be relevant)– Physical findings (“pulsating artery anterior to ear”)– +/- diagnostic studies; +/- treatments

• Example– “What area is supplied with blood by the posterior inferior cerebellar

artery?– “A 62-year-old man develops left-sided limb ataxia, Horner’s syndrome,

nystagmus, and loss of appreciation of facial pain and temperature sensations. Which of the following arteries is most likely to be occluded?”

Page 19: Transforming Multiple Choice Questions to Effectively Assess

How good is this MCQ?

• Subjective methods to evaluate quality– Opinion of question author– Opinions of other content experts– Opinions of experienced MCQ writers– Opinions of students (pre-test, post-test)

• Systematic identification of flaws by question author and trusted consultants (YOU ARE THE CONSULTANTS!)

• Gold standard: performance of MCQ in an exam, as demonstrated by difficulty index and discrimination factor

Page 20: Transforming Multiple Choice Questions to Effectively Assess

Year N diff. index top 25% bottom 25% disc.factor answer A B C D E

Difficulty index: percentage of examinees who answered the question correctly

Discrimination Factor: how well the item discriminates between students who performed highest on the exam (top 25%) and students who performed lowest on the exam (bottom 25%).

Higher D.F. suggests item is a more reliable measure of competence

Gold Standard: Performance of MCQ on an examination

Page 21: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws in MCQs

A) 5 common flaws in stemsB) 7 common flaws in answer options

Page 22: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Stems

A1. Stem does not end with a question (lead-in) that can be answered by covering up answer options.

A 39-year-old female is seen for an annual exam. She had been on oral contraceptive pills as a teenager but discontinued that form of contraception over 15 years ago. Because of her contraceptive practice she has . . .

Prostate cancer is best treated . . .

Corticosteroid therapy . . .

According to the best scientific evidence available to date, HIV-1 came from . . .

Page 23: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ StemsA2. Stem is unnecessarily complicated—too long, lots of irrelevant

information.

A 48-year-old woman presents to the physician with lower back pain. She states that she has had the pain for about 2 weeks and that it has become steadily more severe. An x-ray film shows a lytic bone lesion in her lumbar spine. Review of systems reveals the recent onset of mild headaches, nausea, and weakness. Her CBC shows a normocytic anemia, and her erythrocyte sedimentation rate is elevated. Urinalysis shows heavy proteinuria, and a serum protein electrophoresis shows a monoclonal peak of IgG. Which of the following is responsible for this patient’s spinal lesioins?a.Bence-Jones proteinb.lymphoplasmacytoid proliferationc.osteoblast activating factord.osteoclast activating factore.primary amyloidosis

Page 24: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Stems

A B-cell-deficient toddler recovers as well as a normal child does to infection with the chickenpox virus. This child's immune system is capable of developing . . .

A3. Stem contains vague terms that invite a wide range of interpretations.

Page 25: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Stems

A4. Stem contains abbreviations that are not clearly understood by all examinees.

A 32yo WF in her 1st trimester of pregnancy experiences GERD 3-4x/week and c/o heartburn. She has not responded to MOM. Which medication will be best to treat this patient?

Page 26: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Stems

A5. Stem contains words about quantity that are difficult or impossible to quantify: probably, usually, infrequently, sometimes, in most cases, in few cases, etc.

In most cases, men who develop prostate cancer usually have limited dietary intake of which of the following food groups?

Page 27: Transforming Multiple Choice Questions to Effectively Assess

Perception is unpredictable

Page 28: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B1. One or more options do not follow grammatically from the stem.

Which of the following behaviors is most frequently observed in adolescents who smoke cigarettes?

a.intelligence quotient below 80b.overeatingc.body mass index < 25d.disrespect for authoritye.alcohol abuse

Page 29: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B2. Options are heterogeneous in language or domains.

Which is necessary for the development of Burkitt lymphoma?

a.creation, by translocation, of a bcr/abl fusion gene in B-lymphocytes

b.deletion of p53 tumor suppressor gene in B-lymphocytesc.infection of B-lymphocytes by Epstein-Barr virusd. over-expression of the c-myc oncogene in B-lymphocytese.trisomy of chromosome 8

Page 30: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B3. Option includes absolute terms that make it unlikely to be correct: “always”, “never”

In patients with advanced dementia due to Alzheimer disease, the memory defect

a.can be treated adequately with phosphatidylcholine (lecithin).b.could be a sequela of early parkinsonism.c.is never seen in patients with neurofibrillary tangles in the

cerebral cortex. d.is never severe.e.possibly involves the cholinergic system.

Page 31: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B4. Correct option is longer, more specific, or more complete than other options (“sore thumb”).

Secondary gain is

a.synonymous with malingering.b.a frequent problem in obsessive-compulsive disorder.c. a complication of a variety of illnesses and tends to prolong

many of them.d.never seen in organic brain damage.

Page 32: Transforming Multiple Choice Questions to Effectively Assess
Page 33: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B5. correct option contains the most elements in common with other options (“convergence”).

Intramedullary destruction of red blood cells in beta-thalassemia is best explained by which mechanism?

a.beta-4 tetramer oxidation and precipitationb.excessive iron accumulation in macrophagesc.increased formation of alpha chain aggregatesd.increased formation of Hb H (beta 4)e.increased formation of Hb F (alpha 2 gamma 2)

Page 34: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B6. Options are long, complicated, or composed of 2-3 parts, imposing irrelevant difficulty.

The figure below shows the dose-response curves for four different derivatives of a muscarinic receptor agonist. Each derivative acts by binding to the same site on the muscarinic receptor. The Heptyl derivative

a.has a lower binding affinity for the receptor than does the Hexyl derivative.

b.has a lower intrinsic activity than does the Hexyl derivative because it has a lower receptor affinity.

c.is a full agonist when compared with the Octyl derivative.d.is more potent than the Hexyl derivative.e.may act as a mixed agonist-antagonist if it has a higher receptor

affinity than the Hexyl derivative.

Page 35: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B7. Options contain words about quantity that are difficult or impossible to quantify: probably, usually, infrequently, sometimes, in most cases, in few cases, etc.

Severe obesity in early adolescence

a. usually responds dramatically to dietary regimens.b. often is related to endocrine disorders.c. has a 75% chance of resolving spontaneously.d. shows a poor prognosis.e. usually responds to pharmacotherapy and intensive psychotherapy.

Page 36: Transforming Multiple Choice Questions to Effectively Assess

Systematic Identification of Flaws Pre-Exam: MCQ Answer Options

B8. “none of the above” or “all of the above” is used as an option.

Which of the following cities is closest to New York City?

a. Bostonb.Chicagoc.Dallasd.Los Angelese.None of the above

Page 37: Transforming Multiple Choice Questions to Effectively Assess
Page 38: Transforming Multiple Choice Questions to Effectively Assess

Identify those flaws: Practice MCQ 1P1) Which of the following applies to pseudogout?

a.It occurs frequently in women.b.It is seldom associated with acute pain in a joint.c.It may be associated with a finding of chondrocalcinosis.d.It is clearly hereditary in most cases.e.It responds well to treatment with allopurinol.

P1) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?

a.1b.2c.3d.4e.5

Page 39: Transforming Multiple Choice Questions to Effectively Assess

Identify those flaws: Practice MCQ 2P2) A 17-year-old male presents with a two-year history of "severe" acne. He has previously been treated with numerous topical treatments and several different oral antibiotics. Multiple nodules and cysts are present diffusely on the face, shoulders, back, and upper chest. He has multiple depressed scars on the cheeks. He is administered an oral agent which leads to significant improvement in his condition. This agent works by

a. disruption of bacterial cell membranes.b.exfoliation. c. increased sebum production.d.reduction of androgen levels.e.suppression of sebum production.

P2) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?

a. 1b.2c. 3d.4e.5

Page 40: Transforming Multiple Choice Questions to Effectively Assess

Identify those flaws: Practice MCQ 3P3) A 25-year-old woman consults her physician because she has decided to use oral contraceptives. After the physician asks about history of thrombophlebitis, pulmonary embolus, and smoking (all negative), he proceeds to physical exam: Vital signs: within normal limits Height 4'0" Weight 85 lbs. HEENT: large head with prominent, rounded forehead Heart, Lungs, Abdomen: within normal limits Extremities: short arms and legs (compared to trunk length). He writes a prescription for oral contraceptives, but also records her most likely physical diagnosis in the chart. Which molecular abnormality best explains her diagnosis?a. constitutive activation of fibroblast growth receptor 2b.constitutive activation of fibroblast growth receptor 3c. expansion mutation in HOXD13 with altered length of transcription factord.mutation in COL1A1 with deficient synthesis of type 1 collagene.mutation in COL2A1 with deficient synthesis of type 2 collagen

P3) Of 13 flaws listed in your worksheet, how many flaws are present in this MCQ?a. 1b.2c. 3d.4e.5

Page 41: Transforming Multiple Choice Questions to Effectively Assess

High-Quality MCQ: in principle

A high-quality multiple-choice question is one that assesses content considered to be important, is free of flaws in both stem and options, and effectively identifies those who can use their knowledge to skillfully assess data and make decisions.”

(modified from Case SM Swanson DB: Constructing Written Test Questions for the Basic and Clinical Sciences, National Board of Medical Examiners, 2002)

Page 42: Transforming Multiple Choice Questions to Effectively Assess

Year N diff. index top 25% bottom 25% disc.factor answer A B C D E

Difficulty index: percentage of examinees who answered the question correctly

Discrimination Factor: how well the item discriminates between students who performed highest on the exam (top 25%) and students who performed lowest on the exam (bottom 25%).

Higher DF suggests item is a more reliable measure of competence.

Statistical Definition of High-Quality MCQs: ones that perform well on an exam, as judged by difficulty index and discrimination factor

Page 43: Transforming Multiple Choice Questions to Effectively Assess

Mastery MCQs

The data below show performance of 3 MCQs used in a final course exam for BSOM year 2 students.

All three assessed the same content domain. All three were classified as “mastery” questions

(answered correctly by ≥ 90% of students)

QM1) Based on the performance data shown below, which one is the highest-quality MCQ?

Option n D.I. top 25% bottom 25%

D.F. A B C D E

A) 101 90 96 86 0.27 0 10 91 0 0

B) 105 90 96 67 0.41 1 95 3 4 2

C) 105 90 93 81 0.22 2 7 94 1 1

Page 44: Transforming Multiple Choice Questions to Effectively Assess

Intermediate Difficulty MCQs

The data below show performance of 4 MCQs used in a final course exam for BSOM year 2 students.

All four assessed the same content domain. All four were classified as “intermediate difficulty”

questions. (answered correctly by 70.0 – 89.9% of students)

QM2) Based on the performance data shown below, which one is the highest-quality MCQ?

Option n D.I. top 25% bottom 25%

D.F. A B C D E

A) 105 81 100 59 0.43 0 19 85 1 0

B) 93 81 96 63 0.31 14 4 75 0 0

C) 93 70 96 54 0.40 13 65 10 1 4

D) 93 75 92 67 0.21 1 1 19 2 70

Page 45: Transforming Multiple Choice Questions to Effectively Assess

Challenging MCQs

The data below show performance of 3 MCQs used in a final course exam for BSOM year 2 students.

All 3 assessed the same content domain. All 3 were classified as “challenging” questions.

(answered correctly by <70 % of students)

QM3) Based on the performance data shown below, which one is the highest-quality MCQ?

Option n D.I. top 25% bottom 25%

D.F. A B C D E

A) 104 64 74 41 0.32 12 67 6 12 7

B) 93 57 84 42 0.26 0 22 17 1 53

C) 101 69 81 55 0.33 2 0 3 70 26

Page 46: Transforming Multiple Choice Questions to Effectively Assess