teaching big symposium_keynote_mark gierl

29
Dr. Mark Gierl, Dr. Mark Gierl, Professor and Canada Research Chair Professor and Canada Research Chair Centre for Research in Applied Measurement and Evaluation Centre for Research in Applied Measurement and Evaluation University of Alberta University of Alberta How You Can Learn To Love Large-Scale Assessment: How You Can Learn To Love Large-Scale Assessment: Let Me Count the Ways” Let Me Count the Ways” An Outline For Our Future At The University of Alberta An Outline For Our Future At The University of Alberta Presentation at the Centre for Teaching and Learning (CTL) “Teaching Big” Symposium University of Alberta—August, 2012

Upload: najia-choudhury

Post on 20-Sep-2015

5 views

Category:

Documents


0 download

DESCRIPTION

UoA

TRANSCRIPT

  • Dr. Mark Gierl, Professor and Canada Research ChairCentre for Research in Applied Measurement and EvaluationUniversity of Alberta

    How You Can Learn To Love Large-Scale Assessment:Let Me Count the Ways

    An Outline For Our Future At The University of Alberta

    Presentation at the Centre for Teaching and Learning (CTL) Teaching Big Symposium University of AlbertaAugust, 2012

    Centre for Research in Applied Measurement and Evaluation

    TO BEGINEducational measurement is a discipline and a profession focused on the use of methodologies for assigning test scores to examinees, typically on a numeric scale, so we can make inferences about their knowledge, skills, and competencies

    Once a static and largely quantitatively-driven field, recent developments in the learning sciences, mathematical statistics, computer technology, educational psychology, and computing science are creating profound changes in educational measurementas a result, our contemporary assessments barely resemble their predecessors of decade ago

    Centre for Research in Applied Measurement and Evaluation

    OVERVIEWBACKGROUND

    Measurement, Evaluation, and Cognition (MEC) Program in the Department of Educational PsychologyCentre for Research in Applied Measurement and Evaluation (CRAME)

    PRESENTATION

    Four principles of testing in large classroomsTwo applications for putting principles into practicePlea for our collective future

    My presentation today will have four key messages

    Centre for Research in Applied Measurement and Evaluation

    OVERVIEWMeasurement, Evaluation, and Cognition (MEC) is 1 or 8 areas in the Department of Educational Psychology

    Graduate students (16 currently) who receive an MEd or PhD in MEC specialize in educational measurement, statistics, research methods, cognition applied to assessment, and/or program evaluation

    Our graduates work in the private sector at testing companies like the Educational Testing Service (ETS) or in the public sector for different agencies (e.g., Alberta Education; Medical Council of Canada)

    MEC has five faculty members: Drs. Mark Gierl, Jacqueline Leighton, Ying Cui, Cheryl Poth, and Sharla King

    The Centre for Research in Applied Measurement and Evaluation (CRAME) is a centre within MEC focused on conducting research in the areas of educational measurement, cognitive psychology , and statistics with the goal of making assessment an integral part of learning and instruction

    Centre for Research in Applied Measurement and Evaluation

    OVERVIEWMESSAGE #1: Educational measurement is a specialized discipline where you can earn a graduate degree at both the MEd and PhD levelsthis indicates that testing is embedded in a discipline that requires rigorous and comprehensive trainingMESSAGE #2: You have colleagues at the University of Alberta who actually love to talk about tests and who train graduate students who also like and excel in our discipline [resources exist on campus]

    Centre for Research in Applied Measurement and Evaluation

    TESTING TIPS BY MARK

    HOW TO MAKE A GOOD MULTIPLE-CHOICE TEST ITEMThe item measures specific content, as outlined in the test specifications. The item is based on important topic in the curriculum and is designed to measure key thinking and problem-solving skills.The item is carefully edited, formatted, and presented using correct grammar, punctuation, capitalization, and spelling.The central idea in included in the stem, not the options.The stem of the item is worded positively, and avoids negatives such as NOT or EXCEPT.Only one of the options is clearly correct.The correct option is not cued due to item writing errors such as presenting a conspicuous correct options or blatantly incorrect options.All of the distractors are plausible (e.g., basing distractors on typical errors made by students)

    Etc., etc., etc., etc., etc., etc.

    Centre for Research in Applied Measurement and Evaluation

    OUR FOUR PRINCIPLESPRINCIPLE #1: We will shift from infrequent summative assessments (e.g., 2 midterms + final) to more frequent formative assessment (e.g., 8-10 exams or more per term)

    PRINCIPLE #2: Testing on-demand is required where students can write exams at any time and at any location

    PRINCIPLE #3: Assessments will be scored immediately and students will receive both instant and detailed feedback on their overall performance as well as their problem-solving strengths and weaknesses

    PRINCIPLE #4: You will spend less time and less effort implementing these principles in your large classes compared to the amount of time you currently spend on assessment-related activitiesin fact, much less

    Centre for Research in Applied Measurement and Evaluation

    APPLICATION #1:COMPUTER-BASED TESTING

    COMPUTED-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    Test DevelopmentTest AdministrationTest ReportingPAPER-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    COMPUTED-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    COMPUTED-BASED TESTINGAUTOMATED

    Centre for Research in Applied Measurement and Evaluation

    COMPUTED-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    COMPUTED-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    COMPUTED-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    In short, computer-based testing is a very good thing and it is here to staycomputer-based testing either eliminates or automates 2/3 of the testing activities that, currently, you do manually

    Admittedly, we are focusing on examples that use objectively-scored assessment itemsbut examples can also be cited for automated essay scoring of student-produced assessment tasks

    The architecture for a computer-based testing system is feasible [PAPER BASED TESTING IS DEAD]MESSAGE #3: The University of Alberta needs a computer-based testing system because YOU need this system for all of your classes, big and smallCOMPUTED-BASED TESTING

    Centre for Research in Applied Measurement and Evaluation

    COMPUTED-BASED TESTINGTest DevelopmentTest AdministrationTest Reporting*ELIMINATED**AUTOMATED*

    Centre for Research in Applied Measurement and Evaluation

    APPLICATION #2:AUTOMATIC ITEM GENERATION

    AUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    ONE WAY TO CREATE TEST ITEMSProfessor writing test items the day before the midterm exam

    Centre for Research in Applied Measurement and Evaluation

    AUTOMATIC ITEM GENERATIONAnother way to address this item development challenge is with automatic item generation (AIG)

    Automatic item generation is the process of using item models to generate test items with the aid of computer technologywith this approach, hundreds or even thousands of items can be generated with a single item modelWhile the idea of automatic item generation may be viewed as a dream come true I am here to tell you that the dream is well within our reach because of developments in modern educational measurement theory

    Centre for Research in Applied Measurement and Evaluation

    A 54-year-old woman has a laparoscopic cholecystectomy. On post-operative day 3 she has a temperature of 38.5c. Physical examination reveal a red and tender wound and calf tenderness. Which one of the following is the best next step?

    a. Mobilizeb. Antibioticsc. Anti coagulationd. Reopen the wound

    Centre for Research in Applied Measurement and Evaluation

    AUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    That ugly diagram is a cognitive model highlighting the knowledge, skills, and content required to make a medical diagnosis

    The model includes three key outcomes:

    Identify THE PROBLEM (i.e., Post-Operative Fever);

    Specify Sources of information required to diagnose the problem (e.g., Type of Surgery); and

    3. Describe KEY features within each information source (e.g., Guarding and Rebound) needed to create different instances of the problemAUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    AUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    Next, an item models is created, where an item model is like a template or a mould of the assessment task (i.e., its a target where we want to place the content in the test item)

    A 54-year-old woman has a . On post-operative day the patient has a temperature of 38.5c. Physical examination reveal . Which one of the following is the best next step?AUTOMATIC ITEM GENERATION

    Type of Surgery: Gastrectomy, Right Hemicolectomy, Left Hemicolectomy, Appendectomy, Laparoscopic CholecystectomyTiming of Fever: 1 to 6 daysPhysical Examination: Red and Tender Wound, Guarding and Rebound, Abdominal Tenderness, Calf Tenderness

    Centre for Research in Applied Measurement and Evaluation

    Finally, we combine this information systematically to produce new items

    To accomplish this complex combinatoric task, we created software for item generation called IGOR (Item GeneratOR)

    IGOR was programmed using JAVA

    AUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    When we used our method with 5 different item models developed for the MCC QE Part I in surgery, more than 20,000 items were generated:

    Item Model 1: Gallstones288Item Model 2: Hernias256Item Model 3: Aneurism5,184Item Model 4: Post Operation Management7,488Item Model 5: Post Operation Fever7,680

    We have also developed item models at the K-12 levels in Language Arts, Social, Science, Math as well as AP Biology and Architecture in addition to 10 different content areas in Medicine producing millions of test itemsAUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    16. A 60-year-old woman has been booked for a laparoscopic cholecystectomy for symptomatic gallstones. Prior to her surgery, she presents to the Emergency Department with a history of feeling faint and unwell. She has had rigors. On physical examination, her temperature is 40 C. Her white blood count is 22 x 109/L; aspartate aminotransferase 63 U/L; alanine aminotransferase 78 U/L; alkaline phosphatase 450 U/L; amylase level 200 U/L and bilirubin 50 mol/L. Which one of the following is the most likely diagnosis?

    (a) Cholecystitis.(b) Cholangitis.(c) Pancreatitis.(d) Hepatic abscess.(e) Duodenal ulcer.39. An obese 61-year-old male collapsed with sudden pain at a shopping center and is brought to hospital by ambulance. He is diaphoretic. His pulse is 96/minute; blood pressure 100/70 mm Hg; he complains of severe pain in his abdomen and left flank. Which one of the following is the most likely diagnosis?

    (a) Acute hemorrhagic pancreatitis.(b) Ruptured aortic aneurysm.(c) Mesenteric vascular occlusion.(d) Acute diverticulitis.(e) Volvulus of sigmoid colon.AUTOMATIC ITEM GENERATION

    Centre for Research in Applied Measurement and Evaluation

    Educational measurement is a specialized discipline requiring advanced graduate trainingthis implies that assessment contains many complex and thorny issues but please remember that you have colleagues on-campus who can help you deal with these issues

    Our discipline is undergoing profound changes that will yield much better methods for evaluating students while at the same time requiring less time and effort for the examiner because much of the unpleasant work is being automatedcomputer-based testing and automatic item generation are but two examples from a list of many

    CONCLUSIONMESSAGE #4: There is no going back to the good old daystherefore, we must work together to structure our future at the University of Alberta by building and implementing these new assessment systemsbut also recognize that this work is just getting started

    Centre for Research in Applied Measurement and Evaluation

    THANK YOUDr. Mark J. Gierl ([email protected])6-110 Education Centre North

    **************************