charles alderson

Upload: karekare10

Post on 07-Jan-2016

13 views

Category:

Documents


1 download

DESCRIPTION

Tests

TRANSCRIPT

  • Do all tests have washback?

    J Charles Alderson Lancaster University

  • Tests

    Washback

    Diagnosis

  • A four-letter word

    Elicitation device: getting somebody to perform their competence

    Description of performance

    Procedure for making judgements based on criteria

    Measurement, not the same as assessment

    Not observation

  • Tests whose results are seen rightly or wrongly by students, teachers, administrators, parents or the general

    public, as being used to make important

    decisions that immediately and directly

    affect them.

    (Madaus, 1988)

  • Relates to the effects of tests on classroom practices particularly teaching and learning.

    Can be positive or negative, to the extent

    that it either promotes or impedes the

    accomplishment of educational goals held

    by learners and/or programme personnel.

    (Bailey, 1996)

  • Mismatch between the stated goals of

    instruction and the focus of assessment

    May lead to the abandonment of

    instructional goals in favour of test

    preparation

    Forces teachers to do things they would

    not normally do

  • If a test has positive washback,

    there is no difference between teaching the curriculum and teaching to the test.

    (Weigle & Jensen, 1997, p. 205)

  • Tests can be a powerful, low-cost means of

    influencing the quality of what teachers

    teach and what learners learn at school.

    (Heyneman & Ransom, 1992)

  • psychometric imperialism

    Leads to cramming

    Narrows the curriculum

    Focuses the attention on skills that are easy to test

    Restricts teacher and student creativity

    Demeans the professional judgement of teachers

  • A test will influence teaching.

    A test will influence learning.

    A test will influence what teachers teach.

    A test will influence what learners learn.

    A test will influence how teachers teach.

    A test will influence how learners learn.

    (Alderson and Wall, 1993)

  • A test will influence the rate and sequence, and the degree and depth of teaching.

    A test will influence the rate and sequence, and

    the degree and depth of learning.

    A test will influence attitudes to the content,

    method, etc of teaching and learning. Tests will have washback on all teachers and

    learners.

    Tests will have washback on some teachers and

    some learners but not on others.

    (Alderson and Wall, 1993)

  • Wall & Alderson 1993

    different amounts of washback on content, methods, means of assessment

    Alderson & Hamp-Lyons 1996, Watanabe 1996

    teachers are affected by tests in different ways

    Shohamy, Donitsa-Schmitt & Ferman, 1996

    the washback of tests can change over time

    Tsagari, 2006 The complexity of washback: Participants perceptions, material design and classroom applications

    Virtually all studies relate to high-stakes tests

  • Curriculum contents of curriculum, timetabling Teaching materials choice of textbooks, use of past papers, teacher- made materials Teaching methods choice of methods, teaching of test-taking skills Attitudes and feelings of learners and teachers Learning Do test results improve? Does learning improve?

    (Spratt, 2005)

  • The exam

    Teacher beliefs

    Teacher attitudes

    Teacher training

    Resources

    The school

    Cultural factors (Spratt, 2005)

  • Very under-developed and under-theorised in language testing and teaching

    Focus on learners strengths and weaknesses; on their prediction, even explanation

    Diagnosis requires a better understanding of what the nature might be of strengths and weaknesses in particular language skills

    There are very few diagnostic SFL tests

    (Alderson 2005, 2007; Huhta 2008)

  • NOT Proficiency NOT Achievement NOT Progress NOT Placement NOT Aptitude BUT all the above could yield useful

    diagnostic information HOWEVER, better is diagnosis by design

  • Bachman, 1990: 60 Virtually any test has some potential for

    providing diagnostic information But he then goes on to say: When we speak of a diagnostic test..we are

    generally referring to a test that has been designed and developed specifically to provide detailed information about the specific content domains that are covered in a given program or that are part of a general theory of language proficiency. Thus, diagnostic tests may be either theory or syllabus-based

  • Yet Alderson (2005: 6) points out: It would appear that we have a problem

    here: diagnosis (is said to be) useful, most language tests are (said to be) usable for diagnosis, it is common for universities to administer diagnostic tests (actually placement tests), and yet diagnostic tests are rare!

    Two examples of diagnosis in action and in

    research into theory: DIALANG and DIALUKI

  • DIALANG

    Diagnosis in action Diagnosis by design

  • Computer-based diagnostic language testing system

    14 European languages

    Delivers tests across the Internet

    Supports language learners

    Institutional or private use, free of charge

    Still widely used throughout Europe and beyond, 8 years after launch

  • DIALANG is an application of the Common European Framework of reference

    DIALANG uses Common European Framework scales self-assessment statements (modified)

    DIALANG provides some evidence of their validity

  • to provide language users and learners with diagnostic information about their strengths and weaknesses and to help them to find ways of improving their proficiency

  • to raise the learners awareness of their own language proficiency, of language

    learning and proficiency in general, and of

    the role that language tests might have in

    the learning process

    this takes place through the use of self-assessment and various kinds of feedback

    and information services

  • first large-scale system for diagnosis / feedback rather than certification

    on-line, Internet-delivered, universally available, not restricted to a particular place or time

  • available for all kinds & levels of learners & can support them throughout their language learning career

    multi-lingual (14 languages): tests interface (instructions, help screens) self-assessment & advice / feedback

  • Vocabulary

    Size

    Placement

    Test

    reading

    writing

    listening

    structures

    vocabulary

    Client

    enters

    D

    I

    A

    L

    A

    N

    G

    Selection

    of section:

    1 2

    3

    ASSESSMENT PROCEDURE

  • Self-

    assess-

    ment

    Respond-

    ing to

    tasks

    F

    e

    e

    d

    b

    a

    c

    k

    Selection

    EXIT

    Another

    section/

    language

    Goodbye!

    4 5 6 7

    ASSESSMENT PROCEDURE

  • Reading Comprehension (CEFR)

    Listening Comprehension (CEFR)

    Writing (CEFR)

    Structures

    Vocabulary

    no overall section (nor grade & feedback)

    from beginners to advanced

  • Danish

    Dutch

    English

    Finnish

    French

    German

    Greek

    Icelandic

    Irish

    Italian

    Norwegian Portuguese Spanish Swedish

  • VSPT score band and description

    results (and self-assessment) CEFR scales and report on self assessment

    explanatory feedback Why self-assessment may not match test result

    advisory feedback What you can do and how to progress, based on

    CEFR

    item review

  • http://www.lancs.ac.uk/researchenterprise/dialang/about

  • Validity relates to what the test is intended to measure

    Design for diagnosis, dont retrofit Diagnosis should relate to future

    treatment Treatment should be teachable or

    learnable Diagnosis should be based on theory:

    what we know about what affects learning

  • Informed by SLA research Focus on weaknesses rather than strengths Enable a detailed analysis and report Give detailed feedback which can be acted on Provide immediate results Involve little anxiety Based on content covered in instruction Less authentic Discrete-point rather than integrated More likely to focus on low-level language skills

    than higher-order skills which are more integrated; Likely to be enhanced by being computer-based.

  • DIALUKI

    Understanding Diagnosis Researching Diagnosis

  • Diagnosing Reading and Writing in a Second or Foreign Language

    Research project 2010-2013: work in progress

    Funded by the Academy of Finland, the University of Jyvskyl and the UK Economic and Social Research Council (ESRC)

    Cooperation between language testers, other applied linguists and psychologists (L1 reading)

  • The main research questions: Can different L1 and L2 linguistic,

    psycholinguistic, motivation and background measures predict difficulties in SFL R/W ?

    How does SFL proficiency in R/W develop in psycholinguistic and linguistic terms?

    Which features or combinations of features characterise different CEFR proficiency levels?

  • Study 1 Study 2 Study 3

    A cross-sectional study with 850 students Data collection: 2010-11 Exploring the value of a range of L1 & L2 measures in predicting L2 reading & writing, in order to select the best predictors for further studies

    Longitudinal study Data collection 2010-13 The development of literacy skills, and the relationship of this development to the diagnostic measures.

    Intervention study Data collection 2012-13 The effects of training on SFL reading and writing

  • Finnish-speaking learners of English as FL

    Russian-speaking learners of Finnish as SL

    primary school 4th grade (age 10; N = 210)

    lower secondary school, 8th grade (age 14; N= 208)

    Gymnasium 2nd year students (age 17; N= 218)

    primary school (3-6th grade; N= 186)

    lower secondary school (7-9th grade; N= 78)

  • Independent predictor variables in L1 and FL

  • Instruments in DIALUKI STUDY ONE

    English as a Foreign Language

    Group tasks

  • QUESTIONNAIRES 4th grade

    (age 1011)

    8th grade

    (age 1415)

    Gymnasium

    (age 1718)

    Parents

    questionnaire X X X

    Students

    questionnaire X X X

    Motivational

    questionnaire

    49 statements 58 statements 58 statements

    Self assessment:

    reading &

    writing L1 (2 x

    18 items)

    DIALANG DIALANG

    Self assessment:

    reading &

    writing L2 (2 x

    18 items)

    DIALANG DIALANG

  • LINGUISTIC

    MEASURES

    4th grade

    (age 1011)

    8th grade

    (age 1415)

    Gymnasium

    (age 1718)

    Reading L1 ALLU (1 text, 12 items) PISA 2009 (3 texts, 11 items) PISA 2009 (3 texts, 11 items)

    Reading L2 Pearson Young Learners

    (20 items)

    Pearson PTE General (25

    items)

    Dialang (30 items)

    Pearson PTE General (25

    items)

    Dialang (30 items)

    Writing L1 An opinion: Mobile phones /

    Internet

    An opinion: School food/

    Summer job

    A complaint

    An opinion: School food/

    Summer job

    A complaint

    Writing L2 A message to a friend How do you travel?

    An opinion: Mobile phones /

    Boys and girls on different

    classes

    An article

    An opinion: Mobile phones /

    Boys and girls on different

    classes

    Vocabulary L1 Dialang

    (75 items)

    Dialang

    (75 items)

    Dialang

    (75 items)

    Vocabulary L2 Selected from 1000 most

    common English words

    (60 items)

    Selected from 3000 most

    common English words

    (90 words)

    Selected from 5000 most

    common English words +

    AWL (120 words)

    Segmentation L1 Text: Isois (Grandpa) 36

    items

    Text: Lilli

    (73 items)

    Text: Lilli

    (73 items)

    Segmentation L2 Text: Little pigs

    (51 items)

    Text: Australia

    (59 items)

    Text: Coffee

    (71 items)

    Typing errors L1 NMI test

    (100 words/3min 30 sec)

    NMI test

    (100 words/3min 30 sec)

    Dictation L2 12 units with 24 words (32

    words)

    10 units with 311 words (52

    words)

    12 units with 311 words (77

    words)

  • Instruments in DIALUKI STUDY ONE

    English as a Foreign Language

    Individual tasks (1)

  • PSYCHOLINGUISTIC

    AND COGNITIVE

    TASKS

    4th grade

    (age 1011)

    8th grade

    (age 1415)

    Gymnasium

    (age 1718)

    Backwards digit span L1 28 digits,

    14 items

    (numbers 19)

    28 digits,

    14 items

    (numbers 19)

    28 digits,

    14 items

    (numbers 19)

    Backwards digit span L2 25 digits,

    8 items

    (numbers 16)

    25 digits,

    8 items

    (numbers 16)

    25 digits,

    8 items

    (numbers 16)

    Rapidly presented words

    L1

    14 words

    (28 letters)

    14 words

    (28 letters)

    14 words

    (28 letters)

    Rapidly presented words

    L2

    8 words

    (24 letters)

    12 words

    (29 letters)

    12 words

    (29 letters)

    List reading L1 105 words

    time limit 60 sec

    (Lukilasse)

    105 words

    time limit 60 sec

    (Lukilasse)

    105 words

    time limit 60 sec

    (Lukilasse)

    List reading L2 105 words

    time limit 60 sec

    105 words

    time limit 60 sec

    105 words

    time limit 60 sec

    Non-word reading L1

    (mlkenti)

    10 non-words with

    34 syllables

    10 non-words with

    34 syllables

    10 non-words with

    34 syllables

    Non-word reading L2

    (kipthirm)

    10 non-words (Snowling et al

    1996: Graded Nonword

    Reading Test )

    10 non-words (Snowling et al

    1996: Graded Nonword

    Reading Test )

    Non-word repetition L1

    (vrelyytti)

    10 non-words with 25

    syllables

    10 non-words with 25

    syllables

    10 non-words with 25

    syllables

  • PSYCHOLINGUISTIC

    AND COGNITIVE

    TASKS

    4th grade

    (age 1011)

    8th grade

    (age 1415)

    Gymnasium

    (age 1718)

    Non-word repetition L2

    (bassodoke)

    10 non-words (selected from

    Gupta et al 2005)

    10 non-words (selected from

    Gupta et al 2005)

    10 non-words (selected from

    Gupta et al 2005)

    Non-word spelling L1

    (peunumiile)

    12 non-words with 4 syllables 12 non-words with 4 syllables 12 non-words with 4 syllables

    Phoneme deletion L1

    (hamsa hama) 12 non-words with 13

    syllables

    12 non-words with 13

    syllables

    12 non-words with 13

    syllables

    Phoneme deletion L2

    (nolcrid olcrid) 8 non-words 10 non-words 10 non-words

    Common unit L1 (lauhkua

    - terike)

    10 pairs of non-words 10 pairs of non-words 10 pairs of non-words

    Common unit L2 (filk

    maf)

    10 pairs of non-words 10 pairs of non-words

    Rapid automatic naming

    L1

    Mixed list of numbers, letters

    and colours (50 items)

    Mixed list of numbers, letters

    and colours (50 items)

    Mixed list of numbers, letters

    and colours (50 items)

    Rapid automatic naming

    L2

    Mixed list of numbers, colors

    and objects (30 items)

    Mixed list of numbers, letters

    and colours (50 items)

    Mixed list of numbers, letters

    and colours (50 items)

  • Example Instruments

  • Reading rapidly presented words

    ***

  • Reading rapidly presented words

    day

  • Reading rapidly presented words

    %

  • Cognitive and psycholinguistic tasks (2)

    RAN Rapid Automatized Naming L1 and FL

    Mixed stimuli:

    numbers, letters and colours (L1)

    numbers, objects and colours (FL)

  • Backward digit span memory test in L1 and FL

    repeat the numbers you hear but backwards

  • Rapid reading (aloud) of a list of real L1 words

    read as many as you can in one minute

  • L1 1. viepere 6. kylmnsi

    2. larvaanto 7. hiemakkola

    3. mlkenti 8. sertsapeivo

    4. seivolssi 9. vaastiloima

    5. euksatus 10. ahkontalsi

    Non-word reading task

    L2 1. hast 6. tegwop

    2. mosp 7. molsmit

    3. prab 8. twamket

    4. gromp 9. hinshink

    5. trolb 10. kipthirm

  • L1 1. seitu 6. peunivatna

    2. ronksa 7. ysipulentti

    3. minksakka 8. restomeliitti

    4. kletsoma 9. plotiskntsingis

    5. vrelyytti 10. intjirinanttiin

    Non-word repetition task

    L2 1. bassim 6. kotiesote

    2. peggut 7. doosennane

    3. bipup 8. keegulol

    4. gaypoom 9. beenodoofop

    5. bassodoke 10. daysomaysice

  • L1 1. lauhkua terike 6. vaaso leikua

    2. mustele kyhinty 7. hirattu vnkki

    3. vommiras thmykkyyn 8. kanttuuso vyyrt

    4. tookselo murlain 9. aamestus hilpialli

    5. vapi lumpe 10. tlkys angilme

    Common Unit task

    L2 1. mip pank 6. madast wordle

    2. auk honch 7. prinkle mapgom

    3. skey twisp 8. sloskon nagar

    4. brang peb 9. larsk mambron

    5. kelpit membro 10. filk maf

  • L1 1. Tauk auk 7. mesTo meso

    2. Hok ok 8. puLke puke

    3. Peuk euk 9. kelaMpa kelapa

    4. gooK goo 10. makalTo makalo

    5. hamSa hama 11. sinepTe sinepe

    6. pokRi poki 12. halneSko halneko

    Phoneme deletion task

    L2 1. kisP kis 6. stanseRt stanset

    2. Drant rant 7. dockOAn dockn

    3. Apren pren 8. pronaTE prona

    4. balraS balra 9. driggLE drigg

    5. Nolcrid olcrid 10. norCH nor

  • Example:

    |thepigsweresohappytheysangthissong|

    |the|pigs|were|so|happy|they|sang|this|song|

    Task: |sothenextdaythethreelittlepigslefthomethefirstpigmadeahomef

    romstrawthesecondpig| |madeahomefromsticksbutthethirdpigwascleverhemadehishom

    efrombricksonedaythebig| |badwolfcametothestrawhouseheknockedonthedoor|

    Segmentation task in L2 (4th graders version)

  • L2 Vocabulary OSA 1 2000

    1 birth 2 dust 3 operation 4 row 5 sport 6 victory

    ___ urheilu

    ___ voitto

    ___ syntyminen

    1 adopt 2 climb 3 examine 4 pour 5 satisfy 6 surround

    ___ kiivet, nousta

    ___ katsoa tarkasti

    ___ olla joka puolella

    1 choice 2 crop 3 flesh 4 salary 5 secret 6 temperature

    ___ lmp

    ___ liha

    ___ palkka

    1 bake 2 connect 3 inquire 4 limit 5 recognize 6 wander

    ___ yhdist

    ___ kvell ilman pmr

    ___ rajoittaa

    1 cap 2 education 3 journey 4 parent 5 scale 6 trick

    ___ koulutus

    ___ asteikko

    ___ matka

    1 burst 2 concern 3 deliver 4 fold 5 improve 6 urge

    ___ srky, puhjeta

    ___ tehd paremmaksi

    ___ vied jotakin jollekulle

    1 attack 2 charm 3 lack 4 pen 5 shadow 6 treasure

    ___ aarre

    ___ lumous, viehtysvoima

    ___ puuttua, olla vailla jotakin

    1 original 2 private 3 royal 4 slow 5 sorry 6 total

    ___ alkuperinen

    ___ yksityinen

    ___ yhteens

    1 cream 2 factory 3 nail 4 pupil 5 sacrifice 6 wealth

    ___ kerma

    ___ rikkaudet, varallisuus

    ___ oppilas

    1 brave 2 electric 3 firm 4 hungry 5 local 6 usual

    ___ tavallinen

    ___ nlkinen

    ___ urhea,

    rohkea

  • Motivation

    English Self-concept

    Intrinsic interest

    Instrumentality

    Motivational Intensity

    Parental Encouragement

    Self-regulation

    Anxiety

  • ENGLISH SELF-CONCEPT

    Compared to other students, I'm good at English

    I have always done well in English.

    Studying English is easy for me.

    I get good marks in English.

    I learn English quickly.

    Im better at English than most of my classmates.

    Items dropped

    I am hopeless when it comes to English

    I am satisfied with how well I do in English.

  • Independent variables: Students How much homework do you normally do during a normal school day?

    o Not at all o Half an hour or less a day o From half an hour to an hour a day o 12 hours a day o Over 2 hours a day

    How do you feel about reading in your free time?

    oI like reading a lot

    oI like reading somewhat

    oI dont like reading

  • How often do you read the following things in your free time?

    Independent variables: Students

    I read Daily or nearly

    daily

    12

    times a week

    12

    times a month Rarely or never

    a) text messages

    b) email

    c) Facebook or

    Twitter

    conversations

    d) messages in

    chats (e.g. MSN,

    IRC)

    e) intenet

    chatforums

    f) blogs or home

    pages

    g) news or other

    newspaper articles

    online

    h) online non-

    fiction texts (e.g.

    Wikipedia)

  • Parents education

    Parents occupation

    Independent variables: Parents

    Compulsory

    school

    Vocational

    school or

    institute

    Gymnasium Bachelors

    degree

    Masters

    degree

    a) Childs

    mother 1 2 3 4 5

    b) Childs

    father 1 2 3 4 5

    Working Retired Student Housewife/

    husband Unemployed

    a) Childs

    mother 1 2 3 4 5

    b) Childs

    father 1 2 3 4 5

  • Before the child learned to read, was somebody in the family engaged in the following activities with the child?

    Independent variables: Parents

    Rarely or never 12 times a month 12 times a week

    Everyday or nearly

    everyday

    a) Read books or told

    stories 1 2 3 4

    b) Talked about

    everyday activities or

    events

    1 2 3 4

    c) Sang 1 2 3 4

    d) Played with letter-

    toys (e.g. blocks) 1 2 3 4

    e) Played word-

    games 1 2 3 4

    f) Wrote letters or

    words 1 2 3 4

    g) Read aloud signs

    or labels 1 2 3 4

  • Structural Equation Modelling (SEM) Cognitive variables, 4th graders

    Three latent variables (path model)

  • Structural Equation Modelling (SEM) Cognitive variables, Gymnasium

    Three latent variables (path model)

  • Dependent variable

    Adjusted R Squared

    %

    variance

    First IV Second

    IV

    Third IV Fourth IV Fifth IV

    4th Grade Pearson

    Young

    Learners

    Test in

    English

    .526

    53%

    Size of

    English

    Vocab

    (.664)

    Writing

    in L1

    Finnish

    (.419)

    L2

    segment-

    ation

    accuracy

    (-.584)

    L1

    Finnish

    Reading

    (ALLU)

    (.403)

    8th Grade Pearson

    General +

    DIALANG

    Medium

    .671

    67%

    Size of

    English

    Vocab

    (.740)

    Writing in

    English

    (.696)

    L2

    segment-

    ation

    accuracy

    (-.641)

    Size of

    Finnish

    Vocab

    (.282)

    Gymnasiu

    m

    Pearson

    General +

    DIALANG

    Advanced

    .708

    71%

    English

    dictation

    (.795)

    Size of

    English

    Vocab

    (.747)

    LI Finnish

    Reading

    (PISA)

    (.418)

    L2

    segment-

    ation

    accuracy

    (-.677)

    Writing

    in

    English

    (.680)

  • Positive or negative? On teaching? On learning? On content? On method? On rate and sequence of learning? On degree and depth of learning? On attitudes? On all teachers and learners? On some teachers and learners?

  • Curriculum contents of curriculum, timetabling Teaching materials choice of textbooks, use of past papers, teacher- made materials Teaching methods choice of methods, teaching of test-taking skills Attitudes and feelings of learners and teachers Learning Do test results improve? Does learning improve?

  • What might be the possible unintended negative consequences of diagnostic testing? The fact is that so far we have no research into the washback or impact of diagnostic tests. Empirical research is urgently needed: How might such research be designed and conducted?

  • Thank you for your attention!

    [email protected]