how multiple choice items distort test takers results in tests of structure in thailand a...

29
How Multiple choice items distort test takers results in tests of structure in Thailand A presentation by Mick Currie and Nanta Chiramanee Prince of Songkla University, Hatyai 14 th December 2007 1

Upload: aubrey-hines

Post on 18-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

How Multiple choice items distort test takers results in tests of structure in Thailand

A presentation by Mick Currie and Nanta Chiramanee

Prince of Songkla University, Hatyai

14th December 2007

1

How Multiple choice items distort test takers results in tests of structure in Thailand

The context of the study: Thailand

Asia and the EFL/ESL world

Previous studies

Our study: Subjects

Methodology

Findings

Implications

Your questions and comments

Overview

2

How Multiple choice items distort test takers results in tests of structure in Thailand

Thailand has not been able to develop widespread communicative skills in English among its population.

Studies have repeatedly found that many teachers do not teach English as a communicative skill.

e.g. Musigrunsi, (2002),

Prapaisit, (2003)

Thongsri, (2005)

All identified tests as one of the reasons

3

How Multiple choice items distort test takers results in tests of structure in Thailand

What is being tested? Communicative skills or grammar?

And how is it being tested?

Currie (2007) 97% of students interviewed had been tested in grammar Less than 60% had been tested in writing or speaking

Tests overwhelmingly used multiple choice format

Upshur and Palmer (cited in Canale and Swain, 1980) The measurement of linguistic accuracy in Thai students is not an accurate predictor of their ability to communicate

Knox (1996)

“If the ability to communicate in English is to be taught (and more importantly to be learned) it is

vital that this ability also be tested.”

4

How Multiple choice items distort test takers results in tests of structure in Thailand

Thailand is not alone.

Korea: Li (1998)

Japan: Gorsuch (2000)

- Multiple choice university entrance examinations affect the way teachers teach and student want to learn

China: Liu (2007) -Tests in China concentrate on ‘linguistic competence’ -Multiple choice is the main method in high stakes

tests taken annually by more than 8,000,000 students

5

How Multiple choice items distort test takers results in tests of structure in Thailand

Little research conducted into whether multiple choice tests effectively assess linguistic knowledge

No published studies comparing stem equivalent items in tests of structure

Rodriguez (2003): a meta analysis of studies into construct equivalence. Found high correlations in stem equivalent items. Identified differences in effects in different domains.

Pike (1979) compared constructed response and multiple choice formats by correlations of reliability (not stem equivalent)

Shohamy (1984) compared stem equivalent items for reading

Cheng (2004) compared stem equivalent items in listening

Both found large format effects induced by multiple choice items

6

How Multiple choice items distort test takers results in tests of structure in Thailand

Our research methodology

Subjects: 152, 1st and 3rd year students from Prince of Songkla University

Instruments: A short answer test with 40 structure items

3 multiple choice tests in 3, 4 and 5 option format

A post test questionnaire why did subject change their answer on one selected item?

Procedure: All subjects sat short answer testGroups of 52, 55 and 45 sat 3, 4 and 5-option multiple choice tests 5/6 weeks later

Post test questionnaire after 2nd test

7

How Multiple choice items distort test takers results in tests of structure in Thailand

Our research methodology

Example of item construction method. Structure section item # 10

Constructed response (short answer) item: Stem only

Man: ———— a bank in the university? Student: Yes, it’s opposite the science faculty.

Multiple choice, 3, 4 and 5 option items: Stem and options

Man: ———— a bank in the university? Student: Yes, it’s opposite the science faculty.

Numbers of subjects

who chose the option in3-option 4-option 5-option in the short answer test a. Is a. Is a. Is (19)b. Where is b. Where b. Is there (18)*c. Is there c. Where is c. Where is (39) d. Is there d. Where (17)

e. Have (17) * Expected

response

7.1

How Multiple choice items distort test takers results in tests of structure in Thailand

Our research methodology

Analysis: Comparison of subjects scores between the two tests.

Comparison of item performance

Direct comparison of subjects responses (item by item) in the two tests

Controls Control groupControl items (4-option) in all m/choice testsCriterion referenced test data

1st year subjects: O-net scores3rd year subjects: Mid term test

Established: Groups of equal ability and no practice effect

8

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings:

Comparison of multiple choice test scores with O-net scores

O-net Study Correlation t value

44.69% Control items 44.00% 0.493** 0.538

44.69% Composite m/c 42.80% 0.710** 2.083*

*significant at p< 0.05, **significant at p< 0.001 (df=156)(df=157)

9

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings: Short answer test

3-option m/choice test

4-option m/choice test

5-option m/choice test

Group n 152 52 55 45

Mean score (31 items) 19.38% 52.23% 46.45% 45.10%

Min-max 0 – 83.87% 12.90 – 90.32% 9.68 – 96.77% 9.68 – 90.32%

Reliability (alpha) 0.880 0.872 0.890 0.885

Mean item facility 0.19 0.52 0.46 0.45

Mean discrimination 0.45 0.45 0.48 0.54

Individual difference: 2nd test over 1st test

33.33% (10.23 items)

26.86% (8.33 items)

25.74% (7.98 items)

Min-max increase 3.23 - 67.74% 6.45 - 51.61% 0 - 51.61%

t value(1st/2nd test) 20.646* (df=51) 18.050* (df=54) 15.206* (df=51)

f value (ANOVA) on m /choice tests 1.472, (df = 2 & 149)

*significant at p < 0.001

10

0

5

10

15

20

25

30

3-optiongroup

4-optiongroup

5-optiongroup

1st test(shortanswer)

2nd test(multiplechoice)

Individual scores: 1st & 2nd test

10.1

0

5

10

15

20

25

30

0 5 10 15 20 25 30

Score in 1st test

Score

in 2

nd test

3-option

4-option

5-option

Individual scores: 1st test vs 2nd test

10.2

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings:

Test group

Correlation:1st test score with 2nd test score

Corrected for attenuation

Significance

(p < ) df

3-option 0.842 0.966 0.001 50

4-option 0.880 0.980 0.001 53

5-option 0.886 exceeds 1 0.001 43

Correlation: 1st test score/increase in score

3-option 0.133 - - 50

4-option 0.149 - - 53

5-option 0.426 - 0.01 43

11

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings:

Code1st test

response

Available or not available option

in 2nd test Response in 2nd test

A no answer incorrect

B no answer correct

C incorrect available Incorrect (different from 1st test)

D incorrect available Incorrect (same as 1st test)

E incorrect available correct

F correct available incorrect

G correct available correct

H incorrect not available incorrect

J incorrect not available correct

K acceptable not available correct

L acceptable not available incorrect

12

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings:

Pattern3-option

group

4-option

group5-option

group Overall

A 2.85% 2.58% 2.94% 2.78%

B 1.99% 1.88% 1.36% 1.76%

C 5.89% 11.61% 13.05% 10.08%

D 13.40% 11.67% 12.69% 12.56%

E 11.66% 11.03% 10.97% 11.23%

F 2.23% 2.76% 2.72% 2.57%

G 14.45% 14.43% 14.19% 14.37%

H 22.52% 24.16% 22.65% 23.15%

J 22.21% 17.07% 17.06% 18.82%

K 1.67% 1.99% 1.22% 1.66%

L 1.12% 0.82% 1.15% 1.02%

Correlations

3/4 option: 0.950***

3/5 option: 0.939***

4/5 option: 0.995***

df = 9

ANOVA’s

C: f = 13.078***

J: f = 5.757**

df = 2&149**significant at p<0. 01

***significant at p<0.001

13

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings:

3-option high

4-option high

5-option high

3-option mid.

4-option mid.

5-option mid.

3-option low

4-option low

4-option high 0.985**

5-option high 0.991** 0.988**

3-option mid. 0.675*

4-option mid. 0.430 0.866**

5-option mid. 0.512 0.860** 0.990**

3-option low 0.167 0.719*

4-option low 0.039 0.858** 0.970**

5-option low 0.062 0.866** 0.915** 0.974**

**significant at p < 0.001; df = 9

14

How Multiple choice items distort test takers results in tests of structure in Thailand

High ability subjects:Maintained more correct and less incorrect answers between testsSelected the correct response when their answer from the 1st test was not among the options in the multiple choice test, 3 times out of 4

High and middle ability subjects: Were twice as likely to switch from their incorrect answer in the first test to the correct option, than low ability subjects

Low ability subjects:Selected an incorrect response when their answer from the 1st test was not among the options in the multiple choice test, 3 times out of 4

Our Findings:

15

How Multiple choice items distort test takers results in tests of structure in Thailand

Subjects in the 3-option group Were more successful at selecting the correct response when their answer from the first test was not among the options than subjects in the 4 and 5-option groups (pattern J)

Switched between incorrect options when their original response was among the options, half as often as did subjects in the 4 and 5-option groups (Pattern C)

But: Overall the number of options had very little effect

Our Findings:

16

How Multiple choice items distort test takers results in tests of structure in Thailand

Why did the subjects change their answers between the two tests?

Knowledge

Learning

Cued recall

Test taking strategy/technique

Blind guessing

Our Findings:

17

How Multiple choice items distort test takers results in tests of structure in Thailand

Our Findings:Lack of knowledge 2 4 2 1 6 1 1 2Incorrect partial knowledge 7 1 16 11 6 12 6 17 2 1

Incorrect knowledge 3 1 3 6 3

Correct partial knowledge 1 1 1 1 1

Correct knowledge 2 1 1Ineffective tests taking strategies 1 1 1

Ineffective blind guessing 1 1 1 1 1

Poor test technique 2 2

Incorrect partial knowledge

Ineffective learning

Correct partial know

ledge

Effective learning

Ineffective cued recall

Effective cued recall

Ineffective test taking strategies

Effective test taking strategies

Ineffective blind guessing

Effective blind guessing

Poor test taking technique

First

test

Second test

18

How Multiple choice items distort test takers results in tests of structure in Thailand

What are we to make of these results?

Our Conclusions:

The multiple choice format enabled the subjects to achieve higher scores than the short answer test

0

5

10

15

20

25

30

3-optiongroup

4-optiongroup

5-optiongroup

1st test(shortanswer)

2nd test(multiplechoice)

19

How Multiple choice items distort test takers results in tests of structure in Thailand

What are we to make of these results?

Our Conclusions:

The multiple choice format enabled the subjects to achieve higher scores than the short answer test

The improvements were only weakly correlated with the first test suggesting that language ability was not responsible for the improvement

Correlation: 1st test score/increase in score

3-option 0.133

4-option 0.149

5-option 0.426

20

How Multiple choice items distort test takers results in tests of structure in Thailand

What are we to make of these results?

Our Conclusions:

The multiple choice format enabled the subjects to achieve higher scores than the short answer test

The improvements were only weakly correlated with the first test suggesting that language ability was not responsible for the improvement

The high correlations between the two tests

Test group

Correlation:1st test score with 2nd test score

Corrected for attenuation

Significance

(p < ) df

3-option 0.842 0.966 0.001 50

4-option 0.880 0.980 0.001 53

5-option 0.886 exceeds 1 0.001 43

21

How Multiple choice items distort test takers results in tests of structure in Thailand

What are we to make of these results?

Our Conclusions:

The multiple choice format enabled the subjects to achieve higher scores than the short answer test

The improvements were only weakly correlated with the first test suggesting that language ability was not responsible for the improvement

The high correlations between the two tests are misleading

21.1

How Multiple choice items distort test takers results in tests of structure in Thailand

Pattern3-option group

4-option group

5-option group Overall

A 2.85% 2.58% 2.94% 2.78%

B 1.99% 1.88% 1.36% 1.76%

C 5.89% 11.61% 13.05% 10.08%

D 13.40% 11.67% 12.69% 12.56%

E 11.66% 11.03% 10.97% 11.23%

F 2.23% 2.76% 2.72% 2.57%

G 14.45% 14.43% 14.19% 14.37%

H 22.52% 24.16% 22.65% 23.15%

J 22.21% 17.07% 17.06% 18.82%

K 1.67% 1.99% 1.22% 1.66%

L 1.12% 0.82% 1.15% 1.02%

22

How Multiple choice items distort test takers results in tests of structure in Thailand

What are we to make of these results?

Our Conclusions:

The multiple choice format enabled the subjects to achieve higher scores than the short answer test

The improvements were only weakly correlated with the first test suggesting that language ability was not responsible for the improvement

The high correlations between the two tests are misleading

Only around 27% of the answers from the 1st test were chosen from the multiple choice options in the 2nd test

73% of the 2nd test option selection was forced or induced by the test format

23

How Multiple choice items distort test takers results in tests of structure in Thailand

What are we to make of these results?

Our Conclusions:

The multiple choice tests grossly distorted the measurement of the test takers performance

What was actually being measured was largely the subjects’ ability to deal appropriately with the multiple choice format

Based on this study, the multiple choice format should not be used in tests of language structure

24

How Multiple choice items distort test takers results in tests of structure in Thailand

Your questions and comments are welcome.

END