Download - Ail apresentation(kumazawa)
![Page 1: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/1.jpg)
Evaluating validity of criterion-
referenced test score
interpretations and usesTakaaki Kumazawa
Kanto Gakuin University
Kintai Bridge, Japan (wiki)
![Page 2: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/2.jpg)
Purpose
ß The purpose of my talk is to evaluate
validity of criterion-referenced placement
test score interpretations and uses using
Kane’s (2006) argument-based validity
framework
ß This presentation is based on a paper I
published in the JALT Journal
(http://jalt-publications.org/jj/issues/2013-05_35.1)
![Page 3: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/3.jpg)
Classical view of validity
ß Validity: the extent to which a test is supposed to measure
ß Three types of validity
Þ Criterion-related validityCorrelation between a valid measure and a test developing
Þ Content validityExperts’ judgment on whether items are measuring what is supposed to measure
Þ Construct validityStatistical examination on whether items are measuring what is supposed to measure
![Page 4: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/4.jpg)
Current view of Validity
ß Validity is “the degree to which evidence
and theory support the interpretations of
test scores entailed by proposed uses of
tests” (American Educational Research
Association, American Psychological
Association, & National Council on
Measurement in Education [AERA, APA, &
NCME], 1999, p. 9).
![Page 5: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/5.jpg)
Argument-based validity framework
Interpretive argument: proving argument that the inferences are
going to make is theoretically valid
Validity argument: evaluating the interpretive argument by providing
warrant
Observatio
n
Observed
score
Universe
score
Target
scoreUse
Scoring generalization extrapolation
decision
![Page 6: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/6.jpg)
Interpretive argument
ß Scoring inferenceÞ to what extent do examinees get placement items correct
and high-scoring examinees get more placement items correct
ß Generalization inference Þ to what extent are placement items consistently sampled
from a domain and sufficient in number so as to reduce the measurement error
ß Extrapolation inferenceÞ to what extent do the difficulty of placement items match to
the objectives of a reading course
ß Decision inferenceÞ to what extent do placement decisions made to place
examinees in their proper level of the course have an impact on washback in the course
![Page 7: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/7.jpg)
Participants
Þ 428 Japanese 1st year university students majoring in law
Þ TOEIC score of about 250-450
Þ Three courses in the English program Reading
Listening
TOEIC skills
ß Proficiency based programÞ Three levels
Level 1: 60 high scoring studentsMajor objective of the reading course: improve their reading skills such as fast reading
Level 2: about 300 students
Level 3: 50 low scoring studentsMajor objective of the reading class: re-learn Jr High and High school grammar
![Page 8: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/8.jpg)
Criterion-referenced placement test
ß Grammar (k = 40)
Þ Items are taken from textbooks used in junior and high schools
Þ Grammar: present, past, & future tenses, continuous, relative pronoun,
gerund, participle, etc…
Þ Sample: Hi, I ( ) Ken.
1. am 2. are 3. is 4. be
ß Vocabulary (k = 40)
Þ Items are taken from high frequent 1000-3000 words based on the
JACET 8000 corpus
Þ Sample: Bring
1. 送る (send) 2. 持ってくる (bring) 3. 鳴る (ring) 4. 購入する (buy)
ß Reading (k = 10)
Þ Two passages are taken from two textbooks used in Level 1 and Level
3 reading classes
Þ Sample: How do they travel?
1. by plane 2. by bus 3. by car 4. by train
![Page 9: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/9.jpg)
Procedures
ß On the first day of semester, the placement test was given in 45 minutes
ß A grammar pretest (k = 55, α = .85) was given on the first day of class in Level 2 classes (n = 51) and Level 3 classes (n = 49)
ß 30 90-minute lessons in two semesters
ß The same grammar posttest (α = .92) was given on the last day of class to the same students (n = 51, 49)
ß A course evaluation survey was given to the same students (n = 51, 49)
![Page 10: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/10.jpg)
Backing for scoring inference
ß Item facilityÞ 7 items below .29
Þ 62 items between .30 and .70
Þ 21 items above .71
ß Item discriminationÞ 4 items below .19
Þ 86 items above .20
ß Rasch Item difficulty estimatesÞ -3.79〜2.33
ß Infit MSÞ 0.80〜1.30
![Page 11: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/11.jpg)
Backing for generalization inference
ß Multivariate generalizability theory
(Decision study of a persons X Items
design)
Þ Grammar (k = 40, ρ = .85, Φ = .83)
Þ Vocabulary (k = 40, ρ = .86, Φ = .84)
Þ Reading (k = 10, ρ = .58, Φ = .55)
Þ Total (k = 90, ρ = .92, Φ = .91)
![Page 12: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/12.jpg)
Cut point for Level 1
Level 1 reading
Cut point for Level 3
Junior High grammar and 1000 word level
Backing for extrapolation inferenceDifficulty level estimates FACETS map
Level Difficulty SE Infit MS
Junior High grammar -0.65 0.03 1.00
High School grammar 0.29 0.02 1.00
1000 word level vocab -0.94 0.03 1.00
2000 word level vocab 0.15 0.03 1.00
3000 word level vocab 0.12 0.05 1.00
Level 3 rearing 0.30 0.05 1.00
Level 1 reading 0.73 0.05 1.10
-----------------------------------------------------
|Measr|+students
|-items | -levels
| CUT Po int for Leve ls 1, 2,
3
-----------------------------------------------------
+
3
+
+
+
+
|
|
.
| |
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
.
|
*
|
|
|
|
*
.
|
|
|
+
2
+
.
+
*
+
+
|
|
.
|
|
|
|
|
*
*
.
|
|
|
|
|
*
.
|
*
|
| Level 1a ( 1.49)
---------------------------------------------------------------------------
|
|
*
*
**.
|
|
|
|
|
*
*
**.
|
|
|
|
|
*
*
*.
|
*
|
|
+
1
+
*
**.
+
***
**
+
+
|
|
*
*
****
.
|
*
**
**
*
**
|
|
|
|
*
*
*.
|
***
|
Lev
el
1
Rea
d
ing
| L
e
vel 1b
(.77 )
---------------------------------------------------------------------------
|
|
*
*
****
.
|
*
****
*
|
|
|
|
*
*
**
|
****
**
|
|
|
|
*
*
****
*
. |
**
**
*
***
| Basic
H
S
G r a m
m a r |
|
|
*
*
**
|
****
****
|
JACET2000
J
ACET3000 |
*
0
*
*
****
*
*. *
***
*
** *
*
L e
v e l
2
( .
7 7-.70)
|
|
*
*
****
*
|
*
**
|
|
|
|
*
*
**.
|
***
***
|
|
|
|
*
*
****
.
|
*
***
|
|
|
|
*
*
****
*
** | ***
*
**
|
|
----------------------------------------------------------------------------
|
|
*
*
*
|
*
****
| Jr
H
Gram
m
a
r
| L
e
vel 3a
( -.70)
|
|
*
*
*.
|
**
|
|
----------------------------------------------------------------------------
+
-1 +
**
*
*.
+
**
+
J
AC
ET1
000
+ L e v el
3b
(
-.99)
|
|
*
*
.
|
*
*
|
|
|
|
.
|
*
|
|
|
|
.
|
|
|
|
|
.
|
|
|
|
|
|
*
|
|
|
|
.
|
*
|
|
+
-2 +
+
*
+
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
*
|
|
|
|
|
| |
+
-3 +
+
+
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
*
|
|
+
-4 +
+
+
+
-----------------------------------------------------
|Measr| *
=
4
|
*
=
1
| -levels
|
-----------------------------------------------------
![Page 13: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/13.jpg)
Backing for decision inferenceLevel 2 and Level 3 students’ (n = 51, 49) grammar pretest and posttest
scores (k = 55)
11 points down
6 points up
Level 2
students
scored
higher
Level 3
students
scored
higher
Grammar pretest(α=.85) Grammar posttest(α=.92) Class Level n M SD n M SD Level 2a 26 30.38 6.34 21 12.14 2.50 Level 2b 25 32.36 8.47 24 28.63 7.93
Level 2 51 31.35 7.45 45 20.93 10.24 Level 3c 25 20.80 5.09 22 26.82 5.21 Level 3d 24 19.88 4.29 23 26.78 5.95 Level 3 49 20.35 4.69 45 26.80 5.53
![Page 14: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/14.jpg)
Validity argumentInterpretive argumentß Scoring inference
Þ to what extent do examinees get placement items correct and high-scoring examinees get more placement items correct
ß Generalization inference
Þ to what extent are placement items consistently sampled from a domain and sufficient in number so as to reduce the measurement error
ß Extrapolation inference
Þ to what extent do the difficulty of placement items match to the objectives of a reading course
ß Decision inference
Þ to what extent do placement decisions made to place examinees in their proper level of the course have an impact on washback in the course
Validity argumentß Scoring inference
Þ Because most items were working well,
the inference from observation to the
observed score was valid
ß Generalization inference
Þ Because of high dependability with the
small amount of measurement error, the
inference from the observed score to
universe score was valid
ß Extrapolation inference
Þ Because the difficulty of the items were
adequate to the objectives of the program,
the inference from the universe score to
target score was valid
ß Decision inference
Þ Because Level 3 students were placed in
the right level and were able to improve
their grammar test scores, the inference
from the target score to test use was valid.
![Page 15: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/15.jpg)
Conclusionß “Validation is simple in principle, but
difficult in practice. The argument-based
framework provides a relatively pragmatic
approach to validation” (Kane, 2012, p. 15).
William Jolly Bridge, Brisbane
(wiki)
![Page 16: Ail apresentation(kumazawa)](https://reader034.vdocument.in/reader034/viewer/2022042522/559b66d41a28ab2b3c8b47b3/html5/thumbnails/16.jpg)
References
ß Kane, M. (2006). Validation. In R. Brennan
(Ed.), Educational measurement (4th ed.). (pp.
17-64). Westport, CT: Greenwood Publishing.
ß Kane, M. (2012). Validating score
interpretations and uses. Language Testing,
29, 3-17. doi: 10.1177/0265532211417210
ß Kumazawa, T. (2013). Evaluating validity for
in-house placement test score interpretations
and uses. JALT Journal, 35, 73-100.