validating a productive vocabulary knowledge test for ...koizumi/jacet_summer_validation… ·...
TRANSCRIPT
Validating a Productive Vocabulary Knowledge Test for Novice Japanese Learners of English
The 32nd JACET Summar Seminar Kusatsu Seminar House, Gunma August 20, 2004
Validating a Productive Vocabulary Knowledge Test for Novice Japanese Learners of English
Rie Koizumi (Doctoral Course, University of Tsukuba,
e-mail: [email protected])
1. Theoretical background
1.1 Receptive vocabulary & productive vocabulary (e.g., Nation, 2001)
1.2 Productive vocabulary tests:
The Lexical Frequency Profile (Laufer & Nation, 1995)
A vocabulary-size test of controlled productive ability (Laufer & Nation, 1999)
The Lex30 (Meara & Fitzpatrick, 2000)
Problem: They may not be a good measure of productive vocabulary of novice learners.
2. Purposes and research questions
2.1 Purposes:
(a) To develop a sufficiently valid productive vocabulary knowledge test (Productive VKT) for novice Japanese
learners of English that uses a translation method
(b) To examine if there is a difference in the degree of validity between two scoring methods of assessing
productive vocabulary knowledge
2.2 Why is translation used?
Nation (2001): “the use of the first language to convey and test word meaning is very efficient” (p. 351).
Chen & Leung (1989), Jiang (2000), Kawakami (1994), Matsumi (1993): In the vocabulary representation of
beginners, L1 vocabulary meaning is mediated between L2 meaning and the concept.
2.3 Two scoring methods
The ultimate purpose of using a productive vocabulary knowledge test: Investigate a relationship between
productive vocabulary knowledge and speaking performance
→The need to assess knowledge of spoken form (pronunciation) in a paper-and-pencil test
Spoken and Written (SW) scoring method vs. Written (W) method
Table 1. Two scoring methods for assessing productive vocabulary knowledge
SW
W
Operationalization in scoring
e.g., 桃
e.g., 袋
1 point
1 point
Matches exactly.
peach
bag
1 point
0 point
The spelling is wrong, but a test taker seems to know the correct pronunciation.
pich, pici
bog
0 point
0 point
No answer. OR
Both the spelling and pronunciation are wrong.
pice, peahi
big
Note. In scoring with the SW method, a lenient rating policy was taken. For example, the letters b and d are often confused among beginners, so sounb earned one point when the answer was sound. Other examples include lnch for lunch; weighth, wate, wait, waght for weight; oringe, oring for orange; and ceng, cenge, ceng for change.
2.4 Research questions (RQs)
RQ1: Does the Productive VKT developed in this study have an acceptable level of evidence for validity?
RQ2: Is the SW scoring method more valid than the W scoring method?
3. Method (see Abstract)
4. Results and discussion
4.1 Research Question 1
Reliability: The ratio of the two raters’ agreement: very high for both the SW (98.5%) and W (99.3%) methods
Rasch reliability: very high for both the SW (1.00) and W (1.00) methods
Relationships with other tests (Productive VKT, Receptive VKT, and Grammar test)
Hypothesis: Productive VKT & Receptive VKT > Productive VKT & Grammar
> Receptive VKT & Grammar
All the hypotheses were congruent with the results.
Table 2. Intercorrelations for the scores on the productive and receptive vocabulary knowledge tests and the grammar test
Total (n = 343)
Receptive
>a
Grammar
>ab
Receptive & Grammar
Productive SW
.91*
>
.86*
>
.86*
Productive W
.91*
>
.87*
>
Junior 2nd year (n = 33)
Productive SW
.62*
>
.39*
>
.38*
Productive W
.60*
>
.51*
>
Junior 3rd year (n = 149)
Productive SW
.76*
>
.48*
>
.46*
Productive W
.76*
>
.50*
>
Senior 1st year (n = 161)
Productive SW
.5988*
>
.5981*
>
.49*
Productive W
.63*
>
.61*
>
Note. a = hypothesized direction; b = comparison between the second column and the sixth column. *p < .05.
Relationships with other tests (Productive VKT, Receptive VKT, Grammar test, and Speaking Test)
Hypothesis: Productive VKT & Speaking > Receptive VKT & Speaking
> Grammar & Speaking
All the hypotheses were met except in one case.
Table 3. Intercorrelations for the scores on the productive and receptive vocabulary knowledge tests, the speaking test, and the grammar test
Total (n = 170)
Speak
>a
Receptive & Speak
>ab
Grammar & Speak
Productive SW
.77*
>
.75*
>
.62*
Productive W
.80*
>
>
Junior 3rd year (n = 147)
Productive SW
.66*
<
.68*
>
.49*
Productive W
.70*
>
>
Senior 1st year (n = 23)
Productive SW
.60*
>
.37*
>
.53*
Productive W
.63*
>
>
Note. a = hypothesized direction; b = comparison between the second column and the sixth column. *p < .05.
All approaches produced enough evidence for validity. Therefore, as an answer to the first research question (RQ1), the results demonstrate that there is an acceptable level of validity for the productive vocabulary knowledge test for novice Japanese learners of English.
4.2 Research Question 2
4.1.2 Correlation: SW & W: rs = .97* (n = 343)
2nd-year junior high school students: [2nd junior] rs = .73* (n = 33)
3rd junior: rs = .90* (n = 149)
1st senior: rs = .89* (n = 161)
Overall, the two scoring methods assess very similar aspects of productive vocabulary knowledge.
However, when English ability is very low (e.g., second-year junior high school students), the two
methods tend to assess more different aspects of knowledge.
→The two scoring methods might need to be selected according to the school years or proficiency levels
of test takers and according to the test purpose.
SW method: ○ Possibility that students’ knowledge of a spoken form can be reflected in a productive
vocabulary knowledge test.
× A case in an unexpected direction was found (see Table 3).
(Possibility: SW method was not inherently very valid or the scoring criteria were too lenient)
W method: ○ Better than the SW method in terms of practicality
4.3 Other points
Modeling (see Koizumi, in press)
Figure 1. Model 1.
n = 139; Vocabulary performance = The number of words uttered (Type); RMSEA (90%CI) = .04 (.00- .12).
Table 4: Words in the wordlist in the Course of Study (1989; Italicized word in Table 4.) were easier.
The relationships between frequency levels and item difficulty were moderate for written words (rs = .63*
[Kendall’s τb = .52], n = 30) and low for spoken words (rs = .40* [τb = .32], n = 30)
These results are consistent with Katagiri (2001), which reported a moderate relationship (τb = .64, p
< .01) between written word frequency and item difficulty.
5. Conclusion
This Productive VKT may be useful for getting information on those students’ productive vocabulary
knowledge and giving feedback to students and teachers, which can facilitate vocabulary acquisition.
Further work: Development of a new version of the productive vocabulary knowledge test that can estimate
vocabulary size
Examine the criteria of SW scoring method for further validation
References
Chen, H.-C., & Leung, Y.-S. (1989). Patterns of lexical processing in a nonnative language. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 15, 316-325.
Jiang, N. (2000). Lexical representation and development in a second language. Applied Linguistics, 21, 47-77.
Katagiri, K. (2001). Developing the ten-minute vocabulary tests for quick and approximate estimates of general English ability of
Japanese EFL learners. Unpublished Ph.D. dissertation, Tokyo Gakugei University.
Kawakami, A. (1994). The effect of proficiency in a second language on lexical-conceptual representation. Japanese Journal of
Psychology, 64, 426-433.
Koizumi, R. (2003). A productive vocabulary knowledge test for novice Japanese learners of English: Validity and its scoring
methods. JABAET Journal, 7, 23-52.
Koizumi, R. (in press). Predicting speaking ability from vocabulary knowledge. JLTA Journal, 7.
Laufer, B., ( Nation, P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16,
307-322.
Laufer, B., ( Nation, P. (1999). A vocabulary-size test of controlled productive ability. Language Testing, 16, 33-51.
Matsumi, N. (1993). Retrieval processes of words for speaking in a second language. Japanese Journal of Educational
Psychology, 41, 424-434.
Meara, P., & Fitzpatrick, T. (2000). Lex 30: An improved method of assessing productive vocabulary in an L2. System, 28, 19-30.
Ministry of Education, Science & Culture. (1989). Chugakkou shidousho gaikokugo hen [Instruction guidelines for junior high
school]. Tokyo: Kairyudo.
Mochizuki, M. (1998). A Vocabulary Size Test for Japanese Learners of English. IRLT Bulletin, 12, 27-53.
Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press.
Table 4. Summary map: Vertical rulers showing test-takers’ ability, item difficulty, and rater severity of the productive vocabulary knowledge test using the W scoring method
-------------------------------------------------------------------------
|Measr|+Examinees|-Items |-Judges|
-------------------------------------------------------------------------
+ 7 + *. + + +
| | | 27 stir | |
| | | | |
+ 6 + + 19 crack + +
| | *. | | |
| | | 6 blame | |
+ 5 + + + +
| | **. | | |
| | . | | |
+ 4 + *****. + + +
| | . | | |
| | ******. | | |
+ 3 + *****. + + +
| | . | 17 feeling | |
| | ****. | 15 vegetable 18 fail 26 cheap 34 chicken | |
+ 2 + ****. + 30 funny + +
| | ***. | | |
| | *******. | 38 notice | |
+ 1 + ***. + + +
| | ** | 21 touch | |
| | **. | | 1 |
* 0 * **. * 12 change 28 rabbit 3 behind * 2 *
| | **** | 33 popular | |
| | ** | 11 fight | |
+ -1 + ******* + 22 yellow + +
| | ***** | 12 orange 39 ear 40 month | |
| | ****. | 32 white 36 point 8 sound | |
+ -2 + ***** + 25 move + +
| | *** | 7 sick | |
| | ****. | 16 must | |
+ -3 + ***** + 24 star + +
| | ***. | 1 lunch | |
| | **** | 29 read | |
+ -4 + . + + +
| | **. | | |
| | | | |
+ -5 + * + + +
| | | | |
| | ****. | | |
+ -6 + + 9 box + +
| | | | |
| | | | |
+ -7 + ** + + +
-------------------------------------------------------------------------
|Measr| * = 3 |-Items |-Judges|
-------------------------------------------------------------------------
PAGE
4