assessing vocabulary recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 assessing...

31
1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen AL 6730: Assessment in TESOL Dr. Hanh Nguyen December 6, 2012

Upload: others

Post on 18-Feb-2020

64 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

1

Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production

Hanne Hakonsen

AL 6730: Assessment in TESOL

Dr. Hanh Nguyen

December 6, 2012

Page 2: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

2

Introduction

The purpose of this paper was to report on a group project completed in AL 6730:

Assessment in TESOL. The task was to design, develop, administer and evaluate a

vocabulary test. The outline of the paper is as follows. The paper begins with a detailed

description of the project, which involves a background context of the students and the

host institution the test was created for. In addition to a short description of the project’s

group members. This is followed by a detailed description of the test’s administration,

type, objectives, and its specifications. Then, the students’ results are presented and

discussed in an item analysis. Following is an individual reflection of changes that could

have been made to the test. This part involves a literature review on assessing vocabulary

production, and proposals for future research. Finally, a sample of the test created can be

found at the end, under appendices. There are two versions of the vocabulary test, one

with and one without the answers.

Project Description

Background Information

Host class

The host class was Dr. Brian Rugen’s International Education class for the Bridge

Program. It was hosted at Hawaii Pacific University (HPU) every Monday, Wednesday,

and Friday from 12:55 pm to 1:40 pm. The students had to take a proficiency test and

obtain one of the following scores to gain entry to the Bridge program. TOFEL ibt: 70-

79, CBT: 193-210, PBT: 523-547, TOEIC: 691-750, IELTS: 5.5. These scores indicated

that students were at an upper intermediate level. The students’ goal was to enter an

Page 3: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

3

American university at the undergraduate level. They required academic preparation to

succeed in an undergraduate degree. You can see the objectives of the course below,

which were taken from Dr. Rugen’s syllabus.

By the end of the series of modules in BR 1000, students will have:

1. Closely examined and challenged their beliefs and values regarding select issues

in higher education systems in various regions through extensive readings,

discussions, and mini-lectures.

2. Evaluated differing opinions on particular, controversial educational issues.

3. Learned strategies for intensive reading.

4. Increased their academic vocabulary.

5. Gained experience in preparing and delivering academic presentations to peers.

6. Applied learned skills of visual literacy in critically analyzing a multimodal text.

7. Demonstrated critical thinking skills through participation in an academic debate.

8. Written a successful in-class essay in response to a reading prompt.

The host teacher had a content based, “English for Academic Purposes” instructional

approach. After collaborating with Dr. Rugen, the group observed that he liked to assess

his students’ recognition knowledge, and that he preferred to use multiple choice as an

assessment technique.

Host Institution

The host institution, HPU, was located in Downtown Honolulu, Hawaii. The goal

of the Bridge Program was to offer “an opportunity for international students to build

Page 4: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

4

English language competency, academic skills, and acquire content based knowledge in

preparation for academic success at Hawaii Pacific University.”

Group members

The members of the group were Kri Howland of Massachusetts, U.S.A. Rahma

Kadir of Indonesia. Ciwang Cirenwangdui of Tibet, and myself, Hanne Hakonsen of

Norway. Hanne has been a physical education teacher in Norway, at the junior high level.

Ciwang has taught teaching English in Tibet for seven years, at a senior high school

level. Rahma has been teaching English for two years at an elementary school in

Indonesia. Kri Howland has had no teaching experience, but hopes to pursue an

administration path at Study Abroad and International Service Centers in which she

would incorporate her knowledge of the TESOL program.

Language Assessment Instrument

Administration of assessment

The vocabulary test was given to the host class on November 5th, 2012. It was

graded and turned back to the teacher on November 9th. Two version of the quiz can be

found in the appendices. The student friendly version of the test can be found in Apendix

A, and the teacher’s version with the keys in Appendix B.

Type of assessment

The purpose of the test was to assess the students’ recognition of academic

vocabulary. We therefore created an achievement test for vocabulary recognition. The

Page 5: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

5

Item-Design Approach components were as follows. The test was criterion-referenced,

since the students’ performance was in no way affected by the performance of their

classmates. The test was also indirect, since there were no productive tasks that could be

measured. Furthermore, the test was discrete point as we were only testing vocabulary

recognition, and the students did not need to utilize other language skills to preform the

task presented to them. And finally, the scoring was objective, as all the items were

multiple choice.

Objectives

The following were the objectives of our vocabulary test, given to us by Dr.

Rugen, the course instructor of the host class:

1. The student will be tested on the ability to choose the correct definition for select

vocabulary words from the Academic Word List (AWL).

2. The student will be tested on the ability to identify the meaning of selected

vocabulary words from the AWL based on their use in context.

3. The student will be tested on the ability to distinguish between multiple meanings

of select vocabulary words from the AWL based on their use in context.

4. The student will be tested on the ability to replace words from a short reading

with select vocabulary words from the AWL that have a similar meaning.

Specification

The following were the specifications of the test, written by the group while

preparing to create the test:

Page 6: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

6

1. Specifications of content:

a. Operations: Recognition of academic words with and without context.

1. Recognizing word meaning in sentence context

2. Recognizing definitions

3. Recognizing synonyms without context

4. Recognizing word meaning in paragraph context

5. Spelling answers correctly

b. Types of text: Authentic, academic

c. Length of text: 277 words

d. Addressees of text: Non-native speakers at the undergraduate level in an

International Education class.

e. Topics: Single sex schooling.

f. Readability (Flesh-Kincaid or grade level): 7-8th grade as they are an

upper intermediate level.

g. Structural range: Simple grammar because we are testing them on

vocabulary.

h. Vocabulary range: Generally academic.

i. Dialect and style: Standard American English.

2. Structure, timing, medium, and techniques:

a. Test structure: 4 sections

1. Multiple choice with context

2. Matching

Page 7: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

7

3. Multiple choice without context

4. Multiple choice in passage context

b. Number of items: 20 multiple choice items, 10 matching items. Total: 30

c. Number of passages: 4 sections

1. Section 1: 5 items

2. Section 2: 10 items

3. Section 3: 5 items

4. Section 4: 10 items

d. Medium: Paper and pen.

e. Testing techniques: Multiple choice and matching

3. Criterial level of performance:

Satisfactory performance is recognizing 80% of the vocabulary in each section.

So students who reach this level with be considered having succeeded the course

objectives in terms of this quiz.

4. Scoring procedure:

There will be objective scoring with four scorers. A correct answer will receive 1

point, an incorrect answer will receive 0 points and a misspelled answer will

receive ½ a point as long as it’s comprehensible to the scorers.

5. Sampling (where drew the materials for the test from):

Vocabulary will be selected from the Academic Word List (AWL) and the

passage came from a website (singlesexschools.org/evidence.html) on single sex

schooling in relation to their unit on single sex schooling.

Page 8: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

8

Student Results

In the table below, you can see the percentage out of 100% of the scores. The

majority of the students performed well, with an average of 83% out of 100%. The

highest score was 29 out of 30, and the lowest was 19 out of 30. The most frequent score

was 26.5 out of 30, or 88%. The test time was 30 minutes, and all the students managed

to complete the test within the time limit.

Table of student results

Histogram

The histogram displayed below revealed that the most common score was around

25 points, as many as 7 students had this result. There were also 6 students that scored

above 25. Only one person scored under 25 points, but got at least 20 points. The students

did very well on the test, and many achieved high scores.

Students Score Student 1 96.7 Student 2 93.3 Student 3 88.3 Student 4 88.3 Student 5 88.3 Student 6 88.3 Student 7 83.3 Student 8 83.3 Student 9 81.7 Student 10 81.7 Student 11 78.3 Student 12 76.7 Student 13 70.0 Student 14 63.3

Page 9: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

9

Item Analysis

In order to analyze the students’ results, our group did an item analysis. An item

analysis allows the test creators to examine the contribution that each item is making to

the test. It can also reveal faulty or inefficient items (Hughes, 2003). As previously

mentioned, our testing technique was multiple choice, which requires the creation of

distractors. Unfortunately, our group’s item analysis revealed that several distractors were

not chosen by anyone. Meaning, they were inefficient, since they did not fulfill their

purpose of distracting.

Part I: Multiple Choice with Context.

Distractor Analysis

Item A B C D 1 1 12 (key) 1 0 2 0 1 13 (key) 0 3 14 (key) 0 0 0 4 10 (key) 0 3 1 5 14 (key) 0 0 0

0  1  2  3  4  5  6  7  8  

15   20   25   30   More  

Students  

Scores  

Histogram  

Frequency  

Page 10: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

10

Distractors:

• 1 D. • 2 A, D. • 3 B, C, D.

• 4 B. • 5 B, C, D.

The table above, on page nine, shows how many times a distractor was selected.

The distractors listed in bullets were not selected by any of the students. Thus, they

did not fulfill their purpose of distracting. The distractor that worked best was 4C.

Three out of 14 students chose this distractor. Also, distractors 2B, 1C, 4D were each

chosen by one student. All in all, nearly all the distractors in this item were too weak.

Item Facility (n=14)

Everyone answered item 3 and 4 correctly. The most challenging item was number four.

Ten out of fourteen students answered it correctly. Item 1 and 2 were also easy items as

the IF value was very close to 1.

Item Number of students who answered correctly I. F 1 12 0.8 2 13 0.9 3 14 1 4 10 0.7 5 14 1

Page 11: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

11

Item Discrimination (n=14)

25% of total number of students.

Items 3 and 5 showed no difference between high and low scorers. In item 4 the low

scorers preformed better than the high scorers. Item 1 and 2 are not within the acceptable

range of 0.35-1. All the items therefore did a poor job of discriminating between high and

low scorers, which would need to be improved upon in future tests.

Part II: Multiple Choice (Matching)

Key=*

Item A B C D E F G H I J K L M

1 1 13*

2 13*

1

3 1 12* 1

4 14*

5 14*

6 14*

7 14*

8 1 13*

9 14*

10 14*

Item Number of high scorers (top 4) who answered correctly

Number of low scorers (bottom 4) who answered correctly

I. D

1 4 3 0.2 2 4 3 0.2 3 4 4 0 4 2 3 -0.2 5 4 4 0

Page 12: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

12

Distractor Analysis:

Two students left #6 blank.

Distractors:

• (3) I • (1) C • (1) J

Only these three distracters were selected, meaning too many of the other distracters were

poor. The best distractor was letter J.

Item Facility (n=14)

Item Number of students who answered correctly

I.F.

1 13 0.93 2 13 0.93 3 12 0.86 4 14 1 5 14 1 6 14 1 7 14 1 8 13 0.93 9 14 1 10 14 1

The most challenging item for the students was #3. Items 4-7,9,10 were answered

correctly by all of the students.

This is good by a teacher’s standpoint because it was a criterion-referenced test and the

students learned the vocabulary well.

Item Discrimination (n=14)

25% of total number of students.

Page 13: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

13

Item Number of high scorers (top 4) who answered correctly

Number of low scorers (bottom 4) who answered correctly

I.D.

1 4 4 0 2 4 3 0.29 3 3 4 -0.29 4 4 4 0 5 4 4 0 6 4 2 0.57 7 4 4 0 8 4 3 0.29 9 4 4 0 10 4 4 0

Items 1,4,5,7,9,10 did not show any difference between high scorers and low scorers.

Item 6 showed the biggest distinction between the high scorers and the low scorers.

Part III: Synonyms

Item Facility (n=14)

Item Students who answered item correctly

IF

1. 14 1 2. 14 1 3. 6 0.42 4. 14 1 5. 9 0.64

Item 1, 2 and 3 were too easy for the students. All of the students answered these items

correctly. Item 3 and 5 had moderate difficulty.

Page 14: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

14

Distractor Analysis

Item A B C D

1 14* 0 0 0 2 0 0 14* 0 3 6* 4 0 4 4 0 14* 0 0 5 2 2 1 9*

Distractor

1. b, c, d were not working

2. a, b, d were not working

3. c was not working

4. a, c, d were not working

All of the distractors in item 1, 2, 3 and 4 were not working as nobody chose these

distractors. They need to be rejected and changed.

Item Discrimination

Item 1, 2, 4, 5 were very easy since they could not discriminate between high scorers and

low scorers students.

Item High scorers (top four) With correct answers

Low scorers (bottom four) With correct answers

I.D

1 4 4 0 2 4 4 0 3 4 0 1 4 4 4 0 5 3 3 0

Page 15: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

15

Part IV: Passage

Item Facility (n= 14)

Item Students who answered item correctly

I.F.

1 13 0.93 2 13 0.93 3 10 0.71 4 13 0.93 5 9 0.64 6 2 0.43 7 6 0.50 8 8 0.57 9 4 0.29 10 12 0.86

The most challenging items for the students were #9, #7 and # 6. There were only 2

students who answered it correctly for item #6. And there were only 4 students who

answered it correctly for item #9, 6 students who answered it correctly for item #7.

So the other options are very good distractors. Items #1, #2 and #4 were almost answered

correctly by everyone (except 1 for each item). They may need to be revised.

For item #3 there were 10 students who gave correct answers, so the other options are

acceptable distractors. For item #5, 9 students gave correct answers, so the other options

are good distractors. For item #8 and #10, there are 8 students gave correct answers for

#8, and 12 students gave correct answers for #10, the other options worked well as

distractors.

Page 16: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

16

Item Discrimination (I.D.)

Item High scores (top four) with correct answers

Low scores (bottom four) with correct answers

I.D.

1 4 3 0.29 2 4 3 0.29 3 4 1 0.86 4 4 3 0.29 5 4 1 0.86 6 2 0 0.57 7 2 3 -0.29 8 4 1 0.86 9 1 0 0.29 10 4 3 0.29

The fact that items #1, #2, #4, #10 got an I.D. of 0.29 indicates that these items do

distinguish a slight difference between the high scorers and low scorers. Items #3, #5, #8,

got an I.D of 0.86, which suggests that these items distinguish the difference between the

high scorers and low scorers well.

Page 17: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

17

Reflection and Discussion

Only testing vocabulary recognition was one of the requirements in our project.

This led our group to use multiple-choice as our testing technique. However, the creation

of distractors proved to be ver y challenging. I especially struggled with this when I had

to create Part 1 (see appendices) of the vocabulary quiz. In that section, the students were

tested on their recognition of five words. Their task was to select the option that best

described the underlined word in the context sentence. For example: Anna felt she had

sole responsibility with the group project. Anna had: a) all the responsibility, b) a lot of

the responsibility, c) little responsibility, d) no responsibility. It was challenging to create

distractors, as there were many considerations to take. One of them was that the

distractors should ideally have similar word length as the target word, to avoid it standing

out. Also, if the key was an adjective all the distractors had to be adjectives. Furthermore,

in the example item mentioned above, I had to list the options in a logical order, which in

that case was from more to less. Considerations like that made it especially difficult to

find appropriate distractors. It was a long process that required a lot of editing.

Unfortunately, our group’s item analysis revealed that many of the distractors were not

chosen by anyone. They were too easy. All this made me think that test would have been

better for the students if we had assessed them on vocabulary production.

According to Hughes (2003), testing of productive vocabulary is very difficult,

and practically never attempted in large-scale proficiency tests. I found this statement to

be very contrary to our group’s personal experience. For us, only testing vocabulary

recognition was one of our main obstacles. This is why I choose to do research on

assessing vocabulary production. I will therefore look at some productive vocabulary

Page 18: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

18

techniques and consider if they could have been adapted to our group’s test. I will first

talk about some subtests belonging to the Woodcock Reading Mastery Test (WRMT).

Then I will discuss the testing technique used in The Controlled Production Levels Test

(CPLT), and in the Lex 30 test. Finally, I will consider other techniques to use when

assessing vocabulary production.

The WRMT test focused on assessing reading comprehension, however it had two

subtests devoted to word comprehension (Pearson, Hiebert & Kamil, 2007). The

objectives of the subtests were to assess antonyms and synonyms. More specifically, the

items measured the test takers’ ability to read a word and then respond orally with a word

opposite in meaning (antonyms), or similar in meaning (synonyms). The instructions for

subtest two were: “Synonyms: Read this word out loud and then tell me a word that

means the same thing” (Pearson, Hibert & Kamil, 2007, p. 286). In this item the

participant read cash and answered money.

This technique could be adapted to a different medium. Instead of face-to-face, it

could be paper and pencil, and it could have applied to Part 3 of our test. In this section,

the students had to choose an option that was closest in meaning to a word. In other

words, they had to find the word’s synonym. For example, item one in section three

tested if the students could find a synonym for the word unique. The key was special, and

the distractors were multiple, usual, general. Our item analysis revealed that none of the

distractors were chosen. I believe the test item would have been better if we had tested

the student production ability. For example, the item could have been changed to: Write

one word that is similar in meaning to unique.

Page 19: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

19

The next test I will talk about is called The Controlled Production Levels Test

(CPLT). At first glance the items looked like a C-test as the second half of one word in a

sentence was deleted. However, the test did not use a paragraph where the second half of

every other word was deleted (Hughes, 2003). Instead they had individual sentences that

were not connected. Also, the cues were not always half a word. Moreover, there was

only one word in each sentence that was partially deleted. The number of letters deleted

from each word depended on the elimination of possible alternatives. For example, if two

letters could start two possible words an additional letter was added to eliminate the

alternative (Laufer & Nation, 1999). Here is an example of one of the items: “The book

covers a series of isolated epis_______from history” (Laufer & Nation, 1999, p. 37). We

could have adapted similar items to Part 5 of our test, which had a long paragraph with

ten blanks. Instead of the word bank we could have provided some of the initial letters of

the target word. The number of the cue letters would depend on the elimination of

alternative answers.

However, many of my sources revealed flaws with this technique. According to

Webb (2008), the CPLT test might depend on grammatical knowledge. This would affect

our test’s validity, as the students’ grammatical knowledge was not the aim of the test. In

such cases, the scorers of a test should not mark down grammatical mistakes. Moreover,

Webb (2008) also talked about how the CPLT might be testing receptive knowledge,

instead of productive ability. Likewise, Morton (1979) discovered that test subjects were

able to recognize words when one phoneme was inaudible. He also found that

participants often did not notice when a word in a sentence was partially pronounced.

Studies like that indicate that the partial information or cues given to test-takers might be

Page 20: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

20

enough to make them recognize the word. This then might affects the techniques validity,

as it may not be testing the test-takers productive ability after all.

Lex 30 was the last test I looked at. According to Fitzpatrick and Clenton (2010),

the test was a frequency-based vocabulary test. In order to elicit vocabulary from the test-

takers, Lex 30 used a word association task. However, it was not a word association test,

the vocabulary measured was elicited through a word association task. The aim of the test

was to assess the participants’ lexical ability. The test consisted of thirty items, and each

item had a cue word, like attack, board, close, cloth, dig etc. For each cue the test-takers

had to write four response words (Fitzpatrick & Clenton, 2010). The following was the

instructions of the test:

Look at the words below. Next to each word, write down any other words that it

makes you think of. Write down as many as you can (4, if possible). It doesn’t

matter if the connections between the word and your words are not obvious;

simply write down words as you think of them (Fitzpatrick & Clenton, 2010, p.

548).

The instructions told the test-takers that they were not limited to only writing words that

mean the same as the cue, were similar to the cue, or collocations. They had freedom to

write anything that came to mind. Fitzpatrick and Clenton (2010) included a sample of

the test, which revealed that the responses to the cue obey were commands, obedience,

demands, conform. According to Fitzpatrick and Clenton (2010), one point was given for

every response that could be classed as an infrequent word. Meaning, outside the first

1000 most frequent English words. They also believed that being able to produce

Page 21: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

21

infrequent words indicated how advanced a learner’s lexical development was. I think

this was a very interesting technique. However, the technique did not fit in with our test’s

objectives, so I would not have applied it.

I will now discuss some other techniques that have been used to assess productive

vocabulary. One of the techniques our group wanted to use was gap-filling. However, one

of the major issues with this technique is that there are often alternative answers that can

fit the gaps (Alderson, Clapham & Wall, 1995). A solution to this problem could be to

provide the initial letter of the target word. For instance, “The factory workers strongly

s_________ the Labor Party in every election” (Read, 2000, p. 164). However, in some

cases it may be necessary to provide several letters to ensure that only the target word fits

the sentence. Also, prompts like that may affect the item’s validity, as mentioned earlier

with the CPLT test.

Another way to use prompts when testing vocabulary production is illustrated in

the following item: “You have two of these, and you can hear through them” (Mckay,

2006, p. 253). This item was meant for assessing children, the response had to be written

in a blank line next to the item. In that item the test-takers were presented with the

meaning of the vocabulary, instead of a context sentence. I think items like that could

have been created in Part 2 of our test, where the students matched words with

definitions. The students could have just been presented with the definitions, so they

would produce the vocabulary themselves.

Using images is another way to test vocabulary production. It is also an excellent

way to eliminate alternative answers. Picture gap-filling is often used with young

Page 22: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

22

language learners, where they fill in the gap after identifying the name of an object in an

accompanying picture (McKay, 2006). Hughes (2003) also had a nice example of an item

using pictures. In the item, the student see six pictures each marked with a letter (A, B, C

etc.). Their task was to write the name of the objects in the empty line next to each

picture. Even though the use of pictures if often done with children, the technique can be

used with teenagers and adults as well. After looking at the twenty words our group was

assigned, I realize that none of them could have been visually illustrated since they were

too abstract.

In this part of the paper, I have gone through several different techniques to use

when you are assessing vocabulary production. The subtest to the WRMT test, seem to be

very effective if your are testing antonyms or synonyms. The subtest for synonyms could

even have been adapted to Part 3 of our test. We could also have applied the technique

used in the CPLT to Part 5 of the test. However, some studies indicated that this

technique might not be testing production knowledge after all. In addition, the technique

could also be testing the students’ grammatical knowledge, when it should just focus on

vocabulary. This affects the technique’s validity. Despite this, I really want to try this

technique in the future, and do some item analysis. Research on the Lex 30 test indicated

that it was a valid measure of an individual’s lexical ability. It also seems to be a good

technique for productive vocabulary, but it did not fit our test’s objectives. Moreover, I

discovered that picture gap filling seemed to be a good technique for testing of concrete

nouns, verbs, and adjectives. Lastly, gap filling is a very common production technique

to use. However, its greatest weakness lies in the fact that test-takers might produce

Page 23: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

23

alternative correct answers. A possible solution could be to provide the initial letters of

the target word. Nonetheless, this could cause the test-taker to recognize the word, which

would affect the technique’s validity. Another solution might be to provide a picture with

the gap, if the target word could be illustrated

Future Inquiries

It was very difficult to find resources for my paper. Few people had done research

on assessing vocabulary production. It would be interesting to discover cases where

production techniques work better than recognition techniques. More specifically, which

word classes should be tested with production or recognition techniques. Another

research option could be to compare two different production techniques. For instance,

you could select twenty concrete words, and create two different parts on the test. Part 1

could consist of ten pictures representing the words. The test-takers would have to name

the blank line underneath the picture. Whereas Part 2 could involve gap filling of context

sentences. The candidates scores would reveal which part the students perform better on.

Lastly, as previously mentioned in the discussion, the gap filling technique can be

problematic since alternative answers can often fit the gap. Using pictures is one way to

limit alternative answers. However, many words cannot be visually illustrated. More

research could therefore be devoted to finding other solutions to limit alternative

responses.

Page 24: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

24

References

Alderson, C., Clapham, C. Wall, D. (1995) Language test construction and evaluation.

Cambridge, UK: Cambridge University Press.

Fitzpatrick, T., Clenton, J. (2010). The Challenge of Validation: Assessing the

Performance of a Test of Productive Vocabulary. Language Testing, 27 (4), 537-

554. DOI: 10.1177/0265632209354771.

Hughes, A. (2003) Testing for language teachers. Cambridge, UK: Cambridge University

Press.

Laufer, B., Nation, P. (1999) A vocabulary-size test of Controlled Productive Ability.

Language Testing, 16:33. DOI: 10.1177/026553229901600103.

Cambridge, UK: Cambridge University Press.

Morton, J. (1979). Word recognition. Ithaca, NY: Cornell University Press.

Pearson, P., D., Elfrieda, H., H., & Kamil, M., L. (2007) Vocbulary Assessment: What

We Know and What We Need to Learn. Vol. 42, No. 2 (pp. 282-296). DOI:

10.1598/RRQ.42.2.4.

Read, J. (2000) Assessing vocabulary. Cambridge, UK: Cambridge University Press.

Webb, S. (2008). Receptive and Productive Vocabulary Sizes of L2 Learners. SSLA, 30,

79-95. DOI: 10.1017/S0272263108080042.

Page 25: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

25

Appendix A

Vocabulary Quiz International Education

(This quiz will approximately take 30 minutes. Each item is worth 1 point) Name:________________________________ Date: ________________

Part I Please choose the answer (a, b, c, or d) closest in meaning to the underlined word. 1) Jack thought the girl was very exuberant.

a) exciting b) energetic c) intelligent d) interesting

2) The fact that he is an honor roll student doesn’t warrant his arrogant nature.

a) create b) excel c) justify d) manage

3) Studying was a(n) integral part of Kate’s life as a graduate student.

a) necessary b) productive c) annoying d) frustrating

4) Anna felt she had sole responsibility with the group project.

Anna had: a) all the responsibility b) a lot of the responsibility c) little responsibility d) no responsibility

5) Luke’s parents thought he had blossomed during his senior year in college.

a) developed b) changed c) failed d) fought

Page 26: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

26

Part II Please match each item with its corresponding definition. There will be more definitions than there are words, so choose carefully.

Vocabulary Definitions 1. Criteria _____ a. To get something such as happiness or

strength from someone or something 2. Derive _____ b. Only 3. Dimension _____ c. To be a part of something bigger than

yourself 4. Initiate _____

d. A particular part of a situation

5. Integral _____ e. To arrange for something important to

start 6. Orientation _____ f. Necessary 7. Reside _____ g. Average or usual 8. Site _____ h. To live in a place 9. Sole _____ i. Beliefs or interests that a person or

group has 10. Unique _____ j. To rent out a space

k. Being the only one of its kind l. Facts or standards used to help in

deciding something m. A place where something happened or

where something is being built

Page 27: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

27

Part III Choose the alternative (a, b, c, or d) which is closest in meaning to the word on the left of the page

1. Unique a. special b. multiple c. usual d. general 2. Confidence a. self –assured b. self-doubt c. self-esteem d. self-distrust 3. Criteria a. benchmark b. testing c. possibility d. conjecture 4. Authorization a. breach b. warrant c. rejection d. dissent 5. Fragile a. durable b. tough c. firm d. brittle Part IV Read this article about single-sex school in US. Complete it with the words and expressions from the box. There are more words than needed. Change the form to fit the gap. Copy the words into the gap. Each word should be used only once in the passage

Single-sex Education

Advocates of single-sex education do not believe that "all girls learn one way and all boys learn another way." We cherish and celebrate the diversity among girls and among boys, but we also notice that each individual is 1_______ . We understand that some boys would rather read a poem than play football. We understand that some girls would rather play football than play with Barbies. Educators who understand these differences have developed different ways to facilitate every child to learn to the best of her or his ability and help them to build their 2_________. Besides that, single-sex schools have students with different political 3________ (s), so it is the school’s job to lessen the reinforcement of gender stereotypes. However, because of the high cost, many single-sex schools are 4_________ for most American families. And high costs are not the 5______ challenge that single–sex education is facing….

…, the good news is that the gender-separate form can boost grades and test scores for both boys and girls. That is one of the reasons why people 6______ single-sex schools in the U.S. in the first place. In fact, some educators and parents recognize that all too often, girls or boys are still being 7__________ in coeducational settings. They believe that boys and girls would clearly 8_________some benefit from living and studying in same sex groups. However, the opponents believe that single sex education reduces boys’ and girls’ opportunities to work together, and actually reinforces gender stereotypes. They also believe that the better educational outcome does not 9 __________in gender or gender separation. Therefore, the question is what 10_________ should we base single-sex school on?

criteria derive unique self-esteem shortchange alma mater sole blossom orientation exuberant out of reach reside fragile exuberant initiate

Page 28: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

28

Appendix B

Vocabulary Quiz International Education

(This quiz will approximately take 30 minutes. Each item is worth 1 point)

Name:________________________________ Date: ________________

Part I Please choose the answer (a, b, c, or d) closest in meaning to the underlined word. 1) Jack thought the girl was very exuberant.

a) exciting b) energetic (key) c) intelligent d) interesting

2) The fact that he is an honor roll student doesn’t warrant his arrogant nature.

a) create b) excel c) justify (key) d) manage

3) Studying was a(n) integral part of Kate’s life as a graduate student.

a) necessary (key) b) productive c) annoying d) frustrating

4) Anna felt she had sole responsibility with the group project.

Anna had: a) all the responsibility (key) b) a lot of the responsibility c) little responsibility d) no responsibility

5) Luke’s parents thought he had blossomed during his senior year in college. a) developed (key) b) changed c) failed d) fought

Page 29: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

29

Part II Please match each item with its corresponding definition. There will be more definitions than there are words, so choose carefully

Vocabulary Definitions 1. Criteria _____ (key: l ) a. To get something such as happiness or

strength from someone or something 2. Derive _____(key: a ) b. Only 3. Dimension _____(key: d ) c. To be a part of something bigger than

yourself 4. Initiate _____(key: e )

d. A particular part of a situation

5. Integral _____(key: f ) e. To arrange for something important to

start 6. Orientation _____(key: i ) f. Necessary 7. Reside _____(key: h ) g. Average or usual 8. Site _____(key: m ) h. To live in a place 9. Sole _____(key: b ) i. Beliefs or interests that a person or

group has 10. Unique _____(key: k ) j. To rent out a space

k. Being the only one of its kind l. Facts or standards used to help in

deciding something m. A place where something happened or

where something is being built

Page 30: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

30 Part III Choose the alternative (a, b, c, or d) which is closest in meaning to the word on the left of the page.

1. Unique a. special b. multiple c. usual d. general (key A) 2. Confidence a. self –assured b. self-doubt c. self-esteem d. self-distrust (Key C) 3. Criteria a. benchmark b. testing c. possibility d. conjecture (Key A) 4. Authorization a. breach b. warrant c. rejection d. dissent (Key B) 5. Fragile a. durable b. tough c. firm d. brittle (Key D)

Part IV Read this article about single-sex school in US. Complete it with the words and expressions from the box. There are more words than needed. Change the form to fit the gap. Copy the words into the gap. Each word should be used only once in the passage

Single-sex Education

Advocates of single-sex education do not believe that "all girls learn one way and all

boys learn another way." We cherish and celebrate the diversity among girls and among boys,

but we also notice that each individual is 1______ . We understand that some boys would

rather read a poem than play football. We understand that some girls would rather play football

than play with Barbies. Educators who understand these differences have developed different

ways to facilitate every child to learn to the best of her or his ability and help them to build

their 2________. Besides that, single-sex schools have students with different political

3________ (s), so it is the school’s job to lessen the reinforcement of gender stereotypes.

However, because of the high cost, many single-sex schools are 4_________ for most

criteria derive unique self-esteem shortchange

alma mater sole blossom orientation exuberant

out of reach reside fragile exuberant initiate

Page 31: Assessing Vocabulary Recognitionhannehakonsen.weebly.com/.../al6730_project_paper.pdf1 Assessing Vocabulary Recognition Vocabulary Recognition vs. Vocabulary Production Hanne Hakonsen

31 American families. And high costs are not the 5________ challenge that single–sex education

is facing……

…..,the good news is that the gender-separate form can boost grades and test scores for

both boys and girls. That is one of the reasons why people 6________ single-sex schools in the

U.S. in the first place. In fact, some educators and parents recognize that all too often, girls or

boys are still being 7__________ in coeducational settings. They believe that boys and girls

would clearly 8_________some benefit from living and studying in same sex groups.

However, the opponents believe that single sex education reduces boys’ and girls’

‘opportunities to work together, and actually reinforces gender stereotypes. They also believe

that the better educational outcome does not 9 __________in gender or gender separation.

Therefore, the question is what 10________ should we base single-sex school on?

Keys: 1.unique 2. self-esteem 3. orientations 4. out of reach 5. sole 6. initiated 7.shortchanged 8.derive 9. reside 10.criteria