an investigation of the selective deletion cloze test as a ... · an investigation of the selective...

9
AnInvestigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-BasedProficiency in Second LanguageLearning Gregory S. Hadley and John E. Naaykens rffiNt44-gtiFilLifif9tJ 1997*12)q

Upload: others

Post on 19-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

An Investigation of the Selective Deletion Cloze Testas a Valid Measure of Grammar-Based Proficiency in

Second Language Learning

Gregory S. Hadley and John E. Naaykens

rffiNt44-gtiFilLifif9tJ1997*12)q

Niigata University Research in Linguistics and Culture, Vol. 3, December 1997
Page 2: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

An Investigation of the Selective Deletion Cloze Testas a Valid Measure of Grammar-Based Proficiency in

Second Language Learning

Gregory S. Hadley and John E. Naaykens

Introduction

Few issues in the field of second language research have been as contentious as doze testing.

Over the years, opinions i n the T E F L academic community have been divided over the

applicability of doze tests for the second language classroom. Some contend that doze tests

measure a language learner's overall communicative abil ity in the target language (Hanania

and Shikhani 1986) Others maintain that doze tests assess only the most basic o f second

language learning and reading comprehension (Shanahan, Kamil and Tobin 1982). Still others

support a moderate position. Ikeguchi (1995), who quotes Bachman (1990:86-89) , states that

doze testing:

hold Es] potential for measuring aspects of students' written grammatical competence,

"knowledge o f vocabulary, morphology, syntax, and phonology," as we l l as textual

competence, "knowledge of cohesive and rhetorical properties of text" in second language

(II 167).

Some years earlier, Bachman (1982:61-70) reported that certain types of doze tests, such as

the selective deletion doze, can be used t o investigate a subject's knowledge o f wri t ten

discourse items such as context cohesion, syntax and strategic textual comprehension.

Anderson (1979) adds that doze testing correlates more closely with grammar tests than with

reading tests, and according to Bowen et al. (1985:376), the selective deletion doze is ideal for

testing vocabulary and grammar. Claims such as these should prompt us to f ind ou t for

ourselves if doze tests, such as the selective deletion doze, can measure a subject's knowledge

of grammar. Would students with higher scores on a selective deletion doze test also score

higher on a criterion-referenced examination designed to measure grammatical competency?

— 111 —

Page 3: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

We will consider this question as we review a 1996 study conducted at Niigata University. The

purpose of this study was to investigate whether the selective deletion doze correlates highly with

traditional, grammar-based tests. Many language teachers in the national university system opt

for criterion-referenced tests (C-RTs) wh ich attempt to measure grammatical knowledge

(Garland 1996). Putting aside the issue of whether language teachers should focus primarily on

grammatical proficiency, a selective-deletion doze test, i f proven to be a valid measure o f

grammatical competency, might provide a time-saving method of examination which is both fair to

students and easier to grade for teachers. Before looking at the findings of this study, however, a

brief history has been provided for those new to doze testing.

Cloze testing was first introduced by W.L. Taylor (1953) , who developed it as a reading test

for native speakers. He defined the term "doze" from a gestalt concept which teaches that an

individual w i l l be able to complete a task only after i ts pattern has been discerned:

A doze unit may be defined as: any single occurrence of a successful attempt to reproduce

accurately a part deleted from a 'message' (any language product) , by deciding from the

context that remains, what the missing par t should be (p . 416).

Cloze tests consist of a text (usually two or three paragraphs) which has had words or parts

of words deleted from it. Test subjects must draw from their knowledge of the language in

order to wr i te appropriate words i n the blanks (see Table One).

Ours

These

was the marsh (1), as the r iver wound, (3)

first most vivid and (5)

are the words to choose from:

Mg;k4: ' ILWF5 ' t

Purpose

Close Test ing: A n Overview

t h i n g s , seemst o me(7)m e m o r a b l e r a w afternoontowards(9)a time (10) f o u n d out for certain, (11)bleak place overgrown (12) n e t t l e s was the churchyard; (13)t h a t Phi l ipP i r r ip , la te(14)t h i s parish,and the (15)w i f e o f t h e above,(16)d e a d a n d bur ied .

I were that M y to wi thin w i th a o f broad twenty and

Table 1: Example of a Fixed-Rate Cloze Test.

— 112 —

, down by the river, (2)miles of the sea. (4)impression of the identity (6)

have been gained on (8)• A t such

t h i s

Georgiana o f evening country

Page 4: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language Learning

English 1B, Niigata University, 1996-1997Language Japanese

Age 18 (82%) 1 9 (18%)Sex Male (55%) Female (45%)

Department Science (91%) Education (9%)Skill Level False Beginners

Total Number Subjects 22

There are a t least f ive main types o f doze tests available to language teachers: The

fixed-rate deletion, t h e selective delet ion ( a l s o k n o w n a s t h e rat ional d o z e ) , t h e

multiple-choice doze, the doze elide and the C-test (Ikeguchi 1995; Weir 1990: Klein-Braley

and Raatz 1984).

In the fixed-rate deletion, after one or two sentences, every nth word is deleted. Usually

every fifth or seventh word is deleted, but Brown (1983) suggests that longer texts with every

eleventh or fifteenth word deleted can be used with subjects who have a lower level of language

proficiency. Mult iple choice doze tests provide the subjects wi th several possible items to

choose from for each blank. The doze elide inserts words which do not belong in the text, and

requires the subjects to identify the incorrect words plus wri te appropriate items in their

place. The C-test consists of deleting only par t of every second word in a text, and asks

subjects to complete each truncated word. In the selective deletion or rational doze, the tester

chooses which items he or she wishes to delete from the text. The goal for teachers using this

test is not only to fine tune the level of difficulty of the text, but also to measure the knowledge

of specific grammatical points and vocabulary items. Let us now consider whether the selective

deletion doze t ru l y i s a reliable measure o f grammatical knowledge.

Subjects

One group (see Table Two) from Niigata University was selected for this study. As Table

Two shows, all were native Japanese speakers consisting mostly of first year Science majors.

No special criteria was used in selecting or excluding the subjects. Neither was the group

tested on thei r English proficiency level before entering the course. However, classroom

experience with the subjects led us to believe that most group members had limited speaking,

listening and wri t ing skills, typically representative of a Japanese university f i rst year EFL

class (cf. Wadden 1993).

Table 2

— 113 —

Page 5: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

gAt4-4q-NrifLerct

Materials

Interchange Two (Richards, et al. 1993) was used as the primary text. The selective deletion

doze was created from one of the general interest reading texts in the f i rst chapter of the

course book (Richards et al. 1993:7, see Table Three). While the subjects had read the text

several months earlier, we were fair ly certain that very few, if any of the students had read the

text again since that time. The doze test consisted of a 133 word passage with 25 blanks,

meaning that roughly 19% of the total text was deleted. Test-retest was conducted two separate

times on this particular doze. A t a probability rating at less than one percent that the results

are due to chance (p < .01) , the reliability coefficient for this doze test reached a moderate level

of significance ( r x , = + .56 and + .60).

There are many things people remember about the sixties. Some people remember i t fo rmini-skirts, the Beatles, hippies and the flower children. I t was a time when young people"owned" the world and thought that anything was possible. In art, fashion, and music, the bignames were often in their early twenties, and some of them were already millionaires! Thesixties was a time when young people used to do whatever things they wanted. "Don't trustanyone over 30!" they said. In the arts, people l ike Andy Warhol created "pop art." Andfashions changed, too. The mini-skir t became popular, and then the "unisex" look followed.Young people started wearing blue jeans everywhere - - to school, fancy restaurants, andconcerts. Many o f them had very long hair and wore lots o f rings, beads and bracelets.

Table 3: Te x t Selected f o r Use i n th is Study. Adapted f rom Richards, e t al. (1993:7)

Procedure

The doze test (see Figure One) was administered to the subjects two times, separated by a

period of two weeks. During the second administration, a grammar-based test created by the

textbook designers was also given t o the subjects (Richards, e t al. 1993:168-172). The

instructions were given to the students verbally and in written form, both in English and

Japanese, to facilitate a clear understanding of the task. On each occasion, the doze tests were

collected after 20 minutes. One significant variable that was different, however, is that the first

test was administered during a regular class session, while the other was given during their

midterm test. While this is certainly not standard practice when studying the validity of a

certain test design, allowing this procedure provided a venue to find out how the doze test

would function under a variety o f classroom conditions.

— 114 —

Page 6: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language Learning

Student Name:

Student N u m b e r '

There are m a n y th ings people r e m e m b e r about the sixties. S o m e people i t for mini-

skirts, B e a t l e s , h ipp ies t h e f lower chi ldren. I t a t ime

y o u n g p e o p l e " o w n e d " t h e a n d t h o u g h t t h a t a n y t h i n g w a s

In art, fashion, a n d music, the b ig names o f t e n t h e i r early twenties, a n d

s o m e

Reading and Vocabu la ry

Instruct ions: F i l l Out the b lanks be low with the cor rec t words.

-fTd)ts o c .

them a l r e a d y mil l ionaires! T h e w a s t i m e

when young people t o do wha teve r t h e y wanted. - D o n ' t

anyone o v e r 30! t h e y said. I n a r t s , peop le l ike A n d y Warho l created "pop

" A n d f a s h i o n s c h a n g e d , t o o , m i n i - s k i r t p o p u l a r

then ' u n i s e x ' look fol lowed. Yo u n g people s tar ted b l u e

everywhere — to school, f ancy a n d concerts. M a n y o f them h a d very long ha i r

and wore lots o f rings, beads and bracelets.

Figure 1: Select ive Delet ion C loze Te s t Des igned f o r this Study.

Analysis

The tests were graded by two scorers. The classroom teacher graded the grammar-based

tests using the key provided in the teacher's manual (Richards, et al., 1993:189-190), while a

native English speaking TEFL lecturer graded the tests using the Semantically Acceptable

Word (SEMAC) Method. Typically, doze tests can be graded using either the Exact Word or

SEMAC scoring method. In the exact word method, the doze test blanks must be completed

with the exact word as was in the original text. Correct answers receive 1 point, while any

other response receives no points. SEMAC scoring allows subjects to write answers which are

grammatically and lexically appropriate, although not the original words deleted from the text.

For the purposes of this experiment, i t did not matter whether the exact word method or

SEMAC method was used, since they both correlate highly with each other (cf. Owen et al.

1996; Hadley and Naaykens, in press) However, SEMAC scoring may require a subjective

— 115 —

Page 7: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

judgement by the scorer. In order to avoid the doze test scores to be influenced by personal

knowledge of the subjects, an evaluator unacquainted with the subjects was chosen. Before

grading the tests, the bl ind evaluator was given a manuscript o f the complete text, and

instructed t o a l low any words i n the doze tha t were either synonymous, lexical ly o r

grammatically correct. Mistakes i n historical accuracy, and minor spelling errors were

ignored. If it was difficult to ascertain whether an answer was acceptable or not, it was scored

as incorrect.

After the scores were totaled, a l l o f the data was analyzed using the VA R Grade fo r

Windows 2.0 software package (Revie 1997). The method o f analysis was set up as a

directional one-tailed test which used the Pearson r correlation coefficient. The doze test

scores were correlated with the scores of the grammar-based test, and resulted ill a correlation

coefficient o f + .72 (See Figure Two) .

24.0

20.4

Selective 16.8Deletion

Cloze 132

9.6

VAt-'?-g-guicIL1Jf3r,

— 116 —

46.4

Implications for Language Teachers

6.032.0 3 5 . 6 3 9 . 2 4 2 . 8

Grammar TestFigure 2 Correlated Scores of Grammar-Based Test and Selective Deletion Cloze. n =22, r = .72.

50.0

According to Brown (1993:132-141), at p 005 , the critical level of significance for a group

of 22 i s approximately + .51 (see also Fisher and Yates, 1963). Th is suggests that the

correlation between the grammar-based test and the selective-deletion doze may be quite

significant.

It would be foolhardy if language teachers completely changed their testing practices simply

on the basis of this one study. However, the findings o f this research tends to suggest that

selective-deletion doze tests could be used in place of or alongside of grammar-based language

Page 8: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

An Investigation of the Selective Deletion Ooze Test as a Valid Measure of Grammar-Based Proficiency in Second Language Learning

tests. If careful consideration is given to the design of the selective-deletion doze, it has a high

potential for reliability, even under less than desirable testing conditions. It may be even more

reliable than tests which our learners are frequently exposed to: tests which have been thrown

together l a te a t n ight b y language teachers under the pressure o f several deadlines.

Conservative use of the selective deletion doze could provide teachers with a time-saving

method of testing their learners. Learners could be assured that, despite the brevity of the test,

their level o f grammatical competence in the target language is being, to a certain degree,

reliably measured. Both teacher and learners might then be liberated from the unnecessary

amount of time normally spent on testing, and more time could be dedicated to studying the

target language.

Conclusion

It is hoped that language teachers wi l l begin experimenting with doze testing as a viable

option t o the tradit ional tests which are normally administered i n universi ty language

classrooms. Even i f some are uncertain about the rel iabi l i ty and val idi ty o f the selective

deletion doze for use as a C-RT, i t could still be used as a quick measure to see i f the learners

are making progress i n the course.

This study opens avenues f o r future research. F o r example, t o what extent would a

selective-deletion doze correlate with a test measuring oral proficiency, or with a listening

proficiency test? I f such scores did consistently correlate highly, would this suggest that doze

tests can measure more than just grammatical competence in second language learning? These

are just a few of the many questions which deserve further investigation as we continue our

search fo r innovative and effective methods o f second language testing.

References

Alderson, J.C. (1979). "The doze procedure and proficiency in English as a second language." TESOL

Quarterly, 13 , 219-226.

Bachman, L. (1990). Fundamental Considerations in Language Testing. Oxford: Oxford University Press.

Bachman, L . (1982). "The t ra i t structure o f doze test scores. TESOL Quarterly, 16 , 61-70.

Bowen, JD., Madsen H, and Hi l ferty, A . (1985). TESOL: Techniques and Procedures. Rowley, MA:

Newbury House Publishers.

Brown, JD. (1983). "A closer look at the doze: Val idi ty and reliability." In J.W. 011er, Jr. (Ed.) Issues in

Language Test ing Research. (p . 237-250). Rowley, MA:Newbury House.

— 117 —

Page 9: An Investigation of the Selective Deletion Cloze Test as a ... · An Investigation of the Selective Deletion Cloze Test as a Valid Measure of Grammar-Based Proficiency in Second Language

* V I M3U 5 t

Brown, J.D. (1993). Understanding Research i n Second Language Learning New York : CambridgeUniversity Press.

Brown, J.D. and Yamashita S. (Eds.) (1995). Language Testing in Japan. Tokyo: The Japan Associationfor Language Teaching.

Fisher, R.A. and Yates, F. (1963). Statistical Tables for Biological, Agricultural and Medical Research.London: Longman.

Garland, V. (1996). 'Teaching techniques and learning styles in Japanese universities'. Journal of Cross-Cultural Studies. 6:73-96.

Hadley, G. and Naaykens, J. (In Press). 'Testing the Test: Comparing SEMAC and Exact Word Scoring onthe Selective Deletion Cloze.' Korea TESOL Journal. E .

Hanania, E. and Shikhani, M. (1986). 'Interrelationships among three tests of language proficiency:Standardized ESL, doze and writing.' TESOL Quarterly, 20, 97-109.

Ikeguchi, C. (1995) “Cloze testing options for the classroom." in J.D. Brown and S. Yamashita (Eds.) 1995.Language Testing in Japan (p. 166-178). Tokyo: The Japan Association for Language Teaching.

Klein-Braley, C. and Raatz, U. (1984). "A survey of research on the C-test." Language Testing, 1,134-146.

Oiler, J.W. Jr. (Ed.) (1983). Issues in Language Testing Research. Rowley, MA: Newbury House.

Owen, C., Reeves, J. and Widener, S. (1996). Testing. Birmingham, UK: University of Birmingham.

Revie, D. (1997). VAR Grade for Windows 2.0: Grading Tools for Teachers. Thousand Oaks, CA: VARedSoftware.

R chards, J., Hull, J., and Proctor, S. (1993). Interchange 2:English for International Communication. NewYork: Cambridge University Press.

Richards, J., Hull, J., and Proctor, S. (1993). Interchange 2 : English for International Communication:Teacher's Manual. New York: Cambridge University Press.

Shanahan, T., Kamil, MI., and Tobin, A. (1982). 'Cloze as a measure of intersentiental comprehension.'

Reading Research Quarterly, 17 , 229-225.

Taylor, W.L. (1953). "Cloze procedure: A new tool for measuring readability." Journalism Quarterly, 30,415-433.

Wadden. P. (Ed.) (1992). A Handbook for Teaching English at Japanese Colleges and Universities. NewYork: Oxford University Press.

Weir, C. (1990). Communicative Language Testing. Hemel Hempstead: Prentice Hall International Ltd.

— 118 —