assessing speaking presentation 20412

ASSESING SPEAKING

Emma Belfort

Introduction

This presentation focuses on the assessment of This presentation focuses on the assessment of oral skills and is based on H. Douglas Brownoral skills and is based on H. Douglas Brown’s ’s treatment of the subject as detailed in his book, treatment of the subject as detailed in his book, Language assessmentLanguage assessment: principles and classroom : principles and classroom practices published in 2004 by Pearson practices published in 2004 by Pearson Longman.Longman.

First challenge:

speaking vs. other skills Listening and speaking almost always correlatedListening and speaking almost always correlated

Only in very limited contexts (monologues, speeches, story-telling, and reading aloud) can oral language be assessed without the aural participation of an interlocutor.

Observations invariably tainted by other skills Observations invariably tainted by other skills

Speaking is almost always colored by the accuracy and effectiveness of test-taker’s reading comprehension or listening

Second challenge:

design and elicitation techniquesMost speaking product of creative construction of linguistic strings where the speaker makes choices of lexicon, structure and discourse, as tasks become more and more open-ended the freedom of choice given to test-takers creates a challenge in scoring procedures; therefore:

The stimulus used to elicit the target response for a particular category must be designed in a way that impairs test-takers from avoiding or paraphrasing and thereby dodging production of the target form.

In receptive performance: elicitation stimulus can be structured to anticipate predetermined responses and only those responses.

In productive performance: oral or written stimulus must be specific enough to elicit output within an expected range of performace such that scoring procedures apply appropriately.

Taxonomy for oral production

Imitative

Intensive

Responsive

Interactive

Extensive (monologue)

Micro- and microskills of speaking

Microskills: smaller chucks of language such as phonemes, morphemes, words, collocations, and phrasal units.

Macroskills: larger elements such as fluency, discourse, function, style, cohesion, nonverbal communication, and strategic options

Microskills of oral production

These skills total 11different objectives to assess speaking

1.Differences among English phonemes and allophonic variants

2.Chucks of language of different lengths

3.English stress patterns, words in stressed and unstressed positions, rhythmic structure, and intonation contours

4.Reduced forms of words and phrases

5.Lexical units (words) to accomplish pragmatic purposes

6.Fluent speech at different rates of delivery

Microskills of oral production (ctd.)

7. Monitor one’s own oral production and use various strategic devices – pauses, fillers, self-correctors, backtarcking – to enhance the clarity of the message.

8. Use grammatical word classes (nouns, verbs, etc.), system (e.g., tense, agreement, pluralization), word order, pattern, rules, and elliptical forms.

9. Produce speech in natural constituents: in appropriate phrases, pause groups, breath groups, and sentence constituents.

10.Express a particular meaning in different grammaticall forms.

11.Use cohesive devices in spoken discourse.

Macroskills of oral production

These skills total 5 different objectives to assess speaking

12. Communicative functions according to situations, participants, and goals.

13. Sociolinguistic features used in face-to-face conversations: styles, registers, implicature, redundancies, pragmatic conventions, conversation rules, floor-keeping and –yielding, interrupting, and.

14. Links between events and communicative relations as focal and peripheral ideas, events, and feelings, new information and given information, generalization and exemplification.

15. Facial features, kinesics, body language, and other nonverbal cues

16. Speaking strategies: emphasizing key words, rephrasing, providing a context for interpreting the meaning of words, appealing for help, and accurately assessing how well your interlocutor is understanding the speaker.

Most common tecniques and related

tasksAs we review these techniques three important issues must be

considered when designing tasks:

No speaking task is capable of isolating the skill of oral production.

Designer must make sure the elicitation prompt achieves its aims as closely as possible.

Carefully specify scoring procedures so as to achieve as high a reliability index as possible.

Assessing Imitative speech

Task: to repeat the stimulus, which can be a pair of words, a sentence, or a question (to test for intonation production) with items focusing on a specific phonological criterion. A variation involves prompting test-takers with a brief written stimulus to read aloud.

Drawbacks: there is a potential negative washback effect, also it should not occupy a dominant role in an overall oral production assessment

Example Word repetition task

2 Aceptable pronunciation

1 Comprehensible, partially correct pronunciation

0 Silence, seriously incorrect pronunciation

Scoring Scale:

Assessing intensive speaking

Test-takers are prompted to produce short stretches of discourse (no more than a sentence) through which they demonstrate linguistic ability at a specified level of language. Many tasks are “cued” in that they lead the test-taker into a narrow band of possibilities. Some of the techniques include

Directed response tasks

Read-aloud tasks

Sentence/dialogue completion tasks and oral questionnaires

Pictured-cued tasks

Translation (of limited streches of discourse)

Assesing intensive speaking:

Directed response tasks

These are mechanical, non-communicative tasks; but they do require minimal processing of meaning in order to produce the correct grammatical output.

Example:

Assessing intensive speaking:

Read-aloud tasks

These tasks include reading beyond the sentence level up to a paragraph or two.

Advantages: predictable output, practicality, reliability in scoring

Disadvantages: somewhat inauthentic in that we seldom do this in real life, also this skill calls for certain specialized oral abilities which may not be reliable indicators of test-taker’s pragmatic ability to communicate orally in face-to-face contexts

Example: Read-aloud task

Prator’s (1972) Manual of American English Pronunciation’s diagnostic passage:test-taker read aloud into a recorder, scoring based on a number of phonological factors (vowels, diphthongs, consonants, consonant clusters, stress, and intonation) with a two-page diagnostic checklist on which all errors and questionable items were noted and a four point scale for pronunciation and for fluency.

Test of Spoken English scoring scale (1987)


Sentence/dialogue completion tasksInclude reading a dialogue in which one speaker’s lines have been omitted.

Other examples are form filling (Underhill 1987) or oral questionnaires.While individual variations in responses are accepted, this technique taps into a lerner’s ability to discern expectancies in a conversation and to produce sociolinguistically correct language

It could be contended that performance on these items is responsive rather than intensive but notice there is a degree of control which predisposes the test-taker to respond with certain expected forms. In any event, according to Brown this argument undersacores the fine lines of distinction between the five categories for assessing spoken language Brown (2004).

Advantages: moderate control of output, the written format allows a little bit more time for the test-taker to anticipate the answer and removes the potential ambiguity created by aural misunderstanding.

Disadvantages: the contrived, inauthentic nature of this task and the fact that it relies on literacy and an ability to transfer easily from written to spoken English.

Example: Dialogue completion task

Example: Sentence completion task


Picture-cued tasksDesigned to elicit a word or phrase. Pictures may be very simple, somewhat more elaborate and “busy”, or composed of a series that tells a story or incident.

Other cues: Maps to give instructions, directions, and specify locations.

Other techniques: pairing two test-takers supplied with identical sets of numbered pictures. One test-taker is cued to describe one of the pictures in as few words as possible for the other test-taker to identify.

Advantages: help to unlock the almost ubiquitous link between listening and speaking performance and remove the potential ambiguity created by aural misunderstanding.

Disadvantages: the inauthentic nature of this task and the fact that it relies on literacy and an ability to transfer easily from written instructions to spoken English.

Also, although this technique is quite versatile but it can be heavily dependent on very clear written instructions.

Scoring: may be problematic depending on the expected performance


Picture-cued tasks (ctd.) Picture based tasks are very popular to elicit oral language performance and can be used not only for intensive production but also for extensive output. When scoring multiple factors recordings of the test-takers productions are very useful to the grader.

Types of language that can be elicited using pictures:

minimal pair

comparatives

verb tenses

nouns, negative responses, numbers, and location

giving directions and instructions

elaborate responses and descriptions

Example: Picture-cued elicitation task

minimal pairs comparatives

Intuitive: do not rely on written instructions

Example Picture-cued elicitation task

These pictures need clear written instructions as they could be misleading and confusing without them.

Brown & Sahni, 1994

Example Picture-cued elicitation task

Brown & Sahni, 1994


Translation (of limited stretches of

discourse)According to Brown, translation methods are certainly passé in today’s communicative classroom; but he concedes that in countries (such as Venezuela) where English is still not a prevailing language translation plays a meaningful communicative device for the English learner.

This technique involves test-takers being given a native language word,phrase,or sentence and are asked to translate to the English equivalent.

Advantages: control of the output which of course means that scoring is more easily accomplished.

Assessing responsive speaking:

Differs from intensive tasks in the increased creativity given to the test-taker and from interactive tasks by the somewhat limited length of utterances.

Involves brief interactions with an interlocutor

Some of the techniques commonly used include:

Question and Answer

Giving instructions and directions

Paraphrasing


Questions and AnswersThese tasks can consist of simple and complex questions from an interviewer or they can make up a portion of a whole battery of questions and prompts in an oral interview. There are two types of questions

Display questions: this is the first question and it intensive in its purpose (as we have seen previously, this questions are designed to elicit a predetermined correct response).

Referential questions: through these questions the test-taker is given opportunity to produce meaningful language in response. In designing referential questions it is important to keep in mind why the question is being asked; is it to elicit a string of language output or is it to gain a sense of the test-taker’s discourse competence?

Oral interaction with a test administrator often involves the latter asking all questions. An alternative of this concept is to elicit questions from the test taker.

One technique involves more than one test-taker with an interviewer. With two students in an interview context, both test-takers can ask questions of each other. This technique might meet practicality requirements but it might be troublesome to score.

Example: Question and answer task

Elicitation of questions from the test-taker

Questions eliciting open-ended responses


Giving instructions and directionsThis technique is simple, the administrator poses the problem and the test-taker responds.

Task should require the test taker to produce at least 5 or 6 sentences

Topics need to be familiar (not beyond the content schemata of the test-taker), so that an impromptu delivery is attainable, this avoids having to supply the problem in advance which in turn guarantees the test-taker does not parrot back a memorized set of sentences.

Advantages: Using this type of stimulus provides an opportunity for the test-taker to engage in a relatively extended stretch of discourse, to be very clear and specific, and to use appropriate discourse markers and connectors.

Scoring: based primarily on comprehensibility and secondarily on other specified grammatical or discourse categories.

Example: Giving instructions and

directions task

These tasks can be designed to be simple or complex, potentially placing it in the category of extensive speaking. Objectives must be clearly set if the purpose is to elicit a short and simple response directives must be clear so as not to take the test-taker down a path of complexity for which she or he is not prepared.


ParaphrasingThis tasks require the test-taker to read or hear a limited number of sentences and produce a paraphrased version of the discourse.

It is important to pinpoint the objective of the task clearly. In these tasks the integration of listening and speaking is probably more at stake than simple oral production alone.

Advantages: elicit short stretches of output and perhaps tap into test-takers’ ability to practice the conversational art of conciseness by reducing the output/input ratio.

Some of the contexts that may be assessed include: Describing, comparing and contrasting, narrating, summarizing, giving an opinion, supporting an opinion, hypothesizing, defining, functioning “interactively”.

Assessing interactive speech:

Include long stretches of interactive discourse. The difference between these types of oral production assessment and responsive speech is the length and complexity of the expected output. Can take two forms:

Transactional language: to exchange specific information

Interpersonal exchanges: social exchanges and relationships

Some of the techniques commonly used include:

Interviews

Role plays

Discussions

Games


InterviewThis technique involves a test administrator and a test-taker sitting down in a direct face-to-face exchange and proceeding through a protocol of questions and directives. Interview can vary in length, depending on their purpose:

Placement interview: designed to get a quick spoken sample from a student in order to verify placement into a course, may need only five minutes - if the interviewer is trained to evaluate the output accurately.

Comprehensive interviews: designed to cover predetermined oral production contexts and may require the better part of an hour.

A variation is to place two test-takers during one interview, the advantages of this technique are the opportunity for student-student interaction which increases authenticity and the practicality of scheduling twice as many candidates. The disadvantages are equalizing the output between two test-takers, discerning the interaction effect in case of unequal comprehension and production ability, and scoring two people simultaneously.

Scoring: based on a set of parameters which may include accuracy in pronunciation, grammar, vocabulary usage, fluency, sociolinguistic/pragmatic appropriateness, task accomplishment, and even comprehension. Scoring can be facilitated by recording the interview.


Interview (ctd.)Disadvantages: open-ended and involves a significant level of interaction where the interviewer is forced to make judgments that are susceptible to unreliability. Accuracy in scoring can be improved with careful attention to the linguistic criteria being assessed as well as through experience and training of the administrators to develop a sound judgment.

The success of an oral interview will depend on:

clearly specifying administrative procedures of the assessment (practicality)

focusing the questions and probes on the purpose of the assessment (validity)

appropriately eliciting an optimal amount and quality of oral production from the test-taker(biased for best performance)

creating a consistent, workable scoring system (realibility)


Interview (ctd.)According to Brown, every effective interview contains a number of mandatory stages.

Two decades ago Michael Canale (1984) proposed a framework which has withstood the test of time.

Canale suggested that test-takers will perform at their best if they are led through four stages:

Example: Interviews


Role playThis is a popular pedagogical activity in communicative language teaching classes.

Advantages: it can be controlled or ‘’guided’’ by the interviewer while freeing students to use discourse that might otherwise be difficult to elicit allowing test-takers to go beyond simple intensive and responsive levels to a level of creativity and complexity that approaches real-world pragmatics.

Scoring: presents the usual complications as any task that elicits somewhat unpredictable responses from test-takers.


Discussions and conversationsDifficult to specify and even more difficult to score.

Advantages: as informal techniques to assess learners, they offer a level of authenticity and spontaneity that other assessment techniques may not provide.

Discussion is a integrative task, so it is advisable to give some cognizance to comprehension performance in evaluating learners.

Scoring: checklists should be carefully designed to suit the objectives of the observed discussion


Discussions and conversations (ctd.)Discussions may be specially appropriate tasks through which to elicit and observe such abilities as:


GamesAmong informal assessment devices are a variety of games that involve language production. Some examples include:

’’Tinkertoy’’ game Crossword puzzles Information gap grids City maps

Advantages: as informal techniques to assess learners, they offer a level of authenticity and spontaneity that other assessment techniques may not provide.

Scoring: checklists should be carefully designed to suit the objectives of the observed discussion

Example: Games

Assessing extensive speech:

These tasks involve complex, relatively lengthy stretches of discourse. They are frequently variations on monologues, usually with verbal interaction from listeners or an interlocutor being either highly limited or ruled out all together. Some commonly used techniques include

Some of the most commonly used techniques include:

Speeches and oral presentations

Pictured cued story-telling

Retelling a story or news event

Translation (of extended prose)


Oral PresentationsThese tasks consist of having the test-taker present a report, a paper, a marketing plan, a sales idea, a design of new product, or a method.

Scoring: checklist and grid are common means of scoring these tasks. Scoring is the key assessment challege for oral presentations so the rules for effective assessment must be invoked

Specify the criterion clearlySet appropriate tasksCarefully elicit optimal outputEstablish practical,reliable scoring procedures


Oral Presentations (ctd.)


Picture-cued story-tellingThese tasks are similar to those we reviewed for assessing intensive production. The object is to elicit oral production through visual cues. Some of the stimuli used include:

PicturesPhotographsDiagramsChartsSeries of pictures for longer descriptions

Scoring: criteria need to be clear about what is being assessed. For example it is insufficient to specify the objective as aiming to elicit narrative discourse. This must be further clarified by deciding whether the assessment is evaluating oral vocabulary, time relatives, sentence connectors, past tense of irregular verbs, etc.


for extensive productionPossible questions:

1. Who is eating?2. Who is drinking? 3. Who is talking?4. What is she doing?

In applying questions it is important to know the purpose of each question.

The purpose of the first three questions is to cue the test-taker toward inferring what the woman next to the table could be doing.

Brown & Sahni, 1994


for extensive productionThis task elicits more open-ended performace whereby test-takers have to elaborate with their own opinion, describe preferences, and accomplish a persuasive function. These tasks must have a clearly defined criteria of goals and scoring rubricRubrics could include:GrammarVocabularyComprehensionFluencyPronunciationTask accomplishment (persuasive?)

Brown & Sahni, 1994


Retelling a story or news eventIn these tasks test-takers hear or read a story or news event that they are asked to retell.The difference from the paraphrasing is longer stretches of discourse and a different genre.

Scoring: the most significant challenge as with all extensive production assessments, therefore it should be designed to meet a clear set of criteria.

Some commonly used rubrics include communicating sequences and relationships of events, stress and emphasis patterns, ’’expression’’ in the case of a dramatic story, fluency, and interaction with the hearer. •


Translation (of extended prose)Longer texts are presented for the test-taker to read in the native language and then translate into English. Some of examples of texts include:

DialoguesDirections for assembly of a productA synopsis of a story or play or movieDirections on how to find something on a map

Advantages: is in the control of the content, vocabulary, and to some extent, the grammatical and discourse features.

Disadvantage: as we know, translation of text is a highly specialized skill for which some individuals obtain advanced degrees.

Scoring: criteria should therefore take into account not only the purpose in eliciting a translation but the possibility of errors that are unrelated to oral production ability.

Final comments

Oral Proficiency scoring categories (Brown 2001)

Phonepass ® (imitative and intensive) vs.

TSE (responsive and interactive) vs.

OPI (oral interview)

References

Brown, H.D.Brown, H.D. Language Assessment: Principles Language Assessment: Principles and Classroom Practices. and Classroom Practices. (2004). Longman(2004). Longman

assessing speaking presentation 20412

Documents

potential ambiguity created

assessing extensive speech

assessing interactive speech

assessing intensive speaking

assessing responsive speaking

cued elicitation task

oral production assessment

clear written instructions