assessing speaking presentation 20412
TRANSCRIPT
ASSESING SPEAKING
Emma Belfort
Introduction
This presentation focuses on the assessment of This presentation focuses on the assessment of oral skills and is based on H. Douglas Brownoral skills and is based on H. Douglas Brown’s ’s treatment of the subject as detailed in his book, treatment of the subject as detailed in his book, Language assessmentLanguage assessment: principles and classroom : principles and classroom practices published in 2004 by Pearson practices published in 2004 by Pearson Longman.Longman.
First challenge:
speaking vs. other skills Listening and speaking almost always correlatedListening and speaking almost always correlated
Only in very limited contexts (monologues, speeches, story-telling, and reading aloud) can oral language be assessed without the aural participation of an interlocutor.
Observations invariably tainted by other skills Observations invariably tainted by other skills
Speaking is almost always colored by the accuracy and effectiveness of test-taker’s reading comprehension or listening
Second challenge:
design and elicitation techniquesMost speaking product of creative construction of linguistic strings where the speaker makes choices of lexicon, structure and discourse, as tasks become more and more open-ended the freedom of choice given to test-takers creates a challenge in scoring procedures; therefore:
The stimulus used to elicit the target response for a particular category must be designed in a way that impairs test-takers from avoiding or paraphrasing and thereby dodging production of the target form.
In receptive performance: elicitation stimulus can be structured to anticipate predetermined responses and only those responses.
In productive performance: oral or written stimulus must be specific enough to elicit output within an expected range of performace such that scoring procedures apply appropriately.
Taxonomy for oral production
Imitative
Intensive
Responsive
Interactive
Extensive (monologue)
Micro- and microskills of speaking
Microskills: smaller chucks of language such as phonemes, morphemes, words, collocations, and phrasal units.
Macroskills: larger elements such as fluency, discourse, function, style, cohesion, nonverbal communication, and strategic options
Microskills of oral production
These skills total 11different objectives to assess speaking
1.Differences among English phonemes and allophonic variants
2.Chucks of language of different lengths
3.English stress patterns, words in stressed and unstressed positions, rhythmic structure, and intonation contours
4.Reduced forms of words and phrases
5.Lexical units (words) to accomplish pragmatic purposes
6.Fluent speech at different rates of delivery
Microskills of oral production (ctd.)
7. Monitor one’s own oral production and use various strategic devices – pauses, fillers, self-correctors, backtarcking – to enhance the clarity of the message.
8. Use grammatical word classes (nouns, verbs, etc.), system (e.g., tense, agreement, pluralization), word order, pattern, rules, and elliptical forms.
9. Produce speech in natural constituents: in appropriate phrases, pause groups, breath groups, and sentence constituents.
10.Express a particular meaning in different grammaticall forms.
11.Use cohesive devices in spoken discourse.
Macroskills of oral production
These skills total 5 different objectives to assess speaking
12. Communicative functions according to situations, participants, and goals.
13. Sociolinguistic features used in face-to-face conversations: styles, registers, implicature, redundancies, pragmatic conventions, conversation rules, floor-keeping and –yielding, interrupting, and.
14. Links between events and communicative relations as focal and peripheral ideas, events, and feelings, new information and given information, generalization and exemplification.
15. Facial features, kinesics, body language, and other nonverbal cues
16. Speaking strategies: emphasizing key words, rephrasing, providing a context for interpreting the meaning of words, appealing for help, and accurately assessing how well your interlocutor is understanding the speaker.
Most common tecniques and related
tasksAs we review these techniques three important issues must be
considered when designing tasks:
No speaking task is capable of isolating the skill of oral production.
Designer must make sure the elicitation prompt achieves its aims as closely as possible.
Carefully specify scoring procedures so as to achieve as high a reliability index as possible.
Assessing Imitative speech
Task: to repeat the stimulus, which can be a pair of words, a sentence, or a question (to test for intonation production) with items focusing on a specific phonological criterion. A variation involves prompting test-takers with a brief written stimulus to read aloud.
Drawbacks: there is a potential negative washback effect, also it should not occupy a dominant role in an overall oral production assessment
Example Word repetition task
2 Aceptable pronunciation
1 Comprehensible, partially correct pronunciation
0 Silence, seriously incorrect pronunciation
Scoring Scale:
Assessing intensive speaking
Test-takers are prompted to produce short stretches of discourse (no more than a sentence) through which they demonstrate linguistic ability at a specified level of language. Many tasks are “cued” in that they lead the test-taker into a narrow band of possibilities. Some of the techniques include
Directed response tasks
Read-aloud tasks
Sentence/dialogue completion tasks and oral questionnaires
Pictured-cued tasks
Translation (of limited streches of discourse)
Assesing intensive speaking:
Directed response tasks
These are mechanical, non-communicative tasks; but they do require minimal processing of meaning in order to produce the correct grammatical output.
Example:
Assessing intensive speaking:
Read-aloud tasks
These tasks include reading beyond the sentence level up to a paragraph or two.
Advantages: predictable output, practicality, reliability in scoring
Disadvantages: somewhat inauthentic in that we seldom do this in real life, also this skill calls for certain specialized oral abilities which may not be reliable indicators of test-taker’s pragmatic ability to communicate orally in face-to-face contexts
Example: Read-aloud task
Prator’s (1972) Manual of American English Pronunciation’s diagnostic passage:test-taker read aloud into a recorder, scoring based on a number of phonological factors (vowels, diphthongs, consonants, consonant clusters, stress, and intonation) with a two-page diagnostic checklist on which all errors and questionable items were noted and a four point scale for pronunciation and for fluency.
Test of Spoken English scoring scale (1987)
Assessing intensive speaking:
Sentence/dialogue completion tasksInclude reading a dialogue in which one speaker’s lines have been omitted.
Other examples are form filling (Underhill 1987) or oral questionnaires.While individual variations in responses are accepted, this technique taps into a lerner’s ability to discern expectancies in a conversation and to produce sociolinguistically correct language
It could be contended that performance on these items is responsive rather than intensive but notice there is a degree of control which predisposes the test-taker to respond with certain expected forms. In any event, according to Brown this argument undersacores the fine lines of distinction between the five categories for assessing spoken language Brown (2004).
Advantages: moderate control of output, the written format allows a little bit more time for the test-taker to anticipate the answer and removes the potential ambiguity created by aural misunderstanding.
Disadvantages: the contrived, inauthentic nature of this task and the fact that it relies on literacy and an ability to transfer easily from written to spoken English.
Example: Dialogue completion task
Example: Sentence completion task
Assessing intensive speaking:
Picture-cued tasksDesigned to elicit a word or phrase. Pictures may be very simple, somewhat more elaborate and “busy”, or composed of a series that tells a story or incident.
Other cues: Maps to give instructions, directions, and specify locations.
Other techniques: pairing two test-takers supplied with identical sets of numbered pictures. One test-taker is cued to describe one of the pictures in as few words as possible for the other test-taker to identify.
Advantages: help to unlock the almost ubiquitous link between listening and speaking performance and remove the potential ambiguity created by aural misunderstanding.
Disadvantages: the inauthentic nature of this task and the fact that it relies on literacy and an ability to transfer easily from written instructions to spoken English.
Also, although this technique is quite versatile but it can be heavily dependent on very clear written instructions.
Scoring: may be problematic depending on the expected performance
Assessing intensive speaking:
Picture-cued tasks (ctd.) Picture based tasks are very popular to elicit oral language performance and can be used not only for intensive production but also for extensive output. When scoring multiple factors recordings of the test-takers productions are very useful to the grader.
Types of language that can be elicited using pictures:
minimal pair
comparatives
verb tenses
nouns, negative responses, numbers, and location
giving directions and instructions
elaborate responses and descriptions
Example: Picture-cued elicitation task
minimal pairs comparatives
Intuitive: do not rely on written instructions
Example Picture-cued elicitation task
These pictures need clear written instructions as they could be misleading and confusing without them.
Brown & Sahni, 1994
Example Picture-cued elicitation task
Brown & Sahni, 1994
Assessing intensive speaking:
Translation (of limited stretches of
discourse)According to Brown, translation methods are certainly passé in today’s communicative classroom; but he concedes that in countries (such as Venezuela) where English is still not a prevailing language translation plays a meaningful communicative device for the English learner.
This technique involves test-takers being given a native language word,phrase,or sentence and are asked to translate to the English equivalent.
Advantages: control of the output which of course means that scoring is more easily accomplished.
Assessing responsive speaking:
Differs from intensive tasks in the increased creativity given to the test-taker and from interactive tasks by the somewhat limited length of utterances.
Involves brief interactions with an interlocutor
Some of the techniques commonly used include:
Question and Answer
Giving instructions and directions
Paraphrasing
Assessing responsive speaking:
Questions and AnswersThese tasks can consist of simple and complex questions from an interviewer or they can make up a portion of a whole battery of questions and prompts in an oral interview. There are two types of questions
Display questions: this is the first question and it intensive in its purpose (as we have seen previously, this questions are designed to elicit a predetermined correct response).
Referential questions: through these questions the test-taker is given opportunity to produce meaningful language in response. In designing referential questions it is important to keep in mind why the question is being asked; is it to elicit a string of language output or is it to gain a sense of the test-taker’s discourse competence?
Oral interaction with a test administrator often involves the latter asking all questions. An alternative of this concept is to elicit questions from the test taker.
One technique involves more than one test-taker with an interviewer. With two students in an interview context, both test-takers can ask questions of each other. This technique might meet practicality requirements but it might be troublesome to score.
Example: Question and answer task
Elicitation of questions from the test-taker
Questions eliciting open-ended responses
Assessing responsive speaking:
Giving instructions and directionsThis technique is simple, the administrator poses the problem and the test-taker responds.
Task should require the test taker to produce at least 5 or 6 sentences
Topics need to be familiar (not beyond the content schemata of the test-taker), so that an impromptu delivery is attainable, this avoids having to supply the problem in advance which in turn guarantees the test-taker does not parrot back a memorized set of sentences.
Advantages: Using this type of stimulus provides an opportunity for the test-taker to engage in a relatively extended stretch of discourse, to be very clear and specific, and to use appropriate discourse markers and connectors.
Scoring: based primarily on comprehensibility and secondarily on other specified grammatical or discourse categories.
Example: Giving instructions and
directions task
These tasks can be designed to be simple or complex, potentially placing it in the category of extensive speaking. Objectives must be clearly set if the purpose is to elicit a short and simple response directives must be clear so as not to take the test-taker down a path of complexity for which she or he is not prepared.
Assessing responsive speaking:
ParaphrasingThis tasks require the test-taker to read or hear a limited number of sentences and produce a paraphrased version of the discourse.
It is important to pinpoint the objective of the task clearly. In these tasks the integration of listening and speaking is probably more at stake than simple oral production alone.
Advantages: elicit short stretches of output and perhaps tap into test-takers’ ability to practice the conversational art of conciseness by reducing the output/input ratio.
Some of the contexts that may be assessed include: Describing, comparing and contrasting, narrating, summarizing, giving an opinion, supporting an opinion, hypothesizing, defining, functioning “interactively”.
Assessing interactive speech:
Include long stretches of interactive discourse. The difference between these types of oral production assessment and responsive speech is the length and complexity of the expected output. Can take two forms:
Transactional language: to exchange specific information
Interpersonal exchanges: social exchanges and relationships
Some of the techniques commonly used include:
Interviews
Role plays
Discussions
Games
Assessing interactive speech:
InterviewThis technique involves a test administrator and a test-taker sitting down in a direct face-to-face exchange and proceeding through a protocol of questions and directives. Interview can vary in length, depending on their purpose:
Placement interview: designed to get a quick spoken sample from a student in order to verify placement into a course, may need only five minutes - if the interviewer is trained to evaluate the output accurately.
Comprehensive interviews: designed to cover predetermined oral production contexts and may require the better part of an hour.
A variation is to place two test-takers during one interview, the advantages of this technique are the opportunity for student-student interaction which increases authenticity and the practicality of scheduling twice as many candidates. The disadvantages are equalizing the output between two test-takers, discerning the interaction effect in case of unequal comprehension and production ability, and scoring two people simultaneously.
Scoring: based on a set of parameters which may include accuracy in pronunciation, grammar, vocabulary usage, fluency, sociolinguistic/pragmatic appropriateness, task accomplishment, and even comprehension. Scoring can be facilitated by recording the interview.
Assessing interactive speech:
Interview (ctd.)Disadvantages: open-ended and involves a significant level of interaction where the interviewer is forced to make judgments that are susceptible to unreliability. Accuracy in scoring can be improved with careful attention to the linguistic criteria being assessed as well as through experience and training of the administrators to develop a sound judgment.
The success of an oral interview will depend on:
clearly specifying administrative procedures of the assessment (practicality)
focusing the questions and probes on the purpose of the assessment (validity)
appropriately eliciting an optimal amount and quality of oral production from the test-taker(biased for best performance)
creating a consistent, workable scoring system (realibility)
Assessing interactive speech:
Interview (ctd.)According to Brown, every effective interview contains a number of mandatory stages.
Two decades ago Michael Canale (1984) proposed a framework which has withstood the test of time.
Canale suggested that test-takers will perform at their best if they are led through four stages:
Example: Interviews
Assessing interactive speech:
Role playThis is a popular pedagogical activity in communicative language teaching classes.
Advantages: it can be controlled or ‘’guided’’ by the interviewer while freeing students to use discourse that might otherwise be difficult to elicit allowing test-takers to go beyond simple intensive and responsive levels to a level of creativity and complexity that approaches real-world pragmatics.
Scoring: presents the usual complications as any task that elicits somewhat unpredictable responses from test-takers.
Assessing interactive speech:
Discussions and conversationsDifficult to specify and even more difficult to score.
Advantages: as informal techniques to assess learners, they offer a level of authenticity and spontaneity that other assessment techniques may not provide.
Discussion is a integrative task, so it is advisable to give some cognizance to comprehension performance in evaluating learners.
Scoring: checklists should be carefully designed to suit the objectives of the observed discussion
Assessing interactive speech:
Discussions and conversations (ctd.)Discussions may be specially appropriate tasks through which to elicit and observe such abilities as:
Assessing interactive speech:
GamesAmong informal assessment devices are a variety of games that involve language production. Some examples include:
’’Tinkertoy’’ game Crossword puzzles Information gap grids City maps
Advantages: as informal techniques to assess learners, they offer a level of authenticity and spontaneity that other assessment techniques may not provide.
Scoring: checklists should be carefully designed to suit the objectives of the observed discussion
Example: Games
Assessing extensive speech:
These tasks involve complex, relatively lengthy stretches of discourse. They are frequently variations on monologues, usually with verbal interaction from listeners or an interlocutor being either highly limited or ruled out all together. Some commonly used techniques include
Some of the most commonly used techniques include:
Speeches and oral presentations
Pictured cued story-telling
Retelling a story or news event
Translation (of extended prose)
Assessing extensive speech:
Oral PresentationsThese tasks consist of having the test-taker present a report, a paper, a marketing plan, a sales idea, a design of new product, or a method.
Scoring: checklist and grid are common means of scoring these tasks. Scoring is the key assessment challege for oral presentations so the rules for effective assessment must be invoked
Specify the criterion clearlySet appropriate tasksCarefully elicit optimal outputEstablish practical,reliable scoring procedures
Assessing extensive speech:
Oral Presentations (ctd.)
Assessing extensive speech:
Picture-cued story-tellingThese tasks are similar to those we reviewed for assessing intensive production. The object is to elicit oral production through visual cues. Some of the stimuli used include:
PicturesPhotographsDiagramsChartsSeries of pictures for longer descriptions
Scoring: criteria need to be clear about what is being assessed. For example it is insufficient to specify the objective as aiming to elicit narrative discourse. This must be further clarified by deciding whether the assessment is evaluating oral vocabulary, time relatives, sentence connectors, past tense of irregular verbs, etc.
Example: Picture-cued elicitation task
for extensive productionPossible questions:
1. Who is eating?2. Who is drinking? 3. Who is talking?4. What is she doing?
In applying questions it is important to know the purpose of each question.
The purpose of the first three questions is to cue the test-taker toward inferring what the woman next to the table could be doing.
Brown & Sahni, 1994
Example: Picture-cued elicitation task
for extensive productionThis task elicits more open-ended performace whereby test-takers have to elaborate with their own opinion, describe preferences, and accomplish a persuasive function. These tasks must have a clearly defined criteria of goals and scoring rubricRubrics could include:GrammarVocabularyComprehensionFluencyPronunciationTask accomplishment (persuasive?)
Brown & Sahni, 1994
Assessing extensive speech:
Retelling a story or news eventIn these tasks test-takers hear or read a story or news event that they are asked to retell.The difference from the paraphrasing is longer stretches of discourse and a different genre.
Scoring: the most significant challenge as with all extensive production assessments, therefore it should be designed to meet a clear set of criteria.
Some commonly used rubrics include communicating sequences and relationships of events, stress and emphasis patterns, ’’expression’’ in the case of a dramatic story, fluency, and interaction with the hearer. •
Assessing extensive speech:
Translation (of extended prose)Longer texts are presented for the test-taker to read in the native language and then translate into English. Some of examples of texts include:
DialoguesDirections for assembly of a productA synopsis of a story or play or movieDirections on how to find something on a map
Advantages: is in the control of the content, vocabulary, and to some extent, the grammatical and discourse features.
Disadvantage: as we know, translation of text is a highly specialized skill for which some individuals obtain advanced degrees.
Scoring: criteria should therefore take into account not only the purpose in eliciting a translation but the possibility of errors that are unrelated to oral production ability.
Final comments
Oral Proficiency scoring categories (Brown 2001)
Phonepass ® (imitative and intensive) vs.
TSE (responsive and interactive) vs.
OPI (oral interview)
References
Brown, H.D.Brown, H.D. Language Assessment: Principles Language Assessment: Principles and Classroom Practices. and Classroom Practices. (2004). Longman(2004). Longman