carla rubrics

Upload: polythenesh

Post on 07-Feb-2018

235 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/22/2019 Carla Rubrics

    1/14

    1

    Process: Types of rubrics

    Rubrics are generally categorized as genericor task-specific. As is so often the case in assessment,

    the line between the two categories may blur so that rating instruments appear more or lessgeneric

    or task-specific. Indeed, many task-based rubrics are adaptations of generic scales. It is also possible

    to design hybrid rubrics that combine features of both types.

    Generic rubrics can be applied to a number of different tasks. In language assessment, one

    frequently finds generic rubrics used with assessment tasks within a modality (generally writing and

    speaking) or mode (interpersonal and presentational). A truly generic rubric could be applied to any

    task within the same modality or mode.

    The dimensions in a generic rubric for second-language assessment often emphasize features of

    language production, such as comprehensibility, accuracy, and vocabulary, without making reference

    to specific content or task details. Generic rubrics are often derived from models of language

    proficiency and/or second language acquisition.

    Figure 1. Generic Rubric for Oral Presentations - Intermediate Level Learners

    Exemplary4

    Accomplished3

    Developing2

    Beginning1

    Comprehen-sibility

    Listeners

    accustomed tothe speech oflearners are ableto understand allof thepresentation.

    Listeners

    accustomed tothe speech oflearners are ableto understandmost of thepresentation.

    Listeners

    accustomed tothe speech oflearners are ableto understand themain ideas andsome details ofthe presentation.

    Listeners

    accustomed tothe speech oflearners are ableto understandisolated bits ofthe presentation.

    Text Type

    Describes,narrates, and/orexpresses ownthoughts inparagraph-leveldiscourse.

    Describes,narrates, and/orexpresses ownthoughts inconnected stringsof sentences.

    Speaks in looselyconnectedsentences.

    Speaks inunconnectedsentences andphrases.

    LanguageControl

    High degree ofaccuracy ingrammar andword choice in

    connected,rehearsed, andoccasionallycomplexdiscourse. Littleor no interferencefrom firstlanguage.

    Usually accurategrammar andword choice inconnected,

    rehearseddiscourse.Occasionalinterference fromfirst language.

    Frequent, butusually minor,grammar andword choice

    errors inrehearsed,sentence-leveldiscourse.Significantinterference fromfirst language.

    Comprehension isimpeded byfrequentgrammar and

    word choiceerrors inrehearseddiscourse. Highdegree ofinterference fromfirst language.

    http://www.umn.edu/
  • 7/22/2019 Carla Rubrics

    2/14

    2

    VocabularyUse

    Uses a broad

    range of familiarand new words,phrases, andidioms so that

    expression ishighly varied andnon-repetitive.

    Uses an adequate

    range of familiarand new words,phrases, andidioms so that

    expression isvaried and onlyoccasionallyrepetitive.

    Uses familiar

    words, phrases,and idioms, andrarely attemptsto go beyond

    basic vocabulary.Speech isrepetitive andlacks variety.

    Uses very basic

    vocabulary andmemorizedphrases. Speechis limited and

    highly repetitive.

    CommunicationStrategies

    Always maintainscommunication.Able tocircumlocute andself-correct whenneeded. Use ofmemory aidsenhancespresentation.

    Very few breaksincommunication.Sometimes ableto circumlocuteand self-correct.Effective use ofmemory aids.

    Frequent breaksincommunication.Rarely able tocircumlocute orself-correct. Useof memory aidssometimesdetracts frompresentation.

    Generally unableto maintaincommunication.Overreliance onmemory aidsdetracts frompresentation.

    Figure 1shows a sample generic, analytic rubric for oral presentations (presentational mode) for

    Intermediate level learners adapted from theACTFL Performance Guidelines for K-12 Learners(K-12

    Guidelines). This example reflects a focus on features of second-language production, but additional

    dimensions might be included in order to measure such aspects of oral presentation as content

    coverage, organization, connection with audience, elocution, use of graphics, and so on.

    The terminology in Figure 1 is accessible to language-teaching professionals, but it may not provide

    meaningful feedback to learners. Large-scale and external assessments for purposes such as

    certification, placement, articulation, and program evaluation often use generic scales that contain a

    high degree of professional language and require modification for classroom use.

    A high school French teacher who commented on the rubric in Figure 1 indicated that, for classroom

    use, he prefers a rubric with short descriptors that he can take in at a glance and that serve primarily

    to refresh his memory of what performance is like at each step on the scale. Because he has only a

    few seconds to evaluate each student, and because he wants to spend as little class time as possible

    explaining the terms in the rubric to his students, prefers simple vocabulary that neither he nor his

    students must ponder (J.-L. Roche, personal communication, October 22, 2002). Figure 2 presents an

    adaptation of the oral presentation generic rubric for classroom use.

    Figure 2. "Classroom-Friendly" Generic Rubric for Oral Presentations - IntermediateLevel Learners

    Exemplary

    4 Accomplished

    3 Developing

    2 Beginning

    1

    Comprehen-sibility

    Listeners canunderstand all ofthe presentation.

    Listeners canunderstand almost allof the presentation.

    Listeners canunderstand themain ideas andsome details.

    Listeners canunderstand somephrases orsentences.

    Connected

    Language

    Speaks in

    paragraphs todescribe, tell

    about a sequenceof events, orexpress thoughts.

    Speaks in sentences

    to describe, tellabout a sequence of

    events, or expressthoughts.

    Sentences are

    looselyconnected.

    Phrases and

    sentences areunconnected.

    LanguageControl

    Makes raregrammar orvocabulary errorsin preparedspeech.

    Makes somegrammar orvocabulary errors inprepared speech.

    Makes frequentgrammar orvocabularyerrors inpreparedspeech.

    Makes so manyerrors that itappears speechwas notprepared.

    Vocabulary

    Use

    Uses manyfamiliar and newwords, phrases,

    and expressions.

    Not repetitive.

    Uses an adequaterange of familiar andnew words, phrases,

    and expressions.

    Occasionallyrepetitive.

    Uses familiarand a few newwords, phrases,

    and expressions.

    Repetitive.

    Uses very basicvocabulary andmemorized

    phrases. Very

    repetitive.

    http://www.carla.umn.edu/assessment/vac/references.html#actfl_1999http://www.carla.umn.edu/assessment/vac/references.html#actfl_1999http://www.carla.umn.edu/assessment/vac/references.html#actfl_1999http://www.carla.umn.edu/assessment/vac/references.html#actfl_1999http://www.carla.umn.edu/assessment/vac/references.html#actfl_1999http://www.carla.umn.edu/assessment/vac/references.html#actfl_1999
  • 7/22/2019 Carla Rubrics

    3/14

    3

    CommunicationStrategies

    May glance atnotes. Nonoticeable pausesor hesitations.

    May rely on notesseveral times. A fewnoticeable pauses orhesitations.

    Relies on notesoften. Frequentnoticeablepauses orhesitations.

    Unable to speakwithout readingnotes.

    It is certainly most efficient to design or identify rubrics that can be used for multiple purposes, but

    when weighing the use of generic versus task-based rubrics, efficiency is not the only important

    criterion.Tedick (2002)writes: "While some rubrics are created in such a way as to be generic in

    scope for use with any number of writing or speaking tasks, it is best to consider the task first and

    make sure that the rubric represents a good fit with the task and your instructional objectives. Just as

    a variety of task-types should be used in language classrooms, so should a variety of rubrics and

    checklists be used for assessing performance on those tasks. (p. 37)" For learners who are new to

    performance assessment and evaluation, Tedick recommends making students comfortable with the

    process by first using generic rubrics and gradually introducing task-specific rubrics.

    Task-specific rubrics are used with particular tasks, and their criteria and descriptors reflect specific

    features of the elicited performance. Rubrics developed for a defined group of tasks within a modality

    or mode, such as writing narratives, performing role-plays, or exchanging e-mail messages may

    combine elements of language production with dimensions related to the content and language

    function(s) of the lesson/task. For example, if an assessment task requires learners to use a series of

    pictures to tell a story in the past about a visit to monuments in Paris, the scoring criteria would focus

    on language competencies related to narration in past tense along with one or more dimensions

    measuring content and cultural knowledge. A possible rubric for this task is shown in Figure 3.

    Figure 3. Task-specific rubric for "A visit of Paris monuments"

    4 3 2 1

    Narration

    Story is wellorganized andconnected;includes manydetails or

    elaboration aboutall pictures in thetask.

    Story is mostlyorganized andconnected;includes somedetails or

    elaboration aboutmost pictures inthe task.

    Story is sequenceof looselyconnected events;includes fewdetails or

    elaboration aboutpictures in thetask.

    Story isunconnected listof events;includes no detailsor elaborationabout pictures inthe task.

    Use ofpast tense

    Uses past tense at

    all appropriatetimes; use isaccurate.

    Uses past tensefrequently; use ismostly accurate.

    Uses other tensessometimes where

    past isappropriate; use isaccurate some ofthe time.

    Makes fewattempts to usepast tense; use isfrequentlyinaccurate.

    Culturalknowledge

    Demonstratesextensive andcorrect knowledgeof current andhistoricalsignificance of allmonuments.

    Demonstratesadequate andcorrect knowledgeof current andhistoricalsignificance ofmost monuments.

    Demonstratespartial and usuallycorrect knowledgeof current andhistoricalsignificance ofsome monuments.

    Demonstratesminimal or noknowledge ofcurrent andhistoricalsignificance ofmonuments.

    Rubrics that combine features of generic and task-specific rubrics are very useful in classroom

    assessment because they provide feedback to learners on broad dimensions of language production

    along with their performance on the particular competencies and knowledge targeted by course

    content and aligned assessments. When adapting the rubrics for other tasks, teachers may keep the

    generic language production elements as they are and change one or two categories to focus on task

    expectations. For example, one might add level-appropriate, generic dimensions such as pronunciation

    or fluency to the task-specific categories of narration, use of past tense, and knowledge about

    monuments of Paris to the rubric in Figure 3.

    http://www.carla.umn.edu/assessment/vac/references.html#tedick_2002http://www.carla.umn.edu/assessment/vac/references.html#tedick_2002http://www.carla.umn.edu/assessment/vac/references.html#tedick_2002http://www.carla.umn.edu/assessment/vac/references.html#tedick_2002
  • 7/22/2019 Carla Rubrics

    4/14

    4

    Holistic, analytic, primary traitand multiple traitrubrics may be seen as different ways of

    selecting and organizing rating criteria. These rubric types come from different contexts, and although

    their particular uses and characteristics have converged in current practice, there are some general

    guidelines for choosing among them. In addition, each type has advantages and disadvantages.

    In practice, you will probably find considerable variability in how rubric types are identified. Holisticand analytic scales may be identified as generic or task-specific, or they may include rating criteria of

    both types. Primary and multiple trait rubrics are essentially task-specific, but general language

    production categories may be added to multiple trait rubrics.

    Evaluation

    Process

    Types of rubrics: Holistic scales

    In holistic evaluation, raters make judgments by forming an overall impression of a performance and

    matching it to the best fit from among the descriptions on the scale. Each band on the scale describes

    performance on several criteria (e.g., range of vocabulary + grammatical accuracy + fluency). Four or

    six levels of performance are commonly found in holistic rubrics. Holistic scales may be either generic

    or task-specific. Large-scale assessments are often evaluated holistically, but teachers find holistic

    rubrics easy and efficient to use for classroom assessment as well.

    Fig. 1a Holistic rubric for speaking tasks (generic)

    Exceeds

    expectations

    No errors in expression; near-native pronunciation; use of structures beyond

    expected proficiency; near-native use of appropriate cultural practices;

    exceeded task requirements.

    Meets

    expectations

    Almost all expression is correct; easily understood with infrequent errors in

    pronunciation, structures, and vocabulary usage; almost all cultural practicesdemonstrated and appropriate; met task requirements.

    Developing

    Some errors in expression; comprehensible with noticeable errors in

    pronunciation, structures, and/or vocabulary usage; some cultural practices

    demonstrated and appropriate; met most task requirements.

    Not there

    yet

    Little or no expression is correct; nearly or completely incomprehensible;cultural practices were inappropriate or not demonstrated at all; little success

    in meeting task requirements.

    Adapted from sample rubric in theNew Jersey World Languages Curriculum Framework, Appendix B(PDF document).

    Figures 1a-d present two holistic rubrics for speaking tasks (both generic) and two holistic rubrics for

    writing tasks (one generic, one task-specific). Click the icon at left to open a new window displaying

    Figures 1a-d.

    Advantages:

    They are often written generically and can be used with many tasks.

    They emphasize what learners can do, rather than what they cannot do.

    They save time by minimizing the number of decisions raters must make. Trained raters tend to apply them consistently, resulting in more reliable measurement.

    They are usually less detailed than analytic rubrics and may be more easily understood by

    younger learners.

    Disadvantages:

    They do not provide specific feedback to test takers about the strengths and weaknesses of

    their performance.

    Performances may meet criteria in two or more categories, making it difficult to select the one

    best description. (If this occurs frequently, the rubric may be poorly written.)

    Criteria cannot be differentially weighted.

    http://www.state.nj.us/njded/frameworks/worldlanguages/appendb.pdfhttp://www.state.nj.us/njded/frameworks/worldlanguages/appendb.pdfhttp://www.state.nj.us/njded/frameworks/worldlanguages/appendb.pdfhttp://www.state.nj.us/njded/frameworks/worldlanguages/appendb.pdf
  • 7/22/2019 Carla Rubrics

    5/14

    5

    Evaluation

    Process

    Types of rubrics: Analytic scales

    Analytic scales are usually associated with generic rubrics and tend to focus on broad dimensions of

    writing or speaking performance. These dimensions may be the same as those found in a generic,

    holistic scale, but they are presented in separate categories and rated individually. Points may be

    assigned for performance on each of the dimensions and a total score calculated.

    Traditionally, analytic rubrics are associated with large-scale assessment of general dimensions of

    language performance. However, analytic rubrics certainly can be created or adapted for use in

    classroom settings and with particular tasks (e.g.,Taggart et al., 1998). These rubrics often combine

    performance categories from a generic rubric with categories directly related to a task, such as

    demonstrating understanding of specific lesson content (Moskal, 2000). In practice, the names

    "analytic rubric" and "multiple trait rubric" may be used interchangeably.

    Performance dimensions commonly found in analytic rubrics include:

    Speaking & Writing

    Content

    Vocabulary

    Accuracy/Grammar/

    Language Use

    Task fulfillment Appropriate use of

    language

    Creativity

    Sentence

    structure/Text type

    Comprehensibility

    Writing

    Organization

    Style

    Mechanics

    Coherence and

    Cohesion

    Speaking

    Fluency

    Pronunciation

    Intonation

    FromTedick,p. 35:

    "One of the best known analytic rubrics used for writing assessment in the field of English as asecond language (ESL) was developed byHughey et al. (1983, p. 140). This rubric has fivecategoriescontent, organization, vocabulary, language use, and mechanics. Drawing heavilyupon characteristics of the Hughey et al. scale, Tedick and Klee developed an analytic rubric foruse in scoring essays written for an immersion quarter for undergraduates studying Spanish

    (Klee, Tedick, & Cohen 1995)."

    Fig. Fx. Analytic Writing Scale for the Spanish Foreign Language Immersion ProgramUniversity of Minnesota, Revised July, 1996

    CONTENT 30 POINTS POSSIBLE

    Score Range Criteria Comments

    30 - 27

    Excellent to Very Good.addresses allaspects of the prompt .provides good supportfor and development of all ideas with range ofdetail .substantive

    26 - 22

    Good to Average.prompt adequately

    addressed .ideas not fully developed orsupported with detail, though main ideas areclear .less substance

    21 - 17

    Fair.prompt may not be fully addressed(writer may appear to skirt aspects of prompt).ideas not supported well, main ideas lackdetailed development .little substance

    16 - 13Poor.doesnt adequately address prompt.little to no support or development of ideas.non-substantive

    http://www.carla.umn.edu/assessment/VAC/references.html#taggart_1998http://www.carla.umn.edu/assessment/VAC/references.html#taggart_1998http://www.carla.umn.edu/assessment/VAC/references.html#taggart_1998http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/gloss_windows/references.html#POLIAhttp://www.carla.umn.edu/assessment/VAC/gloss_windows/references.html#POLIAhttp://www.carla.umn.edu/assessment/VAC/gloss_windows/references.html#POLIAhttp://www.carla.umn.edu/assessment/VAC/references.html#hughey_1983http://www.carla.umn.edu/assessment/VAC/references.html#klee_1995http://www.carla.umn.edu/assessment/VAC/references.html#klee_1995http://www.carla.umn.edu/assessment/VAC/references.html#hughey_1983http://www.carla.umn.edu/assessment/VAC/gloss_windows/references.html#POLIAhttp://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/references.html#taggart_1998
  • 7/22/2019 Carla Rubrics

    6/14

    6

    ORGANIZATION 20 POINTS POSSIBLE

    Score Range Criteria Comments

    20 - 18

    Excellent to Very Good.well-framed andorganized (with clear introduction, conclusion).coherent .succinct .cohesive (excellent useof connective words)

    17 - 14

    Good to Average adequate, but looseorganization with introduction and conclusion(though they may be limited or one of the twomay be missing) .somewhat coherent .morewordy rather than succinct .somewhatcohesive (good use of connective words)

    13 - 10

    Fair.lacks good organization (no evidenceof introduction, conclusion) .ideas may bedisconnected, confused .lacks coherence.wordy and repetitive .lacks consistent useof cohesive elements

    9 - 7

    Poor.confusing, disconnected

    organization .lacks coherence, so much sothat writing is difficult to follow .lackscohesion

    LANGUAGE USE/GRAMMAR/MORPHOLOGY 25 TOTAL POINTS POSSIBLE

    Score Range Criteria Comments

    25 - 22

    Excellent to Very Good.great variety ofgrammatical forms (e.g., range of indicativeverb forms; use of subjunctive) .complexsentence structure (e.g., compoundsentences, embedded clauses) .evidence of"Spanish-like" construction .mastery ofagreement (subj/verb; number/gender) .veryfew errors (if any) overall with none that

    obscure meaning

    21 - 18

    Good to Average.some variety ofgrammatical forms (e.g., attempts, though notalways accurate, of range of verb forms, useof subjunctive) .attempts, though not alwaysaccurate, at complex sentence structure (e.g.,compound sentences, embedded clauses).little evidence of "Spanish-like" construction,though without clear translations from English.occasional errors with agreement .someerrors (minor) that dont obscure meaning

    17 - 11

    Fair.less variety of grammatical forms(e.g., little range of verb forms; inaccurate, if

    any, attempts at subjunctive) .simplisticsentence structure .evidence of "English-like"construction (e.g., some direct translation ofphrases) .consistent errors (e.g., withagreement), but few of which may obscuremeaning

    10 - 5

    Poor.very little variety of grammaticalforms .simplistic sentence structure thatcontains consistent errors, especially withbasic aspects such as agreement .evidence oftranslation from English .frequent andconsistent errors that may obscure meaning

    VOCABULARY/WORD USAGE 20 TOTAL POINTS POSSIBLE

    Score Range Criteria Comments

    20 - 18

    Excellent to Very Good.sophisticated,academic range .extensive variety of words.effective and appropriate word/idiom choice

    and usage .appropriate register

    17 - 14

    Good to Average.good, but notextensive (less academic), range or variety

    .occasional errors of word/idiom choice orusage (some evidence of invention of "false"cognates), but very few or none that obscuremeaning .appropriate register

    13 - 10Fair.limited and "non-academic" range(frequent repetition of words) .moreconsistent errors with word/idiom choice or

  • 7/22/2019 Carla Rubrics

    7/14

    7

    usage (frequent evidence of translation;invention of "false" cognates) that may(though seldom) obscure meaning .someevidence of inappropriate register

    9 - 7

    Poor.very limited range of words.consistent and frequent errors with

    word/idiom choice or usage (ample evidenceof translation) .meaning frequently obscured.evidence of inappropriate register

    MECHANICS 5 TOTAL POINTS POSSIBLE

    Score Range Criteria Comments

    5

    Excellent to Very Good.demonstratesmastery of conventions .few errors in

    spelling, punctuation, capitalization, and useof accents

    4

    Good to Average.occasional errors in

    spelling, punctuation, capitalization, and useof accents, but meaning is not obscured

    3

    Fair.frequent errors in spelling,punctuation, capitalization, and use ofaccents that at times confuses or obscuresmeaning

    2

    Poor.no mastery of conventions

    .dominated by errors in spelling,punctuation, capitalization, and use ofaccents

    Total Score_____ COMMENTS:

    Figure F presents an adaptation of a well-known analytic scale for evaluating ESL writing performance.

    Describing this rubric,Tedick (2002)writes: "Note that the scale assigns different weights to different

    features. This allows a teacher to give more emphasis to content than to grammar or mechanics, for

    example. The option to weigh characteristics on the scale represents an advantage to analytic

    scoring." (p. 35).

    Figure F2 shows an analytic scale for role plays and interviews used with students in first-year French

    courses at the University of Minnesota. This rubric can be used with other languages. In this example,

    all criteria are weighted equally.

    Analytic rubric for role plays and interviewsPost-secondary, Year 1

    Communicative Success (Would a listener accustomed to the speech of learnersunderstand?)

    A 6 / 5.5 Understand all of the message.

    A- 5 Understand the general message and most of the details.

    B 4.5 Understand general message, but only some of the details.

    C 4Have some idea of the general message, but would not be sure to haveunderstood.

    D-F 3.5 - 0 Do not understand what the speaker is trying to say.

    Pronunciation & Fluency

    A 6 / 5.5Speech is smooth; speaker is comfortable and confident in use of the language. Nomispronunciation that would interfere with comprehension by a sympathetic nativespeaker.

    A- 5Speech is occasionally hesitant; some rephrasing. Mispronunciation causingmisunderstanding occurs only rarely.

    B 4.5Speech is hesitant (e.g. frequent rephrasing, sentences left unfinished, longpauses). Several misunderstandings arise from mispronunciation of words orerrors in intonation.

    C 4Speech hesitant and choppy; conversation is almost impossible. Mispronunciationand inaccurate stress make understanding difficult. Has to repeat a lot to beunderstood; OR not enough speech to evaluate.

    D-F 3.5 - 0 Speech limited to isolated words, or mispronunciation makes comprehensionimpossible.

    http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002
  • 7/22/2019 Carla Rubrics

    8/14

    8

    Vocabulary

    A 6 / 5.5Shows control of a wide range of the vocabulary taught in class and always usesthis vocabulary appropriately.

    A- 5Shows control of an adequate range of the vocabulary taught in class and mostoften uses this vocabulary appropriately.

    B 4.5 Some control of new vocabulary, but relies on fixed expressions/basic vocabularyor uses vocabulary inappropriately.

    C 4Shows very limited control of the vocabulary taught, making discussion of relatedtopics extremely difficult; OR not enough speech to evaluate.

    D-F 3.5 - 0 Shows no command of the vocabulary taught, making communication impossible.

    Grammar

    A 6 / 5.5Shows consistent control of the structures taught in class and communication isnever impeded.

    A- 5 Usually controls structures taught in class.

    B 4.5 Shows partial control of structures taught in class.

    C 4 Speech is very difficult to understand due to lack of control of structures taught;OR not enough speech to evaluate.

    D-F 3.5 - 0 Extreme lack of control of structures taught in class.

    Role Plays/Interviews(Does it sound like a real conversation?)

    A 6 / 5.5Exchange is well-connected and appropriate to the topic and situation. Amount oftime spent conversing is appropriate for the task assigned and the topic isadequately covered.

    A- 5 Exchange is usually well-connected and appropriate to the topic and situation.

    B 4.5

    Some misunderstandings occur because discourse is not sufficiently connected orconversation is not always appropriate to the topic and situation; or speaker(s)does not maintain conversation for assigned length of time and needs to be told to

    continue.

    C 4Misunderstandings frequently occur between participants because discourse is notconnected; or conversation is often inappropriate to topic or situation.

    D-F 3.5 - 0Exchange is not connected (many non-sequiturs; speaker unable to hold uphis/her end of the conversation); or conversation is entirely inappropriate to topicor situation.

    Department of French and Italian, College of Liberal Arts, University of Minnesota

    There are more sample analytic rubrics in the Evaluation > Examplessection.

    Advantages:

    They provide useful feedback to learners on areas of strength and weakness.

    Their dimensions can be weighted to reflect relative importance.

    They can show learners that they have made progress over time in some or all dimensions

    when the same rubric categories are used repeatedly (Moskal, 2000).

    Disadvantages:

    "The whole is greater than the sum of its parts."Tedick (2002)notes: "Separate scores fordifferent aspects of a students writing or speaking performance may be considered artificial in

    that it does not give the teacher (or student) a good assessment of the "whole" of a

    performance." (p. 36).

    They take more time to create and use.

    There are more possibilities for raters to disagree. It is more difficult to achieve intra- and

    inter-rater reliability on all of the dimensions in an analytic rubric than on a single score

    yielded by a holistic rubric.

    There is some evidence that raters tend to evaluate grammar-related categories more harshly

    than they do other categories (McNamara, 1996), thereby overemphasizing the role of

    accuracy in providing a profile of learners' proficiency.

    http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#mcnamara_1996http://www.carla.umn.edu/assessment/VAC/references.html#mcnamara_1996http://www.carla.umn.edu/assessment/VAC/references.html#mcnamara_1996http://www.carla.umn.edu/assessment/VAC/references.html#mcnamara_1996http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#moskal_2000
  • 7/22/2019 Carla Rubrics

    9/14

    9

    There is some evidence that "when raters are asked to make multiple judgments, they really

    make one..."(Fulcher, 2009).Care must be taken to avoid a "halo effect" and focus on the

    individual criteria to assure that diverse information about the learner's performance is not

    lost.

    ProcessTypes of rubrics: Primary Trait and Multiple Trait

    Primary traitscoring, as developed by Lloyd-Jones and Carl Klaus (Lloyd-Jones, 1977)was designedto evaluate the primary language function or rhetorical trait elicited by a given writing task or prompt.

    "Primary trait assessment in its initial formulations focused on the specific approach that a writer

    might take to be successful on a specific writing task; every task required its own unique scoring

    guide" (Applebee, 2000,p. 4). In its original form, primary trait scoring would be strictly classified as

    task-specific, and performance would be evaluated on only one trait, such as the "Persuading an

    audience" example fromTedick(2002, p. 36) for a task requiring learners to write a persuasive letter

    to the editor of the school newspaper:

    Fig. Fx. Primary Trait: Persuading an audience

    0 Fails to persuade the audience.

    1 Attempts to persuade but does not provide sufficient support.

    2Presents a somewhat persuasive argument but without consistent development

    and support

    3 Develops a persuasive argument that is well developed and supported.

    Today, you may find that primary trait rubrics vary markedly from their original design and intended

    use.Applebeenotes: "Over the years as primary trait approaches were used more widely, they

    evolved into a more generic approach which recognized the similarities in approach within broad uses

    or purposes. The basic question addressed in scoring, however, remained, 'Did the writer successfully

    accomplish the purpose of this task?' To insure that raters maintained this focus, scoring guidelines

    usually instructed raters to ignore errors in conventions of written language, and to focus on overall

    rhetorical effectiveness" (p. 4). Primary trait scoring can be used with speaking tasks as well as with

    assessments of the interpersonal and presentational modes.

    If you search the Web for primary trait rubrics, you will occasionally find examples that include several

    traits rather than the one main criterion for successful communication within a specified rhetorical or

    functional domain (e.g.,SUNY Oswego, Which type of rubric is best? Fig. 3). In the Virtual Assessment

    Center, we adopt the distinctions outlined byTedick(2002) and refer to task-specific scoring grids

    with more than one dimension as multiple trait, or multitrait, rubrics.

    When would you use primary trait rubrics in the classroom? They provide minimal feedback to

    learners, and it probably would not be fair to base important decisions like grades on whether or not

    learners perform well on just one criterion. One scenario might be to use primary trait rubrics in

    formative assessments designed to determine how well learners perform a particular language

    function they have been working on in class. For example, if several lessons have been devoted to

    working on descriptive language, a culminating writing task might be scored solely on its effectiveness

    as a description.

    Multiple trait rubrics.Hamp-Lyons(1991) coined the term multiple trait scoringfor rubrics that she

    designed, based on the concepts of primary trait scoring, to provide diagnostic feedback to learners

    and other stakeholders about performance on "context-appropriate and task-appropriate criteria" for a

    specified topic/text type. She designed her multiple trait rubrics to be applicable across a range of

    similar tasks. Currently, multiple trait (or multitrait) rubrics are commonly considered to be task-

    specific, although one or more of their dimensions might also be found in generic, analytic rubrics.

    Many examples of rubrics of this type that you may find on the Web or in other resources often

    accompany a given task, and may not be readily applicable to other tasks without adaptation. Figure

    Fy illustrates a task and multitrait scoring rubric from a resource for language teachers (Petersen,

    1999).

    http://languagetesting.info/features/halorating/rating.htmlhttp://languagetesting.info/features/halorating/rating.htmlhttp://languagetesting.info/features/halorating/rating.htmlhttp://www.carla.umn.edu/assessment/VAC/references.html#lloyd_1977http://www.carla.umn.edu/assessment/VAC/references.html#lloyd_1977http://www.carla.umn.edu/assessment/VAC/references.html#lloyd_1977http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.oswego.edu/~shindler/typesofrubrics.htmhttp://www.oswego.edu/~shindler/typesofrubrics.htmhttp://www.oswego.edu/~shindler/typesofrubrics.htmhttp://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#hamp_1991http://www.carla.umn.edu/assessment/VAC/references.html#hamp_1991http://www.carla.umn.edu/assessment/VAC/references.html#hamp_1991http://www.carla.umn.edu/assessment/VAC/references.html#petersen_1999http://www.carla.umn.edu/assessment/VAC/references.html#petersen_1999http://www.carla.umn.edu/assessment/VAC/references.html#petersen_1999http://www.carla.umn.edu/assessment/VAC/references.html#petersen_1999http://www.carla.umn.edu/assessment/VAC/references.html#petersen_1999http://www.carla.umn.edu/assessment/VAC/references.html#petersen_1999http://www.carla.umn.edu/assessment/VAC/references.html#hamp_1991http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.oswego.edu/~shindler/typesofrubrics.htmhttp://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#tedick_2002http://www.carla.umn.edu/assessment/VAC/references.html#applebee_2000http://www.carla.umn.edu/assessment/VAC/references.html#lloyd_1977http://languagetesting.info/features/halorating/rating.html
  • 7/22/2019 Carla Rubrics

    10/14

    10

    Figure Fy. Task and multiple trait mini-rubric*

    Activity Description:

    If language teachers were paid every time they had to remind students to speak the targetlanguage (TL), they could probably retire.

    With this in mind, you will be a part of a group that will meet at the beginning of each class(groups will change from time to time). Your teacher will have an activity prepared for thegroups to complete each day (this gives your teacher the opportunity to take attendance andfinish all those other time-consuming tasks).

    Each group should elect a leader to take charge and keep the group on task, making sure thateveryone participates. Always speak as much TL as possible during your warm-up. Help andlearn from each other. Use the TL to greet and say your farewells to your group members at theend of the activity.

    Primary Activity Standard: Communication Standard 1.1 (InterpersonalCommunication)

    Students engage in conversations, provide and obtain information, express feelings andemotions, and exchange opinions.

    Excellent Average Needs Work

    Time on Task

    The group formsimmediately to work onactivity until the teacherindicates otherwise; ifgroup finishes early,members discuss topicsrelated to TL.

    10 9

    The group forms fairlysoon to work mostly onactivity until the teacherindicates otherwise; ifgroup finishes early,members are either silentor discuss topics notrelated to TL.

    8 7 6

    The group takes a longtime to form; they do notwork on activity (unlessthe teacher walks by); ifgroup finishes early,members discuss topicsnot related to TL.

    5 4 3 2 1 0

    Participation

    All group membersparticipate equallythroughout the entireactivity.

    5

    All group members butone participate equallythroughout the activity.

    4 3

    More than one groupmember does notparticipate equallythroughout the activity.

    2 1 0

    Group

    Cooperation

    All members cooperate

    to help each other learn;if anyone has been

    absent, the group helpshim/her; no one acts"superior."

    10 9

    Most members cooperate

    to help each other learn; ifanyone has been absent,

    the group sometimes helpshim/her; no one acts"superior."

    8 7 6

    Members do notcooperate to help each

    other learn; if anyonehas been absent, the

    group does not help;some members act"superior."

    5 4 3 2 1 0

    Use of TL

    Members use as much TLas possible (also to greetand say farewells).

    5

    Members use some TLduring activity (also togreet and say farewells).

    4 3

    Members rarely use TLduring activity (neitherdo they greet nor sayfarewells).

    2 1 0

    1999 Wade Petersen

    Multiple trait rubrics look like analytic rubrics in that performance is evaluated in several categories,

    and, in practice, you may find the terms used interchangeably. However, analytic rubrics usually

    evaluate the more traditional and generic dimensions of language production, while the criteria in

    multiple trait rubrics focus on specific features of performance necessary for successful fulfillment of a

    given task or tasks.

  • 7/22/2019 Carla Rubrics

    11/14

    11

    Advantages

    The rubrics are aligned with the task and curriculum.

    Aligned and well-written primary and multiple trait rubrics can ensure construct and content

    validity of criterion-referenced assessments.

    Feedback is focused on one or more dimensions that are important in the current learningcontext.

    With a multiple trait rubric, learners receive information about their strengths and

    weaknesses.

    Primary and multiple trait rubrics are generally written in language that students understand.

    Teachers are able to rate performances quickly.

    Many rubrics of this type have been developed by teachers who are willing to share them

    online, at conferences, and in materials available for purchase.

    Disadvantages

    Information provided by primary trait rubrics is limited and may not easily translate into

    grades.

    Task-specific rubrics cannot be applied to other tasks without adaptation of at least one or

    more dimensions.

    Process: Creating rubrics

    There are many rubric resources available to teachersonline and in published materialsso the

    first piece of advice we have to offer is: Find and adapt existing rubrics! It is rare to find a rubric

    that is exactly right for your situation and your students, but by using rubrics that have workedwell for others as a starting point, you can save a great deal of time.

    There are many rubric formats. In the grid format shown here, which is one of the possible ways

    to lay out a rubric, we illustrate a few common, frequently recommended, features of multiple

    trait rubrics:

    An even number (4 or 6) of levels of performance on the scale. When there is an odd

    number of levels, the middle level tends to become a catch-all category. With an even

    number of levels, raters have to make a more precise judgment about a performance

    when its quality is not at the top or bottom of the scale.

    High to low scale. In the graphic, the highest level of performance is described at theleft. Students read first the description of an exemplary performance in each criterion. A

    few labels for a four-point scale include:

    4 3 2 1

    Exemplary Excellent Acceptable Unacceptable

    Exceeds expectations Meets expectations Progressing Not there yet

    Superior Good Fair Needs work

    Limited number of dimensions or criteria. The criteria are those components that are

    most important to evaluate in the given task and instructional context. A rubric with too

    many dimensions may be unworkable in classroom assessment.

    Equal steps along the scale. The difference between 4 and 3 should be equivalent to the

    difference between 3 - 2 and 2 - 1. "Yes, and more", "Yes", "Yes, but", and "No" are

    ways for the rubric developer to think about how to describe performance at each scale

    point. Some common descriptive terms are listed in the chart below.

  • 7/22/2019 Carla Rubrics

    12/14

    12

    4 3 2 1

    Task

    requirements

    All Most Some Very few or

    none

    Frequency Always Usually Some of the

    time

    Rarely or not

    at all

    Accuracy No errors Few errors Some errors Frequent errors

    Comprehensibility Always

    comprehensible

    Almost always

    comprehensible

    Gist and main

    ideas are

    comprehensible

    Isolated bits

    are

    comprehensible

    Content coverage Fully

    developed,

    fully supported

    Adequately

    developed,

    adequately

    supported

    Partially

    developed,

    partially

    supported

    Minimally

    developed,

    minimally

    supported

    Vocabulary

    Range

    Variety

    Broad

    Highly varied;

    non-repetitive

    Adequate

    Varied;

    occasionally

    repetitive

    Limited

    Lacks variety;

    repetitive

    Very limited

    Basic,

    memorized;

    highly

    repetitive

    Considerations for Task-Based Rubric Development*

    Getting Started: Determining the criteria that will be valued for a particular task.

    1. Brainstorm all the possible elements or criteria that could be assessed in the

    performance task.

    2. Determine which elements are "non-negotiable." Which criteria could be part of the task

    description as baseline requirements or provided as a checklist?

    3. Prioritize the elements that are left:

    o What are the content and language goals of the unit?

    o What do you really want the students to emphasize in their performance?

    o How important is the overall "look" of the project (interest, appeal, creativity, neatness)?

    o Is culture represented in the rubric, if applicable?

    o Are the standards you targeted represented in the rubric?

    4. Determine 3 to 5 elements or criteria that will be incorporated in the rubric to define a

    quality performance. In trying to be thorough, an unwieldy rubric may be constructed,

    with so many elements being assessed that the rubric is time-consuming to fill out or

    oral performances will have to be taped (audio or video) in order to repeat them severaltimes for the purposes of assessment.

    Considering the Levels of the Rubric: Determining the number of levels and defining

    them.

    1. How many levels of performance do you wish to include in the rubric? How should they

    be defined? For example, "does not meet expectations," "meets expectations," "exceeds

    expectations." Or you may choose to use simply a 3, 4, or 5-point scale, noting,

    however, that a 3-point scale does not account for the fluctuation that exists within the

    average range. Some suggest that a 4-point scale is ideal, and that more than 4 points

    makes a scale cumbersome and difficult to use.

    http://www.carla.umn.edu/assessment/VAC/Evaluation/tbrd.htmlhttp://www.carla.umn.edu/assessment/VAC/Evaluation/tbrd.html
  • 7/22/2019 Carla Rubrics

    13/14

    13

    2. Consider the elements or criteria you have chosen one at a time. Begin with the highest

    level of the scale to define top quality performance. This is the level that you want all

    students to achieve and it should be challenging. How would you describe a

    representation that exceeds expectations? Meets expectations? Does not meet

    expectations?

    3. Are the levels you have created parallel? That is, are the criteria present in all levels?4. Is there continuity in the difference between the criteria for exceeds vs. meets, and

    meets vs. does not meet expectations? The difference between a 2 and a 3 performance

    should not be more than the difference between a 3 and a 4 performance.

    5. Do the levels reflect variants in quality and not a shift in importance of the criteria?

    6. Is there an expectation of quality at the average (meets expectations) level of the scale?

    Other issues to consider for rubric creation:

    Are the characteristics of each performance level described clearly? Will students be able

    to self-assess with the descriptors given? Will the descriptors give students enough

    information to know what they need to improve?

    Does the rubric adequately reflect the range of levels at which students may actually

    perform given tasks?

    Are the criteria at each level defined clearly enough to ensure that scoring is accurate,

    unbiased and consistent? Could several teachers use the rubric and score a students

    performance within the same range?

    Does the rubric attend to process as well as product?

    Are all criteria equally important, or does it make sense to weight an element more than

    the others?

    Are you attending carefully to the language used in the rubric? Use demonstrative verbs.

    Keep to observable behaviors. Avoid negatives ("begins without preparation" vs. "doesnot prepare"). Be specific. Instead of "many errors" you may want to specify "six or

    more errors". At the same time, be sure the rubric is generally qualitative in nature

    rather than quantitative.

    Other issues to consider when using rubrics:

    Rubrics need to be piloted or field tested.

    Rubrics need to be discussed with students to create an understanding of expectations;

    you cannot write a paragraph defining each word in the rubric.

    Are the criteria at each level defined clearly enough to ensure that scoring is accurate,

    unbiased and consistent? Could several teachers use the rubric and score a studentsperformance within the same range?

    There is a fine balance between modeling excellent work and creating a "template" that

    is replicated by the students ad nauseam to the detriment of creativity.

    "If a student can achieve a high score on all the criteria and still not perform well at the

    task, you have the wrong criteria" (Wiggins, cited in Clementi, 1999).

    Consider whether a rubric needs revision for a specific task. Do some of the criteria on

    the rubric go beyond this particular performance (that is, if youve created a rubric that

    is more "generic" and can be used for many tasks over time)?

    Make sure that the expectations in the rubric are directly aligned with the instruction of

    the lesson/unit. Students shouldnt be expected to do what they havent been previously

    taught or shown.

    Some suggest that generic rubrics are more useful because creating rubrics is time-

    consuming and the more often they can be applied, the better. It is also more

    informative for students if the same rubric is used again and again, because they can

    see themselves making progress over time. On the other hand, generic rubrics are much

    less tied to the task and are not able to provide criteria for specific language use

    expectations or content knowledge.

  • 7/22/2019 Carla Rubrics

    14/14

    14

    Process: Grades

    While we use rubrics to judge how well how well learners have met appropriate and clear

    expectations set for them and to give them feedback about their performance along a quality

    continuum, we also know that generally we are expected to issue grades as a descriptor of

    student performance.

    Teachers often construct their own scales for converting rubrics to grades. As with rubrics,

    students need to be clearly informed of what the lowest "acceptable" score would be. The

    Fairfax County (VA) Public Schools World Languageweb site provides one model of a formula

    for converting scores from a point-based analytical rubric to a grade. ChoosePerformance

    Assessments for Language Students (PALS). Under each level, you will see a conversion

    chart for each rubric and task type listed.

    The assessment section of theNebraska Foreign Language Frameworks, available in PDF

    format, also features an assessment score conversion chart for converting a series of raw

    scores to percentage scores. The chart is found on the last page of the pdf document.

    http://www.fcps.edu/is/worldlanguages/index.shtmlhttp://www.fcps.edu/is/worldlanguages/index.shtmlhttp://www.fcps.edu/is/worldlanguages/pals/index.shtmlhttp://www.fcps.edu/is/worldlanguages/pals/index.shtmlhttp://www.fcps.edu/is/worldlanguages/pals/index.shtmlhttp://www.education.ne.gov/forlg/FrameworksMain.htmhttp://www.education.ne.gov/forlg/FrameworksMain.htmhttp://www.education.ne.gov/forlg/FrameworksMain.htmhttp://www.education.ne.gov/forlg/FrameworksMain.htmhttp://www.fcps.edu/is/worldlanguages/pals/index.shtmlhttp://www.fcps.edu/is/worldlanguages/pals/index.shtmlhttp://www.fcps.edu/is/worldlanguages/index.shtml