topic 7 scoring, grading and assessment criteria

Lecturer: Yee Bee Choo

IPGKTHO

Approaches

of Scoring

Objective

AnalyticHolistic

This scoring approach relies on quantified methods of evaluating students’ writing. A sample of how objective scoring is conducted is given by Bailey (1999) as follows:

1. Establish standardization by limiting the length of the assessment: Count the first 250 words of the essay.

2. Identify the elements to be assessed: Go through the essay up to the 250th word underlining every mistake – from spelling and mechanics through verb tenses, morphology, vocabulary, etc. Include every error that a literate reader might note.

3. Operationalise the assessment: Assign a weight score to each error, from 3 to 1. A score of 3 is a severe distortion of readability or flow of ideas; 2 is a moderate distortion; and 1 is a minor error that does not affect readability in any significant way.

4. Quantify the assessment: Calculate the essay Correctness Score by using 250 words as the numerator of a fraction, and the sum of error scores as the denominator.

Essay correctness = sum of error scores250

Advantages

Emphasises the students’ strengths rather than their weaknesses.

Disadvantages

Still some degree of subjectivity involved.

Accentuates negative aspects of the learner’s writing without giving credit for what they can do well.

Often referred to as "impressionistic" scoring

Involves the assignment of a single score to a piece of writing on the basis of an overall impression of it.

Individual features of a text, such as grammar, spelling, and organization, should not be considered as separate entities.

Has the advantage of being very rapid (Hughes 1989: 86).

Holistic Scoring Scale used by Educational Testing Service for evaluating the Advanced Placement Examination in foreign languages:

◦ works well and can be altered to fit the level of the students and the focus of instruction.

◦ numerical scale that ranks performance at levels described as "superior," "competent," and "incompetent."

◦ for each level, the descriptions can be changed to reflect the kind of performance that teachers expect at a given level of language ability.

◦ reliability of this scoring method is considered good when the raters are trained to establish common standards based on practice with the kinds of writing samples that they will be evaluating (Cooper 1977).

Demonstrate s Superiority

9 Strong control of the language; proficiency and variety in grammatical usage with few significant errors; broad command of vocabulary and of idiomatic language

Demonstrates Competence

7-8 Good general control of grammatical structures despite some errors and/or some awkwardness of style. Good use of idioms and vocabulary. Reads smoothly overall.

Suggests Competence

5-6 Fair ability to express ideas in target language; correct use of simple grammatical structures or use of more complex structures without numerous serious errors. Some apt vocabulary and idioms. Occasional signs of fluency and sense of style.

Suggests Incompetence

3-4 Weak use of language with little control of grammatical structures. Limited vocabulary. Frequent use of anglicisms, which force interpretations on the part of the reader. Occasional redeeming features.

Demonstrates Incompetence

1-2 Clearly unacceptable from most points of view. Almost total lack of vocabulary resources, little or no sense of idiom and/or style. Essentially translated from English.

Floating Point A one-point bonus should be awarded for a coherent and well-organized essay or for a particularly inventive one.

In holistic scoring, the reader reacts to the students’ compositions as a whole and a single score is awarded to the writing.

Normally this score is on a scale of 1 to 4, or 1 to 6, or even 1 to 10. (Bailey, 1998 : 187).

Each score on the scale will be accompanied with general descriptors of ability.

The following is an example of a holistic scoring scheme based on a 6 point scale.

Scale Criteria

HIGH

DISTINCTION

(90-100)

27-30 MARKS

Excellent compilation of samples from different types.

Ideas are very effectively represented in i-Think thinking maps.

Accurate grammar, vocabulary and spelling, errors, if any, are only minor first

draft slips.

DISTINCTION

(70-89)

23-26 MARKS

Very good compilation of samples from different types.

Ideas are effectively represented in i-Think thinking maps

Almost entirely accurate grammar, vocabulary and spelling.

CREDIT

(60-69)

18-22 MARKS

Good compilation of samples from different types.

Ideas are well represented in i-Think thinking maps

A few errors in grammar, vocabulary and spelling.

PASS

(50-59)

15-17 MARKS

Satisfactory compilation of samples from limited types.

Some ideas are satisfactorily represented in i-Think thinking maps.

Some errors in grammar, vocabulary and spelling.

FAIL

(0-49)

0-14 MARKS

Incomplete compilation/ poor choice of samples.

Inability to represent relevant ideas in i-Think thinking maps.

Errors are numerous and gross that impede understanding.

TSL3123 Language Assessment

Assessment Rubrics

Task 1: Collecting Samples of Different Test Types (30%)

Scale Criteria

HIGH DISTINCTION

(90-100)

27-30 MARKS

The test format is highly appropriate and meets the requirements of the learning outcomes.

The test is very effectively developed using the Bloom’s and SOLO taxonomy.

The scoring rubric is very effectively developed.

The test consists of at least four items.

Excellent and well organised. Excellent language and no grammatical errors.

DISTINCTION

(70-89)

23-26 MARKS

The test format is very appropriate and meets most of the requirements of the learning outcomes.

The test is effectively developed using the Bloom’s and SOLO taxonomy.

The scoring rubric is effectively developed.

The test consists of at least four items.

Very good organisation. Very good language with one or two grammatical errors.

CREDIT

(60-69)

18-22 MARKS

The test format is appropriate and meets many of the requirements of the learning outcomes.

The test is well developed using the Bloom’s and SOLO taxonomy.

The scoring rubric is well developed.

The test consists of three items.

Good organisation. Good language with a few grammatical errors.

PASS

(50-59)

15-17 MARKS

The test format is acceptable and meets some of the requirements of the learning outcomes.

The test is satisfactory developed using the Bloom’s and SOLO taxonomy.

The scoring rubric is satisfactory developed.

The test consists of two items.

Satisfactory organisation. Satisfactory language with some grammatical errors.

FAIL

(0-49)

0-14 MARKS

The test format is not acceptable and does not meet most of the requirements of the learning outcomes.

The test is very poorly developed using the Bloom’s and SOLO taxonomy.

The scoring rubric is very poorly developed.

The test consists of one item.

Poor organisation. Poor language with many grammatical errors.


Assessment Rubrics

Task 2: Designing Language Test and Preparing Marking Rubric (30%)

Scale Criteria

HIGH DISTINCTION

(90-100)

27-30 MARKS

Excellent description of the report, supported by a very comprehensive range of item

analysis.

Convincing, critical and detailed interpretation based on the test results.

Convincing, critical and detailed discussion and suggestions.

All references are cited using correct APA format.

DISTINCTION

(70-89)

23-26 MARKS

Very good description of the report, supported by a substantial range of item analysis.

Critical and detailed interpretation based on the test results.

Critical and detailed discussion and suggestions.

Most references are cited using correct APA format.

CREDIT

(60-69)

18-22 MARKS

Detailed description of the report, supported by a good range of item analysis.

Detailed interpretation based on the test results.

Detailed discussion and suggestions.

Many references are cited using correct APA format.

PASS

(50-59)

15-17 MARKS

Reasonable description of the report, but may lack some supporting item analysis.

Competent and acceptable interpretation based on the test results.

Competent and acceptable discussion and suggestions.

Some references are cited using correct APA format.

FAIL

(0-49)

0-14 MARKS

Inadequate description of the report with insufficient supporting item analysis.

Inadequate interpretation based on the test results.

Inadequate discussion and suggestions.

References are poorly or not cited using the APA format.


Assessment Rubrics

Task 3: Reporting the Test Results (30%)

Scale Criteria

HIGH DISTINCTION

(90-100)

9-10 MARKS

Very high frequency in participating in answering the posed questions on facebook group.

Excellent opinion on the discussion question.

Excellent understanding of the issues involved in the discussion question.

Accurate grammar, vocabulary and spelling, errors, if any, are only minor first draft slips.

DISTINCTION

(70-89)

7-8 MARKS

High frequency in participating in answering the posed questions on facebook group.

Very good opinion on the discussion question.

Very good understanding of the issues involved in the discussion question.

Almost entirely accurate grammar, vocabulary and spelling.

CREDIT

(60-69)

6 MARKS

Always participate in answering the posed questions on facebook group.

Good opinion on the discussion question.

Good understanding of the issues involved in the discussion question.

A few errors in grammar, vocabulary and spelling.

PASS

(50-59)

5 MARKS

Sometimes participate in answering the posed questions on facebook group.

Adequate opinion on the discussion question.

Adequate understanding of the issues involved in the discussion question.

Some errors in grammar, vocabulary and spelling.

FAIL

(0-49)

0-4 MARKS

Seldom participate in answering the posed questions on facebook group.

Weak opinion on the discussion question.

Limited understanding of the issues involved in the discussion question.

Many errors in grammar, vocabulary and spelling that impede communication.


Assessment Rubrics

Task 4: Online Participation (10%)

Advantages

Quickly graded.

Provide a public standard that is understood by the teachers and students alike.

Relatively higher degree of rater reliability.

Applicable to the assessment of many different topics.

Emphasise the students’ strengths rather than their weaknesses.

Disadvantages The single score may actually mask

differences across individual compositions. Inasmuch as the whole is often greater than

the sum of its parts, a composite score may be very reliable but not valid (Hughes 1989: 93-94)

Does not provide a lot of diagnostic feedback. The student will not necessarily know the reason for his or her grade on the writing.

A method of scoring that requires a separate score for each of a number of aspects of a task, such as grammatical accuracy, vocabulary, idiomatic expression, organization, relevance, coherence.

Disposes of the problem of uneven development of subskills in individuals.

Scorers are compelled to consider aspects of performance which they might otherwise ignore.

The very fact that the scorer has to give a number of scores will tend to make the scoring more reliable.

Characteristics:

Involve the separation of the various feature of a composition into components for scoring purposes

Scoring schemes provide more detailed information about a test taker’s performance

Components Weightage

Content 30 points

Organisation 20 points

Vocabulary 20 points

Language use 25 points

Mechanics 5 points

TOTAL 100 points

The points assigned to each component reflect the importance of each of the components.

Advantages

It provides clear guidelines in grading in the form of the various components.

Allows the graders to consciously address important aspects of writing.

It presents a good analysis of a problem and/or strongly state a position.

It allows students to see areas in their own essays that need work when accompanied by written comments and a breakdown of the final score. Its diagnostic nature provides students with improvement.

Disadvantages It’s time-consuming. Teachers may make many

judgments about one piece of writing. Writing ability is unnaturally split up into

components. Concentration on the different aspects may divert

attention from the overall effect of the piece of writing.

Negative feedback can be pedagogically destructive. Teachers who combine analytic scoring with confrontational or unclear comments - especially about issues of grammar - may actually inhibit student growth.

Ways to Maximise the Effectiveness of the Scoring Methods:

Coming up with a written analytic scale to define grading criteria clearly.

Weighing the criteria according to their importance. For example, if the goal of an assignment is the assimilation of course material, then logic, ideas, arrangement and resourcefulness are rewarded more than grammar and mechanics.

Providing formative feedback in the form of marginal and end comments whereby the comments are balanced and challenging yet supporting the students.

Avoiding use of sarcasm in comments.

1. Traditional Letter Grade System

Students’ performance is summarised by means of letters

Easy to understand but is of limited value when used as a sole report

They end up being a combination of achievement, effort, work habits, behaviour.

Difficult to interpret

Do not indicate patterns of strengths and weaknesses

1. Traditional Letter Grade System

•ExcellentA

•GoodB

•AverageC

•Needs ImprovementD

•FailureE

2. Pass-Fail System

Utilise a dichotomous grade system

Does not provide much information

Students tend to work to the minimum (just to pass)

In mastery learning courses, no grades are reflected until “mastery” threshold is reached.

2. Pass-Fail System

PassFail

3. Checklist of Objectives

Objectives of the courses are enumerated

After each objective, the students’ level of achievement is indicated

Very detailed reporting system

More informative for parents and students

Time-consuming to prepare

POTENTIAL PROBLEM- keeping list manageable and understandable

3. Checklist of Objectives

Objective 1

•Fair

Objective 2

•Outstanding

Objective 3

•Very Poor

Objective 4

•Good

Objective 5

•Very Good