topic 7 scoring, grading and assessment criteria
TRANSCRIPT
Lecturer: Yee Bee Choo
IPGKTHO
Approaches
of Scoring
Objective
AnalyticHolistic
This scoring approach relies on quantified methods of evaluating students’ writing. A sample of how objective scoring is conducted is given by Bailey (1999) as follows:
1. Establish standardization by limiting the length of the assessment: Count the first 250 words of the essay.
2. Identify the elements to be assessed: Go through the essay up to the 250th word underlining every mistake – from spelling and mechanics through verb tenses, morphology, vocabulary, etc. Include every error that a literate reader might note.
3. Operationalise the assessment: Assign a weight score to each error, from 3 to 1. A score of 3 is a severe distortion of readability or flow of ideas; 2 is a moderate distortion; and 1 is a minor error that does not affect readability in any significant way.
4. Quantify the assessment: Calculate the essay Correctness Score by using 250 words as the numerator of a fraction, and the sum of error scores as the denominator.
Essay correctness = sum of error scores250
Advantages
Emphasises the students’ strengths rather than their weaknesses.
Disadvantages
Still some degree of subjectivity involved.
Accentuates negative aspects of the learner’s writing without giving credit for what they can do well.
Often referred to as "impressionistic" scoring
Involves the assignment of a single score to a piece of writing on the basis of an overall impression of it.
Individual features of a text, such as grammar, spelling, and organization, should not be considered as separate entities.
Has the advantage of being very rapid (Hughes 1989: 86).
Holistic Scoring Scale used by Educational Testing Service for evaluating the Advanced Placement Examination in foreign languages:
◦ works well and can be altered to fit the level of the students and the focus of instruction.
◦ numerical scale that ranks performance at levels described as "superior," "competent," and "incompetent."
◦ for each level, the descriptions can be changed to reflect the kind of performance that teachers expect at a given level of language ability.
◦ reliability of this scoring method is considered good when the raters are trained to establish common standards based on practice with the kinds of writing samples that they will be evaluating (Cooper 1977).
Demonstrate s Superiority
9 Strong control of the language; proficiency and variety in grammatical usage with few significant errors; broad command of vocabulary and of idiomatic language
Demonstrates Competence
7-8 Good general control of grammatical structures despite some errors and/or some awkwardness of style. Good use of idioms and vocabulary. Reads smoothly overall.
Suggests Competence
5-6 Fair ability to express ideas in target language; correct use of simple grammatical structures or use of more complex structures without numerous serious errors. Some apt vocabulary and idioms. Occasional signs of fluency and sense of style.
Suggests Incompetence
3-4 Weak use of language with little control of grammatical structures. Limited vocabulary. Frequent use of anglicisms, which force interpretations on the part of the reader. Occasional redeeming features.
Demonstrates Incompetence
1-2 Clearly unacceptable from most points of view. Almost total lack of vocabulary resources, little or no sense of idiom and/or style. Essentially translated from English.
Floating Point A one-point bonus should be awarded for a coherent and well-organized essay or for a particularly inventive one.
In holistic scoring, the reader reacts to the students’ compositions as a whole and a single score is awarded to the writing.
Normally this score is on a scale of 1 to 4, or 1 to 6, or even 1 to 10. (Bailey, 1998 : 187).
Each score on the scale will be accompanied with general descriptors of ability.
The following is an example of a holistic scoring scheme based on a 6 point scale.
Scale Criteria
HIGH
DISTINCTION
(90-100)
27-30 MARKS
Excellent compilation of samples from different types.
Ideas are very effectively represented in i-Think thinking maps.
Accurate grammar, vocabulary and spelling, errors, if any, are only minor first
draft slips.
DISTINCTION
(70-89)
23-26 MARKS
Very good compilation of samples from different types.
Ideas are effectively represented in i-Think thinking maps
Almost entirely accurate grammar, vocabulary and spelling.
CREDIT
(60-69)
18-22 MARKS
Good compilation of samples from different types.
Ideas are well represented in i-Think thinking maps
A few errors in grammar, vocabulary and spelling.
PASS
(50-59)
15-17 MARKS
Satisfactory compilation of samples from limited types.
Some ideas are satisfactorily represented in i-Think thinking maps.
Some errors in grammar, vocabulary and spelling.
FAIL
(0-49)
0-14 MARKS
Incomplete compilation/ poor choice of samples.
Inability to represent relevant ideas in i-Think thinking maps.
Errors are numerous and gross that impede understanding.
TSL3123 Language Assessment
Assessment Rubrics
Task 1: Collecting Samples of Different Test Types (30%)
Scale Criteria
HIGH DISTINCTION
(90-100)
27-30 MARKS
The test format is highly appropriate and meets the requirements of the learning outcomes.
The test is very effectively developed using the Bloom’s and SOLO taxonomy.
The scoring rubric is very effectively developed.
The test consists of at least four items.
Excellent and well organised. Excellent language and no grammatical errors.
DISTINCTION
(70-89)
23-26 MARKS
The test format is very appropriate and meets most of the requirements of the learning outcomes.
The test is effectively developed using the Bloom’s and SOLO taxonomy.
The scoring rubric is effectively developed.
The test consists of at least four items.
Very good organisation. Very good language with one or two grammatical errors.
CREDIT
(60-69)
18-22 MARKS
The test format is appropriate and meets many of the requirements of the learning outcomes.
The test is well developed using the Bloom’s and SOLO taxonomy.
The scoring rubric is well developed.
The test consists of three items.
Good organisation. Good language with a few grammatical errors.
PASS
(50-59)
15-17 MARKS
The test format is acceptable and meets some of the requirements of the learning outcomes.
The test is satisfactory developed using the Bloom’s and SOLO taxonomy.
The scoring rubric is satisfactory developed.
The test consists of two items.
Satisfactory organisation. Satisfactory language with some grammatical errors.
FAIL
(0-49)
0-14 MARKS
The test format is not acceptable and does not meet most of the requirements of the learning outcomes.
The test is very poorly developed using the Bloom’s and SOLO taxonomy.
The scoring rubric is very poorly developed.
The test consists of one item.
Poor organisation. Poor language with many grammatical errors.
TSL3123 Language Assessment
Assessment Rubrics
Task 2: Designing Language Test and Preparing Marking Rubric (30%)
Scale Criteria
HIGH DISTINCTION
(90-100)
27-30 MARKS
Excellent description of the report, supported by a very comprehensive range of item
analysis.
Convincing, critical and detailed interpretation based on the test results.
Convincing, critical and detailed discussion and suggestions.
All references are cited using correct APA format.
DISTINCTION
(70-89)
23-26 MARKS
Very good description of the report, supported by a substantial range of item analysis.
Critical and detailed interpretation based on the test results.
Critical and detailed discussion and suggestions.
Most references are cited using correct APA format.
CREDIT
(60-69)
18-22 MARKS
Detailed description of the report, supported by a good range of item analysis.
Detailed interpretation based on the test results.
Detailed discussion and suggestions.
Many references are cited using correct APA format.
PASS
(50-59)
15-17 MARKS
Reasonable description of the report, but may lack some supporting item analysis.
Competent and acceptable interpretation based on the test results.
Competent and acceptable discussion and suggestions.
Some references are cited using correct APA format.
FAIL
(0-49)
0-14 MARKS
Inadequate description of the report with insufficient supporting item analysis.
Inadequate interpretation based on the test results.
Inadequate discussion and suggestions.
References are poorly or not cited using the APA format.
TSL3123 Language Assessment
Assessment Rubrics
Task 3: Reporting the Test Results (30%)
Scale Criteria
HIGH DISTINCTION
(90-100)
9-10 MARKS
Very high frequency in participating in answering the posed questions on facebook group.
Excellent opinion on the discussion question.
Excellent understanding of the issues involved in the discussion question.
Accurate grammar, vocabulary and spelling, errors, if any, are only minor first draft slips.
DISTINCTION
(70-89)
7-8 MARKS
High frequency in participating in answering the posed questions on facebook group.
Very good opinion on the discussion question.
Very good understanding of the issues involved in the discussion question.
Almost entirely accurate grammar, vocabulary and spelling.
CREDIT
(60-69)
6 MARKS
Always participate in answering the posed questions on facebook group.
Good opinion on the discussion question.
Good understanding of the issues involved in the discussion question.
A few errors in grammar, vocabulary and spelling.
PASS
(50-59)
5 MARKS
Sometimes participate in answering the posed questions on facebook group.
Adequate opinion on the discussion question.
Adequate understanding of the issues involved in the discussion question.
Some errors in grammar, vocabulary and spelling.
FAIL
(0-49)
0-4 MARKS
Seldom participate in answering the posed questions on facebook group.
Weak opinion on the discussion question.
Limited understanding of the issues involved in the discussion question.
Many errors in grammar, vocabulary and spelling that impede communication.
TSL3123 Language Assessment
Assessment Rubrics
Task 4: Online Participation (10%)
Advantages
Quickly graded.
Provide a public standard that is understood by the teachers and students alike.
Relatively higher degree of rater reliability.
Applicable to the assessment of many different topics.
Emphasise the students’ strengths rather than their weaknesses.
Disadvantages The single score may actually mask
differences across individual compositions. Inasmuch as the whole is often greater than
the sum of its parts, a composite score may be very reliable but not valid (Hughes 1989: 93-94)
Does not provide a lot of diagnostic feedback. The student will not necessarily know the reason for his or her grade on the writing.
A method of scoring that requires a separate score for each of a number of aspects of a task, such as grammatical accuracy, vocabulary, idiomatic expression, organization, relevance, coherence.
Disposes of the problem of uneven development of subskills in individuals.
Scorers are compelled to consider aspects of performance which they might otherwise ignore.
The very fact that the scorer has to give a number of scores will tend to make the scoring more reliable.
Characteristics:
Involve the separation of the various feature of a composition into components for scoring purposes
Scoring schemes provide more detailed information about a test taker’s performance
Components Weightage
Content 30 points
Organisation 20 points
Vocabulary 20 points
Language use 25 points
Mechanics 5 points
TOTAL 100 points
The points assigned to each component reflect the importance of each of the components.
Advantages
It provides clear guidelines in grading in the form of the various components.
Allows the graders to consciously address important aspects of writing.
It presents a good analysis of a problem and/or strongly state a position.
It allows students to see areas in their own essays that need work when accompanied by written comments and a breakdown of the final score. Its diagnostic nature provides students with improvement.
Disadvantages It’s time-consuming. Teachers may make many
judgments about one piece of writing. Writing ability is unnaturally split up into
components. Concentration on the different aspects may divert
attention from the overall effect of the piece of writing.
Negative feedback can be pedagogically destructive. Teachers who combine analytic scoring with confrontational or unclear comments - especially about issues of grammar - may actually inhibit student growth.
Ways to Maximise the Effectiveness of the Scoring Methods:
Coming up with a written analytic scale to define grading criteria clearly.
Weighing the criteria according to their importance. For example, if the goal of an assignment is the assimilation of course material, then logic, ideas, arrangement and resourcefulness are rewarded more than grammar and mechanics.
Providing formative feedback in the form of marginal and end comments whereby the comments are balanced and challenging yet supporting the students.
Avoiding use of sarcasm in comments.
1. Traditional Letter Grade System
Students’ performance is summarised by means of letters
Easy to understand but is of limited value when used as a sole report
They end up being a combination of achievement, effort, work habits, behaviour.
Difficult to interpret
Do not indicate patterns of strengths and weaknesses
1. Traditional Letter Grade System
•ExcellentA
•GoodB
•AverageC
•Needs ImprovementD
•FailureE
2. Pass-Fail System
Utilise a dichotomous grade system
Does not provide much information
Students tend to work to the minimum (just to pass)
In mastery learning courses, no grades are reflected until “mastery” threshold is reached.
2. Pass-Fail System
PassFail
3. Checklist of Objectives
Objectives of the courses are enumerated
After each objective, the students’ level of achievement is indicated
Very detailed reporting system
More informative for parents and students
Time-consuming to prepare
POTENTIAL PROBLEM- keeping list manageable and understandable
3. Checklist of Objectives
Objective 1
•Fair
Objective 2
•Outstanding
Objective 3
•Very Poor
Objective 4
•Good
Objective 5
•Very Good