developed by
DESCRIPTION
Michigan Assessment Consortium Common Assessment Development Series Module 14 – Presenting the Results of an Assessment. Developed by. Bruce R. Fay, PhD & Ellen Vorenkamp , EdD Assessment Consultants Wayne RESA. Support. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/1.jpg)
Michigan Assessment Consortium
Common Assessment Development Series
Module 14 –Presenting the Results
of an Assessment
![Page 2: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/2.jpg)
Developed by
Bruce R. Fay, PhD &
Ellen Vorenkamp, EdDAssessment Consultants
Wayne RESA
![Page 3: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/3.jpg)
Support
The Michigan Assessment Consortium professional development series in common assessment development is funded in part by the Michigan Association of Intermediate School Administrators in cooperation with …
![Page 4: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/4.jpg)
In Module 14 you will learn about
Score types… Standards-based reports… Graphical Representations…
![Page 5: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/5.jpg)
So, you’ve…
Developed a test (for use as a ‘common’ assessment)
Pilot / field-tested it (right?) Looked at the field test results (of course)
Now what?
![Page 6: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/6.jpg)
Presenting Your Results
Before you present the results of your test, you need to be clear about:
Who the audience is
Why they are seeing this data? (What?)
Why they should care about it? (So what?)
What you want them to do as a result of seeing it? (Now what?)
![Page 7: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/7.jpg)
SCORE TYPES
![Page 8: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/8.jpg)
A score by any other name
Many score types that you may have heard of are really only appropriate for Norm-Referenced Tests (NRT), such as percentile rank, stanine, and grade level equivalent.
Your common assessment is a Criterion-Referenced Test (CRT), so lets focus on score types that are appropriate for that.
![Page 9: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/9.jpg)
Raw Scores
Number of items correct or Number of points earned
Q? What’s the difference?
A! None, if each item has the same point value, otherwise…
![Page 10: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/10.jpg)
Scaled Score(equal weight)
If each test item has the same “weight”, say 1 point (1 if correct, 0 if wrong) then % correct is:
The simplest scaled score you can create The same as %points earned Puts the raw score on a scale of 0 – 100
![Page 11: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/11.jpg)
Scaled Score(unequal weight)
If each test item does not have the same number of points (there are weighted and/or
partial credit items on the test) then % correct becomes % of total possible
points earned You still end up with a 0 – 100 scale
![Page 12: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/12.jpg)
% Correct Features (Issues)Features
A “common” scale, as in “widely used”
A “common” scale, as in “the same regardless of raw score points”
Intuitively interpretable (maybe)
Permits comparisons between different tests
Issues
Can/will be misinterpreted
Can make a 10 point test and a 100 point test appear equally important
Widely held belief that scores in certain ranges (60-70, 70-80, etc.) have some inherent meaning
![Page 13: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/13.jpg)
% Correct Interpretation
Q? Is 50% correct good or bad?
A!: We don’t know yet. We don’t discuss standard–setting until the next module (15).
But most people think it is intuitively obvious that this is a “bad” score.
![Page 14: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/14.jpg)
Other ways to scale?
Yes, but we don’t really need them…
![Page 15: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/15.jpg)
STANDARDS-BASED REPORTS
![Page 16: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/16.jpg)
Two kinds of “standards”
Content Standards
The definition of the content to be learned; what students are to know and be able to do
Performance Standards
The definition of how good is good enough on a test to determine if, or the extent to which, students know and can do
![Page 17: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/17.jpg)
Reporting byContent Standards
This is our concern in this module The next module (15) deals with
performance standards
![Page 18: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/18.jpg)
Let’s consider…
A test covering 5 GLCEs with 5 selected-response items per GLCE, with each item worth 1 point (25 points total).
Q? What does a raw score of 20 (a % correct scaled score of 80%) mean?
A! It depends
![Page 19: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/19.jpg)
Depends on What?
Student A
GLCE 1: 4/5 GLCE 2: 4/5 GLCE 3: 4/5 GLCE 4: 4/4 GLCE 5: 4/5
Student B
GLCE 1: 5/5 GLCE 2: 5/5 GLCE 3: 5/5 GLCE 4: 3/5 GLCE 4: 2/5
Same or different?
![Page 20: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/20.jpg)
How about these two?
Student C
GLCE 1: 5/5 GLCE 2: 5/5 GLCE 3: 4/5 GLCE 4: 3/5 GLCE 5: 3/5
Student D
GLCE 1: 5/5 GLCE 2: 5/5 GLCE 3: 5/5 GLCE 4: 5/5 GLCE 5: 0/5
These 4 examples all have a raw score of 20 (80% correct) but represent 4 different performances by the students.
![Page 21: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/21.jpg)
Another way to see it
GLCE A # A % B # B % C # C % D # D %
1 4 80 5 100 5 100 5 100
2 4 80 5 100 5 100 5 100
3 4 80 5 100 4 80 5 100
4 4 80 3 60 3 60 5 100
5 4 80 2 40 3 60 0 0
total 20 80 20 80 20 80 20 80
![Page 22: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/22.jpg)
Scores by “Standard”
Remember, we haven’t set performance standards yet, so we really can’t say what these scores mean
Even so, 5 out 5 may suggest that a student knows the material and 0 out 5 may suggest that they don’t (depends on item-GLCE match)
However…even though this is a CRT, you can’t make instructional decisions without the context of the overall pattern of scores
![Page 23: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/23.jpg)
Say what?
There will often be extreme scores (outliers) that are not representative of most of the scores in a set.
Q? What if most of the students scored a 0 or a 1 on GLCE 5 in the example?
A! Maybe a picture would help
![Page 24: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/24.jpg)
GRAPHICAL REPRESENTATIONS
Or, I can see clearly now
![Page 25: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/25.jpg)
Guidelines for Good Graphs
Title & Subtitles Data Source and Time Frame Axis Labels Legend Viewable Colors Readability (3-D doesn’t make it better)
![Page 26: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/26.jpg)
Appropriate Type
Bar Graphs Line Graphs Scatterplots Stem & Leaf Pie Charts (evil)
![Page 27: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/27.jpg)
Results for 25 students(# scoring at each score point for each GLCE)
GLCE 1 GLCE 2 GLCE 3 GLCE 4 GLCE 50
2
4
6
8
10
12
012345
![Page 28: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/28.jpg)
The Data
Here’s how the spreadsheet is set up 0 1 2 3 4 5GLCE 1 1 2 4 9 6 3GLCE 2 0 1 2 10 7 5GLCE 3 1 4 8 9 3 0GLCE 4 7 4 2 2 4 6GLCE 5 8 7 5 3 2 0
Note: This will be replaced with a table so it looks better
![Page 29: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/29.jpg)
Let’s Assume…
We have established that 3 out of 5 on each standard is an acceptable standard of evidence that a student understands the GLCE in question (hey, these were hard items)
Then students who score a 3, 4, or 5 on the cluster of items for GLCE can be considered “proficient” while students with a 2, 1, or 0 are not.
![Page 30: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/30.jpg)
Proficiency by Standard(for 25 Students)
GLCE # Not Prof % Not Prof # Prof % Prof
1 7 28 18 72
2 3 12 22 88
3 13 52 12 48
4 13 52 12 48
5 20 80 5 20
This is what the previous data looks like in table form.
Would a picture help?
![Page 31: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/31.jpg)
1 2 3 4 50%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
ProfNP
Proficiency by Standard(for 25 Students)
![Page 32: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/32.jpg)
Here’s the data
GLCE NP Prof1 7 182 3 223 13 124 13 125 20 5
Note: this will be replaced with a table so it looks better
![Page 33: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/33.jpg)
Repeated Measures
If you test the same content on more than one occasion, you can look at your test results over time.
As an example, lets look at test results for our class of 25 students on a pre-test, two intermediate tests, and a post-test covering the same five GLCEs. We will look only at GLCE 1, with 5 points possible each time.
![Page 34: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/34.jpg)
The Data – Results for 25 students on 4 tests by score point
Score Points Pre-Test Test 1 Test 2 Post-Test
0 9 6 3 1
1 6 4 2 1
2 4 3 2 2
3 3 5 6 5
4 2 4 7 9
5 1 3 5 7
(This is a somewhat idealized example), but interpret it with caution!
![Page 35: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/35.jpg)
And here’s the picture – Results for 25 students on 4 tests by score point
0 1 2 3 4 50%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Post-TestTest 2Test 1Pre-Test
![Page 36: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/36.jpg)
The Excel spreadsheet
Score Pre-Test Test 1 Test 2 Post-Test0 9 6 3 11 6 4 2 12 4 3 2 23 3 5 6 54 2 4 7 95 1 3 5 7
Note: This will be replaced with a table for better viewing
![Page 37: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/37.jpg)
Conclusions
![Page 38: Developed by](https://reader036.vdocument.in/reader036/viewer/2022081603/56814cfc550346895dba1b3c/html5/thumbnails/38.jpg)
Next Module