the future of measurement: making learning...
Post on 17-Sep-2018
218 Views
Preview:
TRANSCRIPT
The Future of Measurement: Making learning visible
John Hattie Melbourne Education Research Institute (MERI)
University of Melbourne
The beginnings ...
Chinese Han Dynasty (206 BCE to 220 CE) • Entrance examination
for the civil service
from observation &
testimonials to written
examinations To avoid nepotism and corruption.
Classical model 1904+
It is simple, elegant, and surprisingly durable
Presence of errors in
measurement
True score = Observed+ Error
Lord and Novick (1968)
Xjg, = tjg +Ejg
Item Response Theories 1950s+
b Difficulty a Discrimination c Guessing
p(u=1|θ) = c + (1-c) ea(θ-b)
1 + ea(θ-b)
.
Major advances
• test-score equating
• detection of item bias
• adaptive testing
• extensions to polytomous items,
to attitudes and personality
Moving beyond classical IRT 2000’s+
Newer models based on:
• time completing items
• multiple attempts on items
• multiple abilities or cognitive processes
• multidimensionality
Advances
• computer based testing of many forms
• development of reporting engines
• computerized essay scoring
• beginnings of integrating of cognitive processes
• merging testing advances with instruction
Assessment in classrooms
Backward Design • % of students talked to .
• student questions
• peer feedback
• collaboration among
students
• early years of reading
Software Applications
Virtual Realities
Hypermedia testing
Learner controlled Interactive solutions
Interactive video
Information Function
Preference 72% Web 28% Paper
0
0.2
0.4
0.6
0.8
1
1.2
-3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3
Info
rmat
ion
Theta
Test Information Functions for Paper-and-Pencil and Web-based Versions
Form A TIF
Form B TIF
Paper
Web
Concept of Validity
Validity is an integrated evaluative judgment of the degree to which
empirical evidence and theoretical rationales support the “adequacy”
and “appropriateness” of “inferences” and “actions” based on test
scores or other modes of assessment.
What is the typical effect across ?
• 900+ meta-analysis
• 50,000 studies, and
• 240+ million students
The Typical Effect
The disasters ... Rank BOTTOM Influences Studies Effects ES
113 Class size 113 802 .21
114 Charter schools 18 18 .20
125 Matching learning styles 244 1234 .17
130 Mentoring 74 74 .15
131 Ability grouping 500 1369 .12
133 Gender 3168 6293 .12
134 Teacher education 106 412 .12
136 Teacher subject matter knowledge 92 424 .09
145 Class architecture 315 333 .01
148 Retention 229 2882 -.13
Rank AVERAGE Influences Studies Effects ES
41 Study skills 668 2217 .59
43 Outdoor/ Adventure Programs 187 429 .52
46 Play Programs 70 70 .50
47 Second/Third chance programs 52 1395 .50
52 Early Intervention 1704 9369 .47
54 Pre school programs 358 1822 .45
57 Teacher expectations 674 784 .43
83 Effects of testing 569 1749 .34
92 Test-based accountability systems 92 92 .31
Rank TOP influences Studies Effects ES
1 Student expectations 209 305 1.44
3 Response to intervention 13 107 1.07
4 Teacher credibility 51 51 .90
5 Providing formative evaluation 30 78 .90
7 Structured classroom discussion 42 42 .82
10 Feedback 1310 2086 .75
12 Teacher-student relationships 229 1450 .72
25 Not labeling students 79 79 .61
48 Goals (Difficult vs. Do your best) 895 1111 .50
32 Worked examples 62 151 .57
Visible Teaching – Visible Learning
When teachers SEE learning through the eyes of the student
and when students SEE themselves as their own teachers.
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Mob
ility
Wel
fare
Pol
icie
s M
ulti-
grad
e/ag
e cl
asse
s W
hole
lang
uage
D
ista
nce
Educ
atio
n Ab
ility
grou
ping
Pr
oble
m b
ased
lear
ning
Te
ache
r Im
med
iacy
Fa
mily
stru
ctur
e Ap
titud
e/tre
atm
ent i
nter
actio
ns
Co-
/ Tea
m te
achi
ng
Teac
hing
test
taki
ng
Sum
mer
sch
ool
Illne
ss
Prog
ram
med
inst
ruct
ion
Use
of c
alcu
lato
rs
Mai
nstre
amin
g Ab
ility
Gro
upin
g fo
r Gift
ed S
tude
nts
Teac
her e
ffect
s In
duct
ive
teac
hing
D
ram
a/Ar
ts P
rogr
ams
Attit
ude
to M
athe
mat
ics/
Scie
nce
Adju
nct a
ids
Tim
e on
Tas
k Sc
ienc
e Be
havi
oral
org
aniz
ers/
Adju
nct q
uest
ions
Ex
pect
atio
ns
Qua
lity
of T
each
ing
Pres
choo
l pro
gram
s C
once
ntra
tion/
Pers
iste
nce/
Eng
agem
ent
Smal
l gro
up le
arni
ng
Pare
ntal
Invo
lvem
ent
Inte
ract
ive
vide
o m
etho
ds
Peer
influ
ence
s Vi
sual
-Per
cept
ion
prog
ram
s W
orke
d ex
ampl
es
Con
cept
map
ping
M
aste
ry le
arni
ng
Dire
ct In
stru
ctio
n Pr
oble
m s
olvi
ng te
achi
ng
Self-
verb
aliz
atio
n/Se
lf-qu
estio
ning
Vo
cabu
lary
pro
gram
s Sp
aced
vs.
Mas
s Pr
actic
e R
ecip
roca
l tea
chin
g C
lass
room
beh
avio
ral
Prov
idin
g fo
rmat
ive
eval
uatio
n
Feedback
Average effect = .40
The Power of Feedback
What do Victorian TEACHERS say ...
• Constructive, commentary, correction,
conversation, compliance to task
• Give lots of feedback
RCT and Experimental studies
• Including effort
• With Praise
NO FEEDBACK
Clarity and Challenge of Success Criteria
Prior achievement
FEEDBACK
• Competence/Rubric
• Comparative/ SociaI
• Individual
• Must emphasize interpretation and Where to next?
• Must have consequences
• Must tell teachers where and for whom they have had impact success
• Must relate to what has been recently taught relative to success criteria
• Must relate to curriculum fidelity --concepts NOT items
• Must be individual and group rich
• Must include Progress as well as Proficiency
• Must be immediate & tailored to the “here and now”
• Must include Diagnostic as well as Success reporting
• Must be based on modern test theory
• Must have multiple modes of delivery
• Must balance surface and deep processing
• Must emphasize test information – particularly formative interpretations
Best test design for Visible Learning
ASSESSMENT REPORTING ENGINE
Additions to reporting
SURFACE One idea. Who painted Guernica? Many ideas. Outline at least two compositional principles that Picasso used in Guernica. DEEP Relate ideas. Relate the theme of Guernica to a current event. Extend Ideas. What do you consider Picasso was saying via his painting of Guernica?
•Video assessment •Essay scoring •Complexity/ Cognitive process scoring
Additions to reporting
• Video assessment
• Essay scoring
• Complexity scoring
• Reporting progress
• Value added
Additions to reporting
• Video assessment
• Essay scoring
• Complexity scoring
• Reporting progress
• Value added
• Other domains
•Year 0-4 •Adult literacy •Linked to LMS, to parents, to curricula change •Other tests – spelling, commercial, health
Additions to reporting
• Video assessment
• Essay scoring
• Complexity scoring
• Reporting progress
• Value added
• Other domains
• World wide testing
Additions to reporting
• Video assessment
• Essay scoring
• Complexity scoring
• Reporting progress
• Value added
• Other domains
• World wide testing
• Cognitive processes
• Pattern recognition
• Developing production rules
• Different levels of integration of
knowledge
• Different degrees of procedural
skills
• Speed of processing
• Misconceptions
• Reports about cognitive strengths
and gaps
• Activate instructional components
15
Gaussian distribution
of errors
Lord & Novick CTT
to IRT
Need for new
model
1788 1968 2012 ??
1905 1998
Spearman
start of CTT
Hambleton & Van der Linden
IRT to MIRT
120 30 60
The Future of Assessment
TEST ITEMS REPORTING ENGINE
Visible Adolescent Unlocking Assessing teachers Intelligence learning reputations formative for professional & Intelligence & risk assessment certification Testing
Forthcoming Hattie, J. & Anderman E (Eds.) Handbook of student achievement Yates, G. & Hattie, J. Making Learning Visible jhattie@unimelb.edu.au
The future of Assessment:
Making learning visible
top related