reading comprehension pearson
TRANSCRIPT
Reading ComprehensionAssessment: One VERYResilient Phenomenon
P. David Pearson, UC Berkeley
Diane Hamm, MSU
Key referencesPearson, P.D., & Hamm, D. (in press). The assessment ofreading comprehension: A review of practices—past,present, and future. In S. G. Paris and S. A. Stahl (Eds.)Current issues in reading comprehension and assessment.Mahwah, NJ: Lawrence Erlbaum Associates.
Johnston, P. H. (1984a) Assessment in reading. In P. D.Pearson, R. Barr, M. Kamil, & P. Mosenthal (Eds.),Handbook of reading research (pp. 147-182). New York:Longman.Johnston, P. H. (1984b) Reading comprehensionassessment: A cognitive basis. Newark, DE: InternationalReading Association.
PurposeBuild an argument for a fresh line of inquiry intothe assessment of reading comprehension
By providing a rich and detailed historical accountof reading comprehension, both as a theoreticalphenomenon and an operational construct
Bonus: Try to offer my explanation of why theshort passage, multiple-choice exam format is soresilient
Why now?Renewed interest among scholars
Rand report
Uneasiness among practitioners that the code, asimportant as it is,may not be the point of readingThe most important outcome of reformNational thirst for accountability requires impeccablemeasures (both conceptually and psychometric)Pleas of teachers desperate for useful tools (need atool that does for RC what running records does forword id)
Reading comprehensionassessment has always vexedresearchers
We want to access the thing itself, the “click”ButWe only ever see its residue, its wake, itsartifactsWe are stuck with artifacts
Require them to tell us whether they understoodRequire them to tell us what they understoodQuiz them on the detailsRequest the big ideas
Most of the measures interposesome other skill or capacity betweenthe act and the evidence
WritingTalkingThe conventions of multiple-choiceassessmentsThese interposed processes inevitablycompromise our capacity to draw inferencesabout comprehension both as a generic anda passage specific enterprise
History nurtures modestyJust about any approach to assessingreading comprehension that has arisenin the last 30 years has a precedent thatis at least 70 years old.
CaveatsI apologize, in advance, for all thestudies, especially those done bypeople in this room, that I will fail to cite.I also want to make it clear that I amlimiting myself to READINGCOMPREHENSION assessment, notprerequisites to or correlates of RC.Just because I chronicle a practice doesnot mean that I advocate it!
The Century old Roots of ReadingComprehension Assessment
Fit the context of the period just afterthe last turn of the century
Curricular shift from oral to silentreading
From Accuracy and Expressiveness(oratory and declamation) of oral reading
To Indicators of understanding
The Roots…Emerging Scientism in educationGrowing numbers of studentsChanging demographics: widespreadilliteracyEfficient means of assessing (e.g. ArmyAlpha Exams)Move from high to a low inference index
1895 BinetUsed RC tasks as a measure of IQ notreading
Not inconsistent with Binet’s notion thatIQ should index school-based reasoningcapacity
Every one of us, whatever our speculative opinion,knows better than he practices, andrecognizes a better law than he obeys.”
Check two of the following statements with thesame meaning as the quotation above.
� To know right is to do the right.� Our speculative opinions determine our actions.� Our deeds often fall short of the actions we approve.� Our ideas are in advance of our everyday behavior.
From Thurstone IQ test, cited in Johnston, 1984, (undated)
Note the multiple correct answers.
1916 Kansas Silent ReadingTest*
Kelley
“fill in the blanks”
some verbal logic problems
some procedural tasks
Complete as many of 16 tasks as possible ina limited time
*The first published standardizedcomprehension test.
1917: ThorndikeReading as Reasoning
Basically an error analysis leading to a set ofcategories and a theory
Understanding a paragraph is like solving aproblem in mathematics. It consists in selectingthe right elements in the situation and putting themtogether in the right relations, and also with theright amount of weight or influence or force ofeach
Touton and Berry (1931) Erroranalyses
(a) failure to understand the question(b) failure to isolate elements of “an involved
statement” read in context(c) failure to associate related elements in a context(d) failure to grasp and retain ideas essential to
understanding concepts(e) failure to see setting of the context as a whole(f) other irrelevant answers
A panoply of measuresStarch: relevant words recalled as a functionof total words recalledCourtis (1914): words remembered/words intextChapman (1924): Find the words in part 2that do not fit the words in part 1 of theparagraph.Note similarity to later free recall and errordetection models of assessment
Enter Psychometrics in thelate 1930s
1935: IBM introduced the IBM 805 scanner
1935: Kelley: Factor Analysis
1944: Davis: Fundamental Factors
Davis 1944
Answer specific text-basedquestions
Main thought
Follow passageorganization
Word meanings in context
Word meanings
Author’s purpose
Literary devices
Draw inferences aboutcontent
Text based questionswith paraphrase
Word factor and a reasoning factor
Other Factor AnalysesHarris 1948: found a single factorDerrik (1953) found 3Hunt (1957) Vocabulary was everythingSchreiner, Hieronymus, and Forsyth (1971): Nodifferentiation among paragraph meaning, causeand effect, reading for inferences, and selectingmain ideas BUT separate LC and lower levelprocessingDavis (1968, 1972)
Davis 1972
8. Following the structure ofthe content
4. Weaving together ideas inthe content
7. Recognizing literarytechniques
3. Understanding contentstated explicitly
6. Recognizing the author’stone and mood andpurpose
2. Word meanings in context
5. Drawing inferences fromthe content
1. Remembering wordmeaning
Davis 1972Remembering word meanings
drawing inferences from content
structure of the passage
writerly techniques
explicit comprehension
Put an end to factor analytic studies
Cloze ProcedureWilson Taylor (1953): every 5th word
Bormuth (1966): the basis of readabilityresearch
Modifications to ClozeAllow synonyms to serve as correct answersDelete only every 5th content word (leavingfunction words intact)Use an alternative to every 5th word deletionMAZE: MC for the blanksMacro cloze: phrasesDelete words at the end of sentences and provide aset of choices from which examinees are to pickthe best answer
The conceptual death of clozeShanahan, Kamil,& Tobin (1983): notsensitive to “intersentential” comprehension
No differences when sentences werescrambled within or across passages orpresented in isolation
Passage DependencyP passage - P isolation
A quiet stir in the late 60s and early 70s(Tuinman)
Died in the wake of Schema Theory’sembrace of prior knowledge
Criterion-referencedassessment
Make a virtue out of sub-skillsTook the notions of mastery learning comingout of Carroll, Gagné and BloomDefine sets of subskillsSet a level of masteryTest-teach-testAssumes a componential skill view of readingData: Bloom’s experiments with Ed Psycourses
CRT takes overWisconsin Design for Reading SkillDevelopment
Fountain Valley
Virtually every basal program by the mid1970s
The children wanted to make a book for their teacher. One girlbrought a camera to school. She took a picture of each person in theclass. Then they wrote their names under the pictures. One boy tied allthe pages together. Then the children gave the book to their teacher.
1. What happened first?a. The children wrote their namesb. Someone brought a camera to schoolc. The children gave a book to their teacher
2. What happened after the children wrote their names?a. A boy put the pages together.b. The children taped their pictures.c. A girl took pictures of each person
3. What happened last?a. The children wrote their names under the pictures.b. A girl took pictures of everyone.c. The children gave the book to their teacher.
(adapted from the Ginn Reading Program, 1982)
Reactions to this movementProvided fuel for the constructivistreforms that were gathering momentumDied in the early 90s basals for about 6yearsOnly to be revived recentlyJohnson, D.D., & Pearson, P.D., (1975).Skills management systems: A critique.The Reading Teacher, 28, 757-764.
Domain referencedassessment
John Bormuth, Toward a Theory ofAchievment Test ItemsIdentify the domain as textsMap all of the logical relations amongsentences.Using linguistic transformations, develop allpossible Wh questions--> itemsRandomly sample from the domainSurvives in Math, not reading
The Cognitive RevolutionThe powerful impact of schema
The evolution of text analytic systemsStory grammars ala Stein & Glenn
Propositional analysis of texts ala Kintsch& vanDijk
Inference taxonomies ala Trabasso
The Impact of Cognitive Scienceon Assessment
more attention to the role of prior knowledge
attention to text structure (in the form of storymaps and visual displays to capture theorganizational structure of text)
the introduction of metacognitive monitoring
Used to critique the existing assessmenttraditions on the way to new assessments
Contrasts between what weknow and what we do
Yet when we assessreading comprehension,we . . .
New views of the readingprocess tell us that . . .
From Valencia, S., & Pearson, P.D. (1987). Readingassessment: Time for a change. The ReadingTeacher, 40, 726-733.
Use short texts that seldomapproximate the structural andtopical integrity of anauthentic text.
A complete story or text hasstructural and topicalintegrity.
Mask any relationshipbetween priorknowledge and readingcomprehension by usinglots of short passages onlots of topics.
Prior knowledge is animportant determinant ofreading comprehension.
Use multiple-choiceitems with only onecorrect answer, evenwhen many of theresponses might, undercertain conditions, beplausible.
The diversity in priorknowledge acrossindividuals as well asthe varied causalrelations in humanexperiences invitesmany possibleinferences to fit a textor question.
Rely on literalcomprehension test items.
Inference is an essentialpart of the process ofcomprehending units assmall as sentences.
Seldom assess how andwhen students vary thestrategies they use duringnormal reading, studying,or when the going getstough.
The ability to varyreading strategies to fitthe text and the situationis one hallmark of anexpert reader.
Structural representationsUsed in test development
Determine hierarchical and sequentialrelations
A theory of importance
Determines which nodes should beassessed
Authentic TextsSelect, not construct, texts forunderstanding
(started a cottage industry for magazinepublishers)
Can’t tinker with the text to rationalizeitems and distractors
(drove professional item writers crazy)
More than one right answerHow does Ronnie reveal his interest in Anne?
Ronnie cannot decide whether to join in theconversation.
Ronnie gives Anne his treasure, the green ribbon.
Ronnie gives Anne his soda.
Ronnie invites Anne to play baseball.
During the game, he catches a glimpse of the greenribbon in her hand.
Rate all of the responses onsome scale of relevance
How does Ronnie reveal his interest in Anne?(2)(1)(0) Ronnie cannot decide whether to join in theconversation.(2)(1)(0) Ronnie gives Anne his treasure, the greenribbon.(2)(1)(0) Ronnie gives Anne his soda.(2)(1)(0) Ronnie invites Anne to play baseball.(2)(1)(0) During the game, he catches a glimpse of thegreen ribbon in her hand.
Best predictor of retelling scores
IncludeComplex indicators of comprehension
Prior knowledge
Metacognition
Habits, attitudes, and dispositions
Some findings from IGAPWhen we plugged in Comprehension, PriorKnowledge, Metacognition, Habits/Attitude
We emerged with these factorsmetacognitive
habits/attitudes items
a combination of the comprehension and priorknowledge items
FateWent the way of all tests that challengethe conventional wisdom
Not good to teach to (e.g. metacognitiveitems)
Went down in the mid 1990s when theytried to add on an individual scorereporting component
Sentence verification taskOriginal: Verbatim repetition of a sentence in thepassageParaphrase: The same meaning as an original but withlots of semantic substitutes for words in the originalsentence.Meaning change: Uses some of the words in thepassage but in a way that changes the meaning of theoriginal sentence.Distractor: A sentence that differs in both meaning andwording from the original.
Judge each as old or newMost people seem content with polyester fillings andsuch. (Original)You don't know what comfort is until you've sunk yourhead into 3,000 bits of polyester. (Meaning change)It is always fun visiting grandparents because they takeyou someplace exciting, like the zoo or the circus.(Distractor)Being able to hear stories of when his mom and dadwere kids was one of the great things about havinggrandparents around, Tim concluded. (Paraphrase)His favorite grandparent was his mother's mother.(Distractor)
Sociocultural and LiteraryPerspectives
Learning and understanding areinherently socialAssessment should be responsive,interactive, and dynamicTexts are inherently political documentswith points of view and agendas andauthorsRosenblatt: Reader, text, and poemLanger: Into, through, and beyond
CLASIf you were explaining what this essay is about toa person who had not read it, what would you say?
What do you think is important or significantabout it?
What questions do you have about it?
This is your chance to write any otherobservations, questions, appreciations, andcriticisms of the story”
Another CLASNow you will be working in a group. You will bepreparing yourself to do some writing later. Your groupwill be talking about the story you read earlier. A copy ofthe story is provided before the group questions if youneed to refer to it. Some of the activities in this sectionmay direct you to work alone and then share with yourgroup, and other activities may have all of you workingtogether. It is important to take notes of your discussionbecause you will be able to use these notes when you doyour writing.Read the directions and do the activities described.Members of the group should take turns reading thedirections. The group leader should keep the activitiesmoving along so that you finish all activities.You’ll have 15 minutes for these prewriting activities.
The demise of performanceassessment in wide-scale
The social aspect: Whose work is it anyway?
Generalizability: Too passage specific
Expense: Scoring and rubric development
Invasion of privacy
The legacy:Mixed models
Classroom assessment
NAEP 1970sDemonstrate the ability to showcomprehension of what was read
analyze what is read, use what is read
reason logically
make judgments
have attitude/interest in reading.
NAEP 1980svalue reading and literature
comprehend written works
respond to written works in interpretiveand evaluative ways
apply study skills
NAEP 1990sFORMING INITIAL UNDERSTANDING
Which of the following is the best statement of thetheme of the story
DEVELOPING INTERPRETATIONSWhat caused this event
PERSONAL REACTION AND RESPONSEHow did this character change your ideas of _____
DEMONSTRATE CRITICAL STANCEWhat could be added to improve the author’sargument
NAEP concernsThe framework does not passpsychometric muster
Not much information at the lower endof the performance scale (no floor)
Item format: Do CR items add anyvalue to the information gained?
Not if they are MC in disguise?
Reading for UnderstandingThe standards for good assessment,especially those dealing withinstructional sensitivity, are critical
Notice that in most of our work, weassume the validity of our measuresand test the validity of the interventions.
What if we turned that around?
What does it mean to achieve agiven comprehension score?
Find a population of kids with a narrow bandof overall comprehension scoresAdminister lots of subskill tests, decoding,vocabulary, and comprehensionEvaluate prerequisiteness and compensatoryhypotheses
Which types of knowledge/skill are essentialHow many ways are there to get to 6.5?
Valencia study (later this morning)Note that New Standards Reference Examprivileges compensatory concept.
More questions to answer?For accommodations, how do we weighincreased participation against potentialsources of invalidity?
Time
Glossary
Mode of presentation
Starting overGo back to a set of theoreticalconceptualizations of comprehension
Component SkillsKnowledge Driven models (Schema Theory andConstruction-Integration)Contextually Driven models (Socio-cultural orcritical)Executive Control models (metacognition andCognitve Flexibility Theory)
Mine each for assessment implicationsApply each set of implications to a commonset of passages to create a set of alternativetheory-based assessments
More steps
Develop a “gold standard” forcomprehension—how do we get as close aspossible to that ineffable phenomenon?My candidate: Some on-line assessment ofboth the content (ideas in text) and the affect(phenomenological sense) of comprehension(akin to the write alongs)Examine the predictive validity of theassessment models generated from eachtheoretical perspective in relation to the goldstandardBe open to the possibility of a mixed model
ConclusionWe have traveled far, sometimes on newroads and sometimes on old.
Virtually all the old forms of assessmentsurvive, even flourish because of their
Psychometric properties
Efficiencies
And because challengers often fail to meeteither psychometric or efficiency standards
ConclusionWe seem poised to re-energizeourselves in this important enterprise
To build assessments that can meet themost rigorous of both measurement andconceptual standards
To serve the needs of both classroomteachers and policy makers
A welcome challenge