considering child-based results for young children: definitions
TRANSCRIPT
DOCUMENT RESUME
ED 414 997 PS 025 428
AUTHOR Kagan, Sharon Lynn, Ed.; Rosenkoetter, Sharon, Ed.; Cohen,Nancy, Ed.
TITLE Considering Child-Based Results for Young Children:Definitions, Desirability, Feasibility, and Next Steps.
INSTITUTION Yale Univ., New Haven, CT. Bush Center in Child Developmentand Social Policy.
PUB DATE 1997-00-00NOTE 81p.; Based on Issues Forums on Child-Based Outcomes (New
York, NY, June 1-2, 1995 and January 24, 1996).PUB TYPE Information Analyses (070) Reports Descriptive (141)
EDRS PRICE MF01/PC04 Plus Postage.DESCRIPTORS *Academic Standards; Educational Change; *Educational
Improvement; *Educational Objectives; Feasibility Studies;*Outcome Based Education; *Outcomes of Education; PreschoolEducation; Program Effectiveness; Theory PracticeRelationship
ABSTRACTWhile elementary and secondary educators are pursuing a
results orientation as a means of educational reform and pedagogicalimprovement, the early childhood field has not explored sufficiently thedesirability and feasibility of, nor the process for, establishingchild-based standards and results for younger children. This report presentsa synthesis of issues discussed at two issues forums held in 1995 and 1996.Following an introductory chapter, chapter 2 distinguishes between types ofresults and purposes of results as they apply to young children. Chapter 3examines the desirability of child-based results in early childhood educationin terms of the impact on teachers' practice and children's experiences,public understanding, funding, and the relationship of early care andeducation to other services. Chapter 4 identifies the following fiveconditions necessary for child-based results to be feasible: broadparticipation in the identification of results; identification of appropriateresults; clarity concerning which children to include; appropriatemeasurement of results; and linking of child-based results to efforts toimprove the lives of children. Chapter 5 outlines steps to advance aresults-based approach: increase public consciousness and participation; planstrategically; identify and choose results carefully; develop appropriate,cost-effective approaches to assessment and data collection; put theory intopractice; explore ways to adequately fund a results approach; andcommunicate, implement, and evaluate such programs. Five appendices containthe meeting agendas of each of the issues forums and three papers presentedat the meetings. (TJQ)
********************************************************************************
Reproductions supplied by EDRS are the best that can be madefrom the original document.
********************************************************************************
U
U.S. DEPARTMENT OF EDUCATIONOffice of Educational Research and Improvement
EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)
)44, This document has been reproduced asreceived from the person or organizationoriginating it.
0 Minor changes have been made toimprove reproduction quality.
L. Points of view or opinions stated in thisdocument do not necessarily representofficial OERI position or policy.
Considering
s
for Young ChildrenDefinitions, Desirability, Feasibility, es. Next Steps
"ft
PERMISSION TO REPRODUCE ANDDISSEMINATE THIS MATERIAL
HAS BEEN GRANTED BY
S L 430, Nrs
TO THE EDUCATIONAL RESOURCESINFORMATION CENTER (ERIC)
Based on Issues Forums on Child-Based Outcomes sponsored by The W.K. Kellogg Foundation,
The Carnegie Corporation of New York, and Quality 2000: Advancing Early Care and Education
BEST COPY AVAILABLE
Considering Child-Based Resultsfor Young Children
Definitions, Desirability, Feasibility,and Next Steps
Based on Issues Forums on Child-Based Results
sponsored by
The W. K Kellogg Foundation,
The Carnegie Corporation of New York, and
Quality 2000: Advancing Early Care and Education
Edited by
Sharon Lynn Kagan,
Sharon Rosenkoetter, and
Nancy Cohen
©1997
Yale Bush Center in Child Development and Social Policy
310 Prospect Street
New Haven, CT o6551
Contents
I. INTRODUCTION
BackgroundRationale for and Goals of the Issues Forums
I
3
2. DEFINITIONSDefining Different Types of Results 4Defining Different Purposes for Assessing Results 6Issues Concerning Definitions of Types and Purposes of Results 7
3. DESIRABILITYThe Impact on Teachers' Practice and Children's Experiences ioThe Impact on Public Understanding ioThe Impact on Funding II
The Impact on the Relationship of Early Care and Education and Other Services 12
So...What do we Conclude About the Desirability of Moving to a Child-BasedResults Approach? 12
4. FEASIBILITYOnly if There is Broad Participation in the Identification of Results 14
Only if we can Identify Appropriate Results 14
Only if we are Clear About Which Children to Include 16
Only if we Measure Results Appropriately 17
Only if we Link Child-Based Results to Efforts to Improve the Lives of Children 17
5. SUGGESTED NEXT STEPSIncrease Public Consciousness and Participation 19
Plan Strategically 19
Identify and Choose Results Carefully 20
Develop Appropriate, Cost-Effective Approaches to Assessment and Data Collection 21
Put Theory Into Practice 21
Explore Ways to Fund a Results Approach Adequately 22
Communicate, Implement, and Evaluate 22
6. REFERENCES
7. APPENDICESA. June 1-2, 1995 Meeting Agenda and Participants 24B. January 24, 1996 Meeting Agenda and Participants 27
c. Considerations Regarding an Outcomes-Based Orientation, Sheldon H. White 29
D. Judging Interventions by Their Results, Lisbeth B. Schorr 36
E. Can we Measure the Outcomes?, John M. Love 48
Chapter 1. Introduction
While elementary and secondary educa-tors are pursuing results as a means ofeducational reform and pedagogical
improvement, the early childhood field has notexplored sufficiently the desirability, feasibility, orprocess of establishing child-based standards andresults for younger children. This report presentsa synthesis of issues discussed at two IssuesForums. The first, on Child-Based Results, washeld on June 1-2, 1995 in New York City and wasattended by 37 scholars and practitioners in earlycare and education, school reform, and policydevelopment related to children and families (seeAppendix A). The second forum, held on January24, 1996, addressed Next Steps in AdvancingChild-Based Results; it consisted of 19 partici-pants with expertise in results efforts (see Appen-dix B). The Forums were a collaborative effort ofthe W.K. Kellogg Foundation, Carnegie Corpora-tion of New York, and Quality 2000: AdvancingEarly Care and Education.
BackgroundCurrent challenges facing the development andimplementation of child-based results for youngchildren are framed by two factors, each discussedbelow. The first is the current socio-political con-text; the second derives from grave concerns re-garding the challenges and misuse of results in thepast.
THE PRESENT CONTEXT
Increasing dissatisfaction with America's schoolsand the performance of its graduates has fosteredwidespread calls for educational reform. Emanat-ing from both the public and private sectors, dis-satisfaction with American education is com-pounded by growing concerns about the rising
I
costs of a system perceived to be inefficient as wellas ineffective.
To begin to rectify these educational ills, severalmovements have taken hold, including school-based management, charter schools, standardsspecification, and results-driven accountability.The results movement gained considerable publicattention through the Goals woo: Educate Amer-ica Act and the Elementary and Secondary Educa-tion Act, among others. The press for greateraccountability has spurred federal and state action,with new standards and results groups beingformed to address educational challenges. In somestatesVermont, Minnesota, and Oregontheresults orientation transcends education andextends across the array of human services.
As support for a results orientation in educationand human services increases, so has attention toyoung children and their families. New policy ini-tiatives focus on very young children (e.g., EarlyHead Start and Healthy Start), preschoolers (e.g.,National Governors' Association Action Teams onSchool Readiness), and on both (e.g., Kids Count,Child Care and Development Fund, and the fam-ily support movement). Public schools increas-ingly support prekindergarten services, while pro-grams such as Head Start and Parents as Teachersreceive media consideration and greater funding.
Despite dramatic growth in early care and edu-cation and unprecedented calls by the NationalEducation Goals Panel and others to delineateoptimal results and chronicle children's progresstoward them throughout the nation, the field ofearly care and education has responded with lessthan vigorous support. Indeed, calls for a focuson child-based results have met with staunchvocal resistance, as well as more silent pleas for aredirection of effort.
THE PAST CONTEXT
Reluctance to embrace a results orientation by theearly care and education field has deep roots,including legitimate concerns regarding test mis-use, technical concerns regarding measurement, ahistoric focus on process, and a lack of agreementregarding what is meant by results.
Evidence of Misuse of Test Data
Early childhood education professionals havecriticized the use of tests to mis-label, mis-catego-rize, and stigmatize children during their earliestdays in formal education (Meisels, 1988; Shepard& Smith, 1987), and they have questioned thevalidity of standardized tests for individual chil-drenespecially boys, racial minorities, andpreschool children whose primary language is notEnglish. Of great concern to the field of early careand education has been the widespread use of"readiness" instruments to screen children forentry to school. This practice has resulted in up to5o percent of children in some districts delayingschool entry or being sent to alternative "transi-tion" classes of unsubstantiated value (Gnezda &Bolig, 1989; Graue, 1993). Given these experi-ences, there is well-grounded skepticism in theearly childhood education community about thepotential use and misuse of results.
Concerns About Measurement
Concerns about measurement manifest them-selves in two domains; what is measured and howit is measured. In a paper commissioned for thefirst Forum, White considers some of these con-cerns (see Appendix C). Regarding what is mea-sured, early educators worry that child resultsmay be narrowly constructed to include only cog-nitive and pre-academic results, ignoring devel-opmental domains that are crucial to children'ssuccess but more difficult to capture in routinizedassessment (e.g., socio-emotional developmentand approaches toward learning). Regarding themeasurement of resultsthe howsome early
2
educators and developmentalists doubt the feasi-bility of instituting a results approach with youngchildren due to the variability of their behaviorand their inexperience in "performing" in testingsituations. Because young children's learning ishighly episodic, early educators voice concernsregarding the capacity of instruments adminis-tered to children on one occasion to capturedevelopmental nuances accurately. Further, theyworry that assessments may not give racially, eth-nically, and linguistically diverse young childrenappropriate opportunities to display their skillsand knowledge. Finally, they challenge the relia-bility and validity of existing assessment tools,questioning whether such instruments can besuitably altered or new instruments created, todiminish these concerns.
Focus on Process
Early education historically has emphasized theprocess of young children's individual learning.Early educators are trained to recognize and workwith children's uneven growth as well as the diver-sity in family values, experiences, and interactionstyles that shape early development. Reflectingthis individualistic orientation, historic attemptsto codify and improve practice in the early child-hood education field have focused on themodification of inputs, structural variables, andprocess variables that enable such individualiza-tionadult-child ratios, group size, and interac-tion patterns. Routinely, such inputs have beenequated with quality, with little call for an exami-nation of the need for a results orientation. Thereis little press for a movement toward child-basedaccountability by the early childhood educationfield, particularly when child-based results couldbe used to influence critical program funding andpolicy decisions.
Lack of Clarity of Terms
Finally, discourse on child results to date has beenhampered by imprecision in language and scope.
There is limited consensus in early care and educa-tion regarding what is meant by terms in commonusegoals, benchmarks, results, inputs, indica-tors, interim indicators, assessment, and testing.Attempts to achieve definitional clarity have beenovershadowed by the field's dual concern with thedelivery of direct services to children and theirfamilies, on the one hand, and with building theinfrastructure on the other hand. In short,definitional ambiguity is pervasive.
Rationale for and Goals of theIssues Forums
Given this context, and the likelihood of ongoingpublic pressure for results for young children, itwas deemed appropriate to engage scholars, prac-titioners, and policymakers in a professional con-versation. Organizers of the Forums wished toprovide an opportunity to take stock of the cur-rent status of child-based results for children birthto age eight, and to give voice to the early child-hood education community regarding its issuesand concerns.
In particular, it seemed important to: (1) clarifydefinitional distinctions; (2) discern the desirabil-ity of moving to a results orientation; (3) deter-mine the feasibility of moving toward a resultsorientation for young children; (4) consider nextsteps regarding a results-based orientation. Incontrast, the aim of the Forums was not to con-
3
cider or define the specific content of results thatmight be deemed appropriate for young children;that work has been started by others (Love, Aber,& Brooks-Gunn, 1994; Phillips & Love, 1994;Institute for Research on Poverty, 1995). Nor,given the complexity of the issues and the differ-ences of opinion that exist, was it the intent of theForums to achieve consensus on relevant issues;rather, this was an opportunity for honestreflection and thoughtful debate.
Reflecting these goals and intents, this reportis structured around three themesdefinitions,desirability, and feasibilitywith possible nextsteps suggested by the participants at the end.The document represents a synthesis of theissues discussed as well as those documented inthe literature. It is not intended to represent theconsensus of participantsbecause no suchconsensus was achieved. Rather, it is intended toreflect the complexity of the issues and the chal-lenges associated with moving toward child-based results for young children. Editors of thedocument (Sharon L. Kagan, Sharon Rosen-koetter, and Nancy Cohen) have attempted torepresent the ideas with fidelity; they alone,however, are responsible for errors. Appendicesfollow, including meeting agendas, lists of par-ticipants, and working papers prepared for thefirst Forum by Sheldon White, Lisbeth Schorr,and John Love.
7
Chapter 2. Definitions
This section details two distinct, but related,frameworks that shape the discussion that fol-lows. The first distinguishes among different typesof results; the second distinguishes among differ-ent purposes of results. A third part of this sectiondiscusses issues related to both the types and pur-poses of results. It should be noted that thedefinitions offered are not the only way of distin-guishing among the types or purposes of results(Bruner, Bell, Brindis, Chang, & Scarbrough,1993; Young, Gardner & Coley, 1993; Schorr,1994); they simply represent one heuristic.Important to note, however, is the pervasive lackof consensus around definitions as well as theneed to ground this (and other) discussions ofresults in a definitional framework. It should benoted that while specification of various typesand purposes of results is critical for clear com-munication in this discussion, the deliberationsof the Forums were designed to focus on TypeOne Results (what children know and can do andwhat is hereafter referred to as "child-basedresults") and on Purpose Four (accountability).
Defining Different Types of Results
Four types of results have been identified. Eachtype is discernable and knowable, each demandsits own data elements and approaches to collec-tion, and each evokes its own assessmentprocesses and considerations. Each type, thoughindependent and distinct, can be used in concertwith others, contingent upon the purposes of thedata collection. Together, the types of results forma continuum, with items representing children'sperformance and behavior at one end, and resultsrelated to systemic performance at the other end.In concert, the four types represent a comprehen-sive overview of the kinds of information beingconsidered by agencies, localities, and states as
4
they move to a results orientation (Kagan, 1995).Data on what children know and can do can beconsidered primary results, while secondaryresults include the contexts in which childrendevelop, such as child and family conditions, ser-vice provision and access, and systems capacity.
TYPE ONE RESULTS-WHAT CHILDRENKNOW AND CAN DO
This type of information focuses directly on chil-dren's behaviorswhat children know and cando. It is synonymous with the term "child-basedresults," the focus of this document. This type ofinformation must be gathered by observing chil-dren directly. It accepts no proxies for behavior,but is a precise and accurate description of chil-dren's performance. For young children thisincludes dimensions related to their motor devel-opment, social and emotional development, useof language, cognition and general knowledge,and approaches to learning. To gather this type ofinformation, child behavior is typically recordedintermittently, from more than one data source.
Examples of Type One Results include:
Motor development: Prevalence of children:who jump; walk a six-foot balance beam; cut;do "x" piece puzzle.
Social and emotional development: Prevalence ofchildren: who accept responsibility for ownactions; take turns; form and maintain friend-ships.
Language usage: Prevalence of children: who ini-tiate and sustain conversation; listen to others;recite poems and do fingerplays; repeat a sen-tence in correct word order; follow verbal direc-tion containing three steps; tell about a picturewhen looking at it; name common objects.
8
Cognition and general knowledge: Prevalence ofchildren: who match; sort shapes and colors;identify largest and smallest; demonstrateawareness of cause and effect.
Approaches toward learning: Prevalence of chil-dren: who take risks; persevere in a chosenactivity; demonstrate curiosity; use materials ininventive ways.
TYPE TWO RESULTS-CHILD ANDFAMILY CONDITIONS
This type of results focuses on the conditions thatsurround and encase what children know and cando. Such information may be gathered fromreviews of documents, including health records;interviews with family members and serviceproviders; and direct observations/conversationswith children and their families. This type ofresults assumes that what children know and cando is directly related to their own health statusand to the conditions in which they live. Ratherthan reporting data on individual children, thistype of data is generally reported as aggregatedprevalences and percentages. Child and familyresults may be grouped into categories (e.g., childhealth conditions; family income conditions)with positive and negative indicators in each.
Examples of Type Two Results include:
Child health conditions: Prevalence of chil-dren: who are born with low birth weights;who are fully immunized; who have func-tional limitations due to health conditions;who have age appropriate heights andweights; who are in good physical health,with no vision or hearing impairments.Family income conditions: Prevalence of chil-dren: who live in poverty; who live with twoparents or one parent employed.Family life conditions: Prevalence of children:who are born to teen mothers or substance-abusing parents; who are abused; who live in
5
foster care; whose TV viewing is regulated;who live in two-parent families; who live inlow-crime neighborhoods.
TYPE THREE RESULTS-SERVICEPROVISION AND ACCESS
Type Three Results are those that describe the ser-vices to which children and families have access.Distinct from the behaviors (Type One) or condi-tions (Type Two), this type focuses on service pro-vision and access to services that children and their
families experience. More than a tally of raw ser-vices, this type of results chronicles real availabil-ity of services to all ethnic, racial, and linguisticgroups, with data typically reported in preva-lences or percentages. Often Type Three Resultsinclude information about services by populationsub-sets or individuals with particular conditions(e.g., disabilities, pregnancy, employed status).Data for these results are typically collected fromrecord reviews and community and institutionaldata bases.
Examples of Type Three Results include:
Health provision/access: Prevalence of pregnantwomen who have access to early and continu-ing prenatal care. Prevalence of children: withincreased access to prenatal care; with healthinsurance; who have access to regular visionand hearing screening, to medical care, towell-child examinations.
Parenting education provision/access: Preva-lence of parents who have access to parentingclasses and social supports.
Child Care/Preschool provision/access: Preva-lence of low-income (or Limited EnglishProficiency [LEP] or disabled) children whohave access to child care. Prevalence of chil-dren: who have access to developmentallyappropriate child care and education; whohave access to before and after-school care.
TYPE FOUR RESULTS-SYSTEMSCAPACITY
Rather than focusing on the provision of andaccess to discrete services, as indicated in TypeThree Results, Type Four Results accord atten-tion to the way services are linked and function as asystem. Type Four Results assume that systemiccapacity, efficiency, and integration are related toaccess and service quality which, in turn, aredirectly related to children's performance. Far lesswell developed than the other types, Type FourResults examine service redundancies, omissions,capacities, and efficiencies. Data for this type arecollected in the aggregate and typically involvethe amalgamation of information across agenciesand service providers.
Examples of Type Four Results include:
Systemic efficiency: The degree to which the sys-tem uses its resources (e.g., fiscal, human, tech-nical, and technological) efficiently and effec-tively.
Systemic infrastructure: The degree to which theinfrastructure (e.g., training, financing, datagathering) supports efficient and effective ser-vice delivery.
Systemic accountability: The degree to whichaccountability is dispersed across systems; thedegree to which agencies build collectiveaccountability.
Systemic cultural sensitivity: The degree towhich the system is sensitive and responsive tothe needs of ethnically, racially, and linguisti-cally diverse children and families.
Defining Different Purposes forAssessing Results
Different types of results exist because they areneeded for different purposes. In some cases, forexample, results information is needed by directservice providers for the purpose of enhancing the
6
accuracy and quality of their work; in otherinstances, results data are needed for the purposeof demonstrating program efficacy; and in stillother cases, results data are used to meet theinformation demands of policymakers and thepublic at large. Because these demands often existsimultaneously and because there has been nocomprehensive data collection strategy delineatedfor young children, the purposes of amassingresults data can and do become blurred. Toaddress this confounding of purposes, variousexperts have proffered different schema (Bruner,Bell, Brindis, Chang, & Scarbrough, 1993; Shep-ard, 1995). All helpful, these schema haveinformed the following categorization of pur-poses for collecting results information. Each ofthe four purposes is addressed with respect toType One Results information.
PURPOSE ONE-SCREENING ANDEVALUATION
Information on Type One Results can be usedfor the purpose of locating children with specifiedcharacteristics, describing their current level offunctioning, and determining their eligibility forintervention services. Typically, large numbers ofchildren are quickly assessed to locate those fewwho might evidence a certain condition. The few,then, receive more thorough evaluation to learnwhether some type of intervention is warranted.Formal and informal observations, checklists,tests, and parent interviews are commonly-usedmeasurement approaches. Assessment for screen-ing and evaluation may take place once or repeat-edly.
Examples of assessments for this purposeinclude screening to discern the need for medicalintervention (e.g., prescriptions, eyeglasses) or forspecial education services. In the latter case, thebehavior of a single child is studied and comparedwith the average behavior of other children of thesame age and characteristics (e.g., gender, geo-graphic location). Large-scale assessments for thispurpose include the Early Periodic Screening,
to
Diagnosis, and Treatment (EPSDT) program thatfunds the identification and treatment of healthand developmental problems among Medicaid-eligible children, and Child Find a process tolocate children who may be eligible for andbenefit from special services, including those cov-ered under the Individuals with Disabilities Edu-cation Act (IDEA).
PURPOSE TWO-IMPROVEMENT OFINSTRUCTION
Information on Type One Results can also beused for the purpose of providing feedback toteachers on the instructional process, with theintention of improving pedagogy, aiding in pro-gram planning, and creating learning experiencesmore appropriate to the needs of individual chil-dren. For this purpose, teachers may note thebehavior of one child or a small group of chil-dren, using informal observations, checklists,anecdotal logs, or portfolios. Typically, suchassessments occur on an on-going basis, andteachers need training to develop and hone theirobservation and assessment skills.
Examples of such assessments include teacherobservation of children's level of small motordevelopment in order to plan appropriate activi-ties to foster such development or teacher obser-vation of children's peer preferences and levels ofplay in order to arrange appropriate groupings ofchildren.
PURPOSE THREE-PROGRAM EVALUATION
Assessment of children's performance and behav-ior for Purpose Three is undertaken to gauge theimpact of a specific program or a particular inter-vention. The resulting data are likely to be used toguide future program design and funding deci-sions. For this purpose, the performance ofgroups of children is of interest, though childrenare likely to be assessed individually. Typically,such work is carried out by researchers, and dataare reported about specific programs and inter-ventions.
7
Examples of assessment for this purposeinclude the evaluation of the Parents as Teachershome visiting program or Kentucky's multi-ageprimary program.
PURPOSE FOUR-ACCOUNTABILITY
Children's knowledge and skills can also be mea-sured for the purpose of informing the publicabout the collective status of children. For thispurpose, the performance of children in class-rooms, schools, districts, communities, states,and the nation is of interest; typically, progress ischarted over time. For this purpose, groups ofchildren are the unit for study, but not all chil-dren within a classroom or service unit will neces-sarily be assessed; samples of the group may beassessed. Assessment must be relatively time-efficient and the resulting data comparable andcapable of aggregation.
Examples of results established for this purposeinclude parts of the Oregon Benchmarks, whichare a series of results set by the public, that serviceproviders, localities, and the state try to achieveand for which they are held accountable. One ofthese benchmarks is the percentage of childrenentering kindergarten meeting specific develop-mental standards for their age; communities arepublicly challenged to improve the percentages ofchildren achieving this result. In another exam-ple, Kentucky uses child results data to considerits allocations of state education funds. Assess-ment for accountability purposes usually has highstakes. Information about the findings tends to bebroadly disseminated and used for decision-mak-ing. The highest stakes occur when recognition,funding, or other resources are directly tied to thereports of child performance.
Issues Concerning Definitions ofTypes and Purposes of Results
Although the definitions offered do rendergreater precision for the discussion, they also raiseseveral key questions: What is the differencebetween inputs and results? Are all four types
really results? and What special issues arise whenapplying these constructs to very young children?
Across service spheres (e.g., education, health,social welfare) debate lingers regarding what con-stitutes an input or a result, and what distin-guishes results from accomplishments. More thana semantic debate, the notion of what constitutesresults warrants examination. Under many condi-tions, particularly in the education domain whenspeaking about students, "results" refer to whatchildren know and can do, with inputs being thesupports, materials, curriculum, pedagogy, andinstruction that combine to foster the results.Alternatively, however, Type Two Results (childand family conditions)while a means to studentresultsalso constitute results in their own right.In short, there may be a chain of results (TypesTwo, Three, or Four) that lead to the ultimate goalof enhanced student performance (Type One).Some designate the results that lead to studentresults as interim results (Schorr, 1993); some con-sider them social indicators.
However thorny for children of any age, thequestions of what constitutes inputs and results,
and how to assess them, are particularly challeng-ing regarding young children. Assessing youngchildren's results is complex because of theirepisodic development, their dependence onadults and society for supports, and the disjunc-ture between the manner in which young chil-dren demonstrate competence (action and inter-action) and more conventional approaches tomeasurement. As such, ethical questions emergeregarding the legitimacy of basing child resultsonly on what young children know and are ableto demonstrate. It is often argued that results foryoung children must be predicated on multipletypes of data; Type One data alone are deemedtoo narrow an indication of child results. Infor-mation from all types, but most particularlyTypes Two and Three, must be included in anassessment strategy that takes full account of theage and expected abilities of youngsters. That is tosay, when considering young children, concep-tions of results evidence may need to be broad-ened to include what, for other age groups, maybe considered inputs.
8
12
Chapter 3. Desirability
service providers in early care and educationhope to affect results for childrento pre-vent negative results from occurring and to
promote positive results. Early childhood educa-tors are quite used to on-going and informalassessment of young children. So while it mightseem that there would be great receptivity towardchild-based results in early care and education,this has not always been the case, because of thehistorical antipathy discussed in the introduc-tion, the lack of training of many early care andeducation workers, and the lack of infrastructurein the early care and education system to collectresults data. In addition, today's discussion ofchild-based results imposes new levels of rigor,specification, and accountability. Movement to amore formal and systematic child-based resultsapproach would focus sustained public attentionon the degree of attainment of specified results.Questions likely to be asked include: How effec-tive are teachers? curricula? early care and educa-tion programs? schools? school districts? theamalgamation of services in communities, states,the nation?
Early care and education experts have diver-gent viewpoints on child-based results. One per-spective is that it is unjust to predicate support forearly care and education on the basis of results.Like education K-12, early care and educationshould be considered a moral imperative in ademocratic society. This perspective also arguesthat the potential dangers of a child-based resultsapproach outweigh the possible benefits. Thisperspective is articulated most emphaticallyunder the following conditions: (a) the youngerthe children in question; (b) when using resultsdata moves beyond the classroom; (c) when usingresults data to assess racially, ethnically, and lin-guistically diverse young children; and (d) whenusing results data to make high-stakes decisions
9
about pay, program reimbursement, or otherresource allocations. In this view, it may be desir-able to assess children's resultsparticularly forchildren ages three through eightfor the pur-poses of screening and evaluating children (Pur-pose One), improving classroom instruction(Purpose Two), and perhaps for curriculum orprogram evaluation (Purpose Three). From thisperspective, however, there are too many risksand not enough benefits to assess results foryounger children and to aggregate data to use foraccountability purposes in monitoring or deci-sion-making in the community, state, or nation(Purpose Four).
Another perspective on results for young chil-dren is that the challenges of results definitionand assessmenteven for accountability (Pur-pose Four)are addressable. This perspectiveregards the benefits of a results orientation as soattractive as to advocate immediate investmentsin constructing child-based results and systemsfor data collection for even very young children.The need for community, state, and national datato guide policy and practice, and even to allocateresources, is emphasized. This perspective arguesthat even if the early childhood education fielddelays, states and localities are preparing to assesschild results, perhaps without needed advice frompersons trained in child development and earlyeducation.
Amplifying these arguments, this section cate-gorizes issues and then enumerates possible disad-vantages and potential advantages of shifting to achild-based results approach for each. In mostcases, specified disadvantages and advantages cor-respond to one or more of the indicated purposes.This section conveys the tensions surroundingchild-based results, while later sections relateideas for resolving them. In a paper commis-sioned for this Forum, Schorr reviews potential
13
benefits of an results-based approach, possiblepitfalls, and promising strategies for implementa-tion (see Appendix D).
The Impact on Teachers' Practiceand Children's Experiences
What effect, if any, would child-based resultshave on daily practices affecting children andtheir teachers? Those who oppose a results-basedapproach feel that it would direct teachers toteach to the test, thereby limiting their creativityand the spontaneity and flexibility of early educa-tion. Advocates for a results-based orientationbelieve that it would help to guide practice, mak-ing it more purposeful and goal driven. Thesepositions are elaborated more fully below.
Potential advantages for teachers' practices andchildren's experiences:
Teachers would have more information aboutchildren's learning and individual differences,allowing teacher practices to address children'sindividual needs and backgrounds, andexpanding practices to address multiple areasof child development rather than just cogni-tive skills and knowledge (Purposes One, Two,and Four)
Teachers would develop and increase effectiveinstructional approaches, curricula, and ser-vices if they know more about what works,have the flexibility to design approaches ratherthan being required to use prescribed meth-ods, and have precise goals toward which toteach (Purposes Two, Three, and Four)
Teachers would use information on childdevelopment to communicate more effectivelywith family members, to help them supporttheir children's learning (Purposes Two andFour)
Teachers would have increased expectationsfor all childrenparticularly ethnically,racially, and linguistically diverse young chil-
IO
drenif all children are expected to achievethe same high results (Purpose Four)
Possible disadvantages for teachers' practices andchildren's experiences:
Teacher practices would become less effectiveif the results are misleading and do not cap-ture the nature, complexity, and individualityof children's developmentparticularly forethnically, racially, and linguistically diverseyoung children (Purposes Two and Four)
Teacher practices would become more uni-form, in the attempt to achieve uniformresults; homogeneous services could not meetthe unique needs of individual children (Pur-poses Two and Four)
Communication with family members wouldbecome less usefulemphasizing performanceon test items rather than overall child develop-ment (Purposes One, Two, and Four)
Children likely to test poorly and lower schoolaverages would be retained or their entry toschool delayed, particularly minority childrenand children with special needs or limitedEnglish proficiency; as a result, children couldbe labeled or stigmatized (Purposes One, Two,and Four)
The Impact on Public Understanding
The desirability of a results approach also
depends on how the data are interpreted by fami-lies and the public. If the results are narrow, triv-ial, abstract, or culturally-bound, then a resultsapproach might hinder public understanding. Ifthe results are meaningful to families and thepublic, and if they are expressed in everyday lan-guage, then a results approach might serve toinstruct the broader society about child develop-ment.
Potential advantages for public understanding ofyoung children's development:
Public knowledge and general understandingof healthy child development and of develop-mental problems would increase if results aremeaningful and valuable to parents and thepublic (Purpose Four)
The public would come to understand moreabout the multi-dimensional nature of childdevelopment; understanding of the relation-ships among children, families, services, andsystems of the early years would increase (Pur-pose Four)
Appropriate assessments would lead to thedevelopment of realistic expectations for thedevelopment of children and the performanceof the programs in which they participate(Purpose Four)
Possible disadvantages for public understandingof young children's development:
The public would draw invalid conclusionsabout children's abilities from results that areinappropriate for all or some young children.For example, results that focus on cognitiveskills minimize the importance of otherdimensions of early learning. Another examplewould be an assessment system that does notreflect how the performance of low-incomechildren may have improved over time (Pur-pose Four)
The public would think that child develop-ment is simpler and more uniform than it is,because it is impossible to capture the com-plexity and individuality of development inspecific results (Purpose Four)
The public would be confused about whathelps and hinders learning, if data about fami-lies, communities, services, and systems arenot collected and presented as the context forchild results (Purpose Four)
The public would place far more pressure onyoung children by developing false, unrealisticexpectations for performance (Purpose Four)
The Impact on Funding
Current discussion about the desirability ofadopting a results orientation in early care andeducation occurs within the framework of thedevolution of government responsibility, programconsolidation, budget cuts, and heightened com-petition for funds. Those who question a results-based approach are concerned that it will lead to areduction in funding for services for children andfamilies in general, for early care and education,and for low-income and minority children whomay not perform well on tests. Advocates for aresults-based approach feel that having depend-able data is the only hope for maintainingandpossibly increasingfunding levels and services.
Potential advantages for funding:
Policymakers or government administratorswould use results-based data to reallocatefunds from marginal and ineffective early careand education programs to effective programs(Purpose Four)
Program administrators would use results-based data to expand more effective programsand improve or eliminate less effective ones(Purpose Four)
Investment would increase in services to low-income and otherwise needy children, in pre-vention efforts (rather than remediation), andin infrastructure, particularly if the cost sav-ings of this additional funding is documented(Purpose Four)
Possible disadvantages for funding:
Effective early care and education programswould have their funding cut if the chosenresults are insignificant, narrow, or insensitiveto family and community differences. (Pur-pose Four)
All early care and education programs wouldhave funding cut if results are not met(whether or not these programs are at fault)
II
15
and if the public loses hope or becomes cyni-cal (Purpose Four)
There would be fewer resources to provideearly care and education services to children iffunds are diverted to setting, assessing, inter-preting, and communicating results (PurposeFour)
What limited infrastructural support that cur-rently exists would be threatened because ofthe desire to keep resources close to the chil-dren so that results will improve (PurposeFour)
The Impact on the Relationship ofEarly Care and Education and OtherServices
As society fails to solve the complex human prob-lems in today's communities, there is a trendtoward services integration and an increasingacknowledgement that the comprehensive needsof families cannot be met with narrow, categoricalservices. Those who question taking a results ori-entation feel that it might fester competitionamong all social services and economic develop-ment; fragmentation among services will grow ascompetition increases. Proponents of a results-orientation believe that it will provide the vehiclefor varied social service and economic develop-ment efforts to work together toward improvingthe lives of children and families.
Potential advantages for the relationship of earlycare and education with other services:
Agencies and programs across service areaswould cooperate, collaborate, and possiblycombine funds to achieve positive results;planning across social services would be facili-tated by results data (Purpose Four)
The gap between early care and education andelementary education would be bridged with
12
16
the common focus on results, as might the riftbetween services for children with specialneeds and education in general (Purpose Four)
Possible disadvantages for the relationship ofearly care and education to other services:
Early care and education programs would loseresources and attention if other services havebetter results (Purpose Four)
Ill will and fragmentation among serviceswould increase and collaboration decrease ifsome services have better results (PurposeFour)
The various service sectors would blame eachother for poor results if attribution of results isnot demonstrated clearly (Purpose Four)
So ... What Do We Conclude AboutThe Desirability of Moving to aChild-Based Results Approach?
Returning to the possible purposes of a child-based results orientation, there is some consensusin the early childhood field that gathering infor-mation for preschool- and primary-aged childrenfor screening and evaluation (Purpose One) andfor improvement of instruction (Purpose Two) isadvantageous for children and families, providedthat the results chosen are both significant andappropriately measured. Many early childhoodexperts also favor the use of child results for pro-gram evaluation (Purpose Three). There is lessagreement regarding movement to child-basedresults for these purposes for children below thepreschool years, notably infants and toddlers.
While some first Forum participants stronglysupported moving to child-based results foraccountability purposes, there was little agree-ment about the wisdom of doing so for all youngchildren, birth to age eight. Even among thosewho support such movement, there is greater
support for assessing child-based results for mon-itoring and planning than for using results toallocate resources.
To implement a child-based results approachthat avoids the possible disadvantages describedabove and achieves the advantages enumerated,planners must meet certain necessary conditions.
13
Implementation of a results orientation is desir-able only if critical safeguards are in place duringthe process of results identification, whileplanning the assessment, throughout the data col-lection, and as findings are interpreted and com-municated. These "only ifs"or necessary con-ditionsare discussed in the section that follows.
17
Chapter 4. Feasibility
Despite disagreement on the desirability ofmoving to a child-based results approachfor accountability purposes, many
experts believe that doing so now is more feasiblethan at any other time in our history. They indi-cate that certain fields, such as early childhoodspecial education, have focused on child resultsfor more than 20 years and have both good-prac-tice and bad-practice examples to inform the dis-cussion. Further, current research is improvingthe options for gathering information that is bothvalid at the time of measurement and meaningfulacross the developmental course. In papers com-missioned for this Forum, White explores someof the limits of existing standardized tests foryoung children (see Appendix C) and Love pre-sents arguments for the feasibility of a childresults approach (see Appendix E). Despite theseadvances, all are concerned that any shift to aresults orientation be made in ways that are devel-opmentally, culturally, and contextually appropri-ate. Care must also be exercised in the process ofdeveloping and interpreting results. Inherentlyprecarious and high stakes, the effort to developand implement a results approach can only beundertaken under certain conditions. Moving tochild-based results is feasible "only if" the follow-ing five conditions are met:
"Only If" There Is Broad ParticipationIn The Identification of Results
INCLUDE MANY STAKEHOLDERS INRESULTS DEVELOPMENT
Results that will be useful to the nation need tobe agreed upon by a broad constituency, includ-ing parents, policymakers, practitioners, andresearchers. Politicians, government administra-tors, business leaders, and citizens have meaning-
ful contributions to make in the development ofresults as do individuals from diverse ethnic,racial, and linguistic backgrounds. The process ofbuilding consensus by selecting certain childresults that are significant to broad audiences iscrucial. Moreover, in developing results, it is
important to remember that the input of lay citi-zens can help assure that results are meaningfuland locally appropriate.
WORK WITH PRACTITIONERS ANDPARENTS IN PARTICULAR
Given the discomfort of many in the early child-hood education community (personnel alongwith parents) regarding a shift to a results orienta-tion, it is important that conversations regardingchild-based results involve practitioners and par-ents. Personnel in early care and education need toconverse with others in the education and humanservice fields who are already using a resultsapproachsuch as early childhood special educa-tors, elementary school teachers, and health careprovidersto gather suggestions from their expe-riences. Parents need to understand clearly theimplications of moving to a child-based resultsorientation. More significantly, both groups knowchildren well; their ideas are critical to construct-ing effective, appropriate results.
"Only If" We Can IdentifyAppropriate Results
CHOOSE RESULTS THAT ARESIGNIFICANT FOR CHILD DEVELOPMENT
AND THE LIFE COURSE
As a child-based results approach evolves, theremay be a tendency to choose results that can beeasily measured or to select "quick and dirty"indicators for which measures already exist. Inlieu of these sometimes flawed approaches, plan-
10
ners need to consider what results they deem fun-damentally important, with major findings fromchild development influencing the content of theresults. Good results have merit because they areimportant in and of themselves, and because theyare linked to longer term life goals. The discus-sion of life-course results is also valuable becauseit can unite Americans across racial and ethniclines; most people want the same major life-course results for their children. Thus, the essen-tial task of results planners is to take the complexconstructs that research has demonstrated to beimportant (e.g., identity formation, achievementmotivation, task persistence, establishing peerrelationships) and identify the life-course "edge"appropriate to the age group under consideration.
Life course results might include being readyfor school, being able to read, graduating fromhigh school, attending college, holding a job, oravoiding teenage pregnancy and crime. Concep-tualized in this way, such results will have utilityfor parents and policymakers. Moreover, theycontextualize the content of the results in a moredurable, long-term perspective that is salientacross all populations. For these reasons, life-course results should be considered.
CHOOSE RESULTS THAT CROSS MULTIPLEDOMAINS OF CHILD DEVELOPMENT
Basing educational results on subject matter cur-riculum areas (e.g., science, math), while perhapsappropriate for older children, needs to be exam-ined critically when considering younger chil-dren. Learning for young children is less orientedto subject matter facts than to the fostering ofbasic developmental competence. To that end,the Goal 1 Technical Planning Group of theNational Education Goals Panel, building ondecades of work by scientists and practitioners,has identified five dimensions of early learningthat provide a developmental, rather than a cur-ricular, framework. The dimensions are: physicalwell-being and motor development; social and
15
emotional development; approaches towardlearning; language usage; and cognition and gen-eral knowledge (Kagan, Moore, & Bredekamp,1995). Results for young children need to considerthese dimensions, as well as a curricular orienta-tion.
Beyond incorporating a broad-based develop-mental orientation, results for young childrenmust take into account children's unique learningapproaches. Young children, especially, do notlearn in compartmentalized categories; theyamass knowledge through integrated experiences.Consequently, results for young children mustreflect integration across domains and across sub-ject areas. Results for young children should notfocus only, for example, on cognitive develop-ment, but must emphasize all domains. It is
imperative that the domains be considered as atotality, with no "single domain acting as a proxyfor the complex interconnectedness of earlydevelopment and learning" (Kagan, Moore, &Bredekamp, 1995).
Use of multi-dimensional results will also min-imize teaching aimed solely at producing hightest scores. For example, it is easy to teach a childto label pictures of community helpers (single-dimension skill), but more challenging to teachchildren how to plan and play with others (multi-dimensional skill).
CHOOSE RESULTS TO WHICH DOLLARVALUE CAN BE ATTACHED
Given the precarious funding of early care andeducation and given the persuasiveness of cost-saving data to policymakers, new results shouldbe amenable to cost analysis. To date, ina very limited number of studies, the early careand education field has been particularly success-ful in demonstrating the cost-effectiveness ofearly intervention. Recognizing these conditions,new results data must be responsive to policy-makers' thirst for additional fiscal information.
19
CHOOSE RESULTS THAT REFLECTTHE REALITIES OF EXISTING
COMMUNITIES AND PROGRAMS,YET CAN BE AGGREGATED
Real world circumstances must guide the adop-tion of results. If schools in a community teachonly in English, then an appropriate result mustbe that children read, write, and speak English.On the other hand, if children in a communityare allowed to demonstrate their competence inSpanish, Korean, or English, then the resultsprocess must reflect that fact. In communities inwhich neighborhood culture and school expecta-tions differ, results might reflect children's abilityto function successfully in both settings. Selectedindicators must not gloss over the differences in"readiness for kindergarten" in the myriad ofcommunities across America, nor can local valuesbe ignored. Nevertheless, if results are to be usedto monitor local, state, and national trends, a coregroup of results must be drawn to allow compar-isons among programs, communities, and states.
DETERMINE WHETHER RESULTSSHOULD BE COMPLEX OR SIMPLE
One perspective is that results must reflect thecomplexity of development; indicators strippedof developmental "richness" are not meaningful.Another view is that worthy child results canfocus on a few crucial elements; these would bestage-salient tasks with considerable social valid-ity, such as trust behaviors in infancy or reading atthe end of first grade.
The issues of the complexity and "develop-mental embeddedness" of selected results, alongwith the intricacies of the assessment methodsadopted to measure them, determine the costs ofthese approaches. One perspective underscorespragmatism in a time of tight budgets; namely,early care and education should move forward to
iG
develop quick, low-cost approaches to resultsassessment. Another view is that simple measuresof complex developmental phenomena are notpresently possible. Given that fact, the fieldshould honestly explain its position and informpolicymakers that adequate time and moneymust be provided before results assessment can goforward.
In either casecomplex or simplethe resultsshould be designed to "tell a story" to increase theunderstanding of the public and policymakers ofchild development.
"Only If" We Are Clear About WhichChildren To Include
DETERMINE HOW TO INCLUDE CHILDRENWITH LIMITED ENGLISH PROFICIENCY
AND SPECIAL NEEDS
To what extent should children with limited Eng-lish proficiency be included in results assess-ments? A predominant view is that all childrenshould be included in such a system. Otherwise,the system would not be national; it would not bejust. However, with children whose home lan-guage is other than English, and particularly ininstances where multiple languages are repre-sented in a classroom or community, assessmenttechnicalities and practical issues exist.
A similar issue arises with regard to young chil-dren with special needs. In several states, young-sters with special needs have been "overlooked" indesigning results assessment systems, while otherstates have included them. Recommendationsfrom national special education experts (NationalCenter on Educational Results, 1993), and pio-neering results efforts in several states, suggestthat most children with disabilities should partic-ipate in results evaluations.
20
"Only If"We Measure ResultsAppropriately
LOOK AT CHANGES FROM BASELINEBEHAVIOR OVER TIME
Because young children's growth is highlyepisodic and variable, performance cannot bejudged at a single point in time, but must begauged from repeated observations; data must becollected at multiple points in time. Use of mea-sures over time also reveals developmentalprogress in children whose exposure to earlylearning opportunities has been restricted andwhose baseline and current performance aredelayed for their chronological age. In monitor-ing local programs, states, and the nation, it isessential to realize that improvement in resultsmust be regarded relative to children's startingpoints. This is particularly important whenstudying children at risk, who may be makinggreat gains but still have below average results.Results assessments take into account startingpoints and must look at change over time.
FOCUS ON FACE VALIDITY, CONSTRUCT
VALIDITY, AND CONSEQUENTIAL VALIDITY
The paragraphs above have underscored theimportance of results and results measures that"make sense" to parents, the public, and practi-tioners. This is the concept of face validity. Con-struct validity is also critical: Do the assessmentapproaches measure what they are intended tomeasure? Do they clarify performance consistentwith the concept embodied in the results? Conse-quential validity is also very important: Areresults used in valid ways (i.e., to make accurateexplanations)? Is the interpretation given consis-tent with the actual findings?
THINK INVENTIVELY ABOUT ASSESSMENT
A great deal of innovative assessment is taking
place nationally, and new results efforts shouldseek to incorporate this cutting edge work.Among these efforts, play-based assessment, class-room observation schemes, authentic assessment,descriptive documentation, interviews withteachers and parents, portfolio approaches, andVygotskian dialogues show promise for providinguseful developmental information. While each ofthese approaches faces unique problems, theyshare the challenge of aggregating descriptiveinformation in a concise form that can be used indecision making. As such, they provoke thinkingabout critical issues that need to be faced asresults information is generated and made use-able.
"Only If"We Link Child-BasedResults to Efforts to ImproveThe Lives of Children
LINK CHILD-BASED RESULTS WITHOTHER INDICATORS
If a purpose of results measurement is to helppractitioners, parents, policymakers, and the gen-eral public understand the status of young chil-dren, and to improve this status, then it is essen-tial that the conditions in which children areliving and learning (Type Two) be assessed andrelated to the child-based results. Moreover, theservices that children are receivingor notreceiving(Type Three) and some or all of theelements of the systems that support early careand education must be assessed and related to thechild-based results.
Information about multiple types of resultsshould be used to help the public understand thecircumstances upon which child results depend.Some early childhood experts are more vocal thanothers in requiring this linkage in any results sys-tem that is developed. More data are already
17
21
being aggregated for Types Two and Three thanfor child-based results (Type One), and, at thepresent time, there is little reporting of childresults along with the other types of information.
i8
22
Chapter 5. Suggested Next Steps
While significant progress has been madein delineating the issues surroundingthe use of results in early care and edu-
cation, the topic remains conceptually complex,practically challenging, and politically sensitive.The suggested next steps that follow build uponpreliminary discussion from the first Forum andelaborated discussion at the second Forum. Thenext steps embody a number of recommenda-tions for advancing a results-based approach, butthey are intentionally suggestive, offeringdomains and strategies for action that need to behoned.
Increase Public Consciousness andParticipation
CONSIDER TERMINOLOGY
Within discussions of results, similar terms oftenconvey different meanings to various speakers;alternatively, different terms may convey the sameconcept. At the same time, particular terms maybe politically charged in certain locations withspecific audiences, but not in others. In short,there is no clear set of terms with which to con-duct the debate. Since language is the vehicle formeaning, foundational concepts need to beclearly stated and mutually agreed upon early inthe discussion of child-based results in early careand education.
BROADEN PARTICIPATION IN IDENTIFYINGRESULTS AND BUILD CONSENSUS
REGARDING THEM AT LOCAL, STATE,AND NATIONAL LEVELS
Worthwhile results that are broadly "owned" canresult only from shared construction by parents,teachers, administrators, researchers, policymak-ers and the public at large. Building consensus atthe local, state, and national levels on desiredresults that are meaningful to varied audiences is
critical to raising public consciousness and sup-porting an ongoing effort. An intentional focuson including traditionally disenfranchised groupsin the process is essential. All consensus-buildingefforts need to be coordinated by groups consid-ered to be non-biased and legitimate.
ENGAGE THE EARLY CHILDHOODCOMMUNITY
The early childhood community is correctly con-cerned about the development and implementa-tion of results for very young children. The con-cerns of this community need to be fullyunderstood and carefully addressed. Forums forearly childhood practitioners and researchersneed to be held; written materials need to bedeveloped. Above all, time needs to be allowed forsupport and consensus to emerge.
Plan Strategically
LEARN FROM AND BUILD ON EFFORTS INCHILD DEVELOPMENT AND ALLIED FIELDS
Given the focus on results creation, construction,and collection throughout the nation, it seemswise to assess and utilizewhere appropriatethe data currently being collected for otherefforts. In particular, a number of governmentand foundation projects have developed modelsfor child-based educational results for youngerand older children. Data gathering efforts forolder children, including the National Assess-ment of Educational Progress, and for youngerchildren, such as the Early Childhood Longitudi-nal Study, have struggled with of the same issuesexplored in this paper that challenge efforts tomove to a child-based results orientation. Itwould be especially informative to consider howthey haveand have notaddressed the neces-sary conditions (only ifs).
Other sources of guidance are child-based
19
23
results efforts in related fields, such as child wel-fare, adoption, foster care, early childhood specialeducation, and child health. Attention to the suc-cesses and failures in results approaches across dis-ciplines will be invaluable at the formative stageof efforts in early care and education. Addition-ally, others in early care and education have pon-dered these same issues. They include the GoalTechnical Planning subgroup and Head StartResearch Committees. Efforts should be made touse the expertise and documents from thesegroups.
CONSIDER AND PREPARE FORUNINTENDED CONSEQUENCES
Moving to a results orientation will yield someunintended consequences. To the extent possible,efforts to predict such consequences, particularlythose that might be negative, are encouraged. Inaddition, once potential negative consequences areidentified, strategies to deal with them need to bedeveloped and implemented.
COORDINATE RESULTS-RELATED EFFORTS
Create a collaborative of organizations engaged inwork on results related to early care and educa-tion to support and inform one another. Such agroup could not only cross-fertilize existing workand minimize duplications, but could also createstrategic plans delineating where additional tech-nical, consensual, and political work is needed.
Identify and Choose Results Carefully
IDENTIFY RESULTS BY TYPE
Confusion exists regarding different types ofresults. To that end, different types of resultsshould be discerned. For results related to chil-dren's behavior, a broad national consensus needsto be developed, with ample opportunity forstates and locales to tailor their specific resultsand benchmarks. Such results should bestrengths-based and should allow for thereflection of partial achievement. Efforts to spec-ify Type One results might look to special educa-
20
2 4
tion which has been using similar results formany years. For results related to child and familyconditions (Type Two), access and quality of ser-vices (Type Three), and systemic results (TypeFour), model results could be developed at thenational level for state adoption. All resultsshould evolve and be subject to frequent changeas knowledge, social conditions, and values arealtered.
IDENTIFY LIFE-COURSE RESULTS
Define major life-course results for older chil-dren, and specify the antecedents in early child-hood that symbolize important "real life" skills.Tying life-course results to younger children willhelp to elevate the importance of the early years.
IDENTIFY LINKS AMONGTHE RESULTS TYPES
Because results are interrelated, clearer pathwaysamong and between individual results and resultstypes need to be clarified. In particular, linkagesneed to made between results related to children'sbehavior and knowledge (Type One) and thecontext in which children develop, as expressed inresults related to child and family conditions(Type Two), service provisions (Type Three), andsystemic integration (Type Four).
FOCUS ON POSITIVE RESULTS
Frequently, particularly regarding child and fam-ily conditions (Type Two results), there has been atendency to chronicle negative results. Effortsshould be made to identify and assess positiveresults (e.g., resilience, protective factors, childwell-being), as well.
SELECT IMPORTANT RESULTS TO WHICHTEACHERS CAN AND SHOULD TEACH
The results chosen for young children must betruly important to their future development andachievement, not simply indicators that are easyto measure. When results are salient and whenthey are evaluated in contextually-sensitive ways,teachers can conduct activities that promote theskills that assessments measure.
RESOLVE THE TENSION BETWEEN THEDESIRE OF COMMUNITIES AND STATES TOCUSTOMIZE RESULTS AND THE NEED TO
AGGREGATE DATA
It is necessary to identify results that are meaning-ful to local communities and states to inform theprocess and assure local ownership of the results.It is also crucial to maintain some consistencyacross all data such that they are useful, inter-pretable, and comparable.
AIM HIGH, BUT BE REALISTIC
Child-based results must be selected that createhigh expectations for all children. Such well-designed results will guide service improvementand policy formation. Care must be taken, how-ever, not to over-reach or idealize the conditionsunder which children live or the services they canreasonably be expected to receive.
Develop Appropriate, Cost effectiveApproaches to Assessment and DataCollection
DEVELOP MEASUREMENT TECHNIQUESFOR LARGE SAMPLES OF YOUNG CHILDREN
Efficient methods for gathering data from chil-dren in a variety of settings must be pioneered,building upon the experiences of multi-site earlychildhood education studies conducted within thepast decade. As noted earlier, promising new tech-niques (e.g. play-based assessment, classroomchecklists for anecdotal records, and Vygotskiandialogue methods) merit exploration because theyreflect child development and the context inwhich it occurs. Narrow, multiple-choice psycho-metric measures are not acceptable for any of thepurposes herein. Richer more complex data showpromise of being useful for multiple purposes.Worthy goals are to collect fewer data and makethe best possible use of what is collected, and togive children from diverse backgrounds the oppor-tunity to perform to their optimum capacity.
21
DEVELOP ACCEPTABLE APPROACHESTO INCLUDE CHILDREN WITH LIMITED
ENGLISH PROFICIENCY AND THOSEWITH SPECIAL NEEDS
Several states and organizations have developedapproaches to including as many individuals aspossible in results assessment. To do this, resultsmeasures and assessments need to be translatedinto multiple languages, appropriate for multiplecultures, and accessible by children with specialneeds.
Put Theory Into Practice
PROVIDE TECHNICAL ASSISTANCE TOSUPPORT RESULTS DEVELOPMENT AND
ASSESSMENT
States are moving to using child-based resultsrapidly. There is an urgent need to support stateand local practitioners as they surge forward. Inmany cases, states will be pressed to implement aresults system well before many of the issues canbe addressed fully. To help states in the meantime,information across states should be chronicledand mechanisms established for informationsharing among those developing results andrelated assessment systems.
PROVIDE TECHNICAL ASSISTANCETO SUPPORT DATA GATHERING AND
MEASUREMENT
Presently, many states are enhancing their datagathering and measurement capacities. Not onlyare states using approaches that differ from oneanother, but often administrative departmentswithin states differ in their approaches to datagathering and measurement, preventing theaggregation of data germane to specified results.To that end, technical assistance should be pro-vided to foster comparable measurement andassessment approaches across state administrativeagencies and perhaps across states.
25
PILOT WELL-CONSTRUCTED MODELS
Any results system for early care and educationmust be grounded in accepted child developmenttheory and validated assessment practices. Forcollecting results-based data for accountabilitypurposes, it is crucial that innovative approachesfrom a variety of disciplines and audiences beconsidered. Before policies are drafted,approaches should be pilot-tested at severaldiverse sites to assure the efficacy of the approach.Expansion should proceed at a manageable pace.
CONTINUE THE DIALOGUE
Convene meetings once or twice a year to reviewresults work being done by districts, states, andindividual researchers; to identify exemplaryefforts and find ways to share them widely; and toidentify gaps in knowledge and plan ways to fillthem. Future gatherings might continue toexplore divergent viewpoints on difficult issuesand spark inventive resolutions.
Explore Ways to Fund a ResultsApproach Adequately
DETERMINE NECESSARY FUNDING
The collection of some data may require no addi-tional funds. For example, funding for data col-lection around special education placement isavailable, and funding for research on instruc-tional approaches may be contained within thedevelopment and pilot testing budgets for suchprojects. However, monitoring child results foraccountability purposes is typically a project that
22
2
needs specific funds. A necessary first step is todefine the scope of the proposed project, notingthe complexities and costs inherent in large-scaledata collection.
EXPLORE POTENTIAL FUNDING SOURCES
Both start-up and ongoing funding need to beexplored, with consideration given to "piggyback-ing" with other data collection wherever possible.Funding options to be considered include foun-dations, government agencies, and state agencies.
Communicate, Implement, andEvaluate
CONCENTRATE ON COMMUNICATINGRESULTS BROADLY AND EFFECTIVELY
To have impact, results data on young childrenmust be shared with parents, policymakers, busi-ness, and the media. Careful consideration mustbe given to the nature of the data shared and theprocess for sharing it. Not all data are in a formimmediately suitable for use by multiple audi-ences. Care must be taken to assure effective com-munication to appropriate audiences.
CONTINUOUSLY EVALUATE AND IMPROVERESULTS MEASURES AND APPROACHES
Initially, no process of results and assessment willbe complete or ideal. Efforts and instrumentsneed to be monitored and evaluated continually.Realistic timelines and comprehensive andsequenced plans should be developed so thatresults-based efforts are regularly re-examined.
REFERENCES
Bruner, C., Bell, K., Brindis, C., Chang, H., & Scar-brough, W. (1993). Charting a course: Assessing a commu-nity's strengths and needs. Falls Church, VA: National Cen-ter for Service Integration.
Graue, M. E. (1993). Ready for what? Constructing meaningsof readiness for kindergarten. Albany, NY: State Universityof New York Press.
Gnezda, M. T., & Bolig, R. (1989). A national survey ofpublic school testing of prekindergarten and kindergarten chil-dren. Paper commissioned by the National Forum on theFuture of Children and their Families of the NationalAcademy of Science and the National Association of StateBoards of Education.
Institute for Research on Poverty. (1995, Spring). Focus,16(3).
Kagan, S. L. (1995). By the bucket: Achieving results foryoung children. Washington, DC: The National Gover-nors' Association.
Kagan, S. L., Moore, E., & Bredekamp, S. (1995). Consid-ering children's early development and learning: Towardshared beliefs and vocabulary. Washington, DC: NationalEducation Goals Panel.
Love, J. M., Aber, L., & Brooks-Gunn, J. (1994). Strategiesfor assessing community progress toward achieving the firstNational Educational Goal. Princeton, NJ: MathematicaPolicy Research, Inc.
Meisels, S. (1988). Developmental screening in early child-hood: The interaction of research and social policy.Annual Review of Public Health, 9, 527-550.
23
National Center on Educational Results. (1993). Educa-tional results and indicators for early childhood (Age 3 andAge 6). Minneapolis, MN: University of Minnesota, Col-lege of Education.
Phillips, D, & Love, J. (1994). Indicators for school readi-ness, schooling, and child care in early to middle childhood.Paper prepared for the Indicators of Children's Well-BeingConference, November 17-18, Rockville, MD.
Schorr, L. B. (1993). A framework for improving results forchildren and families. Washington, DC: Improved Resultsfor Children Project.
Schorr, L. B. (1994). The case for shifting to results-basedaccountability. In Making a difference: Moving to outcome-based accountability for comprehensive service reforms (pp 14-28). Falls Church, VA; National Center for Service Inte-gration.
Shepard, L. (1995). The challenges of assessing young chil-dren appropriately. Phi Delta Kappan, 76(3), 206-212.
Shepard, L. A., & Smith, M. L. (1987). Effects of kinder-garten retention at the end of first grade. Psychology in theSchools, October, 346-357.
Young, N., Gardner, S., & Coley, S. (1993). Getting toresults in integrated service delivery models. In N. Young,S. Gardner, S. Coley, L. Schorr, & C. Bruner (Eds.), Mak-ing a difference: Moving to outcome-based accountability forcomprehensive service reforms. Falls Church, VA: NationalCenter for Service Integration.
Zero to Three. (1992). Heart start: The emotional founda-tions of school readiness. Washington, DC: Author.
27
Appendix A: June 1-2, 1995
Meeting Agenda and Participants
FIRST ISSUES FORUM ON CHILD-BASEDRESULTS AGENDA
GOAL: To examine the desirability and scientific feasibilityof moving toward a child-based results orientation for chil-dren birth to age eight.
June i and 2, 1995Carnegie Corporation of New York
437 Madison Avenue, New York, NY
Day OneJune I, 1995
moo Buffet brunch available
Session I Welcome and OverviewSharon L. Kagan, Chair
12:00 Welcome/IntroductionsGoals of the meetingBackground, Rationale, and Definitions
12:45 Considerations Regarding an Results-BasedOrientation
Presentation by Lisbeth B. Schorr
1:05 Considerations Regarding an Results-basedOrientation
Presentation by Sheldon White
1:25 General Group Discussion
2:45 Break
Session II Implications for Children's DevelopmentMichael Levine, Chair
3:oo Participants will be asked to respond to thefollowing questions:
a. What are the necessary characteristics of anassessment process that obtains neededinformation and also promotes child devel-opment? How does the process differdepending upon whether we are assessing
24
2
for instructional improvement or account-ability?
b. Are there special conditions or needs ofyoung children in general, and certainyoung children in particular, that mightexempt them from or prefer them forinclusion in results assessment?
c. How can a results orientation be structuredto be fully comprehensive for children, o-8?
4:oo General Group Discussion
5:oo Adjournment
Session III Dinner and Roundtable Discussions onDesirability
Valora Washington, Chair
6:3o Cocktails
7:oo Dinner
7:45 Concurrent Roundtable Discussions
Roundtable A Impact on Classroom Practiceand Teacher Preparation
a. What is the potential impact of a resultsorientation on classroom practice? onprofessional development?
b. If a results approach were adopted, whatspecial actions should be taken toencourage positive results and preventnegative consequences in classroompractice?
Roundtable B Impact on Programsa. How would a movement toward child
results impact program quality? avail-ability?
b. What is the relationship between resultsand funding?
c. What program policies might be altered(positively or negatively) as a result of aresults-orientation (e.g., retention)?
Roundtable C Impact on Families and Com-munities
a. How will parents and diverse commu-nity groups respond to a results orienta-tion (e.g., persons with low income,business leaders, service providers)?
b. What precautions are necessary to pre-vent misuse of outcome data within acommunity?
Roundtable D Impact on Early Child-hood Systems
a. Would the benefits and challenges of aresults approach touch all segments ofthe field evenly? Which constituencies/programs would be most affected? How?
b. Would movement toward a results ori-entation help unite or further divideearly care and education?
c. How would a results approach affectmonitoring and licensing?advocacy?
Roundtable E Impact on State and FederalPolicy
a. What are the potential policy conse-quences of a results orientation in thestates? at the national level?
b. If a results approach were to beadopted, what strategies should beimplemented at the national and statelevels to ensure maximum benefits andconstrain harm in the states?
9:00 Adjournment
Day TwoJune 2, 1995
Session III Continued Discussion on DesirabilityValora Washington, Chair
8:oo Continental breakfast
8:3o Reports from Roundtable DiscussionsA Impact on Classroom Practice and
Teacher PreparationB Impact on ProgramsC Impact on Families and CommunitiesD Impact on Early Childhood SystemsE Impact on State Policy
25
9:2o General Group Discussion
10:20 Break
Session IV Scientific Feasibility of an Results ApproachMichael Levine, Chair
10:35 Review of Past Efforts and Scientific FeasibilityPresentation by John Love
to:55 Participants will be asked to address the followingquestions:
a. What cultural, technical, developmental,and implementation considerations mustbe kept in mind if a child-based results ori-entation were to take root?
b. Do we understand the pitfalls and can weovercome them, now?
12:00 General Group Discussion
12:45 Lunch
1:3o Participants will be asked to respond to the follow-ing questions:
a. For each age group (infancy, preschool,and primary), is the development andimplementation of child-based results tech-nically feasible at this time?
b. What, if any, are the special considerationsfor your age group?
2:15 General Group Discussion
3:15 Break
Session V Summary and ConclusionsSharon L. Kagan, Chair
3:3o Summation: The Desirability and Scientific Feasi-bility of an Results Approach
Presentation by Barbara Blum
4:oo General Group Discussion: Next Steps
4:3o AdjournmentFirst Issues Forum on Child-Based Results
2
June 1-2, 1996Participants
Larry AberNational Center for Children in PovertyColumbia University
Barbara Blum, PresidentFoundation for Child Development
Sue BredekampNational Association for the Education of Young Children
Cynthia BrownCouncil of Chief State School Officers
Charles BrunerChild and Family Policy Center
Bettye CaldwellPediatrics/CAREArkansas Children's Hospital
Nancy CohenBush Center in Child Development and Social PolicyYale University
Ellen GalinskyFamily and Work Institute
Sarah GreeneNational Head Start Association
Kenji HakutaSchool of EducationStanford University
Sharon L. KaganBush Center in Child Development and Social PolicyYale University
Mary KimminsMaryland State Department of Education
Luis LaosaEducational Testing Service
Michael LevineThe Carnegie Corporation of New York
Joan LombardiAdministration for Children, Youth, and FamiliesU.S. Department of Health and Human Services
John LoveMathematica Policy Research, Inc.
Samuel J. MeiselsUniversity of Michigan
26
33
Kristin MooreChild Trends
Frederic MosherThe Carnegie Corporation of New York
Deborah PhillipsNational Research Council
Craig RameyCivitan International Research CenterUniversity of Alabama at Birmingham
Sharon RosenkoetterBush Center in Child Development and Social PolicyYale University
Lisbeth SchorrHarvard University Working Group on Early Life
Diana Slaughter-DefoeSchool of Education and Social PolicyNorthwestern University
Robert SlavinJohns Hopkins University
Valora WashingtonThe W.K. Kellogg Foundation
David WeikartHigh/Scope Educational Research Foundation
Charles E. WheelerWalter R. McDonald & Associates, Inc.
Sheldon WhiteDepartment of Psychology and Social RelationsHarvard University
Emily WurtzOffice of Senator Jeff Bingaman
Nicholas ZillWestat, Inc.
FOUNDATION REPRESENTATIVES
Stacie GoffinEwing Marion Kauffman Foundation
Mary LamerDavid and Lucile Packard Foundation
Janice MolnarFord Foundation
Appendix B: January 24, 1996
Meeting Agenda and ParticipantsSecond Issues Forum:
Next Steps for Child-Based ResultsAgenda
Wednesday, January 24, 199610:00 am to 4:30 pm
Carnegie Corporation of New York437 Madison Avenue, New York, NY
Goal: To identify "actionable" next steps for developinga results-oriented approach for accountability in early careand education
to:oo to 10:15 Welcome, Introductions, and Charge tothe Group
to:15 to 10:30 Questions and Discussion ConcerningFirst Meeting
10:30 to 12:00 Identifying Results
Is there a need to identify and build consensus on keytype 1 results (what children know and can do) or havesuch results been adequately identified? Consider thisquestion separately for children age 5, age 3, andyounger than age 3. What are the next steps in thisarea?
Is there a need to identify and build consensus on keytype 2 results (child and family conditions), or havesuch results been adequately identified? Consider thisquestion separately for children age 5, age 3, andyounger than age 3. What are the next steps in thisarea?
Is there a need to identify and build consensus on keytype 3 results (service provision, access, and quality) forfamilies and children, or have such results been ade-quately identified? Consider this question separately forchildren age 5, age 3, and younger than age 3. What arethe next steps in this area?
What is meant by type 4 results (systems capacity)?How do they relate to type t, 2, and 3 results? Is it nec-essary to develop different type 4 results for childrenage 5, age 3, and younger than age 3? What are the nextsteps in this area?
(If time allows) What are the "markers of progress" forresults types 1-4? What are the micro-results that lead to
27
other micro-results? What are the complex relationshipsamong the different types of results? Which results leadto which other results? What are the next steps in thisarea?
12:00 to 1:00 Assessing Results
Is additional work on instrument development andassessment methods needed for implementing a child-based results approach? If yes, for which types of resultsand for children of which ages? Are there adequateapproaches for assessing some relatively complex results(e.g. nurturing families)? What are the next steps in thisarea?
Have cost-effective approaches to data gathering andassessment been identified? Is it clear how states canmake the best possible use of existing data? What arethe next steps in this area?
Is it clear how should children with special needs(developmental disabilities, LEP, at-risk) should beincluded in assessment and data gathering? What arethe next steps in this area?
Loo to 1:3o Lunch
1:3o to 2:30 Developing Comparable Results and Data
How can results be developed and data collected thatreflect community/state input and priorities and thatare also comparable across communities and states?Should we strive to use the same data collected by com-munities and states (increasingly for the purpose ofhigher-stakes accountability such as resource allocation)to make comparisons among communities and states?to generate a picture of children and families across thenation? Alternately, should the goal be separate local,state, and efforts to define results and collect data? Howcould separate local, state, and national data collectionefforts support one another and minimize duplication?What are the next steps in this area?
2:30 to 3:oo Involving the Early Childhood EducationCommunity
How can the early childhood education community bebetter integrated into efforts to develop and assesschild-based results? What are the next steps in this area?
31
3:oo to 3:3o Public Relations
How can states and communities develop the politicalattention and will to overcome special and competinginterests and come up with fair and meaningful resultsand assessment systems for children and families? Whatare the next steps in this area?
3:3o to 4:3o Finalize Next Steps
Second Issues Forum: Next Steps for Child-Based ResultsJanuary 24, 1996
Participants
Barbara BlumFoundation for Child Development
Barbara BowmanErikson Institute
Sue BredekampNational Association for the Education of Young Children
Cynthia BrownCouncil of Chief State School Officers
Nancy CohenBush Center in Child Development and Social Policy,Yale University
Ellen GalinskyFamilies and Work Institute
Eugene GarciaGraduate School of EducationUniversity of California Berkeley
32z8
Sharon L. KaganBush Center in Child Development and Social PolicyYale University
Luis LaosaEducational Testing Service
Michael LevineCarnegie Corporation of New York
John LoveMathematica Policy Research, Inc.
Kristin MooreChild Trends
Michelle NeumanBush Center in Child Development and Social PolicyYale University
Gregg PowellNational Head Start Association
Sharon RosenkoetterAssociated Colleges of Central Kansas
Jack ShonkoffHeller Graduate SchoolBrandeis University
Gary J. StanglerMissouri Department of Social Services
James YsseldykeNational Center on Educational ResultsUniversity of Minnesota
Appendix C: Considerations Regarding anResults-Based Orientation
Sheldon H. WhiteHarvard University
Paper presented at the Issues Forum on Child-Based Results,New York City
SECTION I IntroductionIn the abstract, program assessment using child-basedresults is highly desirable for the management of earlychildhood programs. To the extent that a program pro-duces changes in children that are unarguably positive andplainly visible to others, those responsible for the programhave less to worry about internally and less to explain toothers. The program takes care of itself. The program man-agers obtain autonomy and flexibility. Supervisors or criticsmay ask questions about the program's philosophy, meth-ods, or operational strategybut if the positive benefits ofthe program are plain to see, all the questions are held atarm's length. They do not disappear, they are tabled, butthey have little force in requiring changes in the program.Creative program developers attach considerable impor-tance to the "immunization" produced by face valid results.I have seen one sophisticated developer of an innovativeeducational program go to considerable effort to produce anew system of evaluation along with his new educationalprogram. His hope was that his new form of evaluationwould place a shield between his unorthodox program andits critics.
Often enough, the results obtained by an early child-hood program do not immediately and obviously declarethemselves as benefits. Then some kind of estimation ofwhat the program is achieving has to be arrived at by prox-ies: goodness-of-process indicators, peer reviews, surveys ofclient satisfaction, and analyses of outcome variables thatmight be theoretically or argumentatively linked to the pos-sibility of future benefits. Judgments about a program usingsuch catch-as-catch-can indicators may differ. Such indica-tors may be reasonable and adequate for the everyday man-agement of a program, but they may not be sufficient toanswer life-or-death challenges to the program in a politicalcontext.
The travails of Head Start are, of course, a perfect illus-tration of the challenges that confront a program supportedby partial and argumentative indicators. Child-based results
29
are inherently difficult for many early childhood programsbecause the important consequences the programs areintended to influence lie far in the future. Either we look atsome short-term proxies for those distant consequences, orelse we have to wait a long time before finding out whetheror not the program has had an effect. When there are life-or-death issues of accountabilityas, for example, whenquestions about Head Start's validity periodically surgeforth in the Congresswe struggle with the choices.
SECTION 2 The State-of-the-Artof Readiness Testing
Since Head Start is generally understood to be a programthat helps disadvantaged children do better in school itwould have seem natural to have used tests of school readi-ness as indicators of the program's effectiveness. Such testshave hardly been used in the great volume and variety ofHead Start studies. Why would people not use a test sopatently and obviously directed towards just what HeadStart is supposed to bring about? The technical qualities ofthe tests are not very good and this is a reflection of the factthat what the test is intended to deal with is very poorlyunderstood.
To begin with, traditional readiness tests are not verygood in conventional psychometric terms. About a decadeago, in conjunction with my longstanding interest in devel-opmental changes in children in the 5-7 age range, I lookedat a number of the major commercial school readiness tests.I was interested in the possibility that there might be somehidden wisdom deep in their construction. Did readinesstests embody insights about the cognitive changes in chil-dren near the onset of schooling? Did the subtests of thoseinstruments differentiate out theoretically interesting fac-tors in children's cognitive development? I had to be inter-ested in the tests' predictive power. Could the tests dowhat they were designed for, assess children's cognitivematurity?
The readiness tests did not look as though they con-tained much hidden wisdom. They looked like work-sam-ples of things that children are asked to do in the earlygrades. But the readiness tests' statistics said the predictivepower of the tests was so poor as to make theoretical ques-tions about the instruments uninteresting. I did not pursuethe analysis at that time, but it was the memory of it that
33
led me to be personally quite skeptical when, a few yearsago, I first confronted a declaration of Goal i of the Goals2000 project.
Within the past few weeks, in preparation for today'smeeting, I have taken a second look at the readiness andreadiness-like tests now on the market. Table t, which canbe found at the end of this paper, gives some statistics onthe tests. Fifteen readiness tests were looked atthe Wood-cock-Johnson Psycho-Educational Test Battery, the Brig-ance Diagnostic Inventory of Basic Skills, the Howell Pre-Kindergarten Screening Test, the Brigance K and i Screen,the Daberon Screening for School Readiness, the AntonBrenner Developmental Gestalt Test of School Readiness,the Analysis of Readiness Skills, the Metropolitan Readi-ness Tests, the Clymer-Barrett Readiness Test, the BasicSkills Inventory, the Gesell School Readiness Test, theCognitive Skills Assessment Battery, the Lollipop Test, theABC Inventory to Determine Kindergarten and SchoolReadiness, and the McCarthy Screening Test. Table i alsogives the same information for a second set of eight tests ofgeneral intellectual developmentthe Boehm Test of BasicConcepts, the Bracken Basic Concept Scale, the CIRCUS,the Battelle Developmental Inventory, the DIAL, theWIPPSI, the ABC Inventory, and the EARLY. Tablegives each test's declared purpose, the age range for which itis intended, and some statistics on reliability, concurrentvalidity, and predictive validity.
The statistics for this 1995 sample of readiness tests are,on the whole, slightly better than those I remember fromthe earlier group. Still, predictive validity information ismissing or vague for many of the tests. Where the tests dopredict school-age performances of children, they do notpredict very far into the future and they predict to psycho-metric instruments that are themselves of uncertain predic-tive power. Interestingly, only two of the teststhe HowellPre-Kindergarten Screening Test and the ABC Inventorywere correlated with teachers' clinical estimations ofwhether their children were ready for school, with mixedresults.
I have no desire to pass off this quick survey of the cur-rent readiness tests as a definitive study. It is not. But thequick survey gives a glimpse of the state-of-the-art of ourcontemporary capacity to build a strong school readinesstest, and that survey is not encouraging about the prospectsfor building a nationwide school readiness screening instru-ment by the year 2000.
Of course, there are some good reasons for believing thatif we can give some serious, sustained, deliberate efforts tobuilding new school readiness assessment we can do betterthan those traditional instruments. We know more aboutthe possibilities of testing and assessment, and we haverecently begun to modify and diversify our century-oldtechnology of testing. We know more about child develop-ment than we used to. And we have accumulated some
30
34
greater understanding about when and how research data isbrought into use in the policy process.
SECTION 3 Recent Changes in ourUnderstanding of Child Development
The fundamental reason why traditional readiness testshave not worked well is because we have never had a veryclear idea what "readiness" is to begin with. The "readiness"issue arose together with practical efforts to achieve "readi-ness" testing in the late 192os and early 193os. Compulsoryeducation laws were being passed in all the American statesand children were pouring into the schools. Some childrenwere visibly less ready to do business in the first grade;teachers could see that. But there was no literature in thei92os, and there is no literature now, to discuss exactlywhere or how "readiness" might be constituted in a child.
Two other conceptions that are much like "readiness"exist in the child development literatureBinet andSimon's turn-of-the-century conception of "mental age",built into our contemporary practices of mental testing,and Piaget's notion of "cognitive stages" in childhood, builtinto many of the "developmentally appropriate" preschoolcurricula of the present. The Binet-Simon "measuring scaleof intelligence" was designed to see if children enteringschool could profit from regular instruction. Binet andSimon arranged a series of tasks and performances to forman age-scale, and from the score a child obtained on theirseries they computed a "mental age". Although the Binet-Simon testing procedure has been all but buried under acloud of subsequent psychometric technology and ideolog-ical legendary, the test remains at bottom an instrument fordeciding how old a child is mentally. Presumably, there is auniform path of mental development followed by all chil-dren. The goal of the test is to find out how far along theBinet-Simon path the child stands.
A not-dissimilar uniformitarian view of a child's cogni-tive development came to life in the 196os, through theenormous influence Jean Piaget had on American develop-mental psychologists at that time. Many American develop-mental psychologists who thought about designing andevaluating programs for children in developmental termsconceived of that development as it was construed byPiaget's theory. Child development was cognitive develop-ment. All children go through a uniform series of stages ofcognitive development, and the important differencebetween one child and another was the question of how faralong Piaget's path each child was. Programs for poor chil-dren, it was said, should "close the gap". Some Americanresearch on Piagetian theory dwelt on what Piaget called"the American question", the question of whether onecould or could not accelerate the movement of the growingchild along Piaget's path.
Piagetian theory is on the wane now, and we have come
to believe that there is more to the small child's life thanmarching from one Piagetian stage to another. There ismotor development, social and emotional development,the organization of language, the building of generalknowledge, and the building of metacognitive insights andstrategies. One of the heartening things about the contem-porary literature on child development is that all theseaspects of child development are being actively studied. Ibelieve we know far more about children's developmentthan we have so far "harvested" for the design of programsand assessment instruments for children. We can make aricher world of programs for children and families withsuch knowledge.
But we have to proceed with care and with thoughtfuland sophisticated efforts towards instrument development.Some of the simplifying assumptions of the past are nowslipping away from usthe notion that all there is to childdevelopment is cognitive development, and the uniformi-tarian notion, the idea that all children pursue a commonpath towards adulthood.
Once upon a time, the heart and soul of early educationwas the celebration of the diversity of small children. Whenthe Froebelian Kindergartens first came over to the UnitedStates, the women who worked in them called themselves"child gardeners." Children differ. One is a tomato,another a carrot, a third a sweetpea, a fourth a cucumber.The task of the child-gardener is to study each child and tosee the way the way it grows and what it needs, and then tooffer that child the support or guidance best suited to his orher needs. The labor-intensive vision of how to adultshould deal with her small children in a i9th century Froe-belian kindergarten is far behind us now. Our children goto modern kindergartens and grade school classes in whichthey pursue common schoolingthe "basics" of literacyand numeracy we expect will be given to all children. Welike uniformitarian visions of the basic processes of childdevelopment because they dovetail nicely with the stan-dardization of our classrooms and our expectations. But thereality that every teacher and parent knows is that childrenare different from one another. If we are going to assesschildren's readiness in a broad way, we are going to have toaddress those differences in some meaningful way.
Consider social development, which everyone nowagrees is an important aspect of what happens to children inpreschools and schools. What if one shape or form of socialdevelopment that is true for all children does not exist?Children behave differently in social situations. Some arebold, some are shy, some are friendly, some are reserved,some are garrulous, some are taciturn. Most children inpreschools and schools manage to solve the problem offinding a comfortable social existence among their peers.They catch hold of some sector of small-fry society, butthey do not all do so in the same way. Children are verymuch like adults in this regard. It is not clear which uni-
31
form criteria of social "readiness" for schooling one canapply to all children, boys and girls, who are members ofone cultural community. It is even less clear which criteriaof social development should be applied to children of dif-ferent cultural communities in American society.
If we are going to set forth readiness tests for schools thatexamine a broad spectrum of children's functions and capa-bilities, I believe we are going to have to confront and dealwith the non-standard, idiosyncratic aspects of individualchildren's development. The community of developmentalpsychologists now includes strong groups of researchersaddressing each of the several major streams of small chil-dren's development, and I suspect we can work with suchresearchers to gradually begin to develop possibilities forbroad-scale examinations of children's capabilities andcompetence in several areas of development. But I see noquick way to carry out this process, and I am quite certainthat we cannot smash-and-grab our way past the necessityfor entering into it. The research and developmentprocesses that will be entailed will extend considerably pastthe year z000.
Politicians like to set forth lofty and heroic goals. It is inthe nature of leadership that they do so; 'impossible' goalsget peoples' attention, mobilize them, and surprisinglyoften turn out to be possible after all. Furthermore, the lifeof high officials nowadays tends to be dull, nasty, brutish,and short. One reason why projects on behalf of childrentends to be framed in impossibly short periods of time is thebrief time officials in power have to act. Understanding allthat, I am still not persuaded that we can or should try toestablish a nationwide system of readiness assessment by theyear z000. We have had enough zoth-century experiencewith the development and use of psychoeducational tests toknow that testing is a double-edged sword. It can hurt aswell as help.
Tests that teachers and administrators do not respect canbe more or less politely subverted or evaded. There can beminor forms of fraud, as when so American states all reportthat their children are above average on the school achieve-ment teststhe famous "Lake Woebegon Effect." Tests cancontrol and limit what teachers do, as when teachers in manyAmerican classrooms set aside their best professional judg-ment and "teach to the test". And some psychoeducationaltests, notoriously the descendants of the Binet-Simon instru-ment for determining children's mental age, can becomesignificant instruments for inter-ethnic politics and ideology.Our experience with psychological test usage to date suggeststhat we have every reason to be slow and careful in our devel-opment of future instruments.
SECTION 4 Recent Changes in TestingI am optimistic about the possibilities of more sophisticatedassessments of children's readiness, given time. A reason-ably restrained process of test development can do much to
5
move us towards a greater ability to use child-based resultsfor such purposes as program development and evaluation,student assessments, teacher guidance, policy determina-tion, and a variety of other practical uses. I began my talktoday with a consideration of the state of the art of com-mercial tests for assessing children's readiness. In closing, itmight be useful to consider again briefly what is happeningin the world of commercial testing. The psychoeducationalenterprise traditionally known as "tests and measures" isundergoing a great deal of change today, after a good manydecades of being surprisingly stable and resistant to change.The more important changes of the present are the follow-ing:
Different Kinds of Testing are Being Developed for DifferentPurposes
Traditional psychological tests have always been a little likeHenry Ford's Model T. Ford would sell you any color caryou wanted so long as it was black. Traditional psychoedu-cational tests could be used for any purpose that youwanted as long as you used a standardized, forced-choice,norm-referenced instrument. But it is not at all clear thatone kind of test is maximally useful for purposes of federal-or state-level accountability, providing guidance to ateacher in his or her classroom, diagnosis of an individualstudent's strength or weaknesses, or assessing the pros andcons of a programmatic innovation. Today, we are seeing adifferentiation of forms of psychoeducational testing, asdiffering instruments are being created for different audi-ences and purposes.
New Technologies of Testing Are Coming Into Use
The differentiation of new forms of testing is being fur-thered by the emergence of new technologies of testing.The chief modality for psychoeducational testing seems tobe still, at the moment, the multiple-choice series of ques-tions addressed with a Number 2 pencil. But a variety ofmore complex forms of testing are coming forthrangingfrom constructed-response testing implemented by paperand pencil or by computer to the evaluation of studentwork using portfolios or performance observationalschemes. In general, the new forms of testing are moreexpensive to administer and score but they yield muchricher and more complex information about the individualsbeing tested. We can expect that they will play a substantialrole in future early childhood programs.
Testing and Teaching Have Begun to Merge
As tests have become more complex, naturalistic, and`authentic' they have begun to look more and more likeextensions of the ordinary learning activities of children intheir schoolrooms. Two interesting things have begun tohappen. First, teachers have begun to see the tests as interest-ing commentaries not only upon the child and his or hercapabilities, but upon the substance and methods employed
32
3
by the teacher. The new modalities encourage reflectiveteaching. And, more and more, it seems likely that they canbecome part and parcel of the educational process itself.Instead of existing as a diversion or timeout from schoolwork, the new tests become simply an enriched source offeedback coming out of school activities, available to childrenand teachers who participate in those ongoing activities.
SECTION 5 Enriching the Mixtureof Child-Based Results
The use of child-based results is not an all-or-none thing inprogram assessment. We can imagine a future developmentprocess in which assessments of programs in early childhoodcan be progressively enriched through the use of more andmore child-based results as we develop the capability toenvisage them in a reliable and credible way.
We usually think about accountability in top-downterms, because zoth century discussions of evaluation andaccountability have typically arisen within the context offederally managed programs. Higher-order management,providing resources for the early childhood program andanswerable for the program in the larger web of govern-ment, has to judge what the program is achieving.' The pro-gram is evaluated, by one means or another. But higher-order management is only one of the parties with aninterest in a human services program. Individuals workingwithin the program have an interest, and a real need, toknow whether the program is or is not attaining meaning-ful results. Clients have such an interest; if the program'sclients are children, then parents are involved. Professionalsand program managers working in the community servedby the program have a need to estimate what the program isachieving, in order to come to terms with the possibilitiesof cooperation or competition with the program.
I believe we should move now to capitalize on the possi-bilities opened up to us our studies of child development,by the opening up of new forms of testing and assessment,and by our growing understanding of where and how infor-mation is used in program guidance and policy formation.We can do thisif we are able and willing to submit toslow, reflective, collaborative processes of instrument devel-opment.
I. To simplify discussion, I am here assuming that a government pro-gram has one and only one purpose, well-understood by all partiesand agreed to by them. But this is not true for a good many govern-ment programs that, in the political process, are put forward by coali-tions of parties who are directed towards a variety of purposes. HeadStart, for example, is generally understood to be a child developmentprogram. Historically, however, Head Start was put together by par-ties with declared major or minor interests in: (a) Civil Rights; (b)community action; (c) the coordination of services for children; and(d) stimulating school reform. Many programs in the human servicesreflect such coalitions of interests and emphases.
TABLE 1
TECHNICAL ASPECTS OF SCHOOL READINESS TESTS
Test Age Test-Retest
Inter- Reliabilityrater
Concurrent Validity Predictive Validity References
Lollipop Test(1981)
Pre-K to1st
KR20 forwholetest= .90
.58 w/ teacherratings.86 w/ MRT
Lollipop given atend of K; MRTafter lst=.73;MRT after4th= .40
Bringance K and1 Screen (1986)
K to 1st
Bohem Test ofBasic Concepts,R. (1986)
K to 2nd K-.881st -.552nd-.66
.62 to .82 .60 w/ PPVT.24 to .64 w/Comprehensive Testof Basic Skills,CaliforniaAchievement Test,Iowa Test of BasicSkills
Bracken BasicConcept Scale(1984)
2-6 to 7-11
.97 fortotal test
.76 to .80 .68 to .88 w/ PPVT,Bohem, MRT,. andToken Test
McCarthyScreening Test(1978)
2-6 to 8-6
.32 to .69 .41 to .80 .66 w/ PeabodyIndividualAchievement Test
33
37
Test Age Test-Retest
Inter-rater
Reliability Concurrent Validity Predictive Validity References
BattelleDevelopmentalInventory (1984)
0-6 to 8-0
> .90 > .90 .81 to .95 .41 to .60 w/Stanford Binet
T.C. II, 72-82
Gesell SchoolReadiness Test
4-6 to 9 .79 .87 .84 83% w/ teacherratings;.64 w/ Piagetian testbattery;.50 w/ ThorndikeIQ's;.61 w/ ThorndikeMental Ages
Gesell in K; .64 w/Stanford A.T. in1st
MetropolitanReadiness Tests(1986)
PreK to1st
.62 to .92 .66 to .93 Took MRT; 6months later, .34 to.65 w/MetropolitanAchievement Test;.47-.83 w/ StanfordA.T.
WechslerPreschool andPrimary Scale ofIntelligence(1967)
4 to 6-6 .86 to .92 .77 to .96 .75 w/ StanfordBinet.58 w/PPVT.64 w/ Picorial Testof Intelligence
WechslerRevised (1989)
2-4 to 7-3
.81 .91 to .96 .74 to .90 w/ Wisc-R, Stanford Binet,and McCarthy
34
ri)
Test Age Test-Retest
Inter-rater
Reliability Concurrent Validity Predictive Validity References
DevelopmentalIndicators for theAssessment ofLearning, R.(DIAL) (1984)
2 to 6 .76 to .90 .96 .40 w/ StanfordBinet
Basic SchoolSkills Inventory(1983)
4-0 to 7-5
.88 to .92 .22 to .43 w/ teacherratings
T.C. IV, 68-75
Chicago EarlyAssessment andRemediationLaboratory(EARLY) (1984)
3-0 to 6-0
.72 to .91 .89 ERIC ED 204372
CIRCUS (1979) PreK to1st
.74 to .89 T.C. VII, 102-109
Cognitive SkillsAssessmentBattery
PreK toK
.80 T.C. VII, 126139
DevelopingSkills Checklist(1990)
PreK toK; 4 to 6
N/A Yet
35
39
Appendix D: Judging Interventions by Their Results *
Lisbeth B. SchorrHarvard University Working Group on Early Life
Paper presented at the
Issues Forum on Child-Based Results
New York City
Section 1: IntroductionI have been asked to discuss how child-based results' inearly childhood are related to the broader political contextin which the shift to results accountability is currentlyoccurring. That political context is defined, in my view,first, by an increasing concern about the state of America'schildren and families, particularly about escalating rates ofviolence among ever younger children, of children bearingchildren, and of youngsters coming of age without the skillsor motivation to earn a decent living. Secondly, it is definedby a growing sense that nothing systematic can be doneexcept, perhaps, for harsh punitive measuresto reversethese trends.
I believe that a shift toward results accountability is anessential strategy in efforts to build on new research andexperience to improve results for children and families.Results accountability is not a panacea, but it could be amajor step toward improving the conditions in which chil-dren grow into adulthood. It could also become a centralstrategy in a concerted effort to improve results for childrengrowing up in high-risk environments.
BACKGROUND
While much of the current discussion about whether any-thing works, whether government does anything right, andwhether the government that governs least also governsbest, is pure rhetoric, the rhetoric hides some real issues thatneed urgently to be addressed. Among them is how toimprove our ability to differentiate what works from whatdoes not. Legislators have to know what works when votingon laws and appropriations, parents want to know whethertheir child's school is providing an effective education,foundations have to know whether they are supporting apromising strategy, and voters who have given up on com-passion want to know what is a good investment.
Until recently, anyone who wanted to know whether taxor philanthropic dollars were being spent for a good pur-pose was offered one of three unsatisfactory responses: Themost traditional response has been to bypass the problemsof obtaining information about results and to assume that
36
40
what mattered were intentions and efforts, institutions andservices, resources and spending (Manno, 1994). The familyservice agency was doing its job if its budget was increasingand its monthly parent education sessions were attended bya specified number of people and were under the supervi-sion of a certified social worker.
A second, more recent, response has been to say that ifyou really want to know what is working, you have to pri-vatize the functionlet the market place become the judgeof effectiveness, by shifting the school or day care or recre-ation or mental health program out of the public or non-profit sectors. "Shift the burden of evaluation from theshoulders of professional evaluators to the shoulders ofclients, and let them vote with their feet," advises UCLAprofessor James Q. Wilson (cited in Dilulio, 1994, p. 58)' .
And that is how, presumably, we know preschool programsworkmiddle class parents spend money on them.
In a third response, providers of health, education, andsocial services say "Trust us. What we do is so complex, sohard to document, so hard to judge, and so valuable, inaddition to which we are so well intentioned, that you, thepublic, should support us and our programs without askingfor evidence of effectiveness. Don't let the bean counterswho record the cost of everything and know the value ofnothing interfere with our valiant efforts to get the world'swork done."
Since the mid-198os, in the face of ever-increasing skep-ticism about the value of public investments in any humanservices, a fourth answer has emerged. A new breed of socialreformer is contending that public support for social invest-ments would be greatly strengthened if citizens, tax payers,customers, clients and communities were able to hold theproviders of services, supports, and education accountablefor achieving the results that citizens value. Many reformersare also coming to see results-based accountability as animportant way of increasing program effectiveness by free-ing human services from the straightjackets of rigid rules.
Section 3: The Problems That Results-Based AccountabilityCan Solve
GIVE THE PUBLIC SOME PROOF OF RESULTS
Large numbers of US citizens have a deep sense that they arenot getting their money's worth from their governments.'The 1995 confirmation hearings on the nomination of Dr.Henry Foster to become Surgeon General featured lengthy
and often confused exchanges on the impact of "I Have aFuture," the teenage pregnancy prevention program whichDr. Foster founded in Nashville, Tennessee. After much dis-cussion about the meaning of several program evaluations,Senator James Jeffords of Vermont finally stated, in someexasperation, "We're fooling ourselves to think these pro-grams are good because they feel good, when the evidence ofimpact isn't there." Senator Jeffords is not alone in his senseof frustration. In fact he speaks for an increasing number ofgovernment officials and citizens whose faith in social pro-grams will not be restored until they know what, exactly,they are getting for their money, be it tax money or large-scale philanthropy.
Paying attention to results rather than inputs is centralto the reinventing government proposals of David Osborneand Vice President Al Gore. But they were hardly the firstto preach the results gospel.' Two decades before shebecame head of the Office of Management and Budget inthe Clinton Administration, Alice Rivlin was calling forbetter measures to assess the success of social programs,because public concern with ineffectiveness of human ser-vices was running "very high indeed." (Rivlin, 1971, p. 65).She concluded that "all the likely scenarios for improvingthe effectiveness of education, health, and other social ser-vices dramatize the need for better (outcome) measures. Nomatter who makes the decisions, effective functioning ofthe system depends on measures of achievement. .... To dobetter, we must have a way of distinguishing better fromworse" (Rivlin, 1971, pp. 140-41, 144).5
FREE HUMAN SERVICE PROGRAMSFROM THE STRAIGHT-JACKETS OF
CENTRALIZED MICROMANAGEMENT ANDRIGID REGULATION
Management by results is the best alternative to the top-down, centralized micromanagement that holds peopleresponsible for adhering to rules that are so detailed thatthey make it impossible for a program or institution torespond to a wide range of urgent needs.
Whereas the bureaucratic paradigm assumes that controlcan only be exercised by rules, an outcome oriented organi-zation substitutes "adherence to norms" to fulfill the samefunction (Barzelay, 1992, pp. 124-5). A commitment toresults is essential to this shift, because a clear understand-ing about purposes and desired results is the basis on whichemployees will take responsibility for adhering to norms,and will channel their energies into making appropriateadaptations and solving problems. Employee performanceimproves when employees feel accountable because theybelieve that their intended work results are consequentialfor other people (Barzelay, 1992). An results orientation alsoencourages staff to think less categorically as they becomemore aware of the connection between what they do andthe results they seek.
37
ENHANCE SOCIETAL AND COMMUNITYCAPACITY TO BE MORE PLANFUL AND
MINIMIZE INVESTMENT IN ACTIVITIESTHAT DO NOT CONTRIBUTE TO
IMPROVED RESULTS
Agreement on a common set of goals and outcome mea-sures makes collaboration easier, and also fuels the momen-tum for change and helps promote a community-wide "cul-ture of responsibility" for children and families.
Reflecting Alice in Wonderland's insight that if you donot know where you are going, any road will get you there,a focus on results is likely to discourage expenditures ofenergy, political capital and funds on empty organizationalchanges and on ineffective services. The shared commit-ment to improve results for children is what can makeefforts at collaboration and service integration fall intoplace--not as an end, but as an essential means of workingtogether toward improved results.
FOCUS ATTENTION ON WHETHERINVESTMENTS ARE ADEQUATE TO
ACHIEVE THE PROJECTED RESULTSThe new conversation about results may have its most pro-found effect by injecting a strengthened ethical core intohuman service systems' that currently focus more attentionon the fate of agencies and programs than on whether peo-ple are actually being helped. The new results focuspromises (or threatens, in the eyes of some) to end a con-spiracy of silence between funders and program people byexposing the sham in which human service providers, edu-cators, and community organizations are consistently askedto accomplish massive tasks with inadequate resources andinadequate tools. Attention to results forces the question ofwhether outcome expectations must be scaled down, orinterventions and investments scaled up to achieve theirintended purpose.
In the past, parent education programs have beenfunded with the vague expectation that they would some-how reduce the incidence of child abuse, although a fewdidactic classes have never been shown to change parentingpractices among parents at risk of child abuse. Similarly,outreach programs to get pregnant women into prenatalcare are expected to reduce the incidence of low birthweight, based on the similarly vague belief that outreachprograms are a good thing, without any knowledge ofwhether the prenatal care that is made more accessible actu-ally provides the services that could be expected to result ina greater number of healthy births.
Especially in circumstances where it will take a criticalmass of high quality, comprehensive, intensive, interactiveinterventions to change results, where effective interven-tions must be able to impact even widespread despair,hopelessness and social isolation, funders and program peo-
41
ple should resist the temptation to obscure the limitationsof so many current efforts. Providersand even reform-erswho are asked to achieve grand results with interven-tions so paltry that they are in no way commensurate to thetask, should not obscure the insufficiency of the investmentby pleading with funders and evaluators to just documenttheir efforts and not their results because it would not befair to hold them accountable for real results changes whenthey are doing the best they can. Evidence that a dilutedform of a previously successful intervention is not makingan impact is not an argument against results-based account-ability. It helps to clarify that dilution regularly transformseffective model efforts into ineffective replications. Recog-nition that a single circumscribed intervention may not besufficient to change results is not an argument againstresults-based accountability. It is an argument for adequatefunding of a combined critical mass of promising interven-tions.
SECTION 3: RESULTS-BASEDACCOUNTABILITY: A FAUSTIAN BARGAIN?
Critics of the push toward results accountability range fromthe skeptical to the appalled. Commenting on pressures toincorporate results accountability in early childhood pro-grams, Sue Bredekamp, of the National Association for theEducation of Young Children, says it is but "one more oppor-
tunity and justification to 'blame the victims,' because thechildren who are in greatest need of services demonstrate thepoorest results" (Bredekamp, 1995).
Skeptics see the willingness to be held accountable forachieving specified results as a Faustian bargainevenwhen they agree that a shift toward results accountabilityhas the potential to solve a lot of serious problems. Theybelieve that human service providers, in their eagerness toobtain more funding and to escape over-regulation, willbecome unwitting tools in the war against government, thewar against the vulnerable, and the war against all publicsector activities that are grounded in considerations ofmorality, ethics, and social justice.
FEARS OF RESULTS ACCOUNTABILITY
Those who resist the push to results accountability have atleast six specific fears:
First, knowing that what gets measured gets done, theyfear that programs will be distorted. What will get done willbe what is easiest to measure and has the most rapid pay-offrather than what is really important. They point outthat most communities have aspirations for their childrenthat greatly exceed the results that are currently measurable,especially when the demand is for quick evidence of suc-
38
4 ')
cess. Will community health clinics raise immunizationrates at the cost of cutting back on other kinds of well childcare, or support for chronically ill children? Will preschoolprograms deprive children of the opportunities for play thatstimulates creativity and teaches empathy in order toreserve more time for the flash cards whose mastery showsup on "school readiness" tests?
A second fear is that even effective programs will seem tobe accomplishing less than they actually are. If an inner cityconsortium gets funding to improve the employability ofyoungsters coming out of high school, will it be judged afailure if the predictions that produced the funding turnout to be overly optimistic? And will it be judged a failure ifresults do not change as quickly as the funders had hoped?Will the consortium be held responsible for achieving city-wide improvements in results even though they were onlyable to work with only 15o youngsters and their families?
A third fear arises from the recognition of the complex-ity of the most promising interventions, and the corollarythat in the complex, interactive strategies that are mostpromising, responsibility for both progress and failure can-not be accurately ascribed. No single agency, acting alone,can achieve most of the significant results. If higher rates ofchildren ready for school depend on the effective contribu-tions of the health system, family support centers, highquality child care, nutrition programs and Head Start, aswell as on informal supports and community activities, willagency accountability be weakened as attention shifts tocommunitywide accountability efforts? How are agenciesto be held accountable for results over which no singleagency has control?
A fifth fear is that results accountability will become ashield behind which the few remaining protections andsupports for vulnerable children, youth, and families will bedestroyed. Especially in this anti-regulation era, rock-bot-tom safeguards against fraud, abuse, poor services, and dis-crimination based on race, gender, disability, or ethnicbackground could be destroyed. The new results orienta-tion could lead to the abandonment of the input andprocess regulations that now restrict the arbitrary exercise offront-line discretion by powerful institutions against theinterests of powerless clients.
A sixth fear is that the results-based accountability that isintended initially to serve such benign functions as creatingpressures for reform and sharpening the focus of managersand practitioners on accomplishing their mission ratherthan preserving their turf, will soon lead to such hard-edgedconsequences as results based budgeting. The actual alloca-tion of funding based on a program's or agency's perfor-mance could ultimately threaten to shut down programsand agencies that cannot provide evidence of their contri-bution to achieving agreed-upon results, despite the factthey may be making such a contribution.
Section 4: The Special Case ofSchool ReadinessThe first of the national education goalsagreed to ini-tially by the nation's governors and President George Bush,and subsequently endorsed by both President Bill Clintonand the U.S. Congresswas that by the year woo all chil-dren will start school ready to learn. The importance of thisgoal is not disputed. Recent research has it made quite clearthat brain development occurs earlier, more rapidly, and ismore vulnerable to environmental influence and more last-ing than had been previously been suspected (CarnegieTask Force on Meeting the Needs of Young Children,1994). Young children's experiences between birth andschool lay the foundations for success in school and in life(National Research Council, 1991). There is evidence thatchildren without a good foundation do not do well at thebeginning of school and will not be able to catch up.' Inaddition, the proportion of children in a kindergarten orfirst grade class who are ready for school learning is a pow-erful predictor of how much learning will go on in thatclass.'
Despite the unanimity around the importance of schoolreadiness, the prospect of measuring it has aroused deeppassions and controversy. This may be because the welcomerecognition among policy makers and the public that theearly childhood years are such a critical time has led todecidedly unwelcome consequences: the use of standard-ized tests to make high stakes decisions about individualchildren, including labeling some unready for school entryand placing others in special tracks. The result has been thata high proportion of children have a damaging "failure"experience at the very beginning of their academic careers,and that some preschool programs have been driven toteaching test-taking skills, concentrating on rote learning,memorization, and drill ... rather than on the explorationand experimentation and grasp of basic concepts that arekey to later learning .... and that foster confidence, curios-ity, and problem solving" (National Research Council,1991, p. 2, io).
These developments have left the early childhood com-munity so traumatized at the possibility of unwittingly pro-moting further inappropriate testing, that many oppose anyattempt to assess school readiness by testing or observingindividual children, even if the testing is done for the pur-pose of judging the community's provisions for preparingchildren for school entry, and not the abilities or capacitiesof individual children.
There is little dispute about the elements of early experi-ence that contribute to school readiness. They include goodhealth care and nutrition, high quality child care andpreschool experiences, communities that support families,and homes that provide children with the conditions thatdevelop trust, curiosity, self-regulation, the foundations of
39
literacy and numeracy, and social competence (ActionTeam on School Readiness, 1992; Boyer, 1991; NationalTask Force on School Readiness, 1991; Zill, 1995). There isa school of thought that concludes that we know enoughabout what communities have to do to produce theseresults, that it is possible to measure community capacity toassure school readiness as a proxy for measuring children'sschool readiness directly. The National Governors' Associa-tion, for example, has proposed a list of community capac-ity indicators for this purpose.
My own view is that community capacity indicatorswould serve as a reliable proxy if we knew more than wenow do about the precise linkages between inputs andresults, between, say, home visiting and infant health orbetween family support centers and parental competence,between the elements of good child care and the develop-ment of curiosity in toddlers.
However, given the current state of knowledge, mea-sures of community capacity will be informative but notdefinitive in assessing progress toward universal schoolreadiness. While it is absolutely clear that the extent ofschool readiness depends on the existence, accessibility andquality of an array of services, supports, and institutions, itcan probably best be discerned by looking directly at sam-ples of children.
The question then becomes, can children's school readi-ness be determined without doing them any harm? Can itbe done in ways that would make it impossible to label orstigmatize individual children? Can it be done in ways thatwould strengthen rather than distort preschool programsthat "taught to the test"? Can it be done in ways that"acknowledge the fluid and cumulative nature of develop-ment' and that do not result in "blaming children and fam-ilies for low levels of early learning?" (Phillips & Love,1994). Can it be done in ways that make clear that if largenumbers of children are not ready for school, that is not achild problem but a community problem?
A great deal of work, such as that led by John Love atMathematica Policy Research, has been done to suggestthat these questions can now be answered affirmatively(Love, 1995).' But a clear consensus in the early childhoodcommunity has not yet emerged around this proposition. Itmay be that assessments of school readiness in the mostimmediate future will have to rely on approaches that com-bine observations of samples of children and measures ofcommunity capacity, as proposed by Yale Professor SharonLynn Kagan in the most recent National Governors' Asso-ciation Issues Brief (Kagan, 1995).
Section 5: Choosing the Right ResultsWhen communities or states (or the nation, in the case ofthe education goals) actually agree on results that all thestakeholders consider important and meaningful, a lot of
43
other things fall into place. Results that have been agreedupon by professionals and clients and other interested par-ties can become a solid foundation on which new strategiescan be hammered out, and flexible responses can beadapted and evolved on the basis of continuing feedback."With results as constants, we can afford to experiment andeven disagree over the means: whether and for whom homevisiting is more effective than family support centers;whether parent involvement is best achieved by helpingparents to read to their children or help them with theirschool work, through parent employment as classroomaides, or through parent participation in governance;whether children best learn to read using phonics, wholelanguage, or some other method. By being moored to theends, it is possible to stay flexible on the means.
MEASURING SUCCESS
Of course, if results are to be the constants, the selection ofoutcome measures becomes enormously important. Forbetter or worse, what gets measured has a great effect onwhat gets done. In one way or another, teachers teach to thetest, just as social workers pay attention to what the audi-tors count. So the trick is to devise measures that come asclose as possible to actually reflecting what ought to getdone. If you want children to learn to reason in math, yougo beyond multiplication tables in assessing their perfor-mance. If you want children whose eyesight is defective tobe treated, you measure the absence of untreated visiondefects among first graders, not the number of visionscreenings that have been done.
In addition to all the technical considerations that deter-mine the choice of outcome measures (Love, 1995; Moore,1994), the greatest stumbling block for those actuallyengaged in moving to outcome accountability has been thedifficulty of forging agreement on a set of results consideredimportant, meaningful, and measurable by a wide range ofstakeholders, including skeptics. This involves the resolu-tion of two major tensions: (a) between the implicit andexplicit purposes of the efforts, and (b) between all that thecommunity wants from the effort versus what can be agreedupon and measured.
We have been more opportunistic than planful in thiscountry about measuring success. We grasp for what is eas-iest to measure rather than what best reflects a program'spurpose. Thus the Westinghouse Learning Corporation,under a government contract to assess the effects of the ear-liest years of Head Start, measured only changes in the IQof participating children, even though IQ is one of the leastmalleable of human characteristics, and despite the fact thatHead Start aimed to improve the health and nutrition andsocial skills of participating children, to empower their fam-ilies, strengthen their communities, and in many other
40
ways to contribute to their school readiness. Some thirtyyears later, IQ still becomes the outcome that gets assessedbecause it's such a handy measure. Even today family sup-port programs are evaluated on the basis of their effect onthe IQ of participating children.
Hard-edged thinking about purposes and results meansasking anew about what is worth doing, and to what end.The results around which it seems to be easiest to get agree-ment are those around which judgments are seen as leastsubjective, and where, most people agree on the desirabledirection of change,' even if people do not agree on whatthey would give up to achieve such change, how to achieveit, or which results are most important (Rivlin, 1971).
But it gets harder, once we get beyond a few outcomemeasures that are readily agreed upon. It means becomingexplicit about everything, including multiple or evenconflicting goals. The debate has to include the question ofwhether the provision of jobs for local residents is amongthe purposes of Head Start, inner city hospitals, schools,child care or construction projects." Hard-edged thinkingabout results means acknowledging that while full-dayHead Start is likely to pay off in higher rates of schoolreadiness, part of its impact lies in its ability to provide jobsto neighborhood residents, the decreased social isolation ofHead Start parents, and the increase in the number ofadults who are confident that their children are well takencare of and are therefore better able to pursue job trainingor employment.
Efforts to define results must also resolve the tensionsbetween all that the community wants from the initiative asagainst what can be agreed upon and measured. These ten-sions are especially troublesome around results in earlychildhood and adolescence.
Many leaders in early education, whose hearts and soulsare committed to celebrating the diversity of small children(White, 1995), are convinced that whatever outcome mea-sures are selected to document school readiness, they will"mislabel miscategorize, and stigmatize children" by mea-suring only narrow cognitive development, (Kagan, 1995)and will result in narrow, standardized, cognitively orientedpreschool programming.
Having observed many recent efforts to select outcomemeasures in a wide variety of contexts, I have become con-vinced that these tensions will be resolved through increas-ing recognition of three fundamental points:
Communitiesand certainly parentshave goals fortheir children that are more ambitious, more differenti-ated, and more nuanced than the results that can beagreed upon and measured
It is possible to maintain a strong commitment to a pro-grammatic orientation that is ambitious, differentiated,and nuanced, while being held accountable to theaccomplishment of more modest results
Results selected for accountability purposes must be per-suasive to skeptics, not just to partisans of the programsand policies being assessed
AMBITIOUS GOALS AND MEASURABLE RESULTS
When communities, parents, practitioners, policy makersand advocates are asked about their goals and their visionfor their children, they talk about wanting all children togrow up in loving, nurturing and protective families, to beconnected to those around them, and to achieve their per-sonal, social, and vocational potential. They talk aboutwanting youngsters to feel safe, to have a sense of self-worth, a sense of mastery, a sense of belonging, a sense ofpersonal efficacy, to be socially, academically, and culturallycompetent, and to have the skills needed for productiveemployments
Such goals can become a framework within which out-come measures can be selected for accountability purposes,with the understanding that only some aspects of thesegoals can currently be measured with widely available dataand with outcome measures around which it is possible togain widespread agreement. There is a direct connectionbetween these goals and the outcome measures used foraccountability purposes, but goals and outcome measuresserve different purposes. Goals represent what the commu-nity is striving for. Outcome measures represent what thecommunity will be held accountable forby public andprivate funders and perhaps by higher levels of government.The goals can be general, but the outcome measures mustbe so specific, the public stake in their attainment so clear,and their validity and reliability so well established, that thecommunity would ultimately be willing to see rewards andpenalties, as well as resource allocation decisions, attachedto their achievement.
A commitment to more visionary goals is entirely com-patible with a commitment to documenting progresstoward the achievement of these goals by the use of moremodest outcome measures. Of course health is more thanthe absence of disease, educational attainment is more thannot dropping out, nurturing family life is more than theabsence of abuse and neglect, economic well-being is morethan living above the poverty line, and a thriving commu-nity is more than an absence of boarded up houses andopen drug markets (Brandon, 199z).'4 The attainment ofmodest but measurable results would signify substantialprogress toward more ambitious goals.
AMBITIOUS PROGRAMMATIC FOCUS ANDMEASURABLE RESULTS
Accountability systems that rely on results that are easilymeasured and are persuasive in a public policy context donot preclude a much broader programmatic agenda. Strate-gies to achieve measurable results must of course focus on
child's play as part of child care, on social skills as well ascognitive skills among preschoolers, on youth opportunitiesas part of youth development, and on caring and connect-edness as part of community building.
RESULTS THAT ARE PERSUASIVE TOSKEPTICS-AND THE SAGA OF
EDUCATION STANDARDS
The most compelling lesson about the importance of select-ing results that are persuasive to skeptics comes out of therecent experience with national efforts to agree on educa-tion standards.
In elementary and secondary education, the shift injudging quality from the amount of per pupil spending, toan approach that asks what children are learning has beenoccurring rapidly and tumultuously. Outcome standardsfor student learning were blessed in 1989 by the nation'sgovernors at the Williamsburg education summit called byPresident Bush, and were embraced by many educators,reformers, and advocates as a way of advancing the twingoals of educational excellence and equity (Manno, 1994).
Despite the fact that there was little agreement on howstudent achievement should be assessed, the EducationalCommission of the States reported that by 1993" states haddeveloped or implemented an outcome-based approach toeducation (Manno, 1994). This relatively smooth progres-sion did not last long. By February of 1995, Chester Finn,an early promoter of national education standards, enter-tained a Brookings Institution conference with a talk enti-tled "The, short unhappy life of national standards." Hedeclared national education standards dead for the foresee-able future. Whether or not his prediction turns out to becorrect, the process by which education standards were soquickly and perhaps fatally wounded is one from whichthose contemplating results-oriented accountability inother arenas have much to learn.
The most fundamental strategic error on the part of theproponents of results standards resulted, in my view, fromoverreaching. They failed to make the distinction betweenwhat they wanted for their children and what they wantedschools, teachers, and their children to be held accountablefor.
Proposed draft standards included such items as "Allstudents understand and appreciate their worth as uniqueand capable individuals and exhibit self-esteem; all studentsact through a desire to succeed rather than a fear of failurewhile recognizing that failure is a part of everyone's experi-ences" (Manno, 1994). From the perspective of a parent,these were reasonable objectives. But whether these shouldbe the objectives of schools or other public institutions,whether they could be agreed on and measured was anotherquestion.
Whether the recent push to introduce results standardsin education will end up being an impetus or an impedi-
45
ment to reform is not yet known. What is known is that inthe results selection process, it is lethal to ignore the distinc-tions between the goals that communities have for theirchildren and the results that can be agreed upon and mea-sured, between a programmatic orientation that is ambi-tious, differentiated, and nuanced and accountability formore modest results, and between results that are persuasiveonly to an initiative's supporters and those that are also per-suasive to skeptics.
Section 6: Mismatch Between theData That is Needed and the DataBeing CollectedThere is a severeand at first blush, strangepaucity ofdata that could help answer urgent questions about howwell efforts to change vulnerable lives are actually helping,and about which efforts help a little, which help a lot, andwhich help not at all.
The large gap between the data that is needed for resultsaccountability and the indicator data currently being col-lected exists because data collection has been shaped pri-marily by only two kinds of pressures: those that reflect theneed for administrative data for use in managing programs,policies and institutions; and those that reflect the interestsof social scientists, which have focused either on simpleindicators that can be monitored for national level trends ,or on complex measures of individual development requir-ing labor-intensive direct observation and data collection.Attempts by policy makers and advocates for children tomake do with what has been made available for these otherpurposes are increasingly unsatisfactory for purposes ofdesigning interventions, holding intervention effortsaccountable, and trying to understand what works tochange results.
Data collection that has not been required for adminis-trative and managerial reasons has reflected the interests ofsocial scientists who tend to think primarily about nationaltrends, economic influences, and naturally occurring socialchange. As a result, the nation's data tool kit is virtually use-less when it comes to efforts to understand the effects ofintentional interventions on the well-being of children andfamilies, especially when those interventions are designed tooperate at the level of the local community. The dataneeded by those who must make judgments about what'sworking, and who are committed to try to influence cur-rent policy debates, is very meager.
Much new work is now needed, and some is beginningto be done,' to expand information about the characteris-tics of children, families and communities that can be reli-ably measured, and to make data about results for childrenand families available in a more timely way and in units
42
4 3
that correspond to areas that are optimal targets of inter-vention, such as neighborhoods and school catchmentareas. Communities and funders (public and private) needto be able to identify long-term and interim outcome mea-sures that can be linked to interventions, and that can helpthem assess the effects of interventions across, not justwithin, the domains of health, education, child welfare,juvenile justice, community development, economic devel-opment, and job training. These information needs becomeincreasingly urgent with the need to assess the effects ofsuch new federal initiatives as the empowerment zones andenterprise communities, foundation-funded comprehen-sive community initiatives for children and families, andimpending changes in the allocation of responsibilitybetween the federal government and the states, and inmajor reforms of such safety-net programs as AFDC, Med-icaid, and Food Stamps.
WHO DECIDES?
In determining who selects the results to be achieved, thereis much controversy about "top-down" versus "bottom-up"processes. On the one hand, many believe that society hasso much at stake in the achievement of a core set of results,that political bodiesprobably at the state levelshouldbe responsible for identifying a set out results that are to beachieved in a particular jurisdiction. Others believe that"outcome measures imposed from outside a communityhave no legitimacy in terms of a local consensus-buildingprocess ... and cannot mobilize the resources needed toachieve the results sought ..." (Young, Gardner, & Coley,1993). Charles Bruner of the Iowa Child and Family PolicyCenter, who has struggled with this issue both as a state leg-islator and as an advisor to local programs, argues that thosecharged with achieving results must be involved in theresults selection process if it is to be regarded as fair, useful,legitimate, and if it is to reflect real-life experiences.
There does seem to be increasing agreement that theprocess of selecting results for accountability purposes,whether it is done by a state legislature or a local collabora-tive, must have political legitimation. Sid Gardner, of theCenter for Collaboration for Children at the University ofCalifornia in Fullerton, believes that the importance ofgoing through a consensus building process cannot beunderestimated, because the selection of outcome measuresis not primarily a technical, but a political problem.
It is clear that all of those affected by results-basedaccountabilityas legislators representing tax payers, asproviders, or as service beneficiaries or participantsmusthave a role. All concerned will be able to work more effec-tively toward common goals if they are able to engage in aconsensus-building process, involving both providers andrecipients of services, to select the outcome measures theywill use or be held accountable by.
Many forms of interactive consultation are possible. Forexample, when an official state body selects the results,localities may decide or negotiate the numerical value thatwill represent progress in the achievement of each outcome(e.g., the rate of low birthweight will be reduced by 5% eachyear, or racial disparities in low birthweight rates will bereduced by io% each year). But if results are to be used foraccountability purposes, and actually carry what DavidHornbeck calls hard edged consequences, it seems reason-able that after extensive participation and consultation, thefinal decisions are made by bodies at a higher or broaderlevel of governance than those being held accountable.
Section 7: The Importance of InterimMilestonesThe greatest single obstacle to realizing the benefits from ashift to results based accountability is the lack of interimmilestones that could reliably show that reform efforts areon track toward achieving their targets.
Local communities, agencies and programs that arestruggling to reform, improve, or expand their services, tointegrate services across helping systems, or to target a widearray of intensive interventions on selected geographicareas, are clamoring for ways of finding out in the near termwhether their efforts are changing results, or even whetherthey are going in the right direction and making progresstoward long-term results. Funders (public and private),practitioners, managers, and systems reformers are becom-ing increasingly aware that the most frequently cited lessonfrom major current reform efforts is that they take so muchmore time than expectedboth to get the initiative underway, and to get it to the point where it begins to show animpact on real-world results." They desperately need newtools that would allow them to demonstrate their short-term achievements. They need to be able to get interiminformation very quicklyoften long before a program is"proud," (Campbell, 1987) long before it has had a chanceto make an impact on rates of school readiness, child abuse,teenage pregnancy, violence, school success, and employ-ment.
Two kinds of interim measures can predict later results:indicators that attach to children, families, and communi-ties and that are a short-term manifestation of long-termresults, and indicators of a community's capacity to achievethe identified long-term results.
Examples of interim measures that are a short-termmanifestation of long term results for individuals, families,and communities include the following:
Receipt of prompt high quality prenatal care is thoughtto raise the chances of a healthy birth.
Children who do not read when they are seven are likely
43
to encounter later troubles at school.
An improvement in school attendance rates is thought topredict an improvement in school achievement rates.
Parents' sense of mastery and social support, and theabsence of parental substance abuse, as are thought topredict long-term non-recurrence of abuse or neglect.
Knowledge about the connections between measurableindicators of community capacity and long-term results isat a more primitive stage than knowledge about the con-nections between interim and long term indicators for chil-dren and families. Reliable theories about the linkagesbetween interventions and results, and about the constella-tion of conditions and interventions that will lead to goodresults, are scarce. Most are unproven. For example, can acommunity that is developing strategies to reduce rates oflow weight births assume with confidence that the"enabling conditions" to reach that outcome are some com-bination of the capacity (1) to provide family planning ser-vices to all persons of child-bearing age, and (2) to providehigh quality, responsive prenatal care, nutrition services,and family support to pregnant women?
It is probably not enough to know of the simple existenceof certain services, because their quality and how they aremade available must be taken into account to link themstrongly with results. The distinction among service avail-ability, access, and the nature and quality of the service inaccounting for improved results is crucialand requiresgreater understanding and a wider consensus around how tomeasure the factors that make services effective than nowexists (see Charles Bruner's pioneering work).
One connection that most observers consider reasonablywell established comes from the early childhood field: acommunity that is able to offer all of its low income chil-dren and their families Head Start and other high qualitycomprehensive preschool programs is likely to have a highproportion of children prepared for school learning at thetime of school entry.
Perhaps the most tantalizing of recently hypothesizedlinks between interventions and results that could producesome new short-term indicators of community capacity arebetween results for children and families and such indica-tors of community-level change as a strengthened infra-structure of informal supports, and investments in neigh-borhood safety and expanded economic opportunity. Butthere is as yet scant agreement on ways to measure commu-nity building, and only modest understanding of the pre-cise connections.
The need for both kinds of short-term indicators thatcould show movement toward long-term results has longbeen recognized. It has not been met because the ability todefine these interim markers with confidence depends on
473EST COPY AVAILAbL
having reliable evidence, theories, or at least sturdyhypotheses, about the antecedents of major long-termresults. Neither social science researchers nor the evaluationindustry have really invested in this arenaperhapsbecause their energies are exhausted by their pursuit of thatelusive goal of seeming as scientific as their colleagues in thephysical sciences, and because progress in this arenainvolves a higher ratio of judgment to certainty than mostsocial scientists are comfortable with.
As a society, we now need desperately to make up forlost time. One useful next step would be to systematicallyexamine findings in the recent literature and ongoing expe-rience to provide a more rigorous and deeper understand-ing of established connections among short-term and long-term results. We need to explore the connections betweenlong-term results on the one hand, and measures of interimindividual results and community capacity on the other.'
Section 8: Rigorous Thinking AboutProcess MeasuresProcess measures describe what is going on. (Process mea-sures will also continue to be important in assuring thatprocedural protections are maintained to guard againstfraud, corruption, and inequities or discrimination basedon race, gender, disability, or ethnic background.)'9 Processmeasures are an essential component of understanding theimpact of an intervention, though they themselves do notassess impact, unless they qualify as interim indicators.Process measures are important in finding out whether aprogram or intervention has actually been implementedaccording to plan. (Is the Head Start program in operationfor the number of hours its funders expect, has it enrolledthe expected number of children and the expected propor-tion of eligible children, has it involved a stipulated per-centage of parents, etc.)
Process measures become easily confused with outcomemeasures and interim measures. Distinguishing amongthese various indicators is essential to clear thinking aboutinterventions and their consequences. One of the reasonsfor the current confusion is that the same measure can bean outcome measure, an interim measure, or a process mea-sure, depending on context. If the purpose of the initiativeis community building, "community engagement" couldbe an outcome. If the purpose of the initiative is schoolreadiness and the connection between community engage-ment and school readiness were reasonably well established,"community engagement" could be an interim measure. Ifthe funder and grantee were to agree to make communityengagement an essential component of the intervention, itcould be a process measure.
A process measure can be used as an interim indicator ifthere is a reasonable hypothesis to make the link, as when a
44
4.3
program trains parents as community leaders as part of itsefforts to rebuild a community infrastructure. A processmeasure cannot be used as an interim indicator if there isno basis for linking it to long-term results.
The failure to think clearly about process measures, andhow they relate to what is being proposed or being doneand what is being accomplished, results in what DavidOsborne calls "process creep" (Osborne & Gaebler, 1993, p.350). When process creep occurs, means and ends becomeconfused, and the focus on what actually happens to peopleas a result of the activity is lost. The formation of a collabo-rative, or a high degree of participation in a new governanceentity may be the product of a great deal of effort, but isnot. evidence of progress toward agreed upon results unlessthe rationale that connects these activities to establishedresults is at least explicitly hypothesized, if not proven. Thenumber of children who have been screened for hearingand vision problems is a process indicator. Because screen-ing that isn't followed up with diagnosis and treatmentwhere needed won't reduce the number of children whosevision or hearing is impaired, screening should not be usedas an outcome indicator.
But the confusion about process measures is not onlyconceptual, it is also political. The temptation is ever-pre-sent to fall back on using process measures as evidence ofprogress, even when they meet none of the criteria for out-come measures and there is no basis for linking them toultimate results. Process measures often become substitutesfor outcome measures because they provide comforting evi-dence of activity, they demonstrate that something is hap-pening.
Typically, both grantmakers and grantees contribute toprocess creep. It happens in the early stages of programimplementation, when everyone involved suddenlybecomes afraid that his or her hopes for the project may notbe realized. It also happens when funders encounter hostil-ity to outcome accountability (and outcome evaluation)from communities and program people who fear that out-come measurement will not do justice to their underfundedintervention."
In responding to these fears, funders often find it easierto remove or move the goal posts than to strengthen theplayers. The typical forget-about-the-goal-posts conversa-tion takes place a few months into the implementationphase of a program. The hinder says to the grantee some-thing along the following lines: So we gave you the grant inthe hope that you would reduce teenage pregnancy andyouth violence in this community, and you now say thatwas really an unrealistic expectation? You may be right. Butwe do need some hard evidence that our grant is makingsome sort of difference, so let's see if we can get an evalua-tor to design an attitude survey that will determine whetheryou have increased the number of teenagers who think it's a
bad idea to carry a gun and to initiate sex when they'reyounger than fifteen. Or the evaluators could documenthow many youngsters come to your meetings and classes.Alternatively, maybe we or you could hire an enthnogra-pher to chronicle what's going on in your program ....
Some of these are useful things to do. It is especially use-ful to obtain rich descriptions of complex, nuanced inter-ventions. But descriptions of process are most useful whenthey become part of a systematic inquiry into what the pro-gram is accomplishing and why. Descriptions of a processare not a substitute for either outcome-based accountabilityor outcome based evaluation.
Section 9: ConclusionIn concluding, I would address those who still harbor gravedoubts and a visceral unease about the whole idea of resultsaccountability. Committed practitioners have every reasonto ask why should we have to prove the value of our work?They point out that those who would dismantle the safetynet and the whole infrastructure of public and nonprofitservices and institutions are not arguing efficacythey arearguing principle. These practitioners, along with parents,community leaders and other advocates wish to stand theirground on principle, and say that feeding young childrenand providing them with a safe and happy place to play isenough justification, that comforting a frightened adoles-cent needs no further rationale, that every expectant motheris entitled to the highest quality prenatal careregardless ofwhether there is a payoff in higher rates of school readiness,employability, or healthy births. Other countries, after all,do not make public support for basic services for childrenand families contingent on proof of their merit. In Franceand Germany and Britain and Japan, publicly supportedchild care and maternal and child health care, paid familyleaves, and universal child protective services are taken forgranted and require no evidence of effectiveness.
American human service leaders see themselves as part ofa tradition of service to the vulnerable whose value is ulti-mately independent of its effects. They cite MotherTheresa's explanation of her perseverance in the face of theenormity of world poverty: "God has called on me not tobe successful, but to be faithful" (Kagan, 1993). They citeGhandi's teaching that "It is the action, not the fruit of theaction, that is important."
My own belief is that the moral underpinnings for socialaction, especially by government, are not powerful enoughtoday, in the cynical closing years of the twentieth century,to sustain what needs to be done on the scale at which itneeds to be done. In this time of pervasive doubt, the mag-nitude of public investment that is required will be forth-coming only if there is evidence that investments areachieving their purpose and contributing to long-term
45
goals that are widely shared. And the chances of developingand sustaining the responsive bureaucracies that can sup-port effective programs will also increase to the extent thataccountability for results can replace accountability forobserving rigid and narrow procedural rules.
*Based on a chapter from a forthcoming book, tentatively titled Dis-turbing the universe: Strong families, supportive communities, responsivebureaucracies, and how to get there from here.
s. I use the words "results" and "outcomes" interchangeably. I use theword "goals" to refer to results that are desirable but cannot be readilymeasured or agreed upon.
2. While the bottom line of profit, and market performance and sur-vival is clearly established in private business, there is no similar agree-ment on success in the public sector.
3. "You're seeing it everywhere," says James R. Fountain Jr., asstresearch director at the Governmental Accounting Standards Board,"a growing frustration among taxpayers that they don't know whatthey're getting for their money" (Osborne & Gaebler, 1993, p. 140).
4. President Bill Clinton's remarks at the signing of the GovernmentPerformance and Results Act: "It may seem amazing to say, but likemany big organizations, ours is primarily dominated by considera-tions of inputhow much money do we spend on a program, howmany people do you have on the staff, what kind of regulations andrules are going to govern it; and much less by outputdoes this work,is it changing people's lives for the better?"(From red tape to results,1993, p. 73)
5. Dr. Rivlin fully recognized the difficulty of coming up with goodmeasures of performance: she called for "a sustained effort to developperformance measures suitable for judging and rewarding effective-ness ... all the strategies (here discussed) for finding better methods ofdelivering social services depend for their success on improving per-formance measures (Rivlin, 1971, p. 144).
6. "Ethical core" is Sid Gardner's phrase (Gardner, 1995). Gardner,along with David Hornbeck, has been the most persistent and effec-tive advocate of the shift to an results orientation.
7. Robert Slavin reports that students who are not reading at the endof first grade are at great risk; they don't catch up later (Slavin, 1995).
8. School readiness can be thought of as "a fixed standard of develop-ment sufficient to enable children to fulfill school requirements and toabsorb the curriculum content" (Kagan cited in Phillips & Love,1994). The National Head Start Association says that Kindergartenteachers expect that entering students will be able to work both inde-pendently and as members of small and large groups, to attend to andfinish a task, listen to a story in a group, follow two or three oral direc-tions, take turns and share, care for their belongings, follow simplerules, respect the property of others, and work within the time andspace constraints of a school program (National Head Start Associa-tion, 1995). Businessman and philanthropist Irving Harris tells thestory of Doris Williams, who taught kindergarten in the inner city ofChicago, who told him she could always handle one child who wasn'tready for school. "But when I had two or three who were not readythe extended attention they demanded meant that the rest of the classwas denied the time they had a right to expect from me" (Harris,1993).
9. The NGA's community capacity indicators (benchmarks) forschool readiness include rates of: children in preschool and child care
49
programs; children in preschool and child care programs meeting pre-scribed standards; eligible children in Head Start and public preschoolprograms; communities with family support and education services;school-age parents receiving comprehensive services; children whoexperience consecutive or multiple out-of-home placements; pregnantwomen receiving prenatal care during first trimester; children coveredby Early Periodic Screening, Diagnosis, and Treatment (EPSDT) orprivate health insurance; eligible participants in the Women, Infant,Children program (WIC).
to. In an earlier paper, Love wrote, "Our system (in which assessmentsare completed on a representative sample of children) avoids labelingchildren by focusing on aggregate measures for the community."
it. In the spirit of Sarason's conclusion that, "Problems are constants,answers are provisional" (Sarason, 1990, p. xii), I would say that wecould transform an understanding of our problems into an agreementabout goals, and let those be our constants, as we evolve and experi-ment with our provisional programmatic answers.
12. In this process, it is important to start, as Alice Rivlin (5970advises, with "indicators that measure movement in the appropriatedirection." These include the following: measures of physical health(such as low rates of low birthweight babies, high rates of two-yearolds fully immunized, no untreated vision or hearing problems atschool entry, low rates of sexually transmitted diseases); measures ofschool achievement; measures of perils avoided in adolescence (such astoo early childbearing, arrests for violent crime, suicide, homicide,substance abuse); measures of productivity and economic well-being(such as rates of productive employment, and rates of families withincomes over the poverty line) (Rivlin, 1971, p. 47).
13. The authors of Outcome Funding are troubled by the fact that apublic college they describe was saved from closing by a campaign topreserve the jobs that the university provided in a poor county; theysuggest that if that is one of the desired results, 5o% of the school'sbudget should be funded by the state's economic developmentagency. Similarly, food stamps seems to be have been saved from theGingrich assault in 5995 by agricultural interests, not by concern forthe nutrition needs of the poor.
14. Brandon distinguishes between measures of wellbeing (which hecalls positive measures) and measures of progress in reducing prob-lems (which he calls negative measures) to argue in favor of using theformer as the best way to approach results accountability. But he rec-ognizes that 'the popularity of negative measuresmeasures ofpoverty, dysfunction, and illnessreflects the ease of consensus onrecognizing that some things are clearly inadequate, without thedifficulty of consensus on defining what is adequate" (Brandon, 1992,p. 24).
15. I was at a meeting recently where Marie McCormick cited a studycurrently using IQ to find out how well a family support program wasworking.
16. In the vanguard of the work on neighborhood level indicators isthe Foundation for Child Development, and researcher ClaudiaCoulton at Case Western Reserve University.
46
17. As one dramatic illustration, the coordinator of the externalreview team commissioned by the Pew Charitable Trusts to review itsambitious proposed Children's Initiative, suggested that the problemof obtaining timely results information contributed to the decision tocancel the Initiative. Reflecting back on lessons learned, he wrote that,"To build the public and political will to continue, projects must beable to demonstrate results in a credible way. State officials developingThe Children's Initiative recognized that data on enhanced resultswere critical to expansion, but such data were unlikely to be availablewithin the important early years of the project" (Krauskopf, 1994).
18. The evaluation steering committee of the Aspen Roundtable onComprehensive Community Initiatives has been discussing the use-fulness of a "Michelin Guide" to interim indicators, that would assessthe degree of confidence with which the hypothesized connectionbetween interim indicators and long-term results measures could belinked, all along the causal chain. The idea would be to distinguishamong the connections that seem to be fairly well established, thosewhere the evidence is weaker and the hypothesized connectionsurgently need to be tested, and those where even promising hypothe-ses are lacking.
19. Procedural protections will have to be maintained and monitoredwherever there is no other way to restrict the arbitrary exercise offront-line discretion by powerful institutions against the interests ofpowerless clients. Because the present capacity to use outcome mea-sures to judge program effectiveness is still primitive, and because ittakes so long for results to improve in response to even the most effec-tive interventions, existing process measures will continue to play arole in holding agencies, communities, and systems accountable dur-ing the period of transition. Increasingly, however, as new measuresthat are more closely and reliably related to results become available tomeasure initial progress toward ultimate goals, it will be important tocontinually re-examine the balance between the use of process andoutcome measures, so that communities and agencies can make surethey utilize results-based accountability as much as the state of the artallows.
zo. People who are responsible for programs, be they teachers, socialworkers, early childhood people, youth workers, or neighborhood res-idents and other program participants, often view evaluation researchas an "unfriendly act," observed Peter Bell, when he was president ofthe Edna McConnell Clark Foundation (Bell, 1993).
References
Action Team on School Readiness. (1992). Every childready for school. Washington, DC: National Governors'Association.
Barzelay, M. (1992). Breaking through bureaucracy: A newvision for managing government. Berkeley, CA: Universityof California.
Bell, P. (1993). Edna McConnell Clark Foundation reportof the president.
Boyer, E. L. (1991). Ready to learn: A mandate for thenation. Princeton, NJ: Carnegie Foundation for theAdvancement of Teaching.
Brandon, R. N. (1992). Social accountability: A conceptualframework for measuring child/family well-being. Washing-ton, DC: Center for the Study of Social Policy.
Bredekamp, S. (1995, May 8). Memorandum to S. L.Kagan, M. Levine, & V. Washington.
Campbell, D. T. (1987). Problems for the experimentingsociety in the interface between evaluation and serviceproviders. In Kagan, S. L., Powell, D. R., Weissbourd, B.,& Zigler, E. (Eds.), America's family support programs: Per-spectives and prospects. New Haven, CT: Yale UniversityPress.
Carnegie Task Force on Meeting the Needs of YoungChildren. (1994). Starting points: Meeting the needs of youngchildren. New York: Carnegie Corporation of New York.
Dilulio, J. J. (1994). Deregulating the public service: Cangovernment be improved? Washington, DC: BrookingsInstitution.
From red tape to results: Creating a government that worksbetter and costs less. Washington, DC: National Perfor-mance Review. September 7, 1993.
Gardner, S. (1995, April). Kissing babies and kissing off fam-ilies. Speech given at Concord, MA.
Harris, I. B. (1993, Spring). "EducationDoes it make adifference when you start?" Aspen Institute Quarterly, 5(2),30-52.
Kagan, S. L. (1993, May). Going-to-scale with a comprehen-sive services strategy. Presentation at Wingspread confer-ence.
Kagan, S. L. (1995, May). By the bucket: Achieving resultsfor young children. Washington, DC: National Governor'sAssociation.
Krauskopf, J. A. (5994, October). Overcoming obstacles toimplementing reform of family and children's services. Paperprepared for the annual meeting of the Association forPublic Policy Analysis and Management, Chicago, IL.
Love, J. M. (5995, June). Can we measure the results? Or, associety shifts towards results-based programs and services foryoung children, can the scientific community provide reliable,valid, and useful child results? Paper prepared for discussionat the Issues Forum on Child-Based Results, sponsored bythe Carnegie Corporation of New York.
47
Manno, B. (1994, June). Outcome-based education: Miracleor plague? Indianapolis, IN: Hudson Institute, HudsonBriefing Paper No. 165.
Moore. K. (1994, November). Criteria for indicators ofchild well being. Background paper for the Conference onIndicators of Children's Well-Being, Washington, DC.
National Head Start Association. (1995, May). NationalHead Start Association statement on child-based results.(1995, May). Washington, DC: Author.
National Research Council, Institute of Medicine. (1991).Improving instruction and assessment in early childhood edu-cation. Washington, DC: National Academy Press.National Task Force on School Readiness. (1991, Decem-ber). Caring communities: Supporting young children andfamilies. Alexandria, VA: National Association of StateBoards of Education.
Osborne, D. & Gaebler, T. (1993). Reinventing govern-ment. New York, NY: Penguin.
Phillips, D. & Love, J. (1994). Indicators for school readi-ness, schooling and child care in early to middle childhood.Paper prepared for conference on Indicators of Children'sWell-Being, Washington, DC.
Rivlin, A. M. (1971). Systematic thinking for social action.Washington, DC: Brookings Institution.
Sarason, S. B. (1990). The predictable failure of educationalreform. San Francisco, CA: Jossey-Bass.
Slavin, R. (1995, May). Three days in May: How to assessfirst grade reading, and why we must do so. Memorandumto May 1995 Carnegie Issues Forum on Child-BasedResults.
White, S. (1995, June). Presentation at Carnegie Corpora-tion Issues Forum on Child-Based Results, New York,NY.
Young, N. K., Gardner, S. L., Coley, S. M. (1993). Gettingto results in integrated service delivery models.
Zill, N. (1995, May). Responses to Issues Forum Meetingquestions. Carnegie Issues Forum on Child-Based Results,New York, NY.
5 1
Appendix E: Can We Measure the Results?
Or, As Society Shifts Toward Results-Based Programs AndServices For Young Children, Can The Scientific Commu-nity Provide Reliable, Valid, And Useful Child Outcome
Measures?
John M. LoveMathematica Policy Research, Inc.
Paper presented at the
Issues Forum on Child-Based Results
New York City
Section 1: IntroductionLisbeth Schorr (1994) has made the case for shifting toresults-based accountability as we strive to improve the livesof children and families. I accept the desirability of thisshift. But now what? Can we actually measure the results orresults that programs are now clamoring to articulate? Ihave been asked to consider this question from the perspec-tive of sciencethe science of child development and thescience of measurement.
WHERE THERE'S A WILL THERE'S A WAY
At some level, the answer has to be "yes," because we con-tinually measure a wide range of results in a large numberof programs. In fact, there is a longthough some mightsay checkeredhistory of using existing instruments to cre-ate a body of knowledge on the effectiveness of programsfor children. Head Start, for example, has long measured itseffectiveness with a variety of child outcome measures(Kresh 1993; McCall 1993; McKey et al. 1985). Studies ofother preschool programs, like the Perry Preschool, haveinfluenced public policy in major ways with only minimalquestions about the accuracy of the results measures (Bar-nett 1992; Schweinhart, Barnes, and Weikart 1993). Earlyintervention studies, like the Infant Health and Develop-ment Project (IHDP) (Brooks-Gunn et al. 1994) and theAbecedarian project (Campbell and Ramey 1994; Rameyand Campbell 1991), are currently claiming significantbenefits based on existing child outcome instruments.Extant instruments form the basis for our understanding ofthe benefits of high-quality child care (Helburn et al.1995;Howes, Phillips, and Whitebook 1992). Furthermore, stud-ies of family support programs also use existing measures toassess results for children (Barnett 1995; Lamer 1992; St.Pierre, Layzer, and Barnes 5994; Yoshikawa 1994). Becausewe are measuring child results, there is at least tacit accep-
48
5 2
tance of the outcome measures by a sizable group ofresearchers, program operators, and policymakers. Withoutthis acceptance, we could not have any confidence in thehundreds of studies we rely on for our understanding ofprograms, their accomplishments, and their impacts onchildren.
If, at some level, many researchers and program plannersbelieve we can currently measure important results, weshould still ask whether we are doing so as well as weshould. Could we use better measures? Are we measuringthe right results? Or the most important ones? Do we fail tolearn about particular kinds of program effects because welack the instruments? With sufficient resourcesthat is,time and moneywisely applied, there is no doubt in mymind that the scientific community can answer the feasibil-ity question affirmatively. But, for this paper, I will assumethere is no time and no additional money. The sponsors ofthe forum are interested in what is feasible now, not what iseventually possible!
Schorr (1994) and her colleagues did consider measuresof programs' results for children that are suitable for imme-diate use. Her "start-up list" of possible measures includesmany of the usual suspectslower rates of low birthweightbabies, more complete immunizations, lower schooldropout rates, and fewer children abused or neglected, toname a few. These are critical results, particularly from acost-conscious public policy perspective. Conspicuouslyabsent, however, are measures of the developmental resultsthat many early childhood program peopleteachers, care-givers, parents, and child development expertscare about:communication skills, thinking ability, self-esteem, school-related knowledge, behavior problems, curiosity, and soforththe multiple dimensions of children's early learningand development that have been so admirably and compre-hensively articulated by the technical working group for thefirst national education goal (Kagan, Moore, and Bre-dekamp 1995). Although aspects of these developmentaldomains are commonly measured, as they have been in thestudies cited earlier, we could long debate their scientificsuitability.
In this paper, I discuss the scientific feasibility of mea-suring a wide spectrum of child results. All are encompassedby the notion of children's well-being. All are articulatedamong the goals of many programs that are striving toimprove results for children and families. I illustrate mea-surement of various facets of well-being selected from thetypical domains: health and physical development, intellec-tual or cognitive functioning, language and communica-tion, and social and emotional development.
It may well be that measurement is more feasible insome domains than others. I also consider the possibilitythat feasibility is a function of the age of the child. I raisethe possibility that measurement feasibility depends onother characteristics of the child, such as his or her nativelanguage and culture, and special needs or disabilities. Fea-sibility may vary with the type of measurement as well.Finally, I suggest that the science of psychological measure-ment is not the only factor influencing the feasibility ofadopting an results-based orientation for children's pro-grams. In fact, in spite of the theme of this forum, the sci-entific feasibility of measuring child results may not be themost important source of apprehension. Before getting tothis topic, however, I discuss some of the more salient pastand present child-outcome measurement efforts.
Section 2: Past Assessment Efforts:Heritage or Hamstring?Within the early childhood community, the infamous"Westinghouse" study stands as a hallmark of failure in thescience of evaluationfailed design, failed measurement,and failed policy. Not only did this early evaluation concen-trate on measuring the intellectual development of HeadStart children with an inappropriate test of "IQ" (chosenbecause it was "the best measure available"), it included non-comparable comparison groups under noncomparable pro-gram conditions. Although design flaws, unrealistic expecta-tions, and problems with the media (including prematurerelease of findings to The New York Times) compounded abad choice of measures, it is often the outcome measure thattakes the heat of later condemnation.
If the so-called Westinghouse evaluation set the fieldback a decade, the long-term benefits shown for graduatesof Ypsilanti's Perry Preschool immeasurably advanced thecause of early childhood programs. I don't think there is asingle study that has so inclined politicians to vote for addi-tional funds for Head Start. In this case, a randomlyassigned control group, long-term follow up, and resultsthat really matter (getting jobs, staying out of jail, avoidingpregnancy) far outweighed earlier reports of fading IQgains (Barnett 1992). In fact, the Perry Preschool study,along with the other longitudinal early childhood educa-tion studies begun in the 196os (Lazar, Darlington, Murray,Royce, and Sniper 1982), were instrumental in awakeninginterest in a wide range of educational, social, and eco-nomic results that extend well beyond typical measures ofchildren's development.
Another program from this era was the Brookline EarlyEducation Project (BEEP), distinguished by being an inte-gral part of the public school system and providing services tochildren and families from birth until entry into kinder-garten. Outcome measures included standard developmental
49
measures (such as the Bayley Scales of Infant Developmentand the Stanford-Binet); measures of language development,health and physical development, and school achievement;and a detailed classroom-observational measure of children'smastery skills, social skills, and use of time (Hauser-Cram,Pierson, Walker, and Tivnan 1991). The broader range ofresults measured for BEEP children can be justifiably enviedby today's early childhood programs.
This highly selective sampling of early childhood pro-gram evaluations from the `6os and '7os shows that the fielddid not lack for outcome measures. Yet, study after studyhas begun by declaring that adequate measures do not exist.There are sound scientific and political reasons for notwanting to perpetuate the "psychometrically" strong IQtests, and the highly program relevant and sensitive, butlabor-intensive, observational measures like those used inthe BEEP evaluation cannot easily be used on a large scale.
In the context of these exciting studies, about twentyyears ago, I began planning the evaluation of a Head Startdemonstration program that had ambitious goals for affect-ing 4- to 8-year-olds' "social competence." The evaluationteam identified four dimensions of social competence in aneffort to do justice to Ed Zigler's broad conception of socialcompetence as "an individual's everyday effectiveness indealing with his environment, . . . his ability to masterappropriate formal concepts, to perform well in school, tostay out of trouble with the law, and to relate well to adultsand other children" (quoted by Anderson and Messick1974, p. 283). The early 1970's precursor to the Administra-tion on Children, Youth and Families (the Office of ChildDevelopment) simplified the concept to a concern with thechild's "everyday effectiveness in dealing with his environ-ment and responsibilities in school and life." The domainswe identified were social-emotional development, psy-chomotor development, language development, cognitiveskills, and health and nutritionnot unlike the currentdimensions defined by the Goal i Technical PlanningGroup of the National Education Goals Panel (Kagan,Moore, and Bredekamp 1995).
In 1975, we concluded that "no measure that is alreadyfully developed has been found that meets all the specificselection criteria ..." (Love, Wacker, and Meece 1975, p. 3).It is instructive to look at the instrument selection criteriaused in that study, because they are similar to those fol-lowed in practically every early childhood program evalua-tion I know of, and are still relevant to the issue facing ustoday. Fifteen criteria in three areas were used to evaluatepotential instruments:
Practical Considerations:* Available for use by fall 1975 ("immediately")* Appropriate for use by trained paraprofessionals* Test format appropriate for ages of children in
program
53BEST COPY AVAILABt
* Scoring procedures appropriate for data processing* Reasonable testing time for young children
Psychometric Qualities:* Adequate construct and/or predictive validity* Adequate test stability and internal consistency* Culture and/or SES fair* Representativeness of standardization sample* Low correlation with index of general information
Relevance to the Program:* Spans appropriate age range* Spanish-language adaptation available* Relevant to program's cognitive and language goals* Likely to demonstrate program effects* Used in previous national evaluations or large-scale
studiesEach of these criteria represents a factor influencing ouranswer to the question of scientific feasibility of measuringpolicy-relevant child results. In this instance, the primaryshortcomings of the instruments were their failure to spanthe total 4- to 8-year age range of the program population,lack of relevance to program goals, and failure of the teststandardization samples to represent fully the geographic orSES features of the population participating in the pro-gram.
During this same period, the Office of Education (OE),the middle element in what was then the U.S. Departmentof Health, Education, and Welfare, was conducting a mas-sive national evaluation of an early elementary (kinder-garten through third grade) planned-variation curriculumstudy called "Follow Through." In addition to conductingits multimillion-dollar national evaluation, OE allowed thecurriculum sponsors to conduct their own evaluation activ-ities. A consortium of the institutions sponsoring curriculathat today we would call "developmentally appropriate"joined forces to develop measures that would be more sen-sitive to developmental results than the standardizedachievement tests used in the national evaluation.
The High/Scope Follow Through model, for example,developed a procedure for assessing written language thatdemonstrated that fluency and complexity in the writing ofFollow Though students increased as a direct function ofthe extent to which teachers implemented the High/Scopecurriculum (Bond, Smith, and Kittel 1976). The BankStreet College Follow Through model developed a complexobservation instrument that showed Follow Through chil-dren engaging in more self-initiated communication,expression of thoughts, and peer communication thancomparison children (Bowman and Mayer 1976). Theseand other sponsor evaluation efforts were well-intentionedand in fact did produce useful findings to counterbalancethe national evaluation results (Hodges et al. 1980). But theenormous task was too much for the sponsors' paltry evalu-ation resources and time, and they could not effectively
50
54
counter the negative messages of the national evaluation.In the late 1970s, OCD decided that the sorry state of
measurement feasibility was a major obstacle to obtaininggood evaluations of the Head Start program. The agencytherefore launched a coordinated effort to develop newinstruments. The strategy included contracts to Educa-tional Testing Service and the RAND Corporation todefine and map the landscape of what should be measured(Anderson and Messick 1974; Raizen and Bobrow 1974),and two major instrument development projects. In thefirst, OCD contracted with ETS to create a battery of mea-sures spanning 16 domainsin the areas of cognitive, lan-guage, and perceptual-motor characteristics of preschooland kindergarten children. Called the CIRCUS batterybecause of the pictorial theme using clowns, balloons, andanimals, the battery offered a special feature, EL CIRCO.This Spanish edition was distinguished by the fact that itwas developed in tandem with the English-language versionrather than being a translation of a test initially developedonly for English-speaking children (Anderson et al. 1974).Today, no one hears about CIRCUS or EL CIRCO.
The second project was the Head Start Measures Pro-ject, begun in 1977. In 1980, the project team conducted anextensive review of measures of children's development. Wereviewed more than zoo measures, reaching the followingconclusion:
Very few measures show content validity as defined bythe developmental characteristics of children identifiedthrough the input workshops.' Information on con-struct validity is virtually nonexistent. Moreover, veryfew measures possess sufficient reliability to warrantconfidence in evaluation findings based on their use.(Mediax Associates 1980, p. 34)
Members of the national panel convened for the HeadStart measures project had equally discouraging observa-tions. The comments of panelists representing three per-spectives illustrate the measurement problem that we, andthe field, faced:
I have reviewed all of the major studies of Head Startand related programs and all of the instruments usedin this research to assess children's social development.I canriot recommend any of these instruments foradoption in this project since in no case did I find satis-factory evidence of the instrument's validity. (Carew1978, p. 7)
Indeed, most of the scales of motor development avail-able contain so few items per age level . . . that they areunlikely to be discriminating except under circum-stances of marked differences between the groups beingassessed. (Eichorn 1978, pp. 5-6)
It is clear that evaluators of Head Start have not takeninto account, in their selection of measures, the corn-
plex issues underlying the identification of goals andobjectives of the program. (Laosa 1978, p. 19)
The Head Start Measures Project is the only sustainedeffort I know of that was single-mindedly devoted to devel-oping better outcome measures for the full spectrum ofchildren's well-being. Researchers across four differentinstitutions tackled instrument development in fourdomains: (r) health and physical; (z) cognitive; (3) social-emotional; and (4) "applied strategies." (This fourthdomain foreshadowed the "approaches toward learning"domain defined by Kagan et al. [1995] for the first nationaleducation goal.) After only two years of development work,the government canceled funding for all but the cognitivedomain. Unfortunately, the project's dismal outcome, doc-umented by Raver and Zigler (1991), does not make useager to try again.
Today, we are often hamstrung by failed measurementdevelopment attempts and the negative exemplars of majorevaluations of early childhood programs. We are told thatthis history shows it cannot be done. These experiences areat least partly responsible for a climate of measurementavoidance throughout the early childhood community. It isnow "common knowledge" that the concept of test validityfor young children is an oxymoron. Teachers, administra-tors, or program evaluators who recommend individual,standardized, controlled assessments of young children aremisguided at best and evil at worst. I am not so naive as tobelieve that valid concerns do not exist, or to recognize thatthe practice of testing has involved enormous abuses, but itis both wrongheaded and shortsighted to reject all testingout of hand. Even within this climate, however, positiveexamples exist. I turn now to some of these to considerwhat lessons they carry.
Section 3: Recent and CurrentAssessment Efforts: EnlightenedEndeavors or Imprudent Illusion?In three recent and current activities, researchers haveidentified child results that are relevant for particular pur-poses and have proposed useful measurement approaches.Here, we see that different types of measurement proce-dures are identified; however, constraints on the definitionor construction of the measures in each case cloud our abil-ity to focus on the nature of the problem for results-basedprograms.
CHILD RESULTS IN THE CONTEXT OFCOMMUNITY-BASED FAMILY PROGRAMS
In 1993, Mathematica Policy Research undertook the chal-lenge of designing an evaluation of the Pew CharitableTrusts' Children's Initiative that would be as comprehen-
51
sive and innovative as the Initiative itself. Although weknew the task would not be easy, we were convinced it wasfeasible. I choose The Children's Initiative evaluation as anexample because it illustrates the child results we were con-vinced could be measured within the context of a commu-nity-wide program that also had broad health and familyfunctioning goals. When The Children's Initiative ended,Thornton, Love, and Meckstroth (1994) extended Mathe-matica's design work to propose a system of measures thatwould be appropriate for other community programs withoutcome goals similar to those of The Children's Initiative.
We developed plans for measures related to results in fivebroad areas: (I) child and family health; (z) family function-ing; (3) child development; (4) school performance; and (5)youth maturation and social integration. Two of the topicsunder child and family health (incidence of preventable dis-eases and disabilities, and overall health of children), as wellas the areas of child development and school performance,are pertinent to the child-based results theme of this forum.For each outcome that might relate to a goal of a commu-nity program in each of these areas, we listed the expectedresults (both intermediate and long term), the recom-mended measure for each outcome, and the measurementprocedure or data source. We also provided a summary ofevidence, when available, on how sensitive the measure is tocommunity interventions, the strengths and liabilities of themeasure, and a brief statement as to its policy relevance (seeTables i through 4).'
As with any set of recommended measures, we devel-oped this list with some restrictions in mind. First, we triedto find measures that are sensitive to changes in the well-being of children brought about by community-wide pro-grams. Second, we wanted measures based on data thatlocal communities would be capable of collecting. Third,we gave priority to measures for which there would be com-parative data in national surveys or other studies that couldprovide a frame of reference for interpreting local data.Finally, our selection was influenced by the potential formaking policy relevant statements about changes in chil-dren's status on the measure.
The measures in Tables 1-4 rely on two major data col-lection strategies: interviews or surveys and administrativerecords. We did not include controlled, individualizedassessments of childrennot because of concerns abouttheir scientific feasibility, but because of practical feasibil-ity, primarily the costs of data collection. A major reasonfor constraining our selection of measures was our desire tomake available a system that communities could maintainafter the evaluation ended. Communities seldom have theresources for ongoing assessments that are possible with aspecially funded formal evaluation. We thought we wereproposing a scientifically feasible system of measures,appropriate for the goals (desired results) of The Children's
b6
Initiative and suitable for implementation in the context ofcommunity-wide programs, but six months after we sub-mitted our preliminary evaluation design, the Trustsdecided not to proceed with implementing the Initiative.
I have often wondered how large a role the adequacy ofthe outcome measures played in the Trusts' decision.Clearly, it was a complex decision in which many factorswere weighed. As the coordinator of the external reviewteam commissioned by the Trusts to examine the Initiativeobjectively, Krauskopf (1994) has reflected on the lessonslearned. One of the four areas of concern he articulated was"evaluation, outcome measurement, and accountability."Referring to the outcome focus of the Initiative, Krauskopfnoted that "there are not well-agreed upon indicators andmeasures for three of the Initiative's four outcome areaschild development, family functioning, and school readi-ness." Krauskopfs central concern in the measurementarea, however, makes it clear that the adequacy of the out-come measures is a concern, but that this concern is embed-ded in a larger matrix of issues that includes evaluationdesign and the policymaking process:
Because large-scale systems projects must be prepared tojustify themselves as they proceed, the absence of solidoutcome measurement data and operational managementsystems is particularly harmful to generating ongoingsupport. To build the public and political will to con-tinue, projects must be able to demonstrate results in acredible way. State officials developing The Children'sInitiative recognized that data on enhanced results werecritical to expansion, but such data were unlikely to beavailable within the important early years of the project.(Emphases added)
In part, we may have failed to demonstrate the scientificfeasibility of producing the valid measures outlined inTables 1-4; certainly, none of the measures is perfect. Itmust be recognized, however, that the scientific basis foroutcome measurement, even had it been extremely strong,would not have been sufficient. Two other factors are alsoimportant. First, programs must be able to translate theresults of measurement into policy-relevant conclusionsabout the linkage between program activities and measuredresults: Did this specific community intervention actuallyproduce the changes that evaluators observed in the perfor-mance of children on the measures? Second is the issue oftiming. There is almost always a tension between theresearch and evaluation schedule, which must wait for theprogram processes to unfold and produce their impacts,and the political agenda, which demands immediate "hard"evidence that the investment is paying off. In this context,as we have seen in the case of Head Start, it is often tempt-ing to blame the measures.)
52
58
CHILD RESULTS IN THE CONTEXT OFSCHOOL READINESS ASSESSMENT
My second example takes us to a particularly sensitive arena,that of measuring the extent to which the developmentalprogress of 5-year-old children provides them the where-withal to succeed in school. The issue of school readinessassessment is rife with debate, the pros and cons of which Iwill not go into here. Let it suffice to say that the attitude ofmeasurement avoidance is particularly acute when theresults of the measurements have the potential for labelingchildren and leading to critical decisions about their educa-tional trajectories (grade placements, tracking, and so forth).With support from the Pew Charitable Trusts, Larry Aber,Jeanne Brooks-Gunn, and I analyzed the dimensions of chil-dren's early learning, development, and abilities as definedby the Goal i Technical Planning Group (Kagan et al. 1995)and searched the literature for measures that would beappropriate for communities to use to provide data onimportant elements of each dimension. The results of thisprocess are summarized in Table 5.4
Each of the measures listed in the right-hand column ofTable 5 has been used in previous surveys or research and issupported by evidence of its reliability and validity. Alsoimportant for the question of feasibility, we judged eachmeasure to be appropriate for diverse groups of childrenrepresenting the entire spectrum of socioeconomic status,geographic regions, racial-ethnic groups, language groups,and disability status found in this country. Each of the mea-sures was selected because it assesses one or more of theconstructs embodied in one of the dimensions of children'sdevelopment and learning, is appropriate for 5-year-olds,and is available for use with relatively little further adapta-tion. Two other considerations further constrained oursearch for appropriate measures. First, we wanted the set ofmeasures, taken as a whole, to be practically feasible. Thatis, the assessment process should not be so time-consumingor require such extensive training or materials that school orcommunity personnel would have difficulty administeringthe measures on a relatively large scale. Second, we lookedfor measures on which we had access to national data fordrawing comparisons.
Any application of child results measurement will have asimilar set of constraints. We must keep these in mind aswe contemplate the scientific feasibility of assessing childresults. In other words, the characteristics of the measuresare not the only consideration.
Four separate measurement procedures are required toassess all these indicators: (1) a self-administered parentquestionnaire; (2) teacher-conducted assessments of chil-dren's development; (3) ratings by teachers; and (4) schoolhealth records. Feasibility is enhanced by the use of multi-ple sources of data. For example, both parents and teachers
complete ratings on aspects of children's social and emo-tional development. Teachers obtain data on children'smotor development, approaches toward learning, languageusage, and cognition and general knowledge using a stan-dardized screening instrument.
This system of measures is yet to be tried, but it appearsto have face validity, if the expressions of community inter-est that we have received are any indication. We know thatthe measures exist and are scientifically strong. We don'tknow the full extent to which (I) their administration willbe practical and affordable; or (2) the data they generatewill speak to the needs of communities that are concernedabout these dimensions of development as possible resultsof their systems of health care, child care, parent support,and early education.
CHILD RESULTS IN THE CONTEXT OF ASYSTEM OF INDICATORS
Last year, the organizers of a conference on "indicators ofchildren's well-being" commissioned 24 papers, in whichprominent researchers described what child and adolescentwell-being indicators are feasible to measure. Recommen-dations were subject to a constraint unlike those seen in theprevious examples: the measures must be obtainable fromnational surveys. Even with this severe restriction, confer-ence participants considered a large number of measures ofchildren's well-being to be both feasible and desirable tocollect on a national basis. The measures cover the domainsof child health; education; economic security; population,family, and neighborhood; and social development andproblem behaviors (Brown 1994). The following partial list-ing from that conference illustrates what is feasible withinthe context of national survey data collections for childrenfrom birth to 5 years:
Child Health* Healthy birth index* Percentage of infants born with congenital
anomalies* Child abuse/neglect rate* Percentage of children ever experiencing a delay in
growth or development* Percentage of children limited by chronic health
conditions* Percentage of children who regularly use seat belts
Education* Percentage of 3- to 5-year-olds enrolled in preschool* Percentage of children ages 3 to 5 who are read to
every day by a parent or household member* Percentage of children over 3 years who ever had
learning disabilities
Economic Security* Percentage of children in poverty
53
* Percentage of children in families receiving foodstamps in past year
* Percentage of children in households where both oronly parents are working
* Percentage of children living in inadequate housing
Population, Family, and Neighborhood* Percentage of children who have moved within the
past year* Percentage of children living in institutions or group
quarters* Percentage of children living in severely distressed
neighborhoods
Social Development and Problem Behaviors* Percentage of children with high rates of behavior
problems
Because these measures must be collectable through anational survey, they may be less useful for measuringresults in certain types of programs. On the other hand,they may be particularly useful for evaluating community-wide programs like The Children's Initiative and many ofthe family support programs around the country.
Section 4: It all Depends:Tentative Lessons About Influences onMeasurement FeasibilityThree different strategies have illustrated what someresearchers believe to be the scientific feasibility of measur-ing child results. To make this discussion very concrete, Inow describe four illustrative measures. These exampleshave practical feasibility, which I took as a thresholdrequirement for considering any measure useful to thisforum. In other words, either they have been used in large-scale studies, where the cost and burden of data collectionare important considerations, or I can imagine them beingused in such studies. I have selected these four to illustratethat (I) outcome measurements are feasible; (2) a range offeatures of children's development and well-being can bemeasured; but that (3) a number of factors influence mea-surement feasibility. They do not constitute a random sam-ple of measures, but neither do I think they are entirelyatypical. I realize that to fully answer the question of thisforum, this analysis should be done with a much largersample of measures. If, however, these four suggest a posi-tive answer, this analysis should provide sufficient grist forour debate.
Table 6 summarizes critical features of the four instru-ments: (I) Behavior Problems Index (BPI) (Zill 199o); (2)Early Screening Inventory (ESI) (Meisels et al. 1988); (3)
5 7 BEST COPY AVAILk
MacArthur Communicative Development Inventories(CDI) (Fenson et al. 1993); and (4) Social Skills Rating Sys-tem (SSRS) (Gresham and Elliott 199o). The results mea-sured by these four instruments include aspects of many ofthe important dimensions of children's early developmentand learning: motor, social-emotional, language, and cog-nition and general knowledge. The instruments span differ-ent age ranges, from infancy through the elementary grades.They also represent different measurement approaches:parent reports and ratings (CDI and BPI); teacher ratings(SSRS); and direct individual assessments of children incontrolled settings (ESI). All but the CDI have been rela-tively widely used. The CDI is still undergoing develop-mental research (Fenson 1993), whereas many large-scalesurveys have incorporated the BPI (Love 1994; and Zill andSchoenborn 199o). One of the measures (CDI) was explic-itly designed to measure results for infants and toddlers; theBPI is probably suitable down to age 2, even though it wasinitially developed for preschoolers and older children(Brooks-Gunn and Ross 1991). The ESI is designed primar-ily for the preschool-to-first-grade years, and the SSRSteacher version is designed for elementary school teachers tocomplete (preschool and secondary versions are also avail-able).
I've indicated that it is not completely fair to draw broadgeneralizations from this small subset of measures, so I willovergeneralize! Using these four instruments, as well asknowledge about many other measures that is not docu-mented here, I now suggest some of the factors that maydetermine the scientific feasibility of adopting an resultsorientation for early childhood programs.
AGE AND OTHER CHARACTERISTICSOF THE CHILD
The older the child, the stronger the measures. But notalways! As a general rule, when children become older, theyare easier to talk to, they better understand what we ask ofthem, they respond more consistently to instructions. If ourmeasures depend on communicating with the child andgetting a response in return, then this general rule applies.Standard aptitude tests are often thought to be more validwith older children and adolescents than with babies.When a good multidimensional developmental assessmentis created (such as the Bayley Scales of Infant Develop-ment), it is rejected for all but the most well-funded evalua-tions. Notice, however, that such in-depth measures oftenbecome the benchmarks against which the validity ofnewer, more efficient assessments (like the CDI) arejudged.
The very widely used Peabody Picture Vocabulary Test(PPVT-R) (Dunn and Dunn 1981) is a good example of aninstrument that spans a very wide age range, is practical toadminister, but is inappropriate for young children. This
54
Ir:.; C.)
test relies on formalized interactions between adult andchild, with the adult asking questions (for example, "Whichis the ladder?"), and the child being required to respond invery restricted ways by pointing to or giving the letter des-ignation for the correct one out of four pictures. There isample room for bias to creep inthe adult's pronuncia-tion, the familiarity of the vocabulary within the child's cul-ture, the child's interpretation of the inelegant drawings,and so forth. In spite of these problems, the PPVT has beenwidely used in program evaluations, like the Infant Healthand Development Program (IHDP), and large-scale sur-veys, like the National Longitudinal Survey of Youth(NLSY). It has a number of enviable psychometric charac-teristics, including good reliability and predictive validity,yet fails on our criteria of appropriateness for diverse cul-tural and racial groups (because of vocabulary selection anddrawings) and relevance to program goals (since it measuresonly receptive vocabulary understandinga very narrowslice of the goals that most programs have for child devel-opment). Language has always been a difficult outcome tomeasure, partly because of its sensitivity to context. So, per-haps the domain of development is more important thanthe child's age.
DOMAINS OF DEVELOPMENT
For the last 10 or 15 years, Larry Fenson and his colleagueshave struggled with ways of measuring the productive lan-guage of infants and toddlers. What have they found? Thatparents are not only a convenient, but a highly reliable,source of information on the vocabulary and syntax of theirown children. Parents (mostly mothers) can observe theirchild's language across multiple settings over time in waysthat would be extremely time-consuming and expensive foroutside observers to duplicate. Taken together, the infantand toddler scales of the CDI describe the course of lan-guage development, from nonverbal gestures in infancy,through early vocabulary usage, to the beginnings of gram-matical speech. Interestingly, the growth curves fitted to theparent reports of various language functions closely parallelpredictions from language theory (Dale, et al. 1989; andFenson et al. 1993). So here we have the case of a strongmeasure for very young children in a domain that is fraughtwith measurement difficulties. Should we, perhaps, putmore reliance on parent reports?
MEASUREMENT METHODS
The Behavior Problems Index provides another example ofthe successful use of parent reports of children's behavior.But the Social Skills Rating System has been more success-ful with its teacher-rating version than with the parentform, at least in terms of the internal consistency of thescales (Gresham and Elliott 199o). On the other hand, theEarly Screening Inventory has almost single-handedly
erased our usual disdain for developmental screeninginstruments. The ESI demonstrates not only reliability andvalidity but also sensitivity, an especially important qualitywhen teachers and other school personnel are concernedwith correctly identifying children who should be referredfor further diagnosis. It has not been widely used in pro-gram evaluations (and, in fact, was not developed for thatpurpose), but it is certainly a better candidate than many ofthe traditional screening instruments that have been soused.
In-person assessments, like the ESI, which rely on struc-tured settings that carefully control the demand characteris-tics for children's responses, generally cost more to admin-isterif they go beyond picture-book multiple-choiceformats. Contributing to the psychometric strength of theESI are certainly the systematic procedures and trainingprograms that ensure consistent administration acrossteachers and settings. Good parent and teacher ratingforms, like the BPI, CDI, and SSRS, also provide system-atic instructions for their administration, but that task iseasier since direct contact with the child is not required.Thus, it seems that the measurement method is not a deter-mining factor so much as the care with which proceduresare established to ensure that the intentions of the instru-ment developer are carried out.
EVALUATION DESIGN ISSUES
Characteristics of the child, the domains being measured,and how we go about conducting the measurement processare all important considerations. It is not clear to me, how-ever, that our experience suggests useful rules of thumb forselecting the best measure for a particular program's out-come goals. There is simply too much measure-to-measurevariation, as seen in the perhaps extreme difference betweenthe PPVT and the CDI. It is time to consider not only themeasures themselves, but also how they are used.
Suppose we have one or two scientifically valid and reli-able instruments that program stakeholders agree measureresults they are trying to achieve. How, then, are the resultsof our scientifically valid outcome measures to be used? Thedesirability of moving toward a child-based results orienta-tion is based on the assumption that the results of the mea-surement process mean something to the various programstakeholderspolicymakers, early care and education sys-tems people, families and communities, and program staff.Even with a perfectly valid measure of children's languageusage or behavior problems, we cannot interpret the scoresunless there is highly convincing evidence that the programhad something to do with them. This is not the place for athorough discussion of evaluation design issues, but wemust recognize that even the best outcome measures areuseless in the face of our inability to implement evaluationdesigns that permit unambiguous conclusions about pro-
55
gram effects.' I submit that problems in evaluation designare much more likely than invalid outcome measures to bethe "scientific" basis for impeding movement towardresults-based programming for children.
Section 5: Closing ThoughtsEach year about 750,000 children enroll in Head Start.Responding to recent concerns about program quality,ACYF is designing a system of "performance measures"that will be results-oriented. The system will include mea-sures of children's social competence, because the agency isconvinced that quality improvements will be reflected inbetter results for children. Does the field have perfect mea-sures to recommend to ACYF for use next year, or the yearafter? No. Should we tell ACYF, and indirectly the Con-gress, that it is impossible to judge the results of enhancedquality? I think not. We know enough now to specify theconditions under which it is scientifically feasible to mea-sure child results of Head Start.
Three months from now, about four million childrenwill enter public and private schools for the first time. Willthey have a successful experience? Well, a sizable propor-tion will not actually enter regular kindergartens becausetheir parents, school system, or private counselors willdeem them "unready." Twelve months later, almost200,000 children will spend a second year in kindergartenbecause their school doesn't believe they have made enoughprogress to be successful in first grade. An additional16o,000 or so will also experience delayed entry to firstgrade because their school system places them in some typeof transitional class after the kindergarten year. Many ofthese decisions (so percent or more of the cases) will bejustified by the results of some type of assessment. Wealready know better ways to make these decisions, ways thatwill place children at far lower risk for school failure, laterdropout, and so forth (Shepard and Smith 1989). Are any ofthese outcome measures perfect? No. But each year that wewait for the perfect measure, we diminish the chances forsuccessful schooling for another 350,000 or more children.
Uncounted millions of children participate in familysupport programs each year at a cost of tens of millions ofdollars. There is keen interest in knowing the extent towhich these children are better off because of the supportand services their families receive. Should we tell policy-makers that it is premature to assess these benefits? Thatprograms will just have to proceed without knowing howthey affect children's well-being? Again, I say no. Andagain, we know enough to advise programs on the condi-tions under which effective results measurement can beconducted, on which dimensions of well-being, and withwhich measurement procedures.
If it is desirable to shift to results-based accountability in
5 9 BEST COPY AVAILABLI
programs for children, it is also scientifically feasible to doso. I am fully aware that this brief paper and my selectivereview of measures do not solidly bolster this conclusion.Yet, it seems impossible to deny the good measures we haveavailable. In arriving at this conclusion, I want to be clearthat an affirmative answer to the feasibility question is not ablanket endorsement of all measures for all children in allprograms under all circumstances. The results that are mea-sured, the procedures that are selected, and the conditionsunder which the assessments are conducted and interpretedall have implications for programs and families, and ulti-mately for the children whom we hope will benefit throughour endeavors.
1. The planning phase of the Measures Project included a series of"input workshops," in which project staff met with representatives ofHead Start administrators, teachers, and parents in various regions ofthe country. Conducted like large focus groups, these workshops weredesigned to ascertain what representatives of the Head Start commu-nity (the stakeholders of future program evaluations) believed to beimportant indicators of children's development.
z. Although Tables 1-4 have their foundation in preliminary recom-mendations Mathematica Policy Research developed for possible usein the evaluation of The Children's Initiative, we have made a numberof modifications since The Children's Initiative ended. No endorse-ment of these particular results or measures by the Pew CharitableTrusts should be inferred.
I am grateful to Craig Thornton for helping me articulate the com-plexities of measurement issues in the context of evaluating commu-nity-wide initiatives.
3. For the purposes of this discussion of child results measures, we canignore the first four dimensions in Table 5. They reflect the commu-nity conditions that the National Governors' Association identified assupporting children's development and learning and are important fora complete assessment within the framework of the first national edu-cation goal (see Love et al. 5994).
4. The conference was sponsored by the Institute for Research onPoverty at the University of Wisconsin; Child Trends, Inc.; the Officeof the Assistant Secretary for Planning and Evaluation (U.S. Depart-ment of Health and Human Services); the National Institute of ChildHealth and Human Development; and the Annie E. Casey Founda-tion.
5. I will not go into the power of random assignment, the virtues ofhaving a control-group counterfactual, or the various quasi-experi-mental design alternatives available. Hollister and Hill (1990 haverecently presented a highly intelligent and readable discussion of theseissues with particular reference to evaluating communitywide initia-tives.
References
Anderson, Scarvia, and Samuel Messick. "Social Compe-tency in Young Children." Developmental Psychology, vol.1o, 1974, pp. 282-293.
Anderson, S.B., G.A. Bogatz, T.W. Draper, A. Jungeblut,G. Sidwell, W.C. Ward, and A. Yates. CIRCUS Manualand Technical Report (preliminary version). Princeton, NJ:Educational Testing Service, 1974.
Barnett, W. Steven. "Benefits of Compensatory PreschoolEducation." Journal of Human Resources, vol. 27, 1992,pp. 102-120.
Barnett, W. Steven. "Research on Family Support Pro-grams: Current Status and Future Directions." Paper pre-sented at the biennial meeting of the Society for Researchin Child Development, Indianapolis, March 1995.
Bond, James T., Allen G. Smith, and Janice M. Kittel."Evaluation of Curriculum Implementation and ChildResults." Annual Report, Cognitively Oriented Curricu-lum Project Follow Through. Ypsilanti, MI: High/ScopeEducational Research Foundation, 1976.
Bowman, G., and Rochelle Mayer. "The BRACE Systemof Interaction Analysis: What, Why, and How? New York:Bank Street College of Education, 1976.
Bradley, R., and B. Caldwell. "The Relation of Infants'Home Environments to Achievement Test Performance inFirst Grade: A Follow-Up Study." Child Development, vol.55, 1984, pp. 803-809.
Bradley, R., and B. Caldwell. Home Observation for Mea-surement of the Environment: Validation Studies of thePreschool Version. Paper presented at the Annual Meetingof the Mid-South Educational Research Association, NewOrleans, 1978.
Brooks-Gunn, Jeanne, and Christine Ross. "EvaluatingChild Results of the Expanded Child Care OptionsDemonstration: An Analysis Plan." Princeton, NJ: Mathe-matica Policy Research, Inc., May 1991.
Brooks-Gunn, J., C. McCarton, P. Casey, M.McCormick, et al. "Early Intervention in Low Birth-weight, Premature Infants: Results Through Age 5 Fromthe Infant Health and Development Program." Journal ofthe American Medical Association, vol. 272, no. 16, 1994,pp. 1257-1262.
56
Ca
Brown, Brett. "Indicators of Children's Well-Being: AReview of Current Indicators Based on Data From theFederal Statistical System." Paper prepared for the confer-ence on Indicators of Children's Well-Being, Bethesda,MD, November 1994.
Campbell, Frances A., and Craig T. Ramey. "Effects ofEarly Intervention on Intellectual and Academic Achieve-ment: A Follow-Up Study of Children from Low-IncomeFamilies." Child Development, vol. 65, 1994, pp. 684-698.
Carew, Jean V. "Head Start Profiles of Program Effects onChildren: Position Paper for Mediax ACYF Project."Westport, CT: Mediax Associates, 1978.
Dale, Phillip S., Elizabeth Bates, J. Steven Reznick, and C.Morisset. "The Validity of a Parent Report Instrument ofChild Language at 20 Months." Journal of Child Language,vol. 16,1989, pp. 239-249.
Dunn, L., and L. Dunn. Peabody Picture Vocabulary Test,Revised Edition. Circle Pines, MN: American GuidanceService, 1981.
Edozien, J., B. Switzer, and R. Bryan. "Medicaid Evalua-tion of the Special Supplemental Food Program forWomen, Infants, and Children." The American Journal ofClinical Nutrition, vol. 32, March 1979.Egbuonu, Lisa, and B. Starfield. "Inadequate Immuniza-tion and the Prevention of Communicable Diseases." InThe Effectiveness of Medical Care: Validating Clinical Wis-dom, edited by Barbara Starfield. Baltimore: Johns Hop-kins University Press, 1985.
Eichorn, Dorothy H. "Head Start Profiles of ProgramEffects on Children: Motor Development." Westport, CT:Mediax Associates, 1978.
Eisen, M., C.A. Donald, J.E. Ware, and R.H. Brook. Con-ceptualization and Measurement of Health for Children inthe Health Insurance Study. Santa Monica, CA: The RandCorporation, 1980.
Fenson, Larry. "Parent Report: A Neglected Source ofData." Paper presented at the biennial meeting of the Soci-ety for Research in Child Development, New Orleans,March 1993.
Fenson, Larry, Phillip S. Dale, J. Steven Reznick, DonnaThal, Elizabeth Bates, Jeffrey P. Hartung, Steve Pethick,and Judy S. Reilly. MacArthur Communicative Develop-ment Inventories: User's Guide and Technical Manual. SanDiego: Singular Publishing Group, 1993.
57
Gresham, Frank M., and Stephen N. Elliott. Social SkillsRating System: Manual. Circle Pines, MN: AmericanGuidance Service, 1990.
Hanushek, Eric A. "The Impact of Differential Expendi-tures on School Performance." Educational Researcher, vol.18, no. 4, May 1989, pp. 45-51, 62.
Hauser-Cram, Penny, Donald E. Pierson, Deborah K.Walker, and Terrence Tivnan. Early Education in the Pub-lic Schools: Lessons from a Comprehensive Birth-to-Kinder-garten Program. San Francisco: Jossey-Bass, 1991.
Helburn, Suzanne, et al. "Cost, Quality, and Child Resultsin Child Care Centers: Executive Summary." Denver:University of Colorado at Denver, 1995.
Hodges, W., A. Branden, R. Feldman, J. Follins, J. Love,R. Sheehan, J. Lumbley, J. Osborn, R.K. Rentfrow, J.Houston, and C. Lee. Follow Through: Forces for Change inthe Primary Schools. Ypsilanti, MI: High/Scope Press, 1980.Hofferth, S.L., A. Brayfield, S.G. Deich, and P. Holcomb."The National Child Care Survey 1990." Washington,DC: Urban Institute, 1991.
Hollister, Robinson G., and Jennifer Hill. "Problems inthe Evaluation of Community-Wide Initiatives." In NewApproaches to Evaluating Community Initiatives: Concepts,Methods, and Contexts, edited by James P. Connell, AnneC. Kubisch, Lisbeth B. Schorr, and Carol H. Weiss.Washington, DC: The Aspen Institute, 1995.
Howes, Carollee, Deborah A. Phillips, and Marcy White-book. "Thresholds of Quality: Implications for the SocialDevelopment of Children in Center-Based Child Care."Child Development, vol. 63, 1992, pp. 449-460.
Infant Health and Development Program. "Enhancing theResults of Low-Birth-Weight, Premature Infants: A Multi-site, Randomized Trial." Journal of the American MedicalAssociation, vol. 263, no. 2, June 13, 1990, pp 3035-3042.
Kagan, Sharon L., Evelyn Moore, and Sue Bredekamp(Eds.). "Reconsidering Children's Early Development andLearning: Toward Common Views and Vocabulary."Washington, DC: National Education Goals Panel, June1995.
Koblinsky, S.A., M.L. Taylor, and L.W. Douglass. "Rela-tionships Between Maternal, Child, and Shelter: Charac-teristics and Homeless Children's Developmental Status."College Park, MD: University of Maryland, 1995.
6
Kresh, Esther. "The Effects of Head Start: What Do WeKnow?" Washington, DC: U.S. Department of Healthand Human Services, Administration on Children, Youthand Families, 1993
Krauskopf, James A. "Overcoming Obstacles to Imple-menting Reform of Family and Children's Services." Paperpresented at the annual meeting of the Association forPublic Policy Analysis and Management, Chicago, Octo-ber 1994.
Laosa, Luis M. "Evaluating the Effects of Head Start onChildren's Cognitive Development." Westport, CT:Mediax Associates, 1978.
Lamer, Mary. "Realistic Expectations: Review of Evalua-tion Findings." In Fair Start for Children: Lessons Learnedfrom Seven Demonstration Projects, edited by Mary Larner,Robert Halpern, and Oscar Harkavy. New Haven: YaleUniversity Press, 1992.
Lazar, I., R. Darlington, H. Murray, J. Royce, and A.Snipper. "Lasting Effects of Early Education: A Reportfrom the Consortium for Longitudinal Studies." Mono-graphs of the Society for Research in Child Development, vol.47, nos. 2-3 (series no. 195), 1982.
Love, John M. "Indicators of Problem Behaviors andProblems in Early Childhood." Invited paper prepared forthe Conference on Indicators of Children's Well-Being,National Institute of Child Health and Human Develop-ment, Bethesda, MD, November 17-18, 1994.
Love, John M., J. Lawrence Aber, and Jeanne Brooks-Gunn. "Strategies for Assessing Community ProgressToward Achieving the First National Education Goal."Princeton, NJ: Mathematica Policy research, Inc., October1994.
Love, John M., Sally Wacker, and Judith Meece. "AProcess Evaluation of Project Developmental Continuity,Interim Report II, Part B: Recommendations for Measur-ing Program Impact." Ypsilanti, MI: High/Scope Educa-tional Research Foundation, June 1975.
Mallar, Charles D., and Rebecca A. Maynard. "The Effectsof Income Maintenance on School Performance and Edu-cational Attainment." In Research in Human Capital andDevelopment: A Research Annual, vol. 2, edited by AliKhan and Ismail Sirageldin. Greenwich, CT: JAI PressInc, 1981.
58
Mathematica Policy Research, Inc. "Teenage ParentDemonstration 24-Month Follow-Up Questionnaire."Princeton, NJ: MPR, 1993.
Mathematica Policy Research, Inc. "Survey of Mothersand Children: Mother's Questionnaire for the Interactionsand Developmental Processes Study." Princeton, NJ:MPR, 1991.
Maynard, Rebecca. "Some Aspects of Income Distribution:The Effects of the Rural Income Maintenance Experimenton the School Performance of Children." American Eco-nomic Review, vol. 67, no. I, 1977, pp. 370-374.
Maynard, Rebecca, and Richard Murnane. "The Effects ofa Negative Income Tax on School Performance: Results ofan Experiment." The Journal of Human Resources, vol. 14,
no. 4, 1979, pp. 463-476.
McCall, Robert B. "Head Start: Its Potential, Its Achieve-ments, Its Future: A Briefing Paper for Policymakers."Pittsburgh, PA: University of Pittsburgh, Office of ChildDevelopment, 1993.McCartney, Kathleen, Sandra Scarr, Deborah Phillips,Susan Grajek, and J. Conrad Schwarz. "EnvironmentalDifferences Among Day Care Centers and Their Effectson Children's Development." In Day Care: Scientific andSocial Policy Issues, edited by Edward F. Zigler andEdmund W. Gordon. Boston: Auburn Press, 1982.
McKey, R.H., L. Condelli, H. Gamson, B. Barrett, C.McConkey, and M. Plantz. The Impact of Head Start onChildren, Families, and Communities: Final Report of theHead Start Evaluation, Synthesis, and Utilization Project.Washington, DC: U.S. Department of Health andHuman Services, June 1985.
Mediax Associates. "Accept My Profile: Perspectives forHead Start Profiles of Program Effects on Children."Westport, CT: Mediax Associates, 1980.
Meisels, Samuel J., Martha Stone Wiske, Laura W. Hen-derson, Dorothea B. Marsden, and Kimberly Browning.ESI 4-6: Early Screening Inventory for Four Through SixYear Olds. Revised, third edition. New York: TeachersCollege Press, 1988.
National Center for Education Statistics. National House-hold Education Survey, r993 Washington, DC: NCES,1993.
National Center for Health Statistics. "Design and Estima-tion for the National Health Interview Survey, 1985-94."Vital and Health Statistics, series 2, no. no. Bethesda, MD:U.S. Department of Health and Human Services, PublicHealth Service, Centers for Disease Control, August 1989.
Olds, David L., and Harriet Kitzman. "Can Home Visita-tion Improve the Health of Women and Children at Envi-ronmental Risk?" Pediatrics, vol. 86, 1990, pp. 108-116.
Peterson, J.L., and N. Zill. "Marital Disruption, Parent-Child Relationships, and Behavioral Problems in Chil-dren." Journal of Marriage and the Family, vol. 48, 1986,pp. 295-307.
Pierson, D.E., D.K. Walker, and T. Tivnan. "A School-Based Program from Infancy to Kindergarten for Childrenand Their Parents." Personnel and Guidance Journal, vol.62, no. 8, 1984, pp. 448-455.
Raizen, Senta, and S.B. Bobrow. Design for a NationalEvaluation of Social Competence in Head Start Children.Santa Monica, CA: Rand Corporation, 1974.
Ramey, Craig T., and Frances A. Campbell. "Poverty,Early Childhood Education, and Academic Competence:The Abecedarian Experiment." In Children in Poverty:Child Development and Public Policy, edited by Aletha C.Huston. New York: Cambridge University Press, 1991.
Ramey, Craig T., Donna M. Bryant, and Tanya M. Suarez."Preschool Compensatory Education and the Modifiabilityof Intelligence: A Critical Review." In Current Topics inHuman Intelligence, edited by D. Detterman. Norwood,NJ: Ablex Publishing Corporation, 1983.
Raver, C. Cybele, and Edward F. Zigler. "Three Steps For-ward, Two Steps Back: Head Start and the Measurementof Social Competence." Young Children, vol. 46, no. 4,1991, pp. 3-8.
Schorr, Lisbeth B. "The Case for Shifting to Results-BasedAccountability, with a Start-Up List of Outcome Measureswith Annotations." Washington, DC: Center for theStudy of Social Policy, Fall 1994.
Schwartz, Donald F., Jeanne Ann Grisso, Carolyn Miles,John H. Holmes, and Rudolph L. Sutton. "An Injury Pre-vention Program in an Urban African-American Commu-nity." American Journal of Public Health, May 1993, vol. 83,no. 5, pp. 675-680.
59
Schweinhart, Lawrence J., Helen V. Barnes, and David P.Weikart. "Significant Benefits: The High/Scope PerryPreschool Study Through Age 27." Monographs of theHigh /Scope Educational Research Foundation, no. io. Ypsi-lanti, MI: High/Scope Educational Research Foundation,1993.
Shepard, Lorrie, and M.L. Smith (Eds.). Flunking Grades:Research and Policies on Retention. New York: Palmer Press,1989.
Starfield, Barbara. "The Effectiveness of Medical Care:Validating Clinical Wisdom." Baltimore: Johns HopkinsUniversity Press, 1985, pp. 48-57.
St.Pierre, Robert, Jean Layzer, and Helen Barnes. "Varia-tion in the Design, Cost, and Effectiveness of Two-Gener-ation Programs." Paper prepared for the Eighth RutgersInvitational Symposium on Education: New Directionsfor Policy and Research in Early Care and Education,Princeton, NJ, October 1994.
Thornton, Craig, John M. Love, and Alicia L. Meckstroth."Community-Level Measures for Assessing the Status ofChildren and Families." Princeton, NJ: Mathematica Pol-icy Research, December 1994.
Yoshikawa, Hirokazu. "Prevention as Cumulative Protec-tion: Effects of Early Family Support and Education onChronic Delinquency and Its Risks." Psychological Bul-letin, vol. 115, 1994, pp. 28-54.
Zill, Nicholas. Behavior Problems Index. Washington, DC:Child Trends, Inc., 1990.
Zill, Nicholas, and Charlotte A. Schoenborn. "Develop-mental, Learning, and Emotional Problems: Health ofOur Nation's Children, United States, 1988." AdvanceData from Vital and Health Statistics, No. 190. Hyattsville,MD: National Center for Health Statistics, November 16,1990.
6 3
TA
BL
E 1
HE
AL
TH
OU
TC
OM
ES
AN
D M
EA
SUR
ES:
IN
CID
EN
CE
OF
PRE
VE
NT
AB
LE
DIS
EA
SES
AN
D D
ISA
BIL
ITIE
S IN
CH
ILD
RE
N
Inte
rmed
iate
-Ter
mO
utco
mes
Exp
ecte
dL
ong-
Ter
mO
utco
mes
Exp
ecte
dM
easu
res
Rec
omm
ende
dD
ata
Sour
ceE
vide
nce
of S
ensi
tivity
toC
omm
unity
Int
erve
ntio
nsSt
reng
ths
[Lia
bilit
ies]
of
the
Mea
sure
Polic
y R
elev
ance
Red
uced
num
ber
of c
ases
of d
isea
ses
for
whi
chim
mun
izat
ion
is a
vaila
ble-
-per
tuss
is, p
olio
, mea
sles
,m
umps
, or
rube
lla
Dec
reas
e in
pos
meo
nata
lm
orta
lity
rate
s or
inra
ce/e
thni
city
dif
fere
ntia
lin
pos
meo
nata
l mor
talit
yra
tes
Dec
reas
e in
pre
vent
able
infa
nt m
orta
lity
Red
uced
num
ber
of c
ases
of d
isea
ses
for
whi
chim
mun
izat
ion
is a
vaila
ble
Incr
ease
d us
e of
saf
ety
prec
autio
ns to
red
uce
acci
dent
s an
dun
inte
ntio
nal i
njur
y
PosM
eona
tal m
orta
lity
rate
for
all c
hild
ren
and
byra
ce/e
thni
city
Bir
th c
ertif
icat
e
Post
neon
atal
mor
talit
y ra
teB
irth
cer
tific
ate
> 2
/100
0
Num
ber
of c
hild
ren
with
dise
ases
list
edPu
blic
hea
lth r
ecor
dsan
d/or
sur
vey
Prop
ortio
n of
fam
ilies
Surv
eyus
ing
inju
ry p
reve
ntio
n:se
at b
elts
, ipe
cac
syru
p in
hous
e, w
orki
ng s
mok
eal
arm
s
Som
e ev
iden
ce th
atpo
smeo
nata
l mor
talit
y is
resp
onsi
ve to
cha
nges
in th
epr
even
tion
of m
edic
al c
are
(Sta
rfie
ld 1
985)
Som
e ev
iden
ce th
atpo
smeo
nata
l mor
talit
y is
resp
onsi
ve to
cha
nges
in th
epr
even
tion
of m
edic
al c
are
(Sta
rfie
ld 1
985)
Imm
uniz
atio
n fo
und
tode
crea
se th
e in
cide
nce
ofva
ccin
e-pr
even
tabl
eco
mm
unic
able
dis
ease
s(E
gbuo
nu a
nd S
tarf
ield
198
5)
An
inju
ry p
reve
ntio
npr
ogra
m in
a p
oor
Afr
ican
Am
eric
an c
omm
unity
resu
lted
in in
crea
sed
prop
ortio
n of
hom
es w
ithfu
nctio
ning
sm
oke
dete
ctor
s,sy
rup
of ip
ecac
, saf
ely
stor
edm
edic
atio
ns, a
nd r
educ
edel
ectr
ical
and
trip
ping
haza
rds
(Sch
war
tz e
t al.
1993
).
[Lin
ked
birt
h an
d de
ath
reco
rds
file
s ar
e av
aila
ble
from
sta
tes
with
a lo
ng ti
me
lag.
]
[Lin
ked
birt
h an
d de
ath
reco
rds
file
s ar
e av
aila
ble
from
sta
tes
with
a lo
ng ti
me
lag]
[Inc
iden
ce o
f va
ccin
e-pr
even
tabl
eco
mm
unic
able
dis
ease
s is
low
and
outc
omes
are
thus
not
appr
opri
ate
for
stat
istic
alan
alys
is.]
Red
uced
infa
nt m
orta
lity
Red
uced
infa
nt m
orta
lity
Red
uced
hea
lth c
are
cost
s,im
prov
ed p
hysi
cal h
ealth
Red
uced
chi
ldho
od m
orbi
dity
and
mor
talit
y
SOU
RC
E:
Tho
rnto
n, L
ove,
and
Mec
kstr
oth
(199
4).
TA
BL
E 2
HE
AL
TH
OU
TC
OM
ES
AN
D M
EA
SUR
ES:
OV
ER
AL
L H
EA
LT
H O
F C
HIL
DR
EN
Inte
rmed
iate
-Ter
mO
utco
mes
Exp
ecte
dL
ong-
Ter
mO
utco
mes
Exp
ecte
dM
easu
res
Rec
omm
ende
dD
ata
Sour
ceE
vide
nce
of S
ensi
tivity
toC
omm
unity
Int
erve
ntio
nsSt
reng
ths
[Lia
bilit
ies]
of
the
Mea
sure
Polic
y R
elev
ance
Mor
e po
sitiv
e pa
rent
perc
eptio
ns o
f ch
ild's
heal
th s
tatu
s
Few
er c
hild
ren
with
func
tiona
l lim
itatio
ns d
ueto
hea
lth c
ondi
tions
Few
er c
hild
ren
with
mor
bidi
ties
or s
erio
usm
orbi
ditie
s
Con
tinue
d im
prov
emen
t in
pare
nt p
erce
ptio
ns o
f ch
ildhe
alth
sta
tus
Con
tinue
d re
duct
ion
inch
ildre
n w
ith f
unct
iona
llim
itatio
ns d
ue to
hea
lthco
nditi
ons
Few
er c
hild
ren
with
mor
bidi
ties
or s
erio
usm
orbi
ditie
s
NH
IS q
uest
ion
onm
othe
r's r
atin
g of
chi
ld's
heal
th
Surv
ey
RA
ND
Hea
lth P
erce
ptio
nsSu
rvey
Scal
e
Mor
bidi
ty I
ndex
, Ser
ious
Surv
eyM
orbi
dity
Ind
ex
Incr
ease
in p
ropo
rtio
n of
Hei
ght a
nd w
eigh
t rel
ativ
eSc
hool
reg
istr
atio
nch
ildre
n w
ho a
re w
ithin
to a
ge n
orm
sper
cent
iles
reco
rds
age-
appr
opri
ate
heig
ht a
ndan
d z-
scor
esw
eigh
t nor
ms
Hom
e vi
sit p
rogr
ams
impr
ove
mat
erna
l hea
lth a
ndre
duce
ris
k of
infa
nt h
ealth
prob
lem
s (O
lds
and
Kitz
man
1990
).
No
effe
cts
dete
cted
of
anea
rly
inte
rven
tion
prog
ram
for
low
-bir
thw
eigh
t, pr
eter
min
fant
s (I
nfan
t Hea
lth a
ndD
evel
opm
ent P
rogr
am19
90).
Few
eff
ects
det
ecte
d of
an
earl
y in
terv
entio
n pr
ogra
mfo
r lo
w-b
irth
wei
ght,
pret
erit'
infa
nts
(Inf
ant H
ealth
and
Dev
elop
men
t Pro
gram
1990
).
WIC
par
ticip
atio
n by
child
ren
redu
ced
the
perc
enta
ge o
f ch
ildre
n be
low
the
10th
per
cent
ile o
f he
ight
and
wei
ght f
or th
eir
age
(Edo
zien
et a
l. 19
79).
Scal
e w
idel
y us
ed a
nd u
sefu
l in
asse
ssin
g cu
rren
t hea
lth s
tatu
s,ch
ange
s in
hea
lth s
tatu
s fr
omtr
eatm
ent,
and
pred
ictio
n of
mor
talit
y
Est
ablis
hed
scal
e w
ith h
igh
inte
rnal
con
sist
ency
and
goo
dch
ildre
nco
ncur
rent
val
idity
with
oth
erre
port
ed m
easu
res
of h
ealth
Impr
oved
ove
rall
heal
th o
f ch
ildre
n
Impr
oved
fun
ctio
nal s
tatu
s of
Com
bina
tion
of in
divi
dual
ly r
are
even
ts to
giv
e a
mor
e se
nsiti
vech
ildre
nou
tcom
e m
easu
re
Impr
oved
phy
sica
l hea
lth o
f
Rel
iabl
e an
d va
lid m
easu
res
easi
ly a
vaila
ble
from
dir
ect
mea
sure
men
ts o
r m
edic
al r
ecor
dsre
view
(Sb
anko
ff 1
992)
.
Impr
oved
hea
lth o
f ch
ildre
n
SOU
RC
E:
Tho
rnto
n, L
ove,
and
Mec
kstr
oth
(199
4).
TA
BL
E 3
CH
ILD
DE
VE
LO
PME
NT
OU
TC
OM
ES
AN
D M
EA
SUR
ES
Inte
rmed
iate
-Ter
mO
utco
mes
Exp
ecte
dL
ong-
Ter
mO
utco
mes
Exp
ecte
dM
easu
res
Rec
omm
ende
dE
vide
nce
of S
ensi
tivity
toSt
reng
ths
[Lia
bilit
ies]
(Age
Ran
ge)
Dat
a So
urce
Com
mun
ity I
nter
vent
ions
of th
e M
easu
rePo
licy
Rel
evan
ce
Incr
ease
in e
nrol
lmen
ts in
earl
y ed
ucat
ion
and
care
prog
ram
s
Lon
ger
wai
ting
lists
for
earl
y ch
ildho
od p
rogr
ams
Incr
ease
d le
vels
of
rece
ptiv
e, e
xpre
ssiv
e, a
ndpr
oduc
tive
lang
uage
Con
tinue
d in
crea
se in
enro
llmen
ts in
ear
lyed
ucat
ion
and
care
prog
ram
s
Incr
ease
d su
pply
of
earl
ych
ildho
od p
rogr
ams;
per
-ha
ps s
hort
er w
aitin
g lis
ts
Con
tinue
d in
crea
se in
leve
ls o
f re
cept
ive,
expr
essi
ve, a
nd p
rodu
ctiv
ela
ngua
ge
Par
ticip
atio
n by
Chi
ldre
n in
Ear
ly E
duca
tion
and
Car
e P
rogr
ams
Bef
ore
Kin
derg
arte
n
Enr
ollm
ent r
ecor
ds(a
ges
1-5
year
s)
Nat
iona
l Hou
seho
ldE
duca
tion
Surv
ey (
NH
ES)
item
s on
type
s of
prog
ram
s at
tend
ed, l
engt
hof
atte
ndan
ce, e
tc. (
ages
3-6
year
s)
Prov
ider
and
ser
vice
wor
ker
asse
ssm
ents
of
adeq
uacy
of
supp
ly (
ages
1-5
year
s)
Mac
Art
hur
Com
mun
icat
ive
Dev
elop
men
t Inv
ento
ries
-C
DI-
-Sh
ort F
orm
(ag
es 1
-3
year
s)
Hea
d St
art r
ecor
ds,
Dir
ect l
ink
to o
utre
ach
Obt
aina
ble
from
pro
gram
child
car
e sy
stem
activ
ities
data
base
sre
cord
s[D
ata
may
not
rep
rese
ntSc
hool
reg
istr
atio
nco
mm
unity
as
a w
hole
.]
Surv
ey
Prog
ram
rec
ords
Surv
ey
e D
evel
opm
ent
Cog
nitiv
e an
d la
ngua
gede
velo
pmen
t are
enh
ance
d by
such
fac
tors
as
invo
lvem
ent
in q
ualit
y ch
ild c
are
and
earl
y ed
ucat
ion
prog
ram
s an
din
crea
sed
leve
ls o
fst
imul
atio
n in
the
hom
e(M
cCar
mey
et a
l. 19
82;
Ram
ey e
t al.
1983
).
[Som
e co
mm
uniti
es m
ay h
ave
data
in a
num
ber
of d
iffe
rent
data
base
s.]
Use
d in
199
3 N
HE
S
Shor
t for
m h
ighl
y co
rrel
ated
with
full
CD
I, w
hich
has
str
ong
cons
truc
t val
idity
stro
ngco
rrel
atio
ns b
etw
een
CD
Ivo
cabu
lary
pro
duct
ion
scor
es a
ndla
bora
tory
mea
sure
s; m
othe
rs'
repo
rts
of o
nset
of
gram
mat
ical
func
tions
clo
sely
par
alle
lla
ngua
ge d
evel
opm
ent t
heor
y
Age
nor
ms
avai
labl
e [a
lthou
ghm
ay n
ot b
e ap
plic
able
to lo
w-
inco
me
sam
ples
]
Use
d in
NIB
Inf
ant D
ay C
are
Stud
y
[Nor
min
g sa
mpl
e di
d no
t hav
ela
rge
num
bers
of
min
ority
gro
upm
embe
rs.]
[Lim
ited
to 8
- to
36-
mon
th a
gera
nge]
Ava
ilabi
lity
and
use
of c
hild
car
ear
e as
soci
ated
with
par
ents
' abi
lity
to w
ork
and
redu
ctio
n in
wel
fare
depe
nden
ce.
Part
icip
atio
n in
ear
ly c
hild
hood
prog
ram
s ex
pect
ed to
impr
ove
scho
ol r
eadi
ness
Lan
guag
e sk
ills
are
cent
ral t
osc
hool
rea
dine
ss a
nd s
ucce
ss in
scho
ol.
Rel
ates
to N
atio
nal E
duca
tion
Goa
lO
ne: d
omai
n of
lang
uage
usa
ge
BE
ST
CO
PY
AV
AIL
AB
LE
TA
BL
E 3
(con
tinue
d)
Inte
rmed
iate
-Ter
mO
utco
mes
Exp
ecte
dL
ong-
Ten
nO
utco
mes
Exp
ecte
dM
easu
res
Rec
omm
ende
d(A
ge R
ange
)D
ata
Sour
ceE
vide
nce
of S
ensi
tivity
toC
omm
unity
Int
erve
ntio
nsSt
reng
ths
[Lia
bilit
ies]
of th
e M
easu
rePo
licy
Rel
evan
ce
Incr
ease
s in
sch
ool-
rela
ted
Incr
ease
s in
sch
ool-
rela
ted
know
ledg
e an
d sk
ills
Impr
oved
mot
orde
velo
pmen
t and
coor
dina
tion
Incr
ease
d le
vels
of
coop
erat
ion,
ass
ertio
n,an
d re
spon
sibi
lity,
and
incr
ease
d de
gree
of
self
-co
ntro
l
Low
ered
leve
ls o
fhe
adst
rong
, ant
isoc
ial,
anxi
ous,
dep
ress
ed, a
ndov
erly
dep
ende
nt b
ehav
ior
know
ledg
e an
d sk
ills
Impr
oved
mot
orde
velo
pmen
t and
coor
dina
tion
Incr
ease
d le
vels
of
coop
erat
ion,
ass
ertio
n,an
d re
spon
sibi
lity,
and
incr
ease
d de
gree
of
self
-co
ntro
l
Low
ered
leve
ls o
fhe
adst
rong
, ant
isoc
ial,
anxi
ous,
dep
ress
ed, a
ndov
erly
dep
ende
nt b
ehav
ior
NH
ES
item
s on
know
ledg
e of
col
ors,
lette
rs, n
umbe
rs, a
ndw
ritin
g (3
-5)
NH
ES
item
s (5
-6)
Scho
ol r
egis
trat
ion
Scho
ol-R
elat
ed I
Ctio
wle
dge
and
Skill
s..
Surv
eySc
hool
-rel
ated
kno
wle
dge
isen
hanc
ed b
y pr
esch
ool
prog
ram
par
ticip
atio
n an
dim
prov
ed f
amily
fun
ctio
ning
(Bar
nett
1992
).
Ear
ly S
cree
ning
Inv
ento
rySc
hool
reg
istr
atio
n(E
SI)
(4-6
)
NH
ES
item
s on
but
toni
ng,
Surv
eyho
ldin
g a
penc
il, w
ritin
g,an
d dr
awin
g; m
otor
coor
dina
tion
(3-5
)
ESI
(4-
6)
Soci
al S
kills
Rat
ing
Syst
em--
Pare
nt F
orm
(SSR
S) (
3-5)
SSR
STea
cher
For
m(5
-6)
Scho
ol r
egis
trat
ion
Surv
ey
Pres
choo
l pro
gram
s ca
nen
hanc
e sc
hool
per
form
ance
(Sch
wei
nhar
t et a
l. 19
93).
Soci
al W
ell-
Bet
ng
Qua
lity
child
car
e im
prov
esso
cial
dev
elop
men
t (H
owes
et a
l. 19
92).
Scho
ol r
egis
trat
ion
Beh
avio
r Pr
oble
ms
Inde
x--B
PI (
2-5)
Surv
eyPa
rent
rat
ings
on
BPI
dist
ingu
ish
child
ren
who
have
and
hav
e no
t rec
eive
dps
ycho
logi
cal h
elp
(Zill
1990
); c
hild
ren
in f
amili
esw
ith lo
w le
vels
of
conf
lict
expe
rien
ce f
ewer
beh
avio
rpr
oble
ms
(Pet
erso
n an
d Z
ill19
86);
chi
ldre
n w
ithpr
esch
ool e
xper
ienc
e ha
vefe
wer
cla
ssro
om b
ehav
ior
prob
lem
s (P
iers
on e
t al.
1984
).
Use
d in
199
3 N
HE
S
Goo
d no
rms,
str
ong
relia
bilit
y,ha
s co
mpa
nion
par
ent r
atin
gfo
rm
Use
d in
199
3 N
HE
S
Goo
d no
rms
and
relia
bilit
y
Stan
dard
ized
on
natio
nal s
ampl
eof
5,0
00 c
hild
ren,
rac
ially
,et
hnic
ally
, and
soc
ioec
onom
ical
lym
ixed
Em
phas
izes
pos
itive
beh
avio
rs
Des
igne
d as
par
ent-
repo
rtm
easu
re
BPI
is w
idel
y us
ed (
JOB
S,N
LSY
-CS,
NH
IS)
Stro
ng p
sych
omet
ric
prop
ertie
s
Scho
ol r
eadi
ness
: rel
ates
toco
gniti
ve a
nd g
ener
al k
now
ledg
e
Scho
ol r
eadi
ness
: rel
ates
toph
ysic
al w
ell-
bein
g an
d m
otor
deve
lopm
ent
Soci
al s
kills
are
impo
rtan
t for
succ
essf
ul a
djus
tmen
t to
kind
erga
rten
; rel
ates
to s
ocia
l and
emot
iona
l dev
elop
men
t and
dom
ain
of a
ppro
ache
s to
war
d le
arni
ng
Beh
avio
r pr
oble
ms
can
inte
rfer
ew
ith s
choo
l rea
dine
ss a
nd s
ucce
ss.
Chi
ldre
n w
ith f
ewer
beh
avio
rpr
oble
ms
are
less
like
ly to
req
uire
men
tal h
ealth
ser
vice
s (A
chen
bach
et a
l. 19
91).
Rel
ates
to s
ocia
l and
em
otio
nal
deve
lopm
ent
SOU
RC
E:
Tho
rnto
n, L
ove,
and
Mec
kstr
oth
(199
4).
73B
ES
T C
OP
Y A
VA
ILA
BLE
7
TA
BL
E 4
SCH
OO
L P
ER
FOR
MA
NC
E O
UT
CO
ME
S A
ND
ME
ASU
RE
S
Inte
rmed
iate
-Ter
mO
utco
mes
Exp
ecte
dL
ong-
Ter
mO
utco
mes
Exp
ecte
dM
easu
res
Rec
omm
ende
dD
ata
Sour
ceE
vide
nce
of S
ensi
tivity
toC
omm
unity
Int
erve
ntio
nsSt
reng
ths
[Lia
bilit
ies]
of
the
Mea
sure
Polic
y R
elev
ance
Impr
oved
atte
ndan
ce
Dec
reas
ed p
ropo
rtio
ns o
fch
ildre
n id
entif
ied
asde
velo
pmen
tally
del
ayed
at k
inde
rgar
ten
entr
y
Red
uctio
ns in
num
ber
ofch
ildre
n as
sign
ed to
spec
ial e
duca
tion
prog
ram
s
Impr
oved
atte
ndan
ce
Ret
nedi
atio
n
Scor
es o
n de
velo
pmen
tal
Kin
derg
arte
nsc
reen
ing
test
sre
gist
ratio
n fo
rms
Num
ber
and
prop
ortio
n of
Scho
ol s
yste
mch
ildre
n w
ith s
peci
alre
cord
sed
ucat
ion
assi
gnm
ents
Enr
ollm
ent i
n pr
esch
ool
prog
ram
s an
d in
crea
sed
leve
ls o
f in
telle
ctua
lst
imul
atio
n in
the
hom
esh
own
to d
ecre
ase
deve
lopm
enta
l del
ays
(Ram
ey e
t al.
1983
)
Enr
ollm
ent i
n pr
esch
ool
prog
ram
s an
d in
crea
sed
leve
ls o
f in
telle
ctua
lst
imul
atio
n in
the
hom
esh
own
to d
ecre
ase
spec
ial
educ
atio
n pl
acem
ents
(Bar
nett
1992
; Sch
wei
nhar
tet
al.
1993
)
Atte
ndan
ce
Day
s ab
sent
/pre
sent
Scho
ol r
ecor
ds
Inex
pens
ive
to c
olle
ct (
ifav
aila
ble)
[Sub
ject
to p
ossi
ble
inco
nsis
tenc
ies
in s
choo
lre
gist
ratio
n pr
oced
ures
]
Inex
pens
ive
to c
olle
ct (
ifav
aila
ble)
[Sub
ject
to in
cons
iste
ncie
s in
scho
ol r
ecor
ds]
Impr
ovem
ents
in th
e ho
me
envi
ronm
ent c
an im
prov
esc
hool
atte
ndan
ce (
May
nard
1977
).
Gra
deT
rogr
essi
on
Red
uctio
ns in
rat
es o
fPr
opor
tion
of s
tude
nts
Scho
ol r
ecor
dsch
ildre
n re
tain
ed in
gra
deov
er a
ge f
or g
rade
Con
tinue
d de
crea
se in
Dec
reas
ed s
choo
l dro
pout
scho
ol d
ropo
ut r
ates
rate
sSc
hool
dro
pout
rat
eSc
hool
rec
ords
Qua
lity
pres
choo
l exp
erie
nce
redu
ces
rate
of
grad
ere
tent
ions
(I
ajar
et a
l. 19
82;
Schw
einh
art e
t al.
1993
).
Impr
ovem
ents
in th
e ho
me
envi
ronm
ent c
an in
crea
sepr
obab
ility
of
com
plet
ing
high
sch
ool a
nd in
crea
sing
educ
atio
nal a
ttain
men
t(M
alla
r an
d M
ayna
rd 1
981)
.H
owev
er, r
igor
ousl
yev
alua
ted
scho
ol-b
ased
prog
ram
s fo
r te
enag
ers
have
show
n lit
tle e
ffec
t.
Dev
elop
men
tal d
elay
s ar
e m
ajor
sour
ce o
f in
crea
sed
educ
atio
nal
cost
s.
Spec
ial e
duca
tion
serv
ices
are
ver
yco
stly
.
Inex
pens
ive
to c
olle
ct
[Sub
ject
to in
cons
iste
ncie
s in
scho
ol r
ecor
ds]
Inex
pens
ive
to c
olle
ct
[Sub
ject
to v
aria
tion
in s
choo
lre
tent
ion
polic
ies]
Poor
atte
ndan
ce in
terf
eres
with
scho
ol c
ompl
etio
n.
Gra
de r
eten
tions
res
ult i
n in
crea
sed
educ
atio
n co
sts.
Dro
ppin
g ou
t of
high
sch
ool c
ande
crea
se f
utur
e ea
rnin
gs p
oten
tial.
BE
ST
CO
PY
AV
AIL
AB
LE
TA
BL
E 4
(con
tinue
d)
Inte
rmed
iate
-Ter
mO
utco
mes
Exp
ecte
dL
ong-
Ter
mO
utco
mes
Exp
ecte
dM
easu
res
Rec
omm
ende
dD
ata
Sour
ceE
vide
nce
of S
ensi
tivity
toC
omm
unity
Int
erve
ntio
nsSt
reng
ths
[Lia
bilit
ies]
of
the
Mea
sure
Polic
y R
elev
ance
Impr
oved
bas
ic s
kills
and
Impr
oved
bas
ic s
kills
and
Scre
enin
g te
st s
core
sac
adem
ic a
chie
vem
ent
acad
emic
ach
ieve
men
t
Sch
ool A
chie
vem
ent
Scho
ol s
yste
mre
cord
s
Stan
dard
ized
ach
ieve
men
tSc
hool
sys
tem
test
sco
res
reco
rds
Cog
nitiv
e an
d so
cial
perf
orm
ance
impr
oved
by
pres
choo
l atte
ndan
ce (
see
child
dev
elop
men
t out
com
es)
Scho
ol a
chie
vem
ent
impr
oved
by
atte
ndan
ce in
qual
ity p
resc
hool
pro
gram
s(B
arne
tt 19
92; L
azar
et a
l.19
82),
fam
ily b
ackg
roun
dfa
ctor
s (H
anus
hek
1989
),im
prov
emen
ts in
the
hom
een
viro
nmen
t (M
ayna
rd 1
977;
May
nard
and
Mur
nane
197
9;M
a lla
r an
d M
ayna
rd 1
981)
,in
fant
and
pre
scho
ol h
ome
envi
ronm
ents
(B
radl
ey a
ndC
aldw
ell 1
978,
198
4)
Inex
pens
ive
to c
olle
ct (
ifav
aila
ble)
[Sub
ject
to in
cons
iste
ncie
s in
scho
ol r
egis
trat
ion
proc
edur
es]
Rel
ated
to N
atio
nal E
duca
tion
Goa
ls
SOU
RC
E:
Tho
rnto
n, L
ove,
and
Mec
kstr
oth
(199
4).
74B
ES
T C
OP
Y A
VA
ILA
BLE
7 5
TABLE 5
SCHOOL-ENTRY ASSESSMENT STRATEGIES FOR THE READINESS DIMENSIONSOF THE FIRST NATIONAL EDUCATION GOAL
Readiness Indicator Measure
Access to High-Quality and Development011Y.APproprinieWeichOtilPiiigranis .-...
Increased enrollments in early care and education programs Questions on types of programs attended (Head Start, nursery school,state prekindergarten, center day care, family day care), age of firstattendance, and duration of attendancea
Improved quality of early care and education programs Questions on program's daily and weekly schedule, group size, andchild-staff ratioa
Increased stability of child care arrangements Questions on number of different early care and education settingsexperienceda,b
Increased percentage of high-risk children enrolled in early Question on enrollment in early intervention programscintervention programs
Every Parent Will Be a Child's First Teacher and Devote Time EachDay to Helping His or Her Preschool Child Learn
Increased amount of time spent with child in intellectually Questions on frequency of reading, storytelling, teaching activities,challenging activities playing games, discussing science or nature, etc.a
Increased number of educational materials in the home Questions on number of books, games, and other educationalmaterials in the homea
Increased regulation of children's television viewing Questions on regulation (rules covering content and hours) ofchildren's television viewinga
Increased enriching experiences outside the home provided by Questions on frequency of visits to libraries, museums, zoos, plays,parents concerts, churches, cultural organizations, and games or sportsa
Parents Will Have Access to the Training and Support They Need
Increased attendance at parenting classes in the community Questions on attendance at classesd,e
Increased availability of parenting classes, social clubs, parent Questions on knowledge of and access to parenting classes, socialgroups, counseling opportunities, social service agencies, and clubs, parent groups, counseling opportunities, social serviceother supports agencies, and other supportsd,e,
Children Will Receive the Nutrition and Health Care Needed toArrive at School with Healthy Minds and Bodies
Reduced percentage of low-birthweight babies Questions on child's weight at birthf
Increased access to prenatal care Question on number of prenatal care visitsf
Increased percentage of children receiving regular medical care Questions on regular source of routine caret
Increased percentage of children receiving regular well-child Questions on routine health checkup in past 12 monthsfexaminations
Decreased use of emergency room for nonemergency care Question on emergency room usef
Increased percentage of children receiving immunizations at Questions on completed immunizationsappropriate ages
Increased percentage of children having private or public Questions on health insurancefhealth insurance
Increased percentage of children having regular vision and Questions on vision and hearing screenings in past 12 monthsfhearing screenings
Decreased percentage of children having previously undetected Questions on children referred for treatmentfvision and hearing problems
66
TABLE 5 (continued)
Readiness Indicator Measure
Increased percentage of children having completed dental Questions on dental visit in past 12 monthsfcheckup
Increased percentage of children having nutrition screening Questions on nutrition screeningfbefore kindergarten entry
Increased percentage of children referred before kindergarten Questions on treatment referral/for treatment of mild asthma, tuberculosis, cerebral palsy,mental retardation, autism, or other pervasive developmentaldisorders
Physical Well-Being and Motor Development
Physical Well-Being
Increased overall health status of children Child's health ratinga,f
Decreased percentage of children having functional Ratings of child's functional limitationsglimitations because of health conditions
Reduction in percentage of children with morbidities or Child's health ratinghserious morbidities
Decreased number of hospitalizations Questions on hospitalization/
Increased percentage of children within age-appropriate Items on height and weight relative to age norms (by directheight and weight norms examination or medical record review)
Motor Development .
Improved fine-motor development and coordination Items on block-building, draw-a-person, and copying forms'
Improved gross-motor skills Items on gross-motor/body-awareness scale'
Social and Emotional Development
Social Development
Increased levels of assertion Scores on assertion scale)
Decreased levels of aggressive behavior, dependence, and Scores on scales measuring aggressive behavior, dependence, andheadstrong behavior headstrong behaviork
Increased cooperation and ability to help, communicate Scores on cooperation scale)
problems, and follow rules
Emotional Development
Reduced levels of anxiety and depression Scores on anxiety and depression scalesk
Approaches Toward Leartung
Self-Control and Self-Regulation
Increased ability to control temper, attend to instructions, Scores on self-control scale)and take turns
Task Attention
Increased ability to attend to task and to remember auditory Items on auditory sequential memory scale'material received
_
Language U-sage
Verbal Expression
Increased expressive language, speaking ability, and ability Items on verbal expression scale'to describe objects
67
7 7
EST COPY AVAILABLE
TABLE 5 (continued)
Readiness Indicator Measure
Cognition and General Knowledge
Visual Sequential Memory
Increased ability to remember what is seen Items on visual sequential memory'
Number Concepts
Increased ability to count and to understand simple Items on number concept'quantitative concepts
Verbal Reasoning
Increased relational-thinking ability Items on verbal reasoning'
General Knowledge
Increased school-related knowledge and skills Questions on knowledge of colors, letters, numbers, and writinga
Family Background, Demographics, and Contextual Variables
Mother's education Questions selected from various instrumentseFather's educationMother's age at birth of first childHousehold structureHousehold incomeEmployment status of mother and fatherNumber of residential movesLength of residence in communityChild's age and genderChild's disabilitiesRace/ethnicity of childChild's contacts with father (if father not in home)Number and ages of siblingsLanguage(s) spoken in the homeNeighborhood characteristicsSchool characteristics
SOURCE: Love, Aber, and Brooks-Gunn (1994).
aNational Household Education Survey (NHES:93) (NCES 1993).
bNational Child Care Survey 1990 (Hofferth et al. 1991).
cInteractional and Developmental Processes study questionnaire (MPR 1991).
dTeenage Parent Demonstration 24-Month Follow-Up questionnaire (MPR 1993).
eMeasures to be selected.
f National Health Interview Survey (NHIS) Child Health Supplement (NCHS 1989).
gRand Health Perceptions Scale (Eisen et al. 1980).
hMorbidity Index and Serious Morbidity Index (Brooks-Gunn et al. 1994).
i Early Screening Inventory (ESI) (Meisels et al. 1988).
Social Skills Rating System (SSRS) (Gresham and Elliott 1990).
kBehavior Problems Index (BPI) (Zill 1990).
68
TABLE 6
CHARACTERISTICS OF FOUR ILLUSTRATIVE CHILD OUTCOME MEASURES
Feature/CharacteristicBehavior Problems Index
(Zill 1990)Early Screening Inventory (Meisels
et al. 1988)
MacArthur CommunicativeDevelopment Inventories
(Fenson et al. 1993)
Social Skills Rating SystemElementary Teacher Form(Gresham and Elliott 1990)
Outcomes measured
Age range assessed
Type of assessment
How administered
Behavior problems labeled as
Headstrong
Aggressive
Anxious
Depressed
Immature /dependent
2-5 years
Rating by parent
Telephone interview, self-report,or in-person interview
Visual-motor/adaptiveDraw-a personFine-motor controlEye-hand coordinationVisual-sequential memoryReproducing 2- and 3-dimensional visualstructures
Language and cognitionComprehensionVerbal expressionReasoningCountingRemembering auditorysequences
Gross-motor/body awarenessLarge-muscle coordinationBalancing, hopping,skippingImitating body positionsfrom visual cues
4-6 years
Structured individual assessmentby teacher
Administered by teacher
Words and Gestures (infants)aPhrasesVocabulary comprehensionVocabulary productionEarly gesturesLater gestures
Words and Sentences (toddlers)Vocabulary productionIrregular nouns and verbsSentence complexity
Infant form: 8-16 monthsToddler form: 16-30 months
Parent report
Self-report by parent
Cooperation
Assertion
Self-control
Externalizing behaviors
Internalizing behaviors
Hyperactivity
Academic competence
5-11 years (grades K-6)
Rating by teacher
Written response
BEST COPY AVAILABLE
69
Ta
TABLE 6 (continued)
Feature/CharacteristicBehavior Problems Index
(Zill 1990)Early Screening Inventory (Meisels
et al. 1988)
MacArthur CommunicativeDevelopment Inventories
(Fenson et al. 1993)
Social Skills Rating System- -Elementary Teacher Form(Gresham and Elliott 1990)
Administration time
Standardization and norms
Previous and/or current use
Strengths
Liabilities
10 minutes
Data available on 17,110 children17 years of age and under from the1988 National Health InterviewSurvey of Child Health.
National Longitudinal Survey ofYouth Child Supplements (1986,1988)
National Health Interview Survey:Child Health Supplement (1981,1988)
NICHD infant day care study
JOBS child impact study
Extensive national data availablefor comparative use
Short and easy to administer
Programs may not like thenegativity of focus on problembehaviors
15-20 minutes
Normed on national sample of2,746 children, 44 percentnonwhite.
20-30 minutes
Norming sample of 671 infants and1,142 toddlers in New Haven,Seattle, and San Diego
Evaluation of developmental status Currently used in NICHD infantof homeless children (Koblinsky, day care studyTaylor, and Douglas 1995)
Strong predictive validity with theMcCarthy Scales of Children'sAbilities (Meisels et al. 1993)
High interscorer and test-retestreliabilities
Refers a high proportion ofchildren actually at-risk, andexcludes most of the children notat-risk (Meisels et al. 1993)
Administration time couldconstitute significant burden if ateacher is asked to assess multiplechildren
Good predictive validity (usingobservational data as criterionmeasure)
Moderate to high internalconsistency, depending on scale
Good cross-form, cross-agestability during period of early andmore rapid vocabulary expansion
Only 13 percent of the normingsample were minority groupmembers; 78 percent of parentshad some college education or acollege degree.
10 minutes
Norming sample of 5,000children--geographically, racially,and socioeconomicallyheterogenous (956 for theelementary teacher form); included19 percent handicapped
Head Start-Public SchoolTransition Demonstrationevaluation
Under consideration for use in theEarly Childhood LongitudinalStudy (NCES)
Good psychometrics
Norms for handicapped andnonhandicapped children
Assesses positive social skills aswell as problem behaviors
Administration time couldconstitute significant burden if ateacher is asked to rate multiplechildren
70
C,
TABLE 6 (continued)
Feature/Characteristic
MacArthur CommunicativeBehavior Problems Index Early Screening Inventory (Meisels Development Inventories
(Zit' 1990) et al. 1988) (Fenson et al. 1993)
Social Skills Rating SystemElementary Teacher Form(Gresham and Elliott 1990)
Comments Ratings by teacher or other Administration by interview alsoknowledgeable adult also possible possible.
Short form is being field tested.
May be suitable for older childrenwho are developmentally delayed.
aThis list represents the subscales of the CDI. Various variables can be generated as outcome measures, e.g., age at which x percent of the sampleuses plurals, possessives, progressive, and
past tense.
BEST COPYAVAILABLE
71
U.S. Department of EducationOffice of Educational Research and Improvement (OERI)
Educational Resources Information Center (ERIC)
REPRODUCTION RELEASE
I. DOCUMENT IDENTIFICATION:
(Specific Document)
ERIC
Title:Considering Child-Based Results for Young Children:Definitions, Desirability, Feasibility, & Next Steps
Author(s): Sharon Lynn Kagan? Sharon Rosenkoetter? and Nancy Cohen
Corporate Source: Publication Date:
II. REPRODUCTION RELEASE:In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents announced
in the monthly abstract journal of the ERIC system, Resources in Education (RIE), are usually made available to users in microfiche, reproducedpaper copy, and electronic/optical media, and sold through the ERIC Document Reproduction Service (EDRS) or other ERIC vendors. Credit isgiven to the source of each document, and, if reproduction release is granted, one of the following notices is affixed to the document.
If permission is granted to reproduce and disseminate the identified document, please CHECK ONE of the following two options and sign atthe bottom of the page.
Check hereFor Level 1 Release:Permitting reproduction inmicrofiche (4" x 6" film) orother ERIC archival media(e.g., electronic or optical)and paper copy.
Signhere -pplease
The sample sticker shown below will beaffixed to all Level 1 documents
PERMISSION TO REPRODUCE ANDDISSEMINATE THIS MATERIAL
HAS BEEN GRANTED BY
TO THE EDUCATIONAL RESOURCESINFORMATION CENTER (ERIC)
Level 1
The sample sticker shown below will beaffixed to all Level 2 documents
PERMISSION TO REPRODUCE ANDDISSEMINATE THIS
MATERIAL IN OTHER THAN PAPERCOPY HAS BEEN GRANTED BY
TO THE EDUCATIONAL RESOURCESINFORMATION CENTER (ERIC)
Levei 2
Documents will be processed as indicated provided reproduction quality permits. If permissionto reproduce is granted, but neither box is checked, documents will be processed at Level 1.
LIiia
Check hereFor Level 2 Release:Permitting reproduction inmicrofiche (4" x 6" film) orother ERIC archival media(e.g., electronic or optical),but not in paper copy.
"I hereby grant to the Educational Resources Information Center (ERIC) nonexclusive permission to reproduce and disseminatethis document as indicated above. Reproduction from the ERIC microfiche or electronieoptical media by persons other thanERIC employees and its system contractors requires permission from the copyright holder. Exception is made for non-profitreproduction by libraries and other service agencies to satisfy information needs of educators in response to discrete inquiries.'
Signature:
oi/OVE4,./Organization/Address:Yale University Bush Center in ChildDevelopment and Social Policy310 Prospect StreetNew Haven, CT 06511
Printed Name/Position/Title:
Sharon Lynn Kagan / Senior Associate
Telephone:
(203) 432-9931
FAX:
(203) 432-9933E-Mail Address: Date:
12/1/97
(over)
III. DOCUMENT AVAILABILITY INFORMATION (FROM NON-ERIC SOURCE):
If permission to reproduce is not granted to ERIC, or, if you wish ERIC to cite the availability of the document from another source,please provide the following information r( --arding the availability of the document. (ERIC will not announce a document unless it ispublicly available, and a dependable soi can be specified. Contributors should also be aware that ERIC selection criteria aresignificantly more stringent for documents -sat cannot be made available through EDRS.)
Publisher/Distributor:
Yale University Bush Center in Child Development and Social Policy
Address:310 Prospect StreetNew Haven, CT 06511
Price:
IV. REFERRAL OF ERIC TO COPYRIGHT/REPRODUCTION RIGHTS HOLDER:
If the right to grant reproduction release is held by someone other than the addressee, please provide the appropriate name and address:
Name:
Address:
V. WHERE TO SEND THIS FORM:
Send this form to the following ERIC Clearinghouse:KAREN E. SMITHACQUISITIONS COORDINATORERIC/EECECHILDREN'S RESEARCH CENTER51 GERTY DRIVECHAMPAIGN, ILLINOIS 61820-7469
However, if solicited by the ERIC Facility, or if making an unsolicited contribution to ERIC, return this form (and the document beingcontributed) to:
ERIC Processing and Reference Facility1100 West Street, 2d Floor
Laurel, Maryland 20707-3598
Telephone: 301-497-4080Toll Free: 800-799-3742
FAX: 301-953-0263e-mail: [email protected]
WWW: http://ericfac.piccard.csc.com(Rev. 6/96)