considering child-based results for young children: definitions

DOCUMENT RESUME

ED 414 997 PS 025 428

AUTHOR Kagan, Sharon Lynn, Ed.; Rosenkoetter, Sharon, Ed.; Cohen,Nancy, Ed.

TITLE Considering Child-Based Results for Young Children:Definitions, Desirability, Feasibility, and Next Steps.

INSTITUTION Yale Univ., New Haven, CT. Bush Center in Child Developmentand Social Policy.

PUB DATE 1997-00-00NOTE 81p.; Based on Issues Forums on Child-Based Outcomes (New

York, NY, June 1-2, 1995 and January 24, 1996).PUB TYPE Information Analyses (070) Reports Descriptive (141)

EDRS PRICE MF01/PC04 Plus Postage.DESCRIPTORS *Academic Standards; Educational Change; *Educational

Improvement; *Educational Objectives; Feasibility Studies;*Outcome Based Education; *Outcomes of Education; PreschoolEducation; Program Effectiveness; Theory PracticeRelationship

ABSTRACTWhile elementary and secondary educators are pursuing a

results orientation as a means of educational reform and pedagogicalimprovement, the early childhood field has not explored sufficiently thedesirability and feasibility of, nor the process for, establishingchild-based standards and results for younger children. This report presentsa synthesis of issues discussed at two issues forums held in 1995 and 1996.Following an introductory chapter, chapter 2 distinguishes between types ofresults and purposes of results as they apply to young children. Chapter 3examines the desirability of child-based results in early childhood educationin terms of the impact on teachers' practice and children's experiences,public understanding, funding, and the relationship of early care andeducation to other services. Chapter 4 identifies the following fiveconditions necessary for child-based results to be feasible: broadparticipation in the identification of results; identification of appropriateresults; clarity concerning which children to include; appropriatemeasurement of results; and linking of child-based results to efforts toimprove the lives of children. Chapter 5 outlines steps to advance aresults-based approach: increase public consciousness and participation; planstrategically; identify and choose results carefully; develop appropriate,cost-effective approaches to assessment and data collection; put theory intopractice; explore ways to adequately fund a results approach; andcommunicate, implement, and evaluate such programs. Five appendices containthe meeting agendas of each of the issues forums and three papers presentedat the meetings. (TJQ)

********************************************************************************

Reproductions supplied by EDRS are the best that can be madefrom the original document.

********************************************************************************

U

U.S. DEPARTMENT OF EDUCATIONOffice of Educational Research and Improvement

EDUCATIONAL RESOURCES INFORMATIONCENTER (ERIC)

)44, This document has been reproduced asreceived from the person or organizationoriginating it.

0 Minor changes have been made toimprove reproduction quality.

L. Points of view or opinions stated in thisdocument do not necessarily representofficial OERI position or policy.

Considering

s

for Young ChildrenDefinitions, Desirability, Feasibility, es. Next Steps

"ft

PERMISSION TO REPRODUCE ANDDISSEMINATE THIS MATERIAL

HAS BEEN GRANTED BY

S L 430, Nrs

TO THE EDUCATIONAL RESOURCESINFORMATION CENTER (ERIC)

Based on Issues Forums on Child-Based Outcomes sponsored by The W.K. Kellogg Foundation,

The Carnegie Corporation of New York, and Quality 2000: Advancing Early Care and Education

BEST COPY AVAILABLE

Considering Child-Based Resultsfor Young Children

Definitions, Desirability, Feasibility,and Next Steps

Based on Issues Forums on Child-Based Results

sponsored by

The W. K Kellogg Foundation,

The Carnegie Corporation of New York, and

Quality 2000: Advancing Early Care and Education

Edited by

Sharon Lynn Kagan,

Sharon Rosenkoetter, and

Nancy Cohen

©1997

Yale Bush Center in Child Development and Social Policy

310 Prospect Street

New Haven, CT o6551

Contents

I. INTRODUCTION

BackgroundRationale for and Goals of the Issues Forums

I

3

2. DEFINITIONSDefining Different Types of Results 4Defining Different Purposes for Assessing Results 6Issues Concerning Definitions of Types and Purposes of Results 7

3. DESIRABILITYThe Impact on Teachers' Practice and Children's Experiences ioThe Impact on Public Understanding ioThe Impact on Funding II

The Impact on the Relationship of Early Care and Education and Other Services 12

So...What do we Conclude About the Desirability of Moving to a Child-BasedResults Approach? 12

4. FEASIBILITYOnly if There is Broad Participation in the Identification of Results 14

Only if we can Identify Appropriate Results 14

Only if we are Clear About Which Children to Include 16

Only if we Measure Results Appropriately 17

Only if we Link Child-Based Results to Efforts to Improve the Lives of Children 17

5. SUGGESTED NEXT STEPSIncrease Public Consciousness and Participation 19

Plan Strategically 19

Identify and Choose Results Carefully 20

Develop Appropriate, Cost-Effective Approaches to Assessment and Data Collection 21

Put Theory Into Practice 21

Explore Ways to Fund a Results Approach Adequately 22

Communicate, Implement, and Evaluate 22

6. REFERENCES

7. APPENDICESA. June 1-2, 1995 Meeting Agenda and Participants 24B. January 24, 1996 Meeting Agenda and Participants 27

c. Considerations Regarding an Outcomes-Based Orientation, Sheldon H. White 29

D. Judging Interventions by Their Results, Lisbeth B. Schorr 36

E. Can we Measure the Outcomes?, John M. Love 48

Chapter 1. Introduction

While elementary and secondary educa-tors are pursuing results as a means ofeducational reform and pedagogical

improvement, the early childhood field has notexplored sufficiently the desirability, feasibility, orprocess of establishing child-based standards andresults for younger children. This report presentsa synthesis of issues discussed at two IssuesForums. The first, on Child-Based Results, washeld on June 1-2, 1995 in New York City and wasattended by 37 scholars and practitioners in earlycare and education, school reform, and policydevelopment related to children and families (seeAppendix A). The second forum, held on January24, 1996, addressed Next Steps in AdvancingChild-Based Results; it consisted of 19 partici-pants with expertise in results efforts (see Appen-dix B). The Forums were a collaborative effort ofthe W.K. Kellogg Foundation, Carnegie Corpora-tion of New York, and Quality 2000: AdvancingEarly Care and Education.

BackgroundCurrent challenges facing the development andimplementation of child-based results for youngchildren are framed by two factors, each discussedbelow. The first is the current socio-political con-text; the second derives from grave concerns re-garding the challenges and misuse of results in thepast.

THE PRESENT CONTEXT

Increasing dissatisfaction with America's schoolsand the performance of its graduates has fosteredwidespread calls for educational reform. Emanat-ing from both the public and private sectors, dis-satisfaction with American education is com-pounded by growing concerns about the rising

I

costs of a system perceived to be inefficient as wellas ineffective.

To begin to rectify these educational ills, severalmovements have taken hold, including school-based management, charter schools, standardsspecification, and results-driven accountability.The results movement gained considerable publicattention through the Goals woo: Educate Amer-ica Act and the Elementary and Secondary Educa-tion Act, among others. The press for greateraccountability has spurred federal and state action,with new standards and results groups beingformed to address educational challenges. In somestatesVermont, Minnesota, and Oregontheresults orientation transcends education andextends across the array of human services.

As support for a results orientation in educationand human services increases, so has attention toyoung children and their families. New policy ini-tiatives focus on very young children (e.g., EarlyHead Start and Healthy Start), preschoolers (e.g.,National Governors' Association Action Teams onSchool Readiness), and on both (e.g., Kids Count,Child Care and Development Fund, and the fam-ily support movement). Public schools increas-ingly support prekindergarten services, while pro-grams such as Head Start and Parents as Teachersreceive media consideration and greater funding.

Despite dramatic growth in early care and edu-cation and unprecedented calls by the NationalEducation Goals Panel and others to delineateoptimal results and chronicle children's progresstoward them throughout the nation, the field ofearly care and education has responded with lessthan vigorous support. Indeed, calls for a focuson child-based results have met with staunchvocal resistance, as well as more silent pleas for aredirection of effort.

THE PAST CONTEXT

Reluctance to embrace a results orientation by theearly care and education field has deep roots,including legitimate concerns regarding test mis-use, technical concerns regarding measurement, ahistoric focus on process, and a lack of agreementregarding what is meant by results.

Evidence of Misuse of Test Data

Early childhood education professionals havecriticized the use of tests to mis-label, mis-catego-rize, and stigmatize children during their earliestdays in formal education (Meisels, 1988; Shepard& Smith, 1987), and they have questioned thevalidity of standardized tests for individual chil-drenespecially boys, racial minorities, andpreschool children whose primary language is notEnglish. Of great concern to the field of early careand education has been the widespread use of"readiness" instruments to screen children forentry to school. This practice has resulted in up to5o percent of children in some districts delayingschool entry or being sent to alternative "transi-tion" classes of unsubstantiated value (Gnezda &Bolig, 1989; Graue, 1993). Given these experi-ences, there is well-grounded skepticism in theearly childhood education community about thepotential use and misuse of results.

Concerns About Measurement

Concerns about measurement manifest them-selves in two domains; what is measured and howit is measured. In a paper commissioned for thefirst Forum, White considers some of these con-cerns (see Appendix C). Regarding what is mea-sured, early educators worry that child resultsmay be narrowly constructed to include only cog-nitive and pre-academic results, ignoring devel-opmental domains that are crucial to children'ssuccess but more difficult to capture in routinizedassessment (e.g., socio-emotional developmentand approaches toward learning). Regarding themeasurement of resultsthe howsome early

2

educators and developmentalists doubt the feasi-bility of instituting a results approach with youngchildren due to the variability of their behaviorand their inexperience in "performing" in testingsituations. Because young children's learning ishighly episodic, early educators voice concernsregarding the capacity of instruments adminis-tered to children on one occasion to capturedevelopmental nuances accurately. Further, theyworry that assessments may not give racially, eth-nically, and linguistically diverse young childrenappropriate opportunities to display their skillsand knowledge. Finally, they challenge the relia-bility and validity of existing assessment tools,questioning whether such instruments can besuitably altered or new instruments created, todiminish these concerns.

Focus on Process

Early education historically has emphasized theprocess of young children's individual learning.Early educators are trained to recognize and workwith children's uneven growth as well as the diver-sity in family values, experiences, and interactionstyles that shape early development. Reflectingthis individualistic orientation, historic attemptsto codify and improve practice in the early child-hood education field have focused on themodification of inputs, structural variables, andprocess variables that enable such individualiza-tionadult-child ratios, group size, and interac-tion patterns. Routinely, such inputs have beenequated with quality, with little call for an exami-nation of the need for a results orientation. Thereis little press for a movement toward child-basedaccountability by the early childhood educationfield, particularly when child-based results couldbe used to influence critical program funding andpolicy decisions.

Lack of Clarity of Terms

Finally, discourse on child results to date has beenhampered by imprecision in language and scope.

There is limited consensus in early care and educa-tion regarding what is meant by terms in commonusegoals, benchmarks, results, inputs, indica-tors, interim indicators, assessment, and testing.Attempts to achieve definitional clarity have beenovershadowed by the field's dual concern with thedelivery of direct services to children and theirfamilies, on the one hand, and with building theinfrastructure on the other hand. In short,definitional ambiguity is pervasive.

Rationale for and Goals of theIssues Forums

Given this context, and the likelihood of ongoingpublic pressure for results for young children, itwas deemed appropriate to engage scholars, prac-titioners, and policymakers in a professional con-versation. Organizers of the Forums wished toprovide an opportunity to take stock of the cur-rent status of child-based results for children birthto age eight, and to give voice to the early child-hood education community regarding its issuesand concerns.

In particular, it seemed important to: (1) clarifydefinitional distinctions; (2) discern the desirabil-ity of moving to a results orientation; (3) deter-mine the feasibility of moving toward a resultsorientation for young children; (4) consider nextsteps regarding a results-based orientation. Incontrast, the aim of the Forums was not to con-

3

cider or define the specific content of results thatmight be deemed appropriate for young children;that work has been started by others (Love, Aber,& Brooks-Gunn, 1994; Phillips & Love, 1994;Institute for Research on Poverty, 1995). Nor,given the complexity of the issues and the differ-ences of opinion that exist, was it the intent of theForums to achieve consensus on relevant issues;rather, this was an opportunity for honestreflection and thoughtful debate.

Reflecting these goals and intents, this reportis structured around three themesdefinitions,desirability, and feasibilitywith possible nextsteps suggested by the participants at the end.The document represents a synthesis of theissues discussed as well as those documented inthe literature. It is not intended to represent theconsensus of participantsbecause no suchconsensus was achieved. Rather, it is intended toreflect the complexity of the issues and the chal-lenges associated with moving toward child-based results for young children. Editors of thedocument (Sharon L. Kagan, Sharon Rosen-koetter, and Nancy Cohen) have attempted torepresent the ideas with fidelity; they alone,however, are responsible for errors. Appendicesfollow, including meeting agendas, lists of par-ticipants, and working papers prepared for thefirst Forum by Sheldon White, Lisbeth Schorr,and John Love.

7

Chapter 2. Definitions

This section details two distinct, but related,frameworks that shape the discussion that fol-lows. The first distinguishes among different typesof results; the second distinguishes among differ-ent purposes of results. A third part of this sectiondiscusses issues related to both the types and pur-poses of results. It should be noted that thedefinitions offered are not the only way of distin-guishing among the types or purposes of results(Bruner, Bell, Brindis, Chang, & Scarbrough,1993; Young, Gardner & Coley, 1993; Schorr,1994); they simply represent one heuristic.Important to note, however, is the pervasive lackof consensus around definitions as well as theneed to ground this (and other) discussions ofresults in a definitional framework. It should benoted that while specification of various typesand purposes of results is critical for clear com-munication in this discussion, the deliberationsof the Forums were designed to focus on TypeOne Results (what children know and can do andwhat is hereafter referred to as "child-basedresults") and on Purpose Four (accountability).

Defining Different Types of Results

Four types of results have been identified. Eachtype is discernable and knowable, each demandsits own data elements and approaches to collec-tion, and each evokes its own assessmentprocesses and considerations. Each type, thoughindependent and distinct, can be used in concertwith others, contingent upon the purposes of thedata collection. Together, the types of results forma continuum, with items representing children'sperformance and behavior at one end, and resultsrelated to systemic performance at the other end.In concert, the four types represent a comprehen-sive overview of the kinds of information beingconsidered by agencies, localities, and states as

4

they move to a results orientation (Kagan, 1995).Data on what children know and can do can beconsidered primary results, while secondaryresults include the contexts in which childrendevelop, such as child and family conditions, ser-vice provision and access, and systems capacity.

TYPE ONE RESULTS-WHAT CHILDRENKNOW AND CAN DO

This type of information focuses directly on chil-dren's behaviorswhat children know and cando. It is synonymous with the term "child-basedresults," the focus of this document. This type ofinformation must be gathered by observing chil-dren directly. It accepts no proxies for behavior,but is a precise and accurate description of chil-dren's performance. For young children thisincludes dimensions related to their motor devel-opment, social and emotional development, useof language, cognition and general knowledge,and approaches to learning. To gather this type ofinformation, child behavior is typically recordedintermittently, from more than one data source.

Examples of Type One Results include:

Motor development: Prevalence of children:who jump; walk a six-foot balance beam; cut;do "x" piece puzzle.

Social and emotional development: Prevalence ofchildren: who accept responsibility for ownactions; take turns; form and maintain friend-ships.

Language usage: Prevalence of children: who ini-tiate and sustain conversation; listen to others;recite poems and do fingerplays; repeat a sen-tence in correct word order; follow verbal direc-tion containing three steps; tell about a picturewhen looking at it; name common objects.

8

Cognition and general knowledge: Prevalence ofchildren: who match; sort shapes and colors;identify largest and smallest; demonstrateawareness of cause and effect.

Approaches toward learning: Prevalence of chil-dren: who take risks; persevere in a chosenactivity; demonstrate curiosity; use materials ininventive ways.

TYPE TWO RESULTS-CHILD ANDFAMILY CONDITIONS

This type of results focuses on the conditions thatsurround and encase what children know and cando. Such information may be gathered fromreviews of documents, including health records;interviews with family members and serviceproviders; and direct observations/conversationswith children and their families. This type ofresults assumes that what children know and cando is directly related to their own health statusand to the conditions in which they live. Ratherthan reporting data on individual children, thistype of data is generally reported as aggregatedprevalences and percentages. Child and familyresults may be grouped into categories (e.g., childhealth conditions; family income conditions)with positive and negative indicators in each.

Examples of Type Two Results include:

Child health conditions: Prevalence of chil-dren: who are born with low birth weights;who are fully immunized; who have func-tional limitations due to health conditions;who have age appropriate heights andweights; who are in good physical health,with no vision or hearing impairments.Family income conditions: Prevalence of chil-dren: who live in poverty; who live with twoparents or one parent employed.Family life conditions: Prevalence of children:who are born to teen mothers or substance-abusing parents; who are abused; who live in

5

foster care; whose TV viewing is regulated;who live in two-parent families; who live inlow-crime neighborhoods.

TYPE THREE RESULTS-SERVICEPROVISION AND ACCESS

Type Three Results are those that describe the ser-vices to which children and families have access.Distinct from the behaviors (Type One) or condi-tions (Type Two), this type focuses on service pro-vision and access to services that children and their

families experience. More than a tally of raw ser-vices, this type of results chronicles real availabil-ity of services to all ethnic, racial, and linguisticgroups, with data typically reported in preva-lences or percentages. Often Type Three Resultsinclude information about services by populationsub-sets or individuals with particular conditions(e.g., disabilities, pregnancy, employed status).Data for these results are typically collected fromrecord reviews and community and institutionaldata bases.

Examples of Type Three Results include:

Health provision/access: Prevalence of pregnantwomen who have access to early and continu-ing prenatal care. Prevalence of children: withincreased access to prenatal care; with healthinsurance; who have access to regular visionand hearing screening, to medical care, towell-child examinations.

Parenting education provision/access: Preva-lence of parents who have access to parentingclasses and social supports.

Child Care/Preschool provision/access: Preva-lence of low-income (or Limited EnglishProficiency [LEP] or disabled) children whohave access to child care. Prevalence of chil-dren: who have access to developmentallyappropriate child care and education; whohave access to before and after-school care.

TYPE FOUR RESULTS-SYSTEMSCAPACITY

Rather than focusing on the provision of andaccess to discrete services, as indicated in TypeThree Results, Type Four Results accord atten-tion to the way services are linked and function as asystem. Type Four Results assume that systemiccapacity, efficiency, and integration are related toaccess and service quality which, in turn, aredirectly related to children's performance. Far lesswell developed than the other types, Type FourResults examine service redundancies, omissions,capacities, and efficiencies. Data for this type arecollected in the aggregate and typically involvethe amalgamation of information across agenciesand service providers.

Examples of Type Four Results include:

Systemic efficiency: The degree to which the sys-tem uses its resources (e.g., fiscal, human, tech-nical, and technological) efficiently and effec-tively.

Systemic infrastructure: The degree to which theinfrastructure (e.g., training, financing, datagathering) supports efficient and effective ser-vice delivery.

Systemic accountability: The degree to whichaccountability is dispersed across systems; thedegree to which agencies build collectiveaccountability.

Systemic cultural sensitivity: The degree towhich the system is sensitive and responsive tothe needs of ethnically, racially, and linguisti-cally diverse children and families.

Defining Different Purposes forAssessing Results

Different types of results exist because they areneeded for different purposes. In some cases, forexample, results information is needed by directservice providers for the purpose of enhancing the

6

accuracy and quality of their work; in otherinstances, results data are needed for the purposeof demonstrating program efficacy; and in stillother cases, results data are used to meet theinformation demands of policymakers and thepublic at large. Because these demands often existsimultaneously and because there has been nocomprehensive data collection strategy delineatedfor young children, the purposes of amassingresults data can and do become blurred. Toaddress this confounding of purposes, variousexperts have proffered different schema (Bruner,Bell, Brindis, Chang, & Scarbrough, 1993; Shep-ard, 1995). All helpful, these schema haveinformed the following categorization of pur-poses for collecting results information. Each ofthe four purposes is addressed with respect toType One Results information.

PURPOSE ONE-SCREENING ANDEVALUATION

Information on Type One Results can be usedfor the purpose of locating children with specifiedcharacteristics, describing their current level offunctioning, and determining their eligibility forintervention services. Typically, large numbers ofchildren are quickly assessed to locate those fewwho might evidence a certain condition. The few,then, receive more thorough evaluation to learnwhether some type of intervention is warranted.Formal and informal observations, checklists,tests, and parent interviews are commonly-usedmeasurement approaches. Assessment for screen-ing and evaluation may take place once or repeat-edly.

Examples of assessments for this purposeinclude screening to discern the need for medicalintervention (e.g., prescriptions, eyeglasses) or forspecial education services. In the latter case, thebehavior of a single child is studied and comparedwith the average behavior of other children of thesame age and characteristics (e.g., gender, geo-graphic location). Large-scale assessments for thispurpose include the Early Periodic Screening,

to

Diagnosis, and Treatment (EPSDT) program thatfunds the identification and treatment of healthand developmental problems among Medicaid-eligible children, and Child Find a process tolocate children who may be eligible for andbenefit from special services, including those cov-ered under the Individuals with Disabilities Edu-cation Act (IDEA).

PURPOSE TWO-IMPROVEMENT OFINSTRUCTION

Information on Type One Results can also beused for the purpose of providing feedback toteachers on the instructional process, with theintention of improving pedagogy, aiding in pro-gram planning, and creating learning experiencesmore appropriate to the needs of individual chil-dren. For this purpose, teachers may note thebehavior of one child or a small group of chil-dren, using informal observations, checklists,anecdotal logs, or portfolios. Typically, suchassessments occur on an on-going basis, andteachers need training to develop and hone theirobservation and assessment skills.

Examples of such assessments include teacherobservation of children's level of small motordevelopment in order to plan appropriate activi-ties to foster such development or teacher obser-vation of children's peer preferences and levels ofplay in order to arrange appropriate groupings ofchildren.

PURPOSE THREE-PROGRAM EVALUATION

Assessment of children's performance and behav-ior for Purpose Three is undertaken to gauge theimpact of a specific program or a particular inter-vention. The resulting data are likely to be used toguide future program design and funding deci-sions. For this purpose, the performance ofgroups of children is of interest, though childrenare likely to be assessed individually. Typically,such work is carried out by researchers, and dataare reported about specific programs and inter-ventions.

7

Examples of assessment for this purposeinclude the evaluation of the Parents as Teachershome visiting program or Kentucky's multi-ageprimary program.

PURPOSE FOUR-ACCOUNTABILITY

Children's knowledge and skills can also be mea-sured for the purpose of informing the publicabout the collective status of children. For thispurpose, the performance of children in class-rooms, schools, districts, communities, states,and the nation is of interest; typically, progress ischarted over time. For this purpose, groups ofchildren are the unit for study, but not all chil-dren within a classroom or service unit will neces-sarily be assessed; samples of the group may beassessed. Assessment must be relatively time-efficient and the resulting data comparable andcapable of aggregation.

Examples of results established for this purposeinclude parts of the Oregon Benchmarks, whichare a series of results set by the public, that serviceproviders, localities, and the state try to achieveand for which they are held accountable. One ofthese benchmarks is the percentage of childrenentering kindergarten meeting specific develop-mental standards for their age; communities arepublicly challenged to improve the percentages ofchildren achieving this result. In another exam-ple, Kentucky uses child results data to considerits allocations of state education funds. Assess-ment for accountability purposes usually has highstakes. Information about the findings tends to bebroadly disseminated and used for decision-mak-ing. The highest stakes occur when recognition,funding, or other resources are directly tied to thereports of child performance.

Issues Concerning Definitions ofTypes and Purposes of Results

Although the definitions offered do rendergreater precision for the discussion, they also raiseseveral key questions: What is the differencebetween inputs and results? Are all four types

really results? and What special issues arise whenapplying these constructs to very young children?

Across service spheres (e.g., education, health,social welfare) debate lingers regarding what con-stitutes an input or a result, and what distin-guishes results from accomplishments. More thana semantic debate, the notion of what constitutesresults warrants examination. Under many condi-tions, particularly in the education domain whenspeaking about students, "results" refer to whatchildren know and can do, with inputs being thesupports, materials, curriculum, pedagogy, andinstruction that combine to foster the results.Alternatively, however, Type Two Results (childand family conditions)while a means to studentresultsalso constitute results in their own right.In short, there may be a chain of results (TypesTwo, Three, or Four) that lead to the ultimate goalof enhanced student performance (Type One).Some designate the results that lead to studentresults as interim results (Schorr, 1993); some con-sider them social indicators.

However thorny for children of any age, thequestions of what constitutes inputs and results,

and how to assess them, are particularly challeng-ing regarding young children. Assessing youngchildren's results is complex because of theirepisodic development, their dependence onadults and society for supports, and the disjunc-ture between the manner in which young chil-dren demonstrate competence (action and inter-action) and more conventional approaches tomeasurement. As such, ethical questions emergeregarding the legitimacy of basing child resultsonly on what young children know and are ableto demonstrate. It is often argued that results foryoung children must be predicated on multipletypes of data; Type One data alone are deemedtoo narrow an indication of child results. Infor-mation from all types, but most particularlyTypes Two and Three, must be included in anassessment strategy that takes full account of theage and expected abilities of youngsters. That is tosay, when considering young children, concep-tions of results evidence may need to be broad-ened to include what, for other age groups, maybe considered inputs.

8

12

Chapter 3. Desirability

service providers in early care and educationhope to affect results for childrento pre-vent negative results from occurring and to

promote positive results. Early childhood educa-tors are quite used to on-going and informalassessment of young children. So while it mightseem that there would be great receptivity towardchild-based results in early care and education,this has not always been the case, because of thehistorical antipathy discussed in the introduc-tion, the lack of training of many early care andeducation workers, and the lack of infrastructurein the early care and education system to collectresults data. In addition, today's discussion ofchild-based results imposes new levels of rigor,specification, and accountability. Movement to amore formal and systematic child-based resultsapproach would focus sustained public attentionon the degree of attainment of specified results.Questions likely to be asked include: How effec-tive are teachers? curricula? early care and educa-tion programs? schools? school districts? theamalgamation of services in communities, states,the nation?

Early care and education experts have diver-gent viewpoints on child-based results. One per-spective is that it is unjust to predicate support forearly care and education on the basis of results.Like education K-12, early care and educationshould be considered a moral imperative in ademocratic society. This perspective also arguesthat the potential dangers of a child-based resultsapproach outweigh the possible benefits. Thisperspective is articulated most emphaticallyunder the following conditions: (a) the youngerthe children in question; (b) when using resultsdata moves beyond the classroom; (c) when usingresults data to assess racially, ethnically, and lin-guistically diverse young children; and (d) whenusing results data to make high-stakes decisions

9

about pay, program reimbursement, or otherresource allocations. In this view, it may be desir-able to assess children's resultsparticularly forchildren ages three through eightfor the pur-poses of screening and evaluating children (Pur-pose One), improving classroom instruction(Purpose Two), and perhaps for curriculum orprogram evaluation (Purpose Three). From thisperspective, however, there are too many risksand not enough benefits to assess results foryounger children and to aggregate data to use foraccountability purposes in monitoring or deci-sion-making in the community, state, or nation(Purpose Four).

Another perspective on results for young chil-dren is that the challenges of results definitionand assessmenteven for accountability (Pur-pose Four)are addressable. This perspectiveregards the benefits of a results orientation as soattractive as to advocate immediate investmentsin constructing child-based results and systemsfor data collection for even very young children.The need for community, state, and national datato guide policy and practice, and even to allocateresources, is emphasized. This perspective arguesthat even if the early childhood education fielddelays, states and localities are preparing to assesschild results, perhaps without needed advice frompersons trained in child development and earlyeducation.

Amplifying these arguments, this section cate-gorizes issues and then enumerates possible disad-vantages and potential advantages of shifting to achild-based results approach for each. In mostcases, specified disadvantages and advantages cor-respond to one or more of the indicated purposes.This section conveys the tensions surroundingchild-based results, while later sections relateideas for resolving them. In a paper commis-sioned for this Forum, Schorr reviews potential

13

benefits of an results-based approach, possiblepitfalls, and promising strategies for implementa-tion (see Appendix D).

The Impact on Teachers' Practiceand Children's Experiences

What effect, if any, would child-based resultshave on daily practices affecting children andtheir teachers? Those who oppose a results-basedapproach feel that it would direct teachers toteach to the test, thereby limiting their creativityand the spontaneity and flexibility of early educa-tion. Advocates for a results-based orientationbelieve that it would help to guide practice, mak-ing it more purposeful and goal driven. Thesepositions are elaborated more fully below.

Potential advantages for teachers' practices andchildren's experiences:

Teachers would have more information aboutchildren's learning and individual differences,allowing teacher practices to address children'sindividual needs and backgrounds, andexpanding practices to address multiple areasof child development rather than just cogni-tive skills and knowledge (Purposes One, Two,and Four)

Teachers would develop and increase effectiveinstructional approaches, curricula, and ser-vices if they know more about what works,have the flexibility to design approaches ratherthan being required to use prescribed meth-ods, and have precise goals toward which toteach (Purposes Two, Three, and Four)

Teachers would use information on childdevelopment to communicate more effectivelywith family members, to help them supporttheir children's learning (Purposes Two andFour)

Teachers would have increased expectationsfor all childrenparticularly ethnically,racially, and linguistically diverse young chil-

IO

drenif all children are expected to achievethe same high results (Purpose Four)

Possible disadvantages for teachers' practices andchildren's experiences:

Teacher practices would become less effectiveif the results are misleading and do not cap-ture the nature, complexity, and individualityof children's developmentparticularly forethnically, racially, and linguistically diverseyoung children (Purposes Two and Four)

Teacher practices would become more uni-form, in the attempt to achieve uniformresults; homogeneous services could not meetthe unique needs of individual children (Pur-poses Two and Four)

Communication with family members wouldbecome less usefulemphasizing performanceon test items rather than overall child develop-ment (Purposes One, Two, and Four)

Children likely to test poorly and lower schoolaverages would be retained or their entry toschool delayed, particularly minority childrenand children with special needs or limitedEnglish proficiency; as a result, children couldbe labeled or stigmatized (Purposes One, Two,and Four)

The Impact on Public Understanding

The desirability of a results approach also

depends on how the data are interpreted by fami-lies and the public. If the results are narrow, triv-ial, abstract, or culturally-bound, then a resultsapproach might hinder public understanding. Ifthe results are meaningful to families and thepublic, and if they are expressed in everyday lan-guage, then a results approach might serve toinstruct the broader society about child develop-ment.

Potential advantages for public understanding ofyoung children's development:

Public knowledge and general understandingof healthy child development and of develop-mental problems would increase if results aremeaningful and valuable to parents and thepublic (Purpose Four)

The public would come to understand moreabout the multi-dimensional nature of childdevelopment; understanding of the relation-ships among children, families, services, andsystems of the early years would increase (Pur-pose Four)

Appropriate assessments would lead to thedevelopment of realistic expectations for thedevelopment of children and the performanceof the programs in which they participate(Purpose Four)

Possible disadvantages for public understandingof young children's development:

The public would draw invalid conclusionsabout children's abilities from results that areinappropriate for all or some young children.For example, results that focus on cognitiveskills minimize the importance of otherdimensions of early learning. Another examplewould be an assessment system that does notreflect how the performance of low-incomechildren may have improved over time (Pur-pose Four)

The public would think that child develop-ment is simpler and more uniform than it is,because it is impossible to capture the com-plexity and individuality of development inspecific results (Purpose Four)

The public would be confused about whathelps and hinders learning, if data about fami-lies, communities, services, and systems arenot collected and presented as the context forchild results (Purpose Four)

The public would place far more pressure onyoung children by developing false, unrealisticexpectations for performance (Purpose Four)

The Impact on Funding

Current discussion about the desirability ofadopting a results orientation in early care andeducation occurs within the framework of thedevolution of government responsibility, programconsolidation, budget cuts, and heightened com-petition for funds. Those who question a results-based approach are concerned that it will lead to areduction in funding for services for children andfamilies in general, for early care and education,and for low-income and minority children whomay not perform well on tests. Advocates for aresults-based approach feel that having depend-able data is the only hope for maintainingandpossibly increasingfunding levels and services.

Potential advantages for funding:

Policymakers or government administratorswould use results-based data to reallocatefunds from marginal and ineffective early careand education programs to effective programs(Purpose Four)

Program administrators would use results-based data to expand more effective programsand improve or eliminate less effective ones(Purpose Four)

Investment would increase in services to low-income and otherwise needy children, in pre-vention efforts (rather than remediation), andin infrastructure, particularly if the cost sav-ings of this additional funding is documented(Purpose Four)

Possible disadvantages for funding:

Effective early care and education programswould have their funding cut if the chosenresults are insignificant, narrow, or insensitiveto family and community differences. (Pur-pose Four)

All early care and education programs wouldhave funding cut if results are not met(whether or not these programs are at fault)

II

15

and if the public loses hope or becomes cyni-cal (Purpose Four)

There would be fewer resources to provideearly care and education services to children iffunds are diverted to setting, assessing, inter-preting, and communicating results (PurposeFour)

What limited infrastructural support that cur-rently exists would be threatened because ofthe desire to keep resources close to the chil-dren so that results will improve (PurposeFour)

The Impact on the Relationship ofEarly Care and Education and OtherServices

As society fails to solve the complex human prob-lems in today's communities, there is a trendtoward services integration and an increasingacknowledgement that the comprehensive needsof families cannot be met with narrow, categoricalservices. Those who question taking a results ori-entation feel that it might fester competitionamong all social services and economic develop-ment; fragmentation among services will grow ascompetition increases. Proponents of a results-orientation believe that it will provide the vehiclefor varied social service and economic develop-ment efforts to work together toward improvingthe lives of children and families.

Potential advantages for the relationship of earlycare and education with other services:

Agencies and programs across service areaswould cooperate, collaborate, and possiblycombine funds to achieve positive results;planning across social services would be facili-tated by results data (Purpose Four)

The gap between early care and education andelementary education would be bridged with

12

16

the common focus on results, as might the riftbetween services for children with specialneeds and education in general (Purpose Four)

Possible disadvantages for the relationship ofearly care and education to other services:

Early care and education programs would loseresources and attention if other services havebetter results (Purpose Four)

Ill will and fragmentation among serviceswould increase and collaboration decrease ifsome services have better results (PurposeFour)

The various service sectors would blame eachother for poor results if attribution of results isnot demonstrated clearly (Purpose Four)

So ... What Do We Conclude AboutThe Desirability of Moving to aChild-Based Results Approach?

Returning to the possible purposes of a child-based results orientation, there is some consensusin the early childhood field that gathering infor-mation for preschool- and primary-aged childrenfor screening and evaluation (Purpose One) andfor improvement of instruction (Purpose Two) isadvantageous for children and families, providedthat the results chosen are both significant andappropriately measured. Many early childhoodexperts also favor the use of child results for pro-gram evaluation (Purpose Three). There is lessagreement regarding movement to child-basedresults for these purposes for children below thepreschool years, notably infants and toddlers.

While some first Forum participants stronglysupported moving to child-based results foraccountability purposes, there was little agree-ment about the wisdom of doing so for all youngchildren, birth to age eight. Even among thosewho support such movement, there is greater

support for assessing child-based results for mon-itoring and planning than for using results toallocate resources.

To implement a child-based results approachthat avoids the possible disadvantages describedabove and achieves the advantages enumerated,planners must meet certain necessary conditions.

13

Implementation of a results orientation is desir-able only if critical safeguards are in place duringthe process of results identification, whileplanning the assessment, throughout the data col-lection, and as findings are interpreted and com-municated. These "only ifs"or necessary con-ditionsare discussed in the section that follows.

17

Chapter 4. Feasibility

Despite disagreement on the desirability ofmoving to a child-based results approachfor accountability purposes, many

experts believe that doing so now is more feasiblethan at any other time in our history. They indi-cate that certain fields, such as early childhoodspecial education, have focused on child resultsfor more than 20 years and have both good-prac-tice and bad-practice examples to inform the dis-cussion. Further, current research is improvingthe options for gathering information that is bothvalid at the time of measurement and meaningfulacross the developmental course. In papers com-missioned for this Forum, White explores someof the limits of existing standardized tests foryoung children (see Appendix C) and Love pre-sents arguments for the feasibility of a childresults approach (see Appendix E). Despite theseadvances, all are concerned that any shift to aresults orientation be made in ways that are devel-opmentally, culturally, and contextually appropri-ate. Care must also be exercised in the process ofdeveloping and interpreting results. Inherentlyprecarious and high stakes, the effort to developand implement a results approach can only beundertaken under certain conditions. Moving tochild-based results is feasible "only if" the follow-ing five conditions are met:

"Only If" There Is Broad ParticipationIn The Identification of Results

INCLUDE MANY STAKEHOLDERS INRESULTS DEVELOPMENT

Results that will be useful to the nation need tobe agreed upon by a broad constituency, includ-ing parents, policymakers, practitioners, andresearchers. Politicians, government administra-tors, business leaders, and citizens have meaning-

ful contributions to make in the development ofresults as do individuals from diverse ethnic,racial, and linguistic backgrounds. The process ofbuilding consensus by selecting certain childresults that are significant to broad audiences iscrucial. Moreover, in developing results, it is

important to remember that the input of lay citi-zens can help assure that results are meaningfuland locally appropriate.

WORK WITH PRACTITIONERS ANDPARENTS IN PARTICULAR

Given the discomfort of many in the early child-hood education community (personnel alongwith parents) regarding a shift to a results orienta-tion, it is important that conversations regardingchild-based results involve practitioners and par-ents. Personnel in early care and education need toconverse with others in the education and humanservice fields who are already using a resultsapproachsuch as early childhood special educa-tors, elementary school teachers, and health careprovidersto gather suggestions from their expe-riences. Parents need to understand clearly theimplications of moving to a child-based resultsorientation. More significantly, both groups knowchildren well; their ideas are critical to construct-ing effective, appropriate results.

"Only If" We Can IdentifyAppropriate Results

CHOOSE RESULTS THAT ARESIGNIFICANT FOR CHILD DEVELOPMENT

AND THE LIFE COURSE

As a child-based results approach evolves, theremay be a tendency to choose results that can beeasily measured or to select "quick and dirty"indicators for which measures already exist. Inlieu of these sometimes flawed approaches, plan-

10

ners need to consider what results they deem fun-damentally important, with major findings fromchild development influencing the content of theresults. Good results have merit because they areimportant in and of themselves, and because theyare linked to longer term life goals. The discus-sion of life-course results is also valuable becauseit can unite Americans across racial and ethniclines; most people want the same major life-course results for their children. Thus, the essen-tial task of results planners is to take the complexconstructs that research has demonstrated to beimportant (e.g., identity formation, achievementmotivation, task persistence, establishing peerrelationships) and identify the life-course "edge"appropriate to the age group under consideration.

Life course results might include being readyfor school, being able to read, graduating fromhigh school, attending college, holding a job, oravoiding teenage pregnancy and crime. Concep-tualized in this way, such results will have utilityfor parents and policymakers. Moreover, theycontextualize the content of the results in a moredurable, long-term perspective that is salientacross all populations. For these reasons, life-course results should be considered.

CHOOSE RESULTS THAT CROSS MULTIPLEDOMAINS OF CHILD DEVELOPMENT

Basing educational results on subject matter cur-riculum areas (e.g., science, math), while perhapsappropriate for older children, needs to be exam-ined critically when considering younger chil-dren. Learning for young children is less orientedto subject matter facts than to the fostering ofbasic developmental competence. To that end,the Goal 1 Technical Planning Group of theNational Education Goals Panel, building ondecades of work by scientists and practitioners,has identified five dimensions of early learningthat provide a developmental, rather than a cur-ricular, framework. The dimensions are: physicalwell-being and motor development; social and

15

emotional development; approaches towardlearning; language usage; and cognition and gen-eral knowledge (Kagan, Moore, & Bredekamp,1995). Results for young children need to considerthese dimensions, as well as a curricular orienta-tion.

Beyond incorporating a broad-based develop-mental orientation, results for young childrenmust take into account children's unique learningapproaches. Young children, especially, do notlearn in compartmentalized categories; theyamass knowledge through integrated experiences.Consequently, results for young children mustreflect integration across domains and across sub-ject areas. Results for young children should notfocus only, for example, on cognitive develop-ment, but must emphasize all domains. It is

imperative that the domains be considered as atotality, with no "single domain acting as a proxyfor the complex interconnectedness of earlydevelopment and learning" (Kagan, Moore, &Bredekamp, 1995).

Use of multi-dimensional results will also min-imize teaching aimed solely at producing hightest scores. For example, it is easy to teach a childto label pictures of community helpers (single-dimension skill), but more challenging to teachchildren how to plan and play with others (multi-dimensional skill).

CHOOSE RESULTS TO WHICH DOLLARVALUE CAN BE ATTACHED

Given the precarious funding of early care andeducation and given the persuasiveness of cost-saving data to policymakers, new results shouldbe amenable to cost analysis. To date, ina very limited number of studies, the early careand education field has been particularly success-ful in demonstrating the cost-effectiveness ofearly intervention. Recognizing these conditions,new results data must be responsive to policy-makers' thirst for additional fiscal information.

19

CHOOSE RESULTS THAT REFLECTTHE REALITIES OF EXISTING

COMMUNITIES AND PROGRAMS,YET CAN BE AGGREGATED

Real world circumstances must guide the adop-tion of results. If schools in a community teachonly in English, then an appropriate result mustbe that children read, write, and speak English.On the other hand, if children in a communityare allowed to demonstrate their competence inSpanish, Korean, or English, then the resultsprocess must reflect that fact. In communities inwhich neighborhood culture and school expecta-tions differ, results might reflect children's abilityto function successfully in both settings. Selectedindicators must not gloss over the differences in"readiness for kindergarten" in the myriad ofcommunities across America, nor can local valuesbe ignored. Nevertheless, if results are to be usedto monitor local, state, and national trends, a coregroup of results must be drawn to allow compar-isons among programs, communities, and states.

DETERMINE WHETHER RESULTSSHOULD BE COMPLEX OR SIMPLE

One perspective is that results must reflect thecomplexity of development; indicators strippedof developmental "richness" are not meaningful.Another view is that worthy child results canfocus on a few crucial elements; these would bestage-salient tasks with considerable social valid-ity, such as trust behaviors in infancy or reading atthe end of first grade.

The issues of the complexity and "develop-mental embeddedness" of selected results, alongwith the intricacies of the assessment methodsadopted to measure them, determine the costs ofthese approaches. One perspective underscorespragmatism in a time of tight budgets; namely,early care and education should move forward to

iG

develop quick, low-cost approaches to resultsassessment. Another view is that simple measuresof complex developmental phenomena are notpresently possible. Given that fact, the fieldshould honestly explain its position and informpolicymakers that adequate time and moneymust be provided before results assessment can goforward.

In either casecomplex or simplethe resultsshould be designed to "tell a story" to increase theunderstanding of the public and policymakers ofchild development.

"Only If" We Are Clear About WhichChildren To Include

DETERMINE HOW TO INCLUDE CHILDRENWITH LIMITED ENGLISH PROFICIENCY

AND SPECIAL NEEDS

To what extent should children with limited Eng-lish proficiency be included in results assess-ments? A predominant view is that all childrenshould be included in such a system. Otherwise,the system would not be national; it would not bejust. However, with children whose home lan-guage is other than English, and particularly ininstances where multiple languages are repre-sented in a classroom or community, assessmenttechnicalities and practical issues exist.

A similar issue arises with regard to young chil-dren with special needs. In several states, young-sters with special needs have been "overlooked" indesigning results assessment systems, while otherstates have included them. Recommendationsfrom national special education experts (NationalCenter on Educational Results, 1993), and pio-neering results efforts in several states, suggestthat most children with disabilities should partic-ipate in results evaluations.

20

"Only If"We Measure ResultsAppropriately

LOOK AT CHANGES FROM BASELINEBEHAVIOR OVER TIME

Because young children's growth is highlyepisodic and variable, performance cannot bejudged at a single point in time, but must begauged from repeated observations; data must becollected at multiple points in time. Use of mea-sures over time also reveals developmentalprogress in children whose exposure to earlylearning opportunities has been restricted andwhose baseline and current performance aredelayed for their chronological age. In monitor-ing local programs, states, and the nation, it isessential to realize that improvement in resultsmust be regarded relative to children's startingpoints. This is particularly important whenstudying children at risk, who may be makinggreat gains but still have below average results.Results assessments take into account startingpoints and must look at change over time.

FOCUS ON FACE VALIDITY, CONSTRUCT

VALIDITY, AND CONSEQUENTIAL VALIDITY

The paragraphs above have underscored theimportance of results and results measures that"make sense" to parents, the public, and practi-tioners. This is the concept of face validity. Con-struct validity is also critical: Do the assessmentapproaches measure what they are intended tomeasure? Do they clarify performance consistentwith the concept embodied in the results? Conse-quential validity is also very important: Areresults used in valid ways (i.e., to make accurateexplanations)? Is the interpretation given consis-tent with the actual findings?

THINK INVENTIVELY ABOUT ASSESSMENT

A great deal of innovative assessment is taking

place nationally, and new results efforts shouldseek to incorporate this cutting edge work.Among these efforts, play-based assessment, class-room observation schemes, authentic assessment,descriptive documentation, interviews withteachers and parents, portfolio approaches, andVygotskian dialogues show promise for providinguseful developmental information. While each ofthese approaches faces unique problems, theyshare the challenge of aggregating descriptiveinformation in a concise form that can be used indecision making. As such, they provoke thinkingabout critical issues that need to be faced asresults information is generated and made use-able.

"Only If"We Link Child-BasedResults to Efforts to ImproveThe Lives of Children

LINK CHILD-BASED RESULTS WITHOTHER INDICATORS

If a purpose of results measurement is to helppractitioners, parents, policymakers, and the gen-eral public understand the status of young chil-dren, and to improve this status, then it is essen-tial that the conditions in which children areliving and learning (Type Two) be assessed andrelated to the child-based results. Moreover, theservices that children are receivingor notreceiving(Type Three) and some or all of theelements of the systems that support early careand education must be assessed and related to thechild-based results.

Information about multiple types of resultsshould be used to help the public understand thecircumstances upon which child results depend.Some early childhood experts are more vocal thanothers in requiring this linkage in any results sys-tem that is developed. More data are already

17

21

being aggregated for Types Two and Three thanfor child-based results (Type One), and, at thepresent time, there is little reporting of childresults along with the other types of information.

i8

22

Chapter 5. Suggested Next Steps

While significant progress has been madein delineating the issues surroundingthe use of results in early care and edu-

cation, the topic remains conceptually complex,practically challenging, and politically sensitive.The suggested next steps that follow build uponpreliminary discussion from the first Forum andelaborated discussion at the second Forum. Thenext steps embody a number of recommenda-tions for advancing a results-based approach, butthey are intentionally suggestive, offeringdomains and strategies for action that need to behoned.

Increase Public Consciousness andParticipation

CONSIDER TERMINOLOGY

Within discussions of results, similar terms oftenconvey different meanings to various speakers;alternatively, different terms may convey the sameconcept. At the same time, particular terms maybe politically charged in certain locations withspecific audiences, but not in others. In short,there is no clear set of terms with which to con-duct the debate. Since language is the vehicle formeaning, foundational concepts need to beclearly stated and mutually agreed upon early inthe discussion of child-based results in early careand education.

BROADEN PARTICIPATION IN IDENTIFYINGRESULTS AND BUILD CONSENSUS

REGARDING THEM AT LOCAL, STATE,AND NATIONAL LEVELS

Worthwhile results that are broadly "owned" canresult only from shared construction by parents,teachers, administrators, researchers, policymak-ers and the public at large. Building consensus atthe local, state, and national levels on desiredresults that are meaningful to varied audiences is

critical to raising public consciousness and sup-porting an ongoing effort. An intentional focuson including traditionally disenfranchised groupsin the process is essential. All consensus-buildingefforts need to be coordinated by groups consid-ered to be non-biased and legitimate.

ENGAGE THE EARLY CHILDHOODCOMMUNITY

The early childhood community is correctly con-cerned about the development and implementa-tion of results for very young children. The con-cerns of this community need to be fullyunderstood and carefully addressed. Forums forearly childhood practitioners and researchersneed to be held; written materials need to bedeveloped. Above all, time needs to be allowed forsupport and consensus to emerge.

Plan Strategically

LEARN FROM AND BUILD ON EFFORTS INCHILD DEVELOPMENT AND ALLIED FIELDS

Given the focus on results creation, construction,and collection throughout the nation, it seemswise to assess and utilizewhere appropriatethe data currently being collected for otherefforts. In particular, a number of governmentand foundation projects have developed modelsfor child-based educational results for youngerand older children. Data gathering efforts forolder children, including the National Assess-ment of Educational Progress, and for youngerchildren, such as the Early Childhood Longitudi-nal Study, have struggled with of the same issuesexplored in this paper that challenge efforts tomove to a child-based results orientation. Itwould be especially informative to consider howthey haveand have notaddressed the neces-sary conditions (only ifs).

Other sources of guidance are child-based

19

23

results efforts in related fields, such as child wel-fare, adoption, foster care, early childhood specialeducation, and child health. Attention to the suc-cesses and failures in results approaches across dis-ciplines will be invaluable at the formative stageof efforts in early care and education. Addition-ally, others in early care and education have pon-dered these same issues. They include the GoalTechnical Planning subgroup and Head StartResearch Committees. Efforts should be made touse the expertise and documents from thesegroups.

CONSIDER AND PREPARE FORUNINTENDED CONSEQUENCES

Moving to a results orientation will yield someunintended consequences. To the extent possible,efforts to predict such consequences, particularlythose that might be negative, are encouraged. Inaddition, once potential negative consequences areidentified, strategies to deal with them need to bedeveloped and implemented.

COORDINATE RESULTS-RELATED EFFORTS

Create a collaborative of organizations engaged inwork on results related to early care and educa-tion to support and inform one another. Such agroup could not only cross-fertilize existing workand minimize duplications, but could also createstrategic plans delineating where additional tech-nical, consensual, and political work is needed.

Identify and Choose Results Carefully

IDENTIFY RESULTS BY TYPE

Confusion exists regarding different types ofresults. To that end, different types of resultsshould be discerned. For results related to chil-dren's behavior, a broad national consensus needsto be developed, with ample opportunity forstates and locales to tailor their specific resultsand benchmarks. Such results should bestrengths-based and should allow for thereflection of partial achievement. Efforts to spec-ify Type One results might look to special educa-

20

2 4

tion which has been using similar results formany years. For results related to child and familyconditions (Type Two), access and quality of ser-vices (Type Three), and systemic results (TypeFour), model results could be developed at thenational level for state adoption. All resultsshould evolve and be subject to frequent changeas knowledge, social conditions, and values arealtered.

IDENTIFY LIFE-COURSE RESULTS

Define major life-course results for older chil-dren, and specify the antecedents in early child-hood that symbolize important "real life" skills.Tying life-course results to younger children willhelp to elevate the importance of the early years.

IDENTIFY LINKS AMONGTHE RESULTS TYPES

Because results are interrelated, clearer pathwaysamong and between individual results and resultstypes need to be clarified. In particular, linkagesneed to made between results related to children'sbehavior and knowledge (Type One) and thecontext in which children develop, as expressed inresults related to child and family conditions(Type Two), service provisions (Type Three), andsystemic integration (Type Four).

FOCUS ON POSITIVE RESULTS

Frequently, particularly regarding child and fam-ily conditions (Type Two results), there has been atendency to chronicle negative results. Effortsshould be made to identify and assess positiveresults (e.g., resilience, protective factors, childwell-being), as well.

SELECT IMPORTANT RESULTS TO WHICHTEACHERS CAN AND SHOULD TEACH

The results chosen for young children must betruly important to their future development andachievement, not simply indicators that are easyto measure. When results are salient and whenthey are evaluated in contextually-sensitive ways,teachers can conduct activities that promote theskills that assessments measure.

RESOLVE THE TENSION BETWEEN THEDESIRE OF COMMUNITIES AND STATES TOCUSTOMIZE RESULTS AND THE NEED TO

AGGREGATE DATA

It is necessary to identify results that are meaning-ful to local communities and states to inform theprocess and assure local ownership of the results.It is also crucial to maintain some consistencyacross all data such that they are useful, inter-pretable, and comparable.

AIM HIGH, BUT BE REALISTIC

Child-based results must be selected that createhigh expectations for all children. Such well-designed results will guide service improvementand policy formation. Care must be taken, how-ever, not to over-reach or idealize the conditionsunder which children live or the services they canreasonably be expected to receive.

Develop Appropriate, Cost effectiveApproaches to Assessment and DataCollection

DEVELOP MEASUREMENT TECHNIQUESFOR LARGE SAMPLES OF YOUNG CHILDREN

Efficient methods for gathering data from chil-dren in a variety of settings must be pioneered,building upon the experiences of multi-site earlychildhood education studies conducted within thepast decade. As noted earlier, promising new tech-niques (e.g. play-based assessment, classroomchecklists for anecdotal records, and Vygotskiandialogue methods) merit exploration because theyreflect child development and the context inwhich it occurs. Narrow, multiple-choice psycho-metric measures are not acceptable for any of thepurposes herein. Richer more complex data showpromise of being useful for multiple purposes.Worthy goals are to collect fewer data and makethe best possible use of what is collected, and togive children from diverse backgrounds the oppor-tunity to perform to their optimum capacity.

21

DEVELOP ACCEPTABLE APPROACHESTO INCLUDE CHILDREN WITH LIMITED

ENGLISH PROFICIENCY AND THOSEWITH SPECIAL NEEDS

Several states and organizations have developedapproaches to including as many individuals aspossible in results assessment. To do this, resultsmeasures and assessments need to be translatedinto multiple languages, appropriate for multiplecultures, and accessible by children with specialneeds.

Put Theory Into Practice

PROVIDE TECHNICAL ASSISTANCE TOSUPPORT RESULTS DEVELOPMENT AND

ASSESSMENT

States are moving to using child-based resultsrapidly. There is an urgent need to support stateand local practitioners as they surge forward. Inmany cases, states will be pressed to implement aresults system well before many of the issues canbe addressed fully. To help states in the meantime,information across states should be chronicledand mechanisms established for informationsharing among those developing results andrelated assessment systems.

PROVIDE TECHNICAL ASSISTANCETO SUPPORT DATA GATHERING AND

MEASUREMENT

Presently, many states are enhancing their datagathering and measurement capacities. Not onlyare states using approaches that differ from oneanother, but often administrative departmentswithin states differ in their approaches to datagathering and measurement, preventing theaggregation of data germane to specified results.To that end, technical assistance should be pro-vided to foster comparable measurement andassessment approaches across state administrativeagencies and perhaps across states.

25

PILOT WELL-CONSTRUCTED MODELS

Any results system for early care and educationmust be grounded in accepted child developmenttheory and validated assessment practices. Forcollecting results-based data for accountabilitypurposes, it is crucial that innovative approachesfrom a variety of disciplines and audiences beconsidered. Before policies are drafted,approaches should be pilot-tested at severaldiverse sites to assure the efficacy of the approach.Expansion should proceed at a manageable pace.

CONTINUE THE DIALOGUE

Convene meetings once or twice a year to reviewresults work being done by districts, states, andindividual researchers; to identify exemplaryefforts and find ways to share them widely; and toidentify gaps in knowledge and plan ways to fillthem. Future gatherings might continue toexplore divergent viewpoints on difficult issuesand spark inventive resolutions.

Explore Ways to Fund a ResultsApproach Adequately

DETERMINE NECESSARY FUNDING

The collection of some data may require no addi-tional funds. For example, funding for data col-lection around special education placement isavailable, and funding for research on instruc-tional approaches may be contained within thedevelopment and pilot testing budgets for suchprojects. However, monitoring child results foraccountability purposes is typically a project that

22

2

needs specific funds. A necessary first step is todefine the scope of the proposed project, notingthe complexities and costs inherent in large-scaledata collection.

EXPLORE POTENTIAL FUNDING SOURCES

Both start-up and ongoing funding need to beexplored, with consideration given to "piggyback-ing" with other data collection wherever possible.Funding options to be considered include foun-dations, government agencies, and state agencies.

Communicate, Implement, andEvaluate

CONCENTRATE ON COMMUNICATINGRESULTS BROADLY AND EFFECTIVELY

To have impact, results data on young childrenmust be shared with parents, policymakers, busi-ness, and the media. Careful consideration mustbe given to the nature of the data shared and theprocess for sharing it. Not all data are in a formimmediately suitable for use by multiple audi-ences. Care must be taken to assure effective com-munication to appropriate audiences.

CONTINUOUSLY EVALUATE AND IMPROVERESULTS MEASURES AND APPROACHES

Initially, no process of results and assessment willbe complete or ideal. Efforts and instrumentsneed to be monitored and evaluated continually.Realistic timelines and comprehensive andsequenced plans should be developed so thatresults-based efforts are regularly re-examined.

REFERENCES

Bruner, C., Bell, K., Brindis, C., Chang, H., & Scar-brough, W. (1993). Charting a course: Assessing a commu-nity's strengths and needs. Falls Church, VA: National Cen-ter for Service Integration.

Graue, M. E. (1993). Ready for what? Constructing meaningsof readiness for kindergarten. Albany, NY: State Universityof New York Press.

Gnezda, M. T., & Bolig, R. (1989). A national survey ofpublic school testing of prekindergarten and kindergarten chil-dren. Paper commissioned by the National Forum on theFuture of Children and their Families of the NationalAcademy of Science and the National Association of StateBoards of Education.

Institute for Research on Poverty. (1995, Spring). Focus,16(3).

Kagan, S. L. (1995). By the bucket: Achieving results foryoung children. Washington, DC: The National Gover-nors' Association.

Kagan, S. L., Moore, E., & Bredekamp, S. (1995). Consid-ering children's early development and learning: Towardshared beliefs and vocabulary. Washington, DC: NationalEducation Goals Panel.

Love, J. M., Aber, L., & Brooks-Gunn, J. (1994). Strategiesfor assessing community progress toward achieving the firstNational Educational Goal. Princeton, NJ: MathematicaPolicy Research, Inc.

Meisels, S. (1988). Developmental screening in early child-hood: The interaction of research and social policy.Annual Review of Public Health, 9, 527-550.

23

National Center on Educational Results. (1993). Educa-tional results and indicators for early childhood (Age 3 andAge 6). Minneapolis, MN: University of Minnesota, Col-lege of Education.

Phillips, D, & Love, J. (1994). Indicators for school readi-ness, schooling, and child care in early to middle childhood.Paper prepared for the Indicators of Children's Well-BeingConference, November 17-18, Rockville, MD.

Schorr, L. B. (1993). A framework for improving results forchildren and families. Washington, DC: Improved Resultsfor Children Project.

Schorr, L. B. (1994). The case for shifting to results-basedaccountability. In Making a difference: Moving to outcome-based accountability for comprehensive service reforms (pp 14-28). Falls Church, VA; National Center for Service Inte-gration.

Shepard, L. (1995). The challenges of assessing young chil-dren appropriately. Phi Delta Kappan, 76(3), 206-212.

Shepard, L. A., & Smith, M. L. (1987). Effects of kinder-garten retention at the end of first grade. Psychology in theSchools, October, 346-357.

Young, N., Gardner, S., & Coley, S. (1993). Getting toresults in integrated service delivery models. In N. Young,S. Gardner, S. Coley, L. Schorr, & C. Bruner (Eds.), Mak-ing a difference: Moving to outcome-based accountability forcomprehensive service reforms. Falls Church, VA: NationalCenter for Service Integration.

Zero to Three. (1992). Heart start: The emotional founda-tions of school readiness. Washington, DC: Author.

27

Appendix A: June 1-2, 1995

Meeting Agenda and Participants

FIRST ISSUES FORUM ON CHILD-BASEDRESULTS AGENDA

GOAL: To examine the desirability and scientific feasibilityof moving toward a child-based results orientation for chil-dren birth to age eight.

June i and 2, 1995Carnegie Corporation of New York

437 Madison Avenue, New York, NY

Day OneJune I, 1995

moo Buffet brunch available

Session I Welcome and OverviewSharon L. Kagan, Chair

12:00 Welcome/IntroductionsGoals of the meetingBackground, Rationale, and Definitions

12:45 Considerations Regarding an Results-BasedOrientation

Presentation by Lisbeth B. Schorr

1:05 Considerations Regarding an Results-basedOrientation

Presentation by Sheldon White

1:25 General Group Discussion

2:45 Break

Session II Implications for Children's DevelopmentMichael Levine, Chair

3:oo Participants will be asked to respond to thefollowing questions:

a. What are the necessary characteristics of anassessment process that obtains neededinformation and also promotes child devel-opment? How does the process differdepending upon whether we are assessing

24

2

for instructional improvement or account-ability?

b. Are there special conditions or needs ofyoung children in general, and certainyoung children in particular, that mightexempt them from or prefer them forinclusion in results assessment?

c. How can a results orientation be structuredto be fully comprehensive for children, o-8?

4:oo General Group Discussion

5:oo Adjournment

Session III Dinner and Roundtable Discussions onDesirability

Valora Washington, Chair

6:3o Cocktails

7:oo Dinner

7:45 Concurrent Roundtable Discussions

Roundtable A Impact on Classroom Practiceand Teacher Preparation

a. What is the potential impact of a resultsorientation on classroom practice? onprofessional development?

b. If a results approach were adopted, whatspecial actions should be taken toencourage positive results and preventnegative consequences in classroompractice?

Roundtable B Impact on Programsa. How would a movement toward child

results impact program quality? avail-ability?

b. What is the relationship between resultsand funding?

c. What program policies might be altered(positively or negatively) as a result of aresults-orientation (e.g., retention)?

Roundtable C Impact on Families and Com-munities

a. How will parents and diverse commu-nity groups respond to a results orienta-tion (e.g., persons with low income,business leaders, service providers)?

b. What precautions are necessary to pre-vent misuse of outcome data within acommunity?

Roundtable D Impact on Early Child-hood Systems

a. Would the benefits and challenges of aresults approach touch all segments ofthe field evenly? Which constituencies/programs would be most affected? How?

b. Would movement toward a results ori-entation help unite or further divideearly care and education?

c. How would a results approach affectmonitoring and licensing?advocacy?

Roundtable E Impact on State and FederalPolicy

a. What are the potential policy conse-quences of a results orientation in thestates? at the national level?

b. If a results approach were to beadopted, what strategies should beimplemented at the national and statelevels to ensure maximum benefits andconstrain harm in the states?

9:00 Adjournment

Day TwoJune 2, 1995

Session III Continued Discussion on DesirabilityValora Washington, Chair

8:oo Continental breakfast

8:3o Reports from Roundtable DiscussionsA Impact on Classroom Practice and

Teacher PreparationB Impact on ProgramsC Impact on Families and CommunitiesD Impact on Early Childhood SystemsE Impact on State Policy

25

9:2o General Group Discussion

10:20 Break

Session IV Scientific Feasibility of an Results ApproachMichael Levine, Chair

10:35 Review of Past Efforts and Scientific FeasibilityPresentation by John Love

to:55 Participants will be asked to address the followingquestions:

a. What cultural, technical, developmental,and implementation considerations mustbe kept in mind if a child-based results ori-entation were to take root?

b. Do we understand the pitfalls and can weovercome them, now?


12:45 Lunch

1:3o Participants will be asked to respond to the follow-ing questions:

a. For each age group (infancy, preschool,and primary), is the development andimplementation of child-based results tech-nically feasible at this time?

b. What, if any, are the special considerationsfor your age group?


3:15 Break

Session V Summary and ConclusionsSharon L. Kagan, Chair

3:3o Summation: The Desirability and Scientific Feasi-bility of an Results Approach

Presentation by Barbara Blum

4:oo General Group Discussion: Next Steps

4:3o AdjournmentFirst Issues Forum on Child-Based Results

2

June 1-2, 1996Participants

Larry AberNational Center for Children in PovertyColumbia University

Barbara Blum, PresidentFoundation for Child Development

Sue BredekampNational Association for the Education of Young Children

Cynthia BrownCouncil of Chief State School Officers

Charles BrunerChild and Family Policy Center

Bettye CaldwellPediatrics/CAREArkansas Children's Hospital

Nancy CohenBush Center in Child Development and Social PolicyYale University

Ellen GalinskyFamily and Work Institute

Sarah GreeneNational Head Start Association

Kenji HakutaSchool of EducationStanford University

Sharon L. KaganBush Center in Child Development and Social PolicyYale University

Mary KimminsMaryland State Department of Education

Luis LaosaEducational Testing Service

Michael LevineThe Carnegie Corporation of New York

Joan LombardiAdministration for Children, Youth, and FamiliesU.S. Department of Health and Human Services

John LoveMathematica Policy Research, Inc.

Samuel J. MeiselsUniversity of Michigan

26

33

Kristin MooreChild Trends

Frederic MosherThe Carnegie Corporation of New York

Deborah PhillipsNational Research Council

Craig RameyCivitan International Research CenterUniversity of Alabama at Birmingham

Sharon RosenkoetterBush Center in Child Development and Social PolicyYale University

Lisbeth SchorrHarvard University Working Group on Early Life

Diana Slaughter-DefoeSchool of Education and Social PolicyNorthwestern University

Robert SlavinJohns Hopkins University

Valora WashingtonThe W.K. Kellogg Foundation

David WeikartHigh/Scope Educational Research Foundation

Charles E. WheelerWalter R. McDonald & Associates, Inc.

Sheldon WhiteDepartment of Psychology and Social RelationsHarvard University

Emily WurtzOffice of Senator Jeff Bingaman

Nicholas ZillWestat, Inc.

FOUNDATION REPRESENTATIVES

Stacie GoffinEwing Marion Kauffman Foundation

Mary LamerDavid and Lucile Packard Foundation

Janice MolnarFord Foundation

Appendix B: January 24, 1996

Meeting Agenda and ParticipantsSecond Issues Forum:

Next Steps for Child-Based ResultsAgenda

Wednesday, January 24, 199610:00 am to 4:30 pm

Carnegie Corporation of New York437 Madison Avenue, New York, NY

Goal: To identify "actionable" next steps for developinga results-oriented approach for accountability in early careand education

to:oo to 10:15 Welcome, Introductions, and Charge tothe Group

to:15 to 10:30 Questions and Discussion ConcerningFirst Meeting

10:30 to 12:00 Identifying Results

Is there a need to identify and build consensus on keytype 1 results (what children know and can do) or havesuch results been adequately identified? Consider thisquestion separately for children age 5, age 3, andyounger than age 3. What are the next steps in thisarea?

Is there a need to identify and build consensus on keytype 2 results (child and family conditions), or havesuch results been adequately identified? Consider thisquestion separately for children age 5, age 3, andyounger than age 3. What are the next steps in thisarea?

Is there a need to identify and build consensus on keytype 3 results (service provision, access, and quality) forfamilies and children, or have such results been ade-quately identified? Consider this question separately forchildren age 5, age 3, and younger than age 3. What arethe next steps in this area?

What is meant by type 4 results (systems capacity)?How do they relate to type t, 2, and 3 results? Is it nec-essary to develop different type 4 results for childrenage 5, age 3, and younger than age 3? What are the nextsteps in this area?

(If time allows) What are the "markers of progress" forresults types 1-4? What are the micro-results that lead to

27

other micro-results? What are the complex relationshipsamong the different types of results? Which results leadto which other results? What are the next steps in thisarea?

12:00 to 1:00 Assessing Results

Is additional work on instrument development andassessment methods needed for implementing a child-based results approach? If yes, for which types of resultsand for children of which ages? Are there adequateapproaches for assessing some relatively complex results(e.g. nurturing families)? What are the next steps in thisarea?

Have cost-effective approaches to data gathering andassessment been identified? Is it clear how states canmake the best possible use of existing data? What arethe next steps in this area?

Is it clear how should children with special needs(developmental disabilities, LEP, at-risk) should beincluded in assessment and data gathering? What arethe next steps in this area?

Loo to 1:3o Lunch

1:3o to 2:30 Developing Comparable Results and Data

How can results be developed and data collected thatreflect community/state input and priorities and thatare also comparable across communities and states?Should we strive to use the same data collected by com-munities and states (increasingly for the purpose ofhigher-stakes accountability such as resource allocation)to make comparisons among communities and states?to generate a picture of children and families across thenation? Alternately, should the goal be separate local,state, and efforts to define results and collect data? Howcould separate local, state, and national data collectionefforts support one another and minimize duplication?What are the next steps in this area?

2:30 to 3:oo Involving the Early Childhood EducationCommunity

How can the early childhood education community bebetter integrated into efforts to develop and assesschild-based results? What are the next steps in this area?

31

3:oo to 3:3o Public Relations

How can states and communities develop the politicalattention and will to overcome special and competinginterests and come up with fair and meaningful resultsand assessment systems for children and families? Whatare the next steps in this area?

3:3o to 4:3o Finalize Next Steps

Second Issues Forum: Next Steps for Child-Based ResultsJanuary 24, 1996

Participants

Barbara BlumFoundation for Child Development

Barbara BowmanErikson Institute

Sue BredekampNational Association for the Education of Young Children

Cynthia BrownCouncil of Chief State School Officers

Nancy CohenBush Center in Child Development and Social Policy,Yale University

Ellen GalinskyFamilies and Work Institute

Eugene GarciaGraduate School of EducationUniversity of California Berkeley

32z8

Sharon L. KaganBush Center in Child Development and Social PolicyYale University

Luis LaosaEducational Testing Service

Michael LevineCarnegie Corporation of New York

John LoveMathematica Policy Research, Inc.

Kristin MooreChild Trends

Michelle NeumanBush Center in Child Development and Social PolicyYale University

Gregg PowellNational Head Start Association

Sharon RosenkoetterAssociated Colleges of Central Kansas

Jack ShonkoffHeller Graduate SchoolBrandeis University

Gary J. StanglerMissouri Department of Social Services

James YsseldykeNational Center on Educational ResultsUniversity of Minnesota

Appendix C: Considerations Regarding anResults-Based Orientation

Sheldon H. WhiteHarvard University

Paper presented at the Issues Forum on Child-Based Results,New York City

SECTION I IntroductionIn the abstract, program assessment using child-basedresults is highly desirable for the management of earlychildhood programs. To the extent that a program pro-duces changes in children that are unarguably positive andplainly visible to others, those responsible for the programhave less to worry about internally and less to explain toothers. The program takes care of itself. The program man-agers obtain autonomy and flexibility. Supervisors or criticsmay ask questions about the program's philosophy, meth-ods, or operational strategybut if the positive benefits ofthe program are plain to see, all the questions are held atarm's length. They do not disappear, they are tabled, butthey have little force in requiring changes in the program.Creative program developers attach considerable impor-tance to the "immunization" produced by face valid results.I have seen one sophisticated developer of an innovativeeducational program go to considerable effort to produce anew system of evaluation along with his new educationalprogram. His hope was that his new form of evaluationwould place a shield between his unorthodox program andits critics.

Often enough, the results obtained by an early child-hood program do not immediately and obviously declarethemselves as benefits. Then some kind of estimation ofwhat the program is achieving has to be arrived at by prox-ies: goodness-of-process indicators, peer reviews, surveys ofclient satisfaction, and analyses of outcome variables thatmight be theoretically or argumentatively linked to the pos-sibility of future benefits. Judgments about a program usingsuch catch-as-catch-can indicators may differ. Such indica-tors may be reasonable and adequate for the everyday man-agement of a program, but they may not be sufficient toanswer life-or-death challenges to the program in a politicalcontext.

The travails of Head Start are, of course, a perfect illus-tration of the challenges that confront a program supportedby partial and argumentative indicators. Child-based results

29

are inherently difficult for many early childhood programsbecause the important consequences the programs areintended to influence lie far in the future. Either we look atsome short-term proxies for those distant consequences, orelse we have to wait a long time before finding out whetheror not the program has had an effect. When there are life-or-death issues of accountabilityas, for example, whenquestions about Head Start's validity periodically surgeforth in the Congresswe struggle with the choices.

SECTION 2 The State-of-the-Artof Readiness Testing

Since Head Start is generally understood to be a programthat helps disadvantaged children do better in school itwould have seem natural to have used tests of school readi-ness as indicators of the program's effectiveness. Such testshave hardly been used in the great volume and variety ofHead Start studies. Why would people not use a test sopatently and obviously directed towards just what HeadStart is supposed to bring about? The technical qualities ofthe tests are not very good and this is a reflection of the factthat what the test is intended to deal with is very poorlyunderstood.

To begin with, traditional readiness tests are not verygood in conventional psychometric terms. About a decadeago, in conjunction with my longstanding interest in devel-opmental changes in children in the 5-7 age range, I lookedat a number of the major commercial school readiness tests.I was interested in the possibility that there might be somehidden wisdom deep in their construction. Did readinesstests embody insights about the cognitive changes in chil-dren near the onset of schooling? Did the subtests of thoseinstruments differentiate out theoretically interesting fac-tors in children's cognitive development? I had to be inter-ested in the tests' predictive power. Could the tests dowhat they were designed for, assess children's cognitivematurity?

The readiness tests did not look as though they con-tained much hidden wisdom. They looked like work-sam-ples of things that children are asked to do in the earlygrades. But the readiness tests' statistics said the predictivepower of the tests was so poor as to make theoretical ques-tions about the instruments uninteresting. I did not pursuethe analysis at that time, but it was the memory of it that

33

led me to be personally quite skeptical when, a few yearsago, I first confronted a declaration of Goal i of the Goals2000 project.

Within the past few weeks, in preparation for today'smeeting, I have taken a second look at the readiness andreadiness-like tests now on the market. Table t, which canbe found at the end of this paper, gives some statistics onthe tests. Fifteen readiness tests were looked atthe Wood-cock-Johnson Psycho-Educational Test Battery, the Brig-ance Diagnostic Inventory of Basic Skills, the Howell Pre-Kindergarten Screening Test, the Brigance K and i Screen,the Daberon Screening for School Readiness, the AntonBrenner Developmental Gestalt Test of School Readiness,the Analysis of Readiness Skills, the Metropolitan Readi-ness Tests, the Clymer-Barrett Readiness Test, the BasicSkills Inventory, the Gesell School Readiness Test, theCognitive Skills Assessment Battery, the Lollipop Test, theABC Inventory to Determine Kindergarten and SchoolReadiness, and the McCarthy Screening Test. Table i alsogives the same information for a second set of eight tests ofgeneral intellectual developmentthe Boehm Test of BasicConcepts, the Bracken Basic Concept Scale, the CIRCUS,the Battelle Developmental Inventory, the DIAL, theWIPPSI, the ABC Inventory, and the EARLY. Tablegives each test's declared purpose, the age range for which itis intended, and some statistics on reliability, concurrentvalidity, and predictive validity.

The statistics for this 1995 sample of readiness tests are,on the whole, slightly better than those I remember fromthe earlier group. Still, predictive validity information ismissing or vague for many of the tests. Where the tests dopredict school-age performances of children, they do notpredict very far into the future and they predict to psycho-metric instruments that are themselves of uncertain predic-tive power. Interestingly, only two of the teststhe HowellPre-Kindergarten Screening Test and the ABC Inventorywere correlated with teachers' clinical estimations ofwhether their children were ready for school, with mixedresults.

I have no desire to pass off this quick survey of the cur-rent readiness tests as a definitive study. It is not. But thequick survey gives a glimpse of the state-of-the-art of ourcontemporary capacity to build a strong school readinesstest, and that survey is not encouraging about the prospectsfor building a nationwide school readiness screening instru-ment by the year 2000.

Of course, there are some good reasons for believing thatif we can give some serious, sustained, deliberate efforts tobuilding new school readiness assessment we can do betterthan those traditional instruments. We know more aboutthe possibilities of testing and assessment, and we haverecently begun to modify and diversify our century-oldtechnology of testing. We know more about child develop-ment than we used to. And we have accumulated some

30

34

greater understanding about when and how research data isbrought into use in the policy process.

SECTION 3 Recent Changes in ourUnderstanding of Child Development

The fundamental reason why traditional readiness testshave not worked well is because we have never had a veryclear idea what "readiness" is to begin with. The "readiness"issue arose together with practical efforts to achieve "readi-ness" testing in the late 192os and early 193os. Compulsoryeducation laws were being passed in all the American statesand children were pouring into the schools. Some childrenwere visibly less ready to do business in the first grade;teachers could see that. But there was no literature in thei92os, and there is no literature now, to discuss exactlywhere or how "readiness" might be constituted in a child.

Two other conceptions that are much like "readiness"exist in the child development literatureBinet andSimon's turn-of-the-century conception of "mental age",built into our contemporary practices of mental testing,and Piaget's notion of "cognitive stages" in childhood, builtinto many of the "developmentally appropriate" preschoolcurricula of the present. The Binet-Simon "measuring scaleof intelligence" was designed to see if children enteringschool could profit from regular instruction. Binet andSimon arranged a series of tasks and performances to forman age-scale, and from the score a child obtained on theirseries they computed a "mental age". Although the Binet-Simon testing procedure has been all but buried under acloud of subsequent psychometric technology and ideolog-ical legendary, the test remains at bottom an instrument fordeciding how old a child is mentally. Presumably, there is auniform path of mental development followed by all chil-dren. The goal of the test is to find out how far along theBinet-Simon path the child stands.

A not-dissimilar uniformitarian view of a child's cogni-tive development came to life in the 196os, through theenormous influence Jean Piaget had on American develop-mental psychologists at that time. Many American develop-mental psychologists who thought about designing andevaluating programs for children in developmental termsconceived of that development as it was construed byPiaget's theory. Child development was cognitive develop-ment. All children go through a uniform series of stages ofcognitive development, and the important differencebetween one child and another was the question of how faralong Piaget's path each child was. Programs for poor chil-dren, it was said, should "close the gap". Some Americanresearch on Piagetian theory dwelt on what Piaget called"the American question", the question of whether onecould or could not accelerate the movement of the growingchild along Piaget's path.

Piagetian theory is on the wane now, and we have come

to believe that there is more to the small child's life thanmarching from one Piagetian stage to another. There ismotor development, social and emotional development,the organization of language, the building of generalknowledge, and the building of metacognitive insights andstrategies. One of the heartening things about the contem-porary literature on child development is that all theseaspects of child development are being actively studied. Ibelieve we know far more about children's developmentthan we have so far "harvested" for the design of programsand assessment instruments for children. We can make aricher world of programs for children and families withsuch knowledge.

But we have to proceed with care and with thoughtfuland sophisticated efforts towards instrument development.Some of the simplifying assumptions of the past are nowslipping away from usthe notion that all there is to childdevelopment is cognitive development, and the uniformi-tarian notion, the idea that all children pursue a commonpath towards adulthood.

Once upon a time, the heart and soul of early educationwas the celebration of the diversity of small children. Whenthe Froebelian Kindergartens first came over to the UnitedStates, the women who worked in them called themselves"child gardeners." Children differ. One is a tomato,another a carrot, a third a sweetpea, a fourth a cucumber.The task of the child-gardener is to study each child and tosee the way the way it grows and what it needs, and then tooffer that child the support or guidance best suited to his orher needs. The labor-intensive vision of how to adultshould deal with her small children in a i9th century Froe-belian kindergarten is far behind us now. Our children goto modern kindergartens and grade school classes in whichthey pursue common schoolingthe "basics" of literacyand numeracy we expect will be given to all children. Welike uniformitarian visions of the basic processes of childdevelopment because they dovetail nicely with the stan-dardization of our classrooms and our expectations. But thereality that every teacher and parent knows is that childrenare different from one another. If we are going to assesschildren's readiness in a broad way, we are going to have toaddress those differences in some meaningful way.

Consider social development, which everyone nowagrees is an important aspect of what happens to children inpreschools and schools. What if one shape or form of socialdevelopment that is true for all children does not exist?Children behave differently in social situations. Some arebold, some are shy, some are friendly, some are reserved,some are garrulous, some are taciturn. Most children inpreschools and schools manage to solve the problem offinding a comfortable social existence among their peers.They catch hold of some sector of small-fry society, butthey do not all do so in the same way. Children are verymuch like adults in this regard. It is not clear which uni-

31

form criteria of social "readiness" for schooling one canapply to all children, boys and girls, who are members ofone cultural community. It is even less clear which criteriaof social development should be applied to children of dif-ferent cultural communities in American society.

If we are going to set forth readiness tests for schools thatexamine a broad spectrum of children's functions and capa-bilities, I believe we are going to have to confront and dealwith the non-standard, idiosyncratic aspects of individualchildren's development. The community of developmentalpsychologists now includes strong groups of researchersaddressing each of the several major streams of small chil-dren's development, and I suspect we can work with suchresearchers to gradually begin to develop possibilities forbroad-scale examinations of children's capabilities andcompetence in several areas of development. But I see noquick way to carry out this process, and I am quite certainthat we cannot smash-and-grab our way past the necessityfor entering into it. The research and developmentprocesses that will be entailed will extend considerably pastthe year z000.

Politicians like to set forth lofty and heroic goals. It is inthe nature of leadership that they do so; 'impossible' goalsget peoples' attention, mobilize them, and surprisinglyoften turn out to be possible after all. Furthermore, the lifeof high officials nowadays tends to be dull, nasty, brutish,and short. One reason why projects on behalf of childrentends to be framed in impossibly short periods of time is thebrief time officials in power have to act. Understanding allthat, I am still not persuaded that we can or should try toestablish a nationwide system of readiness assessment by theyear z000. We have had enough zoth-century experiencewith the development and use of psychoeducational tests toknow that testing is a double-edged sword. It can hurt aswell as help.

Tests that teachers and administrators do not respect canbe more or less politely subverted or evaded. There can beminor forms of fraud, as when so American states all reportthat their children are above average on the school achieve-ment teststhe famous "Lake Woebegon Effect." Tests cancontrol and limit what teachers do, as when teachers in manyAmerican classrooms set aside their best professional judg-ment and "teach to the test". And some psychoeducationaltests, notoriously the descendants of the Binet-Simon instru-ment for determining children's mental age, can becomesignificant instruments for inter-ethnic politics and ideology.Our experience with psychological test usage to date suggeststhat we have every reason to be slow and careful in our devel-opment of future instruments.

SECTION 4 Recent Changes in TestingI am optimistic about the possibilities of more sophisticatedassessments of children's readiness, given time. A reason-ably restrained process of test development can do much to

5

move us towards a greater ability to use child-based resultsfor such purposes as program development and evaluation,student assessments, teacher guidance, policy determina-tion, and a variety of other practical uses. I began my talktoday with a consideration of the state of the art of com-mercial tests for assessing children's readiness. In closing, itmight be useful to consider again briefly what is happeningin the world of commercial testing. The psychoeducationalenterprise traditionally known as "tests and measures" isundergoing a great deal of change today, after a good manydecades of being surprisingly stable and resistant to change.The more important changes of the present are the follow-ing:

Different Kinds of Testing are Being Developed for DifferentPurposes

Traditional psychological tests have always been a little likeHenry Ford's Model T. Ford would sell you any color caryou wanted so long as it was black. Traditional psychoedu-cational tests could be used for any purpose that youwanted as long as you used a standardized, forced-choice,norm-referenced instrument. But it is not at all clear thatone kind of test is maximally useful for purposes of federal-or state-level accountability, providing guidance to ateacher in his or her classroom, diagnosis of an individualstudent's strength or weaknesses, or assessing the pros andcons of a programmatic innovation. Today, we are seeing adifferentiation of forms of psychoeducational testing, asdiffering instruments are being created for different audi-ences and purposes.

New Technologies of Testing Are Coming Into Use

The differentiation of new forms of testing is being fur-thered by the emergence of new technologies of testing.The chief modality for psychoeducational testing seems tobe still, at the moment, the multiple-choice series of ques-tions addressed with a Number 2 pencil. But a variety ofmore complex forms of testing are coming forthrangingfrom constructed-response testing implemented by paperand pencil or by computer to the evaluation of studentwork using portfolios or performance observationalschemes. In general, the new forms of testing are moreexpensive to administer and score but they yield muchricher and more complex information about the individualsbeing tested. We can expect that they will play a substantialrole in future early childhood programs.

Testing and Teaching Have Begun to Merge

As tests have become more complex, naturalistic, and`authentic' they have begun to look more and more likeextensions of the ordinary learning activities of children intheir schoolrooms. Two interesting things have begun tohappen. First, teachers have begun to see the tests as interest-ing commentaries not only upon the child and his or hercapabilities, but upon the substance and methods employed

32

3

by the teacher. The new modalities encourage reflectiveteaching. And, more and more, it seems likely that they canbecome part and parcel of the educational process itself.Instead of existing as a diversion or timeout from schoolwork, the new tests become simply an enriched source offeedback coming out of school activities, available to childrenand teachers who participate in those ongoing activities.

SECTION 5 Enriching the Mixtureof Child-Based Results

The use of child-based results is not an all-or-none thing inprogram assessment. We can imagine a future developmentprocess in which assessments of programs in early childhoodcan be progressively enriched through the use of more andmore child-based results as we develop the capability toenvisage them in a reliable and credible way.

We usually think about accountability in top-downterms, because zoth century discussions of evaluation andaccountability have typically arisen within the context offederally managed programs. Higher-order management,providing resources for the early childhood program andanswerable for the program in the larger web of govern-ment, has to judge what the program is achieving.' The pro-gram is evaluated, by one means or another. But higher-order management is only one of the parties with aninterest in a human services program. Individuals workingwithin the program have an interest, and a real need, toknow whether the program is or is not attaining meaning-ful results. Clients have such an interest; if the program'sclients are children, then parents are involved. Professionalsand program managers working in the community servedby the program have a need to estimate what the program isachieving, in order to come to terms with the possibilitiesof cooperation or competition with the program.

I believe we should move now to capitalize on the possi-bilities opened up to us our studies of child development,by the opening up of new forms of testing and assessment,and by our growing understanding of where and how infor-mation is used in program guidance and policy formation.We can do thisif we are able and willing to submit toslow, reflective, collaborative processes of instrument devel-opment.

I. To simplify discussion, I am here assuming that a government pro-gram has one and only one purpose, well-understood by all partiesand agreed to by them. But this is not true for a good many govern-ment programs that, in the political process, are put forward by coali-tions of parties who are directed towards a variety of purposes. HeadStart, for example, is generally understood to be a child developmentprogram. Historically, however, Head Start was put together by par-ties with declared major or minor interests in: (a) Civil Rights; (b)community action; (c) the coordination of services for children; and(d) stimulating school reform. Many programs in the human servicesreflect such coalitions of interests and emphases.

TABLE 1

TECHNICAL ASPECTS OF SCHOOL READINESS TESTS

Test Age Test-Retest

Inter- Reliabilityrater

Concurrent Validity Predictive Validity References

Lollipop Test(1981)

Pre-K to1st

KR20 forwholetest= .90

.58 w/ teacherratings.86 w/ MRT

Lollipop given atend of K; MRTafter lst=.73;MRT after4th= .40

Bringance K and1 Screen (1986)

K to 1st

Bohem Test ofBasic Concepts,R. (1986)

K to 2nd K-.881st -.552nd-.66

.62 to .82 .60 w/ PPVT.24 to .64 w/Comprehensive Testof Basic Skills,CaliforniaAchievement Test,Iowa Test of BasicSkills

Bracken BasicConcept Scale(1984)

2-6 to 7-11

.97 fortotal test

.76 to .80 .68 to .88 w/ PPVT,Bohem, MRT,. andToken Test

McCarthyScreening Test(1978)

2-6 to 8-6

.32 to .69 .41 to .80 .66 w/ PeabodyIndividualAchievement Test

33

37


Inter-rater

Reliability Concurrent Validity Predictive Validity References

BattelleDevelopmentalInventory (1984)

0-6 to 8-0

> .90 > .90 .81 to .95 .41 to .60 w/Stanford Binet

T.C. II, 72-82

Gesell SchoolReadiness Test

4-6 to 9 .79 .87 .84 83% w/ teacherratings;.64 w/ Piagetian testbattery;.50 w/ ThorndikeIQ's;.61 w/ ThorndikeMental Ages

Gesell in K; .64 w/Stanford A.T. in1st

MetropolitanReadiness Tests(1986)

PreK to1st

.62 to .92 .66 to .93 Took MRT; 6months later, .34 to.65 w/MetropolitanAchievement Test;.47-.83 w/ StanfordA.T.

WechslerPreschool andPrimary Scale ofIntelligence(1967)

4 to 6-6 .86 to .92 .77 to .96 .75 w/ StanfordBinet.58 w/PPVT.64 w/ Picorial Testof Intelligence

WechslerRevised (1989)

2-4 to 7-3

.81 .91 to .96 .74 to .90 w/ Wisc-R, Stanford Binet,and McCarthy

34

ri)


Inter-rater

Reliability Concurrent Validity Predictive Validity References

DevelopmentalIndicators for theAssessment ofLearning, R.(DIAL) (1984)

2 to 6 .76 to .90 .96 .40 w/ StanfordBinet

Basic SchoolSkills Inventory(1983)

4-0 to 7-5

.88 to .92 .22 to .43 w/ teacherratings

T.C. IV, 68-75

Chicago EarlyAssessment andRemediationLaboratory(EARLY) (1984)

3-0 to 6-0

.72 to .91 .89 ERIC ED 204372

CIRCUS (1979) PreK to1st

.74 to .89 T.C. VII, 102-109

Cognitive SkillsAssessmentBattery

PreK toK

.80 T.C. VII, 126139

DevelopingSkills Checklist(1990)

PreK toK; 4 to 6

N/A Yet

35

39

Appendix D: Judging Interventions by Their Results *

Lisbeth B. SchorrHarvard University Working Group on Early Life

Paper presented at the

Issues Forum on Child-Based Results

New York City

Section 1: IntroductionI have been asked to discuss how child-based results' inearly childhood are related to the broader political contextin which the shift to results accountability is currentlyoccurring. That political context is defined, in my view,first, by an increasing concern about the state of America'schildren and families, particularly about escalating rates ofviolence among ever younger children, of children bearingchildren, and of youngsters coming of age without the skillsor motivation to earn a decent living. Secondly, it is definedby a growing sense that nothing systematic can be doneexcept, perhaps, for harsh punitive measuresto reversethese trends.

I believe that a shift toward results accountability is anessential strategy in efforts to build on new research andexperience to improve results for children and families.Results accountability is not a panacea, but it could be amajor step toward improving the conditions in which chil-dren grow into adulthood. It could also become a centralstrategy in a concerted effort to improve results for childrengrowing up in high-risk environments.

BACKGROUND

While much of the current discussion about whether any-thing works, whether government does anything right, andwhether the government that governs least also governsbest, is pure rhetoric, the rhetoric hides some real issues thatneed urgently to be addressed. Among them is how toimprove our ability to differentiate what works from whatdoes not. Legislators have to know what works when votingon laws and appropriations, parents want to know whethertheir child's school is providing an effective education,foundations have to know whether they are supporting apromising strategy, and voters who have given up on com-passion want to know what is a good investment.

Until recently, anyone who wanted to know whether taxor philanthropic dollars were being spent for a good pur-pose was offered one of three unsatisfactory responses: Themost traditional response has been to bypass the problemsof obtaining information about results and to assume that

36

40

what mattered were intentions and efforts, institutions andservices, resources and spending (Manno, 1994). The familyservice agency was doing its job if its budget was increasingand its monthly parent education sessions were attended bya specified number of people and were under the supervi-sion of a certified social worker.

A second, more recent, response has been to say that ifyou really want to know what is working, you have to pri-vatize the functionlet the market place become the judgeof effectiveness, by shifting the school or day care or recre-ation or mental health program out of the public or non-profit sectors. "Shift the burden of evaluation from theshoulders of professional evaluators to the shoulders ofclients, and let them vote with their feet," advises UCLAprofessor James Q. Wilson (cited in Dilulio, 1994, p. 58)' .

And that is how, presumably, we know preschool programsworkmiddle class parents spend money on them.

In a third response, providers of health, education, andsocial services say "Trust us. What we do is so complex, sohard to document, so hard to judge, and so valuable, inaddition to which we are so well intentioned, that you, thepublic, should support us and our programs without askingfor evidence of effectiveness. Don't let the bean counterswho record the cost of everything and know the value ofnothing interfere with our valiant efforts to get the world'swork done."

Since the mid-198os, in the face of ever-increasing skep-ticism about the value of public investments in any humanservices, a fourth answer has emerged. A new breed of socialreformer is contending that public support for social invest-ments would be greatly strengthened if citizens, tax payers,customers, clients and communities were able to hold theproviders of services, supports, and education accountablefor achieving the results that citizens value. Many reformersare also coming to see results-based accountability as animportant way of increasing program effectiveness by free-ing human services from the straightjackets of rigid rules.

Section 3: The Problems That Results-Based AccountabilityCan Solve

GIVE THE PUBLIC SOME PROOF OF RESULTS

Large numbers of US citizens have a deep sense that they arenot getting their money's worth from their governments.'The 1995 confirmation hearings on the nomination of Dr.Henry Foster to become Surgeon General featured lengthy

and often confused exchanges on the impact of "I Have aFuture," the teenage pregnancy prevention program whichDr. Foster founded in Nashville, Tennessee. After much dis-cussion about the meaning of several program evaluations,Senator James Jeffords of Vermont finally stated, in someexasperation, "We're fooling ourselves to think these pro-grams are good because they feel good, when the evidence ofimpact isn't there." Senator Jeffords is not alone in his senseof frustration. In fact he speaks for an increasing number ofgovernment officials and citizens whose faith in social pro-grams will not be restored until they know what, exactly,they are getting for their money, be it tax money or large-scale philanthropy.

Paying attention to results rather than inputs is centralto the reinventing government proposals of David Osborneand Vice President Al Gore. But they were hardly the firstto preach the results gospel.' Two decades before shebecame head of the Office of Management and Budget inthe Clinton Administration, Alice Rivlin was calling forbetter measures to assess the success of social programs,because public concern with ineffectiveness of human ser-vices was running "very high indeed." (Rivlin, 1971, p. 65).She concluded that "all the likely scenarios for improvingthe effectiveness of education, health, and other social ser-vices dramatize the need for better (outcome) measures. Nomatter who makes the decisions, effective functioning ofthe system depends on measures of achievement. .... To dobetter, we must have a way of distinguishing better fromworse" (Rivlin, 1971, pp. 140-41, 144).5

FREE HUMAN SERVICE PROGRAMSFROM THE STRAIGHT-JACKETS OF

CENTRALIZED MICROMANAGEMENT ANDRIGID REGULATION

Management by results is the best alternative to the top-down, centralized micromanagement that holds peopleresponsible for adhering to rules that are so detailed thatthey make it impossible for a program or institution torespond to a wide range of urgent needs.

Whereas the bureaucratic paradigm assumes that controlcan only be exercised by rules, an outcome oriented organi-zation substitutes "adherence to norms" to fulfill the samefunction (Barzelay, 1992, pp. 124-5). A commitment toresults is essential to this shift, because a clear understand-ing about purposes and desired results is the basis on whichemployees will take responsibility for adhering to norms,and will channel their energies into making appropriateadaptations and solving problems. Employee performanceimproves when employees feel accountable because theybelieve that their intended work results are consequentialfor other people (Barzelay, 1992). An results orientation alsoencourages staff to think less categorically as they becomemore aware of the connection between what they do andthe results they seek.

37

ENHANCE SOCIETAL AND COMMUNITYCAPACITY TO BE MORE PLANFUL AND

MINIMIZE INVESTMENT IN ACTIVITIESTHAT DO NOT CONTRIBUTE TO

IMPROVED RESULTS

Agreement on a common set of goals and outcome mea-sures makes collaboration easier, and also fuels the momen-tum for change and helps promote a community-wide "cul-ture of responsibility" for children and families.

Reflecting Alice in Wonderland's insight that if you donot know where you are going, any road will get you there,a focus on results is likely to discourage expenditures ofenergy, political capital and funds on empty organizationalchanges and on ineffective services. The shared commit-ment to improve results for children is what can makeefforts at collaboration and service integration fall intoplace--not as an end, but as an essential means of workingtogether toward improved results.

FOCUS ATTENTION ON WHETHERINVESTMENTS ARE ADEQUATE TO

ACHIEVE THE PROJECTED RESULTSThe new conversation about results may have its most pro-found effect by injecting a strengthened ethical core intohuman service systems' that currently focus more attentionon the fate of agencies and programs than on whether peo-ple are actually being helped. The new results focuspromises (or threatens, in the eyes of some) to end a con-spiracy of silence between funders and program people byexposing the sham in which human service providers, edu-cators, and community organizations are consistently askedto accomplish massive tasks with inadequate resources andinadequate tools. Attention to results forces the question ofwhether outcome expectations must be scaled down, orinterventions and investments scaled up to achieve theirintended purpose.

In the past, parent education programs have beenfunded with the vague expectation that they would some-how reduce the incidence of child abuse, although a fewdidactic classes have never been shown to change parentingpractices among parents at risk of child abuse. Similarly,outreach programs to get pregnant women into prenatalcare are expected to reduce the incidence of low birthweight, based on the similarly vague belief that outreachprograms are a good thing, without any knowledge ofwhether the prenatal care that is made more accessible actu-ally provides the services that could be expected to result ina greater number of healthy births.

Especially in circumstances where it will take a criticalmass of high quality, comprehensive, intensive, interactiveinterventions to change results, where effective interven-tions must be able to impact even widespread despair,hopelessness and social isolation, funders and program peo-

41

ple should resist the temptation to obscure the limitationsof so many current efforts. Providersand even reform-erswho are asked to achieve grand results with interven-tions so paltry that they are in no way commensurate to thetask, should not obscure the insufficiency of the investmentby pleading with funders and evaluators to just documenttheir efforts and not their results because it would not befair to hold them accountable for real results changes whenthey are doing the best they can. Evidence that a dilutedform of a previously successful intervention is not makingan impact is not an argument against results-based account-ability. It helps to clarify that dilution regularly transformseffective model efforts into ineffective replications. Recog-nition that a single circumscribed intervention may not besufficient to change results is not an argument againstresults-based accountability. It is an argument for adequatefunding of a combined critical mass of promising interven-tions.

SECTION 3: RESULTS-BASEDACCOUNTABILITY: A FAUSTIAN BARGAIN?

Critics of the push toward results accountability range fromthe skeptical to the appalled. Commenting on pressures toincorporate results accountability in early childhood pro-grams, Sue Bredekamp, of the National Association for theEducation of Young Children, says it is but "one more oppor-

tunity and justification to 'blame the victims,' because thechildren who are in greatest need of services demonstrate thepoorest results" (Bredekamp, 1995).

Skeptics see the willingness to be held accountable forachieving specified results as a Faustian bargainevenwhen they agree that a shift toward results accountabilityhas the potential to solve a lot of serious problems. Theybelieve that human service providers, in their eagerness toobtain more funding and to escape over-regulation, willbecome unwitting tools in the war against government, thewar against the vulnerable, and the war against all publicsector activities that are grounded in considerations ofmorality, ethics, and social justice.

FEARS OF RESULTS ACCOUNTABILITY

Those who resist the push to results accountability have atleast six specific fears:

First, knowing that what gets measured gets done, theyfear that programs will be distorted. What will get done willbe what is easiest to measure and has the most rapid pay-offrather than what is really important. They point outthat most communities have aspirations for their childrenthat greatly exceed the results that are currently measurable,especially when the demand is for quick evidence of suc-

38

4 ')

cess. Will community health clinics raise immunizationrates at the cost of cutting back on other kinds of well childcare, or support for chronically ill children? Will preschoolprograms deprive children of the opportunities for play thatstimulates creativity and teaches empathy in order toreserve more time for the flash cards whose mastery showsup on "school readiness" tests?

A second fear is that even effective programs will seem tobe accomplishing less than they actually are. If an inner cityconsortium gets funding to improve the employability ofyoungsters coming out of high school, will it be judged afailure if the predictions that produced the funding turnout to be overly optimistic? And will it be judged a failure ifresults do not change as quickly as the funders had hoped?Will the consortium be held responsible for achieving city-wide improvements in results even though they were onlyable to work with only 15o youngsters and their families?

A third fear arises from the recognition of the complex-ity of the most promising interventions, and the corollarythat in the complex, interactive strategies that are mostpromising, responsibility for both progress and failure can-not be accurately ascribed. No single agency, acting alone,can achieve most of the significant results. If higher rates ofchildren ready for school depend on the effective contribu-tions of the health system, family support centers, highquality child care, nutrition programs and Head Start, aswell as on informal supports and community activities, willagency accountability be weakened as attention shifts tocommunitywide accountability efforts? How are agenciesto be held accountable for results over which no singleagency has control?

A fifth fear is that results accountability will become ashield behind which the few remaining protections andsupports for vulnerable children, youth, and families will bedestroyed. Especially in this anti-regulation era, rock-bot-tom safeguards against fraud, abuse, poor services, and dis-crimination based on race, gender, disability, or ethnicbackground could be destroyed. The new results orienta-tion could lead to the abandonment of the input andprocess regulations that now restrict the arbitrary exercise offront-line discretion by powerful institutions against theinterests of powerless clients.

A sixth fear is that the results-based accountability that isintended initially to serve such benign functions as creatingpressures for reform and sharpening the focus of managersand practitioners on accomplishing their mission ratherthan preserving their turf, will soon lead to such hard-edgedconsequences as results based budgeting. The actual alloca-tion of funding based on a program's or agency's perfor-mance could ultimately threaten to shut down programsand agencies that cannot provide evidence of their contri-bution to achieving agreed-upon results, despite the factthey may be making such a contribution.

Section 4: The Special Case ofSchool ReadinessThe first of the national education goalsagreed to ini-tially by the nation's governors and President George Bush,and subsequently endorsed by both President Bill Clintonand the U.S. Congresswas that by the year woo all chil-dren will start school ready to learn. The importance of thisgoal is not disputed. Recent research has it made quite clearthat brain development occurs earlier, more rapidly, and ismore vulnerable to environmental influence and more last-ing than had been previously been suspected (CarnegieTask Force on Meeting the Needs of Young Children,1994). Young children's experiences between birth andschool lay the foundations for success in school and in life(National Research Council, 1991). There is evidence thatchildren without a good foundation do not do well at thebeginning of school and will not be able to catch up.' Inaddition, the proportion of children in a kindergarten orfirst grade class who are ready for school learning is a pow-erful predictor of how much learning will go on in thatclass.'

Despite the unanimity around the importance of schoolreadiness, the prospect of measuring it has aroused deeppassions and controversy. This may be because the welcomerecognition among policy makers and the public that theearly childhood years are such a critical time has led todecidedly unwelcome consequences: the use of standard-ized tests to make high stakes decisions about individualchildren, including labeling some unready for school entryand placing others in special tracks. The result has been thata high proportion of children have a damaging "failure"experience at the very beginning of their academic careers,and that some preschool programs have been driven toteaching test-taking skills, concentrating on rote learning,memorization, and drill ... rather than on the explorationand experimentation and grasp of basic concepts that arekey to later learning .... and that foster confidence, curios-ity, and problem solving" (National Research Council,1991, p. 2, io).

These developments have left the early childhood com-munity so traumatized at the possibility of unwittingly pro-moting further inappropriate testing, that many oppose anyattempt to assess school readiness by testing or observingindividual children, even if the testing is done for the pur-pose of judging the community's provisions for preparingchildren for school entry, and not the abilities or capacitiesof individual children.

There is little dispute about the elements of early experi-ence that contribute to school readiness. They include goodhealth care and nutrition, high quality child care andpreschool experiences, communities that support families,and homes that provide children with the conditions thatdevelop trust, curiosity, self-regulation, the foundations of

39

literacy and numeracy, and social competence (ActionTeam on School Readiness, 1992; Boyer, 1991; NationalTask Force on School Readiness, 1991; Zill, 1995). There isa school of thought that concludes that we know enoughabout what communities have to do to produce theseresults, that it is possible to measure community capacity toassure school readiness as a proxy for measuring children'sschool readiness directly. The National Governors' Associa-tion, for example, has proposed a list of community capac-ity indicators for this purpose.

My own view is that community capacity indicatorswould serve as a reliable proxy if we knew more than wenow do about the precise linkages between inputs andresults, between, say, home visiting and infant health orbetween family support centers and parental competence,between the elements of good child care and the develop-ment of curiosity in toddlers.

However, given the current state of knowledge, mea-sures of community capacity will be informative but notdefinitive in assessing progress toward universal schoolreadiness. While it is absolutely clear that the extent ofschool readiness depends on the existence, accessibility andquality of an array of services, supports, and institutions, itcan probably best be discerned by looking directly at sam-ples of children.

The question then becomes, can children's school readi-ness be determined without doing them any harm? Can itbe done in ways that would make it impossible to label orstigmatize individual children? Can it be done in ways thatwould strengthen rather than distort preschool programsthat "taught to the test"? Can it be done in ways that"acknowledge the fluid and cumulative nature of develop-ment' and that do not result in "blaming children and fam-ilies for low levels of early learning?" (Phillips & Love,1994). Can it be done in ways that make clear that if largenumbers of children are not ready for school, that is not achild problem but a community problem?

A great deal of work, such as that led by John Love atMathematica Policy Research, has been done to suggestthat these questions can now be answered affirmatively(Love, 1995).' But a clear consensus in the early childhoodcommunity has not yet emerged around this proposition. Itmay be that assessments of school readiness in the mostimmediate future will have to rely on approaches that com-bine observations of samples of children and measures ofcommunity capacity, as proposed by Yale Professor SharonLynn Kagan in the most recent National Governors' Asso-ciation Issues Brief (Kagan, 1995).

Section 5: Choosing the Right ResultsWhen communities or states (or the nation, in the case ofthe education goals) actually agree on results that all thestakeholders consider important and meaningful, a lot of

43

other things fall into place. Results that have been agreedupon by professionals and clients and other interested par-ties can become a solid foundation on which new strategiescan be hammered out, and flexible responses can beadapted and evolved on the basis of continuing feedback."With results as constants, we can afford to experiment andeven disagree over the means: whether and for whom homevisiting is more effective than family support centers;whether parent involvement is best achieved by helpingparents to read to their children or help them with theirschool work, through parent employment as classroomaides, or through parent participation in governance;whether children best learn to read using phonics, wholelanguage, or some other method. By being moored to theends, it is possible to stay flexible on the means.

MEASURING SUCCESS

Of course, if results are to be the constants, the selection ofoutcome measures becomes enormously important. Forbetter or worse, what gets measured has a great effect onwhat gets done. In one way or another, teachers teach to thetest, just as social workers pay attention to what the audi-tors count. So the trick is to devise measures that come asclose as possible to actually reflecting what ought to getdone. If you want children to learn to reason in math, yougo beyond multiplication tables in assessing their perfor-mance. If you want children whose eyesight is defective tobe treated, you measure the absence of untreated visiondefects among first graders, not the number of visionscreenings that have been done.

In addition to all the technical considerations that deter-mine the choice of outcome measures (Love, 1995; Moore,1994), the greatest stumbling block for those actuallyengaged in moving to outcome accountability has been thedifficulty of forging agreement on a set of results consideredimportant, meaningful, and measurable by a wide range ofstakeholders, including skeptics. This involves the resolu-tion of two major tensions: (a) between the implicit andexplicit purposes of the efforts, and (b) between all that thecommunity wants from the effort versus what can be agreedupon and measured.

We have been more opportunistic than planful in thiscountry about measuring success. We grasp for what is eas-iest to measure rather than what best reflects a program'spurpose. Thus the Westinghouse Learning Corporation,under a government contract to assess the effects of the ear-liest years of Head Start, measured only changes in the IQof participating children, even though IQ is one of the leastmalleable of human characteristics, and despite the fact thatHead Start aimed to improve the health and nutrition andsocial skills of participating children, to empower their fam-ilies, strengthen their communities, and in many other

40

ways to contribute to their school readiness. Some thirtyyears later, IQ still becomes the outcome that gets assessedbecause it's such a handy measure. Even today family sup-port programs are evaluated on the basis of their effect onthe IQ of participating children.

Hard-edged thinking about purposes and results meansasking anew about what is worth doing, and to what end.The results around which it seems to be easiest to get agree-ment are those around which judgments are seen as leastsubjective, and where, most people agree on the desirabledirection of change,' even if people do not agree on whatthey would give up to achieve such change, how to achieveit, or which results are most important (Rivlin, 1971).

But it gets harder, once we get beyond a few outcomemeasures that are readily agreed upon. It means becomingexplicit about everything, including multiple or evenconflicting goals. The debate has to include the question ofwhether the provision of jobs for local residents is amongthe purposes of Head Start, inner city hospitals, schools,child care or construction projects." Hard-edged thinkingabout results means acknowledging that while full-dayHead Start is likely to pay off in higher rates of schoolreadiness, part of its impact lies in its ability to provide jobsto neighborhood residents, the decreased social isolation ofHead Start parents, and the increase in the number ofadults who are confident that their children are well takencare of and are therefore better able to pursue job trainingor employment.

Efforts to define results must also resolve the tensionsbetween all that the community wants from the initiative asagainst what can be agreed upon and measured. These ten-sions are especially troublesome around results in earlychildhood and adolescence.

Many leaders in early education, whose hearts and soulsare committed to celebrating the diversity of small children(White, 1995), are convinced that whatever outcome mea-sures are selected to document school readiness, they will"mislabel miscategorize, and stigmatize children" by mea-suring only narrow cognitive development, (Kagan, 1995)and will result in narrow, standardized, cognitively orientedpreschool programming.

Having observed many recent efforts to select outcomemeasures in a wide variety of contexts, I have become con-vinced that these tensions will be resolved through increas-ing recognition of three fundamental points:

Communitiesand certainly parentshave goals fortheir children that are more ambitious, more differenti-ated, and more nuanced than the results that can beagreed upon and measured

It is possible to maintain a strong commitment to a pro-grammatic orientation that is ambitious, differentiated,and nuanced, while being held accountable to theaccomplishment of more modest results

Results selected for accountability purposes must be per-suasive to skeptics, not just to partisans of the programsand policies being assessed

AMBITIOUS GOALS AND MEASURABLE RESULTS

When communities, parents, practitioners, policy makersand advocates are asked about their goals and their visionfor their children, they talk about wanting all children togrow up in loving, nurturing and protective families, to beconnected to those around them, and to achieve their per-sonal, social, and vocational potential. They talk aboutwanting youngsters to feel safe, to have a sense of self-worth, a sense of mastery, a sense of belonging, a sense ofpersonal efficacy, to be socially, academically, and culturallycompetent, and to have the skills needed for productiveemployments

Such goals can become a framework within which out-come measures can be selected for accountability purposes,with the understanding that only some aspects of thesegoals can currently be measured with widely available dataand with outcome measures around which it is possible togain widespread agreement. There is a direct connectionbetween these goals and the outcome measures used foraccountability purposes, but goals and outcome measuresserve different purposes. Goals represent what the commu-nity is striving for. Outcome measures represent what thecommunity will be held accountable forby public andprivate funders and perhaps by higher levels of government.The goals can be general, but the outcome measures mustbe so specific, the public stake in their attainment so clear,and their validity and reliability so well established, that thecommunity would ultimately be willing to see rewards andpenalties, as well as resource allocation decisions, attachedto their achievement.

A commitment to more visionary goals is entirely com-patible with a commitment to documenting progresstoward the achievement of these goals by the use of moremodest outcome measures. Of course health is more thanthe absence of disease, educational attainment is more thannot dropping out, nurturing family life is more than theabsence of abuse and neglect, economic well-being is morethan living above the poverty line, and a thriving commu-nity is more than an absence of boarded up houses andopen drug markets (Brandon, 199z).'4 The attainment ofmodest but measurable results would signify substantialprogress toward more ambitious goals.

AMBITIOUS PROGRAMMATIC FOCUS ANDMEASURABLE RESULTS

Accountability systems that rely on results that are easilymeasured and are persuasive in a public policy context donot preclude a much broader programmatic agenda. Strate-gies to achieve measurable results must of course focus on

child's play as part of child care, on social skills as well ascognitive skills among preschoolers, on youth opportunitiesas part of youth development, and on caring and connect-edness as part of community building.

RESULTS THAT ARE PERSUASIVE TOSKEPTICS-AND THE SAGA OF

EDUCATION STANDARDS

The most compelling lesson about the importance of select-ing results that are persuasive to skeptics comes out of therecent experience with national efforts to agree on educa-tion standards.

In elementary and secondary education, the shift injudging quality from the amount of per pupil spending, toan approach that asks what children are learning has beenoccurring rapidly and tumultuously. Outcome standardsfor student learning were blessed in 1989 by the nation'sgovernors at the Williamsburg education summit called byPresident Bush, and were embraced by many educators,reformers, and advocates as a way of advancing the twingoals of educational excellence and equity (Manno, 1994).

Despite the fact that there was little agreement on howstudent achievement should be assessed, the EducationalCommission of the States reported that by 1993" states haddeveloped or implemented an outcome-based approach toeducation (Manno, 1994). This relatively smooth progres-sion did not last long. By February of 1995, Chester Finn,an early promoter of national education standards, enter-tained a Brookings Institution conference with a talk enti-tled "The, short unhappy life of national standards." Hedeclared national education standards dead for the foresee-able future. Whether or not his prediction turns out to becorrect, the process by which education standards were soquickly and perhaps fatally wounded is one from whichthose contemplating results-oriented accountability inother arenas have much to learn.

The most fundamental strategic error on the part of theproponents of results standards resulted, in my view, fromoverreaching. They failed to make the distinction betweenwhat they wanted for their children and what they wantedschools, teachers, and their children to be held accountablefor.

Proposed draft standards included such items as "Allstudents understand and appreciate their worth as uniqueand capable individuals and exhibit self-esteem; all studentsact through a desire to succeed rather than a fear of failurewhile recognizing that failure is a part of everyone's experi-ences" (Manno, 1994). From the perspective of a parent,these were reasonable objectives. But whether these shouldbe the objectives of schools or other public institutions,whether they could be agreed on and measured was anotherquestion.

Whether the recent push to introduce results standardsin education will end up being an impetus or an impedi-

45

ment to reform is not yet known. What is known is that inthe results selection process, it is lethal to ignore the distinc-tions between the goals that communities have for theirchildren and the results that can be agreed upon and mea-sured, between a programmatic orientation that is ambi-tious, differentiated, and nuanced and accountability formore modest results, and between results that are persuasiveonly to an initiative's supporters and those that are also per-suasive to skeptics.

Section 6: Mismatch Between theData That is Needed and the DataBeing CollectedThere is a severeand at first blush, strangepaucity ofdata that could help answer urgent questions about howwell efforts to change vulnerable lives are actually helping,and about which efforts help a little, which help a lot, andwhich help not at all.

The large gap between the data that is needed for resultsaccountability and the indicator data currently being col-lected exists because data collection has been shaped pri-marily by only two kinds of pressures: those that reflect theneed for administrative data for use in managing programs,policies and institutions; and those that reflect the interestsof social scientists, which have focused either on simpleindicators that can be monitored for national level trends ,or on complex measures of individual development requir-ing labor-intensive direct observation and data collection.Attempts by policy makers and advocates for children tomake do with what has been made available for these otherpurposes are increasingly unsatisfactory for purposes ofdesigning interventions, holding intervention effortsaccountable, and trying to understand what works tochange results.

Data collection that has not been required for adminis-trative and managerial reasons has reflected the interests ofsocial scientists who tend to think primarily about nationaltrends, economic influences, and naturally occurring socialchange. As a result, the nation's data tool kit is virtually use-less when it comes to efforts to understand the effects ofintentional interventions on the well-being of children andfamilies, especially when those interventions are designed tooperate at the level of the local community. The dataneeded by those who must make judgments about what'sworking, and who are committed to try to influence cur-rent policy debates, is very meager.

Much new work is now needed, and some is beginningto be done,' to expand information about the characteris-tics of children, families and communities that can be reli-ably measured, and to make data about results for childrenand families available in a more timely way and in units

42

4 3

that correspond to areas that are optimal targets of inter-vention, such as neighborhoods and school catchmentareas. Communities and funders (public and private) needto be able to identify long-term and interim outcome mea-sures that can be linked to interventions, and that can helpthem assess the effects of interventions across, not justwithin, the domains of health, education, child welfare,juvenile justice, community development, economic devel-opment, and job training. These information needs becomeincreasingly urgent with the need to assess the effects ofsuch new federal initiatives as the empowerment zones andenterprise communities, foundation-funded comprehen-sive community initiatives for children and families, andimpending changes in the allocation of responsibilitybetween the federal government and the states, and inmajor reforms of such safety-net programs as AFDC, Med-icaid, and Food Stamps.

WHO DECIDES?

In determining who selects the results to be achieved, thereis much controversy about "top-down" versus "bottom-up"processes. On the one hand, many believe that society hasso much at stake in the achievement of a core set of results,that political bodiesprobably at the state levelshouldbe responsible for identifying a set out results that are to beachieved in a particular jurisdiction. Others believe that"outcome measures imposed from outside a communityhave no legitimacy in terms of a local consensus-buildingprocess ... and cannot mobilize the resources needed toachieve the results sought ..." (Young, Gardner, & Coley,1993). Charles Bruner of the Iowa Child and Family PolicyCenter, who has struggled with this issue both as a state leg-islator and as an advisor to local programs, argues that thosecharged with achieving results must be involved in theresults selection process if it is to be regarded as fair, useful,legitimate, and if it is to reflect real-life experiences.

There does seem to be increasing agreement that theprocess of selecting results for accountability purposes,whether it is done by a state legislature or a local collabora-tive, must have political legitimation. Sid Gardner, of theCenter for Collaboration for Children at the University ofCalifornia in Fullerton, believes that the importance ofgoing through a consensus building process cannot beunderestimated, because the selection of outcome measuresis not primarily a technical, but a political problem.

It is clear that all of those affected by results-basedaccountabilityas legislators representing tax payers, asproviders, or as service beneficiaries or participantsmusthave a role. All concerned will be able to work more effec-tively toward common goals if they are able to engage in aconsensus-building process, involving both providers andrecipients of services, to select the outcome measures theywill use or be held accountable by.

Many forms of interactive consultation are possible. Forexample, when an official state body selects the results,localities may decide or negotiate the numerical value thatwill represent progress in the achievement of each outcome(e.g., the rate of low birthweight will be reduced by 5% eachyear, or racial disparities in low birthweight rates will bereduced by io% each year). But if results are to be used foraccountability purposes, and actually carry what DavidHornbeck calls hard edged consequences, it seems reason-able that after extensive participation and consultation, thefinal decisions are made by bodies at a higher or broaderlevel of governance than those being held accountable.

Section 7: The Importance of InterimMilestonesThe greatest single obstacle to realizing the benefits from ashift to results based accountability is the lack of interimmilestones that could reliably show that reform efforts areon track toward achieving their targets.

Local communities, agencies and programs that arestruggling to reform, improve, or expand their services, tointegrate services across helping systems, or to target a widearray of intensive interventions on selected geographicareas, are clamoring for ways of finding out in the near termwhether their efforts are changing results, or even whetherthey are going in the right direction and making progresstoward long-term results. Funders (public and private),practitioners, managers, and systems reformers are becom-ing increasingly aware that the most frequently cited lessonfrom major current reform efforts is that they take so muchmore time than expectedboth to get the initiative underway, and to get it to the point where it begins to show animpact on real-world results." They desperately need newtools that would allow them to demonstrate their short-term achievements. They need to be able to get interiminformation very quicklyoften long before a program is"proud," (Campbell, 1987) long before it has had a chanceto make an impact on rates of school readiness, child abuse,teenage pregnancy, violence, school success, and employ-ment.

Two kinds of interim measures can predict later results:indicators that attach to children, families, and communi-ties and that are a short-term manifestation of long-termresults, and indicators of a community's capacity to achievethe identified long-term results.

Examples of interim measures that are a short-termmanifestation of long term results for individuals, families,and communities include the following:

Receipt of prompt high quality prenatal care is thoughtto raise the chances of a healthy birth.

Children who do not read when they are seven are likely

43

to encounter later troubles at school.

An improvement in school attendance rates is thought topredict an improvement in school achievement rates.

Parents' sense of mastery and social support, and theabsence of parental substance abuse, as are thought topredict long-term non-recurrence of abuse or neglect.

Knowledge about the connections between measurableindicators of community capacity and long-term results isat a more primitive stage than knowledge about the con-nections between interim and long term indicators for chil-dren and families. Reliable theories about the linkagesbetween interventions and results, and about the constella-tion of conditions and interventions that will lead to goodresults, are scarce. Most are unproven. For example, can acommunity that is developing strategies to reduce rates oflow weight births assume with confidence that the"enabling conditions" to reach that outcome are some com-bination of the capacity (1) to provide family planning ser-vices to all persons of child-bearing age, and (2) to providehigh quality, responsive prenatal care, nutrition services,and family support to pregnant women?

It is probably not enough to know of the simple existenceof certain services, because their quality and how they aremade available must be taken into account to link themstrongly with results. The distinction among service avail-ability, access, and the nature and quality of the service inaccounting for improved results is crucialand requiresgreater understanding and a wider consensus around how tomeasure the factors that make services effective than nowexists (see Charles Bruner's pioneering work).

One connection that most observers consider reasonablywell established comes from the early childhood field: acommunity that is able to offer all of its low income chil-dren and their families Head Start and other high qualitycomprehensive preschool programs is likely to have a highproportion of children prepared for school learning at thetime of school entry.

Perhaps the most tantalizing of recently hypothesizedlinks between interventions and results that could producesome new short-term indicators of community capacity arebetween results for children and families and such indica-tors of community-level change as a strengthened infra-structure of informal supports, and investments in neigh-borhood safety and expanded economic opportunity. Butthere is as yet scant agreement on ways to measure commu-nity building, and only modest understanding of the pre-cise connections.

The need for both kinds of short-term indicators thatcould show movement toward long-term results has longbeen recognized. It has not been met because the ability todefine these interim markers with confidence depends on

473EST COPY AVAILAbL

having reliable evidence, theories, or at least sturdyhypotheses, about the antecedents of major long-termresults. Neither social science researchers nor the evaluationindustry have really invested in this arenaperhapsbecause their energies are exhausted by their pursuit of thatelusive goal of seeming as scientific as their colleagues in thephysical sciences, and because progress in this arenainvolves a higher ratio of judgment to certainty than mostsocial scientists are comfortable with.

As a society, we now need desperately to make up forlost time. One useful next step would be to systematicallyexamine findings in the recent literature and ongoing expe-rience to provide a more rigorous and deeper understand-ing of established connections among short-term and long-term results. We need to explore the connections betweenlong-term results on the one hand, and measures of interimindividual results and community capacity on the other.'

Section 8: Rigorous Thinking AboutProcess MeasuresProcess measures describe what is going on. (Process mea-sures will also continue to be important in assuring thatprocedural protections are maintained to guard againstfraud, corruption, and inequities or discrimination basedon race, gender, disability, or ethnic background.)'9 Processmeasures are an essential component of understanding theimpact of an intervention, though they themselves do notassess impact, unless they qualify as interim indicators.Process measures are important in finding out whether aprogram or intervention has actually been implementedaccording to plan. (Is the Head Start program in operationfor the number of hours its funders expect, has it enrolledthe expected number of children and the expected propor-tion of eligible children, has it involved a stipulated per-centage of parents, etc.)

Process measures become easily confused with outcomemeasures and interim measures. Distinguishing amongthese various indicators is essential to clear thinking aboutinterventions and their consequences. One of the reasonsfor the current confusion is that the same measure can bean outcome measure, an interim measure, or a process mea-sure, depending on context. If the purpose of the initiativeis community building, "community engagement" couldbe an outcome. If the purpose of the initiative is schoolreadiness and the connection between community engage-ment and school readiness were reasonably well established,"community engagement" could be an interim measure. Ifthe funder and grantee were to agree to make communityengagement an essential component of the intervention, itcould be a process measure.

A process measure can be used as an interim indicator ifthere is a reasonable hypothesis to make the link, as when a

44

4.3

program trains parents as community leaders as part of itsefforts to rebuild a community infrastructure. A processmeasure cannot be used as an interim indicator if there isno basis for linking it to long-term results.

The failure to think clearly about process measures, andhow they relate to what is being proposed or being doneand what is being accomplished, results in what DavidOsborne calls "process creep" (Osborne & Gaebler, 1993, p.350). When process creep occurs, means and ends becomeconfused, and the focus on what actually happens to peopleas a result of the activity is lost. The formation of a collabo-rative, or a high degree of participation in a new governanceentity may be the product of a great deal of effort, but isnot. evidence of progress toward agreed upon results unlessthe rationale that connects these activities to establishedresults is at least explicitly hypothesized, if not proven. Thenumber of children who have been screened for hearingand vision problems is a process indicator. Because screen-ing that isn't followed up with diagnosis and treatmentwhere needed won't reduce the number of children whosevision or hearing is impaired, screening should not be usedas an outcome indicator.

But the confusion about process measures is not onlyconceptual, it is also political. The temptation is ever-pre-sent to fall back on using process measures as evidence ofprogress, even when they meet none of the criteria for out-come measures and there is no basis for linking them toultimate results. Process measures often become substitutesfor outcome measures because they provide comforting evi-dence of activity, they demonstrate that something is hap-pening.

Typically, both grantmakers and grantees contribute toprocess creep. It happens in the early stages of programimplementation, when everyone involved suddenlybecomes afraid that his or her hopes for the project may notbe realized. It also happens when funders encounter hostil-ity to outcome accountability (and outcome evaluation)from communities and program people who fear that out-come measurement will not do justice to their underfundedintervention."

In responding to these fears, funders often find it easierto remove or move the goal posts than to strengthen theplayers. The typical forget-about-the-goal-posts conversa-tion takes place a few months into the implementationphase of a program. The hinder says to the grantee some-thing along the following lines: So we gave you the grant inthe hope that you would reduce teenage pregnancy andyouth violence in this community, and you now say thatwas really an unrealistic expectation? You may be right. Butwe do need some hard evidence that our grant is makingsome sort of difference, so let's see if we can get an evalua-tor to design an attitude survey that will determine whetheryou have increased the number of teenagers who think it's a

bad idea to carry a gun and to initiate sex when they'reyounger than fifteen. Or the evaluators could documenthow many youngsters come to your meetings and classes.Alternatively, maybe we or you could hire an enthnogra-pher to chronicle what's going on in your program ....

Some of these are useful things to do. It is especially use-ful to obtain rich descriptions of complex, nuanced inter-ventions. But descriptions of process are most useful whenthey become part of a systematic inquiry into what the pro-gram is accomplishing and why. Descriptions of a processare not a substitute for either outcome-based accountabilityor outcome based evaluation.

Section 9: ConclusionIn concluding, I would address those who still harbor gravedoubts and a visceral unease about the whole idea of resultsaccountability. Committed practitioners have every reasonto ask why should we have to prove the value of our work?They point out that those who would dismantle the safetynet and the whole infrastructure of public and nonprofitservices and institutions are not arguing efficacythey arearguing principle. These practitioners, along with parents,community leaders and other advocates wish to stand theirground on principle, and say that feeding young childrenand providing them with a safe and happy place to play isenough justification, that comforting a frightened adoles-cent needs no further rationale, that every expectant motheris entitled to the highest quality prenatal careregardless ofwhether there is a payoff in higher rates of school readiness,employability, or healthy births. Other countries, after all,do not make public support for basic services for childrenand families contingent on proof of their merit. In Franceand Germany and Britain and Japan, publicly supportedchild care and maternal and child health care, paid familyleaves, and universal child protective services are taken forgranted and require no evidence of effectiveness.

American human service leaders see themselves as part ofa tradition of service to the vulnerable whose value is ulti-mately independent of its effects. They cite MotherTheresa's explanation of her perseverance in the face of theenormity of world poverty: "God has called on me not tobe successful, but to be faithful" (Kagan, 1993). They citeGhandi's teaching that "It is the action, not the fruit of theaction, that is important."

My own belief is that the moral underpinnings for socialaction, especially by government, are not powerful enoughtoday, in the cynical closing years of the twentieth century,to sustain what needs to be done on the scale at which itneeds to be done. In this time of pervasive doubt, the mag-nitude of public investment that is required will be forth-coming only if there is evidence that investments areachieving their purpose and contributing to long-term

45

goals that are widely shared. And the chances of developingand sustaining the responsive bureaucracies that can sup-port effective programs will also increase to the extent thataccountability for results can replace accountability forobserving rigid and narrow procedural rules.

*Based on a chapter from a forthcoming book, tentatively titled Dis-turbing the universe: Strong families, supportive communities, responsivebureaucracies, and how to get there from here.

s. I use the words "results" and "outcomes" interchangeably. I use theword "goals" to refer to results that are desirable but cannot be readilymeasured or agreed upon.

2. While the bottom line of profit, and market performance and sur-vival is clearly established in private business, there is no similar agree-ment on success in the public sector.

3. "You're seeing it everywhere," says James R. Fountain Jr., asstresearch director at the Governmental Accounting Standards Board,"a growing frustration among taxpayers that they don't know whatthey're getting for their money" (Osborne & Gaebler, 1993, p. 140).

4. President Bill Clinton's remarks at the signing of the GovernmentPerformance and Results Act: "It may seem amazing to say, but likemany big organizations, ours is primarily dominated by considera-tions of inputhow much money do we spend on a program, howmany people do you have on the staff, what kind of regulations andrules are going to govern it; and much less by outputdoes this work,is it changing people's lives for the better?"(From red tape to results,1993, p. 73)

5. Dr. Rivlin fully recognized the difficulty of coming up with goodmeasures of performance: she called for "a sustained effort to developperformance measures suitable for judging and rewarding effective-ness ... all the strategies (here discussed) for finding better methods ofdelivering social services depend for their success on improving per-formance measures (Rivlin, 1971, p. 144).

6. "Ethical core" is Sid Gardner's phrase (Gardner, 1995). Gardner,along with David Hornbeck, has been the most persistent and effec-tive advocate of the shift to an results orientation.

7. Robert Slavin reports that students who are not reading at the endof first grade are at great risk; they don't catch up later (Slavin, 1995).

8. School readiness can be thought of as "a fixed standard of develop-ment sufficient to enable children to fulfill school requirements and toabsorb the curriculum content" (Kagan cited in Phillips & Love,1994). The National Head Start Association says that Kindergartenteachers expect that entering students will be able to work both inde-pendently and as members of small and large groups, to attend to andfinish a task, listen to a story in a group, follow two or three oral direc-tions, take turns and share, care for their belongings, follow simplerules, respect the property of others, and work within the time andspace constraints of a school program (National Head Start Associa-tion, 1995). Businessman and philanthropist Irving Harris tells thestory of Doris Williams, who taught kindergarten in the inner city ofChicago, who told him she could always handle one child who wasn'tready for school. "But when I had two or three who were not readythe extended attention they demanded meant that the rest of the classwas denied the time they had a right to expect from me" (Harris,1993).

9. The NGA's community capacity indicators (benchmarks) forschool readiness include rates of: children in preschool and child care

49

programs; children in preschool and child care programs meeting pre-scribed standards; eligible children in Head Start and public preschoolprograms; communities with family support and education services;school-age parents receiving comprehensive services; children whoexperience consecutive or multiple out-of-home placements; pregnantwomen receiving prenatal care during first trimester; children coveredby Early Periodic Screening, Diagnosis, and Treatment (EPSDT) orprivate health insurance; eligible participants in the Women, Infant,Children program (WIC).

to. In an earlier paper, Love wrote, "Our system (in which assessmentsare completed on a representative sample of children) avoids labelingchildren by focusing on aggregate measures for the community."

it. In the spirit of Sarason's conclusion that, "Problems are constants,answers are provisional" (Sarason, 1990, p. xii), I would say that wecould transform an understanding of our problems into an agreementabout goals, and let those be our constants, as we evolve and experi-ment with our provisional programmatic answers.

12. In this process, it is important to start, as Alice Rivlin (5970advises, with "indicators that measure movement in the appropriatedirection." These include the following: measures of physical health(such as low rates of low birthweight babies, high rates of two-yearolds fully immunized, no untreated vision or hearing problems atschool entry, low rates of sexually transmitted diseases); measures ofschool achievement; measures of perils avoided in adolescence (such astoo early childbearing, arrests for violent crime, suicide, homicide,substance abuse); measures of productivity and economic well-being(such as rates of productive employment, and rates of families withincomes over the poverty line) (Rivlin, 1971, p. 47).

13. The authors of Outcome Funding are troubled by the fact that apublic college they describe was saved from closing by a campaign topreserve the jobs that the university provided in a poor county; theysuggest that if that is one of the desired results, 5o% of the school'sbudget should be funded by the state's economic developmentagency. Similarly, food stamps seems to be have been saved from theGingrich assault in 5995 by agricultural interests, not by concern forthe nutrition needs of the poor.

14. Brandon distinguishes between measures of wellbeing (which hecalls positive measures) and measures of progress in reducing prob-lems (which he calls negative measures) to argue in favor of using theformer as the best way to approach results accountability. But he rec-ognizes that 'the popularity of negative measuresmeasures ofpoverty, dysfunction, and illnessreflects the ease of consensus onrecognizing that some things are clearly inadequate, without thedifficulty of consensus on defining what is adequate" (Brandon, 1992,p. 24).

15. I was at a meeting recently where Marie McCormick cited a studycurrently using IQ to find out how well a family support program wasworking.

16. In the vanguard of the work on neighborhood level indicators isthe Foundation for Child Development, and researcher ClaudiaCoulton at Case Western Reserve University.

46

17. As one dramatic illustration, the coordinator of the externalreview team commissioned by the Pew Charitable Trusts to review itsambitious proposed Children's Initiative, suggested that the problemof obtaining timely results information contributed to the decision tocancel the Initiative. Reflecting back on lessons learned, he wrote that,"To build the public and political will to continue, projects must beable to demonstrate results in a credible way. State officials developingThe Children's Initiative recognized that data on enhanced resultswere critical to expansion, but such data were unlikely to be availablewithin the important early years of the project" (Krauskopf, 1994).

18. The evaluation steering committee of the Aspen Roundtable onComprehensive Community Initiatives has been discussing the use-fulness of a "Michelin Guide" to interim indicators, that would assessthe degree of confidence with which the hypothesized connectionbetween interim indicators and long-term results measures could belinked, all along the causal chain. The idea would be to distinguishamong the connections that seem to be fairly well established, thosewhere the evidence is weaker and the hypothesized connectionsurgently need to be tested, and those where even promising hypothe-ses are lacking.

19. Procedural protections will have to be maintained and monitoredwherever there is no other way to restrict the arbitrary exercise offront-line discretion by powerful institutions against the interests ofpowerless clients. Because the present capacity to use outcome mea-sures to judge program effectiveness is still primitive, and because ittakes so long for results to improve in response to even the most effec-tive interventions, existing process measures will continue to play arole in holding agencies, communities, and systems accountable dur-ing the period of transition. Increasingly, however, as new measuresthat are more closely and reliably related to results become available tomeasure initial progress toward ultimate goals, it will be important tocontinually re-examine the balance between the use of process andoutcome measures, so that communities and agencies can make surethey utilize results-based accountability as much as the state of the artallows.

zo. People who are responsible for programs, be they teachers, socialworkers, early childhood people, youth workers, or neighborhood res-idents and other program participants, often view evaluation researchas an "unfriendly act," observed Peter Bell, when he was president ofthe Edna McConnell Clark Foundation (Bell, 1993).

References

Action Team on School Readiness. (1992). Every childready for school. Washington, DC: National Governors'Association.

Barzelay, M. (1992). Breaking through bureaucracy: A newvision for managing government. Berkeley, CA: Universityof California.

Bell, P. (1993). Edna McConnell Clark Foundation reportof the president.

Boyer, E. L. (1991). Ready to learn: A mandate for thenation. Princeton, NJ: Carnegie Foundation for theAdvancement of Teaching.

Brandon, R. N. (1992). Social accountability: A conceptualframework for measuring child/family well-being. Washing-ton, DC: Center for the Study of Social Policy.

Bredekamp, S. (1995, May 8). Memorandum to S. L.Kagan, M. Levine, & V. Washington.

Campbell, D. T. (1987). Problems for the experimentingsociety in the interface between evaluation and serviceproviders. In Kagan, S. L., Powell, D. R., Weissbourd, B.,& Zigler, E. (Eds.), America's family support programs: Per-spectives and prospects. New Haven, CT: Yale UniversityPress.

Carnegie Task Force on Meeting the Needs of YoungChildren. (1994). Starting points: Meeting the needs of youngchildren. New York: Carnegie Corporation of New York.

Dilulio, J. J. (1994). Deregulating the public service: Cangovernment be improved? Washington, DC: BrookingsInstitution.

From red tape to results: Creating a government that worksbetter and costs less. Washington, DC: National Perfor-mance Review. September 7, 1993.

Gardner, S. (1995, April). Kissing babies and kissing off fam-ilies. Speech given at Concord, MA.

Harris, I. B. (1993, Spring). "EducationDoes it make adifference when you start?" Aspen Institute Quarterly, 5(2),30-52.

Kagan, S. L. (1993, May). Going-to-scale with a comprehen-sive services strategy. Presentation at Wingspread confer-ence.

Kagan, S. L. (1995, May). By the bucket: Achieving resultsfor young children. Washington, DC: National Governor'sAssociation.

Krauskopf, J. A. (5994, October). Overcoming obstacles toimplementing reform of family and children's services. Paperprepared for the annual meeting of the Association forPublic Policy Analysis and Management, Chicago, IL.

Love, J. M. (5995, June). Can we measure the results? Or, associety shifts towards results-based programs and services foryoung children, can the scientific community provide reliable,valid, and useful child results? Paper prepared for discussionat the Issues Forum on Child-Based Results, sponsored bythe Carnegie Corporation of New York.

47

Manno, B. (1994, June). Outcome-based education: Miracleor plague? Indianapolis, IN: Hudson Institute, HudsonBriefing Paper No. 165.

Moore. K. (1994, November). Criteria for indicators ofchild well being. Background paper for the Conference onIndicators of Children's Well-Being, Washington, DC.

National Head Start Association. (1995, May). NationalHead Start Association statement on child-based results.(1995, May). Washington, DC: Author.

National Research Council, Institute of Medicine. (1991).Improving instruction and assessment in early childhood edu-cation. Washington, DC: National Academy Press.National Task Force on School Readiness. (1991, Decem-ber). Caring communities: Supporting young children andfamilies. Alexandria, VA: National Association of StateBoards of Education.

Osborne, D. & Gaebler, T. (1993). Reinventing govern-ment. New York, NY: Penguin.

Phillips, D. & Love, J. (1994). Indicators for school readi-ness, schooling and child care in early to middle childhood.Paper prepared for conference on Indicators of Children'sWell-Being, Washington, DC.

Rivlin, A. M. (1971). Systematic thinking for social action.Washington, DC: Brookings Institution.

Sarason, S. B. (1990). The predictable failure of educationalreform. San Francisco, CA: Jossey-Bass.

Slavin, R. (1995, May). Three days in May: How to assessfirst grade reading, and why we must do so. Memorandumto May 1995 Carnegie Issues Forum on Child-BasedResults.

White, S. (1995, June). Presentation at Carnegie Corpora-tion Issues Forum on Child-Based Results, New York,NY.

Young, N. K., Gardner, S. L., Coley, S. M. (1993). Gettingto results in integrated service delivery models.

Zill, N. (1995, May). Responses to Issues Forum Meetingquestions. Carnegie Issues Forum on Child-Based Results,New York, NY.

5 1

Appendix E: Can We Measure the Results?

Or, As Society Shifts Toward Results-Based Programs AndServices For Young Children, Can The Scientific Commu-nity Provide Reliable, Valid, And Useful Child Outcome

Measures?

John M. LoveMathematica Policy Research, Inc.

Paper presented at the

Issues Forum on Child-Based Results

New York City

Section 1: IntroductionLisbeth Schorr (1994) has made the case for shifting toresults-based accountability as we strive to improve the livesof children and families. I accept the desirability of thisshift. But now what? Can we actually measure the results orresults that programs are now clamoring to articulate? Ihave been asked to consider this question from the perspec-tive of sciencethe science of child development and thescience of measurement.

WHERE THERE'S A WILL THERE'S A WAY

At some level, the answer has to be "yes," because we con-tinually measure a wide range of results in a large numberof programs. In fact, there is a longthough some mightsay checkeredhistory of using existing instruments to cre-ate a body of knowledge on the effectiveness of programsfor children. Head Start, for example, has long measured itseffectiveness with a variety of child outcome measures(Kresh 1993; McCall 1993; McKey et al. 1985). Studies ofother preschool programs, like the Perry Preschool, haveinfluenced public policy in major ways with only minimalquestions about the accuracy of the results measures (Bar-nett 1992; Schweinhart, Barnes, and Weikart 1993). Earlyintervention studies, like the Infant Health and Develop-ment Project (IHDP) (Brooks-Gunn et al. 1994) and theAbecedarian project (Campbell and Ramey 1994; Rameyand Campbell 1991), are currently claiming significantbenefits based on existing child outcome instruments.Extant instruments form the basis for our understanding ofthe benefits of high-quality child care (Helburn et al.1995;Howes, Phillips, and Whitebook 1992). Furthermore, stud-ies of family support programs also use existing measures toassess results for children (Barnett 1995; Lamer 1992; St.Pierre, Layzer, and Barnes 5994; Yoshikawa 1994). Becausewe are measuring child results, there is at least tacit accep-

48

5 2

tance of the outcome measures by a sizable group ofresearchers, program operators, and policymakers. Withoutthis acceptance, we could not have any confidence in thehundreds of studies we rely on for our understanding ofprograms, their accomplishments, and their impacts onchildren.

If, at some level, many researchers and program plannersbelieve we can currently measure important results, weshould still ask whether we are doing so as well as weshould. Could we use better measures? Are we measuringthe right results? Or the most important ones? Do we fail tolearn about particular kinds of program effects because welack the instruments? With sufficient resourcesthat is,time and moneywisely applied, there is no doubt in mymind that the scientific community can answer the feasibil-ity question affirmatively. But, for this paper, I will assumethere is no time and no additional money. The sponsors ofthe forum are interested in what is feasible now, not what iseventually possible!

Schorr (1994) and her colleagues did consider measuresof programs' results for children that are suitable for imme-diate use. Her "start-up list" of possible measures includesmany of the usual suspectslower rates of low birthweightbabies, more complete immunizations, lower schooldropout rates, and fewer children abused or neglected, toname a few. These are critical results, particularly from acost-conscious public policy perspective. Conspicuouslyabsent, however, are measures of the developmental resultsthat many early childhood program peopleteachers, care-givers, parents, and child development expertscare about:communication skills, thinking ability, self-esteem, school-related knowledge, behavior problems, curiosity, and soforththe multiple dimensions of children's early learningand development that have been so admirably and compre-hensively articulated by the technical working group for thefirst national education goal (Kagan, Moore, and Bre-dekamp 1995). Although aspects of these developmentaldomains are commonly measured, as they have been in thestudies cited earlier, we could long debate their scientificsuitability.

In this paper, I discuss the scientific feasibility of mea-suring a wide spectrum of child results. All are encompassedby the notion of children's well-being. All are articulatedamong the goals of many programs that are striving toimprove results for children and families. I illustrate mea-surement of various facets of well-being selected from thetypical domains: health and physical development, intellec-tual or cognitive functioning, language and communica-tion, and social and emotional development.

It may well be that measurement is more feasible insome domains than others. I also consider the possibilitythat feasibility is a function of the age of the child. I raisethe possibility that measurement feasibility depends onother characteristics of the child, such as his or her nativelanguage and culture, and special needs or disabilities. Fea-sibility may vary with the type of measurement as well.Finally, I suggest that the science of psychological measure-ment is not the only factor influencing the feasibility ofadopting an results-based orientation for children's pro-grams. In fact, in spite of the theme of this forum, the sci-entific feasibility of measuring child results may not be themost important source of apprehension. Before getting tothis topic, however, I discuss some of the more salient pastand present child-outcome measurement efforts.

Section 2: Past Assessment Efforts:Heritage or Hamstring?Within the early childhood community, the infamous"Westinghouse" study stands as a hallmark of failure in thescience of evaluationfailed design, failed measurement,and failed policy. Not only did this early evaluation concen-trate on measuring the intellectual development of HeadStart children with an inappropriate test of "IQ" (chosenbecause it was "the best measure available"), it included non-comparable comparison groups under noncomparable pro-gram conditions. Although design flaws, unrealistic expecta-tions, and problems with the media (including prematurerelease of findings to The New York Times) compounded abad choice of measures, it is often the outcome measure thattakes the heat of later condemnation.

If the so-called Westinghouse evaluation set the fieldback a decade, the long-term benefits shown for graduatesof Ypsilanti's Perry Preschool immeasurably advanced thecause of early childhood programs. I don't think there is asingle study that has so inclined politicians to vote for addi-tional funds for Head Start. In this case, a randomlyassigned control group, long-term follow up, and resultsthat really matter (getting jobs, staying out of jail, avoidingpregnancy) far outweighed earlier reports of fading IQgains (Barnett 1992). In fact, the Perry Preschool study,along with the other longitudinal early childhood educa-tion studies begun in the 196os (Lazar, Darlington, Murray,Royce, and Sniper 1982), were instrumental in awakeninginterest in a wide range of educational, social, and eco-nomic results that extend well beyond typical measures ofchildren's development.

Another program from this era was the Brookline EarlyEducation Project (BEEP), distinguished by being an inte-gral part of the public school system and providing services tochildren and families from birth until entry into kinder-garten. Outcome measures included standard developmental

49

measures (such as the Bayley Scales of Infant Developmentand the Stanford-Binet); measures of language development,health and physical development, and school achievement;and a detailed classroom-observational measure of children'smastery skills, social skills, and use of time (Hauser-Cram,Pierson, Walker, and Tivnan 1991). The broader range ofresults measured for BEEP children can be justifiably enviedby today's early childhood programs.

This highly selective sampling of early childhood pro-gram evaluations from the `6os and '7os shows that the fielddid not lack for outcome measures. Yet, study after studyhas begun by declaring that adequate measures do not exist.There are sound scientific and political reasons for notwanting to perpetuate the "psychometrically" strong IQtests, and the highly program relevant and sensitive, butlabor-intensive, observational measures like those used inthe BEEP evaluation cannot easily be used on a large scale.

In the context of these exciting studies, about twentyyears ago, I began planning the evaluation of a Head Startdemonstration program that had ambitious goals for affect-ing 4- to 8-year-olds' "social competence." The evaluationteam identified four dimensions of social competence in aneffort to do justice to Ed Zigler's broad conception of socialcompetence as "an individual's everyday effectiveness indealing with his environment, . . . his ability to masterappropriate formal concepts, to perform well in school, tostay out of trouble with the law, and to relate well to adultsand other children" (quoted by Anderson and Messick1974, p. 283). The early 1970's precursor to the Administra-tion on Children, Youth and Families (the Office of ChildDevelopment) simplified the concept to a concern with thechild's "everyday effectiveness in dealing with his environ-ment and responsibilities in school and life." The domainswe identified were social-emotional development, psy-chomotor development, language development, cognitiveskills, and health and nutritionnot unlike the currentdimensions defined by the Goal i Technical PlanningGroup of the National Education Goals Panel (Kagan,Moore, and Bredekamp 1995).

In 1975, we concluded that "no measure that is alreadyfully developed has been found that meets all the specificselection criteria ..." (Love, Wacker, and Meece 1975, p. 3).It is instructive to look at the instrument selection criteriaused in that study, because they are similar to those fol-lowed in practically every early childhood program evalua-tion I know of, and are still relevant to the issue facing ustoday. Fifteen criteria in three areas were used to evaluatepotential instruments:

Practical Considerations:* Available for use by fall 1975 ("immediately")* Appropriate for use by trained paraprofessionals* Test format appropriate for ages of children in

program

53BEST COPY AVAILABt

* Scoring procedures appropriate for data processing* Reasonable testing time for young children

Psychometric Qualities:* Adequate construct and/or predictive validity* Adequate test stability and internal consistency* Culture and/or SES fair* Representativeness of standardization sample* Low correlation with index of general information

Relevance to the Program:* Spans appropriate age range* Spanish-language adaptation available* Relevant to program's cognitive and language goals* Likely to demonstrate program effects* Used in previous national evaluations or large-scale

studiesEach of these criteria represents a factor influencing ouranswer to the question of scientific feasibility of measuringpolicy-relevant child results. In this instance, the primaryshortcomings of the instruments were their failure to spanthe total 4- to 8-year age range of the program population,lack of relevance to program goals, and failure of the teststandardization samples to represent fully the geographic orSES features of the population participating in the pro-gram.

During this same period, the Office of Education (OE),the middle element in what was then the U.S. Departmentof Health, Education, and Welfare, was conducting a mas-sive national evaluation of an early elementary (kinder-garten through third grade) planned-variation curriculumstudy called "Follow Through." In addition to conductingits multimillion-dollar national evaluation, OE allowed thecurriculum sponsors to conduct their own evaluation activ-ities. A consortium of the institutions sponsoring curriculathat today we would call "developmentally appropriate"joined forces to develop measures that would be more sen-sitive to developmental results than the standardizedachievement tests used in the national evaluation.

The High/Scope Follow Through model, for example,developed a procedure for assessing written language thatdemonstrated that fluency and complexity in the writing ofFollow Though students increased as a direct function ofthe extent to which teachers implemented the High/Scopecurriculum (Bond, Smith, and Kittel 1976). The BankStreet College Follow Through model developed a complexobservation instrument that showed Follow Through chil-dren engaging in more self-initiated communication,expression of thoughts, and peer communication thancomparison children (Bowman and Mayer 1976). Theseand other sponsor evaluation efforts were well-intentionedand in fact did produce useful findings to counterbalancethe national evaluation results (Hodges et al. 1980). But theenormous task was too much for the sponsors' paltry evalu-ation resources and time, and they could not effectively

50

54

counter the negative messages of the national evaluation.In the late 1970s, OCD decided that the sorry state of

measurement feasibility was a major obstacle to obtaininggood evaluations of the Head Start program. The agencytherefore launched a coordinated effort to develop newinstruments. The strategy included contracts to Educa-tional Testing Service and the RAND Corporation todefine and map the landscape of what should be measured(Anderson and Messick 1974; Raizen and Bobrow 1974),and two major instrument development projects. In thefirst, OCD contracted with ETS to create a battery of mea-sures spanning 16 domainsin the areas of cognitive, lan-guage, and perceptual-motor characteristics of preschooland kindergarten children. Called the CIRCUS batterybecause of the pictorial theme using clowns, balloons, andanimals, the battery offered a special feature, EL CIRCO.This Spanish edition was distinguished by the fact that itwas developed in tandem with the English-language versionrather than being a translation of a test initially developedonly for English-speaking children (Anderson et al. 1974).Today, no one hears about CIRCUS or EL CIRCO.

The second project was the Head Start Measures Pro-ject, begun in 1977. In 1980, the project team conducted anextensive review of measures of children's development. Wereviewed more than zoo measures, reaching the followingconclusion:

Very few measures show content validity as defined bythe developmental characteristics of children identifiedthrough the input workshops.' Information on con-struct validity is virtually nonexistent. Moreover, veryfew measures possess sufficient reliability to warrantconfidence in evaluation findings based on their use.(Mediax Associates 1980, p. 34)

Members of the national panel convened for the HeadStart measures project had equally discouraging observa-tions. The comments of panelists representing three per-spectives illustrate the measurement problem that we, andthe field, faced:

I have reviewed all of the major studies of Head Startand related programs and all of the instruments usedin this research to assess children's social development.I canriot recommend any of these instruments foradoption in this project since in no case did I find satis-factory evidence of the instrument's validity. (Carew1978, p. 7)

Indeed, most of the scales of motor development avail-able contain so few items per age level . . . that they areunlikely to be discriminating except under circum-stances of marked differences between the groups beingassessed. (Eichorn 1978, pp. 5-6)

It is clear that evaluators of Head Start have not takeninto account, in their selection of measures, the corn-

plex issues underlying the identification of goals andobjectives of the program. (Laosa 1978, p. 19)

The Head Start Measures Project is the only sustainedeffort I know of that was single-mindedly devoted to devel-oping better outcome measures for the full spectrum ofchildren's well-being. Researchers across four differentinstitutions tackled instrument development in fourdomains: (r) health and physical; (z) cognitive; (3) social-emotional; and (4) "applied strategies." (This fourthdomain foreshadowed the "approaches toward learning"domain defined by Kagan et al. [1995] for the first nationaleducation goal.) After only two years of development work,the government canceled funding for all but the cognitivedomain. Unfortunately, the project's dismal outcome, doc-umented by Raver and Zigler (1991), does not make useager to try again.

Today, we are often hamstrung by failed measurementdevelopment attempts and the negative exemplars of majorevaluations of early childhood programs. We are told thatthis history shows it cannot be done. These experiences areat least partly responsible for a climate of measurementavoidance throughout the early childhood community. It isnow "common knowledge" that the concept of test validityfor young children is an oxymoron. Teachers, administra-tors, or program evaluators who recommend individual,standardized, controlled assessments of young children aremisguided at best and evil at worst. I am not so naive as tobelieve that valid concerns do not exist, or to recognize thatthe practice of testing has involved enormous abuses, but itis both wrongheaded and shortsighted to reject all testingout of hand. Even within this climate, however, positiveexamples exist. I turn now to some of these to considerwhat lessons they carry.

Section 3: Recent and CurrentAssessment Efforts: EnlightenedEndeavors or Imprudent Illusion?In three recent and current activities, researchers haveidentified child results that are relevant for particular pur-poses and have proposed useful measurement approaches.Here, we see that different types of measurement proce-dures are identified; however, constraints on the definitionor construction of the measures in each case cloud our abil-ity to focus on the nature of the problem for results-basedprograms.

CHILD RESULTS IN THE CONTEXT OFCOMMUNITY-BASED FAMILY PROGRAMS

In 1993, Mathematica Policy Research undertook the chal-lenge of designing an evaluation of the Pew CharitableTrusts' Children's Initiative that would be as comprehen-

51

sive and innovative as the Initiative itself. Although weknew the task would not be easy, we were convinced it wasfeasible. I choose The Children's Initiative evaluation as anexample because it illustrates the child results we were con-vinced could be measured within the context of a commu-nity-wide program that also had broad health and familyfunctioning goals. When The Children's Initiative ended,Thornton, Love, and Meckstroth (1994) extended Mathe-matica's design work to propose a system of measures thatwould be appropriate for other community programs withoutcome goals similar to those of The Children's Initiative.

We developed plans for measures related to results in fivebroad areas: (I) child and family health; (z) family function-ing; (3) child development; (4) school performance; and (5)youth maturation and social integration. Two of the topicsunder child and family health (incidence of preventable dis-eases and disabilities, and overall health of children), as wellas the areas of child development and school performance,are pertinent to the child-based results theme of this forum.For each outcome that might relate to a goal of a commu-nity program in each of these areas, we listed the expectedresults (both intermediate and long term), the recom-mended measure for each outcome, and the measurementprocedure or data source. We also provided a summary ofevidence, when available, on how sensitive the measure is tocommunity interventions, the strengths and liabilities of themeasure, and a brief statement as to its policy relevance (seeTables i through 4).'

As with any set of recommended measures, we devel-oped this list with some restrictions in mind. First, we triedto find measures that are sensitive to changes in the well-being of children brought about by community-wide pro-grams. Second, we wanted measures based on data thatlocal communities would be capable of collecting. Third,we gave priority to measures for which there would be com-parative data in national surveys or other studies that couldprovide a frame of reference for interpreting local data.Finally, our selection was influenced by the potential formaking policy relevant statements about changes in chil-dren's status on the measure.

The measures in Tables 1-4 rely on two major data col-lection strategies: interviews or surveys and administrativerecords. We did not include controlled, individualizedassessments of childrennot because of concerns abouttheir scientific feasibility, but because of practical feasibil-ity, primarily the costs of data collection. A major reasonfor constraining our selection of measures was our desire tomake available a system that communities could maintainafter the evaluation ended. Communities seldom have theresources for ongoing assessments that are possible with aspecially funded formal evaluation. We thought we wereproposing a scientifically feasible system of measures,appropriate for the goals (desired results) of The Children's

b6

Initiative and suitable for implementation in the context ofcommunity-wide programs, but six months after we sub-mitted our preliminary evaluation design, the Trustsdecided not to proceed with implementing the Initiative.

I have often wondered how large a role the adequacy ofthe outcome measures played in the Trusts' decision.Clearly, it was a complex decision in which many factorswere weighed. As the coordinator of the external reviewteam commissioned by the Trusts to examine the Initiativeobjectively, Krauskopf (1994) has reflected on the lessonslearned. One of the four areas of concern he articulated was"evaluation, outcome measurement, and accountability."Referring to the outcome focus of the Initiative, Krauskopfnoted that "there are not well-agreed upon indicators andmeasures for three of the Initiative's four outcome areaschild development, family functioning, and school readi-ness." Krauskopfs central concern in the measurementarea, however, makes it clear that the adequacy of the out-come measures is a concern, but that this concern is embed-ded in a larger matrix of issues that includes evaluationdesign and the policymaking process:

Because large-scale systems projects must be prepared tojustify themselves as they proceed, the absence of solidoutcome measurement data and operational managementsystems is particularly harmful to generating ongoingsupport. To build the public and political will to con-tinue, projects must be able to demonstrate results in acredible way. State officials developing The Children'sInitiative recognized that data on enhanced results werecritical to expansion, but such data were unlikely to beavailable within the important early years of the project.(Emphases added)

In part, we may have failed to demonstrate the scientificfeasibility of producing the valid measures outlined inTables 1-4; certainly, none of the measures is perfect. Itmust be recognized, however, that the scientific basis foroutcome measurement, even had it been extremely strong,would not have been sufficient. Two other factors are alsoimportant. First, programs must be able to translate theresults of measurement into policy-relevant conclusionsabout the linkage between program activities and measuredresults: Did this specific community intervention actuallyproduce the changes that evaluators observed in the perfor-mance of children on the measures? Second is the issue oftiming. There is almost always a tension between theresearch and evaluation schedule, which must wait for theprogram processes to unfold and produce their impacts,and the political agenda, which demands immediate "hard"evidence that the investment is paying off. In this context,as we have seen in the case of Head Start, it is often tempt-ing to blame the measures.)

52

58

CHILD RESULTS IN THE CONTEXT OFSCHOOL READINESS ASSESSMENT

My second example takes us to a particularly sensitive arena,that of measuring the extent to which the developmentalprogress of 5-year-old children provides them the where-withal to succeed in school. The issue of school readinessassessment is rife with debate, the pros and cons of which Iwill not go into here. Let it suffice to say that the attitude ofmeasurement avoidance is particularly acute when theresults of the measurements have the potential for labelingchildren and leading to critical decisions about their educa-tional trajectories (grade placements, tracking, and so forth).With support from the Pew Charitable Trusts, Larry Aber,Jeanne Brooks-Gunn, and I analyzed the dimensions of chil-dren's early learning, development, and abilities as definedby the Goal i Technical Planning Group (Kagan et al. 1995)and searched the literature for measures that would beappropriate for communities to use to provide data onimportant elements of each dimension. The results of thisprocess are summarized in Table 5.4

Each of the measures listed in the right-hand column ofTable 5 has been used in previous surveys or research and issupported by evidence of its reliability and validity. Alsoimportant for the question of feasibility, we judged eachmeasure to be appropriate for diverse groups of childrenrepresenting the entire spectrum of socioeconomic status,geographic regions, racial-ethnic groups, language groups,and disability status found in this country. Each of the mea-sures was selected because it assesses one or more of theconstructs embodied in one of the dimensions of children'sdevelopment and learning, is appropriate for 5-year-olds,and is available for use with relatively little further adapta-tion. Two other considerations further constrained oursearch for appropriate measures. First, we wanted the set ofmeasures, taken as a whole, to be practically feasible. Thatis, the assessment process should not be so time-consumingor require such extensive training or materials that school orcommunity personnel would have difficulty administeringthe measures on a relatively large scale. Second, we lookedfor measures on which we had access to national data fordrawing comparisons.

Any application of child results measurement will have asimilar set of constraints. We must keep these in mind aswe contemplate the scientific feasibility of assessing childresults. In other words, the characteristics of the measuresare not the only consideration.

Four separate measurement procedures are required toassess all these indicators: (1) a self-administered parentquestionnaire; (2) teacher-conducted assessments of chil-dren's development; (3) ratings by teachers; and (4) schoolhealth records. Feasibility is enhanced by the use of multi-ple sources of data. For example, both parents and teachers

complete ratings on aspects of children's social and emo-tional development. Teachers obtain data on children'smotor development, approaches toward learning, languageusage, and cognition and general knowledge using a stan-dardized screening instrument.

This system of measures is yet to be tried, but it appearsto have face validity, if the expressions of community inter-est that we have received are any indication. We know thatthe measures exist and are scientifically strong. We don'tknow the full extent to which (I) their administration willbe practical and affordable; or (2) the data they generatewill speak to the needs of communities that are concernedabout these dimensions of development as possible resultsof their systems of health care, child care, parent support,and early education.

CHILD RESULTS IN THE CONTEXT OF ASYSTEM OF INDICATORS

Last year, the organizers of a conference on "indicators ofchildren's well-being" commissioned 24 papers, in whichprominent researchers described what child and adolescentwell-being indicators are feasible to measure. Recommen-dations were subject to a constraint unlike those seen in theprevious examples: the measures must be obtainable fromnational surveys. Even with this severe restriction, confer-ence participants considered a large number of measures ofchildren's well-being to be both feasible and desirable tocollect on a national basis. The measures cover the domainsof child health; education; economic security; population,family, and neighborhood; and social development andproblem behaviors (Brown 1994). The following partial list-ing from that conference illustrates what is feasible withinthe context of national survey data collections for childrenfrom birth to 5 years:

Child Health* Healthy birth index* Percentage of infants born with congenital

anomalies* Child abuse/neglect rate* Percentage of children ever experiencing a delay in

growth or development* Percentage of children limited by chronic health

conditions* Percentage of children who regularly use seat belts

Education* Percentage of 3- to 5-year-olds enrolled in preschool* Percentage of children ages 3 to 5 who are read to

every day by a parent or household member* Percentage of children over 3 years who ever had

learning disabilities

Economic Security* Percentage of children in poverty

53

* Percentage of children in families receiving foodstamps in past year

* Percentage of children in households where both oronly parents are working

* Percentage of children living in inadequate housing

Population, Family, and Neighborhood* Percentage of children who have moved within the

past year* Percentage of children living in institutions or group

quarters* Percentage of children living in severely distressed

neighborhoods

Social Development and Problem Behaviors* Percentage of children with high rates of behavior

problems

Because these measures must be collectable through anational survey, they may be less useful for measuringresults in certain types of programs. On the other hand,they may be particularly useful for evaluating community-wide programs like The Children's Initiative and many ofthe family support programs around the country.

Section 4: It all Depends:Tentative Lessons About Influences onMeasurement FeasibilityThree different strategies have illustrated what someresearchers believe to be the scientific feasibility of measur-ing child results. To make this discussion very concrete, Inow describe four illustrative measures. These exampleshave practical feasibility, which I took as a thresholdrequirement for considering any measure useful to thisforum. In other words, either they have been used in large-scale studies, where the cost and burden of data collectionare important considerations, or I can imagine them beingused in such studies. I have selected these four to illustratethat (I) outcome measurements are feasible; (2) a range offeatures of children's development and well-being can bemeasured; but that (3) a number of factors influence mea-surement feasibility. They do not constitute a random sam-ple of measures, but neither do I think they are entirelyatypical. I realize that to fully answer the question of thisforum, this analysis should be done with a much largersample of measures. If, however, these four suggest a posi-tive answer, this analysis should provide sufficient grist forour debate.

Table 6 summarizes critical features of the four instru-ments: (I) Behavior Problems Index (BPI) (Zill 199o); (2)Early Screening Inventory (ESI) (Meisels et al. 1988); (3)

5 7 BEST COPY AVAILk

MacArthur Communicative Development Inventories(CDI) (Fenson et al. 1993); and (4) Social Skills Rating Sys-tem (SSRS) (Gresham and Elliott 199o). The results mea-sured by these four instruments include aspects of many ofthe important dimensions of children's early developmentand learning: motor, social-emotional, language, and cog-nition and general knowledge. The instruments span differ-ent age ranges, from infancy through the elementary grades.They also represent different measurement approaches:parent reports and ratings (CDI and BPI); teacher ratings(SSRS); and direct individual assessments of children incontrolled settings (ESI). All but the CDI have been rela-tively widely used. The CDI is still undergoing develop-mental research (Fenson 1993), whereas many large-scalesurveys have incorporated the BPI (Love 1994; and Zill andSchoenborn 199o). One of the measures (CDI) was explic-itly designed to measure results for infants and toddlers; theBPI is probably suitable down to age 2, even though it wasinitially developed for preschoolers and older children(Brooks-Gunn and Ross 1991). The ESI is designed primar-ily for the preschool-to-first-grade years, and the SSRSteacher version is designed for elementary school teachers tocomplete (preschool and secondary versions are also avail-able).

I've indicated that it is not completely fair to draw broadgeneralizations from this small subset of measures, so I willovergeneralize! Using these four instruments, as well asknowledge about many other measures that is not docu-mented here, I now suggest some of the factors that maydetermine the scientific feasibility of adopting an resultsorientation for early childhood programs.

AGE AND OTHER CHARACTERISTICSOF THE CHILD

The older the child, the stronger the measures. But notalways! As a general rule, when children become older, theyare easier to talk to, they better understand what we ask ofthem, they respond more consistently to instructions. If ourmeasures depend on communicating with the child andgetting a response in return, then this general rule applies.Standard aptitude tests are often thought to be more validwith older children and adolescents than with babies.When a good multidimensional developmental assessmentis created (such as the Bayley Scales of Infant Develop-ment), it is rejected for all but the most well-funded evalua-tions. Notice, however, that such in-depth measures oftenbecome the benchmarks against which the validity ofnewer, more efficient assessments (like the CDI) arejudged.

The very widely used Peabody Picture Vocabulary Test(PPVT-R) (Dunn and Dunn 1981) is a good example of aninstrument that spans a very wide age range, is practical toadminister, but is inappropriate for young children. This

54

Ir:.; C.)

test relies on formalized interactions between adult andchild, with the adult asking questions (for example, "Whichis the ladder?"), and the child being required to respond invery restricted ways by pointing to or giving the letter des-ignation for the correct one out of four pictures. There isample room for bias to creep inthe adult's pronuncia-tion, the familiarity of the vocabulary within the child's cul-ture, the child's interpretation of the inelegant drawings,and so forth. In spite of these problems, the PPVT has beenwidely used in program evaluations, like the Infant Healthand Development Program (IHDP), and large-scale sur-veys, like the National Longitudinal Survey of Youth(NLSY). It has a number of enviable psychometric charac-teristics, including good reliability and predictive validity,yet fails on our criteria of appropriateness for diverse cul-tural and racial groups (because of vocabulary selection anddrawings) and relevance to program goals (since it measuresonly receptive vocabulary understandinga very narrowslice of the goals that most programs have for child devel-opment). Language has always been a difficult outcome tomeasure, partly because of its sensitivity to context. So, per-haps the domain of development is more important thanthe child's age.

DOMAINS OF DEVELOPMENT

For the last 10 or 15 years, Larry Fenson and his colleagueshave struggled with ways of measuring the productive lan-guage of infants and toddlers. What have they found? Thatparents are not only a convenient, but a highly reliable,source of information on the vocabulary and syntax of theirown children. Parents (mostly mothers) can observe theirchild's language across multiple settings over time in waysthat would be extremely time-consuming and expensive foroutside observers to duplicate. Taken together, the infantand toddler scales of the CDI describe the course of lan-guage development, from nonverbal gestures in infancy,through early vocabulary usage, to the beginnings of gram-matical speech. Interestingly, the growth curves fitted to theparent reports of various language functions closely parallelpredictions from language theory (Dale, et al. 1989; andFenson et al. 1993). So here we have the case of a strongmeasure for very young children in a domain that is fraughtwith measurement difficulties. Should we, perhaps, putmore reliance on parent reports?

MEASUREMENT METHODS

The Behavior Problems Index provides another example ofthe successful use of parent reports of children's behavior.But the Social Skills Rating System has been more success-ful with its teacher-rating version than with the parentform, at least in terms of the internal consistency of thescales (Gresham and Elliott 199o). On the other hand, theEarly Screening Inventory has almost single-handedly

erased our usual disdain for developmental screeninginstruments. The ESI demonstrates not only reliability andvalidity but also sensitivity, an especially important qualitywhen teachers and other school personnel are concernedwith correctly identifying children who should be referredfor further diagnosis. It has not been widely used in pro-gram evaluations (and, in fact, was not developed for thatpurpose), but it is certainly a better candidate than many ofthe traditional screening instruments that have been soused.

In-person assessments, like the ESI, which rely on struc-tured settings that carefully control the demand characteris-tics for children's responses, generally cost more to admin-isterif they go beyond picture-book multiple-choiceformats. Contributing to the psychometric strength of theESI are certainly the systematic procedures and trainingprograms that ensure consistent administration acrossteachers and settings. Good parent and teacher ratingforms, like the BPI, CDI, and SSRS, also provide system-atic instructions for their administration, but that task iseasier since direct contact with the child is not required.Thus, it seems that the measurement method is not a deter-mining factor so much as the care with which proceduresare established to ensure that the intentions of the instru-ment developer are carried out.

EVALUATION DESIGN ISSUES

Characteristics of the child, the domains being measured,and how we go about conducting the measurement processare all important considerations. It is not clear to me, how-ever, that our experience suggests useful rules of thumb forselecting the best measure for a particular program's out-come goals. There is simply too much measure-to-measurevariation, as seen in the perhaps extreme difference betweenthe PPVT and the CDI. It is time to consider not only themeasures themselves, but also how they are used.

Suppose we have one or two scientifically valid and reli-able instruments that program stakeholders agree measureresults they are trying to achieve. How, then, are the resultsof our scientifically valid outcome measures to be used? Thedesirability of moving toward a child-based results orienta-tion is based on the assumption that the results of the mea-surement process mean something to the various programstakeholderspolicymakers, early care and education sys-tems people, families and communities, and program staff.Even with a perfectly valid measure of children's languageusage or behavior problems, we cannot interpret the scoresunless there is highly convincing evidence that the programhad something to do with them. This is not the place for athorough discussion of evaluation design issues, but wemust recognize that even the best outcome measures areuseless in the face of our inability to implement evaluationdesigns that permit unambiguous conclusions about pro-

55

gram effects.' I submit that problems in evaluation designare much more likely than invalid outcome measures to bethe "scientific" basis for impeding movement towardresults-based programming for children.

Section 5: Closing ThoughtsEach year about 750,000 children enroll in Head Start.Responding to recent concerns about program quality,ACYF is designing a system of "performance measures"that will be results-oriented. The system will include mea-sures of children's social competence, because the agency isconvinced that quality improvements will be reflected inbetter results for children. Does the field have perfect mea-sures to recommend to ACYF for use next year, or the yearafter? No. Should we tell ACYF, and indirectly the Con-gress, that it is impossible to judge the results of enhancedquality? I think not. We know enough now to specify theconditions under which it is scientifically feasible to mea-sure child results of Head Start.

Three months from now, about four million childrenwill enter public and private schools for the first time. Willthey have a successful experience? Well, a sizable propor-tion will not actually enter regular kindergartens becausetheir parents, school system, or private counselors willdeem them "unready." Twelve months later, almost200,000 children will spend a second year in kindergartenbecause their school doesn't believe they have made enoughprogress to be successful in first grade. An additional16o,000 or so will also experience delayed entry to firstgrade because their school system places them in some typeof transitional class after the kindergarten year. Many ofthese decisions (so percent or more of the cases) will bejustified by the results of some type of assessment. Wealready know better ways to make these decisions, ways thatwill place children at far lower risk for school failure, laterdropout, and so forth (Shepard and Smith 1989). Are any ofthese outcome measures perfect? No. But each year that wewait for the perfect measure, we diminish the chances forsuccessful schooling for another 350,000 or more children.

Uncounted millions of children participate in familysupport programs each year at a cost of tens of millions ofdollars. There is keen interest in knowing the extent towhich these children are better off because of the supportand services their families receive. Should we tell policy-makers that it is premature to assess these benefits? Thatprograms will just have to proceed without knowing howthey affect children's well-being? Again, I say no. Andagain, we know enough to advise programs on the condi-tions under which effective results measurement can beconducted, on which dimensions of well-being, and withwhich measurement procedures.

If it is desirable to shift to results-based accountability in

5 9 BEST COPY AVAILABLI

programs for children, it is also scientifically feasible to doso. I am fully aware that this brief paper and my selectivereview of measures do not solidly bolster this conclusion.Yet, it seems impossible to deny the good measures we haveavailable. In arriving at this conclusion, I want to be clearthat an affirmative answer to the feasibility question is not ablanket endorsement of all measures for all children in allprograms under all circumstances. The results that are mea-sured, the procedures that are selected, and the conditionsunder which the assessments are conducted and interpretedall have implications for programs and families, and ulti-mately for the children whom we hope will benefit throughour endeavors.

1. The planning phase of the Measures Project included a series of"input workshops," in which project staff met with representatives ofHead Start administrators, teachers, and parents in various regions ofthe country. Conducted like large focus groups, these workshops weredesigned to ascertain what representatives of the Head Start commu-nity (the stakeholders of future program evaluations) believed to beimportant indicators of children's development.

z. Although Tables 1-4 have their foundation in preliminary recom-mendations Mathematica Policy Research developed for possible usein the evaluation of The Children's Initiative, we have made a numberof modifications since The Children's Initiative ended. No endorse-ment of these particular results or measures by the Pew CharitableTrusts should be inferred.

I am grateful to Craig Thornton for helping me articulate the com-plexities of measurement issues in the context of evaluating commu-nity-wide initiatives.

3. For the purposes of this discussion of child results measures, we canignore the first four dimensions in Table 5. They reflect the commu-nity conditions that the National Governors' Association identified assupporting children's development and learning and are important fora complete assessment within the framework of the first national edu-cation goal (see Love et al. 5994).

4. The conference was sponsored by the Institute for Research onPoverty at the University of Wisconsin; Child Trends, Inc.; the Officeof the Assistant Secretary for Planning and Evaluation (U.S. Depart-ment of Health and Human Services); the National Institute of ChildHealth and Human Development; and the Annie E. Casey Founda-tion.

5. I will not go into the power of random assignment, the virtues ofhaving a control-group counterfactual, or the various quasi-experi-mental design alternatives available. Hollister and Hill (1990 haverecently presented a highly intelligent and readable discussion of theseissues with particular reference to evaluating communitywide initia-tives.

References

Anderson, Scarvia, and Samuel Messick. "Social Compe-tency in Young Children." Developmental Psychology, vol.1o, 1974, pp. 282-293.

Anderson, S.B., G.A. Bogatz, T.W. Draper, A. Jungeblut,G. Sidwell, W.C. Ward, and A. Yates. CIRCUS Manualand Technical Report (preliminary version). Princeton, NJ:Educational Testing Service, 1974.

Barnett, W. Steven. "Benefits of Compensatory PreschoolEducation." Journal of Human Resources, vol. 27, 1992,pp. 102-120.

Barnett, W. Steven. "Research on Family Support Pro-grams: Current Status and Future Directions." Paper pre-sented at the biennial meeting of the Society for Researchin Child Development, Indianapolis, March 1995.

Bond, James T., Allen G. Smith, and Janice M. Kittel."Evaluation of Curriculum Implementation and ChildResults." Annual Report, Cognitively Oriented Curricu-lum Project Follow Through. Ypsilanti, MI: High/ScopeEducational Research Foundation, 1976.

Bowman, G., and Rochelle Mayer. "The BRACE Systemof Interaction Analysis: What, Why, and How? New York:Bank Street College of Education, 1976.

Bradley, R., and B. Caldwell. "The Relation of Infants'Home Environments to Achievement Test Performance inFirst Grade: A Follow-Up Study." Child Development, vol.55, 1984, pp. 803-809.

Bradley, R., and B. Caldwell. Home Observation for Mea-surement of the Environment: Validation Studies of thePreschool Version. Paper presented at the Annual Meetingof the Mid-South Educational Research Association, NewOrleans, 1978.

Brooks-Gunn, Jeanne, and Christine Ross. "EvaluatingChild Results of the Expanded Child Care OptionsDemonstration: An Analysis Plan." Princeton, NJ: Mathe-matica Policy Research, Inc., May 1991.

Brooks-Gunn, J., C. McCarton, P. Casey, M.McCormick, et al. "Early Intervention in Low Birth-weight, Premature Infants: Results Through Age 5 Fromthe Infant Health and Development Program." Journal ofthe American Medical Association, vol. 272, no. 16, 1994,pp. 1257-1262.

56

Ca

Brown, Brett. "Indicators of Children's Well-Being: AReview of Current Indicators Based on Data From theFederal Statistical System." Paper prepared for the confer-ence on Indicators of Children's Well-Being, Bethesda,MD, November 1994.

Campbell, Frances A., and Craig T. Ramey. "Effects ofEarly Intervention on Intellectual and Academic Achieve-ment: A Follow-Up Study of Children from Low-IncomeFamilies." Child Development, vol. 65, 1994, pp. 684-698.

Carew, Jean V. "Head Start Profiles of Program Effects onChildren: Position Paper for Mediax ACYF Project."Westport, CT: Mediax Associates, 1978.

Dale, Phillip S., Elizabeth Bates, J. Steven Reznick, and C.Morisset. "The Validity of a Parent Report Instrument ofChild Language at 20 Months." Journal of Child Language,vol. 16,1989, pp. 239-249.

Dunn, L., and L. Dunn. Peabody Picture Vocabulary Test,Revised Edition. Circle Pines, MN: American GuidanceService, 1981.

Edozien, J., B. Switzer, and R. Bryan. "Medicaid Evalua-tion of the Special Supplemental Food Program forWomen, Infants, and Children." The American Journal ofClinical Nutrition, vol. 32, March 1979.Egbuonu, Lisa, and B. Starfield. "Inadequate Immuniza-tion and the Prevention of Communicable Diseases." InThe Effectiveness of Medical Care: Validating Clinical Wis-dom, edited by Barbara Starfield. Baltimore: Johns Hop-kins University Press, 1985.

Eichorn, Dorothy H. "Head Start Profiles of ProgramEffects on Children: Motor Development." Westport, CT:Mediax Associates, 1978.

Eisen, M., C.A. Donald, J.E. Ware, and R.H. Brook. Con-ceptualization and Measurement of Health for Children inthe Health Insurance Study. Santa Monica, CA: The RandCorporation, 1980.

Fenson, Larry. "Parent Report: A Neglected Source ofData." Paper presented at the biennial meeting of the Soci-ety for Research in Child Development, New Orleans,March 1993.

Fenson, Larry, Phillip S. Dale, J. Steven Reznick, DonnaThal, Elizabeth Bates, Jeffrey P. Hartung, Steve Pethick,and Judy S. Reilly. MacArthur Communicative Develop-ment Inventories: User's Guide and Technical Manual. SanDiego: Singular Publishing Group, 1993.

57

Gresham, Frank M., and Stephen N. Elliott. Social SkillsRating System: Manual. Circle Pines, MN: AmericanGuidance Service, 1990.

Hanushek, Eric A. "The Impact of Differential Expendi-tures on School Performance." Educational Researcher, vol.18, no. 4, May 1989, pp. 45-51, 62.

Hauser-Cram, Penny, Donald E. Pierson, Deborah K.Walker, and Terrence Tivnan. Early Education in the Pub-lic Schools: Lessons from a Comprehensive Birth-to-Kinder-garten Program. San Francisco: Jossey-Bass, 1991.

Helburn, Suzanne, et al. "Cost, Quality, and Child Resultsin Child Care Centers: Executive Summary." Denver:University of Colorado at Denver, 1995.

Hodges, W., A. Branden, R. Feldman, J. Follins, J. Love,R. Sheehan, J. Lumbley, J. Osborn, R.K. Rentfrow, J.Houston, and C. Lee. Follow Through: Forces for Change inthe Primary Schools. Ypsilanti, MI: High/Scope Press, 1980.Hofferth, S.L., A. Brayfield, S.G. Deich, and P. Holcomb."The National Child Care Survey 1990." Washington,DC: Urban Institute, 1991.

Hollister, Robinson G., and Jennifer Hill. "Problems inthe Evaluation of Community-Wide Initiatives." In NewApproaches to Evaluating Community Initiatives: Concepts,Methods, and Contexts, edited by James P. Connell, AnneC. Kubisch, Lisbeth B. Schorr, and Carol H. Weiss.Washington, DC: The Aspen Institute, 1995.

Howes, Carollee, Deborah A. Phillips, and Marcy White-book. "Thresholds of Quality: Implications for the SocialDevelopment of Children in Center-Based Child Care."Child Development, vol. 63, 1992, pp. 449-460.

Infant Health and Development Program. "Enhancing theResults of Low-Birth-Weight, Premature Infants: A Multi-site, Randomized Trial." Journal of the American MedicalAssociation, vol. 263, no. 2, June 13, 1990, pp 3035-3042.

Kagan, Sharon L., Evelyn Moore, and Sue Bredekamp(Eds.). "Reconsidering Children's Early Development andLearning: Toward Common Views and Vocabulary."Washington, DC: National Education Goals Panel, June1995.

Koblinsky, S.A., M.L. Taylor, and L.W. Douglass. "Rela-tionships Between Maternal, Child, and Shelter: Charac-teristics and Homeless Children's Developmental Status."College Park, MD: University of Maryland, 1995.

6

Kresh, Esther. "The Effects of Head Start: What Do WeKnow?" Washington, DC: U.S. Department of Healthand Human Services, Administration on Children, Youthand Families, 1993

Krauskopf, James A. "Overcoming Obstacles to Imple-menting Reform of Family and Children's Services." Paperpresented at the annual meeting of the Association forPublic Policy Analysis and Management, Chicago, Octo-ber 1994.

Laosa, Luis M. "Evaluating the Effects of Head Start onChildren's Cognitive Development." Westport, CT:Mediax Associates, 1978.

Lamer, Mary. "Realistic Expectations: Review of Evalua-tion Findings." In Fair Start for Children: Lessons Learnedfrom Seven Demonstration Projects, edited by Mary Larner,Robert Halpern, and Oscar Harkavy. New Haven: YaleUniversity Press, 1992.

Lazar, I., R. Darlington, H. Murray, J. Royce, and A.Snipper. "Lasting Effects of Early Education: A Reportfrom the Consortium for Longitudinal Studies." Mono-graphs of the Society for Research in Child Development, vol.47, nos. 2-3 (series no. 195), 1982.

Love, John M. "Indicators of Problem Behaviors andProblems in Early Childhood." Invited paper prepared forthe Conference on Indicators of Children's Well-Being,National Institute of Child Health and Human Develop-ment, Bethesda, MD, November 17-18, 1994.

Love, John M., J. Lawrence Aber, and Jeanne Brooks-Gunn. "Strategies for Assessing Community ProgressToward Achieving the First National Education Goal."Princeton, NJ: Mathematica Policy research, Inc., October1994.

Love, John M., Sally Wacker, and Judith Meece. "AProcess Evaluation of Project Developmental Continuity,Interim Report II, Part B: Recommendations for Measur-ing Program Impact." Ypsilanti, MI: High/Scope Educa-tional Research Foundation, June 1975.

Mallar, Charles D., and Rebecca A. Maynard. "The Effectsof Income Maintenance on School Performance and Edu-cational Attainment." In Research in Human Capital andDevelopment: A Research Annual, vol. 2, edited by AliKhan and Ismail Sirageldin. Greenwich, CT: JAI PressInc, 1981.

58

Mathematica Policy Research, Inc. "Teenage ParentDemonstration 24-Month Follow-Up Questionnaire."Princeton, NJ: MPR, 1993.

Mathematica Policy Research, Inc. "Survey of Mothersand Children: Mother's Questionnaire for the Interactionsand Developmental Processes Study." Princeton, NJ:MPR, 1991.

Maynard, Rebecca. "Some Aspects of Income Distribution:The Effects of the Rural Income Maintenance Experimenton the School Performance of Children." American Eco-nomic Review, vol. 67, no. I, 1977, pp. 370-374.

Maynard, Rebecca, and Richard Murnane. "The Effects ofa Negative Income Tax on School Performance: Results ofan Experiment." The Journal of Human Resources, vol. 14,

no. 4, 1979, pp. 463-476.

McCall, Robert B. "Head Start: Its Potential, Its Achieve-ments, Its Future: A Briefing Paper for Policymakers."Pittsburgh, PA: University of Pittsburgh, Office of ChildDevelopment, 1993.McCartney, Kathleen, Sandra Scarr, Deborah Phillips,Susan Grajek, and J. Conrad Schwarz. "EnvironmentalDifferences Among Day Care Centers and Their Effectson Children's Development." In Day Care: Scientific andSocial Policy Issues, edited by Edward F. Zigler andEdmund W. Gordon. Boston: Auburn Press, 1982.

McKey, R.H., L. Condelli, H. Gamson, B. Barrett, C.McConkey, and M. Plantz. The Impact of Head Start onChildren, Families, and Communities: Final Report of theHead Start Evaluation, Synthesis, and Utilization Project.Washington, DC: U.S. Department of Health andHuman Services, June 1985.

Mediax Associates. "Accept My Profile: Perspectives forHead Start Profiles of Program Effects on Children."Westport, CT: Mediax Associates, 1980.

Meisels, Samuel J., Martha Stone Wiske, Laura W. Hen-derson, Dorothea B. Marsden, and Kimberly Browning.ESI 4-6: Early Screening Inventory for Four Through SixYear Olds. Revised, third edition. New York: TeachersCollege Press, 1988.

National Center for Education Statistics. National House-hold Education Survey, r993 Washington, DC: NCES,1993.

National Center for Health Statistics. "Design and Estima-tion for the National Health Interview Survey, 1985-94."Vital and Health Statistics, series 2, no. no. Bethesda, MD:U.S. Department of Health and Human Services, PublicHealth Service, Centers for Disease Control, August 1989.

Olds, David L., and Harriet Kitzman. "Can Home Visita-tion Improve the Health of Women and Children at Envi-ronmental Risk?" Pediatrics, vol. 86, 1990, pp. 108-116.

Peterson, J.L., and N. Zill. "Marital Disruption, Parent-Child Relationships, and Behavioral Problems in Chil-dren." Journal of Marriage and the Family, vol. 48, 1986,pp. 295-307.

Pierson, D.E., D.K. Walker, and T. Tivnan. "A School-Based Program from Infancy to Kindergarten for Childrenand Their Parents." Personnel and Guidance Journal, vol.62, no. 8, 1984, pp. 448-455.

Raizen, Senta, and S.B. Bobrow. Design for a NationalEvaluation of Social Competence in Head Start Children.Santa Monica, CA: Rand Corporation, 1974.

Ramey, Craig T., and Frances A. Campbell. "Poverty,Early Childhood Education, and Academic Competence:The Abecedarian Experiment." In Children in Poverty:Child Development and Public Policy, edited by Aletha C.Huston. New York: Cambridge University Press, 1991.

Ramey, Craig T., Donna M. Bryant, and Tanya M. Suarez."Preschool Compensatory Education and the Modifiabilityof Intelligence: A Critical Review." In Current Topics inHuman Intelligence, edited by D. Detterman. Norwood,NJ: Ablex Publishing Corporation, 1983.

Raver, C. Cybele, and Edward F. Zigler. "Three Steps For-ward, Two Steps Back: Head Start and the Measurementof Social Competence." Young Children, vol. 46, no. 4,1991, pp. 3-8.

Schorr, Lisbeth B. "The Case for Shifting to Results-BasedAccountability, with a Start-Up List of Outcome Measureswith Annotations." Washington, DC: Center for theStudy of Social Policy, Fall 1994.

Schwartz, Donald F., Jeanne Ann Grisso, Carolyn Miles,John H. Holmes, and Rudolph L. Sutton. "An Injury Pre-vention Program in an Urban African-American Commu-nity." American Journal of Public Health, May 1993, vol. 83,no. 5, pp. 675-680.

59

Schweinhart, Lawrence J., Helen V. Barnes, and David P.Weikart. "Significant Benefits: The High/Scope PerryPreschool Study Through Age 27." Monographs of theHigh /Scope Educational Research Foundation, no. io. Ypsi-lanti, MI: High/Scope Educational Research Foundation,1993.

Shepard, Lorrie, and M.L. Smith (Eds.). Flunking Grades:Research and Policies on Retention. New York: Palmer Press,1989.

Starfield, Barbara. "The Effectiveness of Medical Care:Validating Clinical Wisdom." Baltimore: Johns HopkinsUniversity Press, 1985, pp. 48-57.

St.Pierre, Robert, Jean Layzer, and Helen Barnes. "Varia-tion in the Design, Cost, and Effectiveness of Two-Gener-ation Programs." Paper prepared for the Eighth RutgersInvitational Symposium on Education: New Directionsfor Policy and Research in Early Care and Education,Princeton, NJ, October 1994.

Thornton, Craig, John M. Love, and Alicia L. Meckstroth."Community-Level Measures for Assessing the Status ofChildren and Families." Princeton, NJ: Mathematica Pol-icy Research, December 1994.

Yoshikawa, Hirokazu. "Prevention as Cumulative Protec-tion: Effects of Early Family Support and Education onChronic Delinquency and Its Risks." Psychological Bul-letin, vol. 115, 1994, pp. 28-54.

Zill, Nicholas. Behavior Problems Index. Washington, DC:Child Trends, Inc., 1990.

Zill, Nicholas, and Charlotte A. Schoenborn. "Develop-mental, Learning, and Emotional Problems: Health ofOur Nation's Children, United States, 1988." AdvanceData from Vital and Health Statistics, No. 190. Hyattsville,MD: National Center for Health Statistics, November 16,1990.

6 3

TA

BL

E 1

HE

AL

TH

OU

TC

OM

ES

AN

D M

EA

SUR

ES:

IN

CID

EN

CE

OF

PRE

VE

NT

AB

LE

DIS

EA

SES

AN

D D

ISA

BIL

ITIE

S IN

CH

ILD

RE

N

Inte

rmed

iate

-Ter

mO

utco

mes

Exp

ecte

dL

ong-

Ter

mO

utco

mes

Exp

ecte

dM

easu

res

Rec

omm

ende

dD

ata

Sour

ceE

vide

nce

of S

ensi

tivity

toC

omm

unity

Int

erve

ntio

nsSt

reng

ths

[Lia

bilit

ies]

of

the

Mea

sure

Polic

y R

elev

ance

Red

uced

num

ber

of c

ases

of d

isea

ses

for

whi

chim

mun

izat

ion

is a

vaila

ble-

-per

tuss

is, p

olio

, mea

sles

,m

umps

, or

rube

lla

Dec

reas

e in

pos

meo

nata

lm

orta

lity

rate

s or

inra

ce/e

thni

city

dif

fere

ntia

lin

pos

meo

nata

l mor

talit

yra

tes

Dec

reas

e in

pre

vent

able

infa

nt m

orta

lity

Red

uced

num

ber

of c

ases

of d

isea

ses

for

whi

chim

mun

izat

ion

is a

vaila

ble

Incr

ease

d us

e of

saf

ety

prec

autio

ns to

red

uce

acci

dent

s an

dun

inte

ntio

nal i

njur

y

PosM

eona

tal m

orta

lity

rate

for

all c

hild

ren

and

byra

ce/e

thni

city

Bir

th c

ertif

icat

e

Post

neon

atal

mor

talit

y ra

teB

irth

cer

tific

ate

> 2

/100

0

Num

ber

of c

hild

ren

with

dise

ases

list

edPu

blic

hea

lth r

ecor

dsan

d/or

sur

vey

Prop

ortio

n of

fam

ilies

Surv

eyus

ing

inju

ry p

reve

ntio

n:se

at b

elts

, ipe

cac

syru

p in

hous

e, w

orki

ng s

mok

eal

arm

s

Som

e ev

iden

ce th

atpo

smeo

nata

l mor

talit

y is

resp

onsi

ve to

cha

nges

in th

epr

even

tion

of m

edic

al c

are

(Sta

rfie

ld 1

985)

Som

e ev

iden

ce th

atpo

smeo

nata

l mor

talit

y is

resp

onsi

ve to

cha

nges

in th

epr

even

tion

of m

edic

al c

are

(Sta

rfie

ld 1

985)

Imm

uniz

atio

n fo

und

tode

crea

se th

e in

cide

nce

ofva

ccin

e-pr

even

tabl

eco

mm

unic

able

dis

ease

s(E

gbuo

nu a

nd S

tarf

ield

198

5)

An

inju

ry p

reve

ntio

npr

ogra

m in

a p

oor

Afr

ican

Am

eric

an c

omm

unity

resu

lted

in in

crea

sed

prop

ortio

n of

hom

es w

ithfu

nctio

ning

sm

oke

dete

ctor

s,sy

rup

of ip

ecac

, saf

ely

stor

edm

edic

atio

ns, a

nd r

educ

edel

ectr

ical

and

trip

ping

haza

rds

(Sch

war

tz e

t al.

1993

).

[Lin

ked

birt

h an

d de

ath

reco

rds

file

s ar

e av

aila

ble

from

sta

tes

with

a lo

ng ti

me

lag.

]

[Lin

ked

birt

h an

d de

ath

reco

rds

file

s ar

e av

aila

ble

from

sta

tes

with

a lo

ng ti

me

lag]

[Inc

iden

ce o

f va

ccin

e-pr

even

tabl

eco

mm

unic

able

dis

ease

s is

low

and

outc

omes

are

thus

not

appr

opri

ate

for

stat

istic

alan

alys

is.]

Red

uced

infa

nt m

orta

lity

Red

uced

infa

nt m

orta

lity

Red

uced

hea

lth c

are

cost

s,im

prov

ed p

hysi

cal h

ealth

Red

uced

chi

ldho

od m

orbi

dity

and

mor

talit

y

SOU

RC

E:

Tho

rnto

n, L

ove,

and

Mec

kstr

oth

(199

4).

TA

BL

E 2

HE

AL

TH

OU

TC

OM

ES

AN

D M

EA

SUR

ES:

OV

ER

AL

L H

EA

LT

H O

F C

HIL

DR

EN

Inte

rmed

iate

-Ter

mO

utco

mes

Exp

ecte

dL

ong-

Ter

mO

utco

mes

Exp

ecte

dM

easu

res

Rec

omm

ende

dD

ata

Sour

ceE

vide

nce

of S

ensi

tivity

toC

omm

unity

Int

erve

ntio

nsSt

reng

ths

[Lia

bilit

ies]

of

the

Mea

sure

Polic

y R

elev

ance

Mor

e po

sitiv

e pa

rent

perc

eptio

ns o

f ch

ild's

heal

th s

tatu

s

Few

er c

hild

ren

with

func

tiona

l lim

itatio

ns d

ueto

hea

lth c

ondi

tions

Few

er c

hild

ren

with

mor

bidi

ties

or s

erio

usm

orbi

ditie

s

Con

tinue

d im

prov

emen

t in

pare

nt p

erce

ptio

ns o

f ch

ildhe

alth

sta

tus

Con

tinue

d re

duct

ion

inch

ildre

n w

ith f

unct

iona

llim

itatio

ns d

ue to

hea

lthco

nditi

ons

Few

er c

hild

ren

with

mor

bidi

ties

or s

erio

usm

orbi

ditie

s

NH

IS q

uest

ion

onm

othe

r's r

atin

g of

chi

ld's

heal

th

Surv

ey

RA

ND

Hea

lth P

erce

ptio

nsSu

rvey

Scal

e

Mor

bidi

ty I

ndex

, Ser

ious

Surv

eyM

orbi

dity

Ind

ex

Incr

ease

in p

ropo

rtio

n of

Hei

ght a

nd w

eigh

t rel

ativ

eSc

hool

reg

istr

atio

nch

ildre

n w

ho a

re w

ithin

to a

ge n

orm

sper

cent

iles

reco

rds

age-

appr

opri

ate

heig

ht a

ndan

d z-

scor

esw

eigh

t nor

ms

Hom

e vi

sit p

rogr

ams

impr

ove

mat

erna

l hea

lth a

ndre

duce

ris

k of

infa

nt h

ealth

prob

lem

s (O

lds

and

Kitz

man

1990

).

No

effe

cts

dete

cted

of

anea

rly

inte

rven

tion

prog

ram

for

low

-bir

thw

eigh

t, pr

eter

min

fant

s (I

nfan

t Hea

lth a

ndD

evel

opm

ent P

rogr

am19

90).

Few

eff

ects

det

ecte

d of

an

earl

y in

terv

entio

n pr

ogra

mfo

r lo

w-b

irth

wei

ght,

pret

erit'

infa

nts

(Inf

ant H

ealth

and

Dev

elop

men

t Pro

gram

1990

).

WIC

par

ticip

atio

n by

child

ren

redu

ced

the

perc

enta

ge o

f ch

ildre

n be

low

the

10th

per

cent

ile o

f he

ight

and

wei

ght f

or th

eir

age

(Edo

zien

et a

l. 19

79).

Scal

e w

idel

y us

ed a

nd u

sefu

l in

asse

ssin

g cu

rren

t hea

lth s

tatu

s,ch

ange

s in

hea

lth s

tatu

s fr

omtr

eatm

ent,

and

pred

ictio

n of

mor

talit

y

Est

ablis

hed

scal

e w

ith h

igh

inte

rnal

con

sist

ency

and

goo

dch

ildre

nco

ncur

rent

val

idity

with

oth

erre

port

ed m

easu

res

of h

ealth

Impr

oved

ove

rall

heal

th o

f ch

ildre

n

Impr

oved

fun

ctio

nal s

tatu

s of

Com

bina

tion

of in

divi

dual

ly r

are

even

ts to

giv

e a

mor

e se

nsiti

vech

ildre

nou

tcom

e m

easu

re

Impr

oved

phy

sica

l hea

lth o

f

Rel

iabl

e an

d va

lid m

easu

res

easi

ly a

vaila

ble

from

dir

ect

mea

sure

men

ts o

r m

edic

al r

ecor

dsre

view

(Sb

anko

ff 1

992)

.

Impr

oved

hea

lth o

f ch

ildre

n

SOU

RC

E:

Tho

rnto

n, L

ove,

and

Mec

kstr

oth

(199

4).

TA

BL

E 3

CH

ILD

DE

VE

LO

PME

NT

OU

TC

OM

ES

AN

D M

EA

SUR

ES

Inte

rmed

iate

-Ter

mO

utco

mes

Exp

ecte

dL

ong-

Ter

mO

utco

mes

Exp

ecte

dM

easu

res

Rec

omm

ende

dE

vide

nce

of S

ensi

tivity

toSt

reng

ths

[Lia

bilit

ies]

(Age

Ran

ge)

Dat

a So

urce

Com

mun

ity I

nter

vent

ions

of th

e M

easu

rePo

licy

Rel

evan

ce

Incr

ease

in e

nrol

lmen

ts in

earl

y ed

ucat

ion

and

care

prog

ram

s

Lon

ger

wai

ting

lists

for

earl

y ch

ildho

od p

rogr

ams

Incr

ease

d le

vels

of

rece

ptiv

e, e

xpre

ssiv

e, a

ndpr

oduc

tive

lang

uage

Con

tinue

d in

crea

se in

enro

llmen

ts in

ear

lyed

ucat

ion

and

care

prog

ram

s

Incr

ease

d su

pply

of

earl

ych

ildho

od p

rogr

ams;

per

-ha

ps s

hort

er w

aitin

g lis

ts

Con

tinue

d in

crea

se in

leve

ls o

f re

cept

ive,

expr

essi

ve, a

nd p

rodu

ctiv

ela

ngua

ge

Par

ticip

atio

n by

Chi

ldre

n in

Ear

ly E

duca

tion

and

Car

e P

rogr

ams

Bef

ore

Kin

derg

arte

n

Enr

ollm

ent r

ecor

ds(a

ges

1-5

year

s)

Nat

iona

l Hou

seho

ldE

duca

tion

Surv

ey (

NH

ES)

item

s on

type

s of

prog

ram

s at

tend

ed, l

engt

hof

atte

ndan

ce, e

tc. (

ages

3-6

year

s)

Prov

ider

and

ser

vice

wor

ker

asse

ssm

ents

of

adeq

uacy

of

supp

ly (

ages

1-5

year

s)

Mac

Art

hur

Com

mun

icat

ive

Dev

elop

men

t Inv

ento

ries

-C

DI-

-Sh

ort F

orm

(ag

es 1

-3

year

s)

Hea

d St

art r

ecor

ds,

Dir

ect l

ink

to o

utre

ach

Obt

aina

ble

from

pro

gram

child

car

e sy

stem

activ

ities

data

base

sre

cord

s[D

ata

may

not

rep

rese

ntSc

hool

reg

istr

atio

nco

mm

unity

as

a w

hole

.]

Surv

ey

Prog

ram

rec

ords

Surv

ey

e D

evel

opm

ent

Cog

nitiv

e an

d la

ngua

gede

velo

pmen

t are

enh

ance

d by

such

fac

tors

as

invo

lvem

ent

in q

ualit

y ch

ild c

are

and

earl

y ed

ucat

ion

prog

ram

s an

din

crea

sed

leve

ls o

fst

imul

atio

n in

the

hom

e(M

cCar

mey

et a

l. 19

82;

Ram

ey e

t al.

1983

).

[Som

e co

mm

uniti

es m

ay h

ave

data

in a

num

ber

of d

iffe

rent

data

base

s.]

Use

d in

199

3 N

HE

S

Shor

t for

m h

ighl

y co

rrel

ated

with

full

CD

I, w

hich

has

str

ong

cons

truc

t val

idity

stro

ngco

rrel

atio

ns b

etw

een

CD

Ivo

cabu

lary

pro

duct

ion

scor

es a

ndla

bora

tory

mea

sure

s; m

othe

rs'

repo

rts

of o

nset

of

gram

mat

ical

func

tions

clo

sely

par

alle

lla

ngua

ge d

evel

opm

ent t

heor

y

Age

nor

ms

avai

labl

e [a

lthou

ghm

ay n

ot b

e ap

plic

able

to lo

w-

inco

me

sam

ples

]

Use

d in

NIB

Inf

ant D

ay C

are

Stud

y

[Nor

min

g sa

mpl

e di

d no

t hav

ela

rge

num

bers

of

min

ority

gro

upm

embe

rs.]

[Lim

ited

to 8

- to

36-

mon

th a

gera

nge]

Ava

ilabi

lity

and

use

of c

hild

car

ear

e as

soci

ated

with

par

ents

' abi

lity

to w

ork

and

redu

ctio

n in

wel

fare

depe

nden

ce.

Part

icip

atio

n in

ear

ly c

hild

hood

prog

ram

s ex

pect

ed to

impr

ove

scho

ol r

eadi

ness

Lan

guag

e sk

ills

are

cent

ral t

osc

hool

rea

dine

ss a

nd s

ucce

ss in

scho

ol.

Rel

ates

to N

atio

nal E

duca

tion

Goa

lO

ne: d

omai

n of

lang

uage

usa

ge

BE

ST

CO

PY

AV

AIL

AB

LE

TA

BL

E 3

(con

tinue

d)

Inte

rmed

iate

-Ter

mO

utco

mes

Exp

ecte

dL

ong-

Ten

nO

utco

mes

Exp

ecte

dM

easu

res

Rec

omm

ende

d(A

ge R

ange

)D

ata

Sour

ceE

vide

nce

of S

ensi

tivity

toC

omm

unity

Int

erve

ntio

nsSt

reng

ths

[Lia

bilit

ies]

of th

e M

easu

rePo

licy

Rel

evan

ce

Incr

ease

s in

sch

ool-

rela

ted

Incr

ease

s in

sch

ool-

rela

ted

know

ledg

e an

d sk

ills

Impr

oved

mot

orde

velo

pmen

t and

coor

dina

tion

Incr

ease

d le

vels

of

coop

erat

ion,

ass

ertio

n,an

d re

spon

sibi

lity,

and

incr

ease

d de

gree

of

self

-co

ntro

l

Low

ered

leve

ls o

fhe

adst

rong

, ant

isoc

ial,

anxi

ous,

dep

ress

ed, a

ndov

erly

dep

ende

nt b

ehav

ior

know

ledg

e an

d sk

ills

Impr

oved

mot

orde

velo

pmen

t and

coor

dina

tion

Incr

ease

d le

vels

of

coop

erat

ion,

ass

ertio

n,an

d re

spon

sibi

lity,

and

incr

ease

d de

gree

of

self

-co

ntro

l

Low

ered

leve

ls o

fhe

adst

rong

, ant

isoc

ial,

anxi

ous,

dep

ress

ed, a

ndov

erly

dep

ende

nt b

ehav

ior

NH

ES

item

s on

know

ledg

e of

col

ors,

lette

rs, n

umbe

rs, a

ndw

ritin

g (3

-5)

NH

ES

item

s (5

-6)

Scho

ol r

egis

trat

ion

Scho

ol-R

elat

ed I

Ctio

wle

dge

and

Skill

s..

Surv

eySc

hool

-rel

ated

kno

wle

dge

isen

hanc

ed b

y pr

esch

ool

prog

ram

par

ticip

atio

n an

dim

prov

ed f

amily

fun

ctio

ning

(Bar

nett

1992

).

Ear

ly S

cree

ning

Inv

ento

rySc

hool

reg

istr

atio

n(E

SI)

(4-6

)

NH

ES

item

s on

but

toni

ng,

Surv

eyho

ldin

g a

penc

il, w

ritin

g,an

d dr

awin

g; m

otor

coor

dina

tion

(3-5

)

ESI

(4-

6)

Soci

al S

kills

Rat

ing

Syst

em--

Pare

nt F

orm

(SSR

S) (

3-5)

SSR

STea

cher

For

m(5

-6)

Scho

ol r

egis

trat

ion

Surv

ey

Pres

choo

l pro

gram

s ca

nen

hanc

e sc

hool

per

form

ance

(Sch

wei

nhar

t et a

l. 19

93).

Soci

al W

ell-

Bet

ng

Qua

lity

child

car

e im

prov

esso

cial

dev

elop

men

t (H

owes

et a

l. 19

92).

Scho

ol r

egis

trat

ion

Beh

avio

r Pr

oble

ms

Inde

x--B

PI (

2-5)

Surv

eyPa

rent

rat

ings

on

BPI

dist

ingu

ish

child

ren

who

have

and

hav

e no

t rec

eive

dps

ycho

logi

cal h

elp

(Zill

1990

); c

hild

ren

in f

amili

esw

ith lo

w le

vels

of

conf

lict

expe

rien

ce f

ewer

beh

avio

rpr

oble

ms

(Pet

erso

n an

d Z

ill19

86);

chi

ldre

n w

ithpr

esch

ool e

xper

ienc

e ha

vefe

wer

cla

ssro

om b

ehav

ior

prob

lem

s (P

iers

on e

t al.

1984

).

Use

d in

199

3 N

HE

S

Goo

d no

rms,

str

ong

relia

bilit

y,ha

s co

mpa

nion

par

ent r

atin

gfo

rm

Use

d in

199

3 N

HE

S

Goo

d no

rms

and

relia

bilit

y

Stan

dard

ized

on

natio

nal s

ampl

eof

5,0

00 c

hild

ren,

rac

ially

,et

hnic

ally

, and

soc

ioec

onom

ical

lym

ixed

Em

phas

izes

pos

itive

beh

avio

rs

Des

igne

d as

par

ent-

repo

rtm

easu

re

BPI

is w

idel

y us

ed (

JOB

S,N

LSY

-CS,

NH

IS)

Stro

ng p

sych

omet

ric

prop

ertie

s

Scho

ol r

eadi

ness

: rel

ates

toco

gniti

ve a

nd g

ener

al k

now

ledg

e

Scho

ol r

eadi

ness

: rel

ates

toph

ysic

al w

ell-

bein

g an

d m

otor

deve

lopm

ent

Soci

al s

kills

are

impo

rtan

t for

succ

essf

ul a

djus

tmen

t to

kind

erga

rten

; rel

ates

to s

ocia

l and

emot

iona

l dev

elop

men

t and

dom

ain

of a

ppro

ache

s to

war

d le

arni

ng

Beh

avio

r pr

oble

ms

can

inte

rfer

ew

ith s

choo

l rea

dine

ss a

nd s

ucce

ss.

Chi

ldre

n w

ith f

ewer

beh

avio

rpr

oble

ms

are

less

like

ly to

req

uire

men

tal h

ealth

ser

vice

s (A

chen

bach

et a

l. 19

91).

Rel

ates

to s

ocia

l and

em

otio

nal

deve

lopm

ent

SOU

RC

E:

Tho

rnto

n, L

ove,

and

Mec

kstr

oth

(199

4).

73B

ES

T C

OP

Y A

VA

ILA

BLE

7

TA

BL

E 4

SCH

OO

L P

ER

FOR

MA

NC

E O

UT

CO

ME

S A

ND

ME

ASU

RE

S

Inte

rmed

iate

-Ter

mO

utco

mes

Exp

ecte

dL

ong-

Ter

mO

utco

mes

Exp

ecte

dM

easu

res

Rec

omm

ende

dD

ata

Sour

ceE

vide

nce

of S

ensi

tivity

toC

omm

unity

Int

erve

ntio

nsSt

reng

ths

[Lia

bilit

ies]

of

the

Mea

sure

Polic

y R

elev

ance

Impr

oved

atte

ndan

ce

Dec

reas

ed p

ropo

rtio

ns o

fch

ildre

n id

entif

ied

asde

velo

pmen

tally

del

ayed

at k

inde

rgar

ten

entr

y

Red

uctio

ns in

num

ber

ofch

ildre

n as

sign

ed to

spec

ial e

duca

tion

prog

ram

s

Impr

oved

atte

ndan

ce

Ret

nedi

atio

n

Scor

es o

n de

velo

pmen

tal

Kin

derg

arte

nsc

reen

ing

test

sre

gist

ratio

n fo

rms

Num

ber

and

prop

ortio

n of

Scho

ol s

yste

mch

ildre

n w

ith s

peci

alre

cord

sed

ucat

ion

assi

gnm

ents

Enr

ollm

ent i

n pr

esch

ool

prog

ram

s an

d in

crea

sed

leve

ls o

f in

telle

ctua

lst

imul

atio

n in

the

hom

esh

own

to d

ecre

ase

deve

lopm

enta

l del

ays

(Ram

ey e

t al.

1983

)

Enr

ollm

ent i

n pr

esch

ool

prog

ram

s an

d in

crea

sed

leve

ls o

f in

telle

ctua

lst

imul

atio

n in

the

hom

esh

own

to d

ecre

ase

spec

ial

educ

atio

n pl

acem

ents

(Bar

nett

1992

; Sch

wei

nhar

tet

al.

1993

)

Atte

ndan

ce

Day

s ab

sent

/pre

sent

Scho

ol r

ecor

ds

Inex

pens

ive

to c

olle

ct (

ifav

aila

ble)

[Sub

ject

to p

ossi

ble

inco

nsis

tenc

ies

in s

choo

lre

gist

ratio

n pr

oced

ures

]

Inex

pens

ive

to c

olle

ct (

ifav

aila

ble)

[Sub

ject

to in

cons

iste

ncie

s in

scho

ol r

ecor

ds]

Impr

ovem

ents

in th

e ho

me

envi

ronm

ent c

an im

prov

esc

hool

atte

ndan

ce (

May

nard

1977

).

Gra

deT

rogr

essi

on

Red

uctio

ns in

rat

es o

fPr

opor

tion

of s

tude

nts

Scho

ol r

ecor

dsch

ildre

n re

tain

ed in

gra

deov

er a

ge f

or g

rade

Con

tinue

d de

crea

se in

Dec

reas

ed s

choo

l dro

pout

scho

ol d

ropo

ut r

ates

rate

sSc

hool

dro

pout

rat

eSc

hool

rec

ords

Qua

lity

pres

choo

l exp

erie

nce

redu

ces

rate

of

grad

ere

tent

ions

(I

ajar

et a

l. 19

82;

Schw

einh

art e

t al.

1993

).

Impr

ovem

ents

in th

e ho

me

envi

ronm

ent c

an in

crea

sepr

obab

ility

of

com

plet

ing

high

sch

ool a

nd in

crea

sing

educ

atio

nal a

ttain

men

t(M

alla

r an

d M

ayna

rd 1

981)

.H

owev

er, r

igor

ousl

yev

alua

ted

scho

ol-b

ased

prog

ram

s fo

r te

enag

ers

have

show

n lit

tle e

ffec

t.

Dev

elop

men

tal d

elay

s ar

e m

ajor

sour

ce o

f in

crea

sed

educ

atio

nal

cost

s.

Spec

ial e

duca

tion

serv

ices

are

ver

yco

stly

.

Inex

pens

ive

to c

olle

ct

[Sub

ject

to in

cons

iste

ncie

s in

scho

ol r

ecor

ds]

Inex

pens

ive

to c

olle

ct

[Sub

ject

to v

aria

tion

in s

choo

lre

tent

ion

polic

ies]

Poor

atte

ndan

ce in

terf

eres

with

scho

ol c

ompl

etio

n.

Gra

de r

eten

tions

res

ult i

n in

crea

sed

educ

atio

n co

sts.

Dro

ppin

g ou

t of

high

sch

ool c

ande

crea

se f

utur

e ea

rnin

gs p

oten

tial.

BE

ST

CO

PY

AV

AIL

AB

LE

TA

BL

E 4

(con

tinue

d)

Inte

rmed

iate

-Ter

mO

utco

mes

Exp

ecte

dL

ong-

Ter

mO

utco

mes

Exp

ecte

dM

easu

res

Rec

omm

ende

dD

ata

Sour

ceE

vide

nce

of S

ensi

tivity

toC

omm

unity

Int

erve

ntio

nsSt

reng

ths

[Lia

bilit

ies]

of

the

Mea

sure

Polic

y R

elev

ance

Impr

oved

bas

ic s

kills

and

Impr

oved

bas

ic s

kills

and

Scre

enin

g te

st s

core

sac

adem

ic a

chie

vem

ent

acad

emic

ach

ieve

men

t

Sch

ool A

chie

vem

ent

Scho

ol s

yste

mre

cord

s

Stan

dard

ized

ach

ieve

men

tSc

hool

sys

tem

test

sco

res

reco

rds

Cog

nitiv

e an

d so

cial

perf

orm

ance

impr

oved

by

pres

choo

l atte

ndan

ce (

see

child

dev

elop

men

t out

com

es)

Scho

ol a

chie

vem

ent

impr

oved

by

atte

ndan

ce in

qual

ity p

resc

hool

pro

gram

s(B

arne

tt 19

92; L

azar

et a

l.19

82),

fam

ily b

ackg

roun

dfa

ctor

s (H

anus

hek

1989

),im

prov

emen

ts in

the

hom

een

viro

nmen

t (M

ayna

rd 1

977;

May

nard

and

Mur

nane

197

9;M

a lla

r an

d M

ayna

rd 1

981)

,in

fant

and

pre

scho

ol h

ome

envi

ronm

ents

(B

radl

ey a

ndC

aldw

ell 1

978,

198

4)

Inex

pens

ive

to c

olle

ct (

ifav

aila

ble)

[Sub

ject

to in

cons

iste

ncie

s in

scho

ol r

egis

trat

ion

proc

edur

es]

Rel

ated

to N

atio

nal E

duca

tion

Goa

ls

SOU

RC

E:

Tho

rnto

n, L

ove,

and

Mec

kstr

oth

(199

4).

74B

ES

T C

OP

Y A

VA

ILA

BLE

7 5

TABLE 5

SCHOOL-ENTRY ASSESSMENT STRATEGIES FOR THE READINESS DIMENSIONSOF THE FIRST NATIONAL EDUCATION GOAL

Readiness Indicator Measure

Access to High-Quality and Development011Y.APproprinieWeichOtilPiiigranis .-...

Increased enrollments in early care and education programs Questions on types of programs attended (Head Start, nursery school,state prekindergarten, center day care, family day care), age of firstattendance, and duration of attendancea

Improved quality of early care and education programs Questions on program's daily and weekly schedule, group size, andchild-staff ratioa

Increased stability of child care arrangements Questions on number of different early care and education settingsexperienceda,b

Increased percentage of high-risk children enrolled in early Question on enrollment in early intervention programscintervention programs

Every Parent Will Be a Child's First Teacher and Devote Time EachDay to Helping His or Her Preschool Child Learn

Increased amount of time spent with child in intellectually Questions on frequency of reading, storytelling, teaching activities,challenging activities playing games, discussing science or nature, etc.a

Increased number of educational materials in the home Questions on number of books, games, and other educationalmaterials in the homea

Increased regulation of children's television viewing Questions on regulation (rules covering content and hours) ofchildren's television viewinga

Increased enriching experiences outside the home provided by Questions on frequency of visits to libraries, museums, zoos, plays,parents concerts, churches, cultural organizations, and games or sportsa

Parents Will Have Access to the Training and Support They Need

Increased attendance at parenting classes in the community Questions on attendance at classesd,e

Increased availability of parenting classes, social clubs, parent Questions on knowledge of and access to parenting classes, socialgroups, counseling opportunities, social service agencies, and clubs, parent groups, counseling opportunities, social serviceother supports agencies, and other supportsd,e,

Children Will Receive the Nutrition and Health Care Needed toArrive at School with Healthy Minds and Bodies

Reduced percentage of low-birthweight babies Questions on child's weight at birthf

Increased access to prenatal care Question on number of prenatal care visitsf

Increased percentage of children receiving regular medical care Questions on regular source of routine caret

Increased percentage of children receiving regular well-child Questions on routine health checkup in past 12 monthsfexaminations

Decreased use of emergency room for nonemergency care Question on emergency room usef

Increased percentage of children receiving immunizations at Questions on completed immunizationsappropriate ages

Increased percentage of children having private or public Questions on health insurancefhealth insurance

Increased percentage of children having regular vision and Questions on vision and hearing screenings in past 12 monthsfhearing screenings

Decreased percentage of children having previously undetected Questions on children referred for treatmentfvision and hearing problems

66

TABLE 5 (continued)


Increased percentage of children having completed dental Questions on dental visit in past 12 monthsfcheckup

Increased percentage of children having nutrition screening Questions on nutrition screeningfbefore kindergarten entry

Increased percentage of children referred before kindergarten Questions on treatment referral/for treatment of mild asthma, tuberculosis, cerebral palsy,mental retardation, autism, or other pervasive developmentaldisorders

Physical Well-Being and Motor Development

Physical Well-Being

Increased overall health status of children Child's health ratinga,f

Decreased percentage of children having functional Ratings of child's functional limitationsglimitations because of health conditions

Reduction in percentage of children with morbidities or Child's health ratinghserious morbidities

Decreased number of hospitalizations Questions on hospitalization/

Increased percentage of children within age-appropriate Items on height and weight relative to age norms (by directheight and weight norms examination or medical record review)

Motor Development .

Improved fine-motor development and coordination Items on block-building, draw-a-person, and copying forms'

Improved gross-motor skills Items on gross-motor/body-awareness scale'

Social and Emotional Development

Social Development

Increased levels of assertion Scores on assertion scale)

Decreased levels of aggressive behavior, dependence, and Scores on scales measuring aggressive behavior, dependence, andheadstrong behavior headstrong behaviork

Increased cooperation and ability to help, communicate Scores on cooperation scale)

problems, and follow rules

Emotional Development

Reduced levels of anxiety and depression Scores on anxiety and depression scalesk

Approaches Toward Leartung

Self-Control and Self-Regulation

Increased ability to control temper, attend to instructions, Scores on self-control scale)and take turns

Task Attention

Increased ability to attend to task and to remember auditory Items on auditory sequential memory scale'material received

_

Language U-sage

Verbal Expression

Increased expressive language, speaking ability, and ability Items on verbal expression scale'to describe objects

67

7 7

EST COPY AVAILABLE

TABLE 5 (continued)


Cognition and General Knowledge

Visual Sequential Memory

Increased ability to remember what is seen Items on visual sequential memory'

Number Concepts

Increased ability to count and to understand simple Items on number concept'quantitative concepts

Verbal Reasoning

Increased relational-thinking ability Items on verbal reasoning'

General Knowledge

Increased school-related knowledge and skills Questions on knowledge of colors, letters, numbers, and writinga

Family Background, Demographics, and Contextual Variables

Mother's education Questions selected from various instrumentseFather's educationMother's age at birth of first childHousehold structureHousehold incomeEmployment status of mother and fatherNumber of residential movesLength of residence in communityChild's age and genderChild's disabilitiesRace/ethnicity of childChild's contacts with father (if father not in home)Number and ages of siblingsLanguage(s) spoken in the homeNeighborhood characteristicsSchool characteristics

SOURCE: Love, Aber, and Brooks-Gunn (1994).

aNational Household Education Survey (NHES:93) (NCES 1993).

bNational Child Care Survey 1990 (Hofferth et al. 1991).

cInteractional and Developmental Processes study questionnaire (MPR 1991).

dTeenage Parent Demonstration 24-Month Follow-Up questionnaire (MPR 1993).

eMeasures to be selected.

f National Health Interview Survey (NHIS) Child Health Supplement (NCHS 1989).

gRand Health Perceptions Scale (Eisen et al. 1980).

hMorbidity Index and Serious Morbidity Index (Brooks-Gunn et al. 1994).

i Early Screening Inventory (ESI) (Meisels et al. 1988).

Social Skills Rating System (SSRS) (Gresham and Elliott 1990).

kBehavior Problems Index (BPI) (Zill 1990).

68

TABLE 6

CHARACTERISTICS OF FOUR ILLUSTRATIVE CHILD OUTCOME MEASURES

Feature/CharacteristicBehavior Problems Index

(Zill 1990)Early Screening Inventory (Meisels

et al. 1988)

MacArthur CommunicativeDevelopment Inventories

(Fenson et al. 1993)

Social Skills Rating SystemElementary Teacher Form(Gresham and Elliott 1990)

Outcomes measured

Age range assessed

Type of assessment

How administered

Behavior problems labeled as

Headstrong

Aggressive

Anxious

Depressed

Immature /dependent

2-5 years

Rating by parent

Telephone interview, self-report,or in-person interview

Visual-motor/adaptiveDraw-a personFine-motor controlEye-hand coordinationVisual-sequential memoryReproducing 2- and 3-dimensional visualstructures

Language and cognitionComprehensionVerbal expressionReasoningCountingRemembering auditorysequences

Gross-motor/body awarenessLarge-muscle coordinationBalancing, hopping,skippingImitating body positionsfrom visual cues

4-6 years

Structured individual assessmentby teacher

Administered by teacher

Words and Gestures (infants)aPhrasesVocabulary comprehensionVocabulary productionEarly gesturesLater gestures

Words and Sentences (toddlers)Vocabulary productionIrregular nouns and verbsSentence complexity

Infant form: 8-16 monthsToddler form: 16-30 months

Parent report

Self-report by parent

Cooperation

Assertion

Self-control

Externalizing behaviors

Internalizing behaviors

Hyperactivity

Academic competence

5-11 years (grades K-6)

Rating by teacher

Written response

BEST COPY AVAILABLE

69

Ta

TABLE 6 (continued)

Feature/CharacteristicBehavior Problems Index

(Zill 1990)Early Screening Inventory (Meisels

et al. 1988)

MacArthur CommunicativeDevelopment Inventories

(Fenson et al. 1993)

Social Skills Rating System- -Elementary Teacher Form(Gresham and Elliott 1990)

Administration time

Standardization and norms

Previous and/or current use

Strengths

Liabilities

10 minutes

Data available on 17,110 children17 years of age and under from the1988 National Health InterviewSurvey of Child Health.

National Longitudinal Survey ofYouth Child Supplements (1986,1988)

National Health Interview Survey:Child Health Supplement (1981,1988)

NICHD infant day care study

JOBS child impact study

Extensive national data availablefor comparative use

Short and easy to administer

Programs may not like thenegativity of focus on problembehaviors

15-20 minutes

Normed on national sample of2,746 children, 44 percentnonwhite.

20-30 minutes

Norming sample of 671 infants and1,142 toddlers in New Haven,Seattle, and San Diego

Evaluation of developmental status Currently used in NICHD infantof homeless children (Koblinsky, day care studyTaylor, and Douglas 1995)

Strong predictive validity with theMcCarthy Scales of Children'sAbilities (Meisels et al. 1993)

High interscorer and test-retestreliabilities

Refers a high proportion ofchildren actually at-risk, andexcludes most of the children notat-risk (Meisels et al. 1993)

Administration time couldconstitute significant burden if ateacher is asked to assess multiplechildren

Good predictive validity (usingobservational data as criterionmeasure)

Moderate to high internalconsistency, depending on scale

Good cross-form, cross-agestability during period of early andmore rapid vocabulary expansion

Only 13 percent of the normingsample were minority groupmembers; 78 percent of parentshad some college education or acollege degree.

10 minutes

Norming sample of 5,000children--geographically, racially,and socioeconomicallyheterogenous (956 for theelementary teacher form); included19 percent handicapped

Head Start-Public SchoolTransition Demonstrationevaluation

Under consideration for use in theEarly Childhood LongitudinalStudy (NCES)

Good psychometrics

Norms for handicapped andnonhandicapped children

Assesses positive social skills aswell as problem behaviors

Administration time couldconstitute significant burden if ateacher is asked to rate multiplechildren

70

C,

TABLE 6 (continued)

Feature/Characteristic

MacArthur CommunicativeBehavior Problems Index Early Screening Inventory (Meisels Development Inventories

(Zit' 1990) et al. 1988) (Fenson et al. 1993)

Social Skills Rating SystemElementary Teacher Form(Gresham and Elliott 1990)

Comments Ratings by teacher or other Administration by interview alsoknowledgeable adult also possible possible.

Short form is being field tested.

May be suitable for older childrenwho are developmentally delayed.

aThis list represents the subscales of the CDI. Various variables can be generated as outcome measures, e.g., age at which x percent of the sampleuses plurals, possessives, progressive, and

past tense.

BEST COPYAVAILABLE

71

U.S. Department of EducationOffice of Educational Research and Improvement (OERI)

Educational Resources Information Center (ERIC)

REPRODUCTION RELEASE

I. DOCUMENT IDENTIFICATION:

(Specific Document)

ERIC

Title:Considering Child-Based Results for Young Children:Definitions, Desirability, Feasibility, & Next Steps

Author(s): Sharon Lynn Kagan? Sharon Rosenkoetter? and Nancy Cohen

Corporate Source: Publication Date:

II. REPRODUCTION RELEASE:In order to disseminate as widely as possible timely and significant materials of interest to the educational community, documents announced

in the monthly abstract journal of the ERIC system, Resources in Education (RIE), are usually made available to users in microfiche, reproducedpaper copy, and electronic/optical media, and sold through the ERIC Document Reproduction Service (EDRS) or other ERIC vendors. Credit isgiven to the source of each document, and, if reproduction release is granted, one of the following notices is affixed to the document.

If permission is granted to reproduce and disseminate the identified document, please CHECK ONE of the following two options and sign atthe bottom of the page.

Check hereFor Level 1 Release:Permitting reproduction inmicrofiche (4" x 6" film) orother ERIC archival media(e.g., electronic or optical)and paper copy.

Signhere -pplease

The sample sticker shown below will beaffixed to all Level 1 documents

PERMISSION TO REPRODUCE ANDDISSEMINATE THIS MATERIAL

HAS BEEN GRANTED BY


Level 1

The sample sticker shown below will beaffixed to all Level 2 documents

PERMISSION TO REPRODUCE ANDDISSEMINATE THIS

MATERIAL IN OTHER THAN PAPERCOPY HAS BEEN GRANTED BY


Levei 2

Documents will be processed as indicated provided reproduction quality permits. If permissionto reproduce is granted, but neither box is checked, documents will be processed at Level 1.

LIiia

Check hereFor Level 2 Release:Permitting reproduction inmicrofiche (4" x 6" film) orother ERIC archival media(e.g., electronic or optical),but not in paper copy.

"I hereby grant to the Educational Resources Information Center (ERIC) nonexclusive permission to reproduce and disseminatethis document as indicated above. Reproduction from the ERIC microfiche or electronieoptical media by persons other thanERIC employees and its system contractors requires permission from the copyright holder. Exception is made for non-profitreproduction by libraries and other service agencies to satisfy information needs of educators in response to discrete inquiries.'

Signature:

oi/OVE4,./Organization/Address:Yale University Bush Center in ChildDevelopment and Social Policy310 Prospect StreetNew Haven, CT 06511

Printed Name/Position/Title:

Sharon Lynn Kagan / Senior Associate

Telephone:

(203) 432-9931

FAX:

(203) 432-9933E-Mail Address: Date:

12/1/97

(over)

III. DOCUMENT AVAILABILITY INFORMATION (FROM NON-ERIC SOURCE):

If permission to reproduce is not granted to ERIC, or, if you wish ERIC to cite the availability of the document from another source,please provide the following information r( --arding the availability of the document. (ERIC will not announce a document unless it ispublicly available, and a dependable soi can be specified. Contributors should also be aware that ERIC selection criteria aresignificantly more stringent for documents -sat cannot be made available through EDRS.)

Publisher/Distributor:

Yale University Bush Center in Child Development and Social Policy

Address:310 Prospect StreetNew Haven, CT 06511

Price:

IV. REFERRAL OF ERIC TO COPYRIGHT/REPRODUCTION RIGHTS HOLDER:

If the right to grant reproduction release is held by someone other than the addressee, please provide the appropriate name and address:

Name:

Address:

V. WHERE TO SEND THIS FORM:

Send this form to the following ERIC Clearinghouse:KAREN E. SMITHACQUISITIONS COORDINATORERIC/EECECHILDREN'S RESEARCH CENTER51 GERTY DRIVECHAMPAIGN, ILLINOIS 61820-7469

However, if solicited by the ERIC Facility, or if making an unsolicited contribution to ERIC, return this form (and the document beingcontributed) to:

ERIC Processing and Reference Facility1100 West Street, 2d Floor

Laurel, Maryland 20707-3598

Telephone: 301-497-4080Toll Free: 800-799-3742

FAX: 301-953-0263e-mail: [email protected]

WWW: http://ericfac.piccard.csc.com(Rev. 6/96)