© 1996, carolyn b. cropper

A GENERALIZABILITY THEORY STUDY OF A SURVEY

INSTRUMENT TO IDENTIFY GIFTED AND TALENTED

STUDENTS: THE LOOKING FOR TRAITS, ATTRIBUTES,

AND BEHAVIORS STUDENT REFERRAL FORM

by

CAROLYN BROWN CROPPER, B.S.C, M.A.

A DISSERTATION

IN

EDUCATIONAL PSYCHOLOGY

Submittecj to the Gracjuate Faculty of Texas Tech University in

Partial Fulfillment of the Requirements for

the Degree of

DOCTOR OF EDUCATION

Approvecj

Accepted

December, 1996

ACKNOWLEDGMENTS

I would like to express my appreciation to my committee

chair. Dr. Mary Tallent-Runnels, for her support, guidance and

understanding and to the other members of my committee, Dr. Joe

Cornett and Dr. Julie Thomas, for their support, guidance and

understanding.

I would also like to express my appreciation to my family,

Michael D. Cropper and Austin Cropper, for their encouragement and

patience.

II

TABLE OF CONTENTS

ACKNOWLEDGMENTS i i

ABSTRACT vii

LIST OF TABLES ix

CHAPTER

I. INTRODUCTION 1

Statement of the Problem 5 Purpose of the Study 6 Research Questions 7 Definition of Terms 8 Assumptions and Limitations 12

II. REVIEW OF THE LITERATURE 14

Issues Related to the Identification of Gifted Minority Children: An Overview 14

Issues to Consider When Assessing Students From Diverse Backgrounds 18

Behavorial Differences 18 Language Differences 20 Differences in Cognitive Styles 21

Survey Instruments 23 Teachers as Raters 27 Parents as Raters 31 Reliability and Validity of Survey

Instruments 34

III. METHODOLOGY 38

Participants 38 Teachers 38 Students 40 Parents 41

Instrument 41

Procedure 45

Data Analysis 48

Significance of the Study 51

IV. RESULTS 53

Descriptive Statistics 54 Ratings by the Classroom Teacher 54 Ratings by the Gifted and Talented

Program Teacher 55 Ratings by the Parents 56

Generalizabllity Theory 58

Generalizability Analysis 59 Combined Raters 59

Combined Ratings For Total Sample 59 Combined Ratings For Anglo Sample 60 Combined Ratings For

African-American Sample 60 Combined Ratings For Hispanic Sample 64

Teacher Ratings 66 Teacher Ratings For Total Sample 67 Teacher Ratings For Anglo Sample 70

IV

Teacher Ratings For African-American Sample 70

Teacher Ratings For Hispanic Sample 73

Parent Ratings 73 Parent Ratings For Total Sample 73 Parent Ratings For Anglo Sample 75 Parent Ratings For African-American

Sample 75 Parent Ratings For Hispanic Sample 78

Decision Studies Considerations 78

V. DISCUSSION 84

Generalizability Finding for TABS 85

D-Study Findings for TABS 86

Teachers As Raters 87

Parents As Raters 89

Limitations of the Study 90

Directions for Future Research Using the TABS Form 91

Implications 92

REFERENCES 94

APPENDIX

A. THE LOOKING FOR TRAITS, ATTRIBUTES AND BEHAVIORS STUDENT REFERRAL FORM 106

B. ANOVA SUMMARY TABLES AND GENERALIZABILITY CALCULATIONS FOR ALL RATINGS 111

C. GENOVA PROGRAM AND SAMPLE DATA 142

VI

ABSTRACT

Effective ways to identify children from economically

disadvantaged and limited English proficient backgrounds for

participation In programs for the gifted continues to gain much

attention. Numerous Instruments have been developed to aid In the

identification process. The Looking for Traits, Attributes and

Behaviors Student Referral (TABS) is one instrument designed to

specifically aid in the identification of giftedness in the minority

child by providing information from educators and other individuals

closely associated with the child. However, minimal information

has been published about the validity and reliability of the TABS.

This study investigated the reliability of TABS utilizing the

generalization theory.

Three groups of raters (regular classroom teachers, gifted and

talented program teachers, and parents) completed the TABS for

127 third grade students. The group of parents independently rated

each student on two occasions three months apart.

VII

Results indicated that minimal variance was noted between

the various source of error. Several sample measurement protocals

were also investigated. Results suggested that multiple raters

provide a more comprehensive view of the student when attempting

to screen for participation in a gifted and talented program and the

TABS form is a valuable instrument for this process.

VIM

LIST OF TABLES

1. Ratings by Classroom Teacher, Gifted and Talented Program Teacher and Parent 57

2. Generalizability Calculations for Combined Ratings, Total Sample 61

3. Generalizability Calculations for Combined Ratings, Anglo Sample 62

4. Generalizability Calculations for Combined Ratings, African-American Sample 63

5. Generalizability Calculations for Combined Ratings, Hispanic Sample 65

6. Generalizability Calculations for Teacher Ratings, Total Sample 68

7. Generalizability Calculations for Teacher Ratings, Anglo Sample 69

8. Generalizability Calculations for Teacher Ratings, African-American Sample 71

9. Generalizability Calculations for Teacher Ratings, Hispanic Sample 72

10. Generalizability Calculations for Parent Ratings, Total Sample 74

11. Generalizability Calculations for Parent Ratings, Anglo Sample 76

IX

12. Generalizability Calculations for Parent Ratings, African-American Sample 1'7

13. Generalizability Calculations for Parent Ratings, Hispanic Sample 79

14. D-Study: Phi Coefficients for Items Omitted from TABS 81

15. D-Study: Phi Coefficients of TABS with Parents As Raters and Occasions, Varied 82

16. D-Study: Phi Coefficients of TABS for Raters and Occasions, Varied 83

17. ANOVA Summary Table for Combined Ratings, Total Sample 112

18. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented Program Teachers, and Parents, Total Sample 113

19. ANOVA Summary Table for Combined Ratings, Anglo Sample 114

20. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented Program Teachers, and Parents, Anglo Sample 11 5

21. ANOVA Summary Table for Combined Ratings, Asian Sample 116

22. Generalizability Calculations for Combined Ratings, Asian Sample 117

23. ANOVA Summary Table for Combined Ratings, African-American Sample 118

24. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented Program Teachers, and Parents, African-American Sample 119

25. ANOVA Summary Table for Combined Ratings, Hispanic Sample 120

26. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented Program Teachers, and Parents, Hispanic Sample 121

27. ANOVA Summary Table for Teacher Ratings, Total Sample 122

28. Variance Components for Teacher Ratings, Total Sample 123

29. ANOVA Summary Table for Teacher Ratings, Anglo Sample 124

30. Variance Components for Teacher Ratings, Anglo Sample 125

31. ANOVA Summary Table for Teacher Ratings, Asian Sample 126

32. Generalizability Calculations for Teacher Ratings, Asian Sample 127

33. ANOVA Summary Table for Teacher Ratings, African-American Sample 128

XI

34. Variance Components for Teacher Ratings, African-American Sample 129

35. ANOVA Summary Table for Teacher Ratings, Hispanic Sample 130

36. Variance Components for Teacher Ratings, Hispanic Sample 131

37. ANOVA Summary Table for Parent Ratings, Total Sample 132

38. Variance Components for Parent Ratings, Total Sample 133

39. ANOVA Summary Table for Parent Ratings, Anglo Sample 134

40. Variance Components for Parent Ratings, Anglo Sample 135

41. ANOVA Summary Table for Parent Ratings, Asian Sample 136

42. Generalizability Calculations for Parent Ratings, Asian Sample 137

43. ANOVA Summary Table for Parent Ratings, African-American Sample 138

44. Variance Components for Parent Ratings, African-American Sample 139

45. ANOVA Summary Table for Parent Ratings, Hispanic Sample 140

XM

46. Variance Components for Parent Ratings, Hispanic Sample 141

XIII

CHAPTER I

INTRODUCTION

Many students from minority groups and/or economically

disadvantaged families are often denied eligibility for services In

programs for the gifted when only IQ scores are utilized to

determine eligibility. As a result, Identified gifted students form a

group that is usually culturally or ethnically homogeneous (Keller,

1990). Zappia (1989) reported that 80% of the enrollment in gifted

programs is Anglo-American with African-Americans, Hispanics and

Asians constituting only 18% of the enrollment. This homogeneity is

not necessarily a problem if the identified students can truly be said

to represent all those who should be included. Many researchers,

however, have strongly asserted that giftedness Is proportionally

represented in every ethnic and cultural group and at all

socioeconomic levels (Clark, 1992; Davis & RImm, 1994; Frasier,

1987; Gallagher, 1994; Kitano & Kirby, 1986; Marland, 1972).

DeHaan and Havighurst (1961) cautioned against relying on test

data alone to identify and select students with gifts. They asserted

that the complex, multidimensional nature of mental abilities

suggests that above-average ability can be described more

adequately as a group of independent factors than as a general

ability expressed by IQ.

Frasier (1990b) cautioned that one reason for

underrepresentation of minority and low socioeconomic students

may be that reliance on conventional identification procedures has

reduced the multlfaceted, complex phenomenon called giftedness to

a single-faceted phenomenon—high performance on intelligence

tests. Current views on intelligence and the assessment of

intellectual capacity suggest that intelligence tests provide a

narrow view of an Individual's abilities, therefore, a more

comprehensive assessment of ability is needed (Barkan, & Bernal,

1991; Bermudez & Rakow, 1990; Hunsaker & Callahan, 1993;

Pendarvis, Howley & Howley, 1990).

Another reason (Frasier, 1990b) for the underrepresentation of

these students in gifted programs is related to the inability of

educators to recognize children's "gifted behaviors." Many children's

gifts and talents go unrecognized. Including some minority group

children, underachievers, children whose proficiency with the

English language is limited, and highly gifted children. Although not

all the problems of these groups are the same, they do represent the

groups that are most underrepresented in programs for gifted

children (Clark, 1992; Gallagher 1994).

Giftedness exists in all human groups (Baldwin, 1980) and

minority group students may manifest gifted characteristics

differently from the majority student population (Frasier, 1987;

Renzulli, 1973; Torrance, 1969), therefore, suggestions for finding

ability in culturally diverse students lean heavily toward the

identification of noncognltive skills. Much of the investigation has

centered on creative abilities (Bernal, 1978). Frasier (1990b)

incorporated noncognltive gifted characteristics in the TABS and

presented the TABS as an instrument designed to pay special

attention to the different ways in which children from different

cultures manifest behavioral indicators of giftedness.

Current discussions concerned with the measurement of

intelligent behavior emphasize the use of multiple criteria (Hoge,

1989; Passow & Rudnitski, 1993). Maker and Schiever (1989) noted

two predominant conclusions from the various viewpoints presented:

(1) use multiple assessment procedures, including objective and

subjective data from a variety of sources; and (2) use a case study

approach, in which a variety of assessment data is interpreted in the

context of a student's individual characteristics.

According to Delcourt et al. (1994), solutions to the

underrepresentation of gifted students from minority groups have

taken many forms: (1) nominations from sources other than teachers;

(2) alternative checklists and rating scales; (3) conventional

identification models; (4) culturally sensitive standardized tests;

(5) matrix, culture-specific, quota system, and identification-

through-instruction models. Despite the worthiness of these and

other solutions, the representation of minority students in programs

for the gifted remains low.

In order to increase the minority representation in gifted

programs, many schools have implemented a multlfaceted

identification process to identify students for gifted and talented

programs. Often, survey instruments are included in the

multlfaceted identification process.

The classroom teacher is the individual in the school setting

who has the most contact with each student and is often asked to

rate behaviors on the survey instrument which are observed within

the school environment (Frasier, 1990a; Keller, 1990). Several

rating scales utilizing classroom teachers as raters are available to

assist in the identification of the gifted and talented student.

Statement of the Problem

The Looking for Traits. Attributes and Behaviors Student

Referral Form (TABS) is a rating scale currently utilized by

numerous school districts In Georgia and Texas (Mary Frasier,

personal communication. May 12, 1996). This survey instrument was

designed to specifically aid in the identification of minority and/or

economically disadvantaged gifted students. Many school districts,

however, are currently using this form throughout their entire

student population as the principal Instrument to aid in the

identification of gifted students. The TABS is an instrument with

minimal published research about Its inner-rater reliability or

validity. Additional information Is needed to determine if this

instrument will provide stable and consistent scores across raters.

If it is found to yield reliable scores, teachers' responses can be

utilized with increased confidence to aid in the Identification

process, thereby allowing for the most appropriate placement of the

student.

Purpose of the Study

Additional information is needed to determine if the TABS

will provide stable and consistent scores across raters and

occasions. This study Is an attempt to determine the reliability of

TABS scores using more than one rater to identify gifted and

talented elementary students. It will also examine the consistency

of raters' scores over time.

Research Questions

In general, this investigation is intended to answer the

following questions.

1. What is the reliability of the TABS scores using

generalizability to consider the facets of students, raters, and

occasions? (G-Study)

2. What is the effect on reliability and measurement precision

of using more or less than two raters? (D-Study)


of using more or less than two rating occasions? (D-Study)

4. What Is the effect on reliability and measurement precision

of using more or less than 10 items? (D-Study)

Definitions of Terms

Absolute Decision. This measurement is utilized to index the

absolute level of an individual's performance without regard to how

well or how poorly the individual's peer performed. Attention is

focused on the absolute value of an individual's performance, not

relative standing (Shavelson & Webb, 1991).

Classical True Score Theorv. The classical true score theory

examines true scores and error. Classical true score theory

evaluates one source of error in a measurement at a time.

Therefore, numerous reliability coefficients have been developed to

measure such factors as internal consistency reliability and test-

retest reliability. Determining reliability when utilizing the

classical true score theory requires multiple reliability coefficients

for each Instrument. The results are that the coefficients are often

different and contradictory (Eason, 1991).

Decision Study or D-$tudy. The D-study extends the results of

the G-study by placing an emphasis on estimation, use and

interpretation of the generated variance components for decision-

8

making with well-specified measurement procedures (Brennan,

1983). The D-study includes only facets of interest and varies those

values to determine the optimum number of items, forms, occasions,

or raters to include in the research study to achieve dependable

measurement.

Facet. Facets are the characteristics of the testing situation

that contain error variance (Thompson, 1989). Characteristics

Include the specifics of the measuring situation (test forms,

occasions, raters, etc.).

Generalizability Theorv. Generalizability theory-an

alternative to classical true score theory-provides alternative

ways of estimating the respective amounts of variance contributed

by all possible sources of variance that are operative in a given

testing situation. According to generalizability theory, given the

exact same conditions in the universe, the exact same test score

should be obtained (Cronbach, 1972). Generalizability theory

reminds us that a test's reliability Is not something that statically

resides within the test. The reliability of a test is a function of the

circumstances under which the test is developed, administered, and

interpreted. The generalizability theory allows the researcher to

consider all sources of error simultaneously. Through the use of

this theory, the researcher is also able to determine inter-rater

reliability as one source of error involving persons, occasions

and/or raters (Marsden, 1993; Thompson, 1989).

G-Study. The purpose of a G-study includes all the facets of

interest to obtain estimates of error variance components

associated with a universe of admissible observations (Brennan,

1983).

Inter-Rater Reliability. Inter-rater reliability is the form of

reliability that seeks to establish agreement between individuals

who are scoring data pieces. When a measure does not have scaled

response options such as true-false or multiple choice, It is

essential to establish inter-rater reliability. This is done to ensure

that the rating by different individuals remains the same across

cases. For example, if inter-rater reliability has been established in

the scoring of writing samples, one would expect that the scores of

10

a given piece would be the same in most if not all cases from

different raters. As a result of developing inter-rater reliability,

subjectivety is limited (Johnson, 1985).

Measurement Error. Measurement error is a component of

classical test theory which states that only one source of error can

be estimated at a time.

Relative Decisions. Relative decisions are based only on those

sources of error affecting the relative standing of individuals and

are used to rank order individuals or groups.

Reliability. Reliability refers to the consistency of

measurements (Pagano, 1990) and is generally considered to be

synonymous with dependability or consistency. Reliability refers to

the attribute of consistency in measurement. Test results need to

be dependable, therefore, they should be reproducible, stable

(reliable), and meaningful (valid). Reliability is expressed by a

reliability coefficient or by the standard error of measurement,

which is derived from the reliability coefficient. This consistency

of measurement is also referred to as dependability (Shavelson &

n

Webb, 1991). Classical true score theory and generalizability

theory are two ways to evaluate reliability.

Validitv. Validity refers to a judgment concerning how well a

test measures what it purports to measure (Pagano, 1990).

Assumptions and Limitations

Generalizability theory makes the following assumptions when

considering a set of data.

1. The population is assumed to consist of persons.

2. The universe of admissible observations and the universe of

generalizations involve conditions from the same single facet.

3. The described facets, especially persons are considered to

be essentially Infinite (N ^- o o ) .

4. Generalizability theory makes no assumptions about the

distributional form of the data. This study assumes that the data

set are representative sample of the universe.

5. The model assumes that the number of items is constant In

each content category or subscale of an instrument.

12

6. The G and D-study designs are basically the same. This does

not mean that the data are available from two different studies with

similar designs. It refers to the concept that the "estimated random

effects variance components are available from a G-study and ... they

can be used to make Inferences to an Infinite universe of

generalizations, based on applications of basically the same design"

(Brennan, 1983, p. 55).

GENOVA is a statistical method designed to analyze large sets

of data and consider all identified sources of error along with their

specific interactions. GENOVA, however, will not compute

coefficients for unbalanced data sets and/or unequal groupings. Data

sets containing unequal groupings must be balanced by randomly

selecting and discarding data from the set in order to balance the

set.

13

CHAPTER II

REVIEW OF THE LITERATURE

This review focuses on the issues related to the Identification

process of economically disadvantaged and limited English proficient

children for gifted and talented educational programs. Literature

pertaining to assessment related issues which include the reliability

of rating scales in the identification of economically disadvantaged

and limited English proficient gifted children, teachers and parents

as raters, and the use of the TABS is also discussed. Finally, a

rationale for generalizability theory as the preferred method of data

analysis rather than classical true score theory Is presented.

Issues Related to the Identification of Gifted Minority Children: An Overview

The identification of children from economically

disadvantaged and limited English proficient backgrounds for

participation in programs for gifted and talented students

14

continues to pose problems for educators. While research (Baldwin,

1991; Clark, 1992; Davis & Rimm, 1994; Gallagher, 1994b; Kitano &

Kirby, 1986; Marland, 1972) has illustrated the potential for

giftedness in every segment of society, the underrepresentation of

economically disadvantaged and limited English proficient students

in gifted and talented programs exists to this day. Literature

suggested giftedness is a complex, multlfaceted phenomenon, yet

more traditional and current practices across the nation define and

look for giftedness through the dominant use of intelligence and

achievement scores (Ford & Harris, 1990; Frasier, 1990a; Hunsaker

& Callahan, 1993; Treffinger & Renzulli, 1986).

Because many definitions and theories of giftedness are

grounded in psychometrics, educators rely heavily on Intelligence

and achievement tests only to decide who is gifted. Since many

minority students often score poorly on traditional intelligence and

achievement tests, many of these students are unlikely to be

identified as gifted (Ford & Harris, 1996).

15

When intelligence and achievement scores are the only

Instruments utilized in the Identification process for gifted and

talented individuals, giftedness is considered to be a static and

closed phenomenon and students must "fit" this definition. For

example, in some states, an individual scoring at the 99th percentile

on a standardized measure of mental ability Is considered gifted.

Consequently, significant numbers of Individuals from culturally

diverse, economically disadvantaged, bilingual, and rural

backgrounds are not placed in gifted programs, not because of lack of

cognitive, motivational, artistic, or creative potential, but because

traditional criteria does not assess the skills, knowledge, or

aptitudes they do possess. The use of intelligence and achievement

scores as the only Identification criteria has long been an Issue in

gifted education (Frasier, 1991; Frasier & Passow, 1994, Office of

Educational Research and Improvement, 1993).

Research by Gardner (1983) and Sternberg (1985) indicated

that intelligence (e.g., creativity, interpersonal intelligence) cannot

be adequately measured by traditional intelligence and achievement

16

tests. Gardner (1983) proposes a theory of multiple intelligences

that includes seven relatively independent Intelligences-

linguistic, musical, logical-mathematical, spatial, bodily-

kinesthetlc, interpersonal, and IntrapersonaL He also suggested that

gifted students should be assessed within a framework that

considers the gifted students' cultural and ethnic background and the

quality and quantity of their learning opportunities.

Gardner (1983) proposed utilizing one of the intelligences that

is well developed as an alternative learning mode for other

intelligences not as developed. This use of the multiple

intelligences supporting one another creates a learning environment

through which gifted students can display their talents.

Sternberg (1985) theorized a triarchic concept of intelligence:

the Internal world of the student, the external world of the student,

and the Interaction between these two worlds on the student's

experience. The internal world is exemplified by analytical thinking;

the external world is exemplified by contextual thinking (strategies)

and the interaction between these two worlds on the student's

17

experience is exemplified by experiences in insightful ways. The

triarchic theory includes three kinds of mental processes; (a)

metaprocesses, used to plan, monitor, and evaluate one's problem

solving; (b) performance processes, used to carry out the

instructions of the metaprocesses; and (c) knowledge-acquisition

processes, used to figure out how to solve problems.

Issues to Consider When Assessing Students From Diverse Backgrounds

Behavioral Differences

Children from diverse backgrounds exhibit various cultural

differences. Behavior, cognitive style, and learning style should be

considered when evaluating children for giftedness because these

individual differences often work against the child from a diverse

background. For example, cultural deprivation may effect the

development of talent.

Frierson (1965) designed a study to determine any significant

differences between upper and lower status students to determine

the effects of cultural deprivation on talent development. The

18

students were divided into four groups: (a) upper status gifted

students, (b) lower socioeconomic status gifted students, (c) upper

status average students, and (d) lower socioeconomic status average

students. Frierson concluded that differences between the two

groups of gifted students were clearly associated with differences

in their socioeconomic status.

Delgado-Galtan and Trueba (1985) observed teachers working

with students in various classroom activities. When teachers were

asked to rank the students according to "worthwhile" activities.

Delgado-Galton and Trueba concluded that the teachers participating

in the study did not give students a high rating If the teachers did

not value the particular learning style in which the student was

working.

Learning styles that do not exemplify those represented by the

majority population In classrooms in the United States add to the

preception that children from minority groups are not candidates for

gifted programs. Evidence of characteristics associated with

giftedness may be different in minority children, yet educators are

19

seldom trained in identifying those behaviors in ways other than the

way the characteristics are observed in the majority culture

(Ramirez, Herold, & Castaneda, 1974).

Language Differences

One of the greatest issues in the assessment of children from

diverse backgrounds for gifted programs is language. Taylor (1990)

suggested that language is a great determiner of the perception of

ability about an individual. Therefore, a lack of knowledge,

sensitivity, or appreciation of diverse communication styles can

result in inappropriate assessment. For children whose first

language is not English, observed scores are often the result of lack

of experience with English rather than lack of comprehension of

ideas and concepts (de Bernard, 1985). An understanding of this may

be useful when assessment processes include writing samples,

standardized intelligence test scores which are verbally loaded,

and/or achievement subtests with strong language dependent

components (Damico, 1985).

20

Differences in Cognitive Styles

An additional culture attribute Is cognitive style. As a result

of observations made by Tonemah and Brittan (1985), strong tribal

perspectives were associated with the concept of giftedness in the

description of gifted attributes of Native American students.

Characteristics for gifted Native American students were described

as: (a) acquired skills in language, learning, and technological skills;

(b) tribal/cultural understanding referring to their exceptional

knowledge of ceremonies, tribal traditions, and other tribes; (c)

personal/human/qualities such as high Intelligence,

visionary/inquisitive/intuitive, respectful of elders, and creative

skills; and (d) aesthetic abilities, referring to unusual talents in the

visual and performing arts, and arts based in the Indian culture.

Shade (1991) presented a different view of the cognitive

competencies of African-American students. She concluded that

African American students appear to have high motoric capabilities

and use visual perception as a way of protecting and orienting

themselves in the environment rather than for gathering

21

information. African-American students are largely trained to

concentrate more on people and have a preference for affective

materials and a high level of social interaction in their learning

environments.

Ramirez and Castaneda (1974) examined cognitive style and

found that teaching styles used in the classroom may not agree with

the cognitive styles of the students. Ramirez and Castaneda (1974)

observed teachers with different teaching styles. All teachers were

asked to teach the same concept In their perspective classrooms

with their own individual teaching styles. The results of their study

suggested that not all individuals learn through the same cognitive

method.

Beyond having Implications for classroom practices, the

research by Ramirez and Castaneda (1974) provided implications for

assessment. Observed scores may be skewed If an assessment

Instrument requires the use of a particular cognitive style and the

cognitive style of the child is different. Determining the cognitive

22

style of children may provide a context from which to interpret

standardized test scores.

Survev Instruments

In an effort to address the identification of minority children,

survey instruments were developed to assess the skills, knowledge

and/or aptitudes not assessed in traditional Instruments. Some of

the screening tools that have been used successfully by school

districts are the Bella Dranz Multidimensional Screening Device

(Kranz, 1978); the Baldwin Identification Matrix (Baldwin, 1980);

Scales for Rating the Behavioral Characteristics of Superior

Students (Renzulli & Hartman, 1971); GIFTS Talent Identification

Procedures (Perrone and Male, 1981); and the TABS (Frasier, 1990b).

Renzulli and Hartman (1971) developed the Scales for Rating

the Behavioral Characteristics of Superior Students. Raters are

asked to rate the child according to behavioral characteristics. Each

behavioral characteristic contains 8-10 components. A LIkert-type

scale of 1-5 (l=seldom or never, 5=almost always) focusing on

23

specific student behaviors Is utilized. Scoring sheets and

interpretation materials are included with the instrument.

GIFTS Talent Identification Procedures (Perrone & Male, 1981)

is another screening tool consisting of one or more behavior rating

sheets plus scoring and interpretation materials. Raters are asked

to rate the child only In the areas of talent that they have had the

opportunity to observe. Some of the areas that could be assessed are

mathematics, English, music, science, reading, interpersonal

relations, and art.

Instruments utilized as Identification instruments must be

carefully selected. The validity and reliability, the target

population and the limitations of the instrument should be of prime

importance to the educator (Hanson & Linden, 1990). When

checklists and nomination forms are utilized, they should be

sensitive to all reading levels and take into consideration the native

language of the parents. Specific examples and descriptors of how

the characteristics under consideration are exhibited by minority

students need to be understood. It is recommended that teachers and

24

parents complete the same checklists in order to explore

consistencies or discrepancies in their responses (Ford & Harris,

1996).

Shaklee et al. (1994) suggested that the best way to identify

young gifted and talented minority or economically disadvantaged

gifted students is to base observation and assessment procedures on

universal Identifiers of intellectual potential. The display of

potential giftedness does not just occur in school; potential

giftedness Is a 24-hour phenomenon. Persons inside and outside the

educational environment should be involved in any process to

Identify children with extraordinary gifts and talents.

Many alternative forms of assessment-surveys, portfolios,

oral examinations, open-ended questions, essays-rely heavily on

multiple raters. Multiple raters can Improve reliability just as

multiple test items can improve the reliability of standardized

tests. Choosing and training reliable raters can further Improve the

reliability and accuracy of Instruments that depend on the use of

raters (Chambliss & Melmed, 1990; Foster-Gaitskell & Pratt, 1989;

25

Hicks, 1988; Houston, Raymond & Svec, 1991; Wright & Plersel,

1992).

Hicks (1988) evaluated the Parent/Professional Preschool

Performance Profile observational scale through which preschool

teachers and parents evaluate the behavior of disabled or

nondlsabled children In natural settings while the children interact

with familiar adults over prolonged periods of time. Developmental

skills and interfering behaviors were the two main categories

observed and rated. After providing rater-tralning, the involved

teachers and parents rated the scales.

Foster-Gaitskell and Pratt (1989) compared the adaptive

behavior ratings of children with mental retardation by either

parents or teachers using the Adaptive Behavior Scale-School

Edition. All raters were given prior training concerning the

behaviors to be rated. Their findings Indicated no significant

differences in parents' and teachers' ratings in the categories

assessed or in the Importance ascribed to the behaviors evaluated.

26

Foster-Gaitskell and Pratt attribute the reliability of the raters to

the prior training.

Teachers as Raters

Teachers play an important role in the identification of

students for educational programs (Epkins, 1993; Mllich & Landau,

1988; Pelham, Gnagy, Greenslade, & Mllich, 1992). Because they are

able to observe behaviors exhibited in a task-oriented, academic

situation, teachers are often considered appropriate raters of

children's behaviors (Newcorn et al., 1994). Teachers' ability to

make accurate observations is critical in creating a group of

students to be considered for gifted program participation.

However, there has been continuing skepticism about the ability of

teachers to recommend students for an educational program,

especially when they have had no training (Borland, 1978; Clark,

1992; Davis & Rimm, 1994; Gallagher, 1994; Pegnato & Birch, 1959;

Stanley, 1976; Stone & Rosenbaum, 1988). Davis and Rimm (1994)

reminded us that although teacher nominations are widely utilized,

27

they are among the least reliable and least valid measures used to

identify gifted students.

Teachers' expectations of gifted students are often Influenced

by their values and beliefs, thereby significantly Influencing their

decisions, Including referrals for gifted programs. Utilizing

teachers as primary identifiers of gifted learners carries numerous

implications for the recruitment and retention of minority students,

particularly because many teachers are not substantively prepared in

gifted and multi-cultural education. This lack of preparation and

experience decreases the probability that gifted minority students

will be identified and placed in a gifted program (Ford & Harris,

1996; Hansen & Feldhusen, 1994).

A study conducted by Epkins (1993) compared teacher rating

scales with self reporting rating scales of depression, anxiety and

aggression in a sample of elementary school children. While this

study found a significantly high level of agreement (teacher and self

report) for the identification of externalizing symptoms for both

groups; a significant level of agreement for Internalizing symptoms

28

was found for one sample only. Epkins (1993) concluded that a

possible reason for the rater inconsistency might be that many

teachers considered the completion of the rating scale to be too

time consuming. A meta-analysis conducted by Achenbach,

McConaughy, and Howell (1987) concurred with Epkins' findings; low

correspondence between teacher reports and child self-reports

existed in most studies involved In the meta-analysis.

Russikoff's study (1994) Illustrated that preceptions of raters

can influence their decisions on rating scales. Examinations written

by limited-English-speakers were examined, particularly in the

context in which English writing skills were holistically assessed.

The study revealed a lack of Interrater reliability, raters'

perceptions of their role, a reductive approach to scoring, Imprecise

criteria for scoring, confusion between inaccurate and non-standard

structures, and clear prejudice based on the fact that the examinee

was a student of English as a Second Language (ESL). Most raters

felt non-native speakers of English should meet the same criteria

29

for English writing skills as native speakers, and declared that they

graded ESL writers as they would native speakers.

To increase the ability of teachers to accurately identify

giftedness in students, thereby Increasing their performance in the

role of a rater, the teachers must be provided with the Information

that guides their participation. Frasier (1990c) recommended that

staff development be provided to raters, because many teachers hold

stereotypes about gifted students as only well-behaved and

academically successful students. Often these teachers are unlikely

to refer gifted underachieving students and those students who are

currently misbehaving. Training in gifted education can Increase

teachers' understanding, awareness, and competence In recognizing

gifted behaviors (Hansen & Feldhusen, 1994).

Weigle (1994) presented a study on rater training that

involved the analysis of ratings given to Engllsh-as-a-Second-

Language compositions by eight inexperienced and eight experienced

raters both before and after rater training. Each essay was read by

two raters, an Inexperienced rater and an experienced rater during

30

the first (pre-tralning) section of the study. After the first section

of the study, all raters attended mandatory composition rater

training. Findings indicated that before the training was presented,

all raters as a group differed quite significantly from one another in

terms of severity but after the training, a clear distinction between

raters was no longer visible. The rater differences evened out

somewhat after training across the group. Rater consistency

improved, and rater extremism was reduced. Results of this study

confirmed that rater training cannot make raters into duplicates of

one another, but It can make them more consistent.

Parents as Raters

Parents are also valued as raters (Cornell, 1994; Gilbert,

1994), however a number of researchers (Barber & Cernik, 1976;

Christensen, Phillips, Glascow, & Johnson, 1983; Eisenstadt, 1994;

Forehand, Wells, McMahon, Griest, & Rogers, 1982; Rickard, Forehand.

Wells, Griest, & McMahon, 1981; Wall & Paradise, 1981; Webster-

Stratton, 1988) have cautioned against overreliance on parents'

31

perceptions of their children's behaviors and have suggested that

mothers and/or fathers may inaccurately label their children due to

their own personal adjustment problems, including depression,

anxiety, and marital dissatisfaction.

Adults appeared to have different perceptions of a child when

rating the same child on the Barber Scales of Self-Regard (Barber,

1976). Parents of the child disagreed on portions of the scale. The

results of this study indicated that parents had different

perceptions of their child. Barber concluded that, perhaps, both

parents were correct and the child was. In reality, somewhere in

between the levels described by the scale points.

Wall and Paradise (1981) compared mother and teacher reports

on two scales from the Adaptive Behavior Inventory for Children of

the System of Multicultural and Pluralistic Assessment. Results

indicated little agreement between mother and teacher reports.

Mothers tended to provide higher ratings of adaptive behaviors than

did teachers, irrespective of grade level.

32

Parental perception about the behaviors and characteristics of

the child within the family may differ. Eisenstadt (1994) conducted

a study to investigate interparental agreement of the Eyberg Child

Behavior Inventory. In this study, mothers rated their children's

disruptive behavior as more frequent and more problematic than did

fathers. Eisenstadt suggested that parents receive training before

rating their children.

However, when parents are utilized as raters, they, too, should

receive training presentations in order to recognize the different

traits of giftedness, definitions of giftedness, and a thorough

understanding of the identification process (Au & Punfrey, 1993;

Kaplan, 1993). After Inservicing, both parents and teachers will be

better prepared to accept the concepts associated with expanded

views of giftedness, understand more accurately those behaviors

indicating gifted potential, and to determine a variety of objective

and subjective data sources to be used In identification (Chambliss &

Melmed, 1990; Hansen & Feldhusen, 1994; Williams & Hartlage,

1988).

33

Reliability and Validity of Survev Instruments

There are numerous threats to the reliability of scores based

on ratings (SIgafoos & Pennell, 1995). Individuals being rated may

not be performing in their usual manner. The situation or task may

not elicit typical behavior or the raters may be unintentionally

distorting the results. Some of the rater effects are:

1. The halo effect. The Impressions that an evaluator forms

about an individual on one dimension can Influence his or her

impressions of that person on other dimensions. Nisbett and Wilson

(1977), for example, made two videotapes of the same professor. In

the first video, the professor acted In a friendly manner. In the

second video, the professor behaved arrogantly. Students watching

the friendly tape rated the professor more favorably.

2. Stereotyping. The impressions that an evaluator forms

about an entire group can alter his or her impressions about a group

member. In other words, a principal might find a mathematics

teacher to be precise because all mathematics teachers are supposed

to be precise (Hambleton & Powell, 1983).

34

3. Perception differences. The viewpoints and past

experiences of an evaluator can affect how he or she Interprets

behavior. In a classic study, Dearborn and Simon (1958) asked

business executives to identify the major problem described in a

detailed case study. The executives tended to view the problem in

terms of their own departmental functions.

4. Leniency/stringency error. When a rater does not have

enough knowledge to make an objective rating, he or she may

compensate by giving scores that are systematically higher or lower

(Chen, 1993).

5. Scale shrinking. Some raters will not use the end of any

scale (Chen, 1993).

Inter-rater reliability is essential If more than one individual

Is to be involved In the scoring of data pieces. Without Inter-rater

reliability, data from non-scaled measures is unusable (Marsden,

1993). Frasier (1990b) reminded us that to establish inter-rater

reliability, the following guidelines can be followed:

35

1. Have raters Independently score multiple randomly selected

data pieces.

2. Chart scores on each data piece.

3. Identify the response items on which all raters agree.

4. Form a concensus on the Interpretation for scoring.

5. Score another set of randomly selected data pieces.

6. Repeat steps 2 to 5 times until each individual is In

agreement on at least 90 percent of the Items 90 percent of the

time.

One should expect that the initial data set will have divergent

ratings. As subsequent sets are rated, agreement will increase. If

long periods of time elapse between scoring sessions, it might be

necessary to re-establish inter-rater reliability.

Training is needed to establish inter-rater reliability. With

training, inter-rater reliability is established and raters provide

reliable scores (Chambliss & Melmed, 1990; Foster-Gaitskell &

Pratt, 1989; Hanson & Linden, 1990; Houston, Raymond & Svec, 1991;

36

Wright & Piersel, 1992). Generalizability theory provides the

process to observe inter-rater reliability.

Generalizability theory Includes and extends classical test

score theory and is able to estimate the magnitude of the multiple

sources of error simultaneously. Unlike classical test score

analyses, generalizability theory will analyze sources of error

variance and interactions among these sources simultaneously.

Classical test score can only consider a single source of error at a

time. Classical test score also cannot consider the completely

independent or separate interaction effects of the sources of

measurement error variance.

Generalizability theory consists of two stages. The first

stage, the G-study, generates results that are generalizable to the

population of Interest. The second stage, the D-study, is conducted

to determine the most effective protocol to collect data with a

desired degree of reliability (Thompson, 1989). Generalizability

theory provides a powerful method for examining test reliability,

thus providing accurate generalizations.

37

CHAPTER 111

METHODOLOGY

Participants

All data for this study came from Information supplied on the

TABS which was previously requested by the school district. This

study utilized Information supplied by teachers and parents.

Teachers

The raters consisted of regular third grade classroom teachers

and third grade gifted and talented program teachers in one school

district. A regular third grade classroom teacher and a third grade

gifted/talented teacher completed the TABS for each student

nominated for the gifted and talented program In the school district.

Since the possibility existed that more than one student in any

particular third grade classroom could be nominated for the gifted

and talented program, each completed student form was not

necessarily from a different classroom teacher or gifted and

38

talented program teacher. There is a possibility that the same

teacher (either regular classroom teacher and/or gifted and talented

program teacher) completed forms for numerous students. The

researcher did not have access to the students who were rated, the

regular classroom teachers, gifted program teachers or parents who

were the raters.

The regular classroom teachers (N=89) completing the TABS

had between 1 and 20 years of classroom teaching experience (M =

7.9 years, SD = 5.9) . Each teacher reported currently having

students identified as gifted in the classroom. The teachers

indicated that they had received training to recognize gifted

characteristics in students through district in-service training and

university coursework.

The gifted and talented program teachers (N=19) had between 4

and 20 years of teaching experience (M = 7.8 years, SD = 5.0)

Involving gifted students. Each gifted and talented program teacher

also received training to recognize gifted characteristics in

students through district in-service training and university

39

coursework.

Every school In the district participated in the nominating

process and each school was considered by the district to be

ethnically and socioeconomlcally diverse.

Students

One hundred forty-nine third grade students nominated for the

district's gifted and talented program were randomly selected from

the nominated third grade students. Twenty-two forms were

excluded due to Insufficient information, therefore, 127 (73 Anglo,

6 Asian, 16 African-American, 32 Hispanic) third grade students (65

males, 59 females) nominated for the district's gifted and talented

program participated In the study.

The required criteria for a student to be nominated for the

gifted/talented program was that the student be In the third grade.

The student could "self-nominate," or any individual (within the

school or ourside the school environment) could nominate the

student.

40

Parents

Information was supplied on the TABS by 127 parents or

guardians (associated with each participating student). Eight hours

of training was provided to parents by a gifted and talented program

teachers. The parents were trained to identify gifted

characteristics In students. The parents received the same training

as the teachers. Each parent completed the first TABS in February

and again in April.

Instrument

The TABS (Frasier, 1990b) was designed to document

behavioral observations in students to identify giftedness in

culturally different groups (Appendix A). Frasier (1990b) stated

that the TABS meets the requirements of best practices through its

focus on diversity in the gifted population and involvement of people

Inside and outside the school.

The TABS was generated during years one and two of The

National Research Center on the Gifted and Talented (NRC/GT)

41

research project at The University of Georgia. This instrument was

developed with the view that giftedness is a construct. The

psychological concept of the TABS is that giftedness is not itself,

directly measurable, but believed to be Inferred (Bernal, 1978;

Frasier, 1990b; Gardner, 1983; Renzulli, 1973; Torrance, 1969).

Defined as a construct, the Inference of giftedness then Is carried

out through the observation and measurements of traits, aptitudes

and behaviors believed to demonstrate giftedness (Frasier, 1990b).

The TABS was eventually added to the Staff Development Model (SDM

Model), also utilized by The University of Georgia as a comprehensive

training model designed to provide educators with background

Information on giftedness as a psychological construct.

The TABS is a 10 item Instrument using a LIkert-type scale of

1-5 (5=strong, 1=weak) focusing on specific student behaviors. The

rater Is asked to rate the student being referred for assessment in

each of the following items believed to infer giftedness (Bernal,

1978; Frasier, 1990b; Gardner, 1983; Renzulli, 1973; Torrance,

1969):

42

a. communication-unusual ability to communicate (verbally,

nonverbally, physically, artistically, symbolically); uses

particularly apt examples, illustrations, or elaborations;

b. motivation-persistent in pursuing/completing self-

selected tasks (may be culturally Influenced) evident In

school or non-school type activities; enthusiastic learner;

has aspirations to be somebody, do something;

c. interests-unusual or advanced Interests in a topic or

activity; self-starter; pursues an activity unceasingly;

beyond the group;

d. problem solving ability-unusual ability to devise or adapt

a systematic strategy for solving problems and to change

the strategy If it Is not working;

e. memory-already knows; 1-2 repetitions for mastery; has a

wealth of Information about school or non-school topics;

pays attention to details; manipulates Information;

43

f. humor-keen sense of humor that may be gentle or hostile;

large accumulation about emotions; heightened capacity for

seeing unusual relationships; unusual emotional depth;

openness to experiences; heightened sensory awareness;

g. inquiry—asks unusual questions for age; plays around with

ideas; extensive exploratory behaviors directed toward

eliciting information about materials, devices or

situations;

h. insight—has exceptional ability to draw inferences;

appears to be a good guesser; is keenly observant;

Integrates Ideas and disciplines;

1. reasoning—ability to make generalizations; ability to use

metaphors and analogies; can think things through in a

logical manner; critical thinker; ability to think things

through and come up with a plausible answer; and,

44

j . imaglnation/creativity-shows exceptional ingenuity in

using everyday materials; is keenly observant; has wild,

seemingly silly ideas; fluent and flexible producer of ideas;

Is highly curious.

Over 300 public and non-profit private elementary and

secondary school districts (Frasier, 1990c) representing various

ethnic, demographic and socioeconomic groups through the country

have served as major research sites utilizing the TABS. No

published research Is available concerning the TABS.

Procedure

Each year the school district requests that the regular

classroom teacher, the gifted and talented program teacher, and the

parent complete the TABS for each student nominated for the gifted

and talented program. Each of the 127 students in this study had 4

TABS forms; one completed by the classroom teacher, one completed

by the gifted and talented program teacher and two completed by the

parent (the two forms completed by the parent were used in this

45

study to examine the occasion facet). Each teacher completed the

TABS once and each parent completed two ratings of the TABS.

Gathering the second parent rating is the customary procedure by

the school district. The second rating Involved the same parents

who participated in the first rating.

Third grade students were nominated for the gifted/talented

program during January. The parents of the nominated students

were Invited to participate in an 8-hour in-service training taught

by a gifted and talented program teacher. The training was designed

to famllarize the parents with the characteristics of gifted

students.

The training consisted of films illustrating students

participating in a classroom situation. Discussions were conducted

focusing on the gifted characteristics Illustrated by the students In

the film. Assignments and curriculum were briefly discussed,

highlighting higher-level thinking and the products associated with

higher-level thinking. The problems of identifying the culturally

diverse, handicapped or educationally atypical gifted student were

46

also discussed. Of particular benefit (according to the evaluation

form completed at the end of the session) was the question and

answer sessions because many parents seemed to have the same

questions. Upon the completion of the in-service training, each

parent was asked by the school district to complete a TABS for

his/her child.

The regular classroom teacher and the gifted and talented

program teacher were asked to attend the same 8 hour in-service

training as the parents. The training was taught by same gifted and

talented program teacher who taught the parents and the curriculum

was the same as that presented to the parents.

The regular classroom teacher completed a TABS for each

student nominated from her classroom. The gifted and talented

program observed the nominated student in her respective school

while the student worked In the regular classroom. She completed a

TABS after the classroom observation. There was no time limit

Imposed on the teacher's observation time of the nominated student.

47

teacher was allowed as much time as necessary to observe the

student In order to complete the TABS.

Data Analysis

Data was analyzed according to generalizability theory using

GENOVA (Brennan, 1992). The generalizability theory contributes to

the generalizability of results of various dimensions or facets and

reports on the quality of the data (Brennan, 1992). The

generalizability theory was selected for this study because. In the

behavioral and social sciences, such as psychology and education,

generalizability theory offers an extensive conceptual framework

and a powerful set of statistical procedures for addressing

numerous measurement issues. To an extent, generalizability theory

can be viewed both as an extension of classical test theory and as

an application of certain analysis of variance procedures to

measurement models Involving multiple sources of error.

A G-study Is first conducted with the data applicable to the

facets being Investigated; the G-study Is conducted to estimate

48

variance components associated with a universe of admissible

observations. These estimated variance components can then be

used to estimate results for various D-study designs and universes

of generalization. For any D-study, one must specify the objects of

measurement, define the universe of generalization, and identify the

sample sizes and structure of a D-study design. The typical

quantities that are estimated for a specified D-study design and

universe of generalization Include D-study design variance

components, universe score variance, error variances, and a

generalizability coefficient. Usually the magnitudes of these

quantities will vary for different universes of generalization and

for designs that differ In terms of sample sizes and/or structure

(Brennan, 1992).

Three G studies were examined: (1) teacher ratings, (2) parent

ratings, and (3) combined parent and teacher ratings. Finally, G

studies were conducted on each of the total samples (teacher,

parent, combined teacher and parent) to examine the difference

across ethnic groups (Anglo, Asian, African-American, Hispanic).

49

The data was analyzed using a "GENerallzed analysis Of

VArlance system (GENOVA). GENOVA enables the researcher to

Isolate major sources of measurement error and to conduct decision

study (D-study) analysis (Shavelson & Webb, 1991). The D-study

attempted to provide the following Information about the TABS:

1. The minimum number of raters necessary to maintain the

psychometric integrity of the TABS,

2. The effect on reliability and measurement precision of

using more or less than two raters,


using more or less than two rating occasions, and


using more or less than ten Items.

This study examines the Phi coefficient, not the G coefficient.

The Phi coefficient represents the absolute decision and the G

coefficient represents the relative decision. The absolute decision

(Phi coefficient) is used to Index the absolute level of an

individual's performance without regard to how well or poorly the

50

individual's peer performed. The G coefficient represents the

relative decision. Relative decisions are used to rank order

individuals or groups. Attention is paid to the standing of an

individual relative to others In the G coefficient while the Phi

coefficient focuses on the absolute value of an individual's

performance (Shavelson & Webb, 1991).

Significance of the Study

The information gathered from the first question being

investigated will contribute to the methods of data collection

utilized in the identification of gifted children. By considering the

facets discussed, the data collected In the data sets will contribute

to information concerning the reliability of the Instrument. The

Information gained will contribute to the research of instruments

designed to identify giftedness in children.

In particular, information gained from the final two research

questions will contribute to the reliability and measurement

51

precision of using more or less than two raters and more or less

than two rating occasions.

Information gained from this study will contribute to existing

information for those individuals struggling to find effective ways

to Identify children from economically disadvantaged and limited

English proficient background for participation In programs for the

gifted. Because referrals by educators and those closely associated

with children are traditional first steps in identifying children for

gifted program participation, the knowledge they hold about

giftedness and about the instrument being utilized may have a

profound impact on referral decisions.

52

CHAPTER IV

RESULTS

This study examined the following questions utilizing the

application of generalizability theory.

1. What is the reliability of the TABS scores using

generalizability theory to consider the facets of students, raters,

occasions and items?


of using more or less than two raters?


of using more or less than two rating occasions?


of using more or less than ten items?

Each student was rated once by a regular classroom teacher

and, once by a gifted and talented program teacher and on two

occasions by their parent/guardian. The classroom teacher, gifted

and talented program teacher and parent utilized TABS (Frasier,

53

1990b) to rate the students. Means and standard deviations of

scores are reported.

Descriptive Statistics

The scores obtained from TABS are reported In raw score

format. The instrument utilizes a Likert-type scale of 1-5

(5=strong, 1=weak) focusing on specific student behaviors. Each

rater was asked to rate the student being nominated for assessment.

The scores ranged from 5=strongest to l=weakest. Raters consisted

of a classroom teacher, a gifted and talented program program

teacher and a parent. In order for GENOVA to generate the results

designed for this study, the data were divided into ethnic groups.

Means and standard deviations of scores for the ethnic groups are

reported in Table 1.

Ratings bv the Classroom Teacher

All ethnic groups were rated once by the regular classroom

teacher. Table 1 presents the mean for the ratings by the classroom

54

teacher for the Anglo students is 4.39 with a standard deviation of

.69, the mean for the Asian students is 4.64 with a standard

deviation of .48, the mean for the Hispanic students is 4.24 with a

standard deviation of .73, and the mean for the African-American

students is 4.31 with a standard deviation of .71.

Ratings by the Gifted and Talented Program Teacher

The gifted and talented program teacher also rated all

students. On a range of 1 to 5, the mean for the Anglo students is

4.32 with a standard deviation of 1.65, the mean for the Asian

students is 4.5 with a standard deviation of .61, the mean for the

Hispanic students Is 3.96 with a standard deviation of .76, and the

mean for the African-American students is 4.34 with a standard

deviation of .71 (Table 1).

55

Ratings by the Parents

A parent of each student was asked to rate his/her child on

two different occasions. The mean for the ratings by the parents for

the Anglo students Is 4.35 with a standard deviation of 0.75, the

mean for the Asian students is 4.13 with a standard deviation of .72,

the mean for the Hispanic students Is 4.18 with a standard deviation

of .81, and the mean for the African-American students is 4.56 with

a standard deviation of .63 (Table 1).

Generalizability Theory

Data were analyzed utilizing GENOVA. The FORTRAN 11

program for analysis of variance and generalizability statistical

program, GENOVA (Crick & Brennan, 1983) was used for this study-

Complete GENOVA output files for this study are available from the

author. Generalizability theory treats both conceptual and

56

Table 1. Ratings by Classroom Teacher, Gifted and Talented Program Teacher and Parent.

Mean/ Standard Deviation

Anglo M. Anglo SD

Asian M Asian SD

Afr ican-American M. Afr ican-American SD

Hispanic M. Hispanic SD

Classroom Teacher

(1 Occasion)

4.39 .69

4.64 .48

4.31

.71

4.24 .73

Gifted and Talented Program Teacher

(1 Occasion)

4.32 1.65

4.51 .61

4.34

.72

3.96 .76

Parent

(2 Occasions)

4.35 .75

4.13 .72

4.56

.63

4.18 .81

57

statistical issues associated with generalizing from a sample of

conditions of measurement to a universe of such conditions. In

particular, generalizability theory emphasizes the estimation, use,

and interpretation of variance components associated with

universes.

Generalizability analysis were conducted to partition

systematic and measurement error sources within the data set of

scores obtained on TABS. Students were considered the object of

measurement. Error variance facets were raters, items, and

occasions, as well as the Interaction effects. Both generalizability

and phi coefficients were computed, but only the phi coefficient will

be considered since only absolute decisions are being considered.

The obtained variance partitions were also utilized In decision, or D-

study, analyses to explore the estimated effects on score Integrity

of selected changes In the measurement protocol.

58

Generalizability Analysis

Combined Raters

Analyses were generated by GENOVA to evaluate the data

generated for all the raters (classroom teachers, gifted and talented

program teachers, and parents). The following discussion will not

Include the Asian group of students because this group (N=6) was

considered too small to impact the study. Data for the Asian group

is found in Appendix B.

Combined Ratings for Total Sample

Variance components (Table 2) for the main effects of items

and raters are low (.01-.00) and are likely positive and stable due to

the low standard errors. The variance components for students (.07)

and items (.01) contribute to the high combined variance rate

between students, raters, and Items (.24). The Phi coeffrcient

revealed a rating from combined raters (teachers and parents) at .52.

This rating is considered to be marginal as related to reliability.

59

Combined Ratings for Anglo Sample

Data generated by GENOVA for the Anglo sample presents low

variance components for raters (.00) and students (.00). The

variance components in Table 3 suggest the largest source of

variance noted in the combined raters for the Anglo student

population was for the student, rater, and item interaction (.24).

The Phi coefficient suggests marginal reliability of scores.

Combined Ratings for African-American Sample

GENOVA has provided a G-Study for this population on Table 4.

Variance components, indicated on Table 4, range from .00 to .25.

The highest obtained variance components for the main effects of

students, raters, and Items Is .25. A Phi coefficient obtained for the

combined raters In this population of students reflects unreliable

scores.

60

Table 2. Generalizability Calculations for Combined Ratings, Total Sample.

Variance Components in Terms of G-study Universe (of Admissible Observations) Sizes

Variana

Effect

s* R* 1* SR* S 1* Rl* SRI*

3 Components Variance

Components for Single Observation

.07

.00

.01

.18 .03 .00 .24

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 3 8 3 8

24 24

For Mean Scores

Estimates

.07

.00 .00 .01 .00 .00 .01

Standard Errors

.02

.00

.00 .00 .00 .00 .00

Standard Standard Error of

Variance Deviation Variance

Universe Score Expected Observed Score

Lower Case Delta Upper Case Delta

Mean

.07484

.14451 .06967 .07184 .00331

.27357

.38015 .26396 .26804 .05752

.01902

.01806

.00597

.00612

Generalizability Coefficient - .51788 Phi- .51022

(1.07419) (1.04174)

* Rl = * SRI

* S = Students * R = Raters * 1 = Items * SR = Students combined with Raters * SI = Students combined with Items

Raters combined with Items = Students combined with

Raters combined with Items

61

Table 3. Generalizability Calculations for Combined Ratings, Anglo Sample.

Variance Components In Terms of G-study Universe (of Admissible Observations) Sizes

Variance Components

Effect

s R 1 SR SI Rl SRI

Variance Components for

Single Observation

.08

.00

.00

.16 .03 .01 .24

Finite Universe

Corrections

1.00 1.00 1.00 1.00 T.OO 1.00 1.00

D-study Sampling

Frequencies

1 3

10 3

10 30 30

For Mean Scores

Estimates

.08 .00 .00 .05 .00 .00 .01

Standard Errors

.02

.00 .00 .00 .00 .00 .00





Mean

Generalizability Coefficient -Phl-

.07621

.13993 .06372 .06519 .00338

.54462

.53897

.27606

.37408

.25243 .25532 .05818

(1.19597) (1.16905)

.02409 .02300 .00715 .00716

62

Table 4. Generalizability Calculations for Combined Ratings, African-American Sample.


Variance

Effect

s R 1 SR SI Rl SRl

Components Variance

Components for single Observation

.03

.00

.01

.20

.02

.00

.25

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 3

10 3

10 30 30

For Mean Scores

Estimates

.03 .00 .00 .01 .00 .00 .01

Standard Errors

.04

.00 .00 .00 .00 .00 .00





Mean

Generalizability Coefficient -Phi -

.02552

.10249 .07697 .07764 .00669

.24897

.24737

.15974

.32014

.27744 .27863 .08180

(.33151) (.32868)

.03875

.03416

.01826

.01752

63

Combined Ratings for Hispanic Sample

GENOVA generated the G-study shown In Table 5. This G-study

reports a low overall standard error for the mean scores ranging

from .00 to .04. The highest obtained variance component is found In

the student/rater effect (.21) and the student/rater/item effect

(.26). A Phi coefficient Indicates questionable scores.

Teacher Ratings

Analyses were also generated by GENOVA to evaluate the data

generated for teacher ratings for the total sample and teacher

ratings with Individual ethnic student groups (Anglo, African-

American, Hispanic) completing the TABS.

64

Table 5. Generalizability Calculations for Combined Ratings, Hispanic Sample.

Variance G-study Universe

Variance Components

Effect

s R 1 SR Sl Rl SRI


Single Observation

.07

.02

.00

.21

.03

.00

.26

! Components in Terms of (of Admiss

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

ibie Observations) Sizes

D-study Sampling

Frequencies

1 3

10 3

10 30 30

For Mean Scores

Estimates Standard Errors

.07 .04 .00 .00 .00 .00 .07 .01 .00 .00 .00 .00 .00 .00





Mean


.07736

.15870

.08134 .08696 .01058

.48746

.47080

.27814

.39837 .28520 .29488 .10284

(.95107) (.88964

.04148 .03907 .01394 .01456

65

Teacher Ratings for Total Sample

Table 6 reports the analysis of the data for the teacher ratings

for the total sample of the TABS as generated by GENOVA.

GENOVA utilizes the variance components in Table 6 to estimate the

variance components of the G-study universe. Those estimates, as

well as overall generalizability and Phi coefficients, are reported.

The most significant sources of variance found for teacher raters

were the students' main effect (.11) and the two-way interactions

of students and raters (.16). The interaction of all facets (students,

raters, and Items) resulted in a variance component of .23. The main

effects and Interactions that excluded the student component or the

systematic variance contributed no or negligible variance to the

obtained scores. Student and rater effect comprised the largest

portion of variance among the facets. The Phi coefficient of .51

suggests that teachers as raters need to be examined further. The

standard errors of the mean scores computed by GENOVA, ranging

from .00 to .03, are low relative to the estimated variance

components, which ranged from .00 to .23. Therefore, the variance

66

components appear to provide stable estimates of teacher rater

observations.

Teacher Ratings for Anglo Sample

GENOVA generated the G-study results for Table 7. This table

contains the variance components derived from the algorithm

method and estimated mean square equations for this population.

Standard errors for the mean scores were low and ranged from

.00 to .04. The interaction of all facets (students, raters, items)

produced the most variance (.22) among the possible sources. The

Phi coefficient of .59 suggests that further investigation may be

needed.

The variance components for single observations revealed low

components for raters (.00) and items (.00). Students contributed

the greatest portion (.13) of the interaction between students,

raters, and items (.22).

67

Table 6. Generalizability Calculations for Teacher Rating, Total Sample.

Variance Component; 5 in Terms of G-study Universe (of Admissible Observations) Sizes

Variance Components

Effect

s R 1 SR SI Rl SRI


Single Observation

.11

.00

.00

.16

.03

.00

.23



Mean

Generalizability Coefficient -Phi-

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

Variance

.10727

.20369

.09642

.10164

.00683

.52663

.51349

D-study Sampling

Frequencies Es

1 2

10 2

10 20 20

Standard Deviation

.32752

.45132

.31052

.31880

.08266

(1.11253) (1 .05544)

For Mean Scores

timates Standard Errors

.11 .03 .00 .00 .00 .00 .08 .01 .00 .00 .00 .00 .01 .00

Standard Error of Variance

.02814

.02556

.01177

.01261

68

Table 7. Generalizability Calculations for Teacher Ratings, Anglo Sample.


Variance Components

Effect

s R 1 SR SI Rl SRI


single Observation

.13

.00

.00

.14

.03

.00

.22

J Components in Terms of (of Admissible Observations) Sizes

Finite Universe

Corrections

1.00 1,00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores


.13 .04 .00 .00 .00 .00 .07 .01 .00 .00 .00 .00 .01 .00





Mean


.12964

.21439 .08475 .08740 .00560

.60471

.59730

.36006

.46302 .29111 .29564 .07481

(1.52980) (1.48326)

.03775 .03525 .01353 .01365

69

Teacher Ratings for African-American Sample

Table 8 displays the GENOVA results for the African-American

student population. The rater and item main effects are zero since a

negative estimate was generated. The leading source of variance

(.22) was produced by the interaction of all facets (students, raters,

items). A Phi coefficient of .38 was computed for the teacher

ratings for the African-American student sample. This coefficient

further supports the notion that additional research in this area is

needed to establish reliability.

Teacher Ratings for Hispanic Sample

The G-study results (Table 9) for this population obtained

variance components ranging from .00 to .24. The variance

component of .24, was produced by the interaction of all the facets

(students, raters, items). Remaining variance components are likely

positive and stable due to low standard errors. The Phi coefficient

of .22 suggests that additional investigation Is needed.

70

Table 8. Generalizability Calculations for Teacher Ratings, African-American Sample.


Variance Components

Effect

S R 1 SR SI Rl SRI


Single Observation

.06

.00

.00

.17

.02

.02

.22

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores


.06 .06 .00 .00 .00 .00 .09 .03 .00 .00 .00 .00 .01 .00





Mean


.06093 .16066 .09973 .10060 .01091

.37923

.37720

.24683

.40082 .31580 .31717 . 10444

(.61091) (.60565)

.06446 .05510 .03345 .03137

71

Table 9. Generalizability Calculations for Teacher Ratings, Hispanic Sample.


Variance

Effect

s R 1 SR SI Rl SRI

1 Components Variance


.04

.03

.00

.24

.03

.00

.24

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2 8

20 20

For Mean Scores


.04 .05

.02 .02 .00 .00 .00 .03 .00 .00 .00 .00 .01 .00





Mean


.04191

.17794 .13603 .15210 .02163

.23554

.21604

.20473

.42183 .36882 .39000 .14707

(.30812) (.27557)

.05468

.04381

.03272

.03574

72

Parent Ratings

Analyses were also generated by GENOVA to evaluate the data

generated for individual ethnic student groups (Anglo, African-

American, Hispanic) completing theTABS.

Parent Ratings for Total Sample

A G-study for the parents as raters for the TABS was

conducted utilizing GENOVA (Table 10). All students were included

in the data for Table 10 as well as two occasions (ratings) and 10

items.

The main effects for occasions produced the lowest variance

(.00). The leading source of variance (.23) was produced by the

interaction of students and occasions. The standard errors for mean

scores, ranging from .00 to .03, are low. The Phi coefficient of .82

Indicates an acceptable measure of parents as raters.

73

Table 10. Generalizability Calculations for Parent Ratings, Total Sample.

Variance

Effect

s 0 i so SI 0! SOI


Components Variance


.18

.00

.02 .03 .10 .00 .23

i Components in Terms of (of Admissible Observations) Sizes

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores


.18 .00 .00 .02 .00 .00 .01

.03

.00 .00 .00 .00 .00 .00





Mean


.18124

.21982 .03857 .04067 .00384

.82451

.81673

.42573

.46885 .19640 .20167 .06196

(4.69846) (4.45655)

.02784

.02759

.00375

.00385

74

Parent Ratings for Anglo Sample

The standard errors for mean scores, as Indicated in Table 11,

were low, ranging from .00 to .03 and the remaining variance

components are likely positive and stable due to the low standard

error. The greatest Interaction between the student, occasions and

items produced the most variance (.22) among the possible sources.

A Phi coefficient of .75 suggests reliable scores.

Parent Ratings for African-American Sample

The standard error for mean scores (Table 12) for the African-

American population for alt areas remained low. Occasions (.00) and

items (.00) provided the lowest variance components for a single

observation. The interaction of students, occasions, and Items

produced the most variance (.26). The Phi coefficient or

dependability Index was computed at .68 suggesting that the parent

ratings obtained on the TABS reflect fairly reliable scores.

75

Table 11. Generalizability Calculations for Parent Ratings, Anglo Sample.


Variance Components

Effect

s 0 1 so SI 01 SOI


Single Observation

.14

.00

.02 .04 .12 .00 .22

1 Components In Terms of (of Admissible Observations) Sizes

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores


.14 .03

.00 .00 .00 .00 .02 .00 .00 .00 .00 .00 .01 .00





Mean


.14229

.18616

.04387 .04734 .00603

.76436

.75035

.37722 -43146 .20944 .21758 .07764

(3.24385) (3.00557)

.03109 .03060 .00546 .00546

76

Table 12. Generalizability Calculations for Parent Ratings, African-American Sample.


Variance Components

Effect

s 0 1 so SI 01 SOI


Single Observation

.08

.00

.00 .04 .03 .00 .26

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores

Estimates

.08 .00 .00 .01 .00 .00 .01

Standard Errors

.04

.00 .00 .00 .00 .00 .00





Mean


.07602 .11191 .03589 .03653 .00764

.67931

.67541

.27571

.33452 .18944 .19114 .08741

(2.11823) (2.08079)

.04002 .03838 .01133 .01081

77

Parent Ratings for Hispanic Sample

The most significant Information for the Hispanic sample

(Table 13) involved the student main effects (.30) and those of

students, occasions, items (.23). The student facet also produced

the highest standard error for the mean scores (.08). The remaining

variance components are likely positive and stable due to the low

standard errors. A Phi coefficient of .91 Is considered high and

Indicates a reliable score.

Decision Studies Considerations

The primary purpose of this study was to assess the reliability

of the TABS and the impact of raters and occasions on overall

reliability. Several D-studies were conducted to determine how the

generalizability and error coefficients would be affected by

different research designs; In particular, an increase or decrease in

the occasion or rater factors.

78

Table 13. Generalizability Calculations for Parent Ratings, Hispanic Sample.


Variance Components

Effect

S 0 1 SO SI 01 SOI


Single Observation

.30

.00

.02 .02 .09 .00 .23

J Components in Terms of (of Admissible Observations) Sizes

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores

Estimates

.30

.00 .00 .00 .00 .00 ,01

Standard Errors

.08

.00 .00 .00 .00 .00 .00





Mean


.29856

.32671

.02816

.02981

.01186

.91382

.90923

.54640

.57159 .16780 .17264

.10890

(10.60357) (10.01682)

.08060 .08043 .00520 .00513

79

While the TABS has a fixed number of items, a D-study was

conducted to evaluate the effect on reliability of omitting Items

from the survey instrument. The results of this D-study are found in

Table 14. The impact of Incomplete ratings was addressed by

altering the number of Items considered In the D-study. Although

the Items on TABS are fixed, often raters do not respond to all of the

Items. The TABS contains 10 Items and the Phi coefficient remains

near .51 when 8 or more items are completed. The Phi coefficient

achieved when less than 8 items were completed was below .49.

The result of another D-study, Table 15 and Table 16, utilizes

the variance components reported by GENOVA to derive the

coefficient for the different measurement protocols. Table 15

illustrates the parents as raters with two rating occasions. Table

16 Illustrates the combined raters and one rating occasion. Optimum

reliability for the TABS was achieved when the student was rated on

two occasions by parent raters (Table 15). The least desirable

situation examined was one occasion and one rater (Table 16).

80

Table 14. D-Study: Phi Coefficients for Items Omitted from TABS.

3 Raters, 1 Observation per Rater Item Numbers Varied

Items Phi Coefficients

10 .51

8 .50

5 .47

1 .30

81

Table 1 5. D-Study: Phi Coefficients of TABS with Parents As Raters and Occasions Varied

Ethnic Group

Occasions/ Raters

Anglo African-American

Hispanic

2 Occasions 1 Rater (Parent) ,75 .68 .90

1 Occasion 1 Rater (Parent)

.64 .52 .85

82

Table 16. D-Study: Phi Coefficients of TABS for Raters and Occasions, Varied.

Ethnic Groups

Raters/ Occasions

Anglo African-American

Hispanic Combined

3 Raters 1 Occasion ,54 .25 .47 .51

2 Raters 1 Occasion -44 18 .37 .44

1 Rater 1 Occasion .29 .10 .23 .26

83

CHAPTER V

DISCUSSION

If we are to become more effective in recognizing gifted

potential in minority, economically disadvantaged and limited

English proficient student populations, then a number of issues must

be addressed. This study has examined one of those Issues: the

reliability of an instrument utilized by educators to identify

minority, economically disadvantaged and limited English proficient

students for gifted programs.

This study was designed to test the reliability of the TABS

while simultaneously considering several sources of variance and

computing a reliability coefficient for the students, raters, Items,

and occasions. The occasions (test-retest) reliability coefficient

utilizing parents as raters was examined. The utility of teachers as

raters and parents as raters was investigated and reviewed with

respect to previous research. Finally, the limitations of GENOVA and

84

this study were considered along with implications for future

research using TABS.

Generalizability Finding for TABS

The TABS is designed to document behavioral observations in

order to help guide the search for giftedness In culturally different

groups. Frasier (1990b) stated that TABS meets the requirements

of best practices through its focus on diversity In the gifted

population and involvement of people Inside and outside the school

and that TABS should only be utilized after the raters are provided

training to recognize gifted characteristics. The results of these G-

studies yielded coefficients suggesting that raters participating in

this study did not establish Inter-rater reliability before completing

the survey instrument. The results of this study also sugggest that

the TABS should not be used as the sole Identification instrument,

but rather as an adjunct to a comprehenslsve identification

portfolio.

85

D-Studv Findings for TABS

A results of a D-study revealed increases In the Phi

coefficients when parents rated his/her child on more than one

occasion. The Phi coefficient for the Hispanic ratings (2 occasions,

1 parent) was .90, indicating a very reliable rating. The lowest Phi

coefficient is found with the ratings (1 occasion, 1 parent) for the

African-American population. The overall Phi coefficients would

indicate that the training provided the parents before they rated

their children was effective.

A second D-study result, reported in Table 16, revealed

appreciable increases in the Phi coefficient when the number of

observations and raters was Increased from the original design of

one occasion/one rater (.26) for the combined raters to two

occasions/three raters (.51) for the combined raters. This D-study

suggests that additional raters and occasions increase the Phi

coefficient. While cost and time must be considered when utilizing

any instrument, this D-study suggests that additional raters and

occasions add to the reliability of the Instrument. Both coefficients

86

of this D-study suggest that teachers as raters need to be examined

further.

The Phi coefficients dramatically increased for all groups of

students. The highest Phi coefficient for the African-American

population, however, did not indicate reliability (.25 with 3 raters, 1

occasion). This low Phi coefficient would suggest low reliability

concerning scores within this population of students.

Teachers As Raters

This study supports previous findings that teachers play an

important role In the identification of students for educational

programs (Epkins, 1993; Mllich & Landau, 1988; Pelham, Gnagy,

Greenslade, & Mllich, 1992). It is important for teachers to be a

part of the selection process because they have data to offer that is

not available to other members of the selection team. However, the

particular beliefs and attitudes of the teacher must be considered

(Pegnato & Birch, 1959), especially when the identification process

takes place at the elementary level (Jacobs, 1971).

87

Pegnato and Birch (1959) suggested that teachers most often

choose children like themselves as gifted. Whatever the teacher

values will be the criterion for selection. Often the quiet, well-

behaved, well-dressed child who gets good grades Is a prime target

for teacher selection. In their study, Pegnato and Birch found that

teachers Identified only 45% of the students in their classes who

were cognltively gifted, actually missing 55%. They suggested that

systematic bias may exist among teachers when attempting to

identify giftedness in students.

One way to help teachers know how to Identify the

characteristics of gifted learners Is to improve their accuracy in

selecting children who demonstrate these characteristics. Providing

teachers with Information about the common characteristics found

among gifted children would encourage them to look for those

characteristics they might otherwise miss (Borland, 1978; Gear

1978).

If teachers cannot recognize gifted characteristics in the

students in their classes, there may be waste of human potential.

88

Gear (1978) found that the effectiveness of teacher selection was

improved with a training program. The teachers participating in the

training program were twice as effective as were untrained

teachers. Teachers can improve their efficiency in selecting gifted

students when they are provided training to recognize gifted

behaviors in all groups of children. This Is especially important

when attempting to recognize the gifted abilities of children from

economically disadvantaged and limited English proficient

backgrounds. With training, the likelihood exists that gifted

children from underrespesented groups will also be better

recognized (Borland, 1978).

Parents As Raters

This study also found high Phi coefficients (Table 15) in

ratings by parents. This high Phi coefficient may be because parents

are aware of the behavior of their child and can provide Information

that is clearly Indicative of potential giftedness (Jacobs 1971).

89

The Phi coefficient findings in this study clearly Indicated that

parents as raters provided reliable data.

Limitations of the Study

The sample consisted of 127 students. Regular classroom

teachers and gifted and talented program teachers served as raters.

The sample size was considered adequate and the data provided an

accurate representation of the TABS results. The sample for this

study was limited to third grade students, thus limiting the results

outside this grade range. Generalizations to other groups should be

considered with caution.

Data for this study was collected from third grade teachers

across the school district, gifted and talented program teachers, and

parents. All raters had received eight hours of gifted and talented

training focusing on the characteristics of gifted and talented

children.

90

Directions for Future Research Using the TABS Form

The overall reliability of the TABS was examined in this study.

Existing research for the TABS does not exist. Additional studies

should be conducted to further examine the reliability of this

instrument and its usefulness in the identification of gifted

individuals.

The results obtained in this study suggested that reliability

may be Improved if Inter-rater reliability is established. Inter

rater reliability is essential if more than one Individual Is to be

Involved In the rating. Without inter-rater reliability, data are

invalid. Several hours (above those suggested to famllarize raters

with the characteristics of gifted behavior) should be allowed to

establish inter-rater reliability. One should expect that Initial data

set will have divergent ratings. As subsequent sets are rated,

agreement will increase. If long periods elapse between rating

sessions, it is necessary to reestablish Inter-rater reliability.

91

Implications

Information from this survey presents several directions for

the following related research. First, organize a follow-up of this

study to determine what changes were made in the school district

involved In this study to prepare their teachers and parents to meet

the needs of gifted students. Second, implement a survey of gifted

programs presently utilizing TABS to ascertain the extent of

training received by teachers and parents in order to identify gifted

characteristics in children. Third, implement a statewide survey of

gifted education programs to ascertain their methods of identifying

gifted students. Fourth, determine, from the statewide survey, If a

new identification instrument is needed.

New gifted Identification Information will contribute to those

individuals struggling to find effective ways to identify children

from economically disadvantaged and limited English proficiency

background for participation in programs for the gifted. Because

referrals by educators and those closely associated with children

are traditional first steps in identifying children for gifted program

92

participation, the knowledge they hold about giftedness and about

the Instrument being utilized may have a profound impact on referral

decisions.

93

REFERENCES

Achenbach, T. M., McConaughy, S. H., & Howell, C. T. (1987). Child/adolescent behavioral and emotional problems: Implications of cross-informant correlations for situational specificity. Psychological Bulletin 101. 213-232.

Au, M. L., & Punfrey, P. D. (1993). Parents' and teachers' expectations of children's attainments: Match or mismatch? British Journal of Special Education 20. 109-120.

Baldwin, A. Y. (1978). Curriculum and methods: What Is the difference? In A. Baldwin, G. Gear, & L. Luclto (Eds.), Educational planning for the gifted: Overcoming cultural, geographic, and socioeconomic barriers (pp. 37-49). Reston, VA: Council for Exceptional Children.

Baldwin, A. Y. (1980). The Baldwin Identification Matrix. Its development and use in programs for the gifted child. Philadelphia, PA: Paper presented at the Convention of the Council for Exceptional Children.

Barber, L W. & Barton, K. (1971) Useabllltv bv raters of the Barber Scales of Self-Regard for Preschool Children. Boston, MA: Paper presented at the Annual Meeting of the Eastern Psychological Association.

Barkan, J. H., & Bernal E. M. (1991). Gifted education for bilingual and limited English proficient students. Gifted Child Ouarterlv. 22. 144-147.

94

Bermudez, A. B., & Rakow, S. J.. (1990). Analyzing teachers' perceptions of identification procedures for gifted and talented Hispanic limited English proficient students at-risk. Journal of Educational Issues of Language Minority Students. 7. 21-33.

Bernal, E. (1978). The identification of gifted Chicano children. In A. Baldwin, G.Gear, & L. Luclto (EDS.). Educational planning for the gifted. Reston, VA: Council for Exceptional Children.

Borland, J. (1978). Teacher identification of the gifted: A new look. Journal for the Education of the Gifted. 2. 22-32.

Brennan, R. L (1983). Elements of generalizibilitv theory. Iowa City, lA: American College of Testing.

Brennan, R. L. (1992). An NOME Instructional module on generalizability theory. Instructional Theory In Educational Measurement. 27-34.

Brennan, R. L (1992). Elements of generalizability theory. Iowa City, lA: American College Testing Program.

Chambliss, C, & Melmed, M. (1990). Attitudinal and behavioral responses toward parent clientele of parent and nonparent child care providers. ERIC No. ED 320689.

Chen, X. (1993). Reliability coefficient and correlation ratio between the observed scores and latent trait. Acta Psychological Sinica. 25, 395-399.

Christensen, A., Phillips, S., Glascow, R. E., & Johnson, S. M. (1983). Parental characteristics and Interactional dysfunction in families with child behavior problems: A preliminary Investigation. Journal of Abnormal Child Psychology 11. 153-166.

95

Clark, B. (1992). Growing up gifted: Developing the potential of children at home and at school (4th ed.). New York: Merrill.

Cornell, D. G. (1994). Low Incidence of behavior problems among elementary school students in gifted programs. Journal for the Education of the Gifted 18. 4-19.

Crick, J. E., & Brennan, R. L (1983). Manual for GENOVA: A GENerallzed analysis Of VArlance system. (Number 43). Iowa City, lA: American College of Testing.

Cronbach, L. J., Gleser, G. C, Nanda, H., & Rajaratnam, N. (1972). The dependability of measurements. New York: John Wiley & Sons, Inc.

Damico, J. S. (1985). Clinical discourse analysis: A functional approach to language assessment. In C. Simen, Communication skills and classroom success: Assessment of language-learning disabled students, (pp. 165-204). San Diego, CA: College-Hill.

Davis, G. A., & Rimm, S. B. (1994). Education of the gifted and talented (3rd ed.). Englewood Cliffs, NJ: Prentice Hall.

Dearborn, D. C, & Simon, H. A. (1958). Selective perception: A note on the departmental identification of executives. Soclometry. 140-148.

de Bernard, A. E. (1985). Why Jose can't get into the gifted class: The bilingual child and standardized reading tests. Roeper Review. S, 80-82.

DeHaan, R. F., & Havighurst, R. J. (1961). Educating gifted children. Chicago, IL: University of Chicago Press.

96

Delcourt, M., Loyd, B., Cornell, D., Goldberg, M., & Bland, L. (1994). Evaluation of the effects of programming arrangements on student learning outcomes. (Research Monograph). The University of Virginia, Charlottesville, Virginia, Department of Education.

Delgado-Galtan, C, & Trueba, H. T. (1985). Ethnographic study of the participant structures in task completion: Reinterpretation of "handicaps" in Mexican children. Learning Disability Ouarterlv. 8, 67-75.

Eason, S. (1991). Why generalizability theory yields better results than classical test theory: A prier with concrete examples. In B. Thompson (Ed.), Advances In educational research: Substantive finding methodical development (pp. 83-98). Greenwich, CT: JAI Press, Inc.

Eisenstade, T. H. (1994). Interparent agreement on the Eyberg Child Behavior Inventory. Child and Family Behavior Therapy. 16. 21-27.

Epkins, C. C. (1993). A preliminary comparison of teacher ratings and child self-report depression, anxiety, and aggression in inpatient and elementary school samples. Journal of Abnormal Child Psychology. 21. 649-661.

Ford, D. Y., & Harris, J. J. (1990). On discovering the hidden treasure of gifted and talented black children. Roeper Review. 13. 27-32.

Ford, D. Y., & Harris, J. J. (1996). Recruiting and retaining diverse students In gifted education: Pitfalls and promises. Tempo 16. 8-12.

97

Forehand, R., Lautenschlager, G. J., Faust, J., & Grazlano, W. G. (1986). Parent perceptions and parent-child Interactions in clinic-referred children: A preliminary investigation of the effects of maternal depressive moods. Behavior Research and Therapy 24, 73-75.

Foster-Gaitskell, D., & Pratt, C. (1989). Comparison of parent and teacher ratings of adaptive behavior of children with mental retardation. American Journal on Mental Retardation. 94. 177-181.

Frasier, M. M. (1987). The identification of gifted black students: Developing new perspectives. Journal for the Education of the Gifted. 10. 155-180.

Frasier, M. M. (1990a). An investigation of giftedness in economically disadvantaged and limited English proficient populations. Athens, GA: The National Research Center on the Gifted and Talented Proposal, The University of Georgia.

Frasier, M. M. (1990b). Identifying the gifted: Observation and rating scales. Athens, GA: The Torrance Center for Creative Studies, The University of Georgia.

Frasier, M. M. (1990c). Instruction manual: Using the Frasier talent assessment profile (F-TAP). Athens, GA: The University of Georgia.

Frasier, M. M. (1991). Response to Kitano: The sharing of giftedness between culturally diverse and non-diverse gifted students. Journal for the Education of the Gifted. 15. 20-30.

Frasier, M. M., & Passow, A. H. (1994). Toward a new paradigm for Identifying talent potential: Executive summary (Research Monograph 94112). Storrs, CT: University of Connecticut, The National Research Center on the Gifted and Talented.

98

Frierson, E. C. (1965). Upper and lower status gifted children: A study of differences. Exceptional Children. 32. 83-90.

Gallagher, J. J. (1994). Teaching the gifted child (4th ed.). Boston: Allyn & Bacon.

Gardner, H. (1983). Frames of mind: The theory of multiple intelligences. New York: Basic Books.

Gear, G. (1978). Effects of training on teachers' accuracy in the Identification of gifted children. Gifted Child Quarterly. 11. 90-97.

Gilbert, S. L. (1994). Parental and professional agreement In the assessment of children with disabilities: An examination. ERIC No. ED 378705.

Hambleton, R. K., & Powell, S. (1983). A framework for viewing the process of standard setting. Evaluation and the Health Professions. 6. 3-24.

Hanson, J. B. & Feldhusen, J. F. (1994). Comparison of trained and untrained teachers of gifted students. Gifted Child Quarterly. 38, 115-121.

Hanson, J. B., & Linden, L. W. (1990). Selecting Instruments for identifying gifted and talented students. Roeper Review 13. 10-15.

Hicks, J. S. (1988). The five D'S reolication study. Syosset, NY: Paper presented at Annual Convention of the Council for Exceptional Children.

Hoge, R. D. (1989). An examination of the giftedness construct. Canadian Journal of Education. 14. 6-17.

99

Houston, W. M., Raymond, M. R., & Svec, J. C. (1991). Adjustments for rater effects. Applied Psychological Measurement. 15. 409-421.

Hunsaker, S. L., & Callahan, C. M. (1993). Evaluation of gifted programs: Current practices. Journal for the Education of the Gifted. 16. 190-200.

Jacobs, J. (1971). Effectiveness of teacher and parent identification of gifted children as a function of school level. Psychology in the Schools. 8. 140-142.

Johnson, S., & Bell, J. (1985). Evaluating and predicting survey efficiency using generalizability theory. Journal of Educational Measurement. 22. 107-119.

Kaplan, J. A. (1993). The co-parenting system: Longitudinal effects for kindergartners of differences between mothers' and fathers' parenting styles. New Orleans, LA: Paper presented at the Biennial Meeting of the Society for Research In Child Development.

Keller, M. (1990). Holistic Identification of potentially gifted students: An alternative to the matrix. Instructional Leader, 12, 4-7.

Kitano, M. K., & Kirby, D. F. (1986). Gifted education: A comprehensive view. Boston, MA: Little, Brown.

Kranz, B., (1978). Multi-dimensional screening device for the identification of gifted/talented children. Grand Forks, ND: Bureau of Educational Research and Services, University of North Dakota.

100

Maker, C. J., & Schiever, S. W. (Eds.). (1989). Critical issues in gifted education: Defensible programs for cultural and ethnic minorities. Austin, TX: Pro-Ed.

Marland, S. P. (1972). Education of the gifted and talented: Report to the Congress of the United States. Washington, D.C.: U.S. Government Printing Office.

Marsden, P., (1993). The reliability of network density and composition measures. Social Networks. 15. 399-421.

Milich, R., & Landau, S. (1988). Teacher ratings of inattentlon/overactlvity and aggression: Cross-validation with classroom observations. Journal of Clinical Child Psychology. 17, 92-97.

Newcorn, J. H., Halperin, J. M., Schwartz, S., Pascualvasa, D., Wolf, L., Schmeidler, J., & Sharma, V. (1994). Parent and teacher ratings of attention-deficit hyperactivity disorder symptoms: Implications for case identification. Developmental and Behavioral Pediatrics. 86-91.

Nisbett, R. E., & Wilson, T. D. (1977). The halo effect: Evidence for the unconscious alteration of judgments. Journal of Personality and Social Psychology. 35. 450-456.

Office of Educational Research and Improvement. (1993). National excellence: A case for developing America's talent. Washington, DC: U.S. Department of Education.

Pagano, R. R. (1990). Understanding statistics in the behavioral sciences. New York: West Publishing Company.

Passow, A. H., & Rudnitski, R. A. (1993). State Dolicles regarding education of the gifted as reflected in legislation and regulation (CRS93302). Storrs, CT: University of Connecticut.

101

Pegnato, W., & Birch, J. (1959). Locating gifted children In junior high school. Exceptional Children. 25. 300-304.

Pelham, W. E., Gnagy, E. M., Greenslade, K. E., & Mllich, R. (1992). Teacher ratings of DSM-lll-R symptoms for the disruptive behavior disorders. Journal of the American Academy of Child and Adolescent Psychiatry. 31. 210-218.

Pendarvis, E. D., Howley, A. A., & Howley, C. B. (1990). The abilities of gifted children. Englewood Cliffs, NJ: Prentice Hall.

Perrone, P., & Male, R. (1981). The developmental education and guidance of talented learners. Rockville, MD: Aspen Publications.

Ramirez, M., & Castaneda, A. (1974). Cultural democracy, blcognative development, and education. New York: Academic Press.

Renzulli, J. (1973). Talent potential in minority group students. Exceptional Children. 39. 437-444.

Renzulli, J., & Hartman, R. (1971. Scale for rating behavioral characteristics of superior students. Exceptional Children. 38. 243-248.

Renzulli, J. (1973). Talent potential in minority group students. Exceptional Children. 39. 437-444.

Rickard, K. M., Forehand, R., Wells, K. C, Griest, D., L, & McMahon, R. J., (1981). Factors in referral of children for behavioral treatment: A comparison of mothers of clinic-referred deviant, clinic-referred nondeviant and noncllnic children. Behavior Research and Therapy 19. 201-205.

102

Russikoff, K. A. (1994). Hidden expectations: Faculty perceptions of SLA and ESL writing competence. Baltimore, MD: Paper presented at the Annual Meeting of the Teachers of English to Speakers of Other Languages.

Shade, B. J. (1991). African American patterns of cognition. In R. L. Jones (Ed.), Black Psychology (pp. 231-247). Berkeley, CA: Cobb & Henry.

Shaklee, B. D., Barbour, N., Ambrose, R., Rohrer, J., Whitmore, J. R., & Viechnickl, K. J. (1994). Early assessment for exceptional potential In young minority and/or economically disadvantaged students. In C .M. Callahan, C. A. Tomlinson, & P. M. Pizzat (Eds.), Contexts for promise: Noteworthy practices and innovations in the Identification of gifted student (pp. 22-42). Charlottesville, VA: University of Virginia, The National Research Center on the Gifted and Talented.

Shavelson, R .J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.

SIgafoos, J., & Pennell, D. (1995). Parent and teacher assessment of receptive and expressive language In preschool children with developmental disabilities. Education and Training In Mental Retardation and Developmental Disabilities. 18. 329-35.

Stanley, J. C. (1976)Tests better finder of great math talent than teacher are. American Psychologist. 31. 313-314.

Sternberg, R. J. (1985). Bevond IQ: A triarchic theory of human intelligence. Cambridge: Cambridge University Press.

Stone, W., & Rosenbaum, J. L. (1988). A comparison of teacher and parent views of autism. Journal of Autism and Developmental Disorders. 18. 403-414.

103

Taylor, 0. L. (1990). Cross-cultural communication: An essential dimension of effective education. Washington, DC: The Mid-Atlantic Equity Center.

Thompson, B. (1989). Why generalizability coefficients are an essential aspect of reliable assessment. Houston, TX: Paper presented at the meeting of the Southwest Educational Research Association.

Tonemah, S. A., & Brittan, M. A. (1985). American Indian gifted and talented assessment model. Norman, OK: American Indian Research and Development.

Torrance, E. P. (1969). Creative positives of disadvantaged children and youth. Gifted Child Quarterly. 13. 71-81.

Treffinger, D., & Renzulli, J. S. (1986). Giftedness as a potential for creative productivity: Transcending IQ scores. Roeper Review. 8, 150-154.

Wall, S. M., & Paradise, L. V. (1981). A comparison of parent and teacher reports of selected adaptive behaviors of children. Journal of School Psychology. 19. 73-77.

Webster-Stratton, C. (1988), Mothers' and fathers' perceptions of child deviance: Roles of parent and child behaviors and parent adjustment. Journal of Counseling and Clinical Psychology 56. 909-915.

Weigle, S. C. (1994). Using FACETS to model rater training effects. Washington, D.C. Paper presented at the Language Testing Research Colloquium.

104

Williams, B. L, & Hartlage, L C. (1988). Communication and retention of psychoeducatlonal diagnostic Information in parent conferences. Atlanta, GA: Paper presented at the Annual Meeting of the American Psychological Association.

Wright, D., & Plersel, W. (1992). Components of variance In behavior ratings from parents and teachers. Journal of Psychoeducatlonal Assessment. 10. 310-318.

Zappia, I. A. (1989). Identification of gifted Hispanic students: A multidimensional view. In C.J. Maker & S.W. Schiever. (Eds.), Critical Issues In gifted education: Defensible programs for cultural and ethnic minorities. Austin, TX: Pro-Ed.

105

APPENDIX A

THE LOOKING FOR TRAITS, ATTRIBUTES AND BEHAVIORS

STUDENT REFERRAL FORM

106

The National Research Center on the Gifted and Talented at The University of Georgia

Looking for Traits, Attributes and Behaviors Student Referral Form

Name of Student: School: Grade:

Gender M Birthdate:

Student Ethnicity: American Indian (Circle One) 1

Primary Language Spoken at Home: Name of Person Completing Form:

Asian/Pacific ISL Black Hispanic White 2 3 4 5

(Circle One) Classroom Teacher PT Teacher Parent Other (specify)

Directions: Please rate the student being referred for assessment In each of the following areas. Circle the acprotiriate number and provide specific example(s) or comment(s) for each trait. attriDute or Oehavior. Specific examples must be given for a rating of 1 or 5. The anacned TAB'S Observation Sheet may assist you in completing this form.

Communication •unusual ability to communicate (vertaally, nonverbally,

physically, artistically, symbolically) •uses particularly apt examples, illustrations, or elaborations

In this area, the student is:

Strong Average Weak

1

Motivation •persistent in pursuing/completing self-selected tasks

(may be culturally influenced): evident in school or non-school type activities

•enthusiastic learner •has aspirations to be somebody, do something


Strong Average Weak

107

Interests •unusual or advanced interests tn a topic or activity *self-5tarier •pursues an activity unceasingly •t)eyond the group

In this area the student is:

Strong Average Weax

Problem solving ability •unusual ability to devise or adapt a systematic

strategy for solving problems and to change the strategy If it is not working

•creates new designs •inventor/innovator


Strong Average Weak

5 4 2 1

Specific exampie(s)

Memory •already knows •1-2 repetitions for mastery •has a wealth of information about

school or non-school topics •pays anention to details •manipulates information


Strong Average Weak

5 4 2 1

Specific example(s)

108

Humor •keen sense of humor that may be gentle or hostile *<arge accumulation about emotions •heightened capacity for seeing unusual relationships 'unusual emotional depth -openness to expenences •heightened sensory awareness

In this area, the student Is:

Strong Average Weak

Specific exampie(s)

Inquiry •asks unusual questions for age •plays around with ideas •extensive exploratory behaviors directed

toward eliciting information about matenals. devices or situations


Strong Average Weak

4

Specific example(s)

Insight •has exceptional ability to draw inferences •appears to be a good guesser •is keenly observant •integrates ideas and disciplines


Strong Average Weak

Specific exampie(s)

109

Interests •unusual or advanced interests in a topic or activity •self-starter •pursues an activity unceasingly •beyond the group


Strong Average Weak

Problem solving ability •unusual ability to devise or adapt a systematic

strategy for solving problems and to change the strategy if it is not working

•creates new designs •inventor/innovator


Strong Average Weak

5 4 2 1

Specific exampie(s)

Memory •already knows •1-2 repetitions for mastery •has a wealth of information about

school or non-school topics •pays anention to details •manipulates information


Strong Average Weak

5 4 2 1

Specific example(s)

110

APPENDIX B

ANOVA SUMMARY TABLES AND

GENERALIZABILITY CALCULATIONS FOR ALL RATINGS

111

Table 17. ANOVA Summary Table for Combineci Ratings, Total Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

127 2 9

252 1134

18

2268

3809

Sums of Squares for Mean Scores

70993.16667 70463.07795 70474.40682

71515.70000 71389.66667 70504.74803

72481.00000

70446.90000

Sums of Squares for

Score Effects

546.26667 16.17795 27.50682

506.35538 368.99318

14.16325

554.63675

2034.10000

Mean Squares

4.33545 8.08896 3.05631

2.00935 .32539 .78685

.24455

112

Table 18. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented

Program Teachers, and Parents, Total Sample.

Effect

S R 1

SR SI Rl

SRI

Degrees of

Freedom

127 2 9

252 1134

18

2268

Model Variance Components

Using Using EMs Standard Algorithm Equations Error

.0748420

.0043601

.0057444

.1764798

.0269473

.0042701

.2445488

.0748420

.0043601

.0057444

.1764798

.0269473

.0042701

.2445488

.0190240

.0045102

.0034825

.0178448 .0051543 .0019601

.0072588

113

Table 19. ANOVA Summary Table for Combined Ratings. Anglo Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

72 2 9

144 648

18

1296

2189


41799.10000 41503.98493 41516.78995

42069.10000 42028.33333 41535.82192

42617.00000

41496.84429

Sums of Squares for

Score Effects

302.25571 7.14064

19.94566

262.85936 209.28767

11.89132

306.77534

1120.1557

Mean Squares

4.19800 3.57032 2.21618

1.82541 .32297 .66063

.23671

114

Table 20. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented Program

Teachers, and Parents, Anglo Sample.

Effect

S R 1

SR SI Ri

SRI

Degrees of

Freedom

72 2

9

144 648

18

1296


Using Algorithm

.0762106

.0018096

.0067091

.1588703

.0287551

.0058071

.2367094

Using EMs Equations

.0762106

.0018096

.0067091

.1588703

.0287551

.0058071

.2367094

Standard Error

.0240914

.0034825

.0044201

.0213850

.0067272

.0028646

.0092917

115

Table 21. ANOVA Summary Table for Combined Ratings, Asian Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

4 2 9

8 36 18

72

149


2926.46667 2929.48000 2923.60000

2941.20000 2942.66667 2936.80000

2980.00000

2921.62667

Sums of Squares for

Score Effects

4.84000 7.85333 1.97333

6.88000 14.22667

5.34667

17.25333

58.37333

Mean Squares

1.21000 3.92667

.21926

.86000 .39519 .29704

.23963

116

Table 22. Generalizability Calculations for Combined Ratings, Asian Sample.


Variance

Effect

S R 1 SR SI Rl SRI

Components Variance


.00

.06

.00

.06

.03

.01

.24

Finite Universe

Corrections

1.00 1.00 1.00 1.00 1.00 1.00 1.00

D-study Sampling

Frequencies

1 3

10 3

10 30 30

For Mean Scores

Estimates

.00 .02 .00 .02 .00 .00 .01

Standard Errors

.02

.02 .00 .01 .00 .00 .00

1 17

Table 23. ANOVA Summary Table for Combined Ratings, African-American Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

16 2 9

32 144

18

288

309


9826.23333 9781.75882 9782.84314

9903.10000 9876.33333 9794.52941

10033.00000

9777.03725

Sums of Squares for

Score Effects

49.19608 4.72157 5.80588

72.14510 44.29412

6.96471

72.83529

255.96275

Mean Squares

3.07475 2.36078

.64510

2.25433 .30760 .38693

.25290

118

Table 24. Variance Components for Combined Ratings Classroom Teachers, Gifted and Talented Program Teachers, and Parents, African-American Sample.

Effect

S R 1

SR SI Rl

SRI

Degrees of

Freedom

16 2 9

32 144

18

288

Using Algorithm

.0255174 -.0001634 .0039897

.2001634

.0182326

.0078840

.2529003

Model Variance

Using EMs Equations

.0255174 -.0001634 .0039897

.2001634

.0182326

.0078840

.2529003

Components

Standard Error

.0387469 .0103587 .0059594

.0547208 .0138933 .0073028

.0210022

119

Table 25. ANOVA Summary Table for Combined Ratings, Hispanic Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

31 2 9

62 279

18

558

959


16441.36667 16308.52813 16300.36458

16602.30000 16542.33333 16320.09375

16851.00000

16293.77604

Sums of Squares for

Score Effects

147.59062 14.75208

6.58854

146.18125 94.37813

4.97708

142.75625

557.22396

Mean Squares

4.76099 7.37604

.73206

2.35776 .33827 .27650

.25584

120

Table 26. Variance Components for Combined Ratings: Classroom Teachers, Gifted and Talented Program

Teachers, and Parents, Hispanic Sample.

Effect

S R 1

SR SI Rl

SRI

Degrees of

Freedom

31 2 9

62 279

18

558

Using Algorithm

.0773596

.0156175

.0038866

.2101927

.0274791

.0006459

.2558356

Model Variance

Using EMs Equations

.0773596

.0156175

.0038866

.2101927

.0274791

.0006459

.2558356

Components

Standard Error

.0414799

.0163532

.0033935

.0417078 .0107919 .0027739

.0152891

121

Table 27. ANOVA Summary Table for Teacher Ratings, Total Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

125 1 9

125 1125

9

1125

2519


46657.90000 46163.30159 46157.18254

46906.60000 46987.00000 46176.33333

47498.00000

46148.67302

Sums of Squares for

Score Effects

509.22698 14.62857

8.50952

234.07143 320.59048

4.52222

257.77778

1349.32698

Mean Squares

4.07382 .94550 .94550

1.87257 .28497 .50247

.22914

122

Table 28. Variance Components for Teacher Ratings, Total Sample.

Effect

Degrees of

Freedom


Using Algorithm

Using EMs Equations

Standard Error

s R 1

SR SI Rl

SRI

125 1 9

125 1125

9

1125

.1072705

.0099069

.0015365

.1643436

.0279168

.0021693

.2291358

.1072705

.0099069

.0015365

.1643436

.0279168

.0021693

.2291358

.0281430

.0094829

.0018128

.0235189

.0077020

.0017021

.0096526

123

Table 29. ANOVA Summary Table for Teacher Ratings, Anglo Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

72 1 9

72 648

9

648

1459


27718.50000 27414.84384 27417.94521

27841.80000 27906.00000 27426.57534

28178.00000

27409.77808

Sums of Squares for

Score Effects

308.72192 5.06575 8.16712

118.23425 179.33288

3.56438

145.13562

768.22192

Mean Squares

4.28780 5.06575

.90746

1.64214 .27675 .39604

.22497

124

Table 30. Variance Components for Teacher Ratings, Anglo Sample.

Effect

Degrees of

Freedom


Using Algorithm

Using EMs Equations

Standard Error

s R 1

73

CO

C

O

73

SRI

72 1 9

72 648

9

648

.1296444

.0044542

.0031414

.1418168

.0263868

.0023571

.2239747

.1296444

.0044542

.0031414

.1418168

.0263868

.0023571

.2239747

.0377548

.0056828

.0028949

.0270252

.0098744

.0023196

.0124239

125

Table 31 ANOVA Summary Table for Teacher Ratings, Asian Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

4 1 9

4 36

9

36

99


2101.70000 2098.00000 2100.20000

2102.40000 2117.00000 2102.00000

2128.00000

2097.64000

Sums of Squares for

Score Effects

4.06000 .36000

2.56000

.34000 12.74000

1 -44000

8.86000

30.36000

Mean Squares

1.01500 .36000 .28444

.08500 .35389 .16000

.24611

126

Table 32. Generalizability Calculations for Teacher Ratings, Asian Sample.


Variance Components Variance Finite D-study For Mean Scores

Components for Universe Sampling Effect Single Observation Corrections Frequencies Estimates Standard Errors

s R 1 SR SI Rl SRI

.03

.00

.00

.00

.05

.00

.25

1.00 1.00 1.00 1.00 1.00 1.00 1.00

1 2

10 2

10 20 20

.03 .00 .00 .00 .00 .00 .01

.03

.00 .00 .00 .00 .00 .00

127

Table 33. ANOVA Summary Table for Teacher Ratings, African-American Sample.

Effect

S R 1

SR SI Rl SRI

Mean

Total

Degrees of

Freedom

15 1 9

15 135

9 135

319


6042.65000 5994.70625 5997.21875

6072.10000 6081.50000 6001.93750 6145.00000

5994.45313

Sums of Squares for

Score Effects

48.19687 .25313

2.76563

29.19687 36.08436

4.46562 29.58436

150.54688

Mean Squares

3.21312 .25313 .30729

1.94646 .26729 -49618 .21914

128

Table 34. Variance Components for Teacher Ratings, African-American Sample.

Effect

Degrees of

Freedom


Using Algorithm

Using EMs Equations

Standard Error

s R 1

73

CO

CO

73

SRI

15 1 9

15 135

9

135

.0609259 -.0123148 -.0074074

.1727315

.0249741

.0173148

.2191435

.0609209 -.0123148 -.0074074

.1727315

.0240741

.0173146

.2191435

.0644609 .0045668 .0078856

.0668155

.0208810

.0133264

.0264779

129

Table 35. ANOVA Summary Table for Teacher Ratings, Hispanic Sample.

Effect

S R 1

SR SI Rl

SRI

Mean

Total

Degrees of

Freedom

31 1 9

31 279

9

279

639


10795.05000 10697.66563 10686.67188

10890.30000 10882.50000 10701.59375

11047.00000

10684.72656

Sums of Squares for

Score Effects

110.32344 12.93906

1.94531

82.31094 85.50469

1.98281

67.26719

362.27344

Mean Squares

3.55882 12.93906

.21615

2.65519 .30647 .22031

.24110

130

Table 36. Variance Components For Teacher Ratings, Hispanic Sample.


Effect

Degrees of

Freedom Using

Algorithm Using EMs Equations

Standard Error

s R 1

SR SI Rl

SRI

31 1 9

31 279

9

279

.0419131

.0322021 -.0010865

.2414091

.0326837 -.0006496

.2411010

.0419131

.0321371 -.0014113

.2414091

.0326837 -.0006496

.2411010

.0419131

.0330792

.0021196

.0653979

.0164486

.0030037

.0203405

131

Table 37. ANOVA Summary Table for Parent Ratings, Total Sample.

Effect

S 0 1

SO SI 01

SOI

Mean

Total

Degrees of

Freedom

125 1 9

125 1125

9

1125

2519


47973.60000 47424.57143 47475.38093

48046.00000 48504.00000 47479.38095

48424.00000

47424.05714

Sums of Squares for

Score Effects

549.54286 .51429

51.32381

71.88571 479.07619

3.48571

258.11429

1413.94286

Mean Squares

4.39634 .51429

5.70265

.57509

.42585

.38730

.22943

132

Table 38. Variance Components for Parent Ratings, Total Sample.

Effect

S 0 1

SO

Degrees of

Freedom

125 1 9

1125



.1812423 -.0001735 .0203132

.2294349

.1812423 -.0001735 .0203132

.2294349

.0278386 .0003627 .0096718

.0096652

133

Table 39. ANOVA Summary Table for Parent Ratings, Anglo Sample.

Effect

S 0 1

SO SI 01

SOI

Mean

Total

Degrees of

Freedom

72 1 9

72 648

9

648

1459


27990.70000 27724.78082 27758.95890

28039.00000 28329.00000 27762.95890

28528.00000

27722.63288

Sums of Squares for

Score Effects

268.06712 2.14795

36.32603

46.15205 301.07397

1.85205

148.84795

805.36712

Mean Squares

3.72315 2.14795 4.03623

.64100 .46601 .20578

.22970

134

Table 40. Variance Components for Parent Ratings, Anglo Sample.

Effect

S 0 1

SO SI 01

SOI

Degrees of

Freedom

72 1 9

72 648

9

648



.1422924

.0020971

.0246174

.0411297

.1181526 -.0003277

.2297036

.1422924

.0020643 .0244535

.0411297 .1181526

-.0003277

.2297036

.0310878

.0024098 .0118050

.0106147

.0144096 .0012146

.0127417

135

Table 41. ANOVA Summary Table for Parent Ratings, Asian Sample.

Effect

S 0 1

SO SI 01

SOI

Mean

Total

Degrees of

Freedom

4 1 9

4 36

9

36

99


1730.75000 1705.78000 1711.50000

1733.10000 1748.50000 1713.00000

1757.00000

1705.69000

Sums of Squares for

Score Effects

25.06000 .09000

5.81000

2.26000 11.94000

1-41000

4.74000

51.31000

Mean Squares

6.26500 .09000 .64556

.56500

.33167

.15667

.13167

136

Table 42. Generalizability Calculations for Parent Ratings, Asian Sample.


Variance

Effect

S 0 1 SO SI 01 SOI

Components Variance Finite

Components for Universe Single Observation Corrections

.28 1.00

.00 1.00

.03 1.00 .04 1.00 .10 1.00 .00 1.00 .13 1.00

D-study Sampling

Frequencies

1 2

10 2

10 20 20

For Mean Scores


.28 .18 .00 .00 .00 .00 .02 .02 .01 .00 .00 .00 .00 .00

137

Table 43. ANOVA Summary Table for Parent Ratings, African-American Sample.

Effect

S 0 1

SO SI 01

SOI

Mean

Total

Degrees of

Freedom

15 1 9

15 135

9

135

319


6612.95000 6580.08125 6584.28125

6623.30000 6663.50000 6588.56250

6713.00000

6579.37813

Sums of Squares for

Score Effects

33.57188 .70312

4.90312

9.64688 45.64687

3.57813

35.57188

133.62188

Mean Squares

2.23813 .70312 .54479

.64313

.33812

.39757

.26350

138

Table 44. Variance Components for Parent Ratings, African-American Sample.

Effect

S 0 1

SO SI 01

SOI

Degrees of

Freedom

15 1 9

15 135

9

135



.0760185 -.0004630 .0022685

.0379630

.0373146 .0083796

.2634954

.0760185 -.0004630

.0022685

.0379630

.0373146

.0083796

.2634954

.0400206 .0039922 .0091314

.0222876

.0258969

.0107805

.0318367

139

Table 45. ANOVA Summary Table for Parent Ratings, Hispanic Sample.

Effect

S 0 1

SO SI 01

SOI

Mean

Total

Degrees of

Freedom

31 1 9

31 279

9

279

639


11399.95000 11197.39063 11210.48436

11412.10000 11524.50000 11213.15625

11603.00000

11197.38906

Sums of Squares for

Score Effects

202.56094 .00156

13.09531

12.14844 111.45469

2.67031

63.67969

405.61094

Mean Squares

6.53422 .00156

1.45503

.39189

.39948

.29670

.22824

140

Table 46. Variance Components for Parent Ratings, Hispanic Sample.

Effect

S 0 1

SO SI 01

SOI

Degrees of

Freedom

31 1 9

31 279

9

279



.2985551 -.0014337 .0154234

.0163642

.0856183 .0021393

.2282426

.2985551 -.0014337 .0154234

.0163642

.0856183

.0021393

.2282426

.0805986 .0005008 .0099123

.0098378

.0194075

.0039991

.0192557

141

APPENDIX C

GENOVA PROGRAM AND SAMPLE DATA

142

COLUMN 111111111122222222223333333333444444444455555555556666 666666677777777778 123456789012345678901234567890123456789012345678901234 567890

TEACHER RATINGS FOR TOTAL SAMPLE NEGATIVE

* S 127 + R2 + I 10 (10F2.0/10F2.0)

GSTUDY OPTIONS EFECT EFFECT EFFECT FORMAT PROCESS 555445544433334434444 453454344334434344434 443344343343444433434 422443444344344444533 (DATA CONTINUES FOR 600 LINES) COMMENT 1 ST SET OF D STUDY CONTROL CARDS

#1 - S X R X 0 X I - I - FIXED $S 127 R 1 2 I 10

DSTUDY DEFECT DEFECT DEFECT ENDSTUDY COMMENT DSTUDY DEFECT DEFECT DEFECT ENDSTUDY COMMENT DSTUDY DEFECT DEFECT DEFECT ENDSTUDY FINISH

2ND SET OF D STUDY CONTROL CARDS #1 - S X R X 0 X I - I - FIXED $S R I 109 8 5 1

3RD SET OF D STUDY CONTROL CARDS #1 - S X R X 0 X I ~ I - FIXED $S R 1 2 3 I 10

143

© 1996, carolyn b. cropper

Documents