running head: an argument for rasch analysis

25
Running Head: AN ARGUMENT FOR RASCH ANALYSIS Measuring Factors Impacting Educator Supply and Demand: An Argument for Rasch Analysis Shannon O. Sampson University of Kentucky Kelly D. Bradley University of Kentucky

Upload: others

Post on 03-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

Running Head: AN ARGUMENT FOR RASCH ANALYSIS

Measuring Factors Impacting Educator Supply and Demand:

An Argument for Rasch Analysis

Shannon O. Sampson

University of Kentucky

Kelly D. Bradley

University of Kentucky

Page 2: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 1

Abstract

This study employs the Rasch model to analyze 516 responses from U.S. higher

educational institutions to the 2002 Educator Supply and Demand survey administered by

the American Association for Employment in Education. A contention is made that the

Rasch model provides a clearer picture, as compared to the conventional true score

model, by providing both a descriptive summary of the data and an assessment of the

measurement instrument. Key measurement issues identified, including recognizing 17

of the 33 questions as being problematic, suggest caution in interpreting the true score

model results. A content analysis of open-ended responses supports this assertion.

Factors perceived to be influencing teacher supply and demand are also identified.

1

Page 3: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 2

Introduction

In order to meet the needs of the United States educational system, teacher education

programs call for an accurate account of the status of teaching fields. The American Association

for Employment in Education (AAEE) provides an annual report of educational institutions’

perceptions of educator supply and demand in the United States. The report is constructed based

upon analysis utilizing the true score model of survey items. Rating scale data are often

analyzed utilizing this model, which postulates that every measurement, or observed score, is a

composite of the true ability of the respondent on the measure and random error. It is contended

that employing the Rasch model is a better approach to the analysis. Rasch analysis produces

measures on items and respondents, and the analysis assists in identifying problematic items and

unexpected response patterns. This study utilizes the Rasch model to analyze data collected and

originally analyzed in the 2002 AAEE report. A content analysis of the qualitative comments is

used to support the quantitative results. Findings will benefit administrators of teacher

preparation programs making program decisions to address the needs of the field, teacher

candidates seeking information on the current job market, and education agencies involved with

policy decisions affecting the field. A methodological framework for educational researchers

analyzing rating scale data, including AAEE, is also provided.

Background

Teacher supply and demand

Much attention has been given to teacher retention, recruitment, and quality in the United

States (U.S.), and research has indicated that teacher preparation and qualifications are critical

factors in student achievement (Darling-Hammond, 2000). Educators, politicians, and

researchers have warned of an impending teacher shortage and the subsequent need to hire more

Page 4: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 3

teachers. While teacher retirement, an increasing student population, and new classroom policies

have been commonly cited reasons for the coming teacher shortage, research suggests that a

larger part of the problem is teacher attrition (Ingersoll & Smith, 2003). Ingersoll and Smith

report that after five years of teaching, between 40-50% of beginning teachers leave the

profession. As quoted in No Dream Denied (National Commission, 2003), “The ability to create

and maintain a strong professional learning community in a school is limited not by teacher

supply but by high turnover among the teachers who are already there, turnover that is only

aggravated by hiring unqualified or underprepared individuals to replace those who leave” (p. 8).

Ingersoll and Smith state that efforts to recruit more teachers will not solve staffing problems if

nearly half of those teachers leave within a few years. “Pouring more water into the bucket will

not do any good if we do not patch the holes first” (p. 33.)

According to Linda Darling-Hammond (2000), teacher distribution is one issue in

meeting the demands for educators. She writes, “enrollment declines are anticipated in most

parts of the Northeast and Midwest, while other states will have stable enrollments. Some states

have a large number of teacher education institutions and regularly produce more teachers.

States may not have developed aggressive recruitment strategies or reciprocity arrangements for

accepting licenses awarded in other states. As a result, it is difficult to get teachers from where

they are prepared to where they are needed” (p. 6), an indication that lack of mobility is a factor

linked to supply of teachers to meet the demand.

Darling-Hammond (2000) asserts shortages in math and science exist “largely because

the knowledge and skills required by teachers command much greater compensation in fields

outside of teaching and because there are inadequate numbers of slots in schools of education to

prepare an adequate supply of teachers in these fields” (p. 8). States and districts are working to

Page 5: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 4

attract teachers to the hardest-to-staff areas by creating alternative pathways into the profession

to attract mid-career professionals and offering financial incentives such as signing bonuses,

student loan forgiveness, housing assistance, and free graduate courses (Hirsch, Koppich, &

Knapp, 2001). Ingersoll (2003) notes that raising teacher salaries may be a way to ‘plug the

holes’, but he suggest that a better alternative may be to address working conditions identified by

new teachers as their decision to leave teaching, such as lack of administrative support, poor

student discipline and student motivation, and lack of participation in decision making. In order

for every student to have the opportunity to learn from highly qualified teachers, it is important

for educational researchers to continue to identify areas of need and reasons for shortage and for

educational institutions and educational policy makers to implement changes to address them.

Annual AAEE Educator Supply and Demand reports are designed to identify higher

education institutions’ perceptions on the causes of teacher attrition. It is essential for the report

to paint an accurate picture of supply and demand so users of the report can make best-informed

decisions. AAEE suggests many uses for the report, including to develop recruitment strategies

by human resource administrators; to evaluate modifications to teacher education programs; and

to guide university students in selecting a major teaching field. The majority of results in the

AAEE report are based upon means constructed through descriptive statistics of rating scale data.

Rasch Model as Opposed to the True Score Model

AAEE reports utilize true score theory in analyzing the survey rating scale data, under the

premise that every measurement is a composite of the true endorsement characteristics of the

respondent on the measure and random error. As noted in Smith (2000), the true score model

has many deficiencies beginning with the issue of sample-dependence between estimates of an

item’s difficulty to endorse and a respondent’s willingness to endorse, making the estimates for

Page 6: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 5

the items depend on the severity of the respondents in the sample. Moreover, the estimates of

item difficulty cannot be directly compared unless the estimates come from the same sample or

assumptions are made about the comparability of the different samples. The true score approach

requires complete records to make comparisons of items, and even more, a single standard error

of measurement is produced for all the scores, making it inadequate and potentially misleading.

The Rasch model, named after George Rasch and introduced in 1960, addresses many of

the weaknesses of the true score approach. Specific to rating scale data, it connects observations

of respondents and items in a way that indicates the occurrence of an event as probability rather

than certainty and maintains order in that the probability of providing a certain response defines

an order of respondents and items. These circumstances create the probabilistic version of the

scalogram, indicating that a person endorsing a more extreme statement should also endorse all

less extreme statements, and that an easy-to-endorse item is always expected to be rated higher

by any respondent (Wright and Masters, 1982). Contrasting the true score model, parameters in

the Rasch model are neither sample nor test dependent, so missing data are not problematic.

Another improvement is that Rasch measurement produces standard error estimates for each

discrete raw score, allowing for a reliability coefficient to be calculated for the instrument and

the respondents. Persons and items are measured on the same metric, allowing for a probability

expression to be calculated that makes it possible to combine any person’s estimated measure

with any item’s estimated measure to produce expected response values. As well, Rasch analysis

provides estimates for persons and items that are freed from the sampling distribution of the

sample employed, meaning there is no dependence on the particulars of the questionnaire nor of

the sample being measured (Smith, E., 2000; Wright, 1997; Wright and Masters, 1982).

Page 7: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 6

This study examines the 2002 AAEE Educator Supply and Demand data set. Here the

data are analyzed employing Rasch analysis, utilizing the methodology framework outlined by

Sampson and Bradley (2002). The results and conclusions yield information that enhance the

AAEE report, benefiting education agencies involved with policy decisions affecting the field.

The findings can also be used to improve the AAEE instrument and similar instruments as the

study provides insight into the quality of the measurement instrument and its assessment.

Method

Population and Sample

The data analyzed in this study originated from the 26th annual AAEE survey of Educator

Supply and Demand in the United States. The population was composed of institutions listed in

the Higher Education Directory, including AAEE members and non-members. AAEE assumes

that the opinions and perceptions of university directors of career services and other teacher

education program administrators accurately reflect the job market their students are entering.

Periodically, regional studies of employers are conducted to support this assumption. These

studies have consistently validated the data provided by colleges and universities (AAEE, 2002).

In May of 2002, a selected response pencil-and-paper questionnaire was mailed to 1,267

institutions of higher education. The instrument was mailed to the career services director at

each institution responsible for the planning and placement of graduates in teacher education and

related careers, or to deans and directors of teacher education in universities. In June, a second

mailing was conducted with non-responding institutions. 516 questionnaires were received,

resulting in a 40.7% response rate; 498 questionnaires were useable.

The data were assessed for representativeness of the return sample on variables including

AAEE membership, member versus non-member, and response wave, early returns versus on-

Page 8: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 7

time returns versus late returns. The sample was deemed representative of the population, with

the exception of AAEE members being overrepresented. Subsequently, the sorted AAEE and

non-AAEE members were compared across the 33 factors, with respect to how each impacted

the supply and demand of educators, in order to evaluate whether or not differential responses

could be identified between the two groups. Since the majority of items were not statistically

different (t-tests, p<.05), the responses were aggregated into an overall dataset for the Rasch

analysis, as was done with the original AAEE true score theory analysis.

Instrumentation

The selected response pencil-and-paper survey and the subsequent data collection were

developed and conducted by AAEE. A survey questionnaire was chosen as the method of data

collection due to the availability of a sample frame, the quantity of information desired, and the

necessity to obtain information in a timely manner. This study utilizes the portion of the survey

that focuses on teacher education program personnel’s perceptions of the relative weights of

national K-12 and higher education factors on teacher supply and demand for 2002. Using a 5-

point Likert-type scale ranging from “Significant Positive Influence”, 1, to “Significant Negative

Influence”, 5, respondents were asked to rate the perceived influence of 28 factors on supply and

demand. Supply is defined as “the number of individuals with at least a baccalaureate or higher

degree and other minimal requirements who are willing to supply their services as educators but

are not currently providing education services, new graduates as well as previous graduates, ”

and demand is “the total number of positions needed to be filled by certified/licensed (or eligible

for licensure) personnel in an educational setting” (AAEE, 2002). Correlation studies conducted

by surveying school district human resources directors in three regions of the country were found

to have high correlations with the AAEE study, indicating excellent reliability of the data.

Page 9: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 8

Data Analysis

The study has two components. It begins with a quantitative component that replicates

an analysis used in Sampson and Bradley (2003), applying a one-parameter Item Response

Theory model, commonly known as the Rasch Model, using WINSTEPS software (Wright and

Linacre, 2000 version 3.02). The Rasch model is based upon the difficulty of a set of items,

assuming that item difficulty is the only item characteristic influencing responses (Linacre,

1999). Here, two facets are involved: the instrument’s items and the respondents. From a Rasch

perspective, a respondent’s severity interacts with an item’s difficulty to assign a certain score to

produce an observed outcome (Linacre, 2002).

WINSTEPS produces various tables displaying the raw scores, measures, and ZSTDs for

the factors in the survey and the institutions responding. ZSTDs are mean-square fit statistics

standardized to approximate a theoretical mean 0 and standard deviation 1. INFIT ZSTDs are

sensitive to irregular inlying patterns and OUTFIT ZSTDs are sensitive to unexpected rare

extremes. Items and participating institutions’ responses that do not adequately fit the model

requirements are identified using the ZSTD scores. Item profiles are presented so that if a person

knows the measure of an institution a prediction about that institution’s responses for each of the

survey items can be made. Maps of the distribution of the institution measure against the item

measure are provided, and the category probabilities are produced.

The second component is qualitative, a content analysis of the free-response section of

the survey. Respondents’ comments were coded and then organized under emerging categories

based on the comments, independent of the topics listed in the survey. Assertions were drawn

from the comments within each category, and these were used to support the results of the Rasch

analysis and provide insight into possible revisions that could improve the 2002 AAEE survey.

Page 10: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 9

Results

Thirty-three factors were rated on the 5-point Likert-type scale. Parameters that were

estimated include: 496 institution measures, 33 item measures, and 4 category thresholds relating

to the transition points between the five response categories. Analyses began with consideration

of whether the data fit the model (see Table 1). The INFIT and OUTFIT ZSTDs produced in the

Rasch analysis assist in identifying the items on the instrument that are problematic. If the data

fit the model, or cooperate with Rasch model specifications (Wright, 1994), these statistics will

have a mean of zero and a standard deviation of 1. The response categories OUTFIT ZSTDs, the

standardized unweighted item and person fit statistics, are sensitive to unexpected rare extremes.

If the data fit the model, these statistics are approximately t-statistics, also with an expected

mean of 0 and a standard deviation of 1. The mean OUTFIT ZSTD for the persons is -.5 and the

standard deviation is 2.4, so the mean fit is lower than expected and there is greater variability in

the fit of respondents than expected indicating many of the institutions responded in a manner

that was inconsistent. The OUTFIT ZSTD for the items is -.1, and the standard deviation is 3.2.

Again, the mean is close to expected value of zero, but the higher standard deviation suggests

that there are some problematic items in the measurement tool (Wright and Masters, 1982). It

should also be noted that the mean item measure is 0.0 and the mean person measure is -.05.

While the mean for items is always set at 0.0, similar to a standard score, the person mean varies.

Thus, when the person mean is negative, items are generally more difficult to agree with, and

vice versa when the person mean is positive. The person measure for these data are only slightly

negative, suggesting that these items were well matched to the perceptions of the sample.

The next overall statistic to review is the separation, or the spread of person positions or

item positions. The real person separation is 2.28 and the real item separation is 5.94. Real

Page 11: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 10

refers to the estimated standard error of the measurements being adjusted for any misfit

encountered in the data. If separation is 1.0 or less, the items may not have enough spread,

suggesting less variability of the persons on the trait or redundancy of items. The real person

separation suggests the rating scale discriminates well between the respondents and the real item

separation suggests that the items are creating a well-defined variable.

Table 2 indicates how the participants used the response scale, illustrating the steps from

one rating scale category to another. The response scale ranged from 1 = Significant Positive

Influence to 5 = Significant Negative Influence. Observed Count indicates the number of times

the category was selected across all items and persons. Respondents were not likely to endorse a

1 or 5 with only 7% of the responses in the 1 category and 6% in the 5 category. Furthermore, a

1 or 5 response may stand out as differing from the expectation. The mean squared estimate,

MNSQ, is always less than 1.2, so it does not appear that any substantial misfit occurs. The step

calibration is expected to increase with category value, and it does. It also shows the steps are

similar in size, with the largest step of 1.62 logits from category 1 to 2. Another way of viewing

these steps is by the use of probability curves (Figure 1). These curves display the likelihood of

category selection, along the y-axis, by the person-minus-item measure, along the x-axis. If the

difference were -1.0, the most likely response would be a 2, closely followed by a 3. If all

categories are utilized, each category value will be the most likely at some point, and no curves

will be inverted.

Responses can also be estimated through the use of Figure 2. The institution measures

are located on the horizontal axis. Drawing a vertical line through the institution measure and

identifying the response categories nearest that line can recognize the institution’s most likely

responses. Using Figure 2, an institution with a measure of approximately 0 would be expected

Page 12: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 11

to assign a 4 to “increasing our teacher ed. enrollments,” a 3 to “early retirement through state

funding,” and a 2 to “working conditions and school violence.” This chart can be used in

conjunction with actual responses to become aware of idiosyncrasies of the respondents.

WINSTEPS produces a table of items in order of worst to best fitting. While there are no

given rules for acceptable and unacceptable fit, Smith (1992) recommends that standardized infit

or outfit be between -2 and +2 ZSTD. INFIT is weighted by the distance between the person

position and item difficulty and OUTFIT is unweighted and sensitive to outliers, or extreme

unexpected responses. ZSTD values greater than +2 are considered to be ‘noisy’, meaning that

unexpected or unrelated irregularities exist, and values less than -2 are considered to be “muted,”

and often result from dependence or redundancy among items (Linacre, 2000). Inspection of the

OUTFIT ZSTDs indicates that 17 of the 33 items are outside the traditional cutoff of |2|. Eight

of the misfitting items, including teacher salaries and early retirement, have an OUTFIT ZSTD

greater than 2, signifying high variability. Nine of the items, including personal career shifts,

federal mandates, and shifts of teachers, have an OUTFIT ZSTD of less than -2, indicating little

variability for the probabilistic model. These items will be discussed in detail below.

Discussion

According to Linacre (2000), poor category wording can lead to noisy and muted outfit.

The review of the institutions’ comments supports the assertion that poor wording is the cause

for the large number of misfits: “I found the previous page to be rather confusing. Perhaps it

should be worded differently in the future,” and “This survey is terribly confusing! The

supply/demand portion could be interpreted in a variety of ways.” Confusing wording is a

probable cause of a large number of respondents being labeled as ‘misfits’, as indicated by the

unexpected response pattern. Even as the wording of the survey seems to be the cause of the

Page 13: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 12

large number of misfits, the items are reviewed for other issues. Eight items are identified as not

fitting well with the rest of the scale, by an OUTFIT ZSTD greater than 2. They are (1) Teacher

Salaries, Supply; (2) Mobility of New Graduates, Supply; (3) State funding, Demand; (4)

Economic Conditions, Supply; (5) State Mandates, Supply; (6) Postponed Retirement; Demand;

(7) Early Retirement, Demand; and (8) State Funding, Supply.

The content analysis of the qualitative response variables provides insight as to why these

might be listed as misfitting items. Comments about teacher salaries vary from state to state,

ranging from, “The most important factor for [our state] is salaries. If teachers made a decent

salary, we would have many more students to choose education as a career,” and “salaries and

working conditions continue to stifle growth in students majoring in education related fields” to

“salaries are excellent in [our state],” “more considering field due to increasing salaries being

offered,” and “education enrollments have increased dramatically in response to the uncertainty

of the economy, and salaries and benefits that are much better than similar occupations in the

private sector.” The variation of responses may be cause for the high OUTFIT ZSTD of 5.7.

Teacher mobility may have an OUTFIT ZSTD of 5.5 due to different interpretations of

supply and demand. Some institutions note there would be a better supply of educators if they

were willing to work in high demand areas such as urban schools. Many institutions note that

lack of mobility is one of the main reasons teachers do not have jobs, while others note that their

graduates often move out of state in order to find jobs. Here, mobility could be interpreted as

either a willingness to be mobile or lack thereof; in which case, answers would be quite different.

Funding was the factor referred to most often in the survey comments. Comments about

local, state, and federal funding suggest these issues may create a misleading picture of educator

demand. One comment reflects many others’, “demand for teachers is very high, but economic

Page 14: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 13

conditions prevent school districts from hiring.” Funding is noted as the cause of certain subjects

being dropped, increasing class sizes, school closures, and less teachers being hired. Still, when

examining the survey responses, 51% of respondents listed state funding as having positive

influence on the demand of educators hired. This could be due to different interpretations of

‘positive’ and ‘negative’ influence related to supply and demand, since the comments

consistently communicate the economic situation “is having a calming influence on an otherwise

strong hiring trend.”

The Rasch analysis listed state mandates as misfitting with an OUTFIT ZSTD of 2.6; yet,

the content analysis revealed licensing requirements to have both positive and negative effects on

teacher supply. State tests, which respondents claim “are not an accurate predictor of

knowledge,” “keep minority students from entering fields of education,” and “delay entry into

practice,” would have a negative effect on supply. Still, in states where certification is granted in

grade ranges, the supply is noted as positively affecting certain grades at the expense of others.

Regional differences, as well as the type of state mandate the respondents have in mind when

rating that factor may be the cause of the misfit.

Finally, the effect of early retirement and postponed retirement on demand may have

OUTFIT ZSTDs of 2.4 due to the various reactions toward retirement. While some institutions

allude to deferred retirement reducing spending for new teacher candidates, others appear to have

a less positive reaction to postponed retirement. One institution writes, “Retirees are filling

positions that should only be filled by those qualified. Retirees should ONLY be hired when

others are not available.” Similar to other questions, the confusing interpretation of positive and

negative may be cause for the misfit.

Page 15: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 14

If an item fits substantially better than would be expected, it may be too discriminating,

described as overfitting. In this study, items with an OUTFIT ZSTD of less than -2 are

highlighted and labeled as overfitting, suggesting redundancy. With these data, there are nine

overfitting items: (1) Personal Career Shifts, Supply; (2) Federal mandates, Supply; (3) Distance

Learning Teacher Education; Supply; (4) Federal mandates, Demand; (5) Shifts of teachers,

Demand; (6) Shifts of students, Demand; (7) Federal Funding, Supply; (8) Private Schools/Home

Schooling, Demand; and (9) Foreign-prepared teachers, Supply.

Personal career shifts is most likely redundant of economic conditions, based on the

respondents’ comments: “Dot-com meltdown has brought math/science professionals to

teaching,” “Layoffs from jobs in industry are increasing the supply of teachers,” “Many…are

looking at teaching due to the poor economy and need for stability, and many are reevaluating

their careers and wanting to ‘make a difference’.” The same phenomenon appears to take place

with the item pairs of federal mandates and federal funding and the state and local mandate and

funding factors. In the comments, funding and mandates are noted in general terms, not

distinguishing between local, state and federal levels. The recommendation would be to word

these so they are more clearly distinguishable to the respondents.

Private schools/home schooling, shifts of teachers, and shifts of students each has more

than 50% of responses being 3, no influence, but such a rating seems to be unsound based upon

the comments section. Perhaps these areas are redundant with the general Student Enrollment in

demographic shifts in population. Distance learning teacher education, with 64% of respondents

assigning a 3 rating, and Foreign prepared teachers, with 76% assigning a 3, are never mentioned

in the comments section. It is likely that these areas are not typically used across institutions, in

contrast to the idea they were truly viewed as having no influence for the 2002 year.

Page 16: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 15

WINSTEPS produces a map of the items and respondents plotted against each other

according to the distribution of the measures. In general, gaps in the distribution of the items

indicate that the items are not tapping the variable, which is related to the motivation and

opportunity of entering the education profession. Figure 3 presents item and person distributions

as quite even with relatively few gaps. The left side of the vertical line displays the distribution

of institutions; each ‘#’ represents two respondents and each ‘.’ represents one. ‘M’ marks the

mean for respondents and items, ‘S’ is one standard deviation away from the mean, and ‘T’ is

two standard deviations from the mean. Respondents are displayed to the right of the vertical

line. The items cover a range of -1 to 1 logits in difficulty, and the respondents fall between -2

and 2.5 logits in willingness to endorse. Those at the upper end of the scale are more willing to

strongly agree, and those at the bottom are less likely. There are numerous respondents above

and below the distribution of the items, suggesting items were not matching respondents’ levels

of perception of the impact of factors very well. The free-response section provides insight into

items to add to future surveys to better define the variable such as tuition costs of education

programs, certification testing requirements, and the media’s influence on teacher supply.

The factors displayed on the map in Figure 3 having the largest impact on decreasing the

supply of educators are school violence and working conditions, followed by teacher salaries and

state mandates. At the other end of the scale, the factors having the largest impact on increasing

the supply of educators are institutions increasing their teacher education enrollments, followed

by personal career shifts, alternative certification, and distance learning teacher education. The

factors shown to have the largest impact on increasing the demand for educators are early

retirement, routine retirement, demographic shifts of limited English proficient students, and

Page 17: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 16

overall student enrollment. Factors decreasing the demand begin with state funding, and move to

local funding, decreasing teacher education enrollments, and state mandates.

To get an overall picture of the categorization, step calibrations can be used to separate

item measures into categories. For teacher supply, school violence and working conditions fall

into the moderate negative influence range. The only factor to fall into the moderate positive

influence range was increasing teacher education enrollments. All other factors fell into the no

influence range. None of the factors fell into the significant positive influence, significant

negative influence, or moderate negative influence range. Looking at educator demand, all the

factors fall within the no influence category of the rating scale. The lower percentages of use of

responses 1 and 5, as displayed in Table 2, indicate respondents are less likely to use the extreme

categories and rating 3 (no influence) is selected most often. This observation, along with the

comments indicating some confusion on how to interpret the survey, may account for a greater

number of neutral responses and less overall willingness to select extreme ratings.

The results of the Rasch analysis, supported by the content analysis of survey comments,

suggest that the survey results should be interpreted with caution. Moreover, the survey should

be revisited to address concerns brought forth with the Rasch analysis. Doing so would result in

a more stable measurement tool and may yield more meaningful results.

Conclusion

In gauging the impact of factors on supply and demand using ratings, it is presumed that

the respondents have an accurate perception of the field, judge according to reproducible criteria,

with ratings accurately recorded, in terms of uniformly-spaced levels, which add up to scores as

good as measures. In fact, as noted in Wright (1997), ratings are no better than responses based

on fluctuating personal criteria that are not always interpreted as intended or recorded correctly,

Page 18: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 17

in ordinal ratings, which do not add up to measures. Rasch analysis produces measures, provides

a basis for insight into the validity of the measurement tool and provides information to allow for

systematic diagnosis of misfit. Based on the results of the study, the 2002 Educator Supply and

Demand survey instrument should be reevaluated prior to future data collection. The survey was

revised in 2000, adding four factors in the area of teaching environment: salaries, benefits,

school violence, and working conditions. The 2002 survey was revised to delineate how each

particular factor would affect supply and/or demand. Another revision would not necessarily

affect comparability over a long range. As well, based on quantitative results and participant

written comments, the instrument should be revised to avoid misinterpretation of the questions

and response options.

The true-score model produces a descriptive summary based on statistical analysis, but it

is limited, if non-existent, in the measurement capacity. Quality of the measurement tool should

play a key role in the analysis of the data it produces; however, this is often overlooked. It is

important to begin at the level of measurement and to identify weaknesses that may limit the

reliability and validity of the measures made with the instrument. As indicated in the study,

Rasch analysis tackles many of the deficiencies of the true score model in that it has the capacity

to incorporate missing data, produces validity and reliability measures for person measures and

item calibrations, measures persons and items on the same metric, and is person and sample-free.

AAEE, along with researchers, organizations or institutions analyzing similar rating scale

data will benefit from the results of this study as it provides a sound methodology for analyzing

such data. The education community will also benefit by receiving better-informed information

collected using a more valid and reliable instrument. In the meantime, although the results of the

study indicate that the majority of the factors have no influence on teacher supply and demand,

Page 19: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 18

the ranking of the factors according to their measures can be used to inform the education

community as it makes important decisions influenced by and addressing supply and demand.

Page 20: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 19

References

American Association for Employment in Education. (1998). Educator supply and

demand in the United States (1997 report). Evanston, IL: Author.

American Association for Employment in Education. (2002). Educator supply

and demand in the United States (2002 report). Evanston, IL: Author.

Andrich, D. (1988). Rasch models for measurement. Sage University Paper Series on

Quantitative Applications in the Social Sciences, series no. 07-068. Beverly

Hills: Sage Publications.

Darling-Hammond, L. (2000). Solving the dilemmas of teacher supply, demand, and

standards: How we can ensure a competent, caring and qualified teacher for

every child. New York: National Commission on Teaching and America's Future.

Hirsch, E., Koppich, J., & Knapp, M. (2001). Revisiting what states are doing to improve

the quality of teaching: an update on patterns and trends. Retrieved March 15,

2004, from Center for the Study of Teaching and Policy Web site:

www.ctpweb.org

Ingersoll, R. (2003). Who controls teachers' work? Power and accountability in

America's schools. Cambridge, MA: Harvard University Press.

Ingersoll, R., & Smith, T. (2003). The wrong solution to teacher shortage. Educational

Leadership, 60 (8), 30-33.

Linacre, J. (1999). A User’s Guide to Facets Rasch Measurement Computer Program.

Chicago, IL: MESA Press.

Linacre, J. (2002). Facets, factors, elements and levels [Electronic version]. Rasch

Measurement Transactions, 16 (2), 880.

Page 21: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 20

National Commission on Teaching, & America's Future. (2003). No dream denied: A

pledge to America's children. Washington, DC: Author.

Sampson, S. & Bradley, K. D. (November, 2003). Rasch analysis of educator supply and

demand rating scale data [Electronic Version]. Research Methods Forum.

Available at: http://aom.pace.edu/rmd/2003forum.html

Smith, E., Jr. (2000, June). Rasch Measurement Models. Paper presented at An

Introduction to Rasch Measurement: Theory and Applications, Chicago.

Smith, R. (1992). Application of Rasch measurement. Chicago: MESA Press.

Smith, R. (2000, June). What is measurement? Paper presented at An Introduction to

Rasch Measurement: Theory and Applications, Chicago. Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago, IL: MESA

Press.

Wright, B. (1997). Fundamental measurement for outcome evaluation [Electronic

version]. Physical Medicine And Rehabilitation: State Of The Art Reviews, 11(2),

261-288. Available at: http://www.rasch.org/memo66.htm

Page 22: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 21

Table 1 Overall Model Fit Information, Separation and Mean Logit SUMMARY OF 496 MEASURED INSTITUTIONS +-----------------------------------------------------------------------------+ | RAW MODEL INFIT OUTFIT | | SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD | |-----------------------------------------------------------------------------| | MEAN 86.6 29.1 -.05 .22 1.00 -.5 1.00 -.5 | | S.D. 22.8 5.9 .64 .07 .66 2.4 .66 2.4 | | MAX. 149.0 33.0 2.62 .79 4.02 7.6 4.03 7.6 | | MIN. 5.0 2.0 -1.94 .19 .08 -7.0 .08 -7.0 | |-----------------------------------------------------------------------------| | REAL RMSE .26 ADJ.SD .59 SEPARATION 2.28 INSTIT RELIABILITY .84 | |MODEL RMSE .23 ADJ.SD .60 SEPARATION 2.58 INSTIT RELIABILITY .87 | | S.E. OF INSTITUTION MEAN = .03 | +-----------------------------------------------------------------------------+ SUMMARY OF 33 MEASURED FACTORS +-----------------------------------------------------------------------------+ | RAW MODEL INFIT OUTFIT | | SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD | |-----------------------------------------------------------------------------| | MEAN 1302.2 437.6 .00 .05 .99 -.4 1.00 -.1 | | S.D. 189.7 55.7 .34 .01 .20 3.3 .19 3.2 | | MAX. 1590.0 487.0 .85 .09 1.40 5.9 1.38 5.7 | | MIN. 442.0 162.0 -.76 .05 .59 -7.8 .62 -7.0 | |-----------------------------------------------------------------------------| | REAL RMSE .06 ADJ.SD .34 SEPARATION 5.94 FACTOR RELIABILITY .97 | |MODEL RMSE .06 ADJ.SD .34 SEPARATION 6.14 FACTOR RELIABILITY .97 | | S.E. OF FACTOR MEAN = .06 | +-----------------------------------------------------------------------------+

Table 2 Response Scale Use -------------------------------------------------------------------------------- SUMMARY OF MEASURED STEPS +------------------------------------------------------------------+ |CATEGORY OBSERVED | MEASURE | COHERENCE|INFIT OUTFIT| STEP | |LABEL SCORE COUNT %|AVRGE EXPECT| M->C C->M| MNSQ MNSQ|CALIBRATN| |-------------------+------------+----------+------------+---------| | 1 1 1151 7| -.72 -.72| 56% 1%| 1.01 1.04| NONE | | 2 2 3394 21| -.38 -.37| 41% 31%| 1.00 1.02| -1.62 | | 3 3 5583 34| -.06 -.05| 45% 77%| .94 .97| -.71 | | 4 4 3277 20| .33 .30| 42% 28%| .94 .94| .65 | | 5 5 1035 6| .75 .80| 63% 5%| 1.05 1.05| 1.68 | |-------------------+------------+----------+------------+---------| |MISSING 1928 12| -.12 | | | | +------------------------------------------------------------------+ +--------------------------------------------------------+ |CATEGORY STEP STEP | SCORE-TO-MEASURE |THURSTONE| | LABEL CALIBRATN S.E. | AT CAT. ----ZONE----|THRESHOLD| |------------------------+---------------------+---------| | 1 NONE |( -2.95) -INF -2.18| | | 2 -1.62 .03 | -1.30 -2.18 -.64| -1.90 | | 3 -.71 .02 | -.02 -.64 .61| -.65 | | 4 .65 .02 | 1.29 .61 2.21| .61 | | 5 1.68 .03 |( 2.99) 2.21 +INF | 1.94 | +--------------------------------------------------------+

Page 23: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 22

CATEGORY PROBABILITIES: MODES - Step measures at intersections P ++------+------+------+------+------+------+------+------++ R 1.0 + + O | | B |1 5| A | 111 55 | B .8 + 11 55 + I | 1 55 | L | 11 5 | I | 1 55 | T .6 + 1 5 + Y | 11 5 | .5 + 1 333 5 + O | 1222222 333 333 4444444*5 | F .4 + 2221 2** *4 5 44 + | 22 1 3 2 44 33 5 44 | R | 22 1 33 22 4 3 5 44 | E | 22 *1 244 ** 44 | S .2 + 22 33 1 422 5 3 44 + P | 222 33 11 44 22 55 33 444 | O |2 33 4** ** 333 4| N | 33333 4444 111*555 22222 33333 | S .0 +******************5555555555 1111111111******************+ E ++------+------+------+------+------+------+------+------++ -4 -3 -2 -1 0 1 2 3 4

INSTITUTION [MINUS] FACTOR MEASURE Figure 1. Step Use by Person-Item Measure

Page 24: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 23

EXPECTED SCORE: MEAN (":" INDICATES HALF-SCORE POINT) -4 -3 -2 -1 0 1 2 3 4 |------+------+------+------+------+------+------+------| NUM FACTOR 1 1 : 2 : 3 : 4 : 55 23 SCHOOL VIOLENCE SUPPLY 1 1 : 2 : 3 : 4 : 5 5 24 WORKING CONDITIONS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 21 TEACHER SALARIES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 5 STATE FUNDING DEMAND 1 1 : 2 : 3 : 4 : 5 5 11 STATE MANDATES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 6 LOCAL FUNDING DEMAND 1 1 : 2 : 3 : 4 : 5 5 28 DECREASING OUR TEACHER ED REQU 1 1 : 2 : 3 : 4 : 5 5 2 STATE FUNDING SUPPLY 1 1 : 2 : 3 : 4 : 5 5 26 MOBILITY OF EXPERIENCED TEACHE 1 1 : 2 : 3 : 4 : 5 5 12 FEDERAL MANDATES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 3 LOCAL FUNDING SUPPLY 1 1 : 2 : 3 : 4 : 5 5 13 STATE MANDATES DEMAND 1 1 : 2 : 3 : 4 : 5 5 25 MOBILITY OF NEW GRADUATES SUPP 1 1 : 2 : 3 : 4 : 5 5 32 ECONOMIC CONDITIONS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 19 PRIVATE SCHOOLS/HOME SCHOOLING 1 1 : 2 : 3 : 4 : 5 5 22 TEACHER BENEFITS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 14 FEDERAL MANDATES DEMAND 1 1 : 2 : 3 : 4 : 5 5 7 POSTPONED RETIREMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 30 FOREIGN-PREPARED TEACHERS SUPP 1 1 : 2 : 3 : 4 : 5 5 4 FEDERAL FUNDING DEMAND 1 1 : 2 : 3 : 4 : 5 5 20 CLASS SIZE DEMAND 1 1 : 2 : 3 : 4 : 5 5 16 SHIFTS OF TEACHERS DEMAND 1 1 : 2 : 3 : 4 : 5 5 1 FEDERAL FUNDING SUPPLY 1 1 : 2 : 3 : 4 : 5 5 17 SHIFTS OF STUDENTS DEMAND 1 1 : 2 : 3 : 4 : 5 5 10 HIRING OF RETIREES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 18 STUDENT ENROLLMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 31 DISTANCE LEARNING TEACHER EDUC 1 1 : 2 : 3 : 4 : 5 5 15 LIMITED ENGLISH PROFICIENT STU 1 1 : 2 : 3 : 4 : 5 5 8 ROUTINE RETIREMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 29 ALTERNATIVE CERTIFICATION SUPP 1 1 : 2 : 3 : 4 : 5 5 33 PERSONAL CAREER SHIFTS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 9 EARLY RETIREMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 27 INCREASING OUR TEACHER ED REQU |------+------+------+------+------+------+------+------| NUM FACTOR -4 -3 -2 -1 0 1 2 3 4 1123455544121 1 31124841303441783179813564 121 1 INSTITUTIONS T S M S T

Figure 2. Probability Map

Page 25: Running Head: AN ARGUMENT FOR RASCH ANALYSIS

An Argument for Rasch Analysis 24

------------------------------------------------------------------------------- MEASURE | MEASURE <more> --------------------- INSTITUT-+- FACTORS --------------------- <rare> 3 + 3 | | | . | | | . | . | . | 2 . + 2 | | | .# | . | ### | .## | # T| .## | 1 .## + 1 .### | #### | X #### |T ########### S| X ####### | .#### | XXX .############ |S XX ################## | XXX ################## | XXXXXX 0 ################### M+M XXXX 0 .#################### | XX .################## | XXX .################## |S XX .############ | X .################# | XXX .############ | X ###### S|T #### | X .##### | -1 .## + -1 .## | .## | .# T| # | ## | . | . | | ## | -2 + -2 | | | | | | | | | -3 + -3 <less> --------------------- INSTITUT-+- FACTORS ------------------<frequent>

Figure 3. Item-Person Map