running head: an argument for rasch analysis
TRANSCRIPT
Running Head: AN ARGUMENT FOR RASCH ANALYSIS
Measuring Factors Impacting Educator Supply and Demand:
An Argument for Rasch Analysis
Shannon O. Sampson
University of Kentucky
Kelly D. Bradley
University of Kentucky
An Argument for Rasch Analysis 1
Abstract
This study employs the Rasch model to analyze 516 responses from U.S. higher
educational institutions to the 2002 Educator Supply and Demand survey administered by
the American Association for Employment in Education. A contention is made that the
Rasch model provides a clearer picture, as compared to the conventional true score
model, by providing both a descriptive summary of the data and an assessment of the
measurement instrument. Key measurement issues identified, including recognizing 17
of the 33 questions as being problematic, suggest caution in interpreting the true score
model results. A content analysis of open-ended responses supports this assertion.
Factors perceived to be influencing teacher supply and demand are also identified.
1
An Argument for Rasch Analysis 2
Introduction
In order to meet the needs of the United States educational system, teacher education
programs call for an accurate account of the status of teaching fields. The American Association
for Employment in Education (AAEE) provides an annual report of educational institutions’
perceptions of educator supply and demand in the United States. The report is constructed based
upon analysis utilizing the true score model of survey items. Rating scale data are often
analyzed utilizing this model, which postulates that every measurement, or observed score, is a
composite of the true ability of the respondent on the measure and random error. It is contended
that employing the Rasch model is a better approach to the analysis. Rasch analysis produces
measures on items and respondents, and the analysis assists in identifying problematic items and
unexpected response patterns. This study utilizes the Rasch model to analyze data collected and
originally analyzed in the 2002 AAEE report. A content analysis of the qualitative comments is
used to support the quantitative results. Findings will benefit administrators of teacher
preparation programs making program decisions to address the needs of the field, teacher
candidates seeking information on the current job market, and education agencies involved with
policy decisions affecting the field. A methodological framework for educational researchers
analyzing rating scale data, including AAEE, is also provided.
Background
Teacher supply and demand
Much attention has been given to teacher retention, recruitment, and quality in the United
States (U.S.), and research has indicated that teacher preparation and qualifications are critical
factors in student achievement (Darling-Hammond, 2000). Educators, politicians, and
researchers have warned of an impending teacher shortage and the subsequent need to hire more
An Argument for Rasch Analysis 3
teachers. While teacher retirement, an increasing student population, and new classroom policies
have been commonly cited reasons for the coming teacher shortage, research suggests that a
larger part of the problem is teacher attrition (Ingersoll & Smith, 2003). Ingersoll and Smith
report that after five years of teaching, between 40-50% of beginning teachers leave the
profession. As quoted in No Dream Denied (National Commission, 2003), “The ability to create
and maintain a strong professional learning community in a school is limited not by teacher
supply but by high turnover among the teachers who are already there, turnover that is only
aggravated by hiring unqualified or underprepared individuals to replace those who leave” (p. 8).
Ingersoll and Smith state that efforts to recruit more teachers will not solve staffing problems if
nearly half of those teachers leave within a few years. “Pouring more water into the bucket will
not do any good if we do not patch the holes first” (p. 33.)
According to Linda Darling-Hammond (2000), teacher distribution is one issue in
meeting the demands for educators. She writes, “enrollment declines are anticipated in most
parts of the Northeast and Midwest, while other states will have stable enrollments. Some states
have a large number of teacher education institutions and regularly produce more teachers.
States may not have developed aggressive recruitment strategies or reciprocity arrangements for
accepting licenses awarded in other states. As a result, it is difficult to get teachers from where
they are prepared to where they are needed” (p. 6), an indication that lack of mobility is a factor
linked to supply of teachers to meet the demand.
Darling-Hammond (2000) asserts shortages in math and science exist “largely because
the knowledge and skills required by teachers command much greater compensation in fields
outside of teaching and because there are inadequate numbers of slots in schools of education to
prepare an adequate supply of teachers in these fields” (p. 8). States and districts are working to
An Argument for Rasch Analysis 4
attract teachers to the hardest-to-staff areas by creating alternative pathways into the profession
to attract mid-career professionals and offering financial incentives such as signing bonuses,
student loan forgiveness, housing assistance, and free graduate courses (Hirsch, Koppich, &
Knapp, 2001). Ingersoll (2003) notes that raising teacher salaries may be a way to ‘plug the
holes’, but he suggest that a better alternative may be to address working conditions identified by
new teachers as their decision to leave teaching, such as lack of administrative support, poor
student discipline and student motivation, and lack of participation in decision making. In order
for every student to have the opportunity to learn from highly qualified teachers, it is important
for educational researchers to continue to identify areas of need and reasons for shortage and for
educational institutions and educational policy makers to implement changes to address them.
Annual AAEE Educator Supply and Demand reports are designed to identify higher
education institutions’ perceptions on the causes of teacher attrition. It is essential for the report
to paint an accurate picture of supply and demand so users of the report can make best-informed
decisions. AAEE suggests many uses for the report, including to develop recruitment strategies
by human resource administrators; to evaluate modifications to teacher education programs; and
to guide university students in selecting a major teaching field. The majority of results in the
AAEE report are based upon means constructed through descriptive statistics of rating scale data.
Rasch Model as Opposed to the True Score Model
AAEE reports utilize true score theory in analyzing the survey rating scale data, under the
premise that every measurement is a composite of the true endorsement characteristics of the
respondent on the measure and random error. As noted in Smith (2000), the true score model
has many deficiencies beginning with the issue of sample-dependence between estimates of an
item’s difficulty to endorse and a respondent’s willingness to endorse, making the estimates for
An Argument for Rasch Analysis 5
the items depend on the severity of the respondents in the sample. Moreover, the estimates of
item difficulty cannot be directly compared unless the estimates come from the same sample or
assumptions are made about the comparability of the different samples. The true score approach
requires complete records to make comparisons of items, and even more, a single standard error
of measurement is produced for all the scores, making it inadequate and potentially misleading.
The Rasch model, named after George Rasch and introduced in 1960, addresses many of
the weaknesses of the true score approach. Specific to rating scale data, it connects observations
of respondents and items in a way that indicates the occurrence of an event as probability rather
than certainty and maintains order in that the probability of providing a certain response defines
an order of respondents and items. These circumstances create the probabilistic version of the
scalogram, indicating that a person endorsing a more extreme statement should also endorse all
less extreme statements, and that an easy-to-endorse item is always expected to be rated higher
by any respondent (Wright and Masters, 1982). Contrasting the true score model, parameters in
the Rasch model are neither sample nor test dependent, so missing data are not problematic.
Another improvement is that Rasch measurement produces standard error estimates for each
discrete raw score, allowing for a reliability coefficient to be calculated for the instrument and
the respondents. Persons and items are measured on the same metric, allowing for a probability
expression to be calculated that makes it possible to combine any person’s estimated measure
with any item’s estimated measure to produce expected response values. As well, Rasch analysis
provides estimates for persons and items that are freed from the sampling distribution of the
sample employed, meaning there is no dependence on the particulars of the questionnaire nor of
the sample being measured (Smith, E., 2000; Wright, 1997; Wright and Masters, 1982).
An Argument for Rasch Analysis 6
This study examines the 2002 AAEE Educator Supply and Demand data set. Here the
data are analyzed employing Rasch analysis, utilizing the methodology framework outlined by
Sampson and Bradley (2002). The results and conclusions yield information that enhance the
AAEE report, benefiting education agencies involved with policy decisions affecting the field.
The findings can also be used to improve the AAEE instrument and similar instruments as the
study provides insight into the quality of the measurement instrument and its assessment.
Method
Population and Sample
The data analyzed in this study originated from the 26th annual AAEE survey of Educator
Supply and Demand in the United States. The population was composed of institutions listed in
the Higher Education Directory, including AAEE members and non-members. AAEE assumes
that the opinions and perceptions of university directors of career services and other teacher
education program administrators accurately reflect the job market their students are entering.
Periodically, regional studies of employers are conducted to support this assumption. These
studies have consistently validated the data provided by colleges and universities (AAEE, 2002).
In May of 2002, a selected response pencil-and-paper questionnaire was mailed to 1,267
institutions of higher education. The instrument was mailed to the career services director at
each institution responsible for the planning and placement of graduates in teacher education and
related careers, or to deans and directors of teacher education in universities. In June, a second
mailing was conducted with non-responding institutions. 516 questionnaires were received,
resulting in a 40.7% response rate; 498 questionnaires were useable.
The data were assessed for representativeness of the return sample on variables including
AAEE membership, member versus non-member, and response wave, early returns versus on-
An Argument for Rasch Analysis 7
time returns versus late returns. The sample was deemed representative of the population, with
the exception of AAEE members being overrepresented. Subsequently, the sorted AAEE and
non-AAEE members were compared across the 33 factors, with respect to how each impacted
the supply and demand of educators, in order to evaluate whether or not differential responses
could be identified between the two groups. Since the majority of items were not statistically
different (t-tests, p<.05), the responses were aggregated into an overall dataset for the Rasch
analysis, as was done with the original AAEE true score theory analysis.
Instrumentation
The selected response pencil-and-paper survey and the subsequent data collection were
developed and conducted by AAEE. A survey questionnaire was chosen as the method of data
collection due to the availability of a sample frame, the quantity of information desired, and the
necessity to obtain information in a timely manner. This study utilizes the portion of the survey
that focuses on teacher education program personnel’s perceptions of the relative weights of
national K-12 and higher education factors on teacher supply and demand for 2002. Using a 5-
point Likert-type scale ranging from “Significant Positive Influence”, 1, to “Significant Negative
Influence”, 5, respondents were asked to rate the perceived influence of 28 factors on supply and
demand. Supply is defined as “the number of individuals with at least a baccalaureate or higher
degree and other minimal requirements who are willing to supply their services as educators but
are not currently providing education services, new graduates as well as previous graduates, ”
and demand is “the total number of positions needed to be filled by certified/licensed (or eligible
for licensure) personnel in an educational setting” (AAEE, 2002). Correlation studies conducted
by surveying school district human resources directors in three regions of the country were found
to have high correlations with the AAEE study, indicating excellent reliability of the data.
An Argument for Rasch Analysis 8
Data Analysis
The study has two components. It begins with a quantitative component that replicates
an analysis used in Sampson and Bradley (2003), applying a one-parameter Item Response
Theory model, commonly known as the Rasch Model, using WINSTEPS software (Wright and
Linacre, 2000 version 3.02). The Rasch model is based upon the difficulty of a set of items,
assuming that item difficulty is the only item characteristic influencing responses (Linacre,
1999). Here, two facets are involved: the instrument’s items and the respondents. From a Rasch
perspective, a respondent’s severity interacts with an item’s difficulty to assign a certain score to
produce an observed outcome (Linacre, 2002).
WINSTEPS produces various tables displaying the raw scores, measures, and ZSTDs for
the factors in the survey and the institutions responding. ZSTDs are mean-square fit statistics
standardized to approximate a theoretical mean 0 and standard deviation 1. INFIT ZSTDs are
sensitive to irregular inlying patterns and OUTFIT ZSTDs are sensitive to unexpected rare
extremes. Items and participating institutions’ responses that do not adequately fit the model
requirements are identified using the ZSTD scores. Item profiles are presented so that if a person
knows the measure of an institution a prediction about that institution’s responses for each of the
survey items can be made. Maps of the distribution of the institution measure against the item
measure are provided, and the category probabilities are produced.
The second component is qualitative, a content analysis of the free-response section of
the survey. Respondents’ comments were coded and then organized under emerging categories
based on the comments, independent of the topics listed in the survey. Assertions were drawn
from the comments within each category, and these were used to support the results of the Rasch
analysis and provide insight into possible revisions that could improve the 2002 AAEE survey.
An Argument for Rasch Analysis 9
Results
Thirty-three factors were rated on the 5-point Likert-type scale. Parameters that were
estimated include: 496 institution measures, 33 item measures, and 4 category thresholds relating
to the transition points between the five response categories. Analyses began with consideration
of whether the data fit the model (see Table 1). The INFIT and OUTFIT ZSTDs produced in the
Rasch analysis assist in identifying the items on the instrument that are problematic. If the data
fit the model, or cooperate with Rasch model specifications (Wright, 1994), these statistics will
have a mean of zero and a standard deviation of 1. The response categories OUTFIT ZSTDs, the
standardized unweighted item and person fit statistics, are sensitive to unexpected rare extremes.
If the data fit the model, these statistics are approximately t-statistics, also with an expected
mean of 0 and a standard deviation of 1. The mean OUTFIT ZSTD for the persons is -.5 and the
standard deviation is 2.4, so the mean fit is lower than expected and there is greater variability in
the fit of respondents than expected indicating many of the institutions responded in a manner
that was inconsistent. The OUTFIT ZSTD for the items is -.1, and the standard deviation is 3.2.
Again, the mean is close to expected value of zero, but the higher standard deviation suggests
that there are some problematic items in the measurement tool (Wright and Masters, 1982). It
should also be noted that the mean item measure is 0.0 and the mean person measure is -.05.
While the mean for items is always set at 0.0, similar to a standard score, the person mean varies.
Thus, when the person mean is negative, items are generally more difficult to agree with, and
vice versa when the person mean is positive. The person measure for these data are only slightly
negative, suggesting that these items were well matched to the perceptions of the sample.
The next overall statistic to review is the separation, or the spread of person positions or
item positions. The real person separation is 2.28 and the real item separation is 5.94. Real
An Argument for Rasch Analysis 10
refers to the estimated standard error of the measurements being adjusted for any misfit
encountered in the data. If separation is 1.0 or less, the items may not have enough spread,
suggesting less variability of the persons on the trait or redundancy of items. The real person
separation suggests the rating scale discriminates well between the respondents and the real item
separation suggests that the items are creating a well-defined variable.
Table 2 indicates how the participants used the response scale, illustrating the steps from
one rating scale category to another. The response scale ranged from 1 = Significant Positive
Influence to 5 = Significant Negative Influence. Observed Count indicates the number of times
the category was selected across all items and persons. Respondents were not likely to endorse a
1 or 5 with only 7% of the responses in the 1 category and 6% in the 5 category. Furthermore, a
1 or 5 response may stand out as differing from the expectation. The mean squared estimate,
MNSQ, is always less than 1.2, so it does not appear that any substantial misfit occurs. The step
calibration is expected to increase with category value, and it does. It also shows the steps are
similar in size, with the largest step of 1.62 logits from category 1 to 2. Another way of viewing
these steps is by the use of probability curves (Figure 1). These curves display the likelihood of
category selection, along the y-axis, by the person-minus-item measure, along the x-axis. If the
difference were -1.0, the most likely response would be a 2, closely followed by a 3. If all
categories are utilized, each category value will be the most likely at some point, and no curves
will be inverted.
Responses can also be estimated through the use of Figure 2. The institution measures
are located on the horizontal axis. Drawing a vertical line through the institution measure and
identifying the response categories nearest that line can recognize the institution’s most likely
responses. Using Figure 2, an institution with a measure of approximately 0 would be expected
An Argument for Rasch Analysis 11
to assign a 4 to “increasing our teacher ed. enrollments,” a 3 to “early retirement through state
funding,” and a 2 to “working conditions and school violence.” This chart can be used in
conjunction with actual responses to become aware of idiosyncrasies of the respondents.
WINSTEPS produces a table of items in order of worst to best fitting. While there are no
given rules for acceptable and unacceptable fit, Smith (1992) recommends that standardized infit
or outfit be between -2 and +2 ZSTD. INFIT is weighted by the distance between the person
position and item difficulty and OUTFIT is unweighted and sensitive to outliers, or extreme
unexpected responses. ZSTD values greater than +2 are considered to be ‘noisy’, meaning that
unexpected or unrelated irregularities exist, and values less than -2 are considered to be “muted,”
and often result from dependence or redundancy among items (Linacre, 2000). Inspection of the
OUTFIT ZSTDs indicates that 17 of the 33 items are outside the traditional cutoff of |2|. Eight
of the misfitting items, including teacher salaries and early retirement, have an OUTFIT ZSTD
greater than 2, signifying high variability. Nine of the items, including personal career shifts,
federal mandates, and shifts of teachers, have an OUTFIT ZSTD of less than -2, indicating little
variability for the probabilistic model. These items will be discussed in detail below.
Discussion
According to Linacre (2000), poor category wording can lead to noisy and muted outfit.
The review of the institutions’ comments supports the assertion that poor wording is the cause
for the large number of misfits: “I found the previous page to be rather confusing. Perhaps it
should be worded differently in the future,” and “This survey is terribly confusing! The
supply/demand portion could be interpreted in a variety of ways.” Confusing wording is a
probable cause of a large number of respondents being labeled as ‘misfits’, as indicated by the
unexpected response pattern. Even as the wording of the survey seems to be the cause of the
An Argument for Rasch Analysis 12
large number of misfits, the items are reviewed for other issues. Eight items are identified as not
fitting well with the rest of the scale, by an OUTFIT ZSTD greater than 2. They are (1) Teacher
Salaries, Supply; (2) Mobility of New Graduates, Supply; (3) State funding, Demand; (4)
Economic Conditions, Supply; (5) State Mandates, Supply; (6) Postponed Retirement; Demand;
(7) Early Retirement, Demand; and (8) State Funding, Supply.
The content analysis of the qualitative response variables provides insight as to why these
might be listed as misfitting items. Comments about teacher salaries vary from state to state,
ranging from, “The most important factor for [our state] is salaries. If teachers made a decent
salary, we would have many more students to choose education as a career,” and “salaries and
working conditions continue to stifle growth in students majoring in education related fields” to
“salaries are excellent in [our state],” “more considering field due to increasing salaries being
offered,” and “education enrollments have increased dramatically in response to the uncertainty
of the economy, and salaries and benefits that are much better than similar occupations in the
private sector.” The variation of responses may be cause for the high OUTFIT ZSTD of 5.7.
Teacher mobility may have an OUTFIT ZSTD of 5.5 due to different interpretations of
supply and demand. Some institutions note there would be a better supply of educators if they
were willing to work in high demand areas such as urban schools. Many institutions note that
lack of mobility is one of the main reasons teachers do not have jobs, while others note that their
graduates often move out of state in order to find jobs. Here, mobility could be interpreted as
either a willingness to be mobile or lack thereof; in which case, answers would be quite different.
Funding was the factor referred to most often in the survey comments. Comments about
local, state, and federal funding suggest these issues may create a misleading picture of educator
demand. One comment reflects many others’, “demand for teachers is very high, but economic
An Argument for Rasch Analysis 13
conditions prevent school districts from hiring.” Funding is noted as the cause of certain subjects
being dropped, increasing class sizes, school closures, and less teachers being hired. Still, when
examining the survey responses, 51% of respondents listed state funding as having positive
influence on the demand of educators hired. This could be due to different interpretations of
‘positive’ and ‘negative’ influence related to supply and demand, since the comments
consistently communicate the economic situation “is having a calming influence on an otherwise
strong hiring trend.”
The Rasch analysis listed state mandates as misfitting with an OUTFIT ZSTD of 2.6; yet,
the content analysis revealed licensing requirements to have both positive and negative effects on
teacher supply. State tests, which respondents claim “are not an accurate predictor of
knowledge,” “keep minority students from entering fields of education,” and “delay entry into
practice,” would have a negative effect on supply. Still, in states where certification is granted in
grade ranges, the supply is noted as positively affecting certain grades at the expense of others.
Regional differences, as well as the type of state mandate the respondents have in mind when
rating that factor may be the cause of the misfit.
Finally, the effect of early retirement and postponed retirement on demand may have
OUTFIT ZSTDs of 2.4 due to the various reactions toward retirement. While some institutions
allude to deferred retirement reducing spending for new teacher candidates, others appear to have
a less positive reaction to postponed retirement. One institution writes, “Retirees are filling
positions that should only be filled by those qualified. Retirees should ONLY be hired when
others are not available.” Similar to other questions, the confusing interpretation of positive and
negative may be cause for the misfit.
An Argument for Rasch Analysis 14
If an item fits substantially better than would be expected, it may be too discriminating,
described as overfitting. In this study, items with an OUTFIT ZSTD of less than -2 are
highlighted and labeled as overfitting, suggesting redundancy. With these data, there are nine
overfitting items: (1) Personal Career Shifts, Supply; (2) Federal mandates, Supply; (3) Distance
Learning Teacher Education; Supply; (4) Federal mandates, Demand; (5) Shifts of teachers,
Demand; (6) Shifts of students, Demand; (7) Federal Funding, Supply; (8) Private Schools/Home
Schooling, Demand; and (9) Foreign-prepared teachers, Supply.
Personal career shifts is most likely redundant of economic conditions, based on the
respondents’ comments: “Dot-com meltdown has brought math/science professionals to
teaching,” “Layoffs from jobs in industry are increasing the supply of teachers,” “Many…are
looking at teaching due to the poor economy and need for stability, and many are reevaluating
their careers and wanting to ‘make a difference’.” The same phenomenon appears to take place
with the item pairs of federal mandates and federal funding and the state and local mandate and
funding factors. In the comments, funding and mandates are noted in general terms, not
distinguishing between local, state and federal levels. The recommendation would be to word
these so they are more clearly distinguishable to the respondents.
Private schools/home schooling, shifts of teachers, and shifts of students each has more
than 50% of responses being 3, no influence, but such a rating seems to be unsound based upon
the comments section. Perhaps these areas are redundant with the general Student Enrollment in
demographic shifts in population. Distance learning teacher education, with 64% of respondents
assigning a 3 rating, and Foreign prepared teachers, with 76% assigning a 3, are never mentioned
in the comments section. It is likely that these areas are not typically used across institutions, in
contrast to the idea they were truly viewed as having no influence for the 2002 year.
An Argument for Rasch Analysis 15
WINSTEPS produces a map of the items and respondents plotted against each other
according to the distribution of the measures. In general, gaps in the distribution of the items
indicate that the items are not tapping the variable, which is related to the motivation and
opportunity of entering the education profession. Figure 3 presents item and person distributions
as quite even with relatively few gaps. The left side of the vertical line displays the distribution
of institutions; each ‘#’ represents two respondents and each ‘.’ represents one. ‘M’ marks the
mean for respondents and items, ‘S’ is one standard deviation away from the mean, and ‘T’ is
two standard deviations from the mean. Respondents are displayed to the right of the vertical
line. The items cover a range of -1 to 1 logits in difficulty, and the respondents fall between -2
and 2.5 logits in willingness to endorse. Those at the upper end of the scale are more willing to
strongly agree, and those at the bottom are less likely. There are numerous respondents above
and below the distribution of the items, suggesting items were not matching respondents’ levels
of perception of the impact of factors very well. The free-response section provides insight into
items to add to future surveys to better define the variable such as tuition costs of education
programs, certification testing requirements, and the media’s influence on teacher supply.
The factors displayed on the map in Figure 3 having the largest impact on decreasing the
supply of educators are school violence and working conditions, followed by teacher salaries and
state mandates. At the other end of the scale, the factors having the largest impact on increasing
the supply of educators are institutions increasing their teacher education enrollments, followed
by personal career shifts, alternative certification, and distance learning teacher education. The
factors shown to have the largest impact on increasing the demand for educators are early
retirement, routine retirement, demographic shifts of limited English proficient students, and
An Argument for Rasch Analysis 16
overall student enrollment. Factors decreasing the demand begin with state funding, and move to
local funding, decreasing teacher education enrollments, and state mandates.
To get an overall picture of the categorization, step calibrations can be used to separate
item measures into categories. For teacher supply, school violence and working conditions fall
into the moderate negative influence range. The only factor to fall into the moderate positive
influence range was increasing teacher education enrollments. All other factors fell into the no
influence range. None of the factors fell into the significant positive influence, significant
negative influence, or moderate negative influence range. Looking at educator demand, all the
factors fall within the no influence category of the rating scale. The lower percentages of use of
responses 1 and 5, as displayed in Table 2, indicate respondents are less likely to use the extreme
categories and rating 3 (no influence) is selected most often. This observation, along with the
comments indicating some confusion on how to interpret the survey, may account for a greater
number of neutral responses and less overall willingness to select extreme ratings.
The results of the Rasch analysis, supported by the content analysis of survey comments,
suggest that the survey results should be interpreted with caution. Moreover, the survey should
be revisited to address concerns brought forth with the Rasch analysis. Doing so would result in
a more stable measurement tool and may yield more meaningful results.
Conclusion
In gauging the impact of factors on supply and demand using ratings, it is presumed that
the respondents have an accurate perception of the field, judge according to reproducible criteria,
with ratings accurately recorded, in terms of uniformly-spaced levels, which add up to scores as
good as measures. In fact, as noted in Wright (1997), ratings are no better than responses based
on fluctuating personal criteria that are not always interpreted as intended or recorded correctly,
An Argument for Rasch Analysis 17
in ordinal ratings, which do not add up to measures. Rasch analysis produces measures, provides
a basis for insight into the validity of the measurement tool and provides information to allow for
systematic diagnosis of misfit. Based on the results of the study, the 2002 Educator Supply and
Demand survey instrument should be reevaluated prior to future data collection. The survey was
revised in 2000, adding four factors in the area of teaching environment: salaries, benefits,
school violence, and working conditions. The 2002 survey was revised to delineate how each
particular factor would affect supply and/or demand. Another revision would not necessarily
affect comparability over a long range. As well, based on quantitative results and participant
written comments, the instrument should be revised to avoid misinterpretation of the questions
and response options.
The true-score model produces a descriptive summary based on statistical analysis, but it
is limited, if non-existent, in the measurement capacity. Quality of the measurement tool should
play a key role in the analysis of the data it produces; however, this is often overlooked. It is
important to begin at the level of measurement and to identify weaknesses that may limit the
reliability and validity of the measures made with the instrument. As indicated in the study,
Rasch analysis tackles many of the deficiencies of the true score model in that it has the capacity
to incorporate missing data, produces validity and reliability measures for person measures and
item calibrations, measures persons and items on the same metric, and is person and sample-free.
AAEE, along with researchers, organizations or institutions analyzing similar rating scale
data will benefit from the results of this study as it provides a sound methodology for analyzing
such data. The education community will also benefit by receiving better-informed information
collected using a more valid and reliable instrument. In the meantime, although the results of the
study indicate that the majority of the factors have no influence on teacher supply and demand,
An Argument for Rasch Analysis 18
the ranking of the factors according to their measures can be used to inform the education
community as it makes important decisions influenced by and addressing supply and demand.
An Argument for Rasch Analysis 19
References
American Association for Employment in Education. (1998). Educator supply and
demand in the United States (1997 report). Evanston, IL: Author.
American Association for Employment in Education. (2002). Educator supply
and demand in the United States (2002 report). Evanston, IL: Author.
Andrich, D. (1988). Rasch models for measurement. Sage University Paper Series on
Quantitative Applications in the Social Sciences, series no. 07-068. Beverly
Hills: Sage Publications.
Darling-Hammond, L. (2000). Solving the dilemmas of teacher supply, demand, and
standards: How we can ensure a competent, caring and qualified teacher for
every child. New York: National Commission on Teaching and America's Future.
Hirsch, E., Koppich, J., & Knapp, M. (2001). Revisiting what states are doing to improve
the quality of teaching: an update on patterns and trends. Retrieved March 15,
2004, from Center for the Study of Teaching and Policy Web site:
www.ctpweb.org
Ingersoll, R. (2003). Who controls teachers' work? Power and accountability in
America's schools. Cambridge, MA: Harvard University Press.
Ingersoll, R., & Smith, T. (2003). The wrong solution to teacher shortage. Educational
Leadership, 60 (8), 30-33.
Linacre, J. (1999). A User’s Guide to Facets Rasch Measurement Computer Program.
Chicago, IL: MESA Press.
Linacre, J. (2002). Facets, factors, elements and levels [Electronic version]. Rasch
Measurement Transactions, 16 (2), 880.
An Argument for Rasch Analysis 20
National Commission on Teaching, & America's Future. (2003). No dream denied: A
pledge to America's children. Washington, DC: Author.
Sampson, S. & Bradley, K. D. (November, 2003). Rasch analysis of educator supply and
demand rating scale data [Electronic Version]. Research Methods Forum.
Available at: http://aom.pace.edu/rmd/2003forum.html
Smith, E., Jr. (2000, June). Rasch Measurement Models. Paper presented at An
Introduction to Rasch Measurement: Theory and Applications, Chicago.
Smith, R. (1992). Application of Rasch measurement. Chicago: MESA Press.
Smith, R. (2000, June). What is measurement? Paper presented at An Introduction to
Rasch Measurement: Theory and Applications, Chicago. Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago, IL: MESA
Press.
Wright, B. (1997). Fundamental measurement for outcome evaluation [Electronic
version]. Physical Medicine And Rehabilitation: State Of The Art Reviews, 11(2),
261-288. Available at: http://www.rasch.org/memo66.htm
An Argument for Rasch Analysis 21
Table 1 Overall Model Fit Information, Separation and Mean Logit SUMMARY OF 496 MEASURED INSTITUTIONS +-----------------------------------------------------------------------------+ | RAW MODEL INFIT OUTFIT | | SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD | |-----------------------------------------------------------------------------| | MEAN 86.6 29.1 -.05 .22 1.00 -.5 1.00 -.5 | | S.D. 22.8 5.9 .64 .07 .66 2.4 .66 2.4 | | MAX. 149.0 33.0 2.62 .79 4.02 7.6 4.03 7.6 | | MIN. 5.0 2.0 -1.94 .19 .08 -7.0 .08 -7.0 | |-----------------------------------------------------------------------------| | REAL RMSE .26 ADJ.SD .59 SEPARATION 2.28 INSTIT RELIABILITY .84 | |MODEL RMSE .23 ADJ.SD .60 SEPARATION 2.58 INSTIT RELIABILITY .87 | | S.E. OF INSTITUTION MEAN = .03 | +-----------------------------------------------------------------------------+ SUMMARY OF 33 MEASURED FACTORS +-----------------------------------------------------------------------------+ | RAW MODEL INFIT OUTFIT | | SCORE COUNT MEASURE ERROR MNSQ ZSTD MNSQ ZSTD | |-----------------------------------------------------------------------------| | MEAN 1302.2 437.6 .00 .05 .99 -.4 1.00 -.1 | | S.D. 189.7 55.7 .34 .01 .20 3.3 .19 3.2 | | MAX. 1590.0 487.0 .85 .09 1.40 5.9 1.38 5.7 | | MIN. 442.0 162.0 -.76 .05 .59 -7.8 .62 -7.0 | |-----------------------------------------------------------------------------| | REAL RMSE .06 ADJ.SD .34 SEPARATION 5.94 FACTOR RELIABILITY .97 | |MODEL RMSE .06 ADJ.SD .34 SEPARATION 6.14 FACTOR RELIABILITY .97 | | S.E. OF FACTOR MEAN = .06 | +-----------------------------------------------------------------------------+
Table 2 Response Scale Use -------------------------------------------------------------------------------- SUMMARY OF MEASURED STEPS +------------------------------------------------------------------+ |CATEGORY OBSERVED | MEASURE | COHERENCE|INFIT OUTFIT| STEP | |LABEL SCORE COUNT %|AVRGE EXPECT| M->C C->M| MNSQ MNSQ|CALIBRATN| |-------------------+------------+----------+------------+---------| | 1 1 1151 7| -.72 -.72| 56% 1%| 1.01 1.04| NONE | | 2 2 3394 21| -.38 -.37| 41% 31%| 1.00 1.02| -1.62 | | 3 3 5583 34| -.06 -.05| 45% 77%| .94 .97| -.71 | | 4 4 3277 20| .33 .30| 42% 28%| .94 .94| .65 | | 5 5 1035 6| .75 .80| 63% 5%| 1.05 1.05| 1.68 | |-------------------+------------+----------+------------+---------| |MISSING 1928 12| -.12 | | | | +------------------------------------------------------------------+ +--------------------------------------------------------+ |CATEGORY STEP STEP | SCORE-TO-MEASURE |THURSTONE| | LABEL CALIBRATN S.E. | AT CAT. ----ZONE----|THRESHOLD| |------------------------+---------------------+---------| | 1 NONE |( -2.95) -INF -2.18| | | 2 -1.62 .03 | -1.30 -2.18 -.64| -1.90 | | 3 -.71 .02 | -.02 -.64 .61| -.65 | | 4 .65 .02 | 1.29 .61 2.21| .61 | | 5 1.68 .03 |( 2.99) 2.21 +INF | 1.94 | +--------------------------------------------------------+
An Argument for Rasch Analysis 22
CATEGORY PROBABILITIES: MODES - Step measures at intersections P ++------+------+------+------+------+------+------+------++ R 1.0 + + O | | B |1 5| A | 111 55 | B .8 + 11 55 + I | 1 55 | L | 11 5 | I | 1 55 | T .6 + 1 5 + Y | 11 5 | .5 + 1 333 5 + O | 1222222 333 333 4444444*5 | F .4 + 2221 2** *4 5 44 + | 22 1 3 2 44 33 5 44 | R | 22 1 33 22 4 3 5 44 | E | 22 *1 244 ** 44 | S .2 + 22 33 1 422 5 3 44 + P | 222 33 11 44 22 55 33 444 | O |2 33 4** ** 333 4| N | 33333 4444 111*555 22222 33333 | S .0 +******************5555555555 1111111111******************+ E ++------+------+------+------+------+------+------+------++ -4 -3 -2 -1 0 1 2 3 4
INSTITUTION [MINUS] FACTOR MEASURE Figure 1. Step Use by Person-Item Measure
An Argument for Rasch Analysis 23
EXPECTED SCORE: MEAN (":" INDICATES HALF-SCORE POINT) -4 -3 -2 -1 0 1 2 3 4 |------+------+------+------+------+------+------+------| NUM FACTOR 1 1 : 2 : 3 : 4 : 55 23 SCHOOL VIOLENCE SUPPLY 1 1 : 2 : 3 : 4 : 5 5 24 WORKING CONDITIONS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 21 TEACHER SALARIES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 5 STATE FUNDING DEMAND 1 1 : 2 : 3 : 4 : 5 5 11 STATE MANDATES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 6 LOCAL FUNDING DEMAND 1 1 : 2 : 3 : 4 : 5 5 28 DECREASING OUR TEACHER ED REQU 1 1 : 2 : 3 : 4 : 5 5 2 STATE FUNDING SUPPLY 1 1 : 2 : 3 : 4 : 5 5 26 MOBILITY OF EXPERIENCED TEACHE 1 1 : 2 : 3 : 4 : 5 5 12 FEDERAL MANDATES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 3 LOCAL FUNDING SUPPLY 1 1 : 2 : 3 : 4 : 5 5 13 STATE MANDATES DEMAND 1 1 : 2 : 3 : 4 : 5 5 25 MOBILITY OF NEW GRADUATES SUPP 1 1 : 2 : 3 : 4 : 5 5 32 ECONOMIC CONDITIONS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 19 PRIVATE SCHOOLS/HOME SCHOOLING 1 1 : 2 : 3 : 4 : 5 5 22 TEACHER BENEFITS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 14 FEDERAL MANDATES DEMAND 1 1 : 2 : 3 : 4 : 5 5 7 POSTPONED RETIREMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 30 FOREIGN-PREPARED TEACHERS SUPP 1 1 : 2 : 3 : 4 : 5 5 4 FEDERAL FUNDING DEMAND 1 1 : 2 : 3 : 4 : 5 5 20 CLASS SIZE DEMAND 1 1 : 2 : 3 : 4 : 5 5 16 SHIFTS OF TEACHERS DEMAND 1 1 : 2 : 3 : 4 : 5 5 1 FEDERAL FUNDING SUPPLY 1 1 : 2 : 3 : 4 : 5 5 17 SHIFTS OF STUDENTS DEMAND 1 1 : 2 : 3 : 4 : 5 5 10 HIRING OF RETIREES SUPPLY 1 1 : 2 : 3 : 4 : 5 5 18 STUDENT ENROLLMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 31 DISTANCE LEARNING TEACHER EDUC 1 1 : 2 : 3 : 4 : 5 5 15 LIMITED ENGLISH PROFICIENT STU 1 1 : 2 : 3 : 4 : 5 5 8 ROUTINE RETIREMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 29 ALTERNATIVE CERTIFICATION SUPP 1 1 : 2 : 3 : 4 : 5 5 33 PERSONAL CAREER SHIFTS SUPPLY 1 1 : 2 : 3 : 4 : 5 5 9 EARLY RETIREMENT DEMAND 1 1 : 2 : 3 : 4 : 5 5 27 INCREASING OUR TEACHER ED REQU |------+------+------+------+------+------+------+------| NUM FACTOR -4 -3 -2 -1 0 1 2 3 4 1123455544121 1 31124841303441783179813564 121 1 INSTITUTIONS T S M S T
Figure 2. Probability Map
An Argument for Rasch Analysis 24
------------------------------------------------------------------------------- MEASURE | MEASURE <more> --------------------- INSTITUT-+- FACTORS --------------------- <rare> 3 + 3 | | | . | | | . | . | . | 2 . + 2 | | | .# | . | ### | .## | # T| .## | 1 .## + 1 .### | #### | X #### |T ########### S| X ####### | .#### | XXX .############ |S XX ################## | XXX ################## | XXXXXX 0 ################### M+M XXXX 0 .#################### | XX .################## | XXX .################## |S XX .############ | X .################# | XXX .############ | X ###### S|T #### | X .##### | -1 .## + -1 .## | .## | .# T| # | ## | . | . | | ## | -2 + -2 | | | | | | | | | -3 + -3 <less> --------------------- INSTITUT-+- FACTORS ------------------<frequent>
Figure 3. Item-Person Map