estimating spillovers using panel data, with an application to ......natalie goodpaster, analysis...
TRANSCRIPT
Estimating Spillovers using Panel Data, with an Application to
the Classroom∗
Peter Arcidiacono, Duke University
Gigi Foster, University of South Australia
Natalie Goodpaster, Analysis Group, Inc.
Josh Kinsler, University of Rochester
February 12, 2009
Abstract
Obtaining consistent estimates of spillovers in an educational context is hampered
by two issues: selection into peer groups and peer effects emanating from unobservable
characteristics. We develop an algorithm for estimating spillovers using panel data that
addresses both of these problems. The key innovation is to allow the spillover to operate
through the fixed effects of a student’s peers. The only data requirements are multiple
outcomes per student and heterogeneity in the peer group over time. We first show that the
non-linear least squares estimate of the spillover is consistent and asymptotically normal
as N →∞ with T fixed. We then provide an iterative estimation algorithm that is easy to
implement and that converges to the non-linear least squares solution. Using University
of Maryland transcript data, we find statistically significant peer effects on course grades,
particularly in courses of a collaborative nature. We compare our method with traditional
approaches to the estimation of peer effects, and quantify separately the biases associated
with selection and spillovers through peer unobservables.∗We thank Joe Altonji, Pat Bayer, Paul Frijters, John Haltiwanger, Caroline Hoxby, Tom Nechyba, Bruce
Sacerdote, seminar participants at Duke University, University of Georgia, Yale University, and participants at
the 2004 Labour Econometrics Workshop held at the University of Auckland and the NBER’s Higher Education
Workshop for helpful comments, as well as the University of Maryland administrators who have provided us
with the data used in this paper. Shakeeb Khan was particularly helpful. Special thanks go to Chris Giordano
and Bill Spann of the Maryland Office of Institutional Research and Planning. We also thank Graham Gower
for superior research assistance.
1
1 Introduction
The question of how peers affect student achievement underlies many debates in applied
economics. Peer effects are relevant to the estimation of the impact of affirmative action,
school quality, and public school improvement initiatives such as school vouchers, and are
central to more immediate concerns such as how best to group students to maximize learning.
However, despite this wide field of potential relevance, the empirical estimation of spillovers—
whether in the education context or elsewhere— is not straightforward.
There are at least two barriers that must be overcome when estimating spillovers on stu-
dent achievement. The first is the selection problem. When individuals choose their peer
groups, high ability1 students may sort themselves into peer groups with other high ability
students. With ability only partially observable, positive estimates of peer effects may result
even when no peer effects are present because of a positive correlation between the student’s
unobserved ability and the observed ability of his peers. Researchers have undertaken a variety
of estimation strategies to try to overcome the selection problem,2 but significant empirical
problems linger, both because researchers only have access to incomplete measures of ability
and because peer effects may operate differently when peers are chosen rather than assigned.
A second barrier is that spillovers may work in part through characteristics that are not
observed by the econometrician. The importance of peer effects may be significantly under-
stated if the primary channel through which they operate is unobserved. Peer effects through
unobservables has received little attention outside of Altonji et al. (2004). Random assign-
ment is able to circumvent the selection problem but the obstacle to estimation posed by peer
effects through unobservables remains.
We introduce a new algorithm for estimating spillovers using panel data that overcomes
both these obstacles. The key innovation is that the peer effects are captured through a linear
combination of individual fixed effects. Additional types of fixed effects, such as institutional1For ease of exposition we refer to the bundle of individuals’ performance-relevant characteristics as ‘ability.’2One set of papers uses proxy variables to break the link between unobserved and peer ability (Arcidiacono
and Nicholson (2005), Hanushek et al. (2003), and Betts and Morell (1999)). Another set of papers relies on
some form of random assignment (Sacerdote (2001), Zimmerman (2003), Winston and Zimmerman (2003),
Foster (2006), Lehrer and Ding (2007), and Hoxby (2001)). Finally, researchers have tried to circumvent the
endogeneity problem with instrumental variables (Evans et al. (1992)).
2
effects, can also be incorporated into the algorithm. Thus our method, while originally devel-
oped for use in social effects research, can also be used in applications where fixed effects of
large dimension and of potentially multiple types must be estimated.
Constructing the spillover as a linear combination of individual fixed effects results in
a non-linear optimization problem. Estimating individual unobserved heterogeneity in non-
linear panel data models often results in biased estimates of the key parameters of interest—
the incidental parameters problem.3 As N goes to infinity for a fixed T , the estimation error
for the individual effects often does not vanish as the sample size grows, contaminating the
estimates of the parameters of interest.4 We show, however, that the nonlinear least squares
estimate of the spillover parameter is consistent and asymptotically normal as N → ∞ with
T fixed, even though the fixed effects themselves are not consistent.5
While nonlinear least squares yields consistent estimates of the spillover parameters, the
dimensionality of the problem renders nonlinear least squares infeasible. We develop an it-
erative algorithm that, under certain conditions, produces the same estimates as nonlinear
least squares. The algorithm toggles between estimating the individual fixed effects and the
spillover parameters. Each iteration lowers the sum of squared errors, with a fixed point
reached at the nonlinear least squares solution to the full problem.
As a precursor to the exposition of our full spillover model, we show how our iterative
estimator can be used to identify multiple types of fixed effects in contexts where no spillovers
are being estimated. Then, we customize the model for application to peer effects in education,
and estimate the customized model using student-level data from the University of Maryland.
Six semesters of transcript data are available covering the semesters from the spring of 1999
to the fall of 2001. We observe grades for every class each student took over the course of this
period as long as the student lived on campus during any one of the six semesters.
We estimate the model separately for each of three types of courses where ability is allowed3Neyman and Scott (1948) were the first to document the incidental parameters problem.4Hahn and Newey (2004) provide two methods of bias correction: a panel jackknife and an analytical correc-
tion. Woutersen (2002) and Fernandez-Val (2005) consider estimators from bias-corrected moment conditions.
In a similar vein, Arellano and Hahn (2006) and Bester and Hansen (2005) consider bias-correcting the initial
objective function.5Other special cases where the incidental parameters problem does not require a bias correction are Manski
(1987), Honore (1992), and Horowitz and Lee (2004).
3
to vary by course type. We find significant peer effects which vary by course type. A one
standard deviation increase in peer ability yields average returns similar to those from between
a 3 percent and a 11 percent of a standard deviation increase in own ability, depending upon
the course type and specification. The lowest returns are found in math and science with the
highest returns found in the social sciences.
Our model allows us to quantify selection both within and across course types. Within
course types, we compare the amount of selection with respect to observed and unobserved
student ability. To arrive at these measures, we decompose each of the estimated student fixed
effects, or what we label total ability, into an observed and an unobserved component using
typical observed ability measures, such as SAT scores and high school performance. For all
course types we find greater selection on unobserved ability than observed ability. However,
selection is highest when measured using total ability, a reflection of the significant correlation
between peer observed and unobserved ability within a section.6 Because we estimate our
model separately for each course type, we can compare student ability in the primary course
of study to ability in other fields, thereby quantifying selection across course types. We find
strong evidence of both comparative and absolute advantage. On average students select
course types for which they are best suited. However, math and science students show greater
aptitude overall in every course type.
Finally, we compare our peer effect estimates to those that would be obtained using more
conventional methods. In particular, we examine separately the two obstacles present in
traditional peer effects estimation: selection into peer groups and the effect of peer unobserv-
ables. Controlling for selection only, which is what is accomplished using random assignment,
we show that the estimated peer effects are lower than the peer effects obtained using our
method. This is because random assignment techniques rely on incomplete measures of peer
ability. We then re-introduce selection into the model and show that the bias in the peer
effect estimate can be either positive or negative when both issues are present. For humanities
courses, the peer effect estimate continues to be biased downwards since the peer unobserv-
ables problem dominates the selection issue. The opposite is true for math and science courses
where the peer effect estimate using conventional methods is more than four times our original
estimate. The differences across course types is due in part to the much higher correlation6By construction, observed and unobserved ability are orthogonal in the population.
4
between individual observed ability and peer unobserved ability in math and science relative
to the humanities.
The remainder of the paper proceeds as follows. Section 2 discusses our iterative estima-
tion algorithm in the context of some outstanding problems in the applied microeconomics
literature. Section 3 presents the model, its properties, our estimation strategy, and evidence
from Monte Carlo simulations. Section 4 describes the University of Maryland data, and
Section 5 presents the results. Section 6 explores selection within and across course types
and Section 7 illustrates the biases associated with traditional peer effect measures. Section
8 concludes.
2 Estimating Models with Multiple Types of Fixed Effects
An outstanding problem in the applied microeconomic literature is how to estimate models
containing multiple types of fixed effects where each set of fixed effects is of a large dimension.
We begin with this econometric problem since the iterative method we employ solves the issue
of multiple fixed effects en route to estimating spillovers. We then present our full spillover
model in the next section. We focus on two papers in particular: Rivkin et al. (2005) and
Abowd et al. (1999), to illustrate the difficulties in estimating large numbers of fixed effects.
Rivkin et al. (2005) model gains in test scores as a function of the observed characteristics
of the students, Xi, and teacher fixed effects, δj , where i indexes individuals and j indexes
teachers. The change in test scores from time t− 1 to t given that the individual has teacher
j at time t, ∆Y, is then modeled as:
∆Y = αXi + δj + εit (1)
Note that Xi includes characteristics of the students that do not vary over time. However,
Xi may not include the full set of individual characteristics that are relevant for achievement
gains, and the omitted variables may be correlated with the δj ’s due to streaming of students
and/or systematic selection of certain teachers into classrooms with higher- or lower-ability
students. As an alternative, we could estimate the model with both student and teacher fixed
effects:
∆Y = αi + δj + εit (2)
5
However, estimating both sets of fixed effects simultaneously would be infeasible given the
large number of students and teachers in their data.
Abowd et al. (1999) are interested in estimating models that include both firm and worker
fixed effects.The most basic model they are interested in estimating contains just individual
and firm-specific effects regressed on log earnings. Labeling log earnings for individual i in
firm j at time t as Y, the outcome equation is:
Y = αi + δj + εit (3)
A more interesting case occurs when there are tenure effects that vary across firms. For
simplicity, assume that the effects of tenure are linear. Labeling X as the amount of tenure
individual i has in firm j at time t, the outcome equation is:
Y = αi + δj + φjX + εit (4)
where φj is the firm-specific return to tenure. Abowd and Kramarz (1999) recognize that with
over 1 million workers and 500,000 firms, they cannot estimate the above equation directly.
Instead, they consider a number of estimation techniques, none of which results in least squares
estimates of the firm and worker fixed effects without imposing additional assumptions on the
data generating process.
Our approach yields least squares estimates of both firm and worker effects in a compu-
tationally feasible way without imposing any extraneous orthogonality conditions. Consider
the following regression equation which nests the models from both Rivkin et al. (2005) and
Abowd and Kramarz (1999):
Yijt = αi + βXit + δj + γjZ + εijt (5)
Estimating this equation by OLS solves:
minα,γ
N∑
i=1
T∑
t=1
(Yijt − αi − βXit − δj − γjZ)2 (6)
Minimizing this function in one step remains infeasible as a result of the large number of
firms and workers. Instead, we propose an iterative method that yields OLS estimates of the
parameters of interest while circumventing the dimensionality problem. Given starting values
for the α’s and the β’s, the algorithm iterates on two steps with the qth iteration following:
6
• Step 1: Conditional on αq−1 and βq−1, estimate δq and γq by OLS.
• Step 2: Conditional on δq and γq, estimate αq and βq by OLS.
The process continues until the parameters converge. Because the sum of squared errors is
decreased at each step, we will eventually converge to the parameter values that minimize the
least squares problem in (6), regardless of which pair of parameters we guess first to start the
algorithm. The primary advantage of our method in applications such as those described in
this section is that it is capable of estimating extremely large sets of fixed effects in a reasonable
amount of time. The model becomes significantly more complicated when the outcomes are
allowed to depend on functions of the fixed effects themselves. We discuss this enhancement
now.
3 Estimating Spillovers with Panel Data
In this section we present a model and estimation strategy for measuring achievement spillovers
using student fixed effects, building upon the iterative technique outlined in the prior section.
The model is constructed keeping in mind that our application will be measuring peer effects
in college, where we are interested in the interactions that occur within discussion sections in
large classes.
We first consider a case where one’s outcome depends only on one’s own fixed effect as well
as the fixed effects of the other individuals in a defined peer group. We show that it is possible
to obtain consistent estimates of the spillover and that there is a computationally cheap way of
obtaining the solution. Although we focus on the case where the spillover effect is homogeneous
across individuals, all proofs go through when the spillover is allowed to vary provided the
number of parameters governing the spillover effect does not vary with the sample size. We
then show how the model is easily expanded to incorporate correlated effects, highlighting
again the advantages of iterative estimation in solving the dimensionality problem. Finally,
we provide Monte Carlo evidence that the small-sample behavior of our iterative estimator is
also appealing. All proofs are in the appendix.
7
3.1 Baseline Model
Our baseline model has individual i’s outcome at time t, Yit, depending upon his own observed
and unobserved permanent characteristics, Xi and ui, the observed and unobserved permanent
characteristics of each of the other students in his peer group, and a transitory error εit. Define
Iijt as an indicator function that takes on a value of one if individuals i and j are in the
same peer group at time t, and zero otherwise, and let N be the number of students in the
population. Our baseline specification is then:
Yit = Xiβ1 + uiβ2 +N∑
j 6=i
Iijt (Xjγ1 + ujγ2) + εit (7)
The specification in (7) is restrictive along a number of dimensions. In the language of
Manski (1995), there are no endogenous effects as Yjt has no direct effect on Yit for any
j, and there are no correlated effects as there are no variables to capture the commonality
of the environment faced by all members of student i’s time-t peer group. Further, there
are no accumulation effects as spillovers today do not affect outcomes tomorrow. However,
even in this special case estimation is still problematic when peer groups are chosen. In
particular, there may be correlation between ui and the sum of observed peer characteristics,
leading to biased estimates of γ1. Also, we will not be able to capture the peer influence
through unobservables, meaning that γ2 is inestimable without further assumptions. While
random assignment can remove the correlation between ui and observed peer characteristics,
the inability to capture spillovers through unobservables remains.7
We next make one additional assumption: the relevance to outcomes of peer characteristics
is proportional to that of own characteristics, meaning that we can write γ1 and γ2 as
γ1 = γβ1
γ2 = γβ2
This implies, for example, that if two dimensions of an individual’s ability are equally im-
portant in their effect on Yit, then those two dimensions of peer ability will also be equally
important in determining Yit. This same assumption is used in Altonji et al. (2004).7Using random assignment to identify the spillover also disregards the possibility that spillovers operate
differently in selected versus randomized contexts.
8
Now define αi as:
αi = Xiβ1 + uiβ2
We can then rewrite equation (7) as:
Yit = αi +N∑
j 6=i
Iijt(γαj) + εit (8)
An individual’s outcome is then a function of the individual’s fixed effect plus a linear function
of the fixed effects of the other students in the peer group. Using fixed effects in this way
allows us to abstract from many other covariates that may affect student outcomes. All of
the heterogeneity in fixed student characteristics that might affect student outcomes, such as
birth cohort, sex, IQ, or race is captured with this one measure.
Our goal is then to show the properties of the solution to the non-linear least squares
problem:
minα,γ
N∑
i=1
T∑
t=1
Yit − αi −
N∑
j 6=i
Iijt(γαj)
2
(9)
Given the non-linearities present in the problem, one may suspect that for fixed T , the es-
timates of γ’s will be biased as a result of the incidental parameters problem. However, we
show that under standard assumptions this is not the case.
Theorem 1. Under the following assumptions, the estimates of γ from solving the minimiza-
tion problem in (9) are√N consistent and asymptotically normal for fixed T as N → ∞
where N is the number of peer groups:
1. εit⊥εjs ∀ j 6= i, t 6= s
2. εit⊥αj ∀ i, j
3. Var(εit) is finite and constant across i and t
4. Var(αi) is finite
The structure of the proof relies on solving each of the α’s as a function of the data and γ
and then substituting these functions in for the α’s in (9). Minimizing with respect to γ alone
then yields the result.
9
While concentrating out the α’s is useful for proving consistency, the formulas are quite
cumbersome and difficult to calculate. Directly solving (9) is also generally not possible
because of the dimensionality of the problem. Instead, we consider an iterative estimation
strategy that both circumvents the dimensionality problem and yields the same solution as
direct maximization.
To implement the algorithm, consider the first order condition of the nonlinear least squares
problem with respect to αi:
0 =T∑
t=1
Yit − αi −
N∑
j 6=i
Iijt(γαj)
+
N∑
j 6=i
Iijtγ
Yjt − αj −
K∑
k 6=j
Ijkt(γαk)
(10)
Solving for αi yields:
αi =
∑t
[Yit −
∑j 6=i Iijtγαj +
∑j 6=i γ
(Yjt − αj −
∑k 6=i,j Ijktγαk
)]
T +∑
t
∑j 6=i Iijtγ2
(11)
Note that we have pulled out the αi terms from the last term in (10) to derive (11). We
establish in Theorem 2 the conditions under which, given any initial set of α’s, repeatedly
updating the α’s using (11) yields a fixed point.
Theorem 2. Denote f(α) as a function mapping from RN → RN where the ith element of
f(α) is given by the right hand side of (11) ∀i ∈ N . A sufficient condition for f(α) to be a
contraction mapping is that the maximum of γ/M is less than 0.4, where M is the number of
other individuals in each student’s peer group.
The restriction on the maximum value γ is needed to ensure that the feedback effects are
not too strong. With Theorem 2 giving a solution method for the α’s conditional on the γ’s,
our algorithm iterates on estimating the α’s using f(α) (taking the γ’s as given), and then
estimating the γ’s taking the α’s as given. Each of these two steps lowers the sum of squared
errors and, analogous to the estimator in Section 2, converges to the nonlinear least squares
solution. In practice, we have found that the algorithm performs substantially faster if the
α’s are only updated until the sum of squared errors falls before moving on to re-estimating
γ.8 To summarize, the algorithm is started with an initial guess for the α’s and iterates on
two steps until convergence, with the qth iteration given by:8For most iterations of our models updating the α’s just once led to a decrease in the sum of squared errors.
10
• Step 1: Conditional on αq−1, estimate γq by OLS.
• Step 2: Conditional on γq, update αq according to (11).
3.2 Correlated Effects
We now extend the model to allow for correlated effects. If each individual peer group is
exposed to a different environment, it is impossible to separate the correlated effects from the
exogenous effects without further parameterizations. A restriction that is easily imposed for
correlated effects in our context is a component to one’s grade that is course-specific. This is
important as grade inflation and where the curve lies will affect one’s final grade irrespective
of own and peer ability. Denote Iict as an indicator for whether individual i is in course c at
time t where c ∈ [1, C]. Placing course fixed effects into the achievement equation yields:
Yit = αi +N∑
j 6=i
Iijt(γαj) +C∑
c=1
Iictδc + εit (12)
The δc’s are then the course fixed effects used to capture correlated effects since all peer groups
are formed within courses.
The non-linear least squares problem we are now interested in solving takes the following
form:
minα,γ,δ
N∑
i
T∑
t=1
Yit − αi −
N∑
j 6=i
Iijt(γαj)−C∑
c=1
Iictδc
2
(13)
While the consistency of γ is unchanged regardless of whether we include other time-varying
regressors, it is particularly clear here since we can re-write the above expression without δc
by demeaning the dependent variable at the course level.
The first order condition with respect to αi changes to reflect the presence of the course
fixed effects:
αi =
∑t
[Yit −
∑c Iictδc −
∑j 6=i γαj +
∑j 6=i Iijtγ
(Yjt − αj −
∑c Iictδc −
∑k 6=i,j Ijktγαk
)]
T +∑
t
∑j 6=i Iijtγ2
(14)
The updating rule for the α’s then follows directly from (14), once α terms are collected for
each student.
For a given set of α’s and γ’s, the course fixed effects can be calculated according to:
δc =
∑i
∑t Iict
(Yit − αi −
∑j 6=i Iijtγαj
)∑
i
∑t Iict
(15)
11
The estimation strategy is then the same as without correlated effects, with one additional
step. As before, each step of the estimation decreases the sum of squared errors. The algorithm
is set by an initial guess of the α’s and δ’s and then iterates, with the qth iteration following:
• Step 1: Conditional on αq−1 and δq−1, estimate γq by OLS.
• Step 2: Conditional on δq−1 and γq, update αq according to (14).
• Step 3: Conditional on αq and γq, update δq according to (15).
Adding other types of fixed effects simply adds additional steps to the estimation, with each
set of fixed effects updated in an additional step. For example, had data been available on
teaching assistants, it would have been possible for us to control for teaching assistant effects
as well.
3.3 Monte Carlo Simulations
To investigate the properties of our iterative estimator in the presence of correlated effects, we
now run simulations using different assumptions about the empirical setting. In each setting,
the model is simulated using 10,000 students. We simulate the model 100 times under various
states of the world constructed by varying four dimensions of the problem:
1. Observations per student- The number of outcomes observed per student varies across
simulations between 2, 5, and 10. 5 is the maximum number of observations a researcher
may have when analyzing grade school or high school test score data, and 10 observa-
tions is more likely when analyzing grades achieved in university-level courses. More
observations per student implies more accurate measurement of the α’s.
2. Students per peer group- The number of students per peer group varies across simulations
between 2 and 10. 2 is the minimum number of observations required to identify a
spillover in this type of model. 10 students per peer group is in the range of what might
be observed in typical classroom-based data sets.
3. Selection into classes- To show that our estimator solves the selection problem, we
simulate the model under alternative assignment rules. Under random assignment, the
12
average standard deviation of the α’s within a peer group equals the standard deviation
of α in the population. We also simulate the model with selection such that the average
standard deviation of the α’s within a peer group is 75% of the population standard
deviation. The selection level in the non-random assignment simulation corresponds
with the sorting observed in the Maryland transcript data.9
4. Transitory component- The noisier the outcome measure, the noisier the estimates of
the α’s will be. The distribution of the α’s is set at N(0,1). The ε’s are distributed with
mean zero and standard deviation equal to 1.15 or 1.95.
The common group-level shock used to model the correlated effect is not statistically associated
with the abilities of students in the classes. However, students are sorted into classes based
upon ability. Thus, the average standard deviation of abilities within a class is smaller than
standard deviation of abilities in the population.
To verify first that the estimator does not give positive peer effects when no peer effects
are present, Table 1 shows simulation results when the true peer effect, γ, is zero. In all cases,
regardless of the number of observations per student, the level of selection, and the noisiness
of the outcome measure, γ̂ is precisely estimated around zero. There is no evidence of bias
when the true peer effect is zero.
Table 2 documents the model’s performance when the true value of γ is 0.15. Again,
regardless of assignment procedure or section size, γ is centered around the truth. However,
two interesting patterns emerge in the estimates and standard errors of γ. First, γ is more
precisely measured when students are randomly assigned to sections. Selection in this case
can be thought of as occurring at two levels: the class level and the section level. These
results reflect the fact that selection at the class level confounds the estimate of the correlated
effect and reduces the precision of the section-level peer effect estimate. In fact, if selection
occurred only at the section level, the peer effect estimates would be more precise than in the
random assignment case (ceteris paribus), since there would be greater variation in peer ability.
Second, as the peer group size increases, the precision of γ decreases.10 This is again related9In our data, the standard deviation of ability within peer groups is between 70% and 77% of the standard
deviation of ability in the population depending upon the course type.10This can be seen in Table 2 since as the number of observations per student increases, we should naturally
see an increase in precision—yet we do not, because peer group size increases as well, driving standard errors
13
to the variation in peer ability across sections. With smaller section sizes, other things equal,
there is greater variation in peer ability across sections which yields more precise estimates of
the spillover.
4 Data
With the model producing consistent estimates of the spillover parameter and performing well
in our Monte Carlos, we now turn to the data used in estimation. The administrative data
set used in this paper covers all undergraduates observed residing in University of Maryland
on-campus housing during any of the following six academic semesters: Spring 1999, Fall
1999, Spring 2000, Fall 2000, Spring 2001, or Fall 2001. The data set includes students living
off-campus in a given semester as long as they were observed living on campus during at least
one of the six semesters. 90% of University of Maryland entering freshmen live on campus
in their first semester,11 so the data set includes at least 90% of the University of Maryland
undergraduate population who began study sometime in the six-semester period. There is a
less complete representation for upperclassmen, some of whom entered before our observation
period and may not have lived on campus during the period. However, our identification of peer
effects comes from large, multi-section courses in which freshmen predominate.12 A “section”
in the American undergraduate context is a subset of students from an entire course that
meets together formally at least once a week. In these smaller groups, greater communication
and interaction is expected of students. Our section-level spillover captures how the ability of
other students in the same section impacts any given student’s eventual grade in a course.
To generate the student-section-level sample used to estimate our models, we first placed
two major restrictions on the data set: students had to have valid A through F grade informa-
tion for the given section, and they could not be the only student observed in the section that
semester.13 These restrictions yielded a sample of 300,640 student-section observations. We
up. The negative association of peer group size and precision, as well as all other relationships discussed here,
have also been verified in numerous additional Monte Carlo exercises; results are available upon request from
the authors.11This number is taken from publicly-available statistics posted on the University’s web page.12In tests of whether classes that were under-represented had lower estimated peer effects than those with a
complete representation, we found no meaningful differences.13Numeric grade equivalents were assigned as follows: A = 4; B = 3; C = 2; D = 1; and F = 0. Students who
14
then deleted all observations on sections that were not in one of three well-defined academic
subgroups: (1) humanities (86,844 observations), (2) social sciences (77,312 observations), and
(3) hard sciences and mathematics (82,675 observations).14 This left a combined sample of
246,831 student-section observations, representing 18,511 individual students. Sample sizes
are provided in Table 3.
Finally, while our method does not require the presence of observable characteristics about
individuals, the data set to which we apply it does offer an array of observable measures about
each student. Later in the paper, we use the following information about students in further
analysis of our models’ results: SAT math and SAT verbal scores; high school grade point
average; sex; race; whether the student was included in the honors program; whether the
student participated in sports; and whether the student was an in-state student (as opposed
to out-of-state or international).
5 Estimates of Classroom Spillovers
We now turn to our model specifications and estimates. We first describe and estimate a model
that restricts the peer effect such that the spillover only depends upon the mean ability in the
section. The second specification allows the size of the spillover to depend upon one’s own
characteristics. For example, those with high SAT verbal scores may receive higher benefits
from their peers than those with low SAT verbal scores. With the results of the two models in
hand, we then show how predictable ability is given observable measures such as SAT scores,
high school grade point average, and demographics.
withdrew from a course, audited, or received a non-letter grade (such as Pass) were excluded from the sample
due to concerns that they might not have been present during sections and classes. If two separate grades were
recorded for the student for a given section, the highest grade was used.14Excluded courses include those that are generally more vocationally-oriented, but very diverse; for example,
journalism, nutrition and food science, landscape architecture, and library science. Because our model estimates
a homogeneous underlying ability for each course type, we did not include these courses in a separate category
due to our concern that the underlying ability necessary to succeed in them is not sufficiently homogeneous
across the category.
15
5.1 Homogeneous Gamma Model
With c and t indexing courses and sections, we have the same specification as in equation
(12) except that now, with the number of individuals in each section varying, we restrict the
spillover to depend upon the mean fixed effect of the other individuals in the same section of
a course:
Yit = αi +N∑
j 6=i
Iijt
(γαj
Mit
)+
C∑
c=1
Iictδc + εit (16)
Because grades are assigned at the course level, there is a relationship between students who
share a course but are not in the same section that cannot be captured by the section peer
effect, γ. We might expect, for example, that if the course is graded on a curve and the
entire class is extremely able, a mediocre student’s grade may suffer. Similarly, the teacher
of a course may respond to higher average class ability by teaching in a way that enhances
learning and therefore raises grades for at least some portion of the class. By including fixed
effects at the course level, the δc’s, we can make the outcome measure comparable across
classes.
Consistent with the data section, we split courses into three types: humanities, social
sciences, and math and science. A student’s performance in each type of course will differ
according to the particular student’s strengths and weaknesses. Therefore, instead of encap-
sulating all the attributes of a student into one ability measure, we allow students to have
separate ability measures for each course type in which they are enrolled. As noted above, all
courses used in our analysis are classified as belonging to one of the following course types:
humanities, social sciences, or math and science. We estimate an independent ability measure
for each type of course for each student, conditional on the student’s enrollment in at least
one class within that course type.15 Another supporting rationale for the empirical division
into course types is that the amount of interaction, and therefore the size of the peer effect,
may differ by course type.
Adjusting equation (16) to incorporate abilities and peer influence that vary by course15Information as to which courses were assigned to which course types is available from the authors upon
request.
16
type yields:
Yikt = αik +N∑
j 6=i
Iijt
(γαjk
Mit
)+
C∑
c=1
Iictδc + εikt (17)
where k indexes course type. This model is run through our algorithm separately for each type
of course, yielding three sets of peer and class effects estimates, as well as separate student
ability measures for each course type taken. Note that while classes with only one section do
not help in the identification of γ directly, these classes are still useful in identifying the α’s.
Table 4 shows the results from estimating equation (16) for each of the three types of
courses. The results indicate positive and significant section peer effects for all course types.
The magnitudes of the section-level peer effects suggest that peer effects are most important
in the social sciences and least important in math and science. This pattern may reflect the
amount of collaborative work required in each course type as well as the differing amounts of
discussion that occur in the sections.
In order to understand the importance of peer ability relative to own ability, we need to
take into account the differences in variation of peer and own ability. There is likely to be
less variation in peer ability than in own ability as peer ability averages over a cross-section
of students leading to some heterogeneity canceling out. The first two columns of Table 5
show the standard deviation of mean peer ability and the standard deviation of individual
ability, respectively. The third column then shows the fraction of a standard deviation of own
ability that is equivalent, in terms of its effect on grades, to a one-standard-deviation increase
in peer ability. This is calculated by dividing the standard deviation of mean peer ability
by the standard deviation of individual ability and multiplying this number by the estimated
gamma.
The gap evident in the raw marginal effects between math and science and the other
course types is somewhat mitigated because there is relatively more heterogeneity in peer
ability in math and science courses than in humanities or social science courses. A one-
standard-deviation increase in peer ability is shown to be equivalent to a maximum of 9% of
the effect of a one-standard-deviation increase in individual ability in the social sciences, and
to a minimum of 3% of the effect of a one standard deviation increase in individual ability in
math and science courses.
17
5.2 Heterogeneous Gamma Model
The assumption that each student is affected in the same manner by their classmates is restric-
tive and not as interesting from a policy perspective. In particular, the linear-in-means model
implies that, in terms of grades, any winners from reshuffling peers are perfectly balanced
by those who lose from the reshuffling.16 To relax this assumption, we apply our iterative
estimator to an enhanced spillover model in which peer effects are allowed to vary with the
observable characteristics of the student:
Yikt = αik +N∑
j 6=i
Iijtαjk
Mit(Xiγ) +
C∑
c=1
Iictδc + εikt (18)
where X includes whether the student is female, whether the student is Asian or another race,
as well as the student’s SAT math and SAT verbal scores.
Table 6 shows the results from this heterogeneous peer effects model. The qualitative
results for humanities and social sciences are the same. Relative to white males, Asians see
less of a return to peer ability while females and other non-white students see higher returns.
Both SAT math and SAT verbal scores are associated with higher returns to peer ability.
While Asians in math and science again see lower returns to peer ability, females and other
non-whites also see lower returns relative to their white male counterparts. The interaction
of the peer effect with SAT verbal score is once again positive, but the sign on the SAT math
interaction is now negative.
The differences in the SAT interactions across fields suggest that two competing forces
may be at play. First, those with higher test scores may have skills that make them better
able to benefit from their peers. Working against this, however, is that there is more scope
for students to benefit the lower they are in the ability distribution. SAT verbal and math
scores may not be highly correlated with the ability to perform well in the humanities and
social sciences implying that the first effect dominates in these course types. However, the
SAT math score may be highly correlated with the ability to perform well in math and science
classes, leading to the second effect dominating.
Averaging across all individuals within a course type, the relative magnitude of the peer
effects is unchanged from the homogenous gamma model. Peer ability is most important in16See Hoxby and Weignarth (2005) for a discussion.
18
social science courses and least important in math and science courses. However, the overall
magnitude of the peer effects is significantly higher, increasing by an average of over 40%
across course type. Relative to a one-standard-deviation increase in own ability, the effects of
a one-standard-deviation increase in peer ability are also higher in the heterogeneous gamma
model as shown in the fourth column of Table 7. The ratio of the effects of a one-standard-
deviation increase in mean peer ability to a one-standard-deviation increase in own ability
range from a low of 3.5% for math and science to a high of 10.5% in the social sciences.
5.3 Analysis of Ability
Next, we explore the extent to which the fixed effects from our iterative algorithm are pre-
dictable using the observed proxies for ability consistently used in related literature, and to
what extent the fixed effects estimated using the two methods differ. In order to facilitate
this comparison, we use SAT scores, high school performance, and a host of other observable
student attributes as regressors to construct a conglomerate observable measure of ability.
This approach is analogous to the creation of an academic index (as employed by Sacerdote
(2001)). For each course type, we regress our estimated student fixed effects on an array of
previous performance measures and demographics. These results are presented Table 8.
Columns 1 through 3 of Table 8 show results when we use the fixed effects estimated in
our homogeneous gamma model as the dependent variable. The statistical significance and
magnitudes of the coefficients on SAT math and SAT verbal scores vary across the four dif-
ferent course types in predictable ways. For example, SAT math scores are insignificant when
explaining ability in humanities courses, but are a better proxy for math and science ability.
The opposite is true for SAT verbal scores, with higher SAT verbal scores associated with
higher ability in the humanities but uncorrelated with ability in math and science. Consistent
with Arcidiacono (2004), females outperform males across all course types. Conditional on
SAT scores and high school GPA, whites perform better than all other ethnic groups including
Asians.
The second set of columns in Table 8 shows the corresponding results for the heterogeneous
gamma model. Recalling the results found in Table 6 regarding the positive association of
SAT verbal scores with stronger peer effects across all course types, it is unsurprising that the
coefficient on own SAT verbal score is smaller and even negative for some course types when
19
predicting own ability. Previous work such as Arcidiacono (2004), Arcidiacono and Vigdor
(2007), and Arcidiacono et al. (forthcoming) have all found no returns to verbal test scores in
the labor market. Combining these findings with the results presented here suggests that SAT
verbal scores have very little to with ability in the absolute, but rather reflect how capable an
individual is at extracting rents from others.
We find that the R2 for these regressions ranges from 0.13 to 0.36 depending on the course
type and whether we use the homogeneous or heterogeneous gamma model to generate the
individual fixed effects. That these observable characteristics only explain a small portion
of our estimated ability measures suggests the possibility of large biases associated with the
unobserved ability problem when following a selection-on-observables, or random assignment,
approach. With much of peer ability unobserved, we would expect to underestimate the extent
to which outcomes are affected by one’s peers. However, selection into sections may occur
based upon one’s own unobserved ability, which in turn may be correlated with observed peer
ability. This latter effect characterizes the selection problem, and is likely to bias peer effects
estimates upward. While random assignment circumvents this latter problem, a downward
bias remains from not taking into account the impact of unobservable peer attributes on
outcomes.
6 Quantifying Selection
In this section we quantify how much selection is taking place within course types with respect
to both observed and unobserved ability.17 For ease of exposition, we refer to the predicted
values of the regressions in Table 8 as ‘observed ability’ and the residuals of those regressions as
‘unobserved ability’. Thus, by construction, observed and unobserved ability are uncorrelated
at the individual level. However, unobserved individual ability and observed peer ability
will be correlated if students sort by total ability. By decomposing ability into its observed
and unobserved components, we can calculate the correlation between unobserved individual
ability and observed peer ability which is the crux of the selection problem. We also examine
selection across course types. In particular, we can split our student sample by which course
type was chosen the most. Because we estimate separate abilities for each course type, we17See Altonji et al. (2005) for more discussion of selection on observed and unobserved factors.
20
are able to determine whether students choose to take more courses in areas where they are
comparatively more able. In addition to examining comparative advantage, we can also see
whether students who choose particular course types enjoy an absolute performance advantage
by determining if their mean fixed effects are higher on average across all course types.
6.1 Selection Within Course Types
Table 9 provides information by course type on the selection evident with respect to both ob-
served and unobserved ability. We use the underlying ability as estimated by the homogeneous
gamma model and the heterogeneous gamma model, as well as the observed and unobserved
portions of this ability. The first row for every course type shows the section-size-weighted
average of the section-wide standard deviation of the variable in question, across all sections
in the particular course type; the second row for every course type shows the simple standard
deviation of the variable in question across the sample of students taking courses of the given
course type. The third row provides the ratio of the two. The smaller the numbers in the
third row, the tighter is the distribution of the variable within sections relative to the unsorted
distribution, and therefore, the more selection is evident with respect to that variable.
This table allows a direct analysis of the comparative degree of selection evident with
respect to observed versus unobserved ability. The patterns for both the homogeneous speci-
fication and the heterogeneous specification are almost exactly the same. For all three course
types there is more selection on unobserved ability than on observed ability, with higher ratios
of section standard deviations to population standard deviations found for observed ability. In
the social sciences and in particular for the humanities there is more selection on the estimated
α’s as a whole than on either observed ability or unobserved ability separately, with the high-
est levels of selection found in math and science. These patterns are driven by the correlation
between peer observed and unobserved ability. For the homogeneous gamma specification, the
correlation coefficients between peer observed and unobserved ability are 0.03, 0.07, and 0.35
in the humanities, social sciences, and math and science, respectively. The selection on the
estimated α’s in math and science is particularly strong relative to selection on either observed
or unobserved ability, which is consistent with the high correlation between peer observed and
unobserved ability.
With the observed and unobserved ability measures it is also possible to estimate the
21
correlation between unobserved individual ability and observed peer ability, which feeds di-
rectly into the bias associated with the selection problem. The correlation coefficients for
unobserved individual ability and observed peer ability are 0.03, 0.06, and 0.20 for humani-
ties, social sciences, and math and science, respectively. The high correlation coefficient for
math and science suggests that the upward bias associated with peer effect estimation using a
selection on observables approach might be quite large. We test this hypothesis in section 7.
6.2 Selection Across Course Types
Because the vast majority of students are observed in courses of multiple types during their
tenure at Maryland, we obtain multiple estimates of ability for most students. Each estimated
ability measure is specific to courses of a particular type, by virtue of the estimation procedure,
and as such each can be assumed to reflect a skill set that is particularly in demand in that
course type. Calculating the correlations between estimated ability levels illuminates the
extent to which good performance in each of the three course types is driven by similar student
attributes as performance in the other course types, and therefore provides an empirical index
of the academic similarity of course types. For ease of exposition, we focus on the homogeneous
gamma model for the rest of the paper.
Panel A of Table 10 shows the correlation coefficients among estimated ability levels across
the three course types from the homogeneous gamma model. These correlations are created
using estimated abilities from students observed in all course types.18 The correlation coeffi-
cients are all quite large and close together, ranging from 0.65 to 0.69.19 Panel B of Table 10
displays analogous results using only the portion of estimated abilities for each student that
is predictable using our observable variables. The relative relationships amongst observable
abilities by course type are the same, with observable abilities to perform in the humanities
and in math and science being related to a lesser extent than those of the other two course type
pairings (humanities and social sciences, and social sciences and math and science). However,
the strength of the relationships is much stronger across the board with correlation coefficients18Correlations among estimated abilities were also calculated for all students who were observed in each pair
of course types. Similar coefficients resulted.19The pattern of correlations for the heterogenous gamma model is similar, although the correlation coeffi-
cients are slightly lower: 0.64 for social sciences and humanities, 0.62 for social sciences and math and science,
and 0.54 for math and science and humanities.
22
above 0.95 for observed abilities for two out of three pairings. The differences across the pan-
els suggest that ability is much more heterogeneous than can be captured by observed ability
measures.
With this information as background, Table 11 illustrates the degree to which individual
students are observed to be sorting into the types of courses for which they appear, based on
our model, to be best suited. In particular, we label a student as specializing in a particular
course type if the number of courses taken in that course type is higher than the number of
courses taken in either of the other two course types. Panel A of this table displays results
using the estimated abilities from our homogeneous gamma model standardized to a N(0,1)
distribution, and Panel B displays results using only the portion of those estimated abilities
that could be predicted based on observable characteristics, also standardized to N(0,1). The
rows correspond to the sets of students who specialize in humanities, social sciences, and math
and science, respectively. The numbers along the rows gives the mean (normalized) fixed effect
for each of the different course types.
We see two striking patterns in Panel A. First, students who specialize in humanities
courses are estimated to be less able across the board than those who specialize in either
the social sciences or math and science. Students specializing in math and science are the
most able across the board, with higher average fixed effects in each course type. Even more
interesting, students in each specialization group appear to have specialized in the area for
which they are most suited. This can be seen from the fact that the highest number in each row
is that corresponding to the course type specialized in by that sample of students–the highest
number in each row is on the diagonal.20 Panel B of Table 11, where we use only observed
ability to examine selection, also illustrates both absolute and comparative advantage, but
reflects more attenuated distributions of abilities and less heterogeneity than shown in Panel
A.20Paglin and Rufolo (1990) find similar sorting patterns using GRE exam data and transcript data from the
University of Oregon and Oregon State University. They find that students with high math ability tend to take
courses in which there is a high return to this skill.
23
7 Comparing the Method to Conventional Methods
To compare our estimated peer effects to those that would be obtained using conventional
techniques, we first conceptualize the estimation problem as follows. Given a model where the
ability of each student can be decomposed into observed versus unobserved portions, there
are two econometric obstacles to the accurate estimation of the spillover. The first obstacle
is a positive correlation between the student’s own unobserved ability and his peer group’s
observed ability, which also leads in any given sample to a correlation between the peer group’s
observed ability and the peer group’s unobserved ability. This problem leads to an upward
bias of the spillover parameter. The second obstacle is that when only observables are used
to form the peer ability measures, the underlying distribution of peer ability is attenuated,
leading to downward pressure on the estimated impact of a one-standard-deviation change in
peer ability.
To examine the quantitative impact of these two problems separately, we first artificially
eliminate the correlation between the student’s own unobserved ability and the peer group’s
observed ability, by differencing out our estimated individual fixed effects and course effects
from student grades. We regress these adjusted grades on observed peer ability, rather than
total peer ability. This enables us to examine the consequences for estimation of using an
incomplete measure of peer ability in a case where the link between individual unobservables
and peer observables is broken.
Row 2 of Table 12 for each course type presents the results of this first exercise, where
we use total ability (our estimated α’s) for the individual and observed ability for peers.
For comparison, Row 1 of the table for each course type gives the original spillover estimate
produced using our algorithm. Looking at Column 3 of the table, we can see that the estimated
effects of a one-standard-deviation increase in observed peer ability are at most two-thirds the
size of a one-standard-deviation increase in total peer ability. This first-order decrease in effect
magnitude is evident because the impact of unobserved peer ability is not captured in Row 2
except through the correlation between observed and unobserved peer ability.
However, under random assignment, a one-standard-deviation increase in peer ability
would produce even smaller effects than those shown in Row 2 of Table 12 for each course
type. There are two reasons for this. First, peer observed ability and peer unobserved ability
24
are positively correlated in our data, but would not be correlated under random assignment.21
This leads to an upward bias on our estimates of the spillover parameters: the estimated
coefficients in the second row for each course type are all higher than those in the first row.
The largest percentage increase in the peer ability coefficient is for courses in math and sci-
ence, where the correlation between observed and unobserved peer ability is the highest. The
second reason why random assignment will lead to even lower estimates of a one-standard-
deviation increase in peer ability is that random assignment itself leads to less heterogeneity
in mean peer ability across sections when higher ability students choose sections with other
high ability students. This attenuation in the distribution of peer ability means that a one-
standard-deviation increase in peer ability will be smaller under random assignment than in
a self-selected context.
We next investigate what happens when the econometrician additionally assumes that
students select into peer groups only based on observable characteristics (the “selection on
observables” approach). In Row 3 of Table 12, we again use observed peer ability, but now
we use observed ability for the individuals as well rather than the estimated fixed effects. The
positive correlation between unobserved individual ability and observed peer ability biases the
selection-on-observables estimate of the spillover parameter upward. While the estimate of the
spillover parameter is biased upward, the effect of a one-standard-deviation increase in peer
ability may still be smaller because the variance in observed peer ability is smaller than the
variance in peer ability as a whole. As can be seen in the third column, this is indeed the case
for humanities, the course type with the smallest correlation between unobserved individual
ability and observed peer ability. For the social sciences, the selection-on-observables estimate
of a one-standard-deviation increase, although higher than our original estimate, is still closer
than the estimates given in the second row that mitigate the selection problem. However,
for math and science the estimated effect of a one-standard-deviation increase in peer ability
is significantly higher using the selection-on-observables approach than using our algorithm.
This is driven by (1) the high correlation between unobserved individual ability and observed
peer ability in math and science; (2) the fact that observed ability is a greater fraction of21Recall that observed and unobserved ability are uncorrelated by construction at the individual level. Under
random assignment, observed and unobserved peer ability will also be uncorrelated. The positive correlation
in observed and unobserved peer ability in our data stems from student sorting based upon total ability.
25
total ability in math and science; and (3) the fact that the underlying peer effect estimate
from our method is smallest in math and science, which mitigates the underestimation of a
one-standard-deviation increase in peer ability.22
8 Conclusion
Accurate estimation of peer effects in the classroom is plagued by at least two issues, both
of which have to do with ability not being fully observed. First, there is selection into the
peer group which leads to a positive correlation between unobserved individual ability and
observed peer ability. If ignored, this correlation leads to biased upward estimates of peer
effects parameters. On the other hand, underestimation of the effects of peers may result from
ignoring peer effects through unobservables.
We present a new iterative method for estimating educational peer effects that overcomes
both these obstacles. All that is required is that there are multiple observations per student
with the peer group changing over time. We then control for individual effects and allow the
peer effect to operate through a linear combination of the other individual fixed effects. We
show that our estimator is consistent and asymptotically normal for fixed T as N goes to
infinity. We also develop an iterative algorithm that is computationally much cheaper than
direct non-linear least squares minimization yet produces the non-linear squares results upon
convergence. Monte Carlo results suggest that the model performs quite well even when the
number of observations per student is small.
We estimate the model on transcript data from the University of Maryland. Small but
significant peer effects are found, with evidence of heterogeneity by course type. Social science
courses show the largest peer effects, whereas grades in math and science courses rely least on
peer ability and most heavily on a student’s own ability.
Previous efforts to estimate spillover effects in education that do not rely on random
assignment are often plagued by concerns regarding selection on unobservables. The data
suggest that this is a valid issue. Students select into sections based more on unobservable
factors than on observable factors. This leads to correlation in unobserved own ability and
observed and unobserved peer ability that, if ignored, biases the spillover parameter upward.22Indeed, if the spillover parameter were zero, there would be scope for the attenuation effect.
26
There is also much selection across course type. Students sort into course types where their
relative abilities are highest, suggesting comparative advantage is important in the selection
of courses. However, absolute advantage is also present as those who primarily choose math
and science course are more able in both humanities and social sciences than those who choose
to specialize in one of the other ares.
Our method allows us to quantify the effects of both the selection problem and the prob-
lem of not being able to estimate peer effects through unobservables. Our different course
types illustrate how the setting dictates which of these problems is more important. For
math and science courses, the estimated spillover parameter from our model is small. This,
coupled with much selection into math and science courses, leads to estimates from a selection-
on-observables approach that significantly overstates the importance of peers. However, for
humanities courses the estimated spillover parameter is larger than in math and science. Cou-
pled with much less selection than in math and science, the selection-on-observables approach
places similar importance on peers to what is estimated from our model. This is in contrast
to random assignment, which removes the correlation between individual unobserved ability
and peer observed ability but ignores peer effects through unobservables, leading to random
assignment significantly understating the impact of peers on achievement.
There are many avenues to be explored in future research. First, future research will relax
the assumption that an individual’s ability to help others is proportional to an individual’s
own ability to perform well. The individual who asks the clarifying question may be more
useful to others than a smarter individual who remains quiet. It is possible to extend the
model such that spillover ability is treated differently from the ability to perform well for
oneself.
Second, peer effects here are purely transitory. Future work will relax this assumption
by allowing the effect of peer ability in a particular course to decay over time, as well as
to influence performance in other contemporaneous classes. This will help us determine the
long-run impact of peer groups on educational achievement, and may result in higher peer
effect estimates as we account for spillovers beyond the classroom.
Finally, rather than separately estimating ability in each course type, the model could be
expanded to allow a factor structure on ability and allow the returns to the different abilities
to vary by course type. This expansion would allow a better exploitation of large data sets,
27
such as ours, with outcomes that have heterogeneous determinants, since performance on all
outcome measures collectively would be used to estimate individual abilities.
28
References
Abowd, J. M. and Kramarz, F.: 1999, “The Analysis of Labor Markets using Matched
Employer-Employee Data”, in O. C. Ashenfelter and D. Card (eds), Handbook of Labor
Economics, Vol. 3B, Elsevier Science B.V., chapter 40, pp. 2629–2710.
Abowd, J. M., Kramarz, F. and Margolis, D. N.: 1999, “High Wage Workers and High Wage
Firms”, Econometrica 67(2), 251–333.
Altonji, J. G., Elder, T. E. and Taber, C. R.: 2005, “Selection on Observed and Unobserved
Variables: Assessing the Effectiveness of Catholic Schools”, Journal of Political Economy
113(1), 151–184.
Altonji, J. G., Huang, C.-I. and Taber, C. R.: 2004, “Estimating the Cream Skimming Effect
of Private School Vouchers on Public School Students”. Working Paper.
Arcidiacono, P.: 2004, “Ability Sorting and the Returns to College Major”, Journal of Econo-
metrics 121(1-2), 343–375.
Arcidiacono, P., Cooley, J. and Hussey, A.: forthcoming, “The Economic Returns to an MBA”,
International Economic Review .
Arcidiacono, P. and Nicholson, S.: 2005, “Peer Effects in Medical School”, Journal of Public
Economics 89(2-3), 327–350.
Arcidiacono, P. and Vigdor, J.: 2007, “Does the River Spillover? Estimating the Economic
Returns to Attending a Racially Diverse College”. Working Paper.
Arellano, M. and Hahn, J.: 2006, “A Likelihood-based Approximate Solution to the Incidental
Parameter Problem in Dynamic Nonlinear Models with Multiple Effects”. Unpublished
Manuscript.
Bester, C. A. and Hansen, C.: 2005, “A Penalty Function Approach to Bias Reduction in
Non-linear Panel Models with Fixed Effects”. Unpublished Manuscript.
Betts, J. R. and Morell, D.: 1999, “The Determinants of Undergraduate Grade Point Average:
The Relative Importance of Family Background, High School Resources, and Peer Effects”,
Journal of Human Resources 32(2), 268–293.
29
Evans, W. N., Oates, W. E. and Schwab, R. M.: 1992, “Measuring Peer Group Effects: A
Study of Teenage Behavior”, Journal of Political Economy 100(5), 966–991.
Fernandez-Val, I.: 2005, “Estimation of Structural Parameters and Marginal Effects in Binary
Choice Panel Data Models with Fixed Effects”. Unpublished Manuscript.
Foster, G.: 2006, “It’s Not Your Peers, and it’s Not Your Friends: Some progress toward
understanding the peer effect mechanism”, Journal of Public Economics 90, 1455–1475.
Hahn, J. and Newey, W.: 2004, “Jackknife and Analytical Bias Reduction for Nonlinear Panel
Models”, Econometrica 72, 1295–1319.
Hanushek, E. A., Kain, J. F., Markman, J. M. and Rivkin, S. G.: 2003, “Does Peer Ability
Affect Student Achievement?”, Journal of Applied Econometrics 18(5), 527–544.
Honore, B.: 1992, “Trimmed LAD and Least Squares Estimation of Truncated and Censored
Models with Fixed Effects”, Econometrica 60, 533–565.
Horowitz, J. L. and Lee, S.: 2004, “Semiparametric Estimation of a Panel Data Proportional
Hazard Model with Fixed Effects”, Journal of Econometrics 119, 155–198.
Hoxby, C.: 2001, “Peer Effects in the Classroom: Learning from Gender and Race Variation”.
NBER Working Paper 7867.
Hoxby, C. and Weignarth, G.: 2005, “Taking Race Out of the Equation: School Reassignment
and the Structure of Peer Effects”. Working Paper.
Lehrer, S. and Ding, W.: 2007, “Do Peers Affect Student Achievement in China’s Secondary
Schools?”, Review of Economics and Statistics 89(2), 300–312.
Manski, C.: 1995, “Identification Problems in the Social Sciences”, Harvard University Press,
Cambridge, Massachusetts.
Manski, C. F.: 1987, “Semiparametric Analysis of Random Effects Linear Models from Binary
Panel Data”, Econometrica 55, 357–362.
Neyman, J. and Scott, E.: 1948, “Consistent Estimates Based on Partially Consistent Obser-
vations”, Econometrica 16, 1–32.
30
Paglin, M. and Rufolo, A.: 1990, “Heterogeneous Human Capital, Occupational Choice, and
Male-Female Earnings Differences”, Journal of Labor Economics 8(1), 123–144.
Rivkin, S. G., Hanushek, E. A. and Kain, J. F.: 2005, “Teachers, Schools, and Academic
Achievement”, Econometrica 73(2), 417–458.
Sacerdote, B. I.: 2001, “Peer Effects with Random Assignment: Results for Dartmouth Room-
mates”, Quarterly Journal of Economics 116(2), 681–704.
Winston, G. C. and Zimmerman, D. J.: 2003, “Peer Effects in Higher Education”, in C. Hoxby
(ed.), College Decisions: How Students Actually Make Them and How They Could, Univer-
sity of Chicago Press.
Woutersen, T.: 2002, “Robustness Against Incidental Parameters”. Unpublished Manuscript.
Zimmerman, D. J.: 2003, “Peer Effects in Higher Education: Evidence from a Natural Exper-
iment”, Review of Economics and Statistics 85(1), 9–23.
31
Table 1: Monte Carlo Simulations: True Gamma = 0Obs. Per Peer Group Random Assignment Selection
Student Size σε=1.95 σε=1.15 σε=1.95 σε=1.15
2 2 γ 0.0011 -0.0002 -0.0094 -0.0042
(0.035) (0.015) (0.060) (0.023)
R2 0.7044 0.8209 0.7055 0.82
5 10 γ 0.0005 0.0011 0.0039 -0.0055
(0.037) (0.021) (0.061) (0.032)
R2 0.4831 0.6845 0.4815 0.6852
10 10 γ 0.0049 -0.0018 0.0013 -0.0003
(0.026) (0.014) (0.036) (0.015)
R2 0.415 0.644 0.4148 0.6453
Note: The R-squared values reported in this table are those pertaining to the
regression of grades onto the constructed fixed effect values. We alter the random
error added on to the constructed grade for each student in order to manipulate
the amount of variation in performance that is explained by the ability measure.
66
Table 2: Monte Carlo Simulations: True Gamma = .15Obs. Per Peer Group Random Assignment Selection
Student Size σε=1.95 σε=1.15 σε=1.95 σε=1.15
2 2 γ 0.1507 0.1511 0.14 0.1508
(0.034) (0.016) (0.060) (0.024)
R2 0.7055 0.8219 0.7125 0.8277
5 10 γ 0.1499 0.1498 0.1463 0.1489
(0.041) (0.021) (0.059) (0.033)
R2 0.4822 0.6862 0.4943 0.6967
10 10 γ 0.1502 0.148 0.148 0.1524
(0.025) (0.012) (0.036) (0.018)
R2 0.4152 0.644 0.4288 0.6594
Note: The R-squared values reported in this table are those pertaining to the
regression of grades onto the constructed fixed effect values. We alter the random
error added on to the constructed grade for each student in order to manipulate
the amount of variation in performance that is explained by the ability measure.
67
Tab
le3:
Sam
ple
Size
s
S99
F99
S00
F00
S01
F01
TO
TA
L
(1)
Stud
ent-
Sect
ions
27,9
0037
,231
37,1
0945
,991
45,0
5453
,546
246,
831
(2)
Stud
ents
7126
9646
9458
11,7
6011
,393
13,6
6263
,045
(3)
Uni
que
Sect
ions
3095
3388
3408
3628
3543
3754
20,8
16
(4)
Uni
que
Cou
rses
1030
1079
1172
1189
1246
1252
6968
(5)
Sing
le-S
ecti
onC
ours
es63
268
273
375
277
879
543
72
(6)
Stud
ent-
Sect
ions
23,2
0631
,627
30,5
9938
,056
36,2
6043
,853
203,
601
(Onl
yM
ulti
-Sec
tion
Cou
rses
)
Not
e:Fig
ures
repr
esen
tth
eda
tase
taf
ter
appl
ying
the
rest
rict
ions
note
din
the
text
.T
he
unre
stri
cted
data
set
cont
aine
d35
1,94
0st
uden
t-se
ctio
nob
serv
atio
ns.
Row
s3
and
4sh
owth
e
tota
lnum
berof
uniq
uese
ctio
nsan
dco
urse
s,re
spec
tive
ly,i
nw
hich
anyo
nein
the
sam
ple
duri
ng
the
give
nse
mes
ter
was
obse
rved
.
68
Table 4: Peer Effects Results by Course Type: Homogeneous Gamma Model
Humanities Soc.Sci. Math/Sci.
Section peer ability 0.1613 0.1960 0.0483
(0.0007) (0.0008) (0.008)
N 86,844 77,312 82,675
R2 0.6373 0.6321 0.6861
Note: Dependent variable is the grade in the class. Class
fixed effects are estimated in all specifications.
69
Tab
le5:
Stan
dard
Dev
iati
ons
ofE
stim
ated
Abi
lity
and
Mar
gina
lE
ffect
s:H
omog
eneo
usG
amm
aM
odel
Cou
rse
Typ
eSe
ctio
nSt
dD
evPop
ulat
ion
Std
Dev
Mar
gina
lE
ffect
Rat
io
Hum
anit
ies
0.28
230.
6804
0.06
69
Soci
alSc
ienc
e0.
3103
0.71
250.
0853
Mat
han
dSc
ienc
e0.
5952
0.94
980.
0302
Not
e:“S
ecti
onSt
dD
ev”
isth
est
anda
rdde
viat
ion
ofav
erag
epe
erab
ility
acro
ssth
esa
mpl
e
ofst
uden
t-se
ctio
nob
serv
atio
nsof
the
give
nco
urse
type
.“P
opul
atio
nSt
dD
ev”
isth
e
stan
dard
devi
atio
nof
abili
tyac
ross
the
sam
ple
ofst
uden
t-se
ctio
nob
serv
atio
nsin
cour
ses
of
the
give
nco
urse
type
.T
hese
calc
ulat
ions
are
both
base
don
the
fixed
effec
tses
tim
ated
by
the
hom
ogen
eous
gam
ma
mod
el.
“Mar
gina
leff
ect
rati
o”sh
ows
the
rati
oof
(1)
the
effec
t
ongr
ades
from
aon
e-st
anda
rd-d
evia
tion
incr
ease
inpe
erab
ility
to(2
)th
eeff
ect
ongr
ades
from
aon
e-st
anda
rd-d
evia
tion
incr
ease
inow
nab
ility
.
70
Table 6: Peer Effects Results by Course Type: Heterogeneous Gamma Model
Humanities Soc.Sci. Math/Sci.
Section peer ability 0.2058 0.2227 0.0940
(0.0008) (0.0013) (0.0013)
Section peer ability*Female 0.0970 0.0584 -0.0517
(0.0010) (0.0018) (0.0018)
Section peer ability*Asian -0.0098 -0.0347 -0.0346
(0.0016) (0.0026) (0.0022)
Section peer ability*Other Nonwhite 0.0375 0.0252 -0.0420
(0.0014) (0.0026) (0.0026)
Section peer ability*SATm 0.0410 0.0507 -0.0560
(0.0006) (0.0012) (0.0011)
Section peer ability*SATv 0.0222 0.0147 0.0635
(0.0005) (0.0010) (0.0009)
N 86,844 77,312 82,675
R2 0.6376 0.6323 0.6864
Note: Dependent variable is the grade in the class. Class fixed effects are
estimated in all specifications.
71
Tab
le7:
Stan
dard
Dev
iati
ons
ofE
stim
ated
Abi
lity
and
Mar
gina
lE
ffect
s:H
eter
ogen
eous
Gam
ma
Mod
el
Cou
rse
Typ
eSe
ctio
nSt
dD
evPop
ulat
ion
Std
Dev
Avg
Mar
gina
lE
ffect
Mar
gina
lE
ffect
Rat
io
Hum
anit
ies
0.23
830.
6368
0.25
950.
0970
(.06
22)
Soci
alSc
ienc
e0.
2849
0.67
470.
2480
0.10
48
(.05
54)
Mat
han
dSc
ienc
e0.
6028
0.96
600.
0555
0.03
47
(.06
36)
Not
e:“S
ecti
onSt
dD
ev”
isth
est
anda
rdde
viat
ion
ofav
erag
epe
erab
ility
acro
ssth
esa
mpl
eof
stud
ent-
sect
ion
obse
rvat
ions
ofth
egi
ven
cour
sety
pe.
“Pop
ulat
ion
Std
Dev
”is
the
stan
dard
devi
atio
nof
abili
ty
acro
ssth
esa
mpl
eof
stud
ent-
sect
ion
obse
rvat
ions
inco
urse
sof
the
give
nco
urse
type
.T
hese
calc
ulat
ions
are
both
base
don
the
fixed
effec
tses
tim
ated
byth
ehe
tero
gene
ous
gam
ma
mod
el.
Col
umn
3sh
ows
the
mar
gina
leffe
cton
grad
esfr
oma
one-
poin
tin
crea
sein
peer
abili
ty.
“Mar
gina
leffe
ctra
tio”
show
sth
era
tio
of(1
)th
eeff
ect
ongr
ades
from
aon
e-st
anda
rd-d
evia
tion
incr
ease
inpe
erab
ility
to(2
)th
eeff
ect
ongr
ades
from
aon
e-st
anda
rd-d
evia
tion
incr
ease
inow
nab
ility
.T
henu
mbe
rsin
pare
nthe
ses
are
stan
dard
devi
atio
ns
ofth
em
argi
naleff
ects
ofa
one-
poin
tin
crea
sein
peer
abili
tyth
atar
ees
tim
ated
tooc
cur
inth
esa
mpl
e.
72
Table 8: Regression of Fixed Effects on Observed Ability
Homogeneous Gamma Model Heterogenous Gamma Model
Hum. Soc.Sci. Math/Sci. Hum. Soc.Sci. Math/Sci.
SATm 0.00 0.11 0.36 -0.20 -0.04 0.51
(0.01) (0.01) (0.01) (0.01) (0.01) (0.01)
SATv 0.08 0.11 0.00 -0.04 0.06 -0.19
(0.01) (0.01) (0.01) (0.01) (0.01) (0.01)
HS GPA 0.49 0.49 0.73 0.49 0.49 0.72
(0.01) (0.02) (0.02) (0.01) (0.02) (0.02)
Female 0.26 0.17 0.15 -0.15 0.03 0.27
(0.01) (0.01) (0.01) (0.01) (0.01) (0.01)
Black -0.28 -0.23 -0.18 -0.44 -0.29 -0.10
(0.02) (0.02) (0.02) (0.02) (0.02) (0.02)
Hispanic -0.15 -0.19 -0.17 -0.31 -0.25 -0.08
(0.03) (0.04) (0.04) (0.03) (0.04) (0.04)
Asian -0.12 -0.13 -0.10 -0.08 -0.05 -0.02
(0.02) (0.02) (0.02) (0.02) (0.02) (0.02)
Amer. Ind. -0.39 -0.35 -0.31 -0.55 -0.41 -0.22
(0.12) (0.12) (0.16) (0.12) (0.12) (0.16)
Honors 0.14 0.15 0.17 0.14 0.15 0.17
(0.02) (0.02) (0.02) (0.02) (0.02) (0.02)
Sports -0.07 -0.14 0.01 -0.08 -0.14 0.00
(0.02) (0.03) (0.03) (0.02) (0.03) (0.03)
In-state 0.05 0.08 0.12 0.05 0.08 0.12
(0.03) (0.03) (0.03) (0.03) (0.03) (0.04)
N 17,332 15,264 16,077 17,332 15,264 16,077
R2 0.22 0.24 0.34 0.13 0.17 0.36
Note: The dependent variable in columns 1 through 3 is the student-level
fixed effects estimated in the homogeneous gamma model; the dependent
variable in columns 4 through 6 is the student-level fixed effects estimated
in the heterogeneous gamma model. The excluded racial/ethnic category is
white; racial/ethnic categories are mutually exclusive. Standard errors are
robust to heteroskedasticity.
73
Tab
le9:
Sele
ctio
nba
sed
onO
bser
vabl
esan
dE
stim
ated
Abi
lity
Hom
ogen
eous
Gam
ma
Het
erog
eneo
usG
amm
a
Cou
rse
Typ
eα
α̂α
uα
α̂α
u
Hum
anit
ies
Avg
Sect
ion-
leve
lSD
.615
4.3
372
.538
5.5
849
.248
7.5
415
Pop
ulat
ion
SD.8
046
.377
3.7
106
.761
9.2
711
.712
1
Rat
io.7
649
.893
7.7
578
.767
7.9
174
.760
4
Soci
alSc
ienc
e
Avg
Sect
ion-
leve
lSD
.636
5.3
750
.566
7.6
056
.299
4.5
688
Pop
ulat
ion
SD.8
618
.422
8.7
510
.822
6.3
343
.751
7
Rat
io.7
386
.886
9.7
546
.736
2.8
956
.756
7
Mat
han
dSc
ienc
e
Avg
Sect
ion-
leve
lSD
.747
5.5
036
.666
6.7
636
.532
3.6
649
Pop
ulat
ion
SD1.
0567
.616
5.8
583
1.07
19.6
432
.857
5
Rat
io.7
074
.816
9.7
767
.712
4.8
276
.775
4
Not
e:E
ach
set
ofro
ws
corr
espo
nds
tose
ctio
nsin
one
cour
sety
pe;
each
set
of
colu
mns
corr
espo
nds
toon
eve
rsio
nof
the
mod
el(h
omog
eneo
usga
mm
ave
rsus
hete
roge
neou
sga
mm
a).
Var
iabl
esun
der
anal
ysis
appe
arin
the
head
ing
row
:α
isab
ility
ases
tim
ated
byou
rm
odel
,α̂
isob
serv
edab
ility
,an
dα
uis
unob
serv
ed
abili
ty.
“Avg
Sect
ion-
leve
lSD
”is
the
aver
age
(acr
oss
alls
ecti
ons,
and
wei
ghte
dby
sect
ion
size
)of
the
stan
dard
devi
atio
nof
the
vari
able
wit
hin
ase
ctio
n.“P
opul
atio
n
SD”
isth
est
anda
rdde
viat
ion
ofth
eva
riab
lein
the
popu
lati
onof
stud
ents
taki
ng
cour
ses
ofth
egi
ven
cour
sety
pe.
“Rat
io”
isth
era
tio
ofth
efir
stof
thes
eto
the
seco
nd,a
ndsh
owsth
ede
gree
ofse
lect
ion
into
sect
ions
wit
hre
spec
tto
each
vari
able
disp
laye
d.
74
Table 10: Correlations of Estimated Abilities Across Course Types
Panel A: Abilities
Course type Humanities Social Science
Humanities 1.0000
Social Science 0.6875 1.0000
Math and Science 0.6469 0.6776
Panel B: Predicted abilities
Course type Humanities Social Science
Humanities 1.0000
Social Science 0.9643 1.0000
Math and Science 0.8808 0.9593
The abilities used in these correlation matrices are those of
the 12,715 students who took classes in all three course types,
and they are estimated by the homogeneous gamma model.
Panel A displays correlations amongst the full abilities, while
Panel B displays correlations amongst the predicted values
from regressing the estimated abilities from the homogeneous
gamma model on observable variables (as shown in the first
three columns of Table 8).
75
Table 11: Specialization of Students into Course Types By Relative Aptitude
Panel A: Abilities
Humanities Soc.Sci. Math/Sci. N
Humanities-specializers -.08 -.19 -.26 3978
Social Science-specializers .04 .10 -.07 3547
Math and Science-specializers .14 .21 .43 3745
Panel B: Predicted abilities
Humanities Soc.Sci. Math/Sci. N
Humanities-specializers -.04 -.13 -.21 3977
Social Science-specializers -.11 -.10 -.11 3547
Math and Science-specializers .18 .27 .36 3745
In Panel A, each cell shows the mean of the deviations of students’ ability to
perform in the course type of that column (as estimated by our homogeneous
gamma model, and standardized to a normal (0,1) distribution) from the
sample standardized mean of estimated ability across the course type, for
the population of that row. “Specializers” are students who are observed to
take more courses in the given course type than in either of the other two
course types.
76
Table 12: Comparing the Method to Conventional Results: Homogeneous Gamma Model
Own Section peers’ Effect of 1-stddev Own Peers’
ability ability chg in peer ability ability ability
Humanities 1 0.1613 0.0455 Total Total
(–) (0.0007)
1 0.1642 0.0285 Total Observed
(–) (0.0008)
0.9349 0.2502 0.0435 Observed Observed
(0.0076) (0.0077)
Social Science 1 0.1960 0.0608 Total Total
(–) (0.0008)
1 0.2061 0.0370 Total Observed
(–) (0.0009)
0.8518 0.4085 0.0733 Observed Observed
(0.0077) (0.0077)
Math and Science 1 0.0483 0.0287 Total Total
(–) (0.0008)
1 0.0516 0.0193 Total Observed
(–) (.0009)
0.7784 0.3519 0.1313 Observed Observed
(0.0062) (0.0063)
Note: Dependent variable is grade in the class. For models shown in Rows 1 and 2 for each
coursetype, class effects are removed before estimation by demeaning using the ‘true’ values
of the class effect as estimated by our homogeneous gamma model. For Row 3, class fixed
effects are absorbed in estimation. For the purposes of this table, we ignore the sampling
variation in the parameter estimates used to construct our observed ability measures, which
may impact the standard errors reported here. Observations are as in Table 4, although
creating the second and third rows involved dropping observations for which observables
were missing. The maximum number of observations dropped for a course type was 6.
77