1 of 14 exploring race and gender differentials in student ratings of instructors: lessons from a...

14
1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song Spinosa, James D. Whitney Occidental College May 2014

Upload: dorcas-cameron

Post on 03-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

1 of 14

Exploring race and gender differentials in student ratings of instructors:

Lessons from a diverse liberal arts college

Robert L. Moore, Hanna Song Spinosa, James D. WhitneyOccidental College

May 2014

Page 2: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

2 of 14

Do the race and gender of instructors and the race and gender composition of their classes make a difference in the ratings instructors receive from their students?

1. The answer matters (1) social concerns, such as the persistence of discrimination (2) practical concerns, such as tenure and promotion

2. Previous research regarding gender: conflicting findings (1) no difference by gender (2) lower ratings for female instructors (3) “same-gender preferences”

3. Previous research regarding race: (1) hardly any (2) only one in economics literature (3) higher ratings for white instructors

Page 3: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

3 of 14

What differentiates our study from previous research?

1. Explicit focus on both race and gender

2. More recent and largest dataset to date

3. Data sample with relatively high levels of diversity

4. Econometric techniques

5. Race and gender composition of enrolled students

6. Supplemental approaches including a Oaxaca decomposition and panel data estimates from a subset of class sections taught by the same instructor contemporaneously

Page 4: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

4 of 14

What do we find in our study?

1. In several cases of student-instructor pairings (for example, white male student ratings of white female instructors, and so on), we find sizeable estimates of race and gender ratings differentials.

2. In only a few cases are the empirical estimates statistically significant after appropriate adjustments for clustering of the data.

3. Overall class-average student ratings in our dataset do not differ enough by instructor race and gender to warrant systematic ratings adjustments for tenure and promotion decisions, but do warrant a general attentiveness to particular teaching situations in which instructor and student demographics might matter.

4. Our clearest findings are cautionary observations regarding the challenges of related empirical research, including (1) clustering of the observations to adjust for heteroskedasticity, (2) demographic heterogeneity on the part of students as well as faculty, and (3) potential control variables that risk omitted variable bias if excluded

5. Much larger datasets, particularly datasets which span multiple institutions, or data which include the race and gender of individual student evaluators might yield more robust and consistent results than we have found in our study.

Page 5: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

5 of 14

Data and methodology: Our dataset of Occidental College student evaluations is comparatively large and recent

Includes all student evaluations that report an overall student rating of instructor, submitted for full-credit classes (counting for 4 or more units) with enrollments above 5 students

Study Evaluations Classes Instructors Years Institution

Moore, Spinosa, and Whitney (2014)

74,072 4,297 443 2006-12Occidental College

Smith (2007), Smith and Hamilton (2011)

13,702 190 2001-04 U of GA

Centra and Guabatz (2000)

741Multiple (20)

Hamermesh and Parker (2005)

>16,000 463 94 2000-02 U of TX

Page 6: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

6 of 14

Occidental is also noteworthy for the relatively high diversity of both its faculty and studentsRace and gender composition of faculty and students, Academic Year 2009-10

Faculty Students

Percentage Rank* (259) Percentage Rank* (263)

Male Female Male Female Male Female Male Female

White 39.7% 30.1% 212 215 27.9% 34.3% 205 217

Black 4.6% 2.0% 26 83 2.7% 3.2% 114 108

Latino 4.3% 7.4% 12 4 5.8% 8.1% 19 22

Native American

0.0% 0.0% 42 42 0.6% 0.6% 27 37

Asian, Islanders

4.1% 7.8% 37 9 6.9% 9.8% 7 12

Herfindahl Index 14 8

* Among all US News national liberal arts colleges with available data.

Page 7: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

7 of 14

Methodology

1. Like Smith (2007), Smith and Hamilton (2011): explicit consideration of race

2. Like Centra and Guabatz (2000): demographics of enrolled students

3. Like Hamermesh and Parker (2005): controls for non-demographic factors

Basic structure of the regression equations we estimate:

Qn = + Xn + Zn + n

1. subscript n: a sample observation, by default a class average but in some specified cases an individual student evaluation.

2. Q: student rating of instructor (SRI), on a seven-point descending scale (“Overall, the instruction for this course was excellent.”)

3. Xn: a vector summation of demographic variables for each observation (Xn) multiplied by their corresponding estimated coefficients ().

4. Zn: Analogous to Xn but with Zn denoting non-demographic control variables.

5. denotes a constant and n a random error term for observation n.

Page 8: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

8 of 14

Empirical results A: Student Rating of Instructor (SRI) differentials by instructor race and gender (1) Relative to White Male instructors, other race-gender instructor groups receive lower student ratings, dovetailing with previous research. (2) Most of the ratings differentials are comparatively small, averaging -0.15 ratings points (-0.18 standard deviations), less than half as large as commonly reported in past studies. (3) Nearly all of the estimated differentials become statistically insignificant after adjusting the error terms for clustering of the data.

Coef. Pr(B=0) Coef. Pr(B=0)5.998 0.000 5.998 0.000

White Female instructor (WFi) -0.100 0.000 -0.100 0.294Underrepresented minority Male instructor (UMi) -0.044 0.019 -0.044 0.747Underrepresented minority Female instructor (UFi) -0.085 0.000 -0.085 0.560Other Male instructor (OMi) -0.330 0.000 -0.330 0.009Other Female instructor (OFi) -0.186 0.000 -0.186 0.268

Table IV.1: Dependent variable: Class-average Student Rating of Instructor (SRI)

Note: Total sample class-average SRI, weighted by number of respondents: mean = 5.924, standard deviation = 0.837

Unit of observation

Differential versus WMi for:

Estimation method

Standard error adjustment

Number of observationsIndependent variablesConstant (White Male instructor (WMi)

Class Average

OLS

None

Indiv. Evaluation

74,072

443 clusters by instructor

4,297

WLS (by Evaluation Count)

Page 9: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

9 of 14

Results B. Estimating SRI differentials when student demographics vary across classes. Students in our sample are overrepresented in classes taught by instructors that match their own race and gender, and this can compress overall average SRIs by instructor.

WMs WFs UMs UFs OMs OFs

WMi 11.7% 12.0% 3.3% 4.2% 3.8% 4.8% 39.7%

WFi 7.0% 9.9% 2.4% 3.6% 2.1% 3.6% 28.5%

UMi 2.0% 2.4% 0.9% 1.4% 0.7% 1.0% 8.5%

UFi 2.2% 2.9% 1.0% 1.8% 0.6% 1.0% 9.5%

OMi 1.5% 1.4% 0.5% 0.6% 0.7% 0.9% 5.6%

OFi 2.1% 2.2% 0.7% 0.9% 1.1% 1.3% 8.3%

26.5% 30.8% 8.8% 12.4% 8.9% 12.5%

WMs WFs UMs UFs OMs OFs

WMi 29.5% 30.3% 8.3% 10.5% 9.5% 12.0% 100.0%

WFi 24.4% 34.7% 8.3% 12.5% 7.5% 12.7% 100.0%

UMi 23.8% 28.5% 10.6% 17.0% 8.3% 11.8% 100.0%

UFi 23.4% 30.5% 10.6% 19.0% 5.8% 10.7% 100.0%

OMi 27.0% 25.1% 9.2% 10.3% 13.0% 15.4% 100.0%

OFi 25.2% 26.8% 8.9% 10.7% 13.0% 15.5% 100.0%

26.5% 30.8% 8.8% 12.4% 8.9% 12.5%Average % of enrollments

Panel A: Percentage distribution of course enrollments, total sample

Panel B: Average class race-gender composition by instructor subgroupStudent subgroup

Total

Inst

ruct

or

sub

gro

up

Table IV.2: Average class composition of students by gender and ethnicity, overall and by instructor gender-ethnicity subgroup

Student subgroup

Inst

ruct

or

sub

gro

up

Total % of enrollments

Instrtuctor share of enrollments

Page 10: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

10 of 14

After incorporating student demographics: (1) Within each student subgroup (except for Other Female students), estimated cross-group ratings are lower than own-group ratings in 23 of 25 cases. (2) The sizes of the estimated differentials are typically large compared to previously reported findings, averaging 0.63 ratings points (0.75 standard deviations), although only a few of the estimates are statistically significant.

WMs WFs UMs UFs OMs OFs WMs WFs UMs UFs OMs OFs0.418 -0.035 0.313 -0.053 -0.615 -0.007 0.074 0.418 -0.316 -0.610 -0.659 -0.920 0.806

(0.019) (0.851) (0.337) (0.891) (0.122) (0.982) (0.120) (0.019) (0.170) (0.195) (0.247) (0.104) (0.181)-0.561 0.282 0.380 -0.184 -0.219 0.168 -0.026 -0.979 0.282 -0.543 -0.790 -0.523 0.981(0.017) (0.033) (0.391) (0.634) (0.620) (0.482) (0.667) (0.001) (0.033) (0.338) (0.166) (0.376) (0.082)-0.469 -0.375 0.923 0.745 0.100 0.131 0.030 -0.888 -0.656 0.923 0.139 -0.204 0.944(0.265) (0.319) (0.008) (0.003) (0.934) (0.710) (0.791) (0.053) (0.099) (0.008) (0.772) (0.874) (0.128)-0.337 0.021 0.184 0.606 0.430 -0.927 -0.011 -0.755 -0.260 -0.739 0.606 0.126 -0.114(0.365) (0.938) (0.764) (0.142) (0.521) (0.150) (0.926) (0.067) (0.387) (0.292) (0.142) (0.872) (0.890)0.138 -0.299 -1.014 -1.182 0.305 -0.284 -0.256 -0.280 -0.581 -1.937 -1.788 0.305 0.529

(0.749) (0.541) (0.110) (0.003) (0.446) (0.608) (0.012) (0.548) (0.251) (0.007) (0.002) (0.446) (0.482)-0.166 0.253 -0.003 -0.355 0.200 -0.813 -0.112 -0.585 -0.028 -0.926 -0.961 -0.104 -0.813(0.749) (0.409) (0.997) (0.469) (0.624) (0.113) (0.446) (0.292) (0.932) (0.352) (0.132) (0.855) (0.113)

-0.072 0.041 0.272 -0.048 -0.224 -0.114 -0.813 -0.337 -0.712 -0.722 -0.560 0.761

Table IV.3: Estimated average student rating of instructors (SRI) by race-gender subgroups of instructors and students

Panel B: The diagonal entries serve as column benchmarks, and the other cells in each column report differentials versus the column benchmark

Student subgroup Student subgroup

Least squares estimation of 4,297 observations of class-average ratings weighted by number of respondents and clustered by 443 instructors.

Instructor subgroup weighted** average differential of each student subgroup from the overall sample average:

Instructor subgroup weighted** average differntial of each student subgroup from its respective benchmark

**share of total enrolled students in dataset

39.7%

Inst

ruct

or

sub

gro

up

WMi

WFi

UMi

UFi

OMi

OFi

Panel A: Each cell reports each subgroup's estimated class-average SRI differential versus the overall sample average

Enrollment wtd. avg.*

28.5%

8.5%

9.5%

5.6%

8.3%

Instructor subgroup weights**

Values in parentheses correspond to Pr(Ho): Ho for Panel A and diagonals of Panel B: Deviation from average = 0. Ho for Panel B off-diagonal estimates: Deviation from column benchmark = 0. Note: Overall sample class-average SRI, weighted by number of respondents: mean = 5.924, standard deviation = 0.837

*derived from data in Table IV.2 Panel A.

Page 11: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

11 of 14

Results C: Controlling for non-demographic factors that are likely to be associated with SRIs

Hamermesh and Parker (2005) included two additional dummy control variables related to courses (Lower division, One-credit course) and four related to instructors (Female, Minority, Non-native English, Tenure track). We included more controls, as listed in the table to the right.

Sample frequency

100-level (dummy) 41%200-300 level (excluded) 56%400-level (dummy) 3%

Cultural Studies Program (dummy) 8%Arts and Humanities (dummy) 35%Science (dummy) 26%Social Sciences (excluded) 30%

14%

Part-time adjunct (dummy) 17%Full-time adjunct (dummy) 21%Tenure/tenure track (excluded) 62%

18%Sample average

11.318.823.02.4

0.4%30%43%90%

Average grade awarded (A=4.00) 3.293.41Average expected grade (A=4.00)

Total class enrollment

Percent enrolled for Core requirement

New offering by instructor (dummy)

Instructor years of experience at OccidentalYears of experience if greater than six

Average student seniority (1=Frosh, 4=Senior)

Table IV.4: Descriptive data for non-demographic control variables

Course classification:

Evaluation response rate

Variable

Percent graduate student enrollment

Percent enrolled for Major requirement

Classiifcation of instructor

Course level:

Seminar course (dummy)

Page 12: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

12 of 14

Results of incorporating non-demographic control variables: (1) a sharp reduction in the sizes and statistical significance of the diagonal entries (own-group differentials) to average 0.24 ratings points (0.28 s.d.), compared to 0.51 ratings points before, with none of the estimates statistically significant (2) a similarly sharp drop in the average size of the estimated ratings differentials within student groups vs. their respective own-group benchmark, from 0.63 ratings points before to 0.32 ratings points (0.38 standard deviations) now.

WMs WFs UMs UFs OMs OFs WMs WFs UMs UFs OMs OFs0.285 -0.212 0.585 0.137 0.022 0.165 0.105 0.285 -0.176 0.127 0.024 -0.346 0.997

(0.089) (0.181) (0.061) (0.598) (0.949) (0.559) (0.010) (0.089) (0.386) (0.780) (0.959) (0.499) (0.084)-0.371 -0.036 0.564 -0.083 0.201 0.392 -0.002 -0.656 -0.036 0.105 -0.196 -0.167 1.223(0.102) (0.771) (0.160) (0.803) (0.589) (0.086) (0.972) (0.029) (0.771) (0.836) (0.694) (0.751) (0.028)-0.427 -0.457 0.458 -0.066 0.573 0.594 -0.077 -0.712 -0.421 0.458 -0.180 0.205 1.426(0.302) (0.118) (0.152) (0.796) (0.608) (0.121) (0.423) (0.113) (0.167) (0.152) (0.698) (0.863) (0.025)-0.057 -0.075 -0.008 0.113 0.426 -0.683 -0.063 -0.342 -0.039 -0.466 0.113 0.059 0.148(0.870) (0.783) (0.989) (0.766) (0.441) (0.203) (0.560) (0.375) (0.895) (0.451) (0.766) (0.930) (0.843)0.139 -0.175 -1.273 -1.261 0.367 -0.248 -0.243 -0.146 -0.140 -1.731 -1.374 0.367 0.584

(0.660) (0.693) (0.084) (0.025) (0.349) (0.596) (0.013) (0.681) (0.766) (0.032) (0.045) (0.349) (0.395)-0.193 -0.032 -0.115 -0.136 0.227 -0.831 -0.181 -0.478 0.003 -0.573 -0.250 -0.140 -0.831(0.637) (0.904) (0.883) (0.746) (0.524) (0.101) (0.152) (0.287) (0.991) (0.498) (0.665) (0.790) (0.101)

-0.042 -0.153 0.351 -0.046 0.194 0.081 -0.543 -0.164 -0.117 -0.176 -0.184 0.994

Instructor subgroup weighted** average differential of each student subgroup from the overall sample average:

*derived from data in Table IV.2 Panel A.

Instructor subgroup weighted** average differntial of each student subgroup from its respective benchmark

**share of total enrolled students in dataset

Least squares estimation of 4,297 observations of class-average ratings weighted by number of respondents and clustered by 443 instructors. Values in parentheses correspond to Pr(Ho): Ho for Panel A and diagonals of Panel B: Deviation from average = 0. Ho for Panel B off-diagonal estimates: Deviation from column benchmark = 0. Note: Overall sample class-average SRI, weighted by number of respondents: mean = 5.924, standard deviation = 0.837

Table IV.6: Estimated average student rating of instructors (SRI) by race-gender subgroups of instructors and students, controlling for other factors that affect SRIs

Panel A: Each cell reports each subgroup's estimated class-average SRI differential versus the overall sample average

Panel B: The diagonal entries serve as column benchmarks, and the other cells in each column report differentials versus the column benchmark

Instructor subgroup weights**Student subgroup Enrollment

wtd. avg.*Student subgroup

Inst

ruct

or

subg

roup

WMi 39.7%

WFi 28.5%

UMi 8.5%

UFi 9.5%

OMi 5.6%

OFi 8.3%

Page 13: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

13 of 14

Except for the outlier case of Other Female students, the arrows in the chart below indicate substantial convergence of the own-group and sub-group ratings when non-demographic factors are included as control variables

4.500

5.000

5.500

6.000

6.500

7.000

7.500

WMs1 WMs2 WFs1 WFs2 UMs1 UMs2 UFs1 UFs2 OMs1 OMs2 OFs1 OFs2

Aver

age

estim

ated

SRI

Student demographic group. 1=ratings before and 2=ratings after including non-demographic control variables

Chart IV.1: Estimated race-gender SRI differentials before and after the addition of non-demographic control variables

WMi

WFi

UMi

UFi

OMi

OFi

Cross-group avg

Page 14: 1 of 14 Exploring race and gender differentials in student ratings of instructors: Lessons from a diverse liberal arts college Robert L. Moore, Hanna Song

14 of 14

Self-reported learning outcomes and, especially, detailed ratings of instructors sharply reduce the size and significance of the estimated coefficients for the base set of non-demographic control variables. (You might also be interested to note here the detailed ratings items that are most strongly associated with the overall rating of instruction: 1. clarity 2. intellectual enthusiasm 3. fulfilling course goals 4. organization 5. responsive to questions)

Included variables:Coef. Pr(Ho) Coef. Pr(Ho) Coef. Pr(Ho) Coef. Pr(Ho)

100-level course -0.087 0.150 -0.084 0.150 -0.052 0.135 0.006 0.609400-level course -0.162 0.051 -0.230 0.007 -0.059 0.232 -0.021 0.233Cultural Studies Program course -0.232 0.047 -0.263 0.024 -0.180 0.016 -0.027 0.297Arts and Humanities course 0.051 0.587 0.066 0.468 0.024 0.620 0.033 0.029Science course -0.285 0.008 -0.289 0.005 -0.174 0.002 -0.016 0.275Seminar course 0.081 0.241 0.061 0.375 0.024 0.584 -0.010 0.495Part-time adjunct instructor -0.334 0.000 -0.320 0.000 -0.192 0.000 -0.049 0.004Full-time adjunct instructor -0.196 0.041 -0.172 0.067 -0.102 0.073 -0.035 0.039Years of experience at Occidental 0.011 0.536 0.010 0.575 0.007 0.507 0.003 0.280Years of experience if greater than six -0.016 0.290 -0.015 0.329 -0.010 0.264 -0.003 0.323New offering by instructor -0.184 0.012 -0.184 0.009 -0.088 0.028 -0.034 0.012Total enrollment 0.002 0.563 0.002 0.478 0.003 0.104 0.000 0.898Average student seniority 0.066 0.090 0.047 0.215 -0.023 0.331 -0.018 0.027Percent graduate student enrollment 0.038 0.925 -0.116 0.751 0.031 0.871 -0.016 0.814Percent of students enrolled for Core requirement -0.041 0.003 -0.025 0.061 0.010 0.344 0.006 0.400Percent of students enrolled for Major requirement 0.045 0.000 0.018 0.126 -0.012 0.189 0.004 0.444Evaluation response rate 0.708 0.000 0.625 0.000 0.336 0.000 0.073 0.022Average grade awarded 0.247 0.002 0.228 0.003 0.212 0.000 0.073 0.000Average expected grade 0.324 0.000 0.286 0.000 0.070 0.000 -0.002 0.669Average hours worked outside of class 0.010 0.000 0.000 0.875 -0.001 0.162Average number of classes missed -0.016 0.000 0.003 0.083 0.002 0.038Average rating: discussed course material outside of class 0.138 0.000 0.028 0.000 -0.003 0.143Average rating: Course contribution to knowledge 0.421 0.000 0.115 0.000 Course contribution to skills 0.374 0.000 0.069 0.000 Instructor communicated goals -0.030 0.000 Instructor fulfilled goals 0.162 0.000 Course organization 0.141 0.000 Clear assignments 0.029 0.000 Clear grading criteria 0.030 0.000 Helpful feedback 0.087 0.000 Clear explanations of concepts 0.182 0.000 Motivated intellectual enthusiasm 0.171 0.000 Instructor invited individual meetings 0.023 0.000 Instructor invited questions 0.015 0.002 Instructor asked questions 0.050 0.000 Instructor was responsive to questions 0.133 0.000

Table IV.7b: The impact of including self-reported information on estimated coefficients for non-demographic variables Demographics + behavior + outcomes + ratings