1 supplementary materials for “social class competence
TRANSCRIPT
1
Supplementary Materials for “Social class competence stereotypes are amplified by socially-
signalled economic inequality”
2
Table of Contents
Pilot Studies ............................................................................................................................. 3
Study 1 .................................................................................................................................... 5
Who Said What? Task ...................................................................................................................... 6
Who Said What? Pilot. ........................................................................................................................................ 6
Who Said What? Procedure. ............................................................................................................................... 8
Who Said What? Results. .................................................................................................................................. 10
Moderator Analyses ...................................................................................................................... 11
Study 2 .................................................................................................................................. 14
Procedure ...................................................................................................................................... 14
Target photos. ................................................................................................................................................... 14
Target group randomization. ............................................................................................................................ 14
Demographics. ................................................................................................................................................... 14
Target ratings. .................................................................................................................................................... 15
Political/Ideological items. ................................................................................................................................ 17
Pre-registered Hypotheses ............................................................................................................. 18
Hypothesis 5 .................................................................................................................................. 19
Hypothesis 6 .................................................................................................................................. 20
Exploratory Results ........................................................................................................................ 20
Thermometer warmth and perceived income. ................................................................................................ 20
Perceived income. ............................................................................................................................................. 20
Political/Ideological items. ................................................................................................................................ 23
Study 3 .................................................................................................................................. 25
Procedure ...................................................................................................................................... 25
Target photos and occupations. ........................................................................................................................ 25
Target group randomization. ............................................................................................................................ 25
Demographics. ................................................................................................................................................... 25
Target ratings. .................................................................................................................................................... 25
Power Sensitivity Analyses .................................................................................................... 29
3
Pilot Studies
We had 149 full-body photographs of Caucasian adults (66 females) each rated for
apparent age and income by 61 U.S.-based adults (no other demographic information was
recorded) recruited via Amazon’s Mechanical Turk (MTurk). We computed mean ratings of
income and age for each photo across all raters. We decided to focus on males for Study 1,
because their mean income ratings showed substantially more variation than females’ (see Figure
1).
Figure 1. Mean income ratings for male and female Caucasian targets in pilot data. Bars indicate
means, lines indicate +/-1 SEM.
Based on photos’ mean ratings, we created four groups, each containing 21 photos:
‘Unequal’ (containing low-, middle-, and high-income targets), ‘Equal-Low’, ‘Equal-Middle’,
and ‘Equal-High’ (containing only low-, middle-, and high-income targets, respectively), with
some targets included in both the Unequal and Equal groups. Each group of targets was
approximately matched on average age (Figure 2). We computed the Gini coefficient, which is
the most common measure of income inequality (Atkinson & Bourguignon, 2014), based on
mean income ratings, for each group (Ginis are also reported in Figure 2).
4
Figure 2. Income and age ratings of targets in each group. Bars indicate means, lines indicate +/-
1 SEM. Red dots indicate the 5 targets shared between Unequal and Equal-Low groups, purple
dots indicate the 2 targets shared between Unequal and Equal-Middle groups, blue dots indicate
the seven targets shared between Unequal and Equal-High groups.
Pilot studies 1 & 2 tested whether the target groups were perceived as presenting different
levels of economic inequality. In Pilot Study 1, 43 MTurk participants (17 Female, Mage = 33.79,
SDage = 10.36) viewed each target group depicted within a 3 × 7 grid containing each of its 21
target photos and were asked to “imagine each group represents kind of a miniature society.”
Participants were instructed “your task is simply to judge how equally or unequally income is
distributed in each miniature society based on how the individuals look”, and asked: “from 1 to
10, how unequal do you think each group is?” (1 = “Very equal”, 10 = “Very unequal”).
Because each participant rated each of the four target sets (i.e., ratings were nested within
participants), we fit a hierarchical linear model (HLM), regressing perceived inequality ratings
on three dummy variables representing the four target groups, with the Unequal target group set
as the reference, and including random intercepts for participants. Results (see ‘Pilot 1’ results in
Table 1) showed that the Unequal target set was perceived as significantly more unequal than
other target sets.
Due to a concern that the results of Pilot 1 may have been due to demand characteristics,
we ran another Pilot, but used a between-subjects design less vulnerable to this concern. For
Pilot 2, 169 MTurk participants (64 Female, Mage = 32.17, SD age = 9.94) were randomly
assigned to rate only one of the four target sets via a virtually identical procedure as Pilot 1. We
analyzed results via an OLS linear multiple regression, using the same three dummies as Pilot 1
5
to represent target groups. Results (see ‘Pilot 2’ results in Table 1) again showed that the
Unequal target group was perceived as significantly more unequal than the Equal target groups.
Study 1
Target Rating Procedure
After providing consent, participants were informed “…we will show you pictures of 21
different individuals and ask what you would think of them if you encountered them in your day-
to-day life. When you see each individual, please try to imagine seeing them as you walk down
a street in the city where you live (or a city nearby). Then tell us the immediate impressions you
would form of them” (Figure 3).
Figure 3. Instructions to participants prior to presentation and rating of targets in Study 1.
Participants were randomly assigned to view one of four target groups: Unequal, Equal-
Low, Equal-Middle, and Equal-High. They viewed the targets making up their groups one by
one in a randomized order and rated each target on warmth and competence via 0-100 sliders
Table 1 Models predicting perceived inequality of target groups Pilot 1 Pilot 2 �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p
Fixed effects Intercept (reference = Unequal) 7.535(0.32) <.001 7.581(0.33) <.001 Equal-Low -2.744(0.431) <.001 -2.371(0.481) <.001 Equal mid -3.186(0.431) <.001 -3.581(0.463) <.001 Equal-High -3.163(0.431) <.001 -2.672(0.463) <.001 SD Random effects Participant 0.64 Residual 1.998
6
(Figure 4). All target images used are available via the project OSF page
(https://osf.io/e4upm/?view_only=fd598f1d2273492484d33a0d50c88050).
Figure 4. Example of image rating procedure in Study 1.
Who Said What? Task
As discussed in our main manuscript, the primary purpose of Study 1 was to test for an
effect of exposure to socially-signalled inequality on automatic social class categorization as
measured by the ‘Who Said What’ task (Taylor, Fiske, Etcoff, & Ruderman, 1978). We pre-
registered the following confirmatory hypothesis on the Open Science Framework (OSF) at
https://osf.io/e4upm/?view_only=fd598f1d2273492484d33a0d50c88050: “We expect to find
evidence that individuals who are exposed to groups of people of more heterogeneous apparent
socioeconomic status will display a heightened tendency to categorize others according to their
socioeconomic status.”
Who Said What? Pilot. In the Who Said What task, participants observed eight target
individuals of mixed social class (four upper class, four lower class) identified by name, photo,
and occupation, making brief statements one by one on the topic of ‘fatherhood’, with each
target making three statements each (see Figure 5).
7
Figure 5. Targets (top panels) used in the WSW task, and two example fatherhood statements
(bottom panels).
WSW targets were pre-tested in a pilot study for perceived socioeconomic status (SES).
Thirty-seven participants recruited via MTurk (22 Females, Mage = 39.73, SDage = 13.08) rated
each target on the 10-point MacArthur ladder scale (Adler, Epel, Castellazzo, & Ickovics, 2000).
We fit a HLM regressing perceived SES ratings on an SES group dummy (0 = lower SES, 1 =
upper SES), and included random intercepts for participants. Results showed that the upper SES
target group was perceived as significantly higher in SES than the lower SES target group
�̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) = 3.51(0.13), 95% CI = [3.24,3.76], t(257.19) = 26.35, p < .001, partial r2 = .70 (see
Figure 6).
8
Figure 6. Perceived SES ratings of targets used in WSW task in Study 1. Bars indicate means,
lines indicate +/-1 SEM.
Who Said What? Procedure. Participants completed the WSW task after viewing and
rating the targets presented in the Unequal, Equal-High, Equal-Middle, or Equal-Low target
groups (see main manuscript for details). Prior to the task, instruction read “In the next part of
the survey, we are still interested in how you form impressions of others, but this time you will
not just see photos, but will also read statements people have made. We will present 3 statements
from each of the eight volunteers, making 24 total different statements. When you read the
statements, try to form an impression of the volunteers. photos will advance automatically,
without your having to press anything.” Targets appeared one by one in randomized order for ten
seconds each along with a statement they had ostensibly made during a discussion about
fatherhood (see Figure 5). The complete list of statements used is presented in Table 2. Each
target appeared three times, for 24 total photo-statement pairs. Target images were also presented
with either red or blue borders, which were orthogonal to class group (i.e., each colour grouping
had two upper and two lower SES targets). Two different versions of the WSW task were used,
which presented each target with a different first name (names were swapped between high and
low SES targets) and different occupation (upper and lower SES targets retained upper and lower
SES occupations, respectively). Participants were then presented with the 24 statements and
asked to choose from the eight targets “Who made this statement?”
9
Table 2 Fatherhood statements used in the WSW task in Study 1 No. Statement 1 My dad got custody of me when I was young, and we’ve been close with each other ever
since then. 2 I've mostly always gotten along with both my parents, and my mom and I have a great
relationship and all, but my dad and I have so much in common, it's always been really special between us.
3 I was raised by my mom and step-dad, with close and frequent contact with my father. He was raising his two step-daughters who saw their mom once a year.
4 I grew up with a step dad and we weren't that close. But bonding with my daughter makes me happy because I've always wanted my daughter to have a good bond with her father.
5 I want my children to be their own individual people, and have the relationship with me and my wife that I didn't have with my parents.
6 After they got divorced, I didn’t see my dad much, but my sister's husband took over where he left off. I think I'm the person I am today because of him.
7 My father and I don’t have much of a relationship. That is just a part of the reason. But if that one thing would change, I think our relationship would change for the better.
8 We don't have much of a relationship now. I went through a phase where I tried to impress him or make him proud of me.
9 Job number one for a father is to protect their children from harm. This can even mean protecting them from themselves. But you have to let them be themselves too.
10 You may not like it, but your daughter will go out on dates, and have friends that you may not like. You have to not only trust your children, but talk to them like they are human beings.
11 I believe that a father’s relationship with his daughter should be more open. I have seen a lot of fathers who struggle to communicate with their daughters.
12 It's stereotypical that most fathers are much closer to their daughters and mothers closer to their sons, oddly. I think my family was like that, anyway.
13 But, you're right. Just because a mother isn't in the picture, that doesn't mean anything at all about how a girl will grow up, and who she will be.
14 It’s important for there to be dialogue among fathers. One good father can make generations of future good fathers and mothers.
15 Really, it's his place to stand up and take some responsibility. I believe if he really wants to see his daughter he'll do what it takes to be in her life.
16 You know, I feel there is a difference between the two. But, there's a point when one has to just give up and realize a man will never be a dad. Unfortunately, some guys just can’t handle the responsibilities.
17 When he is good to you, pay attention to him. And just ignore him when he's bad. Grown men can be very much like grown-up little boys.
18 Action is what makes a father. When you make sacrifices to raise a child, then you can be called a parent.
19 Even a part time parent can make a gigantic difference in a child's life. The child sees his or her own role model in one or the other parent.
20 But we shouldn't hurt our children or put them in the middle of any situation that is not their fault. A child not only deserves, but also has a right to have both parents in their lives.
21 Go to any place where secure guys hang around and there's always men out there looking to help young people get started in whatever endeavour they want.
22 Talk about everything. It's never too early. If they’re twelve or fifteen and you don’t talk often enough, you're already too late.
23 If he's making life a living hell for you and refuses to change to make you happy, no, you don't have a problem. You don't have to love your parents.
24 You can't pick your family. If they treat you badly, well, I wouldn't love them either. If your dad can't make the effort to change for you, you shouldn't either.
10
Social class categorization was assessed by examining the proportion of all recall errors
made within social class categories (e.g., assigning a statement made by an upper class target to a
different upper class target). We also computed the proportion of all errors made within colour
categories (e.g., assigning a statement made by a red-bordered target to a different red-bordered
target). Following convention (e.g., Weeks & Lupfer, 2004), we adjusted error proportions to
make the expected proportion of each error type 50%.1 If individuals made more than 50% of
errors within social class categories, they were inferred to have spontaneously categorized the
targets according to social class (Taylor et al., 1978). Conversely, proportions of errors within
and border colour groups were interpreted as evidence of categorization according to non-class
related perceptual features. Both proportions were computed for each participant.
Who Said What? Results. Overall, participants made more within-class errors on the
WSW task than would be expected under random guessing, making an average of proportion of
.53 (SD = 0.15) errors within-class categories, which was significantly higher than .5, t(380) =
3.61, p < .001, 95% CI = [.51,.54], Cohen’s d = 0.18 (Figure 7). This suggests participants
automatically categorized targets by social class. However, proportions of errors within-class
were not significantly different between the Unequal (M = 0.52, SD = 0.17), Equal-Low (M =
0.53, SD = 0.14), Equal-Middle (M = 0.52, SD = 0.14), and Equal-High conditions (M = 0.54,
SD = 0.15). A one-way ANOVA with proportion of errors within class as dependent variable and
Condition (Unequal,) was not significant, F(3,377) = 0.30, p = 0.82, η2= 0.002, and this
remained the case when controlling for proportions of errors within colour groups in an
ANCOVA, F(3,376) = 0.30, p = 0.83, η2= 0.002.
1 Participants chose from 8 options, and could make four kinds of errors: within-class within-
colour, within-class between-colour, between-class within-colour, and between-class between-
colour. However, if a participant were randomly guessing, they would make fewer within-class
within-colour errors than the other three categories, because one of the within-class within-
colour choices was correct. Prior to calculating the overall proportions of within-class and
within-colour errors, we therefore counted the number of each of these four kinds of error, and
doubled the within-class within-colour error total. This makes the expected proportion of within-
class and within-colour errors under random guessing .5.
11
Figure 7. Distribution of within-class error proportions in the WSW task in Study 1.
Moderator Analyses
We examined whether the observed effects of socially-signalled inequality on social class
stereotyping were moderated by participants’ socioeconomic status (SES) or political ideology.
The subjective SES Macarthur scale (Adler et al., 2000) and participants’ yearly incomes
correlated strongly (r = 0.59), so were z-scored and averaged to create a composite SES measure,
while political ideology was measured via our ten-point conservatism/liberalism scale, which we
refer to as ‘Liberalism’ due to higher scores indicating higher liberalism. To test the influence of
each moderator, we fitted the following hierarchical linear model:
𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝛽𝛽2𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖2 + 𝛽𝛽3𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝛽𝛽4𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
+ 𝛽𝛽5𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑚𝑚𝑚𝑚𝑖𝑖𝑚𝑚𝑖𝑖 + 𝛽𝛽6𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑚𝑚𝑚𝑚𝑖𝑖𝑚𝑚𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝛽𝛽7𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑚𝑚𝑚𝑚𝑖𝑖𝑚𝑚𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
+ 𝛽𝛽8𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑚𝑚𝑚𝑚𝑖𝑖𝑚𝑚𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝜁𝜁𝑖𝑖 + 𝜁𝜁𝑖𝑖 + 𝜀𝜀𝑖𝑖𝑖𝑖
where i indexes participants and j indexes target photos, 𝑦𝑦𝑖𝑖𝑖𝑖 is the warmth or competence rating
made by participant i to photo j, 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 is the mean level of perceived annual income
attributed to photo j (in units of thousands), 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖2 is photo mean perceived income squared,
𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 is the Gini index of income inequality of the photos viewed by participant i, 𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑚𝑚𝑚𝑚𝑖𝑖𝑚𝑚𝑖𝑖 is
a participant-level score on either the SES or liberalism moderators, 𝜁𝜁𝑖𝑖 is a participant-level
random intercept adjustment, 𝜁𝜁𝑖𝑖 is a photo-level random intercept adjustment, and 𝜀𝜀𝑖𝑖𝑖𝑖 is the
residual. The key parameter of interest is 𝛽𝛽8, which estimates the three-way interaction between
participants’ SES or liberalism, targets’ income level, and the inequality of the target group
viewed by the participant.
12
Results are presented in Table 3. Of all models fitted, the only three-way interaction term
that appeared potentially important was the three-way interaction between participants’
liberalism, targets’ perceived incomes, and target groups’ inequality, �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) = -0.183(0.056),
95% CI = [-.292,-0.074], t(579) = 3.267, p = .001, d = -1.65.2 This interaction is also visualized
in Figure 8, with the pattern of results suggesting that target group inequality primarily affected
the judgments of warmth of relatively conservative participants, while having little influence on
the judgements of relatively more liberal participants.
Figure 8. The three-way interaction between participants’ liberalism, target group inequality, and
target perceived income in Study 1.
2 Standardized effect sizes were computed via the formula 𝑚𝑚 = 2𝑏𝑏𝑆𝑆𝑆𝑆𝑥𝑥
𝑆𝑆𝑆𝑆𝑦𝑦 which for small effects is
equivalent to Cohen’s d for binary predictors (Gelman, 2008).
13
Table 3 Results from moderator analyses in Study 1 (models were fit using matched targets only)
Moderator = liberalism Moderator = SES Outcome = competence Outcome = warmth Outcome = competence Outcome = warmth
�̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p
Fixed effects Intercept 12.528(11.48) 0.28 46.041(19.866) 0.034 7.289(8.357) 0.397 27.679(18.292) 0.157 Income 0.81(0.26) 0.007 0.128(0.556) 0.821 0.868(0.239) 0.003 0.41(0.548) 0.47 Income2 -0.004(0.001) 0.028 -0.002(0.003) 0.459 -0.004(0.001) 0.026 -0.002(0.003) 0.456 Gini -52.445(34.603) 0.13 -111.007(34.561) 0.001 -52.682(12.12) <.001 -33.374(12.119) 0.006 Income × Gini 0.88(0.387) 0.023 1.614(0.389) <.001 0.815(0.136) <.001 0.385(0.136) 0.005 Moderator -0.801(1.17) 0.494 -2.744(1.163) 0.019 -0.398(2.924) 0.892 -0.216(2.914) 0.941 Moderator × Income 0.009(0.015) 0.564 0.042(0.015) 0.006 0.015(0.038) 0.695 0.013(0.038) 0.74 Moderator × Gini 0.152(4.863) 0.975 11.585(4.854) 0.017 4.254(12.024) 0.724 -1.322(12.031) 0.913 Moderator × Income × Gini -0.012(0.056) 0.831 -0.183(0.056) 0.001 -0.069(0.14) 0.624 0.032(0.141) 0.819 SD N SD N SD N SD N Random effects Participant 12.99 380 12.37 380 12.99 381 12.42 381 Photo 4.18 14 9.95 14 4.15 14 9.96 14 Residual 14.5 2611 15.81 2611 14.5 2618 15.8 2618 R2 a R2 a R2 a R2 a Model fit 0.487 0.036 0.486 0.034 a R2 refers to Nakagawa’s R2 (Nakagawa, Johnson, & Schielzeth, 2017)
14
Study 2
Procedure
Target photos. We selected 150 target photos (30 Black female, 30 White female, 45
Black male, 45 White male) for use in the study. We subjectively chose 50 apparently low-
income, 50 middle-income, and 50 high-income targets, with each income category equally
represented among gender/race subgroups (e.g., among the 30 Black females, 10 were chosen as
appearing subjectively low-, middle-, and high-income).
Target group randomization. Participants were randomly assigned with varying
probabilities to the same four conditions as Study 1: Unequal (50% probability), Equal-Low
(16.7% probability), Equal-Middle (16.7% probability), and Equal-High (16.7% probability).
However, unlike Study 1, we did not assemble a single target photo group for each condition.
Instead, unique groups of eight photos were randomly chosen for each participant from
condition-specific photo pools. For example, participants in the Equal-Low, Equal-Middle, and
Equal-High groups viewed eight photos chosen randomly from the 50 photos judged to appear
low-income, middle- income, and high-income, respectively. Participants in the Unequal
condition viewed eight photos chosen randomly from all 150 target photos. All target images
used are available via the project OSF page
(https://osf.io/e4upm/?view_only=fd598f1d2273492484d33a0d50c88050).
Demographics. Participants reported their demographic information as in Study 1
(Figure 9).
15
Figure 9. Demographics from Study 2
Target ratings. First, participants viewed eight targets one by one without making any
ratings, and were asked to think about what it would be like to encounter each target in day-to-
day life (Figures 10 & 11).
Figure 10. Introduction to target presentation in Study 2.
16
Figure 11. Example of a target presentation in Study 2.
Following this, the same eight targets were again presented one by one (Figure 12), and
participants were asked to “imagine you saw this person on the street in your local area” and to
rate each target on perceived warmth and competence (“How warm and competent would you
think this individual is?”; two 0-100 sliders), age (“How old would you think they are?”; 5-year
intervals ranging from “less than or 15 years” to “76 years or older”), and income (“What would
you guess their annual income to be?”; $10,000 intervals ranging from “$0-10,000” to “$200,001
or over”). Participants also completed a feeling thermometer measure regarding each target (“If
10 = "warmest feelings" and 0 = "coldest feelings", how warm or cold would you feel towards
this person?”; 0-10 scale).
17
Figure 12. Example of target ratings procedure from Study 2.
Political/Ideological items. Six items measured political and ideological attitudes
(Figure 13). Two were drawn from the System Justification Scale (Jost & Kay, 2005), two were
drawn from the Multidimensional Class Consciousness Scale (MCCS; Keefer, Goode, & Van
18
Berkel, 2015), and two were drawn from the Symbolic Racism Scale (SRS; Henry & Sears,
2002). For more information see Supplementary Materials.
Figure 13. Political/ideological items from Study 2.
Pre-registered Hypotheses
In addition to the first four pre-registered hypotheses fully reported in the main
document, we also pre-registered the following hypothesis:
5. This two-way interaction (the interaction between target Gini and mean perceived target
income on warmth ratings) will be moderated by participant political ideology. When
inequality is high and participants are liberal-leaning, high income targets will be
19
perceived as less warm. Political ideology will not moderate the interaction between
photo set inequality and perceived income on perceived competence.
6. Among non-Black respondents there will be a two-way interaction between target race
and target income over participants’ self-reported feelings of warmth towards the targets.
At low levels of apparent income, feelings of warmth will be greater towards White
targets than Black targets, but this gap will be significantly decreased at higher levels of
apparent income.
Hypothesis 5
To test our fifth hypothesis that participants’ political orientation would moderate the
two-way interaction between photo income and photoset inequality on targets’ perceived
warmth, we extended model 2 from our main manuscript, fitting the following HLM predicting
attributions of warmth:
𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝛽𝛽2𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖2 + 𝛽𝛽3𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝛽𝛽4𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
+ 𝛽𝛽5𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖 + 𝛽𝛽6𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖 + 𝛽𝛽7𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖
+ 𝛽𝛽8𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖 + 𝜁𝜁𝑖𝑖 + 𝜁𝜁𝑖𝑖 + 𝜀𝜀𝑖𝑖𝑖𝑖
where 𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖 is participants’ political liberalism. Here the key parameter of interest was
𝛽𝛽8, the coefficient on the three-way interaction 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖. If results followed
our predictions, the estimated interaction effect 𝛽𝛽8� would be negative and significant, with model
predictions showing that high income targets were rated as warmer in more unequal target by
conservative participants, but rated as less warm in more unequal target groups by more liberal
participants. Results are reported in Table 3. The three-way interaction term was not significant
(p = 0.12).
20
Table 3 Three-way interaction model results from Study 2 �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p
Fixed effects (Intercept) 7.744(9.942) 0.436 Income 1.149(0.194) <.001 Income2 -0.005(0.001) <.001 Gini 21.064(50.522) 0.677 Liberalism 3.875(1.15) 0.001 Income × Gini -0.766(0.637) 0.229 Income × Liberalism -0.058(0.015) <.001 Gini × Liberalism -5.35(7.066) 0.449 Income × Gini × Liberalism 0.139(0.089) 0.119 SD N Random effects Participant 9.67 1150 Photo 6.94 150 Residual 14.7 9089
Hypothesis 6
As mentioned in the main manuscript, the sixth hypothesis was related to an alternate line
of research, so we do not discuss it here.
Exploratory Results
Thermometer warmth and perceived income. We fit models of forms 1 and 2 in our
main manuscript to predict thermometer ratings of feelings of warmth toward targets. Model 1
showed that thermometer ratings were non-linearly associated with targets’ perceived incomes,
with both perceived income and its square significant at p < .001 (see Table 4) and collectively
explaining 3.0% of variance in thermometer ratings. However, like warmth ratings, the
interaction term between target mean income and target group Gini was positive but not
statistically significant in model 2 (p > 0.1, see Figure 14 and Table 4).
Perceived income. We used a modified Model 1 and 2 (omitting random intercepts of
photos, which were redundant because photos’ mean income ratings were used as a predictor,
and the 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖2 term, because the relationship between the mean of photos’ income ratings and
the individual income ratings was linear by necessity) to predict income ratings (adjusted to units
of $1000). The interaction effect between target mean income and target group Gini in Model 2
was positive and significant, �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) = 1.113(0.221), 95% CI = [0.680,1.547], t(8968) = 5.034, p
(unadjusted) < .001, d = -0.30. In concrete terms, the model predicted that in a group with a
21
Gini of 0.2 compared to a Gini of 0.05, targets with mean perceived incomes of $120,000 would
be rated as having 7.6% higher incomes, while targets with mean perceived incomes of $30,000
would be rated as having 18.1% lower incomes.
22
Table 4 Study 2 thermometer warmth and perceived income results Outcome = thermometer warmth Outcome = perceived income Model 1 Model 2 Model 1 Model 2 �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p
Fixed effects Intercept 2.195(0.652) 0.001 2.758(0.712) <.001 0.043(0.936) 0.964 11.992(2.958) <.001 Income 0.095(0.019) <.001 0.088(0.02) <.001 0.999(0.012) <.001 0.825(0.037) <.001 Income2 -0.001(1e-04) <.001 -0.001(1e-04) <.001 Gini -3.314(1.694) 0.05 -75.325(18.627) <.001 Income × Gini 0.035(0.021) 0.101 1.113(0.221) <.001 SD N SD N SD N SD N
Random effects Participant 1.2 1118 1.2 1118 16.53 1147 16.56 1147 Photo 0.84 150 0.84 150 Residual 1.64 7556 1.64 7556 19.22 9093 19.19 9093 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p
Model comparison 25.387(2) <.001 3.855(2) 0.146 5397.921(1) <.001 25.36(2) <.001 R2 b R2 b R2 b R2 b Model fit 0.03 0.031 @ 0.425 0.415 aModel comparisons for Model 1 compare its fit to a null model including only random effects, model comparisons for Model 2 compare its fit with the previous Model 1 b R2 refers to Nakagawa’s R2 (Nakagawa et al., 2017)
23
Figure 14. Modelled and observed relationships between perceived target income (x axes), target
group Gini (colours) and trait attributions (y axes). Points are coloured by target group Gini.
Political/Ideological items. Participants responded to six items aimed at measuring
political and ideological attitudes. Two items were drawn from the System Justification Scale
(SJS; Jost & Kay, 2005; “Everyone has a fair shot at wealth and happiness”, “Society is set up so
that people usually get what they deserve”; 5-point scale ranging from “1 = Strongly disagree” to
5 = “Strongly agree”). These items correlated strongly (r = .46) so were averaged to form a
single measure (M = 2.21, SD = 1.03). Two items were drawn from the Conflict subscale of the
Multidimensional Class Consciousness Scale (MCCS; Keefer, Goode, & Van Berkel, 2015; “A
wealth difference between social classes represents an unfair society”, M = 3.8, SD = 1.11;
“For the rich to increase their wealth, they must exploit the poorer classes”, M = 2.89, SD = 1.37;
5-point scale ranging from “1 = Strongly disagree” to 5 = “Strongly agree”). The items were not
highly correlated (r = .17) so were analysed separately. Another two items were drawn from the
Symbolic Racism Scale (SRS; Henry & Sears, 2002) for purposes not directly related to the
present study.
To test if exposure to unequal vs equal target groups affected responses to the
political/ideological survey items we ran OLS regressions using non-nested individual-level data
that predicted participants’ average scores on the SJS items and each of the MCCS items from
their exposure to target set inequality. For each outcome we fit two models, with and without
demographic covariates:
24
𝑦𝑦𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝐺𝐺𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝜀𝜀𝑖𝑖 (4)
𝑦𝑦𝑖𝑖 = 𝛽𝛽0 + 𝛽𝛽1𝐺𝐺𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 + 𝛽𝛽2𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖 + 𝛽𝛽3𝑔𝑔𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑔𝑔𝑚𝑚𝑙𝑙𝑖𝑖𝑖𝑖 + 𝛽𝛽4𝑔𝑔𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑔𝑔𝑚𝑚ℎ𝑖𝑖𝑚𝑚𝑖𝑖 +
𝛽𝛽5𝑆𝑆𝑆𝑆𝑆𝑆𝑖𝑖 + 𝛽𝛽6𝑚𝑚𝑔𝑔𝑖𝑖𝑖𝑖 + 𝜀𝜀𝑖𝑖
(5)
where i indexes participants, 𝑦𝑦𝑖𝑖 is the political/ideological survey measure, 𝐺𝐺𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 is the level of
inequality displayed in participant i’s target group, 𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑚𝑚𝑚𝑚𝑙𝑙𝑖𝑖𝑙𝑙𝑖𝑖𝑖𝑖, 𝑆𝑆𝑆𝑆𝑆𝑆𝑖𝑖, and 𝑚𝑚𝑔𝑔𝑖𝑖𝑖𝑖 are participants’
self-reported political liberalism, subjective SES and age, respectively, 𝑔𝑔𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑔𝑔𝑚𝑚𝑙𝑙𝑖𝑖𝑖𝑖 and
𝑔𝑔𝑖𝑖𝑖𝑖𝑚𝑚𝑖𝑖𝑚𝑚𝑔𝑔𝑚𝑚ℎ𝑖𝑖𝑚𝑚𝑖𝑖 are dummy variables indicating participants selection of ‘male’ or of ‘other’ for
gender, and 𝜀𝜀𝑖𝑖 is the error term.
No models suggested any effects of target group Gini. Estimated effects of target set
Ginis and effect sizes are reported in Table 5.
Table 5
Estimated effects of target group Gini from models predicting political/ideological item responses
Outcome Covariates �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�)Ginii p 𝜂𝜂2 N
SJS No -0.707(0.589) 0.23 0.001 1153
Yes -0.799(0.542) 0.141 0.002 1149
MCCS1a No -0.618(0.636) 0.331 0.001 1152
Yes -0.614(0.591) 0.299 0.001 1148
MCCS2b No 0.16(0.787) 0.839 4×10-5 1153
Yes 0.189(0.779) 0.809 5×10-5 1149
aMCCS1 = “A wealth difference between social classes represents an unfair society”
bMCCS2 = “For the rich to increase their wealth, they must exploit the poorer classes”
25
Study 3
Procedure
Target photos and occupations. We selected 60 target photos used in Study 2 (14
females, 26 Black), choosing 20 photos each from the top, middle, and lowest thirds of the
distribution of photos’ mean perceived incomes. High-income targets were assigned relatively
high-income occupations (e.g., Personal Finance Advisor, Lawyer, and Business Manager),
middle-income targets were assigned middle-income occupations (e.g., Teacher, Social Worker),
and low-income targets were assigned relatively low-income occupations (e.g., Janitor, Usher).
We selected high-income, medium-income, and low-income occupations based on income
statistics from the Bureau of Labor Statistics (https://www.bls.gov/oes/tables.htm). A full list of
occupations chosen for each photo and all target photos are available from the project OSF page
(https://osf.io/e4upm/?view_only=fd598f1d2273492484d33a0d50c88050).
Target group randomization. Target group randomization followed the approach of
Study 2, except that participants assigned to the Unequal condition viewed three targets each
from the low-income targets, middle-income targets, and high-income targets. Participants in the
Equal-Low, Equal-Middle, and Equal-High conditions viewed nine photos chosen randomly
from low-income, middle- income, and high-income targets, respectively.
Demographics. Participants reported their demographic information on the same
measures as in the first two studies.
Target ratings. Instructions and rating scales largely followed those used in Study 2
(Figure 15 & Figure 16), but after participants rated targets on apparent income, they were asked
to rate deserved levels of income for each target: “Based on their occupation, what do you
consider an appropriate annual income for this person?” To economize on time, we did not
measure apparent age.
26
Figure 15. Example of an initial target presentation in Study 3.
27
Figure 16. Example of target rating procedure from Study 3.
Results from models predicting thermometer ratings, perceived incomes, and perceived
deserved incomes are presented in Table 6.
28
Table 6 Results from Study 3 models predicting thermometer warmth, perceived income, and perceived deserved income. Outcome = Thermometer warmth Outcome = perceived income Outcome = perceived deserved income Model 1 Model 2 Model 1 Model 2 Model 1 Model 2
�̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p �̂�𝛽(𝑆𝑆𝑆𝑆𝛽𝛽�) p
Fixed effects (Intercept) 4.831(0.759) <.001 4.942(0.788) <.001 0.991(1.032) 0.337 1.138(2.713) 0.675 13.928(1.11) <.001 14.196(2.917) <.001 Income 0.026(0.02) 0.283 0.026(0.02) 0.2 0.987(0.01) <.001 0.97(0.032) <.001 0.865(0.011) <.001 0.867(0.034) <.001 Income2 -0.0002(0.0001) 0.232 -0.0002(0.0001) 0.115 Gini -0.764(1.07) 0.475 1.172(13.113) 0.929 -1.953(14.114) 0.89 Income × Gini 0.0001(0.012) 0.990 0.077(0.143) 0.660 -0.007(0.153) 0.964 SD N SD N SD N SD N SD N SD N Random effects id 1.27 729 1.26 729 18.0 746 18.0 746 19.63 745 19.63 745 photo 0.85 60 0.85 60 residual 1.74 5410 1.74 5410 21.32 6659 21.32 6659 22.23 6526 22.23 6526 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p 𝜒𝜒2(df)a p Model comparison 4.865 0.088 1.536 0.464 5987.783 <.001 0.961 0.618 4590 <.001 0.08 0.961 R2 b R2 b R2 b R2 b R2 b R2 b Model fit 0.012 0.013 0.581 0.58 0.485 0.485 aModel comparisons for Model 1 compare its fit to a null model including only random effects, model comparisons for Model 2 compare its fit with the previous Model 1 b R2 refers to Nakagawa’s R2 (Nakagawa et al., 2017)
29
Power Sensitivity Analyses
Sample sizes for each study were chosen to maximize power within pragmatic constraints
imposed by time, human resources, and research funding. However, given growing and justified
interest in issues of statistical power within the psychological sciences (e.g., Fraley & Vazire,
2014), we performed post hoc power sensitivity analyses to assess the power of each study’s
sample size to detect effects of various sizes.
To achieve this, we used an approach incorporating a mixture of boot-strapping and
simulation. To understand this approach, consider the case of a simple post hoc power analysis
performed via boot-strapping. To carry this out, a researcher who has collected N cases and
observed an effect of size 𝜃𝜃 needs only to repeatedly re-sample N cases with replacement rom
their data, and re-run their test of the effect in each of the boot-strapped samples. By recording
the proportion of the boot-strapped samples in which the effect is statistically significant, the
researcher can thereby produce an estimate of the power of their sample size N to detect the
effect of interest at its observed effect size 𝜃𝜃 (Efron & Tibshirani, 1993). For example, if they
find that the effect is significant in 80% of the boot-strapped samples, it would suggest 80%
power at sample size N to detect the observed effect.
Via this procedure, the researcher is treating their observed data as the population to
sample from, and manually estimating the power of a statistical test by repeatedly sampling from
that population and running the test many times. A benefit of this approach is that it can provide
a way of estimating power in situations in which a priori power analyses are difficult. In our
case, for example, estimating power sensitivity in this way avoids one of the main difficulties of
conducting power analyses with multilevel data, which is the need to estimate the variance and
distribution of random effects (Lane & Hennes, 2018). Via this bootstrapping method, these
parameters do not have to be assumed; because the observed data is treated as the population to
be sampled from, they are taken directly from the observed variance and distribution of random
effects.
Alone, post hoc estimates yield little new information, because they are more or less re-
statements of p values. However, if researchers alter parameters, they can use this procedure to
yield estimates of power at different sample sizes or different effect sizes, which can provide
valuable new information. For example, if the aforementioned researcher wished to estimate
30
power at a different N to detect their effect 𝜃𝜃, they would simply need to take smaller or larger
boot-strapped samples from the observed data.
Altering effect sizes is also relatively easy. This is where the “simulation” part of the
process comes into play, though we are here using the term ‘simulation’ relatively loosely. In
fact, researchers can change the size of effects of interest within our data––and thus, within the
‘populations’ they draw bootstrapped samples from––via relatively minor adjustments of
outcome variables that leave their datasets virtually unchanged.
In our case, in which the key effect of interest was the interaction term between target
income and target group inequality on target competence ratings, we altered the effect size of the
interaction via a 2-step process. First, we computed the following:
�́�𝑦𝑖𝑖𝑖𝑖 = 𝑦𝑦𝑖𝑖𝑖𝑖 − 𝛽𝛽𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
where i indexes participants and j indexes target photos, 𝑦𝑦𝑖𝑖𝑖𝑖 is the observed competence rating of
photo j by participant i, 𝑦𝑦𝚤𝚤𝚤𝚤́ is the adjusted competence rating of photo j by participant i,
𝛽𝛽𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 is the observed interaction slope of targets’ perceived income 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 and target set
inequality 𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 , 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 is the mean level of perceived income attributed to photo j, and 𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
is the Gini index of income inequality, computed from photos’ mean perceived incomes, of the
set of photos viewed by participant i. Using the altered outcome score �́�𝑦𝑖𝑖𝑖𝑖, the effect size of the
interaction 𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 becomes zero. In the second step, we added interaction effects of
different sizes back to each dataset by choosing new values of the 𝛽𝛽𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖term 𝛿𝛿, and
creating a newly altered version of the outcome, like so:
𝑦𝑦𝑖𝑖𝑖𝑖 = �́�𝑦𝑖𝑖𝑖𝑖 + 𝛿𝛿𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑔𝑔𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖
where 𝑦𝑦𝑖𝑖𝑖𝑖 is a newly altered version of the outcome, and re-running the original model will result
in the estimated interaction term 𝛽𝛽𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖𝑖 being equal to 𝛿𝛿. At this point, the dataset has still
been changed very little; the variance of random effects is unaffected, and each other slope in a
new model returns identical estimates to the original model. All that has changed is the size of
the estimated interaction term, and with it, the effect size of the interaction effect. At this point,
we can estimate the effect size in the data based on 𝑦𝑦𝑖𝑖𝑖𝑖 scores, and estimate power to detect that
effect at our N using the bootstrapping approach described above.
Using this approach, we used trial and error to create a range of effects 𝛿𝛿 for which our
sample sizes in studies 2 and 3 had low power and for which our sample sizes had high power.
These effects were chosen separately for each dataset due to the different sample sizes used. For
31
each dataset and each simulated effect size, we then ran bootstrapped power analyses by
resampling at each Study’s N with replacement 1000 times, and testing the proportion of samples
in which simulated effects reached statistical significance. Results are displayed in Figure 17
below, and discussed in our main manuscript.
Figure 17. Power at each sample’s N to detect effects of different sizes.
32
References
Adler, N. E., Epel, E. S., Castellazzo, G., & Ickovics, J. R. (2000). Relationship of subjective and
objective social status with psychological and physiological functioning: Preliminary data
in healthy, White women. Health psychology, 19(6), 586.
Atkinson, A. B., & Bourguignon, F. (2014). Introduction: Income distribution today. In A. B.
Atkinson & François Bourguignon (Eds.), Handbook of Income Distribution: Volume 2A
(xvii – lxv). Oxford: Elsevier.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and
powerful approach to multiple testing. Journal of the royal statistical society. Series B
(Methodological), 289-300.
Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap. Chapman & Hall/CRC:
London.
Fraley, R. C., & Vazire, S. (2014). The N-pact factor: Evaluating the quality of empirical
journals with respect to sample size and statistical power. PloS one, 9(10), e109019.
Henry, P. J., & Sears, D. O. (2002). The symbolic racism 2000 scale. Political
Psychology, 23(2), 253-283.
Jost, J. T., & Kay, A. C. (2005). Exposure to benevolent sexism and complementary gender
stereotypes: Consequences for specific and diffuse forms of system justification. Journal
of Personality and Social Psychology, 88(3), 498-509.
Keefer, L. A., Goode, C., & Van Berkel, L. (2015). Toward a psychological study of class
consciousness: Development and validation of a social psychological model. Journal of
Social and Political Psychology, 3(2), 253-290.
Lane, S. P., & Hennes, E. P. (2018). Power struggles: Estimating sample size for multilevel
relationships research. Journal of Social and Personal Relationships, 35(1), 7-31.
Nakagawa, S., Johnson, P. C. D., & Schielzeth, H. (2017). The coefficient of determination R2
and intra-class correlation coefficient from generalized linear mixed-effects models
revisited and expanded. Journal of The Royal Society Interface, 14(134), 20170213.
doi: 10.1098/rsif.2017.0213
Weeks, M., & Lupfer, M. B. (2004). Complicating race: The relationship between prejudice,
race, and social class categorizations. Personality and Social Psychology Bulletin, 30(8),
972-984.