stigma and silence in learning - stanford universityarungc/cgy.pdfwe thank alokik mishra, ... our...

46
SIGNALING, STIGMA, AND SILENCE IN SOCIAL LEARNING ARUN G. CHANDRASEKHAR , BENJAMIN GOLUB § , AND HE YANG Abstract. We study the stigma associated with seeking information as a potential fric- tion in social learning, and discuss its implications for equilibrium network formation and information diffusion. In our model, a Seeker has information relevant to a task, and may choose to rely on that or seek other information from an Advisor. Higher-skill Seekers need help less. Thus, when Seekers want to be perceived as having high skill, they may refrain from asking Advisors questions, in a signaling equilibrium that is essentially the reverse of the one in Spence (1973). To test several predictions of this model, we conduct a 2 × 2 experiment with 1247 Seeker-Advisor pairs across 70 villages in India, where Seekers had three days to choose to retrieve information from Advisors. The first treatment arm varies whether Seekers get initial information randomly or through demonstrating cognitive skill on a test, and the second varies whether a Seeker’s skill is held private or always revealed to the paired Advisor. We find that when information depends on skill, low-skill Seekers have a significantly lower probability (compared to the random treatment) of seeking information when they most need it. This signaling effect is stronger in when (i) the Advisor is not a friend of the Seeker or (ii) when the Advisor is from a higher caste. We also identify two additional behavioral effects, separate from signaling skill, that arise with a low Seeker score is revealed to the Advisor: a direct “shame” effect—the Seeker dislikes meeting the Advisor in this circumstance; and an “effort signaling” effect—the Seeker is more likely to seek in- formation in the skill-based treatment, where the (known) low score implies a need for help. These effects are particularly pronounced among friends. We argue that these forces, taken together, can stabilize and magnify homophily in communication networks and explain the observed phenomenon of persistent differences in information among physically proximate peers. Date: This Version: December 11, 2016. We thank Nageeb Ali, Marcella Alsan, Abhijit Banerjee, Emily Breza, Gabriel Carroll, Pascaline Dupas, Matthew O. Jackson, Navin Kartik, Michael Kremer, Horacio Larreguy, and Melanie Morten as well as par- ticipants at the Stanford Junior Faculty Workshop and the Workshop on Information and Social Economics at Caltech. Financial support from the NSF under grants SES-1156182 and SES-1559469, SCID at SIEPR, and IRiSS is gratefully acknowledged. We thank Alokik Mishra, Devika Lakhote, Tithee Mukhopadhyay, and Gowri Nagraj for their excellent research assistance. Department of Economics, Stanford University; NBER; JPAL. § Department of Economics, Harvard University. Department of Economics, Harvard University.

Upload: lamtruc

Post on 16-Mar-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

SIGNALING, STIGMA, AND SILENCE IN SOCIAL LEARNING

ARUN G. CHANDRASEKHAR‡, BENJAMIN GOLUB§, AND HE YANG?

Abstract. We study the stigma associated with seeking information as a potential fric-tion in social learning, and discuss its implications for equilibrium network formation andinformation diffusion. In our model, a Seeker has information relevant to a task, and maychoose to rely on that or seek other information from an Advisor. Higher-skill Seekers needhelp less. Thus, when Seekers want to be perceived as having high skill, they may refrainfrom asking Advisors questions, in a signaling equilibrium that is essentially the reverse ofthe one in Spence (1973). To test several predictions of this model, we conduct a 2 × 2experiment with 1247 Seeker-Advisor pairs across 70 villages in India, where Seekers hadthree days to choose to retrieve information from Advisors. The first treatment arm varieswhether Seekers get initial information randomly or through demonstrating cognitive skillon a test, and the second varies whether a Seeker’s skill is held private or always revealed tothe paired Advisor. We find that when information depends on skill, low-skill Seekers havea significantly lower probability (compared to the random treatment) of seeking informationwhen they most need it. This signaling effect is stronger in when (i) the Advisor is not afriend of the Seeker or (ii) when the Advisor is from a higher caste. We also identify twoadditional behavioral effects, separate from signaling skill, that arise with a low Seeker scoreis revealed to the Advisor: a direct “shame” effect—the Seeker dislikes meeting the Advisorin this circumstance; and an “effort signaling” effect—the Seeker is more likely to seek in-formation in the skill-based treatment, where the (known) low score implies a need for help.These effects are particularly pronounced among friends. We argue that these forces, takentogether, can stabilize and magnify homophily in communication networks and explain theobserved phenomenon of persistent differences in information among physically proximatepeers.

Date: This Version: December 11, 2016.We thank Nageeb Ali, Marcella Alsan, Abhijit Banerjee, Emily Breza, Gabriel Carroll, Pascaline Dupas,Matthew O. Jackson, Navin Kartik, Michael Kremer, Horacio Larreguy, and Melanie Morten as well as par-ticipants at the Stanford Junior Faculty Workshop and the Workshop on Information and Social Economicsat Caltech. Financial support from the NSF under grants SES-1156182 and SES-1559469, SCID at SIEPR,and IRiSS is gratefully acknowledged. We thank Alokik Mishra, Devika Lakhote, Tithee Mukhopadhyay,and Gowri Nagraj for their excellent research assistance.‡Department of Economics, Stanford University; NBER; JPAL.§Department of Economics, Harvard University.?Department of Economics, Harvard University.

Page 2: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 1

1. Introduction

Even a fool, when he holdeth his peace, is counted wise; and he that shuttethhis lips is esteemed as a man of understanding.

Proverbs 17:28

Social learning is a key channel through which people gain information about decisionsthey must make. Recent policy interventions have sought to leverage social learning, forinstance through financial and agricultural extension, as well as by broadcasting information(Conley and Udry, 2010; Banerjee et al., 2013; Cole and Fernando, 2014). The typicalmodel of information transmission in the literature on social learning is that conditional oninformation being present, it is observed by or communicated to peers with some exogenousprobability according to an exogenous network.1 A corollary of this is that by identifyingthe right individuals to inform, one can maximize the reach of the information. There isan important way in which this model is unsuited to real social learning. The activationof relationships to transmit information is endogenous: the decisions to seek and to conveyinformation are affected by the properties of the information, its spread to date, and howan act of communication is interpreted. Indeed, the flow of information is often activatedby a “pull”: for example, someone comes to know that there is a new technology and askssomeone else how best to use it (Conley and Udry, 2010). Decisions of this sort, and theassociated distortions, are left out of most social learning models.

In this paper we study one important margin that affects agents’ decision to engage insocial learning. In particular, we focus on cases in which an agent’s skill directly affectsthe quality of initial signals he receives, and hence his valuation of further information. Forexample, a less skilled farmer may be less able to use a new technology based on his ownprivate information, because he finds it difficult to perform useful experiments. Therefore,his skill directly decreases the benefit of acquiring more information. Others may realizethis in interpreting his information-seeking behavior, and having a reputation for low skillmay itself be problematic. If seeking information negatively updates others’ beliefs aboutone’s skill, then people may be reluctant to do it, in a reversal of the Spence (1973) view ofeducation as signaling high ability. In other words, the solicitation of information, which isoften necessary for the uninformed to learn, may be dampened by this concern of signalinglow ability. This situation can be essential to a poverty trap: the very individuals who standto benefit most from social learning are often the ones who are not known to be of highskill, and these are the ones deterred by stigma from engaging in social learning. Forces suchas these can serve as a friction in an information aggregation process, thereby renderinginterventions such as agricultural or financial extension less effective than anticipated.

1For important exceptions that endogenize some aspect of transmission decisions, see Niehaus (2011a),Acemoglu et al. (2014), Galeotti et al. (2013), and Calvo-Armengol et al. (2015).

Page 3: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 2

This paper makes three contributions. First, we develop a simple model to make precisethe nature of the frictions caused by signaling in seeking advice, and their comparativestatics. Second, we perform an experiment in rural India bearing out that the frictions inthe model are present empirically in a setting where social learning is critical. Third, wediscuss the implications for network structure and for policy.

1.1. Model. The first contribution, our model, is designed to make our story above preciseand identify how the informational environment determines equilibrium levels of stigma. Inessence, the model posits that a Seeker (he) is endowed with information whose quality isprivately known to him. The distribution of the quality, and hence value of the informationdepends on the Seeker’s skill, with more skilled Seekers endowed with higher-quality infor-mation (in distribution). An Advisor (she) possesses some information useful to the Seeker,and will also update on the Seeker’s skill based on his decision of whether or not to seek herhelp. The Seeker wants the Advisor to have a higher estimate of his skill. By introducingagent-specific shocks in a specific way, we are able to study the signaling equilibria in asimple framework, which allows precise predictions about who seeks information and howequilibria respond to changes in priors. The key predictions of the model are: (i) undernatural assumptions (low-skill types need information more than high-skill types), low-skilltypes are more likely to seek information; (ii) this phenomenon is less pronounced whenthe skill of the Seeker is precisely known to the Advisor prior to the stage of informationacquisition; (iii) this phenomenon is less pronounced when the need for information is lesscorrelated with skill. The model also permits more sensitive comparative statics as we vary,for example, the importance of reputation.

1.2. Experimental Findings. The core contribution of this paper is an experiment con-ducted in 70 villages with 1247 total subjects in Karnataka, India. The experiment israndomized at the village level, and is conducted at the Seeker-Advisor pair level, takingplace over three days in a given village.

On the first day, a Seeker is told that on the third day, he will have the opportunity to wina mobile phone by figuring out which one of two boxes contains the prize.2 In the interim, hehas a choice about how to get clues to guess the correct box. He is entitled to draw a certainnumber of clues, k, before making the guess. He also has an alternative option: between thefirst and the third day, he can physically visit a particular individual in the village, calledan Advisor, to get clues from her. He is told the Advisor’s identity and is told the numberof clues she has, k′, which is drawn randomly and is generally greater than k.3

2The top prize is worth about seven times the average daily wage.3In about 14% of cases, we had k′ ≤ k; very little seeking occurred (about 3% within this group) when k′

was equal to k and no seeking occurred when k′ was strictly less than k.

Page 4: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 3

The experiment has a simple 2× 2 design: {Random, Skill} × {Private, Revealed}. Thefirst arm varies whether k, the number of clues given to the Seeker, is (i) determined randomly(the Random treatment), independent of any attribute of the Seeker, or (ii) proportional tothe Seeker’s skill on a cognitive ability test based on the Raven’s Progressive Matrices (theSkill treatment). Importantly, in both arms, the Seeker and the Advisor have been bothmade familiar with the test, all the rules of the game, and how k is determined.

The second arm varies whether the Seeker’s ability score is held private (the Privatetreatment) or revealed to the Advisor (the Revealed treatment), irrespective of his seekingdecision. Under a classical signaling story fleshed out in our model, the (Skill, Revealed)treatment should encourage seeking relative to the (Skill, Private) treatment, since there isno room to signal ability by the act of seeking or not. There are also other channels, notpurely informational in nature, that may deter seeking. For instance, if the Seeker has a rawdisutility, which we call shame, of interacting with someone who is commonly known to haverecently received adverse information about the him, we should see a reduction in seekingwhen we compare (Random, Revealed) to (Random, Private). We can use the Randomtreatment as a benchmark to soak up all such effects unrelated to signaling ability; thencomparisons to the the Skill treatment reveal the signaling effects.

Our main experimental results are as follows. First, we demonstrate a striking decrease inthe probability that a low-skilled individual (holding fixed his potential benefit from seeking)seeks information in the Skill treatment. Comparing (Skill, Private) to (Random, Private),we see a 9-13pp decline on a base of 20.3% in the seeking rate.4 This is evidence that agents,concerned about the signaling their low skill, may shut down information-seeking behavior.

Second, we show that this effect is particularly driven by cases when the Seeker is partneredwith an Advisor who is not his friend, based on network data previously collected —an 11-14pp decline on a base of 17.4% seeking rate for low skilled Seekers with poor signals. Whenthe Seeker is friends with the Advisor, the decline in seeking rate is not significantly differentfrom zero. Similarly, we show that the Skill treatment effect is most pronounced whenwe condition on a pairing between a Seeker with an Advisor of higher caste, suggesting asignificant premium of maintaining one’s reputation among those of higher caste.

Third, the Seeker is just as likely to seek in (Skill, Revealed) as in (Random, Private).That is, mechanically revealing a Seeker’s ability to his Advisor, and making that revelationcommon knowledge, yields a seeking rate that is indistinguishable from the (Random, Pri-vate) control. This is consistent with the signaling story: because there is no informationabout one’s score conveyed by the act of seeking in (Skill, Revealed), seeking happens justas often as when there is no information in the score at all, in (Random, Private).

4When we report ranges, what varies is the specification.

Page 5: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 4

So far, the findings verify that the subjects value extra signals that help them play thegame, but are reluctant to reveal adverse information about themselves to gain those signals.We also find behavioral effects that are not explained purely by a skill-signaling story.

Our fourth main finding is sizable reduction in seeking rate when comparing (Random,Private) to (Random, Revealed), suggesting a direct “shame” effect. In both cells, the needfor information is unrelated to ability, and the only difference facing a Seeker is whether hispotential Advisor has recently received adverse information about him (with this revelationbeing common knowledge between them). The deterrent to seeking in this comparison is onaverage sizable: a 12pp decrease in seeking rate on a base of 20.3%. This effect is chieflydriven by one’s social circle. Among friends, the decline in seeking rate is 16-23pp decline inseeking on a base of 26.8%. Among members of the same caste, the effect is comparable.

Considering again a benchmark model where ability signaling is the only effect, movinganother step from (Random, Revealed) to (Skill, Revealed) should have no effect on seeking.As a result, combining this with the fourth finding above, we should expect that going from(Random, Private) to (Skill, Revealed) has a negative effect on seeking. However, recallour third finding which described that comparison: going from (Random, Private) to (Skill,Revealed), it actually yields the same rate of seeking.

Our fifth finding is that moving from (Random, Revealed) to (Skill, Revealed) in factincreases the probability of seeking by about 17pp, which resolves this apparent inconsistency.We call this an “obligation” effect—when the Advisor knows that the Seeker received lowclues due to his low skill, the Advisor knows he needs help. Though there is a “shame” costto seeking in this case, the Seeker is apparently also more inclined to seek when his need toget signals is common knowledge. This may reflect signaling about a different dimension—the inclination to put in effort, or to use a particular relationship when it is needed. Whilethe interpretations we attach to the differences in the fourth and fifth effect are necessarilyspeculative, together they show there are subtle effects operating that are not predicted bya simple ability-signaling theory.

Taken together, our findings suggests that in the endogenous formation of a communicationnetwork on a given a social environment, the information revealed about ability by askingquestions can yield sizeable distortions in terms of who talks to whom and what subgraphsend up being connected in information exchange and aggregation.

1.3. Implications for network structure and information policy. An implication ofour model, confirmed in the empirical results, is that when Advisors have strong priors, thereis virtually no skill-signaling effect. This can stabilize and amplify homophily—the tendencyof people to interact primarily with those like themselves—in a communication network.

Suppose a Seeker, and in particular his ability, is already familiar to a potential Advisor—for instance, because of a shared social environment, family, etc. Then there is no signaling

Page 6: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 5

deterrent to information-seeking, making it easier for communication to happen in thatrelationship and reinforcing the link. On the other hand, if a Seeker is not well-known bya potential Advisor, then the signaling effect will deter communication even if there areno physical obstacles. Thus, the impact of signaling on the Seeking decision amplifies thestrength of strong ties and the weakness of weak ties. Since weak ties often connect differenthomophilous subgroups, the signaling effect we have identified can reinforce homophily.

The fact that strong priors turn off the signaling effect can also exacerbate inequality.Suppose that before our game is played, players have opportunities to learn about eachother’s ability. Suppose the learning happens via a “good news” process, whereby some canconvincingly demonstrate that they very likely have high ability—for example, through afarming success that cannot happen through luck alone. Those who have not demonstratedhigh ability are of uncertain ability. Our theoretical results show that in such a case, thoseof known skill will seek information, while others may be deterred.

Our findings also have important policy implications for effectively spreading information.Instead of seeding information through different “injection points,” an alternative strategyis to offer a more private learning environment that removes the information frictions causedby stigma concerns. One way to implement this is through the mobile consulting service forproduction advice, as in the experiment of Cole and Fernando (2014). Farmers can call in torequest information without their actions being observed by peers, and therefore the stigmaassociated with learning is no longer a concern.

1.4. Related Literature. Our paper lies at the intersection of the literature on networkformation for information acquisition (e.g., Acemoglu et al. (2014); Galeotti et al. (2013);Calvo-Armengol et al. (2015)) and the literature on signaling concerns in education effort(Fryer and Austen-Smith, 2005; Bursztyn and Jensen, 2015; Bursztyn et al., 2016).

Agents in our setting take an active decision to hear information from a peer. Thus there isboth a network formation and learning aspect. As the action of forming or using a connectionto seek information may be perceived as a signal about skill, the communication network isendogenously determined, and we study the associated frictions. Recent work on endogenousnetwork formation has studied equilibrium networks (Galeotti et al., 2013; Calvo-Armengolet al., 2015) and conditions for learning to occur in them (Acemoglu et al., 2014). Our workis complementary in that those models do not feature a signaling component, where theformation or use of links changes the beliefs of a counterparty about an agent’s attributes.

Signaling incentives in information-acquisition or education decisions have been intensivelystudied since Spence (1973). The classical perspective is that seeking education is a signal(to potential employers, for instance) of a high ability, since education has lower costs formore able types. More recently, researchers have studied signaling concerns that can de-ter information-seeking. Fryer and Austen-Smith (2005) examine a dual-audience signaling

Page 7: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 6

model, where the signal that the labor market appreciates may be one that an agent’s localcommunity penalizes: having a relatively high type from the perspective of the labor marketcan correspond to having a relatively low “social type,” which is what peers care about.Of course, peers may also reward signaling high ability. Bursztyn and Jensen (2015) andBursztyn et al. (2016) propose a series of ingenious experiments designed to examine whetherpeer pressure and perceptions amplify or counteract education incentives; they find that theexact social context matters a great deal for which signaling incentives are operating.

There are two main contrasts between this strand of work and our experiments. First,there is a substantive one: in the literature just discussed, the education in question is com-ing from outside the peer network, for instance in the form of a test preparation course. Thequestion there is whether peers deter each other from accessing an external helping hand.Our substantive focus is, in contrast, how signaling affects seeking within the network—theformation and activation of peer relationships. Our treatments are designed to identify howmuch within-network communication is deterred by a signaling concern. Second, and relat-edly, the nature of our experimental variation is different. In Bursztyn and Jensen (2015);Bursztyn et al. (2016), the experimental variation is in whether a student’s demand for ed-ucation is made public, while holding fixed the way that education demand is related to astudent’s academic ability—his need for test preparation help. In contrast, we experimen-tally turn on and off the signaling concern (by making the need for information a matterof luck or skill), while holding the nature of the information-acquisition activity fixed in allother respects, including visibility. This allows us to assess purely ability-signaling impedi-ments to social learning, holding fixed any other preferences and social norms that may beinvolved in deciding to go public with one’s need for information.

Our work also relates to the existing empirical literature on social learning and informa-tion diffusion (Conley and Udry (2010); Banerjee et al. (2013); Kremer and Miguel (2007);Foster and Rosenzweig (1995); Beaman et al. (2014)). We add to this literature by endo-genizing the decision to make an observation—something that is exogenous in most of thestandard models. Since this decision is shaped by agents’ attributes and concerns of stigma,our analysis suggests new sources of inefficiency and new externalities that are relevant todeciding whether organic communication will spread information successfully.5

The remainder of our paper is organized as follows. Section 2 presents a simple frameworkto highlight the mechanisms we are exploring and how they interact with network structure.Section 3 contains the experimental design, background and sample statistics. The mainresults are presented in Section 4. We look at how signaling interacts with social structure

5For a different perspective on filtering and negative externalities in information transmission, see Niehaus(2011b).

Page 8: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 7

in Section 5. Section 6 is a discussion of the implications of the results and some modelingextensions suggested by the empirical findings.

2. Model

We propose a signaling model of information acquisition effort. After presenting theframework and main results, we discuss it in the context of other signaling models.

2.1. Environment. There is one active agent, a Seeker. The Seeker either has or does nothave an exogenously random opportunity or chance (C ∈ {0, 1}) to make a binary investmentdecision. For example, a decision may concern whether or not to invest, or which technologyto invest in.

The Seeker also has a privately known skill (or ability) type, a ∈ {H,L} (high or low),which will be relevant to the value he can get out of the opportunity, with and without help.The Advisor, an audience whose beliefs the Seeker cares about, has a prior about the Seeker’sskill, P(a = H) = π, and a prior about the Seeker’s investment opportunity, P(C = 1) = q.

A Seeker’s ability a is independent of his chance to invest, C.6 In the experiment, the beliefthe Advisor holds about C and a before she is informed of the Seeking decision is varied byrevealing or not revealing the Seeker’s identity and score on the skill test.

If the Seeker does not get the chance to invest, he has no choices. If the Seeker doeshave the chance to invest, he has a seeking decision, y ∈ {0, 1}, which is observed by theAdvisor. This, along with his type, affects his expected utility from the opportunity: theinvestment payoff is a private random variable V 0

a . In our experiment, V 0a is the Seeker’s

expected utility from using his own clues, whereas V 1a is the expected utility from getting

the Advisor’s clues. Thus the marginal value of seeking is equal to V 1a − V 0

a . Let Fa be thec.d.f. associated with V 1

a − V 0a , and let Ga be the complementary c.d.f.

The Seeker’s utility in the game is determined by the investment payoff and a perceptionpayoff. These enter his utility function additively. The perception payoff is determined by anAdvisor’s assessment of the Seeker’s skill based on whether he chooses to seek, P(a = H | y).

For concreteness, suppose that the the Advisor receives utility Wa from collaborating withthe Seeker on some later project if the true skill is a. If the Advisor chooses to collaborate,the Seeker receives a deterministic payoff of λ > 0 which enters his utility additively. TheAdvisor wants to collaborate if and only if

(2.1) P(a = H | y)P(a = L | y) ≥

WH

−WL.

6This assumption is for notational simplicity, and is consistent with our experiment. Correlation can beincorporated easily into the formulas below.

Page 9: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 8

The ratio WH−WL

corresponds to the relative value of working with a high-skill Seeker comparedto the loss of working with a low-skill Seeker, and corresponds to the odds—in the likelihood-ratio sense—required to make working with this person worthwhile. Let H be the c.d.f. ofWH−WL

and assume it has a positive density supported on the positive reals. The utility of aSeeker of skill a is then

(2.2) Ua(y) = V ya + λH

(P(a = H | y)P(a = L | y)

).

2.2. Basic Analysis. We will study the Bayes-Nash equilibria of our model. We first makea technical assumption:

Assumption 1. For both ability types a ∈ {H,L}, the random variable V 1a − V 0

a has anatomless distribution (i.e., its c.d.f. Fa is continuous) whose support is the positive reals(i.e., Fa(v) < 1 for all v).

The critical assumption concerns how the marginal value of seeking, V 1a − V 0

a , comparesacross the two types. Recall that the c.d.f. of this function is Fa.

Assumption 2. FL first-order stochastically dominates FH, strictly in the sense that FL(v) <FH(v) for every v.

In other words, the marginal value to low-skill types from seeking exceeds (in distribution)the marginal value to high-skill types.

Under the assumption of no atoms, it is essentially without loss of generality to assumethat the Seeker uses a cutoff strategy: if C = 1 (so that there is a choice to make). A Seekerseeks if and only if his value is high enough: V 1

a − V 0a ≥ va.

Proposition 1 (Equilibrium Existence and Characterization). Under Assumptions 1 and2, an equilibrium exists and it is in cutoff strategies.7 An equilibrium is characterized by anumber v so that an agent of ability type a seeks if and only if V 1

a − V 0a ≥ v. The cutoff v is

the same for all agents and is a solution to

(2.3) v

λ= H

1− π1− qGH(v)1− qGL(v)

)−H

1− πGH(v)GL(v)

).

The cutoff is positive, so that V 1a > V 0

a is necessary (but not sufficient) for an agent to seek.

All proofs are in the appendix.The reasoning behind this central result is straightforward and intuitive. First, ability

type does not enter an agent’s utility function at the time of the seeking decision. It plays arole only in determining the distribution of the gains to seeking V 1

a −V 0a , whose realization is

known at the time of the agent’s decision, so both players will use the same cutoff. Given this,7Up to differences that occur with zero probability.

Page 10: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 9

one can compute the odds ratios that the agent’s type is high conditional on not seeking, andconditional on seeking. These are the arguments of H on the right-hand side of (2.3). Thequantity on the right-hand side of (2.3) is the decrease induced by seeking in the probabilitythat an Advisor will want to collaborate with the Seeker. The expected loss from seeking istherefore λ

[H(

π1−π

1−qGH(v)1−qGL(v)

)−H

1−πGH(v)GL(v)

)], and the cutoff type v is indifferent between

this and the expected value v he will get from the information. The Intermediate ValueTheorem shows that (2.3) has a solution, and using the fact that H is increasing along withAssumption 2 shows that any solution must be positive.

Proposition 2 (Inferences in Equilibrium). Under Assumptions 1 and 2, in any equilibriumof the signaling game, the mass of high-skill agents choosing to seek (y = 1) is strictly smallerthan the mass of low-skill agents choosing to seek (y = 0). Therefore, not seeking signalshigh ability:

P(a = H | y = 0) > P(a = H | y = 1).

This follows simply for the fact that with any cutoff, as long as it is the same for both types(as must be the case in equilibrium), Assumption 2 guarantees that the mass of high-skillagents choosing to seek is strictly smaller than the corresponding mass of low-skill agents.

To make predictions corresponding to our treatment arms, as well as the effect of pre-existing familiarity between the Seeker and the Advisor, we study some boundary casesof our problem. In particular, we wish to explore how equilibria behave as we vary twoimportant aspects of the environment: making the Advisor’s priors precise, and makingthe marginal value of seeking unrelated to ability type. In both cases, the signaling effectdisappears and both ability types behave the same.

Proposition 3 (Limit Cases: Known Ability or Ability-Irrelevance). Suppose that Assump-tions 1 and 2 hold. Take a sequence of games satisfying either

(1) π → 0 or π → 1, fixing all other parameters; or(2) the distribution FH converges to FL in the total variation norm, fixing all other pa-

rameters.

For any sequence of equilibria corresponding to those games, the cutoff v converges to 0 andseeking decisions become uninformative:

P(a = H | y = 0)→ P(a = H | y = 1).

Finally, we turn to the important issue of uniqueness. In general, our model may featuremultiple stable equilibria, reflecting the realistic feature that, despite fundamental param-eters being the same in two communities, the same actions may have different meanings

Page 11: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 10

due to culture or custom. This is manifested formally in equation (2.3) potentially havingmultiple solutions8 v. For our main result we need a set of technical assumptions:

Assumption 3. (1) GH(v)/GL(v)2 →∞ as v →∞.(2) The ratio GH(v)

GL(v) has a limit as v →∞ and monotone first and second derivatives forlarge enough v.

The first part of this assumption posits that GH and GL, which both tend to zero at largevalues of their arguments, don’t tend to zero at very different rates. This comparabilityassumption is satisfied for many realistic specifications, for example when both distributionsare normals or truncated normals. The second part of the assumption says that GH(v)

GL(v) haswell-behaved derivatives at infinity. Note that under Assumption 2, the limit in (2) of theratio must be at most 1.

With the additional technical conditions we can establish:

Proposition 4 (Equilibrium Uniqueness). Fix all distributions and parameters except λ.If Assumptions 1, 2, and 3 hold, there exist λ, λ > 0 such that, whenever λ /∈ [λ, λ] there isa unique equilibrium cutoff v.

Thus, when signaling concerns are very low or very high, there is a unique equilibrium.However, with intermediate signaling concerns, it is possible that there are multiple equilib-ria, and in general these will be welfare-ranked. We will return to these questions in Section6.

Proposition 4 is our most technically intricate result; the conditions for equilibrium unique-ness at the extremes are subtle. However, it points out a virtue of the model: to examinethe structure of equilibria, and the comparative statics of this set, we can simply look at apicture that plots the two sides of equation (2.3) together. Uniqueness here corresponds tothe two curves having a single intersection.

2.3. Discussion and Comparison with Related Models. The basic idea that one mightnot seek information to avoid signaling low ability is a sort of reversal of the insight ofSpence (1973). Indeed, this anti-Spence result can be shown in the original two-type modelof Spence (1973). Our model features richer private information—namely, the marginalbenefit of Seeking, V 1

a −V 0a , whose distribution depends on ability. This has two advantages;

first, it generates the realistic feature that all ability types sometimes choose to seek. Moreimportantly, it relates the rates at which they do so to economic fundamentals, namelytheir costs and benefits. The essence of the solution is equation (2.3), which pins down alltypes’ incentives and behavior. Looking at this picture illuminates comparative statics anduniqueness issues transparently.8A stable solution has the slope of the right-hand side of (2.3) being between −1 and λ−1 at the solution.

Page 12: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 11

A key technical point is that we do not make the value of a reputation linear in theprobability that one is perceived as a high type, as is often done. Indeed, we give a simplemicrofoundation of a natural nonlinear form of the perception payoff. The function H, whichis involved in this form, plays an important role in our theoretical analysis and in derivingthe single equation (2.3) on which our analysis centers.

In Bursztyn et al. (2016), the “cool to be smart” model is related to the phenomenon westudy, in that students refrain from seeking education to avoid signaling low ability. But inboth the theory and experiments there, ability is automatically revealed when individualschoose to seek. Thus, there is no endogeneity to the information content of a seeking decision,in contrast to this model, where that endogeneity plays the starring role.

2.4. Summary and Implications. We summarize the results of our model as they relateto our experiment. In any equilibrium of the signaling game. . .

(1) In the (Skill, Private) treatment, Seekers will seek if the clue advantage of seekingexceeds some positive cutoff (Proposition 1).

(2) In the (Skill, Private) treatment, high-ability Seekers, on average, will seek less thanlow-ability Seekers, and Advisors’ will update their beliefs, upon observing seeking,in the direction of low ability (Proposition 2).

(3) In the Random or Revealed treatments, or when the Seeker is well-known to theAdvisor, high- and low-skill types have similar seeking rates (Proposition 3).

(4) In the (Skill, Private) treatment, both types seek less than in (Random, Private);this also comes from Proposition 3.

In Section 6 we discuss some extensions of the model that the empirical findings suggest,and some welfare comparisons it allows us to make.

3. Experimental Design and Setting

3.1. Setting. We conduct surveys and experiments with 1247 subject pairs in 70 villagesin Karnataka, India. The majority of villagers have occupations in agriculture, sericulture,and dairy production.

This is an setting well-suited to our questions for two reasons. First, villagers rely sub-stantially on word-of-mouth social learning to obtain information useful for production, sounderstanding what obstacles impede social learning is important. Second, as documentedbelow, there are natural social impediments to social learning; the low propensity to learnacross caste groups is a major one.

3.2. Design. In every village, 32 participants are selected to be Seekers and 32 are selectedto be Advisors. These roles are randomly assigned from the village census and participants

Page 13: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 12

are not made aware of the identities of other participants. The experiment is at the pair—that is, Seeker-Advisor—level. We stratify the sample into friend pairs, co-caste pairs, andthose belonging to neither class.

The experiment takes place over three days. On Day 1, we approach both Seekers andAdvisors and collect demographic data in a baseline survey. We also administer—separatelyto all participants—a simple test of cognitive ability based on the Raven’s Progressive Ma-trices. After this, we inform the Seekers that each will have, on Day 3, the opportunity towin a cellphone worth Rs. 1350, or a smaller prize (depending on a dice roll). To win theprize, the Seeker will have to correctly guess which of two boxes contains a prize.9 A Seekeris told that he will make choices that determine what information he will possess about thecorrect box before having to guess the box. He is told that there is a particular number, k,of clues that he will be entitled to before drawing; each is an independent draw stating thecorrect box with probability 0.55. He can forgo this entitlement and instead get k′ > k clues,of the same type, by physically visiting the paired Advisor. The paired Advisor’s clues areusable only by the paired Seeker; in particular, the Advisor cannot use them for anything. Inthe Advisor baseline survey we also collect data on the Advisors’ ratings of cognitive abilityfor a random set of 5 Seekers, which includes the Advisor’s own paired Seeker. Our mainoutcome of interest is the Seeker’s decision of whether to seek information from the Advisoror use his own clue endowment.

On Days 1-3, the Seekers are able to seek information from the Advisors, if they choose todo so. On Day 3, we revisit the village to solicit the Seekers’ guesses, distribute prizes andrewards, and adminsiter endline surveys to both Seekers and Advisors. The Seeker endlinesurvey collects data on whether Seekers approached their Advisors. The Advisor endlinesurvey collects data on any interaction with a Seeker, and an update on their ability ratingsof the same 5 random Seekers whose ability they evaluated at baseline.

Our experiment is a simple 2 × 2 design, randomized at the village level.10 The firstmain treatment arm varies on how the k clues are drawn for the Seekers, i.e. whether theyare drawn as random signals (Random treatment) or in proportion to performance on askill test (Skill treatment). In the Random villages, k is drawn uniformly at random from{1, 2, 3, 4, 5}, while in the Skill villages, k is proportional to the Seeker’s performance on anability test—the Ravens’ Matrix Test. As noted above, the test has been made familiar toboth Seekers and Advisors, so that an Advisor can correctly interpret a Seeker’s performance,and this is common knowledge. The motivation for this treatment is that, by comparingindividuals of the same skill and who received the same number of clues in either condition,the difference in their seeking rates can be attributed to signaling effect.

9The expected value of the package is INR 180.10We randomize at the village, not pair, level to guard against the scenario where players in the same villagediscuss the rules of the game and get confusing information.

Page 14: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 13

We cross-cut the Random versus Skill design with our second treatment arm—Privateversus Revealed. In the Private treatment, we do not directly reveal a Seeker’s test score tothe paired Advisor, and the Advisor could only infer the Seeker’s test score by observing thatthe Seeker comes for help. In the Revealed treatment, we reveal Seekers’ test scores to theirAdvisors, independently of their seeking decision. This arm is motivated by the fact thatin the classical signaling model, if the Advisor already knows the Seeker’s skill irrespectiveof the Seeker’s seeking decision, then in principle the Seeker should be just as likely to seekin Skill as in Random. On the other hand, if there is disutility interacting with an Advisorwhen it is commonly known that the Advisor considers the Seeker as with low skill, thenthere should still be a reduction in seeking. We call this “the shame effect.”

An aspect of our exercise that we want to emphasize is that the randomization of treat-ment, as well as the clustering of errors, is at the village level. This approach has twoadvantages. First, it allows us to keep the rules of the game homogeneous across a village,especially about what is revealed and to whom, and these rules can be made common knowl-edge among those playing the game in that community. Fundamentally, our theory and mainquestions are about a signaling equilibrium, and that relies on a common understanding ofthe meaning of various actions and the conditions under which one would take them. Havingtreatments vary within a community would complicate this. For example, if we had Randomand Skill both occurring in the same community, the players would face more complex infer-ence problems, since it would not be common knowledge who was assigned to the Randomtreatment and who was assigned to Skill. Second, there is an issue of whether empiricalresults are driven by fundamental parameters that are specific to one community. In ourmodel, the key parameters are the importance of reputation for a particular kind of ability.Imagine that we had run our experiment in just one village or other socially contiguousgroup. Then we would be concerned that we are learning mainly about the parameters ofthat community—for example, the importance of reputation there. By examining a numberof different communities, the distribution of parameters is less likely to be driven by idiosyn-cratic aspects of any one small group, though of course it still depends on the culture inwhich the experiment takes place.

3.2.1. Measurement of Effects. The design schematic is presented in Figure 1. We can mea-sure the following effects. Condition on a person of low skill who receives the same numberof clues in any treatment cell (either as a function of their skill in the Skill treatment or bychance in the Random treatment). This holds fixed the type of the individual and the needfor clues.

Only looking at the Private treatments, by comparing Skill and Random, we measure theeffect of signaling. This is the pure signaling effect which reduces the probability that anagent goes to seek due to the fact that by seeking, he signals need and therefore in part

Page 15: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 14

is revealing his low skill. Note this includes within it the disutility from interacting withsomeone when they think the agent is of low ability (that is, the probability that the agentis of a low ability times a shame effect as well). This is our main comparison and the onethat we are particularly interested in.

Next, only looking at the Random treatments and by comparing Revealed and Private,we can measure the shame effect. Here the reduction in seeking comes only from the factthat one has to interact with another who is known to know that one has low ability.

Third, we can hold Revealed constant and move from Random to Skill. Observe thatthere is a difference here, because the Advisor knows that the Seeker’s need for informationis higher in Skill since the rules are common knowledge. Therefore, if the Seeker chooses notto seek, by definition the Advisor learns something about the Seeker’s willingness to makeeffort and utilize social connections.

Finally, we are particularly interested in how the signaling and shame effects vary withaspects of the social network, and we will estimate these effects interacted with friendshipor co-caste variables.

3.3. Data. Our data consists of three major parts: experimental data, demographics, andnetwork data. For the experiments, the key variables we are interested in are each Seeker’sscore on the Raven’s Matrices Test, the number of clues distributed, the seeking action, aswell as Advisors’ perceptions on Seekers’ ability. On the demographics, we use data on eachparticipant’s gender, caste, and the village the participant belongs to. We have previouslycollected household network data to help us identify whether two subjects are friends11,which serves as an indicator of frequency or intensity of social interaction. We also calculateaverage degress of various types of links.

3.4. Sample Statistics. Table 1 presents the sample statistics. We have comparable maleand female subjects: 45% of the Seekers and 42% of the Advisors are male.

Though we do have rich jati (subcaste) data, because major devisions occur at broadcategories, as is standard we look at caste blocks of General, OBC, Scheduled Caste andScheduled Tribe. In our analysis we treat General and OBC as upper caste and SC/ST aslower caste. 59% of the Seekers and 55% of the Advisors are General or OBC Caste—theremainder are Scheduled Caste, Scheduled Tribe or Religious Minorities.

We also have a wide distribution of skill as measured by the Raven’s Matrix test, scoredout of 15. The mean Seeker score is 9.5 with a standard deviation of 3.2, and similarly themean is 9.2 (3.3) for Advisors.

11This data was collected and analyzed in Banerjee et al. (2016)

Page 16: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 15

Turning to the network data, the subjects have on average 8.6 links overall, as measuredby the ALL network. They have 6.7 friendship links, 5.7 informational links, 6.6 within-caste, and 2.1 across-caste links on average. As expected within-caste links far outnumberacross-caste links both in social and informational relations.

Finally, we see that the seeking rate itself is 14%.

3.5. Do People Update Perceptions from the Raven’s Matrix Score? To makesense of our results, we first need to check that subjects do update about ability based onthe Raven’s matrix test. We show that the score on the Raven’s Matrix Test can effectivelyupdate people’s belief on the test-taker’s skill. Table 2 presents the results. We find that a onestandard deviation increase in the seeker’s test score corresponds to a 0.07 standard deviationincrease in the change in ability for non-friends, and a 0.15 standard deviation increase forfriends. Similarly a one standard deviation increase in the Seeker’s score corresponds to the3.3pp increase in the probability that the perception of intelligence increased on a base of36.7%. Note that this analysis takes the respondent fixed effect, so it is comparing acrossdifferent individuals who are revealed to the respondent for the same respondent.

4. Results

Table 3 presents the main regression analysis. The main sample conditions on low-skillSeekers (below median scores) who received low clue counts (below median clues) so thesubjects across the different treatment cells are comparable and the only thing that varies isthe treatment condition. Given this sample, we estimate:

Seekijv = α + β1 SkillTreatmentv + β2 RevealTreatmentv+ β3 SkillTreatmentv × RevealTreatmentv + εijv

for Seeker i with Advisor j in village v. We also add Seeker score, Seeker clue count, andAdvisor clue count fixed effects across specifications.

4.1. Do Low-skilled Seekers Seek Less in the Skill Treatment?Our main question is whether low-skill Seekers in the (Skill, Private) treatment seek less

than they do in the (Random, Private) treatment. In particular, we need to compare low-skill subjects across treatments who are offered the same number of clues (either by randomchance in control or deterministically in treatment) since only then do they have the sameneed to seek.

We focus on β1 in the regression above. We find that moving to the (Skill, Private)treatment from (Random, Private) leads to a 13.2pp decline in the probability that the

Page 17: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 16

Seeker seeks on a base of 20.3% in the control. The effect remains robust as we add variousfixed effects. Relative to column 1, column 2 adds fixed effects for the number of clues thatthe Advisor randomly received, column 3 additionally adds fixed effects for the Seeker’s cluecount, column 4 includes the fixed effects for the Seeker’s score on the test, and column5 includes all of the above. We find that moving to the (Skill, Private) treatment from(Random, Private) leads to a 8.8pp decline in the probability that the Seeker engages inseeking.

4.2. Revealing Seeker Skill to Advisor. We now look at what happens when we revealthe Seeker’s skill to the Advisor. As discussed above, in the classical story, this shouldnow allow the low-skilled Seekers to resume seeking at their rates in control since there isnothing left to signal. On the other hand, if there is shame in having to engage face-to-facewith someone who is known to perceive the Seeker as with low skill, there may still be areluctance or even a further discouragement of seeking. To see this, we focus on β2 and β3

in the regression.If conversing with an Advisor is uncomfortable when it is common knowledge between the

Seeker and Advisor that the Seeker has low skill, then β2 < 0. We find evidence of such ashame effect: if in the Random treatment one reveals the Seeker’s score to the Advisor, thenthe Seeker is 12.4pp less likely to seek on a base of 20.3% (column 5).

Next, we consider what happens when we reveal the Seeker’s score to the Advisor in theSkill treatment. Because the prospect of signaling ability has been eliminated, the Seekerno longer needs to worry about the Advisor’s inference about his ability and therefore heshould be more likely to seek relative to (Skill, Private). Furthermore, because the Seekerknows that the Advisor now knows that the Seeker is involved in an experiment, has receivedfew clues because of his low ability, if the Seeker does not seek, the Advisor learns that theSeeker is unwilling to pursue information that would be useful. We call this an “obligation”or “willingness signaling” effect, and it leads us to expect β3 ≥ −β1 > 0. We find thatboth revealing the score to the Advisor and going from Random to Skill leads to a 4.6ppincrease in the probability of seeking relative to (Random, Private), and the difference is notstatistically significant.

Taken together, this suggests that the classical story is at play—introducing informationabout the Seeker’s skill removes reluctance to seek in the Skill treatment—but there areelements beyond this story at work as well. In particular, there is a measurable shame effect.To examine these effects further, in Section 5 we focus on the nature of the relationshipbetween the Seeker and the Advisor.

4.3. Signaling Effect and Inferences in Equilibrium. Recall the predictions of Section2.4. The second prediction, based on Proposition 2, is essential to the mechanics of thesignaling equilibrium: low-skilled types seek more, so that seeking is a signal of low ability.

Page 18: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 17

To check whether this holds in the data, we compare the seeking behavior of low-skill andhigh-skill types in the (Skill, Private) treatment. We define a high-skill Seeker as one havinga Raven’s score above the median (in this case, 10 out of a possible 15); otherwise she is low-skill. In our sample, due to ties, 53.2% of Seekers had high skill. We examine the behaviorof both types of Seekers in the skill treatment. We find that 12.9% of the low-skill Seekerschose to seek (averaging across all clue counts, etc.). Only 7.9% of high-skill Seekers choseto seek. The difference is statistically significant, even clustering at the village level.

Given the base rate (53.2% have high skill) and these seeking rates, we can computeBayesian posteriors conditional on seeking. Given that someone chooses to seek, the likeli-hood ratio that she has a high type is

P(a = H | y = 1)P(a = L | y = 1) = π

1− πP(y = 1 | a = H)P(y = 1 | a = L) = 0.532

0.4680.0790.129 = 0.696.

Solving for P(a = H | y = 1) we get 41%: conditional on seeking, the probability that one ishigh-skill falls from the base rate of 53.2% to 41%.

It is also instructive to do a similar computation in the (Random, Private) treatment,where there ought to be no signaling motive. Consistent with the fourth prediction, bothtypes seek more frequently in (Random, Private) as compared to (Skill, Private). Figure 2presents this graphically. We can compute

P(a = H | y = 1)P(a = L | y = 1) = π

1− πP(y = 1 | a = H)P(y = 1 | a = L) = 0.521

0.4790.1530.170 = 0.98.

Here we have P(a = H | y = 1) = 49.5%: conditional on seeking, the probability that one ishigh-skill essentially stays at 1/2.

This confirms that the basic force operating in the theory is present in the data, and givesa sense of the magnitude of the belief updating induced by the signaling.

5. Signaling and Social Structure

We have seen that introducing an ability-signaling motivation in a social learning taskreduces the probability that one seeks. In this section we explore how heterogeneity in thismotive affects information sharing through different parts of the network. Specifically, whendoes signaling and when does shame matter? We might think that the scope for signalingis less among frequent contacts (e.g., friends, members of the same caste) whereas for thesame reason shame may matter more there. We regress

Seekijv =α + β1 SkillTreatmentv + β2 RevealTreatmentv + β3 SkillTreatmentv × RevealTreatmentv+ δ0 SocialProxijv + δ1 SkillTreatmentv × SocialProxijv + δ2 RevealTreatmentv × SocialProxijv+ δ3 SkillTreatmentv × RevealTreatmentv × SocialProxijv + εijv

Page 19: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 18

for Seeker i with Advisor j in village v. Again we include in some specifications Seeker cluecount, Advisor clue count, and Seeker score fixed effects. SocialProx indicates the socialproximity between the Seeker and Advisor, here measured either by direct friendship or bywhether they are of the same subcaste.

Figures 3 and 4 present our results graphically, by simply plotting the raw data. Consider3, Panel A. For the low skill, low clue count population, we can see that there is a signalingeffect, but no shame effect, among friends. There is also a signaling elimination and obligationeffect among friends. There are no effects for the high skill, high clue count population.Similarly Figure 3, Panel B turns to friends. Here we are unable to detect a signaling effect,though there is a sizeable shame effect and an obligation effect for the low skill, low cluecount population. Again we find no effects for the high skill, high clue count population.Figure 4 paints a similar picture using caste instead of friendship.

5.1. Signaling between Friends. Table 4 presents the results for friends and non-friendspairs. In non-friend pairs we see a 11.4pp decrease in seeking due to introducing Skill relativeto (Random, Private) on a base of 17.4%. We also see a decline of about 9.1pp movingto (Random, Revealed). Finally, among non-friend pairs, having both Revealed and Skilltreatments eliminates the signaling effect and adds the obligation effect (16.9pp increase) soin total 3.6pp less than (Random, Private), a difference which is not statistically significant.

In friend pairs, we see that under (Random, Private) there is no detectable change inseeking probability relative to non-friend pairs (the point estimate is 7.03pp, not significant).We also see no detectable differential skill effect, where the point estimate is a decline of2.3pp and not statistically different from zero. Consistent with this is the fact that subjectshave more accurate priors about the skill of their friends. When, in an auxiliary experiment,we simply tell A about how B performs on the ability test, A’s rating of B’s ability changesless if the two are friends than if they are not.

Under (Random, Private), with friend pairs, moving from Private to Revealed results in alarge decline of about 24.8pp, indicating a detectable and significant shame effect amongstfriends. When introducing Skill into a friend pair with Revealed, we see a large and significantincrease of about 49.6pp. This result indicates that the obligation effect is strong amongfriend pairs, and there is a strong norm that if you are known to need help, not seeking isperceived negatively.

5.2. Signaling across Caste. Table 5 presents the results of signaling across differentcastes. In same caste pairs, we see no detectable skill effect, and a statistically significantshame effect of a 14.5pp decline in seeking relative to 21.9% in Private treatment. WhenSkill is introduced into same caste pairs with Revealed treatment, we see a large, nearlydetectable increase in seeking of about 18.03pp. These results indicate sizeable shame andobligation effect amongst same caste pairs, but not a strong signalling effect.

Page 20: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 19

We find that under Private and Random, there is no signficant difference in seeking be-tween same caste and different caste pairs. When Skill is introduced in different caste pairsunder Private, we see a significant effect of signalling of about a 15.3ppdecline in seekingrate. We see no detectable shame effect under Random with a noisy point estimate of -7.8pp.Adding Revealed to different caste pairs conditional on the Skill treatment results in a noisy22.4pp increase, suggesting a signaling elimination and obligation effect.

Columns 3 and 4 break it out by caste hierarchy, looking at whether a Seeker is in same,higher or lower caste than his Advisor. We see that when the Seeker’s caste is consideredlower than the Advisor’s caste, he is 26.7pp less likely to seek on a base of 35%—thereappears to be a premium of maintaining one’s reputation among those of higher caste. Wesee no detectable shame effect under Random. We do see evidence of a signaling eliminationand obligation effect.

Moreover, although caste is thought to be a major impediment to information sharing, inthis setting we show that an impediment appears only in interaction with the Skill treatment,i.e., through a signaling effect. This dwarfs the reluctance of an upper-caste Seeker to seekinformation from a lower-caste Advisor in Control (which is only a 4pp reduction on a baseof 22%). In addition, we find that information exchange is, if anything, stimulated whenlower-caste Seekers have the opportunity to seek information from upper-caste Advisors: in(Random, Private)—our Control group—they are 12.8pp more likely to do so than same-caste pairs. We find no shame effect when the Seeker’s caste is considered higher thanthe advisor’s caste, but when we introduce Skill into (Revealed, Private), we find a largeincrease of seeking rate, suggesting strong signaling elimination and obligation effect whenthe Seeker’s caste is considered higher.

Taken together, we find that signaling matters when the people being interacted withare probably less familiar. Shame matters for these higher frequency interactions. Thesignaling seems to be particularly important when it is to a higher caste: shame seems to beparticularly important when it is to members of the same caste.

5.3. High Skill and High Clue Counts. We have presented results for the low skill andlow clue count population to keep the sample constant across treatment and identify thesignaling effect of interest. In Appendix D, we repeat our results for those with high abilityand a high number of clues. Notice that presenting these tables separately is equivalent to asaturated model where we put a dummy for high ability and high clue interacted with everyregressor in our main regressions (on the low ability and low number of clues sample).

The results are largely consistent with our story. The level of seeking is considerablylower—5.8% in (Random, Private)—as compared to 20.3% for the low ability and low cluecount sample. This is unsurprising: the need for signals is less if one has a higher clue count.In fact in the (Random, Private) treatment, randomly receiving a low number of clues makes

Page 21: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 20

a Seeker 16.5pp more likely to seek on a base of 5.4%, and but the interaction of low skilland low clues is not significantly different from zero.

Moreover, we see no signaling, shame, nor obligation effects (Table D.1). A Seeker is justas likely to seek in (Skill, Private) as (Random, Private). He is also just as likely to seek in(Random, Private) as (Random, Revealed). And there is no interaction effect. Turning tothe social structure results, again we see no skill, shame, nor obligation effect either amongstrangers or friends, or co-caste or other-caste members (Tables D.2 and D.3).

Taken together, the results from the high skill and high clue count sample suggest thatbecause these individuals receive more clues, they have less of a need for information and donot seek very often. If they do, it is likely for idiosyncratic reasons. This is largely consistentwith our theory.

6. Discussion

This paper looks at how information-seeking can be dampened when people are concernedabout the signaling effects that come with it. We develop a simple model to think about howthis can affect social learning and then conduct a first experiment to explore this mechanism.

We find evidence in favor of the view that individuals do worry about the signaling aspectin information-seeking, and this can deter social learning. Further, we empirically identifya raw disutility of having to interact with someone who is commonly known to have a lowopinion of the agent. The signaling and shame stories act heterogeneously: signaling mattersmore when agents need to exchange information with strangers or acquaintances, whereasshame matters more with friends.

In addition to our experiment we also conducted surveys with 122 respondents in fourvillages not from our sample. Among other things, we asked them about how they getinformation from their networks on a number of topics (financial products, farming inputs,and health practices). We found that passive learning was considerably more common thanactive learning. 95% report hearing about it passively from friends and 90% from broadcastmedia (newspapers, TV, and internet). Meanwhile, only 49% report actively asking friends inthis village for this type of information. We also probed whether the subjects felt constraintson seeking information, asking if there were informal limits on the number of times one couldapproach a village member for information. 88% of the time agents reported there being alimit on the number of times one could approach a member of their community to ask aboutfarming, health, or financial advice. In fact, 64% of these cases, the non-seeking was due tonot wanting to appear weak or uninformed, which is consistent with stigma and signalingbeing an important force in this context. Finally, about 70% of respondents say that it isimportant to talk to others before making a decision, but the survey evidence suggests peopleare constrained. In fact about 60% say that they know everything that the best person they

Page 22: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 21

can realistically ask for advice knows, and this particular person does not know anythingmore than them. However, 70% say that the best person to ask for advice does know moreinformation. These individuals, however, are just not approachable.

Taken together this paints a picture of a world in which information flows casually andeven though there is a recognized need to acquire information, agents may feel constrainedand may not seek.

Overall, our findings suggest both theoretical and empirical implications. In terms oftheory, it suggests that the incentives to engage in learning that are beyond the scope of thelearning itself may matter quite a bit and can intersect in interesting ways with endogenouscommunication network formation.

There are a few policy implications that this line of research can shed light on. Forinstance, it shows that there is likely an important role for anonymous query protocols,such as Avaaj Otalo in Gujurat (Cole and Fernando, 2014). Farmers can call in to requestinformation without their questions being observed by their peers, which strips away thesignaling concern. Furthermore, any intervention that normalizes questioning and ignoranceabout a topic itself can amplify learning in the community.

Our empirical findings suggest additional hypotheses and extensions, which may merittheoretical study and lead to additional policy implications.

6.1. Obligation and Signaling Effort. As we discussed in the introduction, the obligationeffect is not explained by a pure ability signaling model. However, it is natural if there aremultiple dimensions of private type to signal.

Consider, to begin with, a relabeling of our basic model, where ability is known, but thereare two types, a Willing type and a Lazy type. All else (i.e., clue counts k and k′) equal,the Willing type has a lower cost of seeking information, whereas the Lazy type has a highercost. This is operationalized just as in our main model, via V 1

a − V 0a .

Unless it is known that the value of information is substantial—i.e., unless the clue en-dowment of the Seeker is expected to be low—there should not be much of a signaling effect(a difference of seeking rates between types).

Suppose the clue endowment of the Seeker is expected to be low. Now consider the effectof revealing to the Advisor that C = 1, i.e. that the Seeker has a chance to seek. In practice,this corresponds to revealing the Seeker’s identity. Doing this will increase his tendency toseek, because it makes the updating based on not seeking larger for any given cutoff v; thereis no longer the uncertainty about whether his had the opportunity. Intuitively, failing toseek is a sign of the Seeker’s laziness, because the benefits to him of doing so are high.

These forces can explain the obligation effect; Seekers are signaling that they are willing toput in effort, or to use a relationship. What we are calling “willingness” could be interpreted

Page 23: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 22

also as sociability, humility, or work ethic. It makes sense that this should be an issuebetween friends more than between strangers, because friends are those with whom one islikely to have opportunities to collaborate.

If we take the willingness aspect as a private type seriously, then the model to examinebecomes richer, with two dimensions of private type—cognitive ability and willingness. Thepayoff function that accommodates beliefs about both would be correspondingly richer. Inour modeling here, we have examined the forces involved when one dimension is known, butconsidering uncertainty simultaneously about both can lead to interesting extensions.

While there is a reasonable signaling explanation for the “obligation” effect, we do notbelieve there is one that is as compelling for the “shame” phenomenon, and we are contentto treat it as simply an aspect of the agent’s preferences.

6.2. Welfare. Our signaling model is designed to accommodate the possibility that theremay be a unique signaling equilibrium and that there may be many of them, for fundamentalreasons rather than the spurious ones often arising in signaling models. Recall equation 2.3in Section 2. A stable equilibrium is a v at which the left-hand side and right-hand sideintersect, with the slope of the right-hand side between −1 and λ−1. Whether there is justone such intersection or several depends on the parameter λ and the shapes of the functionsH, GL, and GH.

When there are several stable equilibria, there are interesting welfare implications. Typi-cally, the equilibria will be welfare-ranked, even taking into account the value to the Advisorof learning more about the Seeker’s type. Thus, shifting between equilibria can have sub-stantial welfare consequences.

A shift between equilibria gives a meaning to the intuitive notion of “removing the stigma”of asking questions. Note that this does not mean changing λ, which is a fundamentalparameter describing the value of a good reputation. The kinds of interventions we have inmind for the case of multiple equilibria leave all fundamental parameters fixed. For example,an intervention may involve artificially inducing people to ask questions by giving them prizesfor doing so (while leaving it uncertain to others whether these prizes are present). This canchange the interpretation of the act of seeking. Once it is changed, even when additionalincentives are removed, the equilibrium played may be different.

Thus, the economic fundamentals of the model that determine the shape of the functionsin (2.3) determine whether temporary policies of this sort can improve welfare.

6.2.1. Homophily. Here we consider how the stigma of signaling low skill can affect theendogenous communication network that is formed differently based on whether there ishomophily or heterophily in society.

For simplicity, suppose that there are two types of agents in the network, in group (whichcan be interpreted as caste) g ∈ {r, b}, and that an agent’s category and skill are independent.

Page 24: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 23

Consider the following two line networks depicted in Figure 5: (A) shows a homophilousnetwork, where people of the same group tend to be neighbors within the network, and (B)shows a heterophilous network, where group membership alternates across nodes on the line.

Assume that each node N has a prior belief about each of her neighbor’s skill: πL and πR

for the neighbor to the left and right, respectively, drawn from a full-support distribution.We further assume that node N has a more accurate prior (i.e., one closer to 0 or 1) abouta neighbor who is of the same type than one of a different type.

In the homophilous network, this results in chains of agents who all have strong priorsabout each other’s type. Following the same discussion in the basic model, they are morelikely to opt into seeking information from each other. On the other hand, in the het-erophilious network neighbors do not have strong priors about each other’s skill, and thereis greater scope for signaling across every link in the network. As a result, members from aheterophilous network is less likely to engage with social learning.

Here, type can be a parable for a number of things: for instance, those of the same typemay interact more frequently (be friends socially) or perhaps are members of the same caste.What is key is that those of the same identity may carry more information about each other’sability, for instance simply because they have opportunity to draw more inferences from awider range of interactions. The communication choices studied in our model then amplifyeither homophily or heterophily, reinforcing homophilous links and weakening heterophilousones.

6.2.2. Inequality. Suppose there are some agents who are known to be of high ability, withπ ≈ 1 when they are Seekers. This may happen because they are known to have accomplisheddifficult tasks. Suppose that others’ agents’ abilities are not known with high confidence tobe either high or low at the prior stage.

In this case, “ordinary” types are deterred from seeking by the signaling effect, whereasour results on precise prior show that known-high-ability individuals are not. This cancreate a multiplier effect and exacerbate inequalities in information. Those considered veryintelligent are permitted to ask questions and are immune to the stigma, while those whoneed information more protect their reputations.

On the other hand, consider prior beliefs that have a “bad news” structure, where somepeople are known to be of low ability (π ≈ 0) but others’ abilities are less confidently known.Then the known-low-ability individuals are permitted to ask questions because they don’thave a reputation to worry about, whereas those whose abilities are uncertain are morereluctant.

Thus, whether the signaling effects exacerbate or attenuate inequality depends in a par-ticular way on the structure of prior information.

Page 25: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 24

References

Acemoglu, D., K. Bimpikis, and A. Ozdaglar (2014): “Dynamics of InformationExchange in Endogenous Social Networks,” Theoretical Economics. 1, 1.4

Banerjee, A., A. G. Chandrasekhar, E. Duflo, and M. O. Jackson (2013): “TheDiffusion of Microfinance,” Science, 341. 1, 1.4

——— (2016): “Gossip: Identifying Central Individuals in a Social Network,” . 11Beaman, L., A. BenYishay, J. Magruder, and A. M. Mobarak (2014): “Can

Network Theory based Targeting Increase Technology Adoption?” . 1.4Bursztyn, L., G. Egorov, and R. Jensen (2016): “Cool to Be Smart or Smart to Be

Cool? Understanding Peer Pressure in Education,” . 1.4, 2.3Bursztyn, L. and R. Jensen (2015): “How Does Peer Pressure Affect Educational In-

vestments?” Quarterly Journal of Economics, 130, 1329–1367. 1.4Calvo-Armengol, A., J. de Marti, and A. Prat (2015): “Communication and Influ-

ence,” Theoretical Economics, 10, 649–690. 1, 1.4Cole, S. A. and A. N. Fernando (2014): “The value of advice: Evidence from the

adoption of agricultural practices,” HBS Working Group Paper. 1, 1.3, 6Conley, T. G. and C. R. Udry (2010): “Learning about a New Technology: Pineapple

in Ghana,” The American Economic Review, 35–69. 1, 1.4Foster, A. and M. Rosenzweig (1995): “Learning by Doing and Learning from Others:

Human Capital and Technical Change in Agriculture,” Journal of Political Economy., 103,1176–1209. 1.4

Fryer, R. and D. Austen-Smith (2005): “An Economic Analysis of ’Acting White’,” .1.4

Galeotti, A., C. Ghiglino, and F. Squintani (2013): “Strategic information trans-mission networks,” Journal of Economic Theory, 148, 1751–1769. 1, 1.4

Kremer, M. and E. Miguel (2007): “The Illusion of Sustainability,” The QuarterlyJournal of Economics., 122, 1007–1065. 1.4

Niehaus, P. (2011a): “Filtered Social Learning,” Journal of Political Economy, 119, 686–720. 1

——— (2011b): “Filtered Social Learning,” Journal of Political Economy, 119, 686–720. 5Spence, M. (1973): “Job market signaling,” The quarterly journal of Economics, 355–374.

(document), 1, 1.4, 2.3

Page 26: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 25

Figures

Private Revealed

Random

value of info-seeking cost

value of info-seeking cost

-shame

Skill

value of info-seeking cost

-signaling

value of info-seeking cost

-shame+obligation

Figure 1. Experimental design schematic.

0.0

5.1

.15

.2.2

5.3

.35

.4.4

5.5

Prob

abilit

y of

see

king

Low Skilled High Skilled

Random, Private Skill, Private

Figure 2. Probability of seeking by low skilled (below median score) or highskilled (above median score), by treatment. This does not restrict the cluecount in any way.

Page 27: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 26

0.1

.2.3

.4.5

.6.7

.8.9

1Pr

obab

ility

of s

eeki

ng

Random, Private Skill, Private Random, Revealed Skill, Revealed

Low Skilled, Low Clue Count High Skilled, High Clue Count

(a) Non-friends

0.1

.2.3

.4.5

.6.7

.8.9

1Pr

obab

ility

of s

eeki

ng

Random, Private Skill, Private Random, Revealed Skill, Revealed

Low Skilled, Low Clue Count High Skilled, High Clue Count

(b) Friends

Figure 3. Probability of seeking plotted by treatment, with standard errors.Two samples are plotted: low skill with low clue count and high skill with highclue count, to hold ability and incentive to seek fixed across treatments.

Page 28: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 27

0.1

.2.3

.4.5

.6.7

.8.9

1Pr

obab

ility

of s

eeki

ng

Random, Private Skill, Private Random, Revealed Skill, Revealed

Low Skilled, Low Clue Count High Skilled, High Clue Count

(a) Different caste

0.1

.2.3

.4.5

.6.7

.8.9

1Pr

obab

ility

of s

eeki

ng

Random, Private Skill, Private Random, Revealed Skill, Revealed

Low Skilled, Low Clue Count High Skilled, High Clue Count

(b) Same caste

Figure 4. Probability of seeking plotted by treatment, with standard errors.Two samples are plotted: low skill with low clue count and high skill with highclue count, to hold ability and incentive to seek fixed across treatments.

Page 29: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 28

(a) Homophilous Network

(b) Heterophilous Network

Figure 5

Page 30: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 29

Tables

Table 1. Summary Statistics

Panel A: Experimental samplemean sd

probability of seeking 0.14 (0.35)test score of advisor 9.17 (3.29)test score of seeker 9.46 (3.20)female advisors 0.53 (0.50)female seekers 0.54 (0.50)upper caste advisors 0.55 (0.50)upper caste seekers 0.60 (0.49)

Panel B: Links within/across caste by typemean sd

all links 8.62 (5.28)within-caste all links 6.56 (4.59)across-caste all links 2.13 (2.72)social links 6.66 (3.64)within-caste social links 5.13 (3.33)across-caste social links 1.60 (2.13)information links 5.65 (3.99)within-caste info links 4.30 (3.41)across-caste info links 1.42 (2.02)

Page 31: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 30

Table 2. Updating

(1) (2) (3)VARIABLES Increased Perception Increased Perception Increased Perception

Standardized Seeker Score 0.0311 0.0313 0.0234(0.0142) (0.0151) (0.0244)

Observations 1,097 1,097 1,097R-squared 0.038 0.101 0.472Mean of dep. var 0.381 0.381 0.381Village FE XRespondent FE X

Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 32: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 31

Table 3. Main Results

(1) (2) (3) (4) (5)VARIABLES Seeking Seeking Seeking Seeking Seeking

Skill Treatment -0.132 -0.132 -0.0862 -0.130 -0.0883(0.0492) (0.0494) (0.0505) (0.0494) (0.0497)

Reveal score to Advisor -0.154 -0.145 -0.137 -0.146 -0.124(0.0580) (0.0593) (0.0636) (0.0618) (0.0668)

Skill × Reveal score to Advisor 0.298 0.305 0.292 0.272 0.258(0.100) (0.0978) (0.101) (0.103) (0.101)

Observations 452 452 452 452 452R-squared 0.088 0.103 0.126 0.133 0.158Random/No Reveal Mean 0.203 0.203 0.203 0.203 0.203Advisor clue count FE X X XSeeker Score FE X XSeeker clue count FE X X

Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 33: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 32

Table 4. Friends

(1) (2) (3) (4) (5)VARIABLES Seeking Seeking Seeking Seeking Seeking

Skill Treatment -0.133 -0.139 -0.102 -0.137 -0.114(0.0554) (0.0549) (0.0556) (0.0554) (0.0545)

Reveal score to Advisor -0.0994 -0.0970 -0.103 -0.0958 -0.0910(0.0661) (0.0653) (0.0674) (0.0679) (0.0686)

Skill × Reveal score to Advisor 0.174 0.189 0.177 0.163 0.169(0.0876) (0.0920) (0.0875) (0.0866) (0.0905)

Friend 0.146 0.134 0.0823 0.133 0.0703(0.140) (0.140) (0.143) (0.136) (0.136)

Skill × Friend -0.00205 0.0150 0.0602 0.0242 0.0909(0.149) (0.149) (0.153) (0.147) (0.146)

Reveal score to Advisor × Friend -0.224 -0.220 -0.156 -0.205 -0.157(0.167) (0.169) (0.167) (0.168) (0.165)

Skill × Reveal score to Advisor × Friend 0.471 0.460 0.394 0.415 0.350(0.222) (0.228) (0.228) (0.227) (0.229)

Observations 452 452 452 452 452R-squared 0.124 0.139 0.143 0.168 0.189Random/No Reveal/Not Friend Mean 0.174 0.174 0.174 0.174 0.174Advisor clue count FE X X XSeeker Score FE X XSeeker clue count FE X X

Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 34: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 33

Table 5. Caste

(1) (2) (3) (4)VARIABLES Seeking Seeking Seeking Seeking

Skill Treatment -0.110 -0.0417 -0.110 -0.0433(0.0694) (0.0672) (0.0678) (0.0657)

Reveal score to Advisor -0.190 -0.146 -0.188 -0.145(0.0792) (0.0845) (0.0811) (0.0856)

Skill × Reveal score to Advisor 0.276 0.222 0.274 0.226(0.130) (0.134) (0.130) (0.134)

Seeker Caste < Advisor Caste 0.100 0.128(0.123) (0.0998)

Skill × Seeker Caste < Advisor Caste -0.177 -0.224(0.142) (0.116)

Reveal × Seeker Caste < Advisor Caste -0.0355 -0.0459(0.163) (0.148)

Skill × Reveal × Seeker Caste < Advisor Caste 0.104 0.0939(0.265) (0.230)

Seeker Caste > Advisor Caste -0.0910 -0.0438(0.0813) (0.0804)

Skill × Seeker Caste > Advisor Caste 0.0133 -0.0439(0.0979) (0.0936)

Reveal × Seeker Caste > Advisor Caste 0.183 0.136(0.140) (0.131)

Skill × Reveal × Seeker Caste > Advisor Caste 0.204 0.270(0.264) (0.268)

Seeker Caste differs from Advisor Caste -0.0257 0.0182(0.0798) (0.0715)

Skill × Seeker Caste differs from Advisor Caste -0.0519 -0.111(0.0953) (0.0838)

Reveal × Seeker Caste differs from Advisor Caste 0.104 0.0676(0.122) (0.113)

Skill × Reveal × Seeker Caste differs from Advisor Caste 0.0528 0.0800(0.210) (0.194)

Observations 452 452 452 452R-squared 0.096 0.167 0.105 0.176Random/No Reveal/Same Caste Mean 0.219 0.219 0.219 0.219Advisor clue count FE X XSeeker clue count FE X XSeeker Score FE X X

Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 35: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 34

Appendix A. Proofs

A.1. Preliminary Results.

Lemma 1. For any v ∈ R,

(A.1) π

1− πGH(v)GL(v) <

π

1− π1− qGH(v)1− qGL(v) .

Proof. The ratios are well-defined by Assumption 1. By Assumption 2 on first-order sto-chastic dominance, we have GH(v)

FH(v) <GL(v)FL(v) , and therefore

π

1− πGH(v)GL(v) <

π

1− πFH(v)FL(v) .

Also by Assumption 2 on first-order stochastic dominance,π

1− πGH(v)GL(v) <

11 .

Now, for any positive reals x, y, z, y′, z′, if we have x < y/z and x < y′/z′ then it follows thatx < qy+(1−q)y′

qz+(1−q)z′ . Thus,π

1− πGH(v)GL(v) <

π

1− πqFH(v) + (1− q)qFL(v) + (1− q) .

To deduce the desired equation, use the identity Ga(v) = 1− Fa(v) to show

(A.2) π

1− πqFH(v) + (1− q)qFL(v) + (1− q) = π

1− π1− qGH(v)1− qGL(v) .

This completes the proof.

A.2. Proof of Proposition 1.

A.2.1. Inferences. First we will use Bayes’ rule to compute the Advisor’s inferences basedon the seeking behavior. Note that, by the description of the model, conditional on being ofability a, the probabilities of seeking and not seeking are

P(y = 1 | a) = P(C = 1)Ga(va)

= qGa(va)

P(y = 0 | a) = 1−P(y = 1 | a)

= 1− qGa(va).

We now solve for perceptions. By Bayes’ rule,P(a = H | y = 1)P(a = L | y = 1) = π

1− πGH(vH)GL(vH)

andP(a = H | y = 0)P(a = L | y = 0) = π

1− π1− qGH(vH)1− qGL(vL) .

Page 36: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 35

A.2.2. Equilibrium Condition. By the no-atoms assumption, in any equilibrium the typesatisfying V 1

a − V 0a = va has to be indifferent between seeking and not seeking. We can

compute

Ua(1)− Ua(0) = V 1a − V 0

a + λ

[H

1− πGH(vH)GL(vL)

)−H

1− π1− qGH(vH)1− qGL(vL)

)].

The Seeker will want to seek if and only if the expected gain from the improved signal is largeenough to offset any disutility from being perceived as a lower type with higher probability.For indifference we require

vaλ

= H

1− π1− qGH(v)1− qGL(v)

)−H

1− πGH(v)GL(v)

).

Thus in any equilibrium there is only one cutoff for both skill types, v = vH = vL, and acutoff being an equilibrium is equivalent to solving

(2.3) v

λ= H

1− π1− qGH(v)1− qGL(v)

)−H

1− πGH(v)GL(v)

).

A.2.3. Existence. A v solving (2.3) exists because the right-hand side of (2.3) is a continuousfunction of v (by Assumption 1) bounded in absolute value by 1 (it is a difference of twonumbers in [0, 1]) and the function v 7→ v/λ is a continuous function crossing this the interval[−1, 1], so they must intersect by the Intermediate Value Theorem.

A.2.4. Positivity of Cutoff. Now we argue that any v solving this is positive. Recall equation(2.3):

v

λ= H

1− π1− qGH(vH)1− qGL(vL)

)−H

1− πGH(v)GL(v)

).

Since H is an increasing function, Lemma 1 guarantees that the right-hand side of (2.3) ispositive, and so any v solving the equation is positive.

A.3. Proof of Proposition 2. Proposition 1 establishes that the cutoff is the same forboth types. Lemma 1 along with the expressions about perceptions derived at the beginningof the proof of Proposition 1 establish that

P(a = H | y = 0)P(a = L | y = 0) = π

1− π1− qGH(vH)1− qGL(vL) >

π

1− πGH(vH)GL(vH) = P(a = H | y = 1)

P(a = L | y = 1) .

This can be rewritten asP(a = H | y = 0)

1−P(a = H | y = 0) >P(a = H | y = 1)

1−P(a = H | y = 1) .

Since the function x 7→ x1−x is strictly increasing, it follows that

P(a = H | y = 0) > P(a = H | y = 1).

Page 37: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 36

A.4. Proof of Proposition 3. If it is hypothesis (1) that holds, and the prior becomesprecise, then the right-hand side of (2.3) tends pointwise to 0, and so the v solving (2.3)must be very close to 0. Then from (2.3) we have

H

1− πGH(v)GL(v)

)≈ H

1− π1− qGH(v)1− qGL(v)

),

and since H is monotone, it follows thatπ

1− πGH(v)GL(v) ≈

π

1− π1− qGH(v)1− qGL(v) .

Using the formulas in Section A.2.1 above, it follows thatP(a = H | y = 1)P(a = L | y = 1) −

P(a = H | y = 0)P(a = L | y = 0) → 0.

In other words,P(a = H | y = 1)

1−P(a = H | y = 1) −P(a = H | y = 0)

1−P(a = H | y = 0) → 0.

Since the function x 7→ x1−x is increasing and continuous, this shows that

P(a = H | y = 1)−P(a = H | y = 0)→ 0.

The argument assuming that (2) holds is very similar.

A.5. Proof of Proposition 4. Write condition (2.3) defining the right-hand side as R(v),so that

(A.3) v

λ= R(v) := H

1− π1− qGH(v)1− qGL(v)

)−H

1− πGH(v)GL(v)

).

To show that the equilibrium is unique for large λ, we will study the intersections of thecurves v 7→ v/λ and R for large λ. In particular, we will show that as v →∞, the derivativeR′(v) converges quickly to zero: so quickly that, the first time that v/λ intersects it, R istoo shallow from then onward to permit another intersection with the line v 7→ v/λ.

To this end, let us first study R′(v) for large v. We will show:

Claim 1. For every constant k > 0, there is a v so that if v ≥ v, we have R′(v) < 1/(kv).

Proof. Define the shorthand r0 = π1−π

1−qGH(v)1−qGL(v) and r1 = π

1−πGH(v)GL(v) . Differentiating (A.3) we

get

R′(v) = π

1− π

H ′(r0)q[− G′H(v)

1− qGL(v) + (1− qGH(v))G′L(v)(1− qGL(v))2

]

−H ′(r1)[G′H(v)GL(v) −

GH(v)G′L(v)GL(v)2

]

Page 38: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 37

Assumption 3(1), along with basic properties of c.d.f.’s, yields that the quantity in the secondsquare brackets dominates the quantity in the first square brackets term by term. Thus, ifwe can show the claim only for the term on the second line, we will be done. This isthe the same as showing the claim for the derivative of the second term in (A.3). Defineρ(v) = GH(v)/GL(v). Then we want to so the claim for

S(v) := − d

dvH(

π

1− πρ(v)).

Now,S(v) = − π

1− πH′(

π

1− πρ(v))ρ′(v).

By Assumption 3(2), ρ̂ := limv→∞ ρ(v) exists; because H has a positive density on thepositive reals, it follows that H ′

1−πρ(v))

converges to the strictly positive quantity H ′(ρ̂)as v → ∞. Thus it suffices to show that for every k there is a v1 so that for all v > v1 wehave ρ′(v) < 1/(kv).

We will now deduce this from Assumption 3(2). The assumption says that for largeenough v, the ratio ρ(v) has with monotone first and second derivatives. Now, fixing anyk, if ρ′(v) > k/v infinitely often, then by monotonicity of the second derivative, it must beabove k/v for all large enough values of v, say above v. In that case, integrating

´∞vρ′(v)

would give infinity, contradicting that ρ(v) ≤ 1 by Assumption 2.

Now suppose, toward a contradiction, that for every λ > 0, there exists a λ > λ so thatthere are multiple intersections between the curves v 7→ v/λ and R. Then for λ = 1, 2, . . ., letthe minimum and maximum values of v at which the two curves intersect be called vλ and vλ,which are distinct. Now, there are two possibilities. First suppose vλ remains bounded alongthis sequence, say by a number b. Then by a compactness argument (using the fact thatthe graph of the continuous function R on the compact interval [0, b] is compact) there is anaccumulation point of the ordered pairs ( (vλ, R(vλ)) )∞λ=1. By continuity of R on the domain[0, b], it follows that this accumulation point has the form (v̂, R(v̂)) for some v̂. Because itis a limit of points of the form (v, v/λ) with v ≤ b and λ arbitrarily large, it must be thatR(v̂) = 0, contradicting the fact that R is positive (by Lemma 1 and strict monotonicity ofH).

So we must be in the other possibility, where the sequence (vλ)∞λ=1 is unbounded. In thatcase, by the Mean Value Theorem, for each λ = 1, 2, . . ., there must be a point vλ ∈ [vλ, vλ] sothat R′(vλ) = 1

λ. Moreover, since R(v) ≤ 1, this vλ ≤ 1/λ. Together, these facts contradict

Claim 1.So our supposition must be false, and for some λ > 0 it must be the case that if λ > λ,

there is exactly one v solving v/λ = R(v).

Page 39: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 38

To show that for small enough λ, there is a unique intersection between v 7→ v/λ andR, simply note that in some neighborhood of 0—call it (0, v)—the derivative of R(v) isuniformly bounded in absolute value as a consequence of the assumptions that H, as wellas GL and GH, have positive, bounded densities. Thus as long as λ is small enough, 1/λexceeds the maximum magnitude that R′(v) achieves in the interval (0, v). If λ is so largethat v/λ must intersect R between 0 and v, it follows that the two curves v 7→ v/λ and R

can cross only once.

Page 40: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

Online AppendixNot for Publication

Appendix B. Experimental Materials

B.1. Skill Test. We present sample questions from the skill test.

Figure B.1. Examples from Skill Test.

Page 41: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

Test Score Clues

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

3

4

5

1

2

B.2. Score to Clues. We present the mapping from the skill test score to the number ofclues drawn in the Skill treatment.

Figure B.2. Score to Clues Schedule.

Page 42: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 41

Appendix C. Belief Updating

Table C.1. Priors

(1) (2) (3)VARIABLES Standardized Perceived Prior Standardized Perceived Prior Standardized Perceived Prior

Standardized Seeker Score 0.0719 0.0269 0.0364(0.0344) (0.0379) (0.0207)

Friend 0.822(0.0993)

Seeker Score × Friend 0.156(0.0990)

Know Well 1.304(0.0733)

Standardized Seeker Score × Know Well 0.0733(0.0454)

Know Score 0.868(0.0216)

Standardized Seeker Score × Know Score 0.0286(0.0191)

Observations 1,533 1,393 1,393R-squared 0.460 0.700 0.854Respondent FE X X XMean of dep. var 0.00 0.00 0.00Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 43: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 42

Table C.2. Heterogenous Updating

(1) (2)VARIABLES Increased Perception Increased Perception

Standardized Seeker Score 0.0633 0.0424(0.0286) (0.0181)

Know Well -0.185(0.0405)

Standardized Seeker Score × Know Well -0.0480(0.0351)

Know Score -0.0975(0.0195)

Standardized Seeker Score × Know Score -0.0243(0.0157)

Observations 1,393 1,393R-squared 0.410 0.411Respondent FE X XMean of dep. var 0.381 0.381Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 44: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 43

Appendix D. High Skill and High Score Sample

Table D.1. Main Results

(1) (2) (3) (4) (5)VARIABLES Seeking Seeking Seeking Seeking Seeking

Skill Treatment 0.00273 0.00426 -0.00995 0.00874 -0.0145(0.0214) (0.0218) (0.0231) (0.0222) (0.0268)

Reveal score to Advisor -0.0314 -0.0360 -0.0542 -0.0288 -0.0558(0.0352) (0.0372) (0.0373) (0.0364) (0.0416)

Skill × Reveal score to Advisor -0.0201 -0.0127 0.00558 -0.0196 0.00810(0.0510) (0.0495) (0.0503) (0.0498) (0.0510)

Observations 484 484 484 484 484R-squared 0.046 0.072 0.082 0.057 0.095Random/No Reveal Mean 0.058 0.058 0.058 0.058 0.058Advisor clue count FE X X XSeeker Score FE X XSeeker clue count FE X X

Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 45: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 44

Table D.2. Friends

(1) (2) (3) (4) (5)VARIABLES Seeking Seeking Seeking Seeking Seeking

Skill Treatment 0.00104 0.00607 -0.0137 0.00779 -0.0150(0.0252) (0.0264) (0.0263) (0.0254) (0.0289)

Reveal score to Advisor -0.0234 -0.0242 -0.0464 -0.0214 -0.0575(0.0443) (0.0464) (0.0470) (0.0455) (0.0523)

Skill × Reveal score to Advisor -0.0187 -0.0163 0.00738 -0.0156 0.0221(0.0695) (0.0660) (0.0728) (0.0692) (0.0701)

Friend 0.0550 0.0718 0.0446 0.0573 0.0569(0.0891) (0.0881) (0.0848) (0.0901) (0.0807)

Skill × Friend 0.000425 -0.0160 0.00749 -0.00254 -0.00290(0.0965) (0.0955) (0.0926) (0.0970) (0.0882)

Reveal score to Advisor × Friend -0.0458 -0.0654 -0.0208 -0.0429 -0.00864(0.147) (0.141) (0.146) (0.147) (0.138)

Skill × Reveal score to Advisor × Friend 0.000560 0.0251 -0.0359 -0.00982 -0.0460(0.189) (0.182) (0.189) (0.193) (0.184)

Observations 484 484 484 484 484R-squared 0.052 0.079 0.062 0.063 0.101Random/No Reveal/Not Friend Mean 0.045 0.045 0.045 0.045 0.045Advisor clue count FE X X XSeeker Score FE X XSeeker clue count FE X X

Notes: Standard errors (clustered at the village level) are reported in parentheses.

Page 46: Stigma and Silence in Learning - Stanford Universityarungc/CGY.pdfWe thank Alokik Mishra, ... our model, is designed to make our story above precise ... STIGMA AND SILENCE IN LEARNING

STIGMA AND SILENCE IN LEARNING 45

Table D.3. Caste

(1) (2) (3) (4)VARIABLES Seeking Seeking Seeking Seeking

Skill Treatment 0.0133 -0.0124 0.0142 -0.0111(0.0321) (0.0372) (0.0322) (0.0373)

Reveal score to Advisor -0.0408 -0.0713 -0.0377 -0.0690(0.0402) (0.0467) (0.0404) (0.0469)

Skill × Reveal score to Advisor 0.00757 0.0443 0.00635 0.0432(0.0706) (0.0692) (0.0717) (0.0703)

Seeker Caste < Advisor Caste 0.0147 -0.00830(0.0619) (0.0704)

Skill × Seeker Caste < Advisor Caste -0.0807 -0.0516(0.0720) (0.0779)

Reveal × Seeker Caste < Advisor Caste -0.0514 -0.0370(0.0701) (0.0872)

Skill × Reveal × Seeker Caste < Advisor Caste -0.00574 -0.0331(0.0992) (0.112)

Seeker Caste > Advisor Caste -0.0179 -0.0411(0.0707) (0.0617)

Skill × Seeker Caste > Advisor Caste -0.00471 0.00793(0.0852) (0.0782)

Reveal × Seeker Caste > Advisor Caste 0.0804 0.0893(0.121) (0.100)

Skill × Reveal × Seeker Caste > Advisor Caste -0.107 -0.113(0.169) (0.159)

Seeker Caste differs from Advisor Caste -0.00375 -0.0269(0.0567) (0.0548)

Skill × Seeker Caste differs from Advisor Caste -0.0369 -0.0170(0.0670) (0.0644)

Reveal × Seeker Caste differs from Advisor Caste 0.0214 0.0322(0.0852) (0.0760)

Skill × Reveal × Seeker Caste differs from Advisor Caste -0.0708 -0.0861(0.120) (0.113)

Observations 484 484 484 484R-squared 0.053 0.103 0.057 0.106Random/No Reveal/Same Caste Mean 0.060 0.060 0.060 0.060Advisor clue count FE X XSeeker clue count FE X XSeeker Score FE X X

Notes: Standard errors (clustered at the village level) are reported in parentheses.