firms and within country migration: evidence from india files/14-080_cf9efca6-a33b-4037... · we...

57
Firms and Within Country Migration: Evidence from India Prithwiraj Choudhury Tarun Khanna Working Paper 14-080

Upload: duongnga

Post on 28-Feb-2018

215 views

Category:

Documents


1 download

TRANSCRIPT

Firms and Within Country Migration: Evidence from India

Prithwiraj Choudhury Tarun Khanna

Working Paper 14-080

Working Paper 14-080

Copyright © 2014, 2018 by Prithwiraj Choudhury and Tarun Khanna

Working papers are in draft form. This working paper is distributed for purposes of comment and discussion only. It may not be reproduced without permission of the copyright holder. Copies of working papers are available from the author.

Firms and Within Country Migration: Evidence from India

Prithwiraj Choudhury Harvard Business School

Tarun Khanna Harvard Business School

Firms and Within Country Migration: Evidence from India

By PRITHWIRAJ CHOUDHURY AND TARUN KHANNA1

While prior literature on within-country migration is focused on self-selection models, agents such as firms

can also play an important role in facilitating within country migration. In the face of physical, informational

and social barriers to migration, firms with nationwide hiring practices can benefit from facilitating the

migration of high ability individuals from smaller towns to production centers located in larger cities. We

argue that despite the ex-ante higher search costs associated with hiring from smaller towns, firms can

benefit from such hiring. In this study, we explore the relationship between an employee’s hometown and

her subsequent work productivity. To do so, we exploit unique personnel data for newly hired college

graduates within an Indian technology firm. We leverage the fact that the assignment of an employee to

one of many production centers within the firm is uncorrelated with observable characteristics of the

employee and find that employees hired from smaller towns have higher productivity than their counterparts

from large cities. As a possible explanation of our results, we test for selection and find that employees

hired from smaller towns outperform their large city counterparts in standardized logical tests at the

recruitment stage. We also find that employees hired from smaller towns have lower attrition rates

compared to those hired from large cities. We provide evidence that the extra payoff conferred by higher

productivity and lower attrition rates sufficiently outweigh the extra search costs associated with hiring

from smaller towns. Our secondary results also indicate that other forms of marginalization (e.g., being

from a “lower caste”) are related to higher productivity.

1 Choudhury Prithwiraj: Harvard University, (e-mail: [email protected]). Khanna Tarun, Harvard University (e-mail: [email protected]). The

authors would like to thank Lauren Cohen, Lawrence Katz, Bill Kerr, Dylan Minor, Pian Shu, Claudia Steinwender; seminar participants at Duke,

Harvard University, the International Economic Association (IEA) meetings in 2011 in Beijing, INSEAD, London School of Economics, the

Seventh International Conference on Migration and Development, University of Oxford, NYU Stern School of Business, Washington University St. Louis, Wharton, the World Bank DECTI seminar and three anonymous reviewers at the Quarterly Journal of Economics for their comments

on a previous draft.

2

I. Introduction

The influence of firms in shaping human capital and labor markets has been long studied by

economists (Baker et al., 1994; Bartel et al., 2004). However, firms are notably missing from the

literature on migration, especially the literature on within-country migration. To quote Kerr et al.

(2013: 5), ‘firms are mostly absent from the literature on the impact of immigration.’ The authors

also argue that ‘this approach seems quite incomplete for skilled migration’ given that firms play

an active role in the migration of skilled workers, in the context of US and other countries.

While the migration literature has outlined several self-selection models (Harris and

Todaro 1970; Borjas 1994; Young 2013), there might be physical, social, cultural, legal and

informational barriers to migration and employment opportunities for individuals in developing

countries (Jensen, 2012, Banerjee et al., 2007; Clemens, Ozden and Rapoport, 2015). In the face

of such barriers to migration, nationwide hiring by firms might facilitate efficient within country

migration patterns. Firms with nationwide hiring practices can hire high ability individuals from

smaller towns, move them to production centers in the large cities, and can benefit from such hiring

practices. If employees hired from smaller towns outperform employees recruited from large cities,

this could lead to positive rents for the firm in question, even net of the higher search costs to hire

individuals from smaller towns. This relates to a broader insight in economics that when

information is limited or costly, agents are unable to engage in optimal arbitrage (Jensen, 2007).

However, firms with superior access to information around identifying and hiring high ability

employees in smaller towns might be able to equalize factor input labor.

Despite this, most technology firms in India limit their hiring efforts to large cities. Jensen

(2012) studied the business process outsourcing sector in India and reported that the sector was

strongly geographically concentrated (with 95% of employment focused around seven major

3

cities), resulting in large, localized increases in economic opportunities.2 Such hiring policies

exclude individuals living in smaller towns, thus significantly limiting the pool of eligible

candidates. This leads to the question of why Indian firms, do not expand their hiring efforts to

smaller towns.

The policy of Indian firms to predominantly hire from large cities is likely based on an

expectation that this leads to a higher net payoff. Firms may reasonably expect that the talent pool

concentrated in cities is of superior quality, as individuals hired from large cities presumably have

better access to education or training. These firms may also believe that they are capturing the best

high school students from smaller towns who may have moved to a large city to attend college.

Additionally, the theoretical literature in labor economics on job-worker match in the presence

of search frictions (Mortensen and Pissarides 1999; Pissarides 2011) suggests that firms will ex

ante incur higher search costs when hiring individuals from smaller towns. Given this, firms will

not extend their hiring efforts beyond large cities unless there is a higher net payoff associated

with hiring individuals from smaller towns. This net payoff can manifest in two key ways: (1) if

individuals hired from smaller towns have higher productivity compared to individuals hired from

large cities; and/or (2) if the attrition rate for employees hired from smaller towns is lower than

the attrition rate for employees hired from large cities.

In this paper, we study the hiring efforts of a large Indian technology firm that extensively

hires from smaller towns in India, though its production centers are predominantly located in large

cities. Our econometric evidence shows that individuals hired from smaller towns have both higher

productivity and lower attrition rates compared to individuals hired from large cities. Later in the

2

To illustrate this further, we conducted a survey with representatives from 11 randomly selected engineering colleges based in large cities and

smaller towns, and found that most technology firms, especially multinational ones, hire exclusively from colleges located in large cities (see

Appendix D for more information).

4

paper, we provide evidence that the extra payoff generated from higher productivity and lower

attrition among those hired from smaller towns may outweigh the extra search costs associated

with hiring them. We estimate the net payoff of hiring an employee from a smaller town to be

around 3% of the salary of an entry-level employee freshly hired from college.

To conduct our econometric analysis, we take advantage of a talent allocation protocol within

INDTECH, one of India’s largest software multinationals. This company employs over 120,000

people worldwide and recruits talent from more than 250 colleges in India. Several of these

colleges are located in smaller towns. After four months of induction training, the firm randomly

assigns the individuals it hires from across the country—including from smaller towns—to

projects that are executed in its 10 “development centers” (INDTECH uses the terminology

“development center” to indicate a production center). INDTECH randomizes employee

assignments so that the company’s end customers, which are mostly U.S.-based firms, are

indifferent to the particular INDTECH center that executes their projects and to prevent

sociolinguistic cliques from emerging in INDTECH’s urban production centers. To randomize

employee assignments, INDTECH uses a computer application that is part of the firm’s enterprise

resource planning software.

Note that we do not use the random assignment of employees to production centers as

treatment; instead, we use the random assignment to control for endogeneity concerns in

measuring the relationship between an employee’s prior location (hometown) and subsequent

productivity within the firm. The randomized assignment of individuals to production centers helps

us avoid issues related to assortative matching of individuals to production centers based on

observables, such as hometown and ethnicity.

5

As an example, randomization implies that employees are not systematically assigned to

production centers close to their hometowns. If this was not the case, employees coming from

smaller towns might be assigned to production centers closer to the smaller towns and farther away

from the larger production centers (such as Bangalore). If indeed employees from smaller towns

were assigned to production centers far from Bangalore, the productivity of employees from

smaller towns would be downward biased because of the distance from the larger production

centers and because they would miss out on agglomeration economies. The econometrician would

not be able to eliminate this bias by simply using production center fixed effects.

To conduct the econometric analyses, we collected unique, anonymized personnel data for

INDTECH’s newly hired college graduates in 2007. The personnel data includes details about the

individual’s school district, college district, caste, production center to which the employee is

assigned, grades received during training, performance ratings, attrition, and whether or not the

employee was fired. We report several results.

Employees hired from smaller towns were more productive than their counterparts from large

cities. We measured productivity using objective performance measures (described in detail later)

and found that graduates hired from smaller towns were 10% more likely to achieve the highest

performance rating compared to graduates hired from large cities. Those hired from smaller towns

also had a lower attrition rate, and were 11.6% less likely to quit the firm within the first four years

compared to those hired from larger cities.

One possible mechanism that could explain this set of results relates to selection. It is plausible

that high-ability individuals living in large cities have a wider range of job opportunities than those

of equal ability from smaller towns. Thus, the sample of employees from smaller towns working

at INDTECH may be composed of a larger proportion of high-ability individuals than the sample

6

from large cities. This mechanism relates to the literature in economics that documents the social,

economic, informational, and physical barriers to migration and employment for individuals living

outside large cities in developing countries such as India (Banerjee et al. 2007; Munshi and

Rosenzweig 2016; Jensen 2012).

To empirically test this mechanism, we collected data on standardized test scores in logical

and verbal ability during INDTECH recruitment and provide econometric evidence that employees

hired from smaller towns outperform their large city counterparts on standardized logical ability

tests. For the verbal tests, the difference between the two groups was statistically nonsignificant.

Our regression results are further corroborated by field interviews with INDTECH human

resources (HR) managers. These field interviews indicate that graduates from large cities have

several career options at the end of graduation, including joining an MBA program in one of the

large Indian cities or migrating out of the country to join a master’s degree program. In comparison,

individuals from smaller towns have fewer career options. Joining INDTECH, which has a policy

of hiring from a large number of smaller towns, is among the preferred options for such individuals.

Our research is in the spirit of ‘insider econometrics’ (Baker, Gibbs, and Holmstrom 1994;

Bartel, Ichniowski, and Shaw 2004; Bandiera et al., 2005) and considers data from one single firm.

To extend our analyses, we consider survey and secondary data sources to generalize our insights

to the broader Indian labor market. We scrape and use CBSE test score data for 461 schools in our

sample, use two college ranking datasets, conduct two additional surveys (the first survey is a

college level survey and the second survey is an employee level survey of 1,054 employees at an

Indian technology firm different from INDTECH), scrape and use data from 30,131 LinkedIn

profiles of employees at the top three Indian technology firms, and collect CVs of the 593 top

scientists at India’s government owned research laboratories. While our secondary analyses

7

suggests no statistically significant differences in the broader student population across smaller

towns and larger cities, it indicates variation in firm policy of hiring from smaller towns. While

Indian technology firms and government owned research entities are likely to hire individuals

educated in smaller towns, multinational technology firms are more likely to concentrate their

hiring efforts in the larger cities.

Our findings contribute to several streams of the economics literature, notably the literature on

within country migration (Harris and Todaro, 1970; Young 2013; Bryan, et al., 2014; Bazzi et al.,

2016; Munshi and Rosenzweig, 2016) and the literature on the barriers to labor market mobility in

developing countries (e.g., Currie and Harrison 1997). Unlike the prior literature on within country

migration which has focused on rural to urban migration, we focus on the role of firms, in

facilitating migration of skilled workers from smaller towns to larger cities (i.e. semi-urban to

urban migration). The within country migration literature dates back to the 1970s and the job

search models of Harris and Todaro (1970). These self-selection models explain within-country

migration as rational maximizing behavior where the higher urban wage acted as a regulating

mechanism. However, as Clemens, Ozden and Rapoport (2015) state, there could be geographic,

cultural, legal and physical barriers to migration, and the role of firms in facilitating within-country

migration could be particularly salient in the presence of such barriers.

Finally, though the focus of our study relates to the benefits of firms hiring employees from

smaller towns, we also present econometric evidence that contributes to the literature on

affirmative action in India (Bertrand et al. 2010; Banerjee et al. 2009). Our econometric analyses

indicate that other forms of marginalization (e.g., being from a “lower caste”) are related to higher

productivity. Our marginal probability analysis shows that employees belonging to “lower castes”

are 15% more likely to achieve the highest performance rating than employees belonging to

8

“higher castes” in the fully specified model. Our results also indicate that employees from “lower

castes” are less likely to be fired than other “higher caste” employees. These findings further

validate recent research on affirmative action and caste discrimination in labor markets in India.

Bertrand et al. (2010) studied affirmative action for “lower caste” groups in engineering colleges

in India. They found that despite poor entrance examination scores, lower caste entrants obtained

positive returns to admission to engineering colleges. Banerjee et al. (2009) found no evidence of

discrimination against non-upper-caste applicants for software jobs. In summary, our results

contribute to the literature in economics (Jensen 2012) that has argued that geographic barriers to

exclusion in the labor market in developing countries must be studied in conjunction with social

barriers (e.g., gender, caste).

The rest of this article is structured as follows. Section II summarizes our literature review.

Section III describes our empirical setting and our data. We outline the identification strategy in

Section IV and our results in Section V. Section VI concludes.

II. Theory

In this section, we draw upon two distinct theory literatures in economics to motivate our

empirical analyses. First, we draw upon the literature on job-worker match in the presence of

search frictions in labor economics to predict why productivity and attrition of employees hired

from smaller towns could be systematically different from that of employees hired from large cities

(section A). We then draw upon the literature on barriers to migration in developing countries to

indicate why selection might be a relevant mechanism to explain such differences (section B).

A. Job-Worker Match in Presence of Search Frictions

We build on the literature on job-worker match in the presence of search frictions in labor

economics (Mortensen and Pissarides 1999; Pissarides 2011; and Rogerson, Shimer, and Wright

9

2004) to illustrate why productivity and attrition of employees hired from smaller towns could be

systematically different from that of employees hired from large cities. As Mortensen and

Pissarides (1999) outline, in labor markets, the central problem is the creation of cooperating

coalitions composed of a job-worker match in the presence of search or matching frictions.

Mortensen and Pissarides (1999: page 2571) define matching friction as “the costly delay in the

process of finding trading partners and determining the terms of trade.” There is also extensive

literature on how search frictions are modeled. In Morgan (1995), search frictions are modeled

using a discrete time model of market segmentation with an explicit search cost c > 0.3

We now elaborate a highly stylized example of the hiring problem faced by a firm such as

INDTECH. Let us assume INDTECH has the choice of hiring two types of workers—those from

smaller towns (S) or those from large cities (L). Following Pissarides (2011), it is conceivable

that the time to find a good match is higher for S than for L. Following Smith (2006), it is also

conceivable that the opportunity time cost of search is higher for S compared to L. Our field

interviews with managers from INDTECH indicate that most of the smaller towns from which

INDTECH hires do not have airports; traveling from the Bangalore headquarters to these smaller

towns often involves trips by rail or road. Hiring from these towns thus incurs both a higher direct

search cost (i.e., cost of travel, food and accommodation) and a higher opportunity cost of staying

away from work for multiple days. In contrast, the large cities from which INDTECH hires are all

accessible to the Bangalore headquarters by direct flights. Both the time to find a match and the

opportunity time cost of search are much lower for hiring from large cities.

3 Pissarides (2011: page 1093) illustrates how the matching function helps model search frictions—“the matching function captures many

features of frictions in labor markets that are not made explicit…it captures the key idea of a good match: it takes time to find a good match, the

length of time it takes varies across workers in unpredictable ways.” Smith (2006) considers the opportunity time cost of search for marriage.

10

Our interviews also indicate that when hiring from large cities, managers face relatively lower

information asymmetry about the quality of colleges from which they are hiring. However, in

smaller towns, information about the quality of colleges is relatively less available. Thus, hiring

managers must spend time validating the quality of colleges in these smaller towns. In other words,

assuming cS and cL are the flow costs of recruiting a worker from a smaller town and a large cities,

respectively, cS > cL.4 Again, as Mortensen and Pissarides (1999) indicated, the two-sided search

model with search frictions assumes individual rationality by both agents (i.e., the share of the

value of the match received by each partner after the match must exceed the option of continued

search).

Let y be the output of any match between the job and the worker. In other words, y is the ex

post productivity of the hired worker. Let the job be filled at a rate . As indicated earlier, c is the

flow cost of recruiting a worker to fill a vacancy. The value of the match shared by the worker is

w, and V is the value of a vacancy to the firm. Following Mortensen and Pissarides (1999) and

Rogerson, Shimer, and Wright (2004), the value of posting a vacancy by a firm is equal to –c +

[J() – V]. The utility of posting a vacancy by a firm can also be written as –c + [J(y-w) – V].5

In our stylized example of hiring either a smaller town worker with flow cost of recruiting cS or

hiring a worker from a large city with flow cost of recruiting cL, if cS > cL, then it follows that yS

is conceivably greater than yL.

In summary, for a utility-maximizing firm satisfying the condition of individual rationality,

there has to be a higher net payoff of hiring individuals from smaller towns. This can manifest in

two ways: (1) the ex post productivity of the smaller town worker is higher than the ex post

4 Studies (e.g., Albrecht and Axell 1984) within the matching with frictions literature build on differential costs of search for agents.

5 In the literature, this follows from a simple model of job creation that involves matching and bargaining (for details, see Rogerson, Shimer, and

Wright 2004; page 17). This model is also based on transferrable utility (i.e., the total value of the match is the sum of the shares received by the

partners). This leads to being equal to (y-w).

11

productivity of the worker from the large city; and/or (2) the attrition rate for the smaller town

worker is lower than the attrition rate for the worker from the large city. In other words, a firm

such as INDTECH might be willing to ex ante incur a higher flow cost of recruiting the smaller

town worker under the condition of heterogeneous workers from smaller towns and large cities.6

B. Barriers to Within-Country Migration

The assumption that higher productivity workers from smaller towns do not migrate en masse

to the large cities is a necessary precondition for the existence of heterogeneous workers from

smaller towns and large cities. This assumption also underpins the need for firms such as

INDTECH to conduct costly search for higher productivity workers in smaller towns. If, in

equilibrium, all higher productivity workers from smaller towns migrated to large cities, then a

firm such as INDTECH would not have to engage in a time-consuming search for such workers in

smaller towns. To theorize why all high-productivity workers from smaller towns may not migrate

to large cities, we leverage insights from the economics literature that document social, economic,

informational, and physical barriers to migration and employment in developing countries such as

India.

First, high-ability individuals residing in small towns may be constrained from migrating to

large cities because of informational barriers to migration. Jensen (2012) studied awareness of

business process outsourcing jobs among people residing in locations that were only 50 to 150

kilometers from Delhi and found that awareness of jobs in larger cities and knowledge of how to

access them were very limited.

6 This proposition is related to the literature on heterogeneous agents in the job-worker match with search frictions literature (e.g., Burdett and

Coles 1997). The main insight from this literature is that, given heterogeneous agents, workers and employers will endogenously match in

classes.

12

Second, individuals from smaller towns and rural areas may not migrate to large cities for

social and economic reasons. Munshi and Rosenzweig (2016) use data from the 2005 India Human

Development Survey to show that caste networks restrict spatial mobility in rural India due to

underlying economic reasons. The authors show that “caste loans” consistently have more

favorable loan terms (e.g., the proportion of zero interest loans and the proportion of loans not

requiring collateral), and individuals in rural India are more likely to participate in caste-based

financial arrangements and are less likely to out-migrate.

There is also literature (Dyson and Moore 1983; Foster and Rosenzweig 2009) exploring the

ways in which considerations such as gender or social status (e.g., caste of the student) could

negatively affect the perceived benefits of investing in education.7 On the issue of caste, scholars

such as Banerjee and Somanathan (2007) observe that in India, members of lower castes can be

relatively disadvantaged in their access to public goods, such as education. Under-provisioning in

education might constrain individuals living in smaller towns and rural areas from migrating to

large cities in search of skilled employment.

Finally, in developing countries, employment opportunities for talented individuals in smaller

towns and rural areas might be hindered by a lack of teaching infrastructure and a lack of

committed teachers (Banerjee et al. 2007; Chaudhury et al. 2006; Muralidharan and Sundararaman

2015). Choudhury and Khanna (2012) summarized the physical, informational, and social barriers

to migration and employment for high-ability individuals in remote regions in the context of

developing countries.

7 Dyson and Moore (1983) and Foster and Rosenzweig (2009) outline the practice of “patrilocal exogamy,” where a woman gets married to an

individual from a different village and leaves her parents’ village to live with her husband’s family. This, and similar practices, imply that the returns to investing in girls’ human capital do not accrue to the parents and, as a result, parents have less incentive to invest in the education of

girls.

13

III. Data

Our empirical setting is one of India’s largest IT firms (INDTECH). The firm has more than

120,000 employees spread over 10 production centers in India, and its IT projects span the globe.

After entry level employees are recruited by INDTECH, they undergo two random assignments.

First, they are randomly assigned to one of three “technological areas”—“.NET,” Java, or

Mainframe—that represent INDTECH’s core business. Based on this assignment, they then

receive four months of related induction training.8 Employee training is staggered, and can begin

at any point from May to November of each year. Second, once employees have completed

training, they are randomly assigned to a production center. Each of the 10 production centers at

INDTECH works on projects related to all three technological areas (“.NET,” Java, Mainframe);

thus, entry-level INDTECH employees can be assigned to any production center.

INDTECH’s decision to assign a new hire to one of the three technological areas is

uncorrelated with observable characteristics of the individual. To avoid bias caused by diverse

temporal trends affecting the technologies in which employees are trained, we restricted our data

collection exercise to employees trained in a single area—“.NET”. Focusing on employees trained

in the same technological area enables us to minimize bias resulting from differences in employee

performance due to short-term demand or supply trends in each of the technology areas.

We collected unique data for all entry-level, fresh college graduates recruited in 2007 who had

no prior full-time employment experience. The employees in our sample came from more than

250 colleges across India. In total, we collected data on 1,696 undergraduates hired and assigned

to the .NET technology area in 2007. INDTECH hires about 10,000 undergraduates every year.

8. The .NET Framework (pronounced dot net) is a software framework developed by Microsoft that runs primarily on Microsoft Windows. It includes a large library and provides language interoperability (each language can use code written in other languages) across several

programming languages (http://en.wikipedia.org/wiki/.NET_ Framework).

14

Since we focused only on employees trained in .NET, we collected data on about 17% of the total

entry undergraduates in 2007.

INDTECH trains new employees assigned to a particular technological area in batches of about

100 employees each. For the sample of employees hired in 2007 and assigned to .NET, INDTECH

trained 14 batches with 94 employees each and four batches with 95 employees each. The company

has a corporate training center in the southern Indian city of Mysore with a 337-acre campus, 400

instructors, and 200 classrooms. According to our field interviews, INDTECH spends around

$3,500 per employee to train new college graduates for four months on computer science topics,

such as relational databases, client-server concepts, and programming languages. In addition, as

described earlier, the post-training assignment of employees to a production center is not correlated

to observable employee characteristics.

The main independent variable of interest was whether or not the employee hailed from a

smaller town (from smaller town). We constructed this variable as follows. First, we obtained

detailed employee resumes, which included the name and location of employees’ primary schools,

high schools, and undergraduate colleges. The INDTECH data for this was available for 93% of

the 2007 batch. In the next step, we classified Indian cities and towns based on the classification

system outlined by the Sixth Pay Commission report of the Government of India.9 The

classification system divides India’s cities and towns into three categories, with the largest six

metropolitan areas of Delhi, Mumbai, Bangalore, Chennai, Kolkata, and Hyderabad classified into

the first category, the next largest cities in the second category, and the smallest towns in the

third.10

9 The government issued a circular on August 29, 2008 to formalize this classification system, and all Indian state-owned entities and government

departments use this classification system to establish the cost of living for employees. 10. Details of this categorization of Indian cities and towns are available at:

http://ccis.nic.in/WriteReadData/CircularPortal/D2/D02ser/11016_2_2008-AIS-II.pdf.

15

Given this data, we code from smaller town as “1” if the following three conditions are met:

the employee attended (1) primary school in a location outside the largest six metros; (2) high

school in a location outside the largest six metros; and (3) college in a location outside the largest

six metros. Assuming that being from a smaller town is correlated to ex post higher productivity,

this turns out to be the most conservative way of coding the variable from smaller town. In this

definition, employees who went to primary and high school in a smaller town, but attended college

in one of the largest six metros are coded as “0.” Thus, although these employees might have

performed better in school and consequently moved to college in one of the largest six metros,

they are still coded as part of the control group. In robustness checks outlined later, we expand the

classification of large cities to include other large Indian cities such as Pune, Ahmedabad, and

Jaipur. All our empirical results remain robust to reclassification of cities.

Our first dependent variable of interest, performance, measures employee productivity. At the

end of every year, all INDTECH employees that worked on a coding/testing project for at least

nine months in the calendar year receive a performance rating. For the 2007 sample, we collected

performance data for all employees (n=511) who met these criteria and received a performance

rating at the end of 2008.

For new hires, employees’ training schedule affected whether they satisfied the “nine-month

rule.” For instance, in the 2007 sample, employees who started their training after September 2007

would not finish until early 2008. Most of those employees were not assigned to a project prior to

March 2008, making them ineligible to receive a 2008 performance rating. This mitigated the

concern that INDTECH’s decision to deploy an employee to a project depended exclusively on

superior ability, based on observable and/or unobservable characteristics.

16

Additionally, controlling for the training batch, we did not find any statistically significant

relationship between observables such as the employee’s geographic origin (large city vs. smaller

town) and whether or not the employee received a performance rating at the end of 2008. We also

did not find any statistically significant relationship between other observables such as recruitment

test scores and whether or not the employee received a performance rating at the end of 2008. This

further validates the random assignment protocol.

Field interviews with the head of talent development at INDTECH, a senior manager in HR,

and several employees in the sample also indicate that the performance ratings for entry-level

undergraduates are based on objective measures. These include quality of coding and/or testing

(measured using “mistakes” in the code that are recorded by automated software) and timeliness

and completeness in coding/testing and documentation (also measured using automated software).

Each employee’s manager gives an initial performance rating based on the objective criteria, and

then managers from Human Resources check the rating against the underlying scores (i.e., scores

of coding error rates, coding completeness) to correct any erroneous scores. To quote a senior

human resources manager, “for the first three years, performance evaluation is strictly based on

objective metrics.”

We retrieved our second set of dependent variables—verbal scores and logical scores—for the

2007 sample from a standardized test administered by INDTECH at the recruitment stage. These

variables helped us test for one of the possible underlying mechanisms (i.e., selection) for why

smaller town employees might have different productivity compared to their large city

counterparts.

We also collected data on attrition to construct another dependent variable—quit firm. To code

this variable, we collected data from exit interviews for each employee in our sample who left the

17

firm by 2011. Additionally, we code control variables to indicate the gender of the employee

(male) and whether or not the employee is from one of the underrepresented scheduled castes or

other lower castes in India (scheduled caste).11 We also control for cumulative grade point average

(cgpa_training) at the end of training. This variable controls for performance during the four-

month induction training, and it is expected to be positively correlated to subsequent performance

within the firm.

Table 1 summarizes descriptive statistics of the personnel and performance data for the entire

sample. Table 2 compares the personnel and performance data for employees from smaller towns

with employees from large cities. The descriptive statistics reported in column (3), Panel C of

Table 2 indicate statistically significant differences in performance ratings for smaller town and

large city employees (difference = .17; p < .01). They also indicate statistically significant

differences in attrition rates for smaller town and large city employees (difference = -.12; p < .01).

Column (3) of Table 2 reports the differences in descriptive statistics where the treatment group

includes from smaller town employees (i.e., employees who went to smaller town schools, high

schools, and colleges). Column (6) of Table 2 reports differences in descriptive statistics where

the treatment group includes employees from smaller town school (i.e., employees who went to

smaller town schools and high schools but attended college in either large cities or smaller towns).

[Insert Table 1 and 2 Here]

IV. Identification Strategy

To recap, our first proposition is that employees hired from smaller towns have higher

productivity than employees hired from large cities. To estimate whether being from a smaller

11. As Banerjee et al. (2009) point out, the term “scheduled castes” comes from the Ninth Schedule of the Indian Constitution, which lists for

each state in India the specific caste groups eligible to benefit from the affirmative action provisions outlined in the Constitution.

18

town for employee i affects productivity measured using performance ratings, we run the following

specification:

(1) 𝑃𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒𝑖= 𝛽0 + 𝛽1𝑎𝑙𝑙_𝑠𝑚𝑎𝑙𝑙𝑒𝑟_𝑡𝑜𝑤𝑛𝑖 + 𝛽2𝑐𝑔𝑝𝑎_𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔𝑖 + 𝛽3𝐿 + 𝛽4𝐼𝑖 + 𝜖𝑖,

where 𝑃𝑒𝑟𝑓𝑜𝑟𝑚𝑎𝑛𝑐𝑒𝑖 is a measure of productivity at the end of 2008.

We control for cumulative grade point average at the end of training (cpga_training), fixed

effects for the production center to which the employee is assigned (L),12 and employee

characteristics, including gender, whether the employee is a member of a scheduled caste, and test

scores during the standardized recruitment tests (𝐼𝑖). Given that performance is measured in

normalized bands, we implement the specification using an ordered logit model and robust

standard errors. In the base case, we cluster the standard errors by the location of the production

center to which the employee is assigned. In robustness checks, we also cluster the standard errors

by location of primary school district and use robust standard errors without clustering. Given the

small number of clusters, we also run an OLS model with wild bootstrap and clustered standard

errors (Cameron, Gelbach and Miller, 2008) and additionally run an ordered logit model with

bootstrapped standard errors; all results remain robust.

Our identification strategy exploits a talent allocation protocol in our setting. The preexistence

of a computer-generated talent allocation protocol at INDTECH implies that the production center

fixed effects are arguably uncorrelated with the variable from smaller town or observable

characteristics of the employee. After four months of induction training, the firm randomly assigns

employees to its 10 production centers across the country. This policy ensures that the assignment

of an employee to a particular location within the firm does not correlate with measures of observed

12. INDTECH has 10 production centers in India at the following locations: Bangalore, Hyderabad, Chennai, Mysore, Mangalore, Trivandrum, Pune, Bhubaneshwar, Jaipur, and Chandigarh. As indicated earlier, each entry-level employee is randomly assigned to one of these 10 production

centers. This variable is available for 96% of the 2007 batch.

19

ability, such as test scores at the end of induction training. Out of the 10 development centers,

seven centers are located near the firm headquarters. The remaining three centers are relatively far

from INDTECH’s headquarters in Bangalore. Appendix A outlines the steps followed by the

computer application that assigns new employees to production centers.

INDTECH’s primary motivation for this talent allocation policy is to ensure that INDTECH’s

end customers are indifferent to the location of the production center that executes their projects.

The secondary motivation of this talent allocation policy is to prevent the emergence of regional

and/or ethnic cliques at the production centers. To quote the head of talent development at

INDTECH, “we do not want all Tamils to join the Chennai center or all Punjabis to join

Chandigarh and start conversing in their regional language rather than in English. If that

happens, both our clients and employees from other parts of the country are affected.”

As described earlier, we do not use the random assignment of employees to production centers

as treatment; instead, we use the random assignment to control for endogeneity concerns in

measuring the relationship between an employee’s prior location (hometown) and subsequent

productivity within the firm. If there is assortative matching of employees to production centers

based on observables such as hometown, then the productivity estimates of smaller town

employees could be downward or upward biased, and using production center fixed effects may

not mitigate the bias, as a few examples will illustrate. In a more conventional setting, it is

conceivable that employees could be systematically assigned development centers close to their

hometowns. If that is the case, then employees coming from smaller towns would be assigned

development centers located in relatively smaller towns (e.g., the centers in Bhubaneshwar, Jaipur,

or Chandigarh), and their performance estimates would be downward biased because of the

distance from the larger production centers (such as Bangalore) and because of missing out on

20

agglomeration economies.13 The econometrician would not be able to get around the bias

stemming from a systematic allocation of individuals from smaller towns to smaller production

centers closer to their home towns, by simply using production center fixed effects. In the extreme

case, assume all employees coming from smaller towns are assigned to production centers closer

to smaller towns, and not Bangalore. Given agglomeration economies, it is also likely that being

assigned to Bangalore has a positive correlation with productivity. In such a setting, using

production center fixed effects would not alleviate the bias in smaller town employees being

systematically assigned far away from Bangalore. In another example, employees could be

assigned to locations based on measures of ability observable to the managers of the firm but not

to the econometrician. In this case, the highest ability employees might be disproportionately

assigned production centers in and around Bangalore. If this is indeed the case and if observed

measures of ability are not perfectly correlated with actual ability, then any further study of what

drives subsequent productivity is subject to methodological bias. In this case, for example, if there

is a positive correlation between being from a smaller town and unobserved measures of ability

and if employees from smaller towns are systematically assigned production centers close to

Bangalore, their performance estimates could be upward biased. In a more conventional setting,

it is also conceivable that employees might be assigned to locations based on considerations such

as ethnicity. In this example, all employees who are ethnic Kannadigas (i.e., from Karnataka)

might be assigned to the Bangalore technology center, all Tamils (i.e., from Tamil Nadu) to the

Chennai technology center, and all employees from Orissa to the technology center in

Bhubaneshwar. If this were the case and, further, if there were systematic bias in grading

13. The researchers conducted several field interviews at INDTECH to confirm this. A couple of the employees who were interviewed came from

the Khordha and Sundergarh districts in the eastern Indian state of Orissa and had families living in these places. In interviews, they confirmed that if they had been given a choice, they would have selected the Bhubaneshwar, Orissa, technology center of INDTECH. Since they had no

choice in selecting the center, both of these employees were assigned to and continue to work in the Bangalore technology center.

21

performance toward certain ethnic or social groups within the firm in question, we might get

spurious results in analyzing how employee hometown (which is correlated to production center

location, given the geographic clustering of ethnicities in India) relates to performance ratings.

As an example, there could be systematic positive bias in grading performance for employees

from the south Indian centers (Bangalore and Chennai). In that case, the performance ratings of

employees from the Orissa (Bhubaneshwar) center would be downward biased. Given assortative

matching based on ethnicity and since Orissa has a large number of “smaller towns,” the

econometrician would observe a downward bias in the performance of employees hired from

smaller towns.

Our second proposition is that employees hired from smaller towns exhibit lower attrition rates

than employees hired from large cities. To estimate whether or not being from a smaller town for

employee i affects the probability of attrition, we run the following specification:

(2) 𝐻𝑎𝑠_𝑞𝑢𝑖𝑡𝑖 = 𝛽0 + 𝛽1𝑎𝑙𝑙_𝑠𝑚𝑎𝑙𝑙𝑒𝑟_𝑡𝑜𝑤𝑛𝑖 + 𝛽2𝑐𝑔𝑝𝑎_𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔𝑖 + 𝛽3𝐿 + 𝛽

4𝐼𝑖 + 𝜖𝑖,

where 𝐻𝑎𝑠_𝑞𝑢𝑖𝑡𝑖 is a dummy variable indicating that the employee exited the firm by 2011.

We control for cumulative grade point average at the end of training (cgpa_training), fixed

effects for the production center to which the employee is assigned (L), and employee

characteristics, including gender, whether the employee is a member of scheduled caste, and scores

during the standardized recruitment tests (𝐼𝑖). Given the dummy variable nature of the dependent

variable, we implement this specification using a logit regression and robust standard errors. In the

base case, we cluster the standard errors by the location of the development center to which the

employee is assigned.

A possible mechanism for the different productivity and attrition rates of employees hired from

smaller towns compared with those hired from large cities relates to selection. It is plausible that

22

high-ability individuals from large cities have competing career options, making INDTECH a less

attractive option. It is also plausible that INDTECH is able to hire higher-ability individuals from

smaller towns. To estimate whether there is positive selection of employees from smaller towns,

we run the following specification:

(3) 𝑆𝑐𝑜𝑟𝑒𝑖 = 𝛽0 + 𝛽1𝑎𝑙𝑙_𝑠𝑚𝑎𝑙𝑙𝑒𝑟_𝑡𝑜𝑤𝑛 𝑖 + 𝛽2𝑐𝑔𝑝𝑎_𝑡𝑟𝑎𝑖𝑛𝑖𝑛𝑔𝑖 + +𝐼𝑖 + 𝜖𝑖,

where 𝑆𝑐𝑜𝑟𝑒𝑖 is the verbal/logical scores in the standardized tests at the recruitment stage and 𝐼𝑖

indicates a range of individual characteristics, including gender and whether the individual is a

member of a scheduled caste.

We implement this specification using OLS with robust standard errors. It is plausible that

test scores on standardized recruitment tests do not measure a person’s true ability; however, our

interviews suggest that given the coding nature of the job, performance on the logical test was a

key predictor of productivity.

V. Results

A. Smaller Town Employees and Performance

Figure 1 outlines productivity (i.e., 2008 performance ratings) for employees hired from

smaller towns and large cities for the 2007 batch. We run distributional tests to compare the short-

term performance for smaller town and large city employees. We run the two-sample Wilcoxon

rank-sum (Mann-Whitney) test and reject the null hypothesis that the performance data for the two

groups follow the same distribution.

[Insert Figure 1 Here]

Table 3 reports results for the ordered logit regression with robust standard errors described in

specification (1). Table 3 reports the relationship between the variables from smaller town and

performance. Although the size of the 2007 batch is 1,696 employees, the sample size of the

23

regression analyses in Tables 3 and 4 is smaller (N=511 in the fully specified model) due to

exclusions based on the “nine-month work rule” described in Section III.

Table 3 runs specification (1) and indicates that there is a positive and statistically significant

relationship between being from a smaller town and productivity as measured by performance

rating scores. This result remains robust after controlling for test scores at the end of training,

standardized recruitment test scores, whether the employee is from a scheduled caste, gender, and

production center fixed effects.

Among the control variables, as expected, CGPA at the end of training and logical test scores

on the standardized tests during recruitment are highly correlated with performance. Columns (7)

and (9) also indicate a positive and statistically significant relationship between being a member

of a scheduled caste and performance. Marginal effects (additionally reported in Figure 2) indicate

that the predicted probability that an employee hired from a smaller town will achieve the highest

performance rating is 38.5% in the fully specified model. The corresponding predicted probability

that an employee hired from a large city will achieve the highest performance rating is 28.5%.

[Insert Table 3 and Figure 2 Here]

B. Smaller Town Employees and Attrition

Next we present econometric evidence of a negative and statistically significant relationship

between being from a smaller town and attrition. Here we implement specification (2) and employ

a logit regression with robust standard errors clustered at the production center to which the

employee is assigned. Results are reported in Table 4 and indicate a negative and statistically

significant relationship between being from a smaller town and attrition across all models. The

predicted probability that an employee from a smaller town will quit INDTECH in the first four

years is 34.3%. The corresponding predicted probability that an employee from a large city will

24

quit in the first four years is 45.9%. Among the control variables, there is a positive and statistically

significant relationship between verbal test scores and attrition.

[Insert Table 4 Here]

C. Evidence of Selection—Smaller Town Employees and Standardized Test Scores

As one possible explanation for our result, we tested whether employees hired from smaller

towns have higher scores in standardized tests of verbal and logical ability at the time of

recruitment14. Figures 3 and 4 outline logical and verbal test scores, respectively, during

recruitment. We also run distributional tests to compare the verbal and logical scores for smaller

town and large city employees. We run the two-sample Wilcoxon rank-sum (Mann-Whitney) test

and reject the null hypothesis that the test score data for the two groups follow the same

distribution.

[Insert Figures 3 and 4 Here]

Tables 5 and 6 present our regression results. Table 5 relates the variable from smaller town to

logical scores, and Table 6 does the same for verbal scores. Here we implement specification (3)

and use OLS with robust standard errors. Although the size of the total cohort is 1,696 employees,

the sample size for Tables 5 and 6 is constrained to 1,229 employees by the availability of data for

the score_logical and score_verbal variables. Table 5 documents a positive and statistically

significant relationship between being from a smaller town and logical test scores at the time of

recruitment. Table 6 indicates no statistically significant relationship between verbal test scores

during recruitment and being from a smaller town.

To quote one of the hiring managers, “The job of hired graduates in the first five years is to

write code and test code. This job requires a strong analytical mind. The graduates work offshore

14 Verbal test scores are on English proficiency.

25

at the Indian development centers and do not interact directly with the U.S. clients. They only

interact with the U.S.-based program managers (and) only on technical queries. Given this, we

look for analytically smart individuals, and the smaller towns provide a rich pool of such

individuals.”

[Insert Tables 5 and 6 Here]

D. Smaller Town Employees and Getting Fired

So far, our results indicate that employees hired from smaller towns have higher productivity

than employees hired from large cities and additionally have lower attrition rates compared to

employees hired from large cities. These results are in accordance with the theoretical predictions

from the stylized example we outlined earlier.

However, these results raise the question of why INDTECH does not focus all its hiring on

smaller towns. A plausible explanation relates to greater variance in smaller town employees.

There is literature in labor economics on hiring “risky” workers and how firms value risk in new

candidates (Lazear 1995). Variance provides employers an option: risky workers have value

because a better-than-expected worker can be kept and a worse-than-expected worker can be

terminated. It is plausible that a subsample of smaller town employees commit serious work errors

that result in large negative payoffs for INDTECH. If that is indeed the case, INDTECH might be

prompted to terminate a larger fraction of employees hired from smaller towns. Employees hired

from large cities might not commit serious errors and would be less likely to be fired.

To test this hypothesis, we studied whether smaller town employees are more likely to be fired

compared to employees from large cities. We collected data on which employees of the 2007

cohort were fired from 2007 to 2011. As shown in Table 7, there is no evidence that employees

from smaller towns were more likely to be fired than employees from large cities. Additionally,

26

specifications (5) and (7) of Table 7 indicate that the employees from scheduled castes and

scheduled tribes are less likely to be fired than other “higher caste” employees.

[Insert Table 7 Here]

E. Robustness Checks

One of our most important robustness checks aims to validate the talent allocation protocol

(i.e., validating that INDTECH’s decision to assign an employee to a particular production center

is not correlated with observable employee-level characteristics, including prior performance

during recruitment and training). As shown in Table 8, we found that the decision to allocate an

employee to the largest and most important production center (Bangalore) after induction training

was not correlated with observable employee-level characteristics such as being a from smaller

town employee. Likewise, the decision to assign an employee to Bangalore was neither correlated

with observable measures of prior performance (such as CGPA at the end of training or

standardized test scores at the recruitment stage) nor with gender. These findings validate the

random talent allocation policy underlying our study.

[Insert Table 8 Here]

Given the small number of clusters, in robustness checks we ran an OLS model with wild

bootstrap and clustered standard errors (Cameron, Gelbach and Miller, 2008) and additionally ran

an ordered logit model with bootstrapped standard errors; all results remain robust. We also

clustered the standard errors by location of primary school district and used robust standard errors

without clustering. Again, all results remain uniform. In addition to production center fixed effects,

we also controlled for fixed effects for the districts of the primary school, and all results remained

robust.

27

In additional robustness checks, we dropped the employees who left the firm from 2007 to

2011; the results remained robust. We also relaxed the definition of the from smaller town variable.

In the base case, we had taken the most limiting definition of the variable, coding the variable as

“1” only if there was no missing data for the school, high school, and college location and if all

three of these locations were smaller towns. In robustness checks, we relaxed this limitation of

missing data, and our results remained robust. We get larger coefficient estimates for the from

smaller town variable in Tables 3–8, and the statistical significance remains the same or improves.

We also ran robustness checks using alternate categorization schemes for “large cities” and

“smaller towns” in India, and all results remained robust. In the base case, we classified only the

six largest metropolitan areas of Delhi, Mumbai, Bangalore, Chennai, Kolkata, and Hyderabad as

large cities. In robustness checks, we collected additional secondary data on population and

educational parameters for the top 15 Indian cities based on population. The expanded list of cities

includes the six largest metropolitan areas as well as Pune, Ahmedabad, Jaipur, Patna, Lucknow,

Thiruvananthapuram, Mysore, Bhopal, and Chandigarh. Collectively, these 15 cities/towns exceed

20% of the total urban population of India in 2007-8 (the top six metros collectively exceeded 15%

of the urban population of India in 2007-8). In all our iterations, all results reported in Tables 3–8

remained robust. Appendix C has details on the secondary dataset we used for the cluster analysis

of cities and towns.

F. Net Payoff of Hiring from Smaller Towns

Our results raise the question of whether the decision to hire employees from smaller towns

creates economic value for a firm (i.e., whether the higher productivity and lower attrition rates of

employees hired from smaller towns sufficiently offset the ex ante higher search costs of hiring

such individuals). In addition to limitations outlined earlier, an important limitation of our paper

28

is the absence of detailed employee-level data comparing INDTECH’s costs associated with hiring

individuals from smaller towns versus large cities. Given this, we could not run regressions using

hiring costs. Instead, we estimated the net payoff of hiring individuals from smaller towns based

on a set of assumptions gathered during field interviews.

In the first step, we estimated the dollar value of productivity gains associated with hiring an

employee from a smaller town. We based this analysis on 2008 performance data for the 2007

batch of employees. We used the predicted probabilities of achieving the highest performance

rating for smaller town versus large city employees. Figure 2 indicates that smaller town

employees were 10% more likely to achieve the highest performance rating compared to

employees hired from large cities. Our interviews also suggest that compared to those who achieve

the highest performance rating, other employees needed 35% more man-days to correct

coding/testing/documentation errors. This is based on rough calculations with INDTECH HR

managers on error rates and lost man-days due to coding/testing/documentation errors.

INDTECH’s entry-level salaries are about $8,000 per year (at 2013 U.S. Dollar to Rupee exchange

rates). Based on this, we can calculate the productivity gains associated with hiring an employee

from a smaller town to be around $280 per year.

Next, we estimated the incremental search costs of recruiting employees from smaller towns.

Based on discussions with INDTECH’s recruiting managers, we estimated that there is a $21

incremental cost of hiring a remote employee. This is based on several criteria: incremental travel

costs for INDTECH executives involved in hiring from smaller towns, the additional search costs

associated with trips to screen colleges and students from smaller town, and the larger number of

candidates who need to be interviewed in smaller towns compared to large cities.

29

Given this, the net payoff of hiring an employee from a smaller town is estimated to be around

$259 per employee per year. This works out to around 3% of the entry-level salary of a new college

graduate employee. These calculations provide only a very rough estimate of the net payoff

associated with hiring from smaller towns. An important limitation of this analysis is that we do

not have an estimate of sunk costs of investments related to hiring from smaller towns.

G. Generalizability of Results – Analyses Using Survey and Secondary Data

An important limitation of our empirical analysis is external validity. We study the hiring

efforts of one single firm, and we observe candidates only after they are hired. We do not observe

candidates who were interviewed but not hired, even by this one firm. Because INDTECH’s HR

department does not keep records on the recruitment test scores of candidates who were not hired,

we are unable to observe such data. This leads to the possibility that if other firms implemented

INDTECH’s hiring policy, they may not be able to achieve the same payoff.

To partially work around this data constraint, we collected additional survey and secondary

data to partially extend the generalizability of our results. In summary, we attempt to conduct the

following secondary analyses – (1) comparison of broader student population in schools and

colleges located in smaller towns vs. larger cities; (ii) analyzing heterogeneity in firm policy of

hiring from smaller towns. While our secondary analyses suggests no statistically significant

differences in the broader student population across smaller towns and larger cities, it indicates

variation in firm policy of hiring from smaller towns. While Indian technology firms and

government owned research entities are likely to hire individuals educated in smaller towns,

multinational technology firms are more likely to concentrate their hiring efforts in the larger cities.

While none of these secondary results are causal, collectively they help extend our main results in

context of the broader Indian labor market.

30

Large cities and Smaller Town – Comparison of Broader Student Population

We first collect secondary data to compare the average performance of students from smaller

towns with students from large cities (see Appendix B for more information). In the absence of

existing data on quality of students across schools in India, we created our own dataset to compare

average performance in the secondary schools where the sample of 2007 INDTECH employees

attended.

To do so, we scraped a secondary website to obtain the examination scores achieved by

579,591 students who took the high school leaving (i.e., Grade XII) annual examination conducted

by the Central Board of Secondary Education (CBSE) in 2015.15 After manually correcting for

several typographical errors in the names of schools in both the CBSE and INDTECH samples,

we excluded all schools that were not represented in the INDTECH sample. This narrowed the

data from 5,445 schools to 461. We then computed the average CBSE examination score for each

of these 461 schools and compared scores between the schools located in smaller towns versus

large cities. We did not find any statistically significant differences in average CBSE examination

scores between the two groups of schools.16 However, an important limitation of this analyses is

that we only consider the 461 schools in the INDTECH sample and our results might be affected

by selection bias.

To strengthen this analysis, we created a second dataset with rankings for colleges based in

smaller towns and colleges located in large cities. We merged two publicly available lists from

Dataquest and i3RC, that ranked technology colleges across India, and we used these scores to

15. http://www.thelearningpoint.net/home/examination-results-2015/top-cbse-affiliated-schools---2015-school-wise-result-analysis-in-class-12-

examinations. 16. The average CBSE Grade XII examination scores for the schools located in the smaller towns and large cities were 76.20 and 76.62, respectively, and t-tests indicate no statistically significant differences between the scores of the two groups. Details on the data collection

exercise and results are available in Appendix B.

31

compare the average ranking of smaller town versus large city colleges in our sample.17 Once

again, we did not find any statistically significant differences in average college rankings between

the two groups of colleges.

Hiring from Smaller Town Colleges – Heterogeneity in Firm Policy

Next, we collected data from several primary and secondary sources to shed insights on the

hiring of individuals who graduate from smaller town colleges by Indian technology firms,

multinational technology firms and government owned research entities in India. Collectively, the

evidence suggests that while Indian technology firms and government owned research entities are

likely to hire individuals educated in smaller towns, several other firms, including multinational

technology firms, are more likely to concentrate hiring efforts in larger cities.

In the first step, we conducted a survey of engineering colleges in India and found evidence

that many technology firms in India, especially the multinational firms, do not hire from smaller

towns.18 This result is consistent with prior research such as Jensen (2012). As reported in

Appendix D, the results indicate that multinational technology firms predominantly hire from the

large city colleges, whereas INDTECH (and a few other Indian technology firms) hire from both

large city and smaller town colleges. Our survey results also indicate that the mean salaries for

2011 and 2012 are significantly higher for the large city college graduates compared to the smaller

town college graduates. We validated that this difference is statistically significant in a t-test

comparison of means.19

17 The Dataquest survey of Indian technology colleges is available at: http://www.gcet.ac.in/news/t-school_2015.pdf; the i3RC Times

engineering college survey is available at: http://www.times-engineering-survey.com/. Websites accessed on 20th January, 2016. 18 To conduct this analysis, we randomly selected 10 large city and 10 smaller town colleges from the list of colleges from which INDTECH

hires employees. We then contacted representatives at these colleges via phone and/or e-mail. Eleven of the 20 college representatives agreed to

participate in telephone surveys that lasted about 30 minutes each. In these interviews, we asked about the total graduating class size at these

colleges, starting salaries for undergraduate engineers in 2011 and 2012, which technology firms (both Indian and foreign multinational) hired from these colleges, and how many students were hired by each firm. The interviews were conducted with either the head of the college or the

head of the group that was responsible for organizing recruitment at the college. 19. These findings are in line with the geographic returns to skills segregation assumptions of Young (2013) and Borjas (1994). The large difference in wages for skilled workers graduating from urban and rural colleges is arguably an exposition of the underlying conditions that lead

to “refugee sorting” in the Borjas (1994) model of self-selection of migrants.

32

In the next step, we scraped data from publicly available profiles of employees for the three

largest Indian technology firms (TCS, Infosys and Wipro), available on a social networking site,

LinkedIn. We had to reconcile differences in naming conventions for each of these firms on the

LinkedIn profiles (e.g. Wipro was listed as “Wipro”, “Wipro technologies”, “Wipro ltd”, etc.).

This data was cleaned and coded by the researchers and we were able to accurately establish

location of the graduate school of an employee for 30,131 employees in this sample. We coded the

location of the graduate school as located within a top six metropolitan city of India (i.e. New

Delhi, Mumbai, Kolkata, Chennai, Bangalore and Hyderabad) or located elsewhere (i.e. in smaller

towns). The fraction of employees who graduated from colleges located in smaller towns was 0.59,

0.62 and 0.62 respectively for the employees from TCS, Infosys and Wipro.

To follow-up on this, we conducted yet another survey, at one of the top three Indian

technology firms (not INDTECH). We employed a professional survey company that surveyed

1,054 employees at the Bangalore production center of the firm. 84% employees surveyed

indicated that they had gone to school in a smaller town of India. The employees from smaller

towns had shorter tenure at the firm (difference in means = -1.12 years, t=-4.11) and were more

likely to be employed as a contractor (difference in means = 0.22, t=4.99).

In the final step, we collected the CVs of the top 593 scientists in the government owned

research labs of India. India’s 42 state-owned national laboratories are organized under an

autonomous umbrella organization, The Council of Scientific and Industrial Research (CSIR), and

collectively they have around 12,500 scientific and technical employees. The laboratories,

covering all major scientific and engineering disciplines, were created in the 1940s and 1950s. We

collected the CVs of the top 593 scientists across all 42 laboratories, at the top three tiers of the

organizational hierarchy. We then coded whether or not the individual was educated in a college

33

located in a smaller town. The mean fraction of top CSIR scientists educated in smaller towns is

0.74, 0.71 and 0.57 respectively, for the top three organizational hierarchy levels of ‘Scientist-F’,

‘Scientist-G’ and ‘Scientist-H’.20

VI. Discussion and Conclusion

Our empirical study relates to the role that firms in developing countries play in hiring

individuals from smaller towns, thus facilitating their migration to urban centers of production,

and training them to perform high-skilled jobs, such as software coding. Given the ex ante higher

costs firms plausibly incur in searching for individuals in smaller towns, an important question is

whether or not firms can benefit from such a hiring strategy. We exploit a talent allocation protocol

within a large Indian technology firm that allows us to isolate the relationship between the prior

location of an employee and that employee’s subsequent productivity.

Our results indicate that employees hired from smaller towns in India have higher productivity

compared to employees hired from large cities. To provide a possible interpretation of our results,

we test for selection. We provide econometric evidence that employees hired from smaller towns

have higher logical test scores at the recruitment stage. Our field interviews further corroborate

this. To quote the head of talent development at INDTECH, “If you are the best student in

Bangalore, you will probably never join INDTECH. Instead, you will join some MBA course in a

large city or go to America. However, if you are the best student in Puri Orissa, a smaller town

with high unemployment levels, and INDTECH visits your college to hire, that is your dream first

job.”

20 Our secondary analyses has the benefit of looking at employees across the organizational hierarchy of these firms (our main analyses looks at

entry level employees at INDTECH), but has several limitations. For this broader sample of employees, we are unable to compare performance of individuals educated in smaller towns vs. larger cities. We also do not know whether the Indian firms hire employees educated in smaller towns

directly from their colleges, or whether the individuals migrated to larger cities prior to being hired by these firms.

34

One possible interpretation of our results is the following: if a firm is unable to hire the best

candidates from large cities, the employees it can recruit from smaller cities are of higher quality

than the employees it is able to hire from larger cities, as evidenced by the higher logical scores

on the tests at recruitment. One might still worry that being from a smaller city might be negatively

correlated with other determinants of future productivity that do not show up in the logic test

scores. The fact that these employees also receive higher performance ratings later suggests that

this is not the case, making a stronger argument for firms to recruit outside the six largest cities, at

least at the margin.21 Our results indicate that employees hired from smaller towns have lower

attrition rates as well. Additionally, we provide evidence that the net payoff of hiring individuals

from smaller towns is positive compared to hiring individuals from large cities (i.e., the gains

achieved from higher productivity and lower attrition rates sufficiently offset the additional search

costs associated with hiring individuals from smaller towns).

Our study contributes to the literature on within-country migration (Harris and Todaro, 1970;

Fields 1975; Young 2013; Bryan, et al., 2014; Colmer 2015; Bazzi et al., 2016; Munshi and

Rosenzweig, 2016). Unlike prior literature that focuses on rural to urban migration, we focus on

semi-urban to urban migration of skilled workers. While the prior literature on within country

migration has focused on self-selection models, Clemens, Ozden, Rapoport (2015) and other

scholars have alluded to barriers to migration that might affect the decisions of individuals to

migrate within countries. In the recent literature, Munshi and Rosenzweig (2016) study the effect

of caste on within-country migration in India. Other recent papers, such as Asher and Novosad

(2016) and Morten and Oliveira (2014) study physical barriers to within country migration and

analyze how the construction of roads eases such barriers and facilitates within-country migration.

21 We thank an anonymous referee at the Quarterly Journal of Economics for making this point.

35

Our study contributes to this literature and studies the role of firms in facilitating within-country

migration.

Our study has several limitations. Our data come from a single firm. We follow the tradition

of insider econometrics in personnel economics (Baker, Gibbs, and Holmstrom 1994; Bartel,

Ichniowski, and Shaw 2004; Bandiera et al., 2005) and collect data from a single firm. Future

research should test our central findings in other settings to corroborate that: (1) employees from

smaller towns outperform their large city counterparts; and (2) this is attributable to selection in

other settings.

An additional limitation of our results is that although we interpret our results using selection,

we do not have a way to test for alternative and complementary mechanisms such as effort or

motivation for why employees from smaller towns perform better. Employees hired from smaller

towns are moving to large urban production centers, so they could be viewed as undergoing within-

country migration. The literature on migration in economics (Borjas 1994; Chiswick 1978;

Carliner 1980). Chiswick (1978) points to the fact that immigrants are often more highly motivated

than residents. Carliner (1980) noted that immigrants choose to work longer and harder than non-

migrants. Existing literature (Bailey, 2005) also explores whether immigrant workers adapt better

to uncertainty.

Another limitation of our study is that, in equilibrium, the gains from hiring smaller town

employees who have higher productivity are likely to disappear as firms set up production centers

in smaller towns over time and/or as the barriers of within-country migration are gradually

overcome. However, here again we borrow from the long-standing wisdom in the within-country

migration literature and posit that “neither migration, nor any equilibrating force is strong enough

to eliminate imbalances instantaneously” (Yap 1976: page 122). As Chandrasekhar, Ghosh, and

36

Roychowdhury (2006) pointed out, in 2006, the nonurban workforce in India was around 305

million and the IT and business process outsourcing industries employed only slightly more than

one million workers.

Our findings have several policy implications for India, which aspires to enjoy a demographic

dividend in the coming decades. As Chandrasekhar, Ghosh, and Roychowdhury (2006) noted, in

2020, the average Indian will be just 29 years old, compared to 37 in China and the U.S., 45 in

Western Europe, and 48 in Japan. However, aligning economic outcomes with demographic trends

is no easy task, especially in a country with persistent regional inequalities. Deaton and Dreze

(2002) demonstrated that regional disparities in India increased in the 1990s, with southern and

western regions showing much higher rates of growth than northern and eastern regions.

Our own analysis included in Appendix E indicates that northern and eastern states in India

such as Uttar Pradesh, Bihar, and West Bengal have large populations of young people and very

few technology firms. Without within-country migration moving highly skilled young people from

labor surplus smaller towns to the urban centers for employment opportunities, the perceived

demographic dividend could become a “demographic albatross.” To avoid this, government actors

in developing country locations with large pools of unemployed workers may consider subsidizing

firms’ costs incurred by searching for workers, especially for smaller firms. Government actors

might consider organizing employment fairs and/or providing certification to students based in

smaller towns. Future research should explore the effectiveness of such policy instruments in

increasing firms’ decisions to hire workers from smaller towns.

Our work also has relevance for developed countries like the United States and has a

philosophical connection to the “moving to opportunity” experiments conducted in Boston by

Katz, Kling, and Liebman (2001). In the past decade or so, labor economists have documented the

37

polarization of the U.S. labor market (Autor, Katz, and Kearney 2006), and economists have

identified large differences in worker earnings for observationally similar workers based on the

location of the individual. Moretti (2011) documents that the hourly wage of workers located in

metropolitan areas at the top of the wage distribution is more than double the wage of

observationally similar workers located in nonmetropolitan areas.

In conclusion, both in the U.S. and in developing countries, firms with nationwide hiring

practices can play an important role in facilitating within-country migration, by providing

employment opportunities to individuals living in smaller towns. Our results indicate that firms

can benefit from such hiring practices if they are able to hire individuals who have higher

productivity and/or lower attrition than their counterparts from large cities.

38

References

Albrecht, James W., and Bo Axell. “An equilibrium model of search unemployment.” The Journal of

Political Economy (1984): 824-840.

Autor, D., Lawrence F. Katz, and M. Kearney. “Measuring and interpreting trends in economic

inequality.” AEA Papers and Proceedings. Vol. 96. No. 2. 2006.

Asher, Sam, and Paul Novosad. "Market access and structural transformation: Evidence from rural

roads in India." Job Market Paper January 11 (2016): 2016.

Baker, George, Michael Gibbs, and Bengt Holmstrom, “The Internal Economics of the Firm:

Evidence from Personnel Data,” The Quarterly Journal of Economics, 109 (1994), 881–919.

Bandiera, Oriana, Iwan Barankay, and Imran Rasul. "Social preferences and the response to

incentives: Evidence from personnel data." The Quarterly Journal of Economics (2005): 917-962.

Banerjee, Abhijit, Marianne Bertrand, Saugato Datta, and Sendhil Mullainathan, “Labor Market

Discrimination in Delhi: Evidence from a Field Experiment,” Journal of Comparative

Economics, 37 (2009), 14–27.

Banerjee, Abhijit, Shawn Cole, Esther Duflo, and Leigh Linden, “Remedying Education: Evidence

from Two Randomized Experiments in India,” The Quarterly Journal of Economics, 122 (2007),

1235–1264.

Banerjee, Abhijit, and Rohini Somanathan, “The Political Economy of Public Goods: Some Evidence

from India,” Journal of Development Economics, 82 (2007), 287–314.

Bartel, Ann, Casey Ichniowski, and Kathryn Shaw, “Using ‘Insider Econometrics’ to Study

Productivity,” The American Economic Review, 94 (2004), 217–223.

Bazzi, Samuel, et al. "Skill Transferability, Migration, and Development: Evidence from Population

Resettlement in Indonesia." The American Economic Review 106.9 (2016): 2658-2698.

Bertrand, Marianne, Rema Hanna, and Sendhil Mullainathan, “Affirmative Action in Education:

Evidence from Engineering College Admissions in India,” Journal of Public Economics, 94

(2010), 16–29.

Borjas, George J., “Economics of Immigration,” Journal of Economic Literature, 32 (1994), 1667–

1717.

Bryan, Gharad, Shyamal Chowdhury, and Ahmed Mushfiq Mobarak. "Underinvestment in a

profitable technology: The case of seasonal migration in Bangladesh." Econometrica 82.5 (2014):

1671-1748.

Burdett, Ken, and Melvyn G. Coles. "Marriage and class." The Quarterly Journal of Economics

(1997): 141-168.

Cameron, A. Colin, Jonah B. Gelbach, and Douglas L. Miller. "Bootstrap-based improvements for

inference with clustered errors." The Review of Economics and Statistics 90.3 (2008): 414-427.

Carliner, Geoffrey, “Wages, Earnings and Hours of First, Second and Third Generation American

Males,” Economic Inquiry, 18 (1980), 87–102.

Chandrasekhar, C. P., Jayati Ghosh, and Anamitra Roychowdhury, “The ‘Demographic Dividend’

and Young India’s Economic Future,” Economic and Political Weekly, December 9, 2006.

Chaudhury, Nazmul, Jeffrey Hammer, Michael Kremer, Karthik Muralidharan, and F. Halsey

Rogers, “Missing in Action: Teacher and Health Worker Absence in Developing Countries,”

Journal of Economic Perspectives, 20 (2006), 91–116.

39

Chiswick, Barry, “The Effect of Americanization on the Earnings of Foreign-Born Men,” Journal of

Political Economy, 86 (1978), 897–921.

Choudhury, Prithwiraj, and Tarun Khanna, “Physical, Social and Informational Barriers to Domestic

Migration in India,” in Institutions and Comparative Economic Development, Masahiko Aoki,

Timur Kuran, and Gerard Roland, eds. (Basingstoke, U.K.: Palgrave Macmillan, 2012).

Clemens, Michael A., “Why Do Programmers Earn More in Houston than Hyderabad? Evidence

from Randomized Processing of U.S. Visas,” American Economic Review, 103 (2013), 198–202.

Clemens, Michael A., Çağlar Özden, and Hillel Rapoport. "Reprint of: Migration and development

research is moving far beyond remittances." World Development 65 (2015): 1-5.

Currie, Janet, and Ann Harrison, “Sharing the Costs: The Impact of Trade Reform on Capital and

Labor in Morocco,” Journal of Labor Economics, 15 (1997), S44–S71.

Deaton, Angus, and Jean Dreze, “Poverty and Inequality in India: A Re-Examination,” Economic

and Political Weekly, September 7, 2002.

Dyson, Tim, and Mick Moore, “On Kinship Structure, Female Autonomy, and Demographic

Behavior in India,” Population and Development Review, 9 (1983), 35–60.

Fields, Gary S. "Rural-urban migration, urban unemployment and underemployment, and job-search

activity in LDCs." Journal of development economics 2.2 (1975): 165-187.

Foster, Andrew, and Mark R. Rosenzweig, “Missing Women, the Marriage Market, and Economic

Growth,” Working Paper, Brown University, 2009.

Grogger, Jeffrey, and Gordon H. Hanson. "Attracting Talent: Location Choices of Foreign-Born

PhDs in the United States." Journal of Labor Economics 33.S1 Part 2 (2015): S5-S38.

Harris, John R., and Michael P. Todaro. "Migration, unemployment and development: a two-sector

analysis." The American economic review 60.1 (1970): 126-142.

Hunt, Jennifer, and Marjolaine Gauthier-Loiselle. 2010. "How Much Does Immigration Boost

Innovation?" American Economic Journal: Macroeconomics, 2(2): 31-56.

Jensen, Robert, “Do Labor Market Opportunities Affect Young Women’s Work and Family

Decisions? Experimental Evidence from India,” The Quarterly Journal of Economics, 127 (2012),

753–792.

Katz, Lawrence F., Jeffrey R. Kling, and Jeffrey B. Liebman, “Moving to Opportunity in Boston:

Early Results of a Randomized Mobility Experiment”, The Quarterly Journal of Economics (2001)

116 (2): 607-654

Kerr, William R., and William F. Lincoln. The supply side of innovation: H-1B visa reforms and US

ethnic invention. No. w15768. National Bureau of Economic Research, 2010.

Kerr, Sari Pekkala, William R. Kerr, and William F. Lincoln. "Skilled immigration and the

employment structures of US firms." Journal of Labor Economics 33.S1 (2015): S147-S186.

Lazear, Edward P, “Hiring Risky Workers,” NBER Working Paper No. w5334, 1995.

Moretti, Enrico, “Local Labor Markets,” Handbook of Labor Economics, 4 (2011), 1237–1313.

Morgan, Peter. 1995. “A Model of Search, Coordination, and Market Segmentation.” Manuscript

(rev.), State Univ. New York Buffalo

Morten, Melanie, and Jaqueline Oliveira. "Migration, roads and labor market integration: Evidence

from a planned capital city." Unpublished Manuscript (2014).

40

Mortensen, Dale T., and Christopher A. Pissarides, “New Developments in Models of Search in the

Labor Market,” Handbook of Labor Economics, 3 (1999), 2567–2627.

Munshi, Kaivan, and Mark Rosenzweig. "Networks and misallocation: Insurance, migration, and the

rural-urban wage gap." The American Economic Review 106.1 (2016): 46-98.

Muralidharan, Karthik, and Venkatesh Sundararaman, “The Aggregate Effect of School Choice:

Evidence from a two-stage experiment in India”, The Quarterly Journal of Economics, first

published online February 27, 2015 doi:10.1093/qje/qjv013

Pissarides, Christopher A., “Equilibrium in the Labor Market with Search Frictions,” The American

Economic Review, 101 (2011), 1092–1105.

Rogerson, Richard, Robert Shimer, and Randall Wright, “Search-Theoretic Models of the Labor

Market: A Survey,” NBER Working Paper No. w10655, 2004.

Schultz, T. Paul, “Rural-Urban Migration in Colombia,” The Review of Economics and Statistics, 53

(1971), 157–163.

Smith, Lones, “The Marriage Model with Search Frictions,” Journal of Political Economy, 114

(2006), 1124–1144.

Yap, Lorene, “Internal Migration and Economic Development in Brazil,” The Quarterly Journal of

Economics, 90 (1976), 119–137.

Young, Alwyn. "Inequality, the urban-rural gap, and migration*." The Quarterly Journal of

Economics (2013): qjt025.

World Bank report by Blom, Andreas, and Hiroshi Saeki. "Employability and skill set of newly

graduated engineers in India." (2011).

41

FIG 1. — Comparison of Performance in 2008

NOTE. — This graphic plots the distributions of performance at the end of 2008 for the 2007 batch and compares performance for the smaller town employees to employees from large cities. Interviews with managers at INDTECH indicate that performance at the end of 2008 for the 2007

batch is measured using two dimensions—error rate in coding/testing and completeness in coding/testing and documentation—and is distributed

across three possible discrete ratings. We also conduct distributional tests to compare the short-term performance for large city and smaller town employees. We run the two-sample Wilcoxon rank-sum (Mann-Whitney) test and reject the null hypothesis that the performance data for the two

groups (tier 1 and tier2/tier3) follow the same distribution.

FIG. 2.— Comparison of Predicted Probabilities of Performance Ratings. the predicted probability that an employee hired from a smaller town

will achieve the highest performance rating is 38.5% in the fully specified model. The corresponding predicted probability that an employee hired from a large city will achieve the highest performance rating is 28.5%.

0.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

Lowest rating Middle rating Highest rating

Predicted Probabilities - Performance Ratings

Small town Large cities

42

FIG. 3. — Comparison of Logical Scores based on Standardized Tests during Hiring

NOTE.— This graphic plots the comparison of logical scores obtained during the standardized recruitment test for the 2007 batch and compares

logical scores for the smaller town recruits to the tier 1 city recruits.

FIG. 4.— Comparison of Verbal Scores based on Standardized Tests during Hiring

NOTE. — This graphic plots the comparison of verbal scores obtained during the standardized recruitment test for the 2007 batch and compares

verbal scores for the smaller town recruits to the tier 1 city recruits.

Table 1

Summary Statistics—Entire sample

Observations Mean Std. dev. Min Max

Panel A: Employee characteristics From smaller town

(i.e., from smaller town school and smaller town college) 1,275 0.51 0.50 0 1

From smaller town school 1,302 0.62 0.48 0 1

Scheduled caste 1,696 0.51 0.49 0 1

Male 1,696 0.65 0.48 0 1

Panel B: Recruitment and training scores

Recruitment test score logical 1,635 4.93 3.36 -4 9

Recruitment test score verbal 1,635 4.29 3.98 -8 16

CGPA training 1,696 4.49 0.43 2 5

Panel C: Performance and attrition

Performance in 2008 676 2.29 0.54 1 3

Has quit 1,696 0.42 0.49 0 1

Fired 1,696 0.061 0.24 0 1

NOTE. — This table lists summary statistics for the entire sample. The variable “from smaller town” is coded as “1” if the individual went to primary school, high school, and college in a non-tier 1 town in India. The variable “from smaller

town school” is coded as “1” if the individual went to primary school and high school in a non-tier 1 town of India. We

classify Indian towns based on the tier system outlined by the Sixth Pay Commission report of the Government of India (details at: http://ccis.nic.in/WriteReadData/CircularPortal/D2/D02ser/11016_2_2008-AIS-II.pdf). “CGPA training” is the

cumulative grade point average at the end of the training. The logical and verbal scores are from the standardized multiple

choice recruitment tests; the standardized tests include negative penalties for wrong answers. The variable “scheduled caste” is coded as “1” if the employee is a member of one of the scheduled or other backward castes (OBCs).

44

Table 2

Comparison of Summary Statistics for Employees from Smaller Towns and Employees from Large

Cities

(1) (2) (3) (4) (5) (6)

From

smaller town = 1

From

smaller town = 0

Difference

From smaller town school = 1

From smaller town school = 0

Difference

Panel A: Employee

characteristics

Scheduled caste 0.50 0.55 -0.05* 0.52 0.53 -0.02

Male 0.65 0.65 -0.01 0.65 0.66 -0.01

Panel B: Recruitment and

training scores

Recruitment test score logical 5.63 4.59 1.04*** 5.48 4.47 1.01***

Recruitment test score verbal 4.23 4.53 -0.30 4.13 4.71 -0.58**

CGPA training 4.49 4.47 0.02 4.48 4.48 0.00

Panel C: Performance and

attrition

Performance in 2008 2.40 2.23 0.17*** 2.37 2.22 0.14***

Has quit 0.35 0.47 -0.12*** 0.36 0.49 -0.13***

Fired 0.05 0.06 -0.01 0.06 0.06 0.00

NOTE. — Column (3) of Table II reports the differences in descriptive statistics where the treatment group includes from

smaller town employees (i.e., employees who went to a smaller town school, high school, and college). The variable “from smaller town” is coded as “1” if the individual went to primary school, high school, and college in a non-tier 1 town in India.

Column (6) of Table II reports differences in descriptive statistics where the treatment group includes employees from smaller

town schools (i.e., employees who went to a smaller town school and high school. They might have gone to a college in either a large city or smaller town). The variable “from smaller town school” is coded as “1” if the individual went to primary

school and high school in a non-tier 1 town of India. We classify Indian towns based on the tier system outlined by the Sixth

Pay Commission report of the Government of India (details at: http://ccis.nic.in/WriteReadData/CircularPortal/D2/D02ser/11016_2_2008-AIS-II.pdf). “CGPA training” is the cumulative

grade point average at the end of the training. The logical and verbal scores are from the standardized multiple choice

recruitment tests; the standardized tests include negative penalties for wrong answers. The variable “scheduled caste” is coded as “1” if the employee is a member of one of the scheduled or other backward castes (OBCs).

***p < .01.

**p < .05 *p < .1

45

Table 3

Performance

(1) (2) (3) (4) (5) (6) (7) (8) (9) Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9

from_smaller_town 0.63*** 0.64*** 0.52*** 0.59*** 0.69*** 0.61*** 0.66*** 0.63*** 0.45*** (0.19) (0.21) (0.18) (0.21) (0.20) (0.21) (0.18) (0.21) (0.16)

cgpa_training 2.32*** 2.20***

(0.25) (0.30) score_logical 0.08*** 0.07** 0.08***

(0.03) (0.03) (0.03)

score_verbal 0.03*** 0.01 0.01 (0.01) (0.01) (0.01)

scheduled_caste 0.64*** 0.76***

(0.243) (0.193)

Male 0.22 0.30

(0.20) (0.20)

Pseudo R2 0.01 0.02 0.10 0.04 0.03 0.04 0.04 0.03 0.12 Observations 540 540 540 511 511 511 540 540 511

Fixed effects No Yes Yes Yes Yes Yes Yes Yes Yes

NOTE. — This table implements specification (1) with robust standard errors clustered at the development center. The results are also robust to running an OLS model with wild bootstrap and clustered standard errors (reported in Appendix F).

This table indicates that there is a positive and statistically significant relationship between being from a smaller town and

ex post productivity measured using performance rating scores. This result is robust to controlling for test scores at the end of training, standardized recruitment test scores, whether or not the employee is from a scheduled caste, gender, and

production center fixed effects. Among the control variables, as expected, CGPA at the end of training and logical test scores

during recruitment are highly correlated to performance. Columns (7) and (9) also indicate a positive and statistically significant relationship between being a member of a scheduled caste and performance. Marginal effects analyses indicate

that the predicted probability that an employee hired from a smaller town will achieve the highest performance rating is

38.5% in the fully specified model. The corresponding predicted probability that an employee hired from a large city will achieve the highest performance rating (for performance) is 28.5%.

***p < .01

**p < .05 *p < .1

46

Table 4

Attrition

Dependent variable: has quit the firm

(1) (2) (3) (4) (5) (6) (7) (8) (9) Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9

from_smaller_town -0.51*** -0.53*** -0.54*** -0.51*** -0.50*** -0.48*** -0.53*** -0.53*** -0.48***

(0.11) (0.12) (0.12) (0.12) (0.13) (0.12) (0.13) (0.12) (0.13) cgpa_training -0.48 -0.49

(0.35) (0.37)

score_logical -0.003 -0.02 -0.018 (0.02) (0.02) (0.022)

score_verbal 0.03*** 0.04*** 0.04***

(0.01) (0.01) (0.01) scheduled_caste 0.07 0.08

(0.16) (0.20)

Male -0.13 -0.09 (0.12) (0.14)

Pseudo R2 0.01 0.02 0.03 0.02 0.02 0.02 0.02 0.02 0.03

Observations 1,254 1,254 1,254 1,208 1,208 1,208 1,254 1,254 1,208 Fixed effects No Yes Yes Yes Yes Yes Yes Yes Yes

NOTE. — Robust standard errors in parentheses. This table implements specification (2) and employs a logit regression

with robust standard errors clustered at the production center to which the employee is assigned. The results are also robust

to running an OLS model with wild bootstrap and clustered standard errors. Table IV results indicate a negative and statistically significant relationship between being from a smaller town and attrition across all models. The predicted

probability that an employee from a smaller town will quit INDTECH in the first four years is 34.3%. The corresponding

predicted probability that an employee from a large city will quit in the first four years is 45.9%. Among the control variables, there is a positive and statistically significant relationship between verbal test scores and attrition.

***p < .01

**p < .05 *p < .1

Table 5

Standardized Logical Scores from Recruitment Test

(1) (2) (3) (4)

Variables Model 1 Model 2 Model 3 Model 4

from_smaller_town 1.04*** 1.04*** 1.04** 1.04***

(0.19) (0.19) (0.19) (0.19)

scheduled_caste 0.16 0.14 (0.19) (0.19)

Male -0.15 -0.13

(0.19) (0.20) Constant 4.59*** 4.51*** 4.69*** 4.60***

(0.14) (0.17) (0.18) (0.22)

Observations 1,229 1,229 1,229 1,229

R-squared 0.025 0.025 0.025 0.025

NOTE. — Robust standard errors in parentheses. This table implements specification (3) for logical test scores in

standardized tests during recruitment. Here we implement specification (3) and use OLS with robust standard errors. Though the size of the cohort is 1,696 employees, the sample size for this table is constrained to 1,229 employees by the availability

of data for the score_logical variable. Results across all models indicate a positive and statistically significant relationship

between logical test scores and being from a smaller town. ***p < .01

**p < .05

*p < .1

47

Table 6

Standardized Verbal Scores from Recruitment Test (1) (2) (3) (4) Variables Model 1 Model 2 Model 3 Model 4

from_smaller_town -0.30 -0.29 -0.30 -0.29

(0.23) (0.23) (0.23) (0.23) scheduled_caste 0.18 0.17

(0.23) (0.23)

Male 0.02 0.05 (0.23) (0.24)

Constant 4.53*** 4.44*** 4.52*** 4.40***

(0.16) (0.21) (0.22) (0.27)

Observations 1,229 1,229 1,229 1,229

R-squared 0.001 0.002 0.001 0.002

NOTE. — Robust standard errors in parentheses. This table implements specification (3) for verbal test scores on

standardized tests during recruitment. Here we implement specification (3) and use OLS with robust standard errors. Though the size of the cohort is 1,696 employees, the sample size for this table is constrained to 1,229 employees by the availability

of data for the score_verbal variable. Results across all models indicate no statistically significant relationship between verbal

test scores and being from a smaller town. ***p < .01

**p < .05

*p < .1

Table 7

Getting Fired

NOTE. — Robust standard errors in parentheses. This table reports results on whether smaller town employees are more

likely to be fired compared to employees from large cities. We collected data on which employees of the 2007 cohort were

fired from 2007 to 2011, and results reported in Table VII indicate no evidence that employees from smaller towns are more likely to be fired than employees hired from large cities. Additionally, columns (5) and (7) of Table VII indicate the

employees from scheduled castes and scheduled tribes are less likely to be fired than other “higher caste” employees.

***p < .01 **p < .05

*p < .1

Dependent variable: was the employee fired?

(1) (2) (3) (4) (5) (6) (7)

Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7

from_smaller_town -0.03 -0.11 0.15 0.05 -0.16 -0.03 0.02

(0.35) (0.37) (0.36) (0.33) (0.29) (0.35) (0.32)

cgpa_training -4.69*** -4.81*** (0.42) (0.29)

score_logical -0.08** -0.09** (0.03) (0.035)

score_verbal -0.04* 0.0002

(0.02) (0.04) scheduled_caste -1.30*** -1.72***

(0.25) (0.33)

Male -0.10 0.09 0.03 (0.19) (0.19) (0.21)

Pseudo R2 0.03 0.46 0.04 0.03 0.07 0.03 0.50

Observations 1,202 1,202 1,157 1,157 1,202 1,202 1,157 Fixed effects for location Yes Yes Yes Yes Yes Yes Yes

48

Table 8

Validity of the Random Assignment

Dependent variable: whether or not posted to Bangalore

(1) (2) (3) (4) (5) (6)

Variables Model 1 Model 2 Model 3 Model 4 Model 5 Model 6

from_smaller_town -0.16 -0.13 -0.15 -0.17 -0.16 -0.12 (0.14) (0.15) (0.14) (0.14) (0.14) (0.15)

score_logical -0.02 -0.03

(0.02) (0.02)

score_verbal 0.02 0.02

(0.02) (0.02) cgpa_training 0.30 0.23

(0.20) (0.20)

Male -0.18 -0.22 (0.15) (0.15)

Constant -1.33*** -1.23*** -1.40*** -2.69*** -1.21*** -2.17***

(0.10) (0.14) (0.128) (0.912) (0.13) (0.90) Observations 1,254 1,208 1,208 1,254 1,254 1,208

NOTE. — Robust standard errors in parentheses. This table reports results to validate the talent allocation protocol (i.e., validating that the production center assignment is not correlated with observable employee-level characteristics, including

prior performance during recruitment and training). Results across all models indicate that the decision to allocate an

employee to the largest and most important production center (located in Bangalore) following induction training is not correlated with observable employee-level characteristics such as being from a smaller town or observable measures of prior

performance (such as CGPA at the end of training or standardized test scores at the recruitment stage). The decision to

allocate an employee to Bangalore is also not correlated with other observable individual characteristics such as gender. This validates the talent allocation policy underlying our study.

***p < .01 **p < .05 *p < .1

49

APPENDIX

A. Employee Random Assignment Protocol

(Based on interviews and INDTECH internal documents. Part of the text is copied

from INDTECH internal documents.)

INDTECH assigns its software engineer trainees to production centers based on a

computer application called “Talent Planning,” which is part of the firm’s

enterprise resource allocation software system. This application allocates trainees

to a location based on the quarterly manpower budget released by Corporate

Planning.

The “process life cycle steps” are:

• Collating the manpower budget and unit-wise requirements

• Trainee assignment (location)

• Communication with stakeholders

Talent Planning bases the assignment of employees on the following:

• Production center requirements: HR at each production center provides

data on requirement for trainees trained in various technologies.

• Data from HR stationed at the training location: Two weeks prior to the

completion of training batches, HR at the training location releases data on

which employees are expected to complete training.

The two variables that the Talent Planning team considers while assigning

employees to production centers, using the automated system, include the stream

of training for the trainee and the estimated date of training completion. The prior

background of the employee and the test scores of the employee are not

considered in this decision. INDTECH communicates employee assignments

through a centralized portal.

50

APPENDIX

B. Creating a Database of Scores for CBSE Schools and Rankings for Colleges

In the absence of existing reliable data on the quality of schools in India, we created

our own dataset using student scores on the Grade XII annual Central Board of Secondary

Education (CBSE) examination. The CBSE examination is a standardized school leaving

examination in India. As of 2015, 17,161 schools across the country were affiliated with

the CBSE, but only a fraction of these schools have grades XI and XII. In 2015, students

from approximately 9,000 schools took the CBSE.

To compare average scores achieved by students from smaller town versus large city

schools, we scraped a secondary website with CBSE examination marks for 579,591

students across 5,433 CBSE-affiliated schools in India. After manually correcting for

several typographical errors in the names of schools in both the CBSE and INDTECH

samples, we excluded all schools that were not all schools that were not represented in the

INDTECH sample. This narrowed the data from 5,445 schools to 461. We then computed

the average CBSE examination score for each of these 461 schools and compared the

scores of schools located smaller towns versus large cities. The data collection steps are

outlined below.

In the first step, we examined publicly available secondary data available on “The

Learning Point” Web site.22 Next, we copied data for 5,433 schools that had been scraped

by this website from the Grade XII CBSE examination held in March 2015. The

examination scores were collected for 5,433 schools (out of approximately 9,000

schools).23 We then manually corrected school name errors and used Google Maps to

identify school names as necessary. As an example, a candidate who studied at “Kendriya

Vidyalaya, T Nagar” was matched to “Kendriya Vidyalaya, Chennai,” because a Google

Maps search showed that T Nagar is a street in Chennai and there is no other Kendriya

Vidyalaya in Chennai. In the final step, we computed average scores for each school.

22 http://www.thelearningpoint.net/home/examination-results-2015/top-cbse-affiliated-schools---2015-school-wise-result-

analysis-in-class-12-examinations. 23 Details at: https://docs.google.com/spreadsheets/d/1U1plN_gpQg-

qcmyNsPumrH4PSctLXmnK0Px6SCdlIJk/edit#gid=1732754620.

51

To augment this analysis, we created a second dataset with rankings for colleges based

in smaller towns and those located in large cities. We merged two publicly available lists

that ranked 226 colleges across India, and compared the average rankings of smaller town

and large city colleges in our sample. To do so, we first searched for the superset of college

rankings for India.24 After reviewing the available lists, we selected the Dataquest ranking

as the most reliable, scrapable, and sufficiently large dataset.25

Dataquest ranks and assigns scores to private and public colleges (i.e., “1” is the

highest score, “2” the second highest score, and so on). To increase the sample size, we

then augmented the Dataquest list with additional colleges from the “Times Ranking”

which was also scrapable and large.26 We then merged the two lists and specified if each

unique ranking was designated by the Times or Dataquest. All websites were accessed on

December 12, 2015.

24

https://en.wikipedia.org/wiki/Indian_engineering_college_rankings,_2015. 25 http://www.gcet.ac.in/news/t-school_2015.pdf. 26

http://www.times-engineering-survey.com.

52

APPENDIX

C. Secondary Data Collected to Perform Cluster Analysis of Indian Cities and Towns

Notes. In the base case, we used the Government of India classification of cities and towns in India and classified the six

largest metros of Delhi, Mumbai, Bangalore, Chennai, Kolkata, and Hyderabad as large cities. In robustness checks, we

collected additional secondary data on population and educational parameters for the top 15 Indian cities based on population.

The expanded list of cities includes the six largest metros and the following additional cities and towns: Pune, Ahmedabad,

Jaipur, Patna, Lucknow, Thiruvananthapuram, Mysore, Bhopal, and Chandigarh. Collectively, these 15 cities/towns

exceeded 20% of the total urban population of India in 2007-8. We used a publicly available dataset called the District

Information System for Education (DISE) that has city- and town-level data for variables related to population and

elementary schools (Grade 1-8) in each “district.” We collated the district-level data to compute the city-/town-level

indicators. The variables included percentage urban population, overall literacy, female literacy, total number of schools,

total enrollment in government-run primary schools, and number of teachers in government-run primary schools. This

dataset was created by the National Institute of Educational Planning and Administration (NIEPA) in India, which initiated

the task of developing a school-based statistical system during 1995 with financial assistance from UNICEF. DISE has data

from 2002-3 to 2013-14. Our data was extracted using Python code from http://www.dise.in/drc.htm.

Using these indicators for 2007-8, we ran a cluster analysis of the top 15 Indian cities and towns and grouped the cities/towns

into either two or three clusters (we used data for 2007-8, given that the INDTECH batch we studied was hired in 2007). We

also ran the cluster analysis in three different ways: based on population indicators alone; based on educational indicators

alone; and based on both population and educational indicators. This led to six iterations of the cluster analysis. In each

iteration of the cluster analysis, we recomputed the variable from smaller town based on the cities grouped in the top cluster.

In all our iterations of this robustness check, all results reported in Tables III-VIII remain robust.

Total

number of

primary

schools

(grades 1-8)

Overall

literacy

Female

literacy

Total enrollment -

government

schools_primary

Teachers in

government

schools_primary

1 Delhi 93.2% 4,918 81.7% 74.7% 901,168 25,231

2 Mumbai 100.0% 3,622 86.8% 81.2% 43,285 1,440

3 Chennai 100.0% 1,504 85.3% 80.4% 35,441 1,064

4 Bangalore 73.2% 5,220 78.9% 72.5% 75,358 3,368

5 Hyderabad 100.0% 3,032 78.8% 73.5% 72,244 2,371

6 Kolkata 100.0% 2,184 80.9% 77.3% 177,302 5,380

7 Pune 58.1% 5,632 80.5% 71.9% 162,805 6,791

8 Ahmadabad 80.2% 2,270 79.5% 70.8% 66,431 1,939

9 Jaipur 49.4% 7,064 69.9% 55.5% 202,244 5,835

10 Patna 41.6% 3,341 62.9% 50.8% 288,955 6,207

11 Lucknow 63.6% 2,703 68.7% 60.5% 198,588 5,097

12 Thiruvananthapuram 33.8% 971 89.3% 86.1% 54,895 1,705

13 Mysore 37.2% 2,488 63.5% 55.8% 49,927 2,407

14 Bhopal 80.4% 1,993 74.6% 66.4% 112,207 2,576

15 Chandigarh 89.8% 176 81.9% 76.5% 11,351 266

# City

Educational indicators used for cluster analysis (2007-8)Percentage

urban

population

(2007-8)

53

APPENDIX

D. Survey of Large City and Smaller Town Engineering Colleges

Large city colleges Smaller town

colleges

Average size of graduating class in computer science/IT (undergraduate and master’s)

342 458

Average percentage of graduating class in computer science/IT hired

by INDTECH (in 2011, 2012) 0.17% 0.06%

Average percentage of graduating class in computer science/IT hired

by multinational technology firms IBM and Cognizant (in 2011,

2012)

9% 1%

Mean annual salary

(Rupees Lakhs, 2011 and 2012 average) 6.20 2.70

N 7 4

Notes. The researchers randomly selected 10 large city and 10 smaller town engineering colleges from the list of colleges

from which INDTECH hires and contacted the colleges' representatives to participate in a telephone survey. The researchers

were able to conduct interviews with representatives at seven out of the 10 large city colleges. These were: R.V. College of

Engineering, Bangalore; M.S. Ramaiah Institute of Technology, Bangalore; MLR Institute of Technology, Hyderabad;

Muffakham Jah College of Engineering and Technology, Hyderabad; Vasavi College of Engineering, Hyderabad; G.

Narayanamma Institute of Technology & Science (GNITS), Hyderabad; and Gokaraju Rangaraju Institute of Engineering

and Technology, Hyderabad. The researchers were also able to conduct interviews with representatives at four out of the 10

selected smaller town colleges. These included: M.J.P. Rohilkhand University in Bareilly, Uttar Pradesh; Majhighariani

Institute of Technology & Science, Rayagada Orissa; Bapatla Engineering College, Guntur, Andhra Pradesh; and Jaya

Prakash Narayan College of Engineering, Mahabubnagar, Dharmapur, Telangana. The survey results indicated that the mean

salaries for 2011 and 2012 were significantly higher for individuals hired from the large city colleges compared to those from

the smaller town colleges. We found this difference to be statistically significant, based on a t-test comparison of means. In

addition, the survey revealed that multinational technology firms predominantly hire from large city colleges, while

INDTECH follows the distinctive policy of hiring from both large city and smaller town colleges. Note that the results from

the survey might be upward biased given the small sample of colleges that participated in the survey. Rs.1 Lakh = Rs.

100,000.

54

APPENDIX

E. Percentage of Young Population and Technology Firms across Indian States

Percentage of 20-34

year olds in India

who live in the state

Percentage of

NASSCOM

technology firms

that are

headquartered in the

state

Uttar Pradesh 0.34 0.08

Maharashtra 0.23 0.22

West Bengal 0.19 0.04

Andhra Pradesh 0.18 0.11

Bihar 0.17 0.00

Tamil Nadu 0.15 0.10

Madhya Pradesh 0.13 0.00

Karnataka 0.13 0.20

Gujarat 0.12 0.03

Rajasthan 0.12 0.01

Orissa 0.09 0.00

Kerala 0.08 0.01

Assam 0.06 0.00

Jharkhand 0.06 0.00

Punjab 0.06 0.00

Haryana 0.05 0.12

Chattisgarh 0.05 0.00

Delhi 0.04 0.05

Jammu and Kashmir 0.02 0.00

Uttarakhand 0.02 0.00

Himachal Pradesh 0.01 0.00

Tripura 0.01 0.00

Manipur 0.01 0.00

Meghalaya 0.01 0.00

Notes. This table tabulates the percentage of population in the age group of 20-34 year olds living in each Indian state and

the corresponding fraction of information technology firms listed with the National Association of Software and Services

Companies (NASSCOM) in India. States with less than 0.01% of the population in the 20-34 age group were dropped from

the table. This data was collected by the researchers.

55

APPENDIX

F. Replicating Table 3 using wild bootstrap for OLS with clustered standard errors

(1) (2) (3) (4) (5) (6) (7) (8) (9)

VARIABLES Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9

all_small_town 0.17** 0.17** 0.13** 0.16** 0.18*** 0.16** 0.17*** 0.17** 0.11**

(0.05) (0.06) (0.04) (0.05) (0.05) (0.06) (0.04) (0.06) (0.04)

cgpa_training 0.53*** 0.50***

(0.04) (0.06)

score_logical 0.02** 0.02** 0.02**

(0.01) (0.01) (0.01)

score_verbal 0.01*** 0.00 0.00

(0.00) (0.00) (0.00)

scheduled_caste 0.17** 0.18***

(0.06) (0.04)

male 0.05 0.06

(0.05) (0.05)

Constant 2.23*** 2.17*** -0.23 2.06*** 2.10*** 2.05*** 2.06*** 2.14*** -0.34

(0.04) (0.03) (0.21) (0.04) (0.03) (0.03) (0.05) (0.03) (0.27)

Observations 540 540 540 511 511 511 540 540 511

R-squared 0.02 0.04 0.14 0.05 0.04 0.05 0.06 0.04 0.17

Wild p-value 0.008 0.017 0.027 0.097 0.018 0.096 0.013 0.019 0.042

Robust standard errors in parentheses; *** p<0.01, ** p<0.05, * p<0.1; This Table replicates results reported in Table 3, using a wild bootstrap

for OLS with clustered standard errors. Webb random weights were applied to the residuals from the base regression, when bootstrapping.

1000 bootstrap repetitions were performed for each test.