Download - © Copyright by Somdeep Chatterjee May 2016
_______________
A Dissertation
Presented to
The Faculty of the Department
of Economics
University of Houston
_______________
In Partial Fulfillment
Of the Requirements for the Degree of
Doctor of Philosophy
_______________
By
Somdeep Chatterjee
May, 2016
THREE ESSAYS ON ISSUES IN DEVELOPING ECONOMIES: CREDIT
REFORMS, ELECTIONS AND WAGE DISPERSION
THREE ESSAYS ON ISSUES IN DEVELOPING ECONOMIES: CREDIT
REFORMS, ELECTIONS AND WAGE DISPERSION
_________________________ Somdeep Chatterjee
APPROVED:
_________________________ Aimee Chin, Ph.D. Committee Co-Chair
_________________________ Gergely Ujhelyi, Ph.D.
Committee Co-Chair
_________________________ Chinhui Juhn, Ph.D.
_________________________ Francisco Cantú, Ph.D.
Department of Political Science University of Houston
_________________________ Steven G. Craig, Ph.D. Interim Dean, College of Liberal Arts and Social Sciences Department of Economics
ii
_______________
An Abstract of a Dissertation
Presented to
The Faculty of the Department
of Economics
University of Houston
_______________
In Partial Fulfillment
Of the Requirements for the Degree of
Doctor of Philosophy
_______________
By
Somdeep Chatterjee
May, 2016
THREE ESSAYS ON ISSUES IN DEVELOPING ECONOMIES: CREDIT
REFORMS, ELECTIONS AND WAGE DISPERSION
iv
Abstract
This dissertation presents three essays on some important issues in two develop-
ing economies, India and Ghana. In the first essay I analyze a major agricultural credit
reform in India, known as the Kisan Credit Card (KCC) scheme, which intended to
simplify the process of credit delivery in the agricultural sector. I use plausibly exoge-
nous variation in the reach of the program to find the causal effects of the policy on
agricultural output and technology adoption using a district panel data set. I also use
a household dataset to analyze the effects of differential exposure to this policy on a
wide range of household outcomes. I find evidence of increases in agricultural output
of rice, which is the major crop of the country. I also find that on average the use of
high yielding variety (HYV) seeds increases at the district level providing evidence
of technology adoption. The increase in output at the district level is corroborated by
suggestive increase in sales revenue and output of rice farmers at the household level.
In addition, I find evidence that bank borrowing increased for households due to this
program and the estimated effects on income and production are higher for such a
sub-sample of borrowers.
In the second essay, co-authored with Gergely Ujhelyi and Andrea Szabó, we try
to answer the question as to why people vote? One possibility is that they derive con-
sumption utility from doing so, but isolating this has proven empirically challenging.
In this paper we study a recent natural experiment in India, where legislative elections
have to provide a "None Of The Above" (NOTA) option to voters. Using the fact that
NOTA cannot affect the electoral outcome we show that studying individual voters’
behavior with and without NOTA provides a way to identify various components of
the consumption utility of voting. To address the challenge that individual votes are
v
not observable, we borrow techniques from the Industrial Organization literature to
estimate a structural model of voter demand for candidates and perform counterfac-
tual simulations removing the NOTA option. We complement this with a reduced-
form analysis of NOTA in a difference-in-differences framework, exploiting variation
in the timing of the reform created by the electoral calendar. Using both methods,
we find that NOTA increased turnout. We find minimal substitution between candi-
dates and NOTA, indicating that NOTA votes are cast by new voters who turn out
to vote specifically for this option. This indicates the presence of an option-specific
consumption utility of voting.
In the final essay, I use a matched employer-employee dataset from the Ghanaian
manufacturing sector to analyze earnings dispersion in Ghana from 1992-2003, a pe-
riod post extensive economic reforms. I find that variance of earnings increased from
1992-1998 and decreased thereafter resembling an inverted u-shaped relationship. I
use ANOVA and variance decomposition approaches to understand the underlying
factors that led to such a pattern in earnings inequality. I find that between-firm
factors explain this pattern more than within-firm factors. I also find that the mean
earnings gap between workers above and below the 90th percentile of income distri-
bution can explain majority of the initial surge in inequality (61%) but only explains a
very small fraction of the eventual decline (9%). I run modified Mincerian regressions
and decompose the variance components to find that the decline in earnings inequal-
ity is consistent with decline in variance of firm level earnings whereas variance of
predicted wage from worker characteristics have increased. I also find suggestive evi-
dence of changing patterns of worker-firm sorting which contributes to the decline in
inequality.
vi
Acknowledgements
The role of the Chair of the committee without doubt deserves the most plaudits
for a successful dissertation of a PhD student. I consider myself extremely lucky to
have had not one, but two people serve as co-Chairs of my committee. This not only
doubled my impetus towards achieving this target but also made me doubly more
determined to put my best foot forward towards producing this research output. As a
result I cannot thank Aimee Chin and Gergely Ujhelyi enough for acting as co-chairs
of my committee and guiding me at every step throughout my graduate student life
at the University of Houston. Professor Chin has truly been an academic mentor to
me from whom I could seek help for almost anything related to my career, not just my
research. Prof Ujhelyi on the other hand was the one who first encouraged me to start
thinking about research, right in my first year as a graduate student. The fact that I
started attending the Empirical Microeconomics seminar series every Tuesday right
from then is owed to him.
I must acknowledge the very important role played by the third member of my
dissertation committee, Chinhui Juhn. It was with her that I did my first independent
study in Grad school and the way she walked me through the papers we read together
and discussed every Friday will remain one of the most fruitful experiences of my life.
I must also take a moment to specially thank Janet Kohlhase for being super support-
ive during the lowest point of my days here at U of H. I also thank Bent Sørensen for
encouraging and uplifting me particularly at that time and for being a great Grad Di-
rector to all of us in general. I thank Amber Pozo for her great administrative support
throughout my career at University of Houston.
vii
I would also like to thank Andrew Zuppann for his comments and suggestions
at various times and for giving the chance to present preliminary work in the Brown
Bag sessions and thank all the attendees. I have benefitted greatly from all of their
comments which I will perhaps miss the most, apart from the delicious food! I thank
all the other faculty members of the department of Economics for providing feedback
at various times during seminars. I would be failing in my duty if I do not specially
mention Andrea Szabó for being an amazing co-author. I am grateful to Francisco
Cantú from the Department of Political Science for being the external member of my
dissertation committee. Life at UH would not have been as great had it not been
for my fantastic classmates and colleagues. I thank Chon-Kit for the many useful
discussions. I thank Bocong for being such a great friend and confidante and Subash
for being extremely supportive. For the innumerable memories at UH, within and
outside the department, I must thank Gautham, Vinh, Max, Xuejing, Shoumen, Aritri,
Indrajit, Debashis and so on but I apologize to everyone whose names I cannot take
due to paucity of space.
While one is at grad school, he requires a solid support system from outside and
back home. I was lucky to have one. I cannot express in words the contribution of my
parents, Srabani and Somnath. Everything that I am today is due to them and there
are no other contenders befitting for me to dedicate this thesis to. I must thank my
professors from my alma mater in India, Ajitava Raychaudhuri, Saikat Sinha Ray and
Swapnendu Banerjee for believing in me, my greatest buddy and childhood friend
Sayantan for being there at all times, my closest friend from college Saumik for being
who he is.
Last but in no way the least, I was blessed to have unbelievable grandparents for
viii
super important emotional support and much needed pampering at times. The con-
tribution of Dadai, Chotdadu and Monididi cannot be quantified here. Even though
Mammam could not live to see this day, I am sure she has always been with me from
wherever she is now and would have been the happiest person on earth today.
ix
Contents
1 Effects of Agricultural Credit Reforms on Farming Outcomes: Evidence from
the Kisan Credit Card Program in India. 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.1 The Kisan Credit Card Program . . . . . . . . . . . . . . . . . . . 6
1.2.2 Conceptual Framework and Related Literature . . . . . . . . . . . 9
1.3 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.5.1 Results using District-Panel Dataset . . . . . . . . . . . . . . . . . 21
Effects on Crop Production . . . . . . . . . . . . . . . . . . . . . . 21
Technology Adoption . . . . . . . . . . . . . . . . . . . . . . . . . 22
Threats to Identification: Check for Pre-Trends . . . . . . . . . . . 23
1.5.2 Do Households Change their Borrowing Patterns? . . . . . . . . . 24
Total Borrowing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Analysis of the Largest Loans . . . . . . . . . . . . . . . . . . . . . 26
1.5.3 Effects on Household Production, Income and Consumption . . . 27
Rice Production and Sales . . . . . . . . . . . . . . . . . . . . . . . 27
Income . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
x
Consumption Expenditures . . . . . . . . . . . . . . . . . . . . . . 29
1.5.4 Falsification Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 “None of the Above" Votes in India and the Consumption Utility of Voting
(with Gergely Ujhelyi and Andrea Szabó) 46
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.2 The consumption utility of voting . . . . . . . . . . . . . . . . . . . . . . . 51
2.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.1 The Indian NOTA policy . . . . . . . . . . . . . . . . . . . . . . . . 55
2.3.2 NOTA-like options in other countries . . . . . . . . . . . . . . . . 56
2.3.3 Assembly elections in India . . . . . . . . . . . . . . . . . . . . . . 58
2.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.4.1 Samples used for analysis . . . . . . . . . . . . . . . . . . . . . . . 61
2.4.2 Election data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.4.3 Voter demographics . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.5 Patterns in the data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.5.1 NOTA votes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.5.2 The effect of NOTA on turnout . . . . . . . . . . . . . . . . . . . . 67
2.6 Estimating the effect of NOTA from a demand system for candidates . . 70
2.6.1 Specification: demand . . . . . . . . . . . . . . . . . . . . . . . . . 71
2.6.2 Specification: supply . . . . . . . . . . . . . . . . . . . . . . . . . . 74
2.6.3 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
2.6.4 Practical issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
xi
2.6.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Parameter estimates . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Counterfactual analysis: The impact of NOTA . . . . . . . . . . . 82
2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
2.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
2.9 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
2.9.1 The correlates of NOTA votes . . . . . . . . . . . . . . . . . . . . . 90
2.9.2 Robustness of the DD estimates . . . . . . . . . . . . . . . . . . . . 91
National elections . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Redistricting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
State-specific events . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3 Firm Ownership and Wage Dispersion: Evidence from Ghana Using Matched
Employer Employee Data 110
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3.2.1 The Ghanaian Manufacturing Sector . . . . . . . . . . . . . . . . . 113
3.2.2 Related Literature and Conceptual Framework . . . . . . . . . . . 115
3.3 Empirical Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3.3.1 Analyis of Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
3.3.2 Percentile Analysis: Contribution of Earnings Gap to Variance . . 117
3.3.3 Worker Characteristics and Firm Fixed Effects . . . . . . . . . . . 117
3.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.5.1 Within-Firm and Between-Firm Effects . . . . . . . . . . . . . . . 120
xii
3.5.2 Percentile Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
3.5.3 Returns to Schooling and Earnings Dispersion . . . . . . . . . . . 124
3.5.4 Does Predicted Wage Dispersion by Worker and Firm Charac-
teristics Vary by Ownership Structure? . . . . . . . . . . . . . . . 124
3.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
xiii
List of Figures
1.1 Crop Production: Year-specific coefficients for aligneds ·morebanksd . . 38
1.2 HYV Use: Year-specific coefficients for aligneds ·morebanksd . . . . . . . 39
2.1 Constituencies in the merged dataset . . . . . . . . . . . . . . . . . . . . . 102
2.2 Distribution of NOTA vote shares across constituencies . . . . . . . . . . 107
2.3 Impact of NOTA on turnout . . . . . . . . . . . . . . . . . . . . . . . . . . 108
2.4 Impact of NOTA on candidates’ vote shares . . . . . . . . . . . . . . . . . 109
3.1 Share of Firms by Ownership Type . . . . . . . . . . . . . . . . . . . . . . 129
3.2 Time Series Plots of Firm Averages by Ownership Types: Firm Dataset
of 200 Firms over 12 Years . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
3.3 Variances of Log Real Hourly Earnings by Ownership Type . . . . . . . . 136
xiv
List of Tables
1.1 Comparing Means of statewise spread of KCC in 2000 by aligned . . . . 37
1.2 VDSA District Dataset: Effects on Crop Production and HYV Use . . . . 40
1.3 IHDS Dataset: Effects on Borrowing Composition . . . . . . . . . . . . . 41
1.4 IHDS Dataset: Effects on Rice Production and Sales . . . . . . . . . . . . 42
1.5 IHDS Dataset: Effects on Income . . . . . . . . . . . . . . . . . . . . . . . 43
1.6 IHDS Dataset: Effects on Consumption Expenditure . . . . . . . . . . . . 44
1.7 Falsification Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1 Timeline of events in the study period . . . . . . . . . . . . . . . . . . . . 95
2.2 Summary statistics of the electoral data at the constituency level . . . . . 96
2.3 Candidate characteristics in the panel data . . . . . . . . . . . . . . . . . 97
2.4 Voter demographics at the state level (repeated cross-section) . . . . . . . 97
2.5 Demographic characteristics of the constituencies from the Indian Cen-
sus (panel dataset) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
2.6 The impact of NOTA on turnout, DD estimates . . . . . . . . . . . . . . . 99
2.7 Estimates of the linear parameters of the demand system . . . . . . . . . 100
2.8 Estimates of the nonlinear parameters of the full model . . . . . . . . . . 101
2.9 Impact of NOTA on vote shares by party . . . . . . . . . . . . . . . . . . 103
2.10 The correlates of NOTA votes . . . . . . . . . . . . . . . . . . . . . . . . . 104
xv
2.11 Effect of NOTA on turnout, excluding national election years . . . . . . . 105
2.12 Effect of NOTA on turnout, controlling for redistricting . . . . . . . . . . 105
2.13 Effect of NOTA on turnout, robustness to state-specific events . . . . . . 106
3.1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
3.2 Analysis of Variances of Log Real Hourly Earnings of Workers . . . . . . 131
3.3 Variances of Log Real Hourly Earnings of Workers by Ownership Type . 131
3.4 Contribution of Earnings Gap to Variance in Earnings . . . . . . . . . . . 132
3.5 Returns to Schooling and Experience . . . . . . . . . . . . . . . . . . . . . 133
3.6 Variance Decomposition: Estimated Firm Effects and Predicted Wage
from Worker Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . 134
1
Chapter 1
Effects of Agricultural Credit Reforms
on Farming Outcomes: Evidence from
the Kisan Credit Card Program in India.
1.1 Introduction
Providing access to financial resources to the poor continues to be an important pol-
icy prescription in the literature even though the empirical evidence on impacts of
credit constraint relaxations and expansion of credit options for the unconstrained in
developing countries is mixed (Karlan and Murdoch 2009). To design effective poli-
cies that provide or expand access to credit, one would first need to understand the
mechanisms through which credit access helps the poor and also the impact of im-
plementing such a policy on the targeted beneficiaries. To estimate these impacts, the
ideal experiment would be to randomly provide credit products to households and
compare the outcomes of the ones getting access to the ones without access to this
product. The January 2015 issue of AEJ Applied has six papers on this subject using
randomized evaluations.1 These papers find little to no impact of providing access to
finance. Other papers like de Mel, Mckenzie and Woodruff (2008) using experimental
designs find positive impacts.
While randomized evaluations are ideal to identify the causal effects of credit con-
straint relaxations, by design these cater to a relatively small sample of the entire pop-
ulation. Whether a large scale national reform would replicate these findings is impor-
tant to understand. In this paper, I look at a major overhaul in the agricultural credit
delivery process in India in 1998, known as the Kisan Credit Card (KCC) program,
and evaluate the impacts of this policy. The targeted group for this credit reform was
rural agricultural households, generally involved in farming and other related occu-
pations. Ease of delivering agricultural credit, reasonable interest rates and relaxation
of monitoring norms were the key features of this program. Reports from the Planning
Comission of India (2002) suggest that by 2000-01, KCCs constituted almost 71% of the
total production credit disbursement by commercial banks. It was also the dominant
mode of production credit delivery for other banks. The report also suggests that in
the first two years, close to 4 million credit cards were issued with a total disbursal of
credit lines worth 50 bllion INR (1 billion USD approximately).
Although this was a major policy reform, to date there has been little convincing
evidence of the impacts of this program. Chanda (2012) uses post-policy state level
data from 2004-2009 to see if growth in KCC issues lines up with increases in agri-
cultural productivity. There are other government of India commissioned descriptive
reports like the Planning Commission report mentioned above and Samantara (2010).
In this paper, I use a country wide district panel dataset to evaluate the causal effects
1All the 6 papers are cited in the references.
2
of this program on agricultural output and technology adoption. I also use household
data to estimate the impacts of this program on a wide range of outcomes including
income, consumption and borrowing.
The reach of formal financial institutions is not universal in most developing coun-
tries. This is because banks would want to select into richer regions unless they are
administratively required to setup branches in unbanked locations. This makes for-
mal credit markets less accessible to the poor in these areas. The KCC reform therefore
provides an opportunity to add to this literature of how access to formal credit insti-
tutions can help the poor sections of the society in line with Burgess and Pande (2005).
The unique feature of the KCC program was that it catered exclusively to the agricul-
tural sector. Although in this paper, I am not able to distinguish whether the effects
of KCC operate through channels of new access to credit or expansion of credit to the
ones who already had some access. As a result most of my estimations should be
viewed as a bundle of reduced form effects.
This paper takes advantage of rules in implementation of the policy to generate
plausibly exogenous variation in access to this program to identify causal effects of
the reform. The identification strategy relies on variation across three main dimen-
sions. First is the time dimension. The policy was implemented in 1998 and I look at
the outcomes in years before and after the policy. Second, is the political alignment di-
mension, ie, whether the state government is ruled by a party aligned with the central
government in the federal structure of India. Political alignment has been widely re-
garded to be important for policy implementation and performance (see Chibber et al
2004, Iyer and Mani 2012 and Asher and Novosad 2015). The final source of variation
comes from how the rolling out of these credit cards was implemented. The KCCs
3
could only be issued through formal banks and not by any other agency. I use district
level variation in the number of bank branches already setup prior to the policy to
proxy for access to this program.
I propose to identify the causal effect of the policy by the interaction of these three
variables. The effect is identified by looking at the difference in outcomes after and
before the policy in districts with more bank branches over districts with fewer bank
branches in states that are ruled by political parties aligned with the central govern-
ment after controlling for these differences in districts in the states not aligned with
the center. I use pre-policy data to show that these regions were not already different
along the relevant dimensions to provide support to the identifying assumption that
any differences post-policy are attributable to the program.
I find that increased access leads to significantly higher production levels. Rice is
the major crop of India and I find an aggregate increase in production by 88 thousand
tonnes (metric ton) per year on average which is between 1/3 to 1/4 of an increase
compared to the mean. 2 Corresponding to this large change, I find that technology
adoption has also been significant. Crop production area under high yielding variety
(HYV) seeds increased by around 71 thousand hectares at an aggregate level which
is just under a 1/3 increase compared to the mean. This suggests that with increased
access to credit, districts exposed to the program fared significantly better in terms
of porduction and technology adoption. Using houseold data, I corroborate some of
these results. I find suggestive evidence of increases in rice production for farmers
even though estimated imprecisely. I am constrained by the fact that the household
2The Food and Agricultural Organization’s FAOSTAT indicates that in 2012, the value of rice pro-duced in India is over 40 billion US dollars which makes it the most valuable crop of India. Rice hasconsistently been the major crop of India in terms of overall value for years. See - faostat.fao.org
4
data comes only from a sample of farmers and not the universe, unlike the district
panel data described above which contains all rice production in the districts. I find
that revenue from sales of rice is higher for farmers potentially exposed to KCC.
The advantage of using household data is being able to observe borrowing pat-
terns. Using a cross-section of households, I find that households are more likely to
have fewer but larger loans with exposure to KCC. I also find that they are more likely
to have larger bank loan sizes if exposed to KCC. These effects seem to be larger for
those households which report cultivation as their main source of income and for rice
farmers. This is reassuring because most of the production effects observed using the
district data seem to suggest that rice farmers would be most affected by this policy.
I find that even though on average there is no effect on income but farm income
is higher by 129 Indian rupees (USD 2) per month for households whose main in-
come source was cultivation. Compared to the mean, the magnitude of this effect is
almost 25%. This is consistent with the findings on production and sales. With higher
sales, we might expect higher profits, ceteris paribus. I find no overall impacts on con-
sumption expenditure but composition of consumption changes. I find higher daily
expenditures but lower expenditure on consumption like tobacco and beetel leaves.
I do not find any effects on the margin of whether a household is likelier to have a
bank loan in response to the policy. Since I do find that households have a higher bank
loan size conditional on borrowing, this allows me to analyze a sub sample of house-
holds seperately. I look at all the outcomes for only those households who borrow
from banks and I find that all the effects described above are much more pronounced
for this group. Since KCC had to operate through banks, this gives us confidence that
our estimated effects are likely to be mediated through bank borrowing which should
5
include KCC borrowing.
The rest of the paper is organized as follows. Section 2 provides background infor-
mation. Section 3 describes the empirical strategy. Section 4 explains the Data. Section
5 presents results and Section 6 concludes.
1.2 Background
1.2.1 The Kisan Credit Card Program
Agriculture constitues roughly a fifth of the total GDP of India and employs two out
of three Indian workers. In the late nineties, agriculture started opening up to the mar-
ket rather than being limited to subsistence farming. Agricultural credit has played an
important role in developing the market for such produce and help improve the con-
dition of farmers in the country. However, the finance and credit institutions present
in the country prior to 1998 were deemed inefficient by several reports and experts and
as a result the Kisan Credit Card program was envisaged. This scheme was launched
in 1998 and was introduced for the first time in the budget speech of the Finance Min-
ister of India in the parliament. Within a year after its inception around 5 million cards
were issued to farmers. Prior to 1998, the system of agricultural credit delivery was
complicated. A multi agency approach was used where borrowers had to go through
several layers of bureaucracies depending on the purpose of their loans (Samantara
2010). KCC also brought about a revolving credit regime as opposed to the existing
demand loan system (Chanda 2012).
At its inception, the KCC was not a traditional credit card that is commonly used.
The card was a mere documentation for identifying the individual and his credit line
6
with a given bank. It did not have features that allowed payments at merchant outlets.
This also makes the presence of banks an important dimension for identifying the
intensity of reach of the program. The way to use a KCC was to visit the bank branch
in person and withdraw a certain amount of money which could then be used for
purchases. This also ruled out the possibility of banks monitoring the usage of the
loans.
The most important feature of this credit product was the ease of availability of
loan. Some banks laid down rules for eligibility like having title to an acre of irrigated
land. On fulfiling this criterion, the farmer would be eligible for a loan with a bank
without any collateral requirement for an amount upto 50,000 INR (around 1000 USD
back then). The KCC accounts were largely valid for 3 years and repayment time
frames spanned upto a year. On successful replayments and responsible credit use,
these accounts were renewable but the initial approval was given largely without any
bacground checks. As pointed out above, a big difference from existing crop loans
was that the usage of the KCC loans were not monitored whereas most agro-credit
was tied to agricultural use or purchase of inputs, fertilizers etc. So, a farmer could
get a KCC account and use the amount for personal consumption.
In a way KCC provided the best available source of personal credit to poor farmers.
The biggest advantage over microfinance institutions were that KCC was operational
through formal banks and charged a very reasonable interest rate of around 7% per
annum as opposed to as large as 36-40% rates charged by self help group microcredit
institutions. The approval process was also very simple and was a single window
exercise as the only criterion was ownership of an acre of irrigated land. Many banks
have recorded allowance of credit limits in excess of 50,000 INR but in such cases
7
they often asked for collaterals. Therefore larger scale farmers who are financially in
a better off situation were only likely to go for these loans. There was no clause to my
knowledge which restricted large farmers from opening a KCC account.
Samantara (2010) points out that a major reason why KCC was launched was to
integrate the various credit needs of farmers, from personal consumption to festival
expenditure, education, health and agricultural needs, into one comprehensive prod-
uct. Earlier a farmer had to weigh multiple options based on the purpose of his loan.
KCC made it a one stop procedure wherein he could withdraw the requisite amount
and use it for any purpose whatsoever. All the bank cared about was the timely re-
payment and not the usage. This was a major shift from the pre-existing agro-credit
policy in India which was called the Agricultural Credit Delivery System. Under that
system, a multi-product multi-agency approach was adopted. Policy makers in the
country had planned this in a way such that specific needs of farmers could be ad-
dressed by specific credit products. A farmer could go to a bank for purchase of a
particular input and get a loan against that purchase. The idea was more like financ-
ing purchases rather than giving out cash loans. From such a scheme KCC came as
a welcome change which sought to replace the multi-product approach in favor of
a cash credit approach in a single comprehensive product. As might be already evi-
dent from this discussion, KCC was intended to address the short term credit needs of
farmers and not the longer term needs. Since there was no monitoring, one could not
rule out the possibility of withdrawing cash from these accounts and using them for
consumption purposes. At present, Kisan Credit Cards are available as differentiated
products with various banks coming out with various varieties and features.
Overall the Kisan Credit Card program should be viewed as a bundle of reforms in
8
one. It not only aimed to relax credit constraints by making loans available to the ones
constrained prior to 1998 but also provided a source of flexible credit. KCCs could
potentially finance a lot of purchases, not just agricultural inputs and therefore have
wider social consequences. Since KCCs were a source of cheaper credit, one might
also view it as expansion of credit options for the ones already having access to other
forms of credit. Unconstrained farmers may now be attracted to borrow at cheaper
rates and finance their short term credit needs.
1.2.2 Conceptual Framework and Related Literature
To estimate the true causal effects of access to credit one would ideally want to gen-
erate random variation in access to financial institutions. There is a rich literature
comprising of experimental studies along these lines (Angelucci, Karlan and Zinman
2015, Attanasio et al 2015, de Mel, Mckenzie and Woodruff 2008, Augsburg et al 2015,
Banerjee et al 2015, Crepon et al 2015, Tarozzi, Desai and Johnson 2015). Apart from
this there is a quasi-experimental literature which looks at policy reforms in the formal
financial sector to answer a similar question (Burgess and Pande 2005, Banerjee and
Duflo 2014). Government policy reforms are usually not randomly assigned, there-
fore identifying the causal effects of such programs is challenging even though it is
important to understand the mechanisms behind such policies aimed at removal of
borrowing constraints.
Most recent studies on the role of credit access focus largely on this aspect of mech-
anisms of credit delivery (Karlan and Morduch 2009). This paper is the first to objec-
tively evaluate the Kisan Credit Card scheme using a district panel dataset and ex-
tends this literature by looking at this large scale national reform in credit delivery
9
mechanism. In the Indian context, Banerjee and Munshi (2004) and Banerjee and Du-
flo (2014) study the role of credit constraints on firms and businesses. However the
role of credit constraints in agricultural occupations has been little studied till date.
This paper also contributes to the literature by attemtping to fill this void.
An important question that arises here is whether this program should be viewed
as enhanced ‘access’ to agricultural credit or ‘expansion’ of credit to the ones who al-
ready had access to credit? The existence of credit constraints and impediments to
borrowing are major roadblocks in developing economies which is why governments
may want to innovate by reforming the system of credit delivery. If the main objec-
tive is to improve the condition of the poor, one would imagine that removing the
borrowing constraints would be important, or in other words a program like KCC
should have given ‘access’ to credit to the ones who never had the chance to borrow
before. The starting point of the analysis is to understand how we expect credit access
to affect the credit constrained? If KCC relaxed credit constraints and people unable
to borrow elsewhere could now borrow under this program, economic theory and
existing empirical evidence would lead us to expect multiple effects.
First, if households invest in productive assets or the borrowed funds are used to
finance improvements in technology of agricultural production, we expect their agri-
cultural income to be higher. Second, if we aggregate these effects, overall production
of crops should be higher and overall adoption of new technology should also be
higher. Third, composition of consumption may change. Banerjee et al (2015) find
such evidence in a microfinance experiment but the idea is applicable to a broader
10
country wide setting as well because in essence we are thinking of the impact of relax-
ation of credit constraints per se. Finally, since this was a national level formal lend-
ing program, one would expect that with enhanced access informal lending would go
down and be substituted by more formal sector loans.
The flip side however, is that from a lender’s perspective, such a policy may at-
tract poor quality borrowers. This leads to issues of adverse selection. Asubel (1991)
discusses credit card markets in the US and how lowering interest rates are far from
ideal from a bank’s perspective as bad borrowers may select into borrowing at lower
rates. KCC lending was usually at a much lower rate of interest than market rates or
informal lending rates prevalent among microfinance institutions. This would have
meant that the adverse selection issue was likely to be severe under this program.
Also since new borrowers are unlikely to have ever engaged in credit dealings, their
perception about their own future stream of income determining their repaying abil-
ity is likely to be myopic. Melzer (2011) and Bond, Musto and Yilmaz (2009) point
out these problems about ‘misinformed’ borrowers underestimating their future re-
payment commitments.
It is also important to think about potential general equilibrium effects of this pro-
gram. Are there any spillovers? For example, if some farmers get credit cards whereas
others do not, maybe they have a competitive advantage over the ones who did not get
this card and this might lead to perverse welfare implications. Similarly, if KCCs are
very attractive and result in high profits for farmers, this maybe an incentive for non-
farmers to take up agricultural occupations which in turn may affect non-agricultural
sectors in the rural areas.
11
1.3 Empirical Strategy
There are two parts in my empirical strategy. I have the twin objective of evaluating
the overall effects of access to credit on production outcomes on average and also
whether access to credit through such a reform is useful for intended beneficiaries.
To this end, I use two different datasets. The first is a district panel dataset and the
second is a cross-sectional household dataset.
Identifying the causal effects of having a KCC on agricultural outcomes using sur-
vey data is difficult because KCCs were not randomly assigned to households. Also,
using a cross sectional dataset, it is not possible to use time varying access to the
scheme either. 3 To overcome these issues, I propose an identification strategy that re-
lies on plausible exogenous variation in the reach of this program to find causal effects
of the program. Apart from the time dimension (program introduced in 1998) which
provides variation in the access to the program over the span of the data, there are two
different cross sectional dimensions that give us a sense of which regions might have
had more access to these cards after the policy. I use an interaction of these dimesions
to identify effect of the policy.
The KCC program was announced by the Finance Minister of India in his budget
speech in 1998 and the implementation began soon after. The government at the cen-
ter was ruled by the Bharatiya Janata Party (BJP) led National Democratic Alliance
(NDA) coalition. However, not all state governments were run by the NDA coalition.
Since the implementation of this policy required a lot of work at the grass roots in
terms of setting up infrastructure, spreading awareness, nudging banks to implement
3Even though there is no clear idea even in government documents in terms of how these cards wererolled out.
12
this policy and the like, one can understand that the role that state governments and
officials at the village and block levels who are employed by the state governments
would have had an important role to play in the penetration of this policy in those
states. This gives one potential source of variation in the policy. I use an indicator
variable aligned which takes the value 1 if the state in question was ruled by the BJP
or one of its NDA allies in 1998 and 0 otherwise. The idea is that aligned states would
probably have earlier or quicker access to this policy whereas the opposition parties
may choose to be slack in the policy implementation in the states where they are in
power, out of several motives including the fact that they would want the scheme to
be projected as a failure for the ruling coalition and take advantage of this in future
elections themselves.
The first real governmental study on the program outreach was done in 2002 by the
Planning Commission of India. They published a report with tables on the state wise
coverage of Kisan Credit Cards as of March 2000, which is 2 years into the program.
The coverage rates were basically the number of KCCs issued by various banks as a
percentage of total operational land holdings in the concerned state. So this gave an
idea as to how many farmers were potentially reached or covered under the policy
within the first two years of the policy at a state level. If we observe that aligned states
actually were implementing the policy faster than the other states, we might be more
confident about the use of this dimension to identify the effects of the program. Table
1.1 provides supportive evidence. I find that coverage in aligned states is almost 2.5
times the coverage in rest of the states and the difference is statistically significant at
the 99% level of confidence.
13
The second dimension that I bring to this analysis of variation in access is a techni-
cality that the policy had. These credit cards could only be given out through banks.
So it is understandable that areas with more banks are likely to be able to roll out these
cards faster than the ones which are unbanked or have fewer banks. However, there
may be concerns that banks opened up or positioned or repositioned themselves based
on the policy announcement in markets where KCC lending would flourish more. To
account for this issue I use bank data at the baseline year, ie, 1998 and not after the
policy. I use district level existing bank branches data from 1998 to enumerate the
number of branch offices of banks at the time of announcement of the policy. This
gives us another potential exogenous source of variation in the intensity of coverage
of the program. I create the variable bank98 to denote the number of bank branches
in a given district in 1998 and use the indicator variable morebanks which takes the
value 1 for districts with number of banks above the mean of bank98.
Finally, I use the indicator variable I(Y EAR > 1998) to capture the time of expo-
sure to the policy and controlling for pre-existing differences along the above cross
sectional dimensions over time. I run the following regression for district ‘d’ in state
‘s’ at time ‘t’:
Ydst = αs + δt + β1aligneds ·morebanksd · I(Y EAR > 1998) + β2morebanksd
+ β3aligneds ·morebanksd + β4morebanksd · I(Y EAR > 1998)
+ β5aligneds · I(Y EAR > 1998) + γXdst + εdst (1.1)
The coefficient of interest is β1 which captures the causal effect of the policy on
14
outcomes Y . I use state fixed effects captured by αs. Demographic controls at the
household level are included in X . I control for the number of persons in the fam-
ily, number of children, number of married men and women and also the age and
education levels of men and women.
The interpretation of β1 is that it gives us the difference over time (post- and pre-
policy) in Y for households in districts with more banks compared to households
in districts with lesser banks in aligned states after controlling for these same differ-
ences in non-aligned states. The identifying assumption is that the outcome Y would
not have been different for these groups of households had there been no KCC pol-
icy. There is no standard way to validate this assumption and identification always
assumes this, but the panel structure of the data provides an opportunity to check
whether these districts were historically different and already had differential trends
even before the policy. If we find that before the policy, differences in outcomes along
the above dimensions were not different, we gain confidence that the identifying as-
sumption is plausible. I describe a check for this at a later section and find that before
1998 there were indeed no differences in outcomes in these areas.
The fact that prior to the policy, the cross sectional dimensions seem to be similar,
leads us into the cross sectional analysis. The dataset that I use is from 2005 which is a
post-policy year. I still use the above cross sectional dimensions to generate exogenous
variation in access to the policy but do not have the time dimensions anymore. Since
there were no differences in these regions prior to the policy, any difference that I find
for 2005 can be attributed as a causal effect of the program.
Using the household dataset, I therefore propose to run the following regression
for household h in district d and state s:
15
Yhds = αs + θ1(aligned ·morebanks)ds + θ2(morebanks)d + ωXhds + uhds (1.2)
In this specification, θ1 is the causal impact of the policy on outcomes Y . The
identifying assumption here, similar to above, is that in the absence of the policy,
the differences in household outcomes between districts with more and less banks in
aligned states would not have been any different from the differences in household
outcomes in more and less bank districts in non-aligned states.
The main outcome that I look at is crop production. As mentioned earlier, rice is
the major crop of the country in terms of value. I focus primarily on rice production
but also look at the other important crops like wheat and maize. The idea is that
with access to credit, farmers may be able to invest more and increase output. Since
there is an element of investment behavior attached to credit access, I look at the use
of high-yielding variety (HYV) seeds. If farmers would adopt more HYV seeds to
increase their production, this would be evidence of technology adoption. I observe
all of these outcomes at the district level and use the panel dataset to find effects on
these. The cross sectional dataset however has a wide range of other outcomes that
are of interest. I briefly describe some of those below.
If access to credit leads to higher agricultural production, an immediate hypothesis
that follows is, access to credit leads to higher incomes for farmers. I use the household
survey data to test this hypothesis. I also hypothesize that since KCC is a formal
source of credit, this might lead to crowding out of informal lending sources like local
money lenders and employers. I do not observe usage of HYV seeds at the household
level but a feature of the agricultural sector is that most poor farmers are not able
16
to preserve and/or grow seeds for indigenous production. I hypothesize that with
access to credit, farmers become more efficient and will be able to use home grown
seeds as a result. I also look at various measures of consumption to see if household
consumption expenditure changed with exposure to the policy or if composition of
their expenditure on different types of consumption changed.
1.4 Data
District Production Data
The data for this study mainly comes from 2 sources. First, ICRISAT-VDSA database
provides a district panel data set for agricultural outcomes.4 For this analysis I am
only focussing on production of rice, wheat, maize and use of HYV seeds. The data
contains information on total production, total area under production, gross and net
cropped and irrigated areas„ number of markets in district, rainfall etc. Although
the dataset provides data from 1966-2011, I focus on the post-1985 period. This is
because of two reasons. Firstly, the empirical strategy would require that pre-trends
are accounted for among the geographic classifications used to identify the causal
effect of the program. One would be worried that in years long before the policy,
potential treatment and control groups would have had very different trends in out-
comes which would invalidate the analysis. Also, the period before 1986 marks a long
history of political turmoil including the emergency days and war with neighboring
countries. 1986 gives us a reasonable starting point for the analysis and it is at least 12
years before the KCC program began. Secondly, the dataset for the early 60s and 70s
4The ICRISAT has a rich database known as the Village Dynamics of South Asia (VDSA) and makesthis available for 19 major states of India
17
has lots of missing information, so analysis using those years would in any way lead
to lesser power.
Household Survey Data
The second dataset is the Indian Human Development Survey (IHDS)-2005. The
first official release of the survey was in 2008 for a survey they conducted in 1503 In-
dian villages and 971 urban neighborhoods in the year 2005. So, the data in this edition
of the survey is based on respondents interviewed in 2005. It was jointly conducted
by a team from the University of Maryland, USA and the National Council of Applied
Economic Research (NCAER), India. The 2005 survey covered 41,554 households and
compiled responses from two interviews each of which lasted for an approximate du-
ration of one hour. I have a wide range of outcome variables to look at including
income, consumption per capita, asset ownership, loan and debt details etc. I focus
only on the rural sample and exclude the urban households which yields a sample of
26734 households.
Household Crop Data
The IHDS-2005 also surveyed households to collect data at the crop level. There
are multiple households producing multiple crops. As will become clear later, most
of my main results appear to be driven by rice producers. So I merge the household
survey data with the crop files using only those farmers who produce any rice. For my
regressions using this dataset, I focus on the households below the 99th percentile to
exclude some large outliers. In the sample the mean of rice production for a household
is around 25 units measured in tenths of a quintal, the maximum is 2600 which is
18
unusually high. Therefore, I exclude the large outliers who produce above the 99th
percentile, which is 200 units in tenths of a quintal.
Household Data from 1993
To provide support to my identification strategy, described in the following sec-
tion, I do a falsification exercise using a cross section of households from the 1993
Human Development Profile of India (HDPI) which was a household survey and in-
terviewed several households who would later be reinterviewed in the IHDS.
Other Data
My identification strategy also relies on variations across three dimensions, cover-
age of KCCs, number of bank branches in 1998 and political alignment of state govern-
ments with the center as of 1998. I look up media reports and open source information
available online to match whether the political party ruling a state was part of the rul-
ing coalition at the center.5 I use data from the Reserve Bank of India website to list
the number of bank branches and branch offices in each district. I also use data from
the Planning Commission of India publication of 2002 for state level access to KCCs
by number of land holding covered under the scheme in 2000 to support the idea that
political alignment was important in terms of the reach of the program.
Do households own a Kisan Credit Card?5In particular I look up the name of the Chief Minister of the states in 1998 and note down his
political party. Then I check if that political party was part of the ruling coalition at the center, ie,National Democratic Alliance or NDA.
19
The IHDS-2005 includes a question for households on whether anybody in the
family owns a KCC or not. This is only a dummy variable. The ideal scenario to de-
scribe the true causal effect of access to credit on outcomes would be to do a 2SLS
regression by instrumenting for access to credit. So if the KCC program was an instru-
ment for access to credit, then ideally we would want to run a first stage regression
of access to credit on the identifying variables and divide the reduced form estimates
above by the first stage coefficient. However, regressions using this dummy variable
as the dependent variable should not be interpreted as the ‘first-stage’ because of two
main reasons explained as follows.
First, the ideal first stage we have in mind would be actual borrowings and usage
of the credit card and not the mere possession of this card. The only way that enhanced
access to credit through possession of this card would lead to increases in income is
if people actually borrowed using this card. Second, since we have just a single time
point, the year 2005, which is seven years after the policy was implemented, all the
coefficients reported using this dummy variable would be under-estimates of the first
stage coefficients. For example, if a household had the KCC for 7 years, and we believe
it was constrained prior to that, then the coefficients from the reduced form estimates
I report are relevant over a period of time while the household has benefitted from
access to credit. So if for this household we consider a change in some outcome Y , it is
not just an instantaneous rise but an overall change. If we divide this by the first stage
which just takes into account 1 period of time, the potential 2SLS estimate would be
hugely overestimated. So we would either need to multiply the so called first stage
coefficient by the number of years the household had the card for (the information for
which is unavailable) or deflate the reduced form by some factor.
20
Second, the dummy variable for having a KCC is not the perfect proxy for ‘access
to credit’ which would be the main dependent variable in our structural regression
model to do the 2SLS regressions. It is also quite possible that a single household had
multiple KCCs but this would show up as a 1 on the dummy, the same as a household
with just 1 KCC. To avoid these problems, I do not use this as an outcome variable
in my regressions. However, roughly comparing the means of this dummy variable
in areas potentially exposed more to the program to the areas exposed less, I seem to
find a positive difference, but this is merely suggestive and therefore I do not interpret
this as causal. The mean of this dummy variable for the entire sample is around 4%
which makes any estimation using this as a dependent variable less convincing.
1.5 Results
1.5.1 Results using District-Panel Dataset
Effects on Crop Production
Table 1.2 reports results on reduced form effects of credit access through more ex-
posure to KCC program on crop production outcomes. I run regressions using the
specification in equation 1.1 as above and report the coefficients β1 for each outcome.
Rice is by far the major crop of India in terms of value of output. I find from column
1 that annual district production of rice increases by about 88 thousand tonnes with
more exposure to KCC. This is quite a big effect compared to the mean of 285 thousand
tonnes which suggests that impediments to borrowing severely constrain the scale of
production. One possible interpretation of this is while farmers are credit constrained,
21
they can put a smaller area under crop production, use lesser inputs and have little or
no access to advanced production technology. With access to credit, these are less of
problems and as a result we expect to see a surge in production, to the extent that is
found in Table 1.2. 6
One possible concern could be that there are state specific or district specific time
trends that are driving these results. To address this concern, I allow for trends in
the identifying variables in columns (2) and (3) and I find that the point estimate is
robust. In columns 7 and 8, I look at two other crops and do not find any significant
effect of this policy. Again, a reason could be that these crops are much less important
in terms of value and not all states produce these whereas rice is a more universal
crop in a country like India. So, with access to credit, given rice is more profitable in
India, farmers are expected to invest more in rice production. However, it is reassuring
that even though not significant, the point estimates on these are still positive which
suggests an increase in overall production.
Technology Adoption
A possible mediating channel for an increase in rice production could be adoption
of technology. Existing studies have shown that credit constraints are important hin-
drances in adoption of technology (Croppenstedt, Demeke and Meschi 2003). Mukher-
jee (2012) uses Indian household data to show that access to banks leads to better
6India had a major drought in 2002 which affected several rice farmers. Rainfall was about 56%below normal in July and almost 22% less rain was recorded overall (see Bhat 2006). In general thisshould not impact my analysis. However, there maybe concerns that banked districts in aligned statesmight have responded differently in terms of providing support to the agricultural system and thereforeit confounds the estimate somewhat. I find that the point estimates are not very different if we exclude2002 which alleviates these concerns. These results are not reported but are available upon request.
22
adoption of High Yielding Variety (HYV) seeds in production. Since the KCC pro-
gram intended to provide more credit access, it is interesting to examine whether the
relaxing of credit constraints has a similar effect as Mukherjee (2012) on aggregate.
Column 4 in Table 1.2 suggests that overall crop area put under HYV seeds usage
is higher by 71 thousand hectares with exposure to KCC. This is suggestive evidence
that access to credit leads to some technology adoption. As with overall production,
the point estimate here is also robust to linear de-trending as reported in columns 5-6.
These reduced form effects can be viewed as mediating channels for an increase in rice
production.
Threats to Identification: Check for Pre-Trends
The identification strategy would be invalidated in the case of pre-existing differen-
tial trends in the areas plausibly exposed more to KCCs compared to the ones not
exposed as much. One example would be if some districts in aligned states are tra-
ditional strong holds of the political party in the center, those districts may in any
case get preferential treatment historically and the coefficient we are picking up is not
the true causal effect of the policy. To alleviate concerns such as these, in Figure 1.1 I
plot all the β1 coefficients for crop production outcomes by year instead of interacting
with I(Y EAR > 1998). The dotted lines represent 95% confidence intervals. In other
words, instead of using all the previous years as the omitted reference group, I exclude
the year 1986 and compare the year specific effects with respect to this excluded year.
Each point of the graph represents the following object for year t:
23
[(Yaligned,morebanks − Yaligned,lessbanks)− (Ynonaligned,morebanks − Ynonaligned,lessbanks)]t
− [(Yaligned,morebanks − Yaligned,lessbanks)− (Ynonaligned,morebanks − Ynonaligned,lessbanks)]1986
(1.3)
I find that these coefficients, for all the outcomes are not statistically different from
zero prior to the policy year (marked by a vertical line) and for rice production, they
become positive since 1998. These suggest that the areas identified as exposed more to
KCCs were not systematically different from the areas without as much KCC exposure
as per my identification strategy.
In figure 1.2, I perform the same exercise but for HYV area as an outcome. The
coefficient does not jump at 1998 as sharply as for rice production but at least prior to
1998 it is never significant, which supports the identifying assumption somewhat.
1.5.2 Do Households Change their Borrowing Patterns?
In this section I look at the impacts of these agricultural credit reforms on outcomes
related to borrowing and lending. I report reduced form regressions using the house-
hold data. Unfortunately the district panel dataset (VDSA) does not provide any in-
formation on credit and therefore it is not possible to compare these findings at the
district level. So all of the following analyses are based on the cross sectional dataset.
24
Total Borrowing
Panel A of Table 1.3 reports regression results for the outcomes I discuss here. Through-
out the table I report results for all available households and 2 sub categories. First,
columns titled ‘cultivator’ represent those households whose main income source is
cultivation. Second, columns titled ‘Rice’ are for those households who produce any
rice. I find from columns 1-3 that on average there is no effect on whether people ex-
posed to KCC are more likely to borrow. The dependent variable is based on answers
to the survey question of whether the household had any loans in the last 5 years. The
policy was implemented from 1998 and the survey is based on 2004-05, so it is hard to
make conclusive statements about the estimated coefficients, especially because of the
lack of precision. I also do not find any significant effect on total outstanding debt.
The more interesting results come from columns 7 to 12. I find that on average,
households have lesser number of loans in the last 5 years. For every 2 rice farmers,
I estimate 3 fewer loans with exposure to the KCC program. I do not find any evi-
dence of the policy impacting the margin of whether the main creditor is a bank for
the households. I define bank as the main creditor if the largest loan, conditional on
borrowing, comes from a bank. The fact that this margin is unaffected by the pol-
icy allows me to look at effects of the program on a sub sample of households who
borrow from banks. The KCC policy was expected to operate through banks, so the
households who actually borrow from banks are likely to be affected by this program
the most. I look at outcomes like production, consumption and income for this sub
sample of households in the following sections.
25
Analysis of the Largest Loans
In Panel B, I restrict attention only to the largest loans of households in the 5 years
before the survey. Columns 1-6 focus on the largest loan from any source. The rest
of the columns focus on the largest loans if the source is reported to be a bank. I do
not find any difference in interest rates across the board. Although for bank loans, the
negative coefficient (and the lower mean interest rates) are suggestive that the policy
led to availability of cheaper credit because one feature of the reform was to allow
borrowing at lower rates of interest. Again, these estimates are imprecise, so we have
to be cautious with interpreting these.
The effects on loan size are significant. Not only do I find that the average house-
hold increasingly exposed to KCC borrows almost 9 thousand INR more than the
average household less exposed to KCC, but this number is 16 thousand INR for the
average rice farmer. This is with respect to loans from any source. If I restrict the sam-
ple to largest loans coming from banks, these numbers are considerably higher. The
average household borrowing from banks and exposed more to KCC has a largest loan
that is 41 thousand INR bigger in size than the one with less exposure to KCC. These
numbers are very similar for the cultivator and the rice farmer samples. These results
are consistent with theories of expansion of credit as a result of KCC as well as access
to credit. Whether the higher borrowing is because in the counterfactual households
are constrained or due to the fact that loans are now cheaper cannot be seperated with
this exercise though the point estimates on the borrowing margins in panel A sug-
gests that most of the effect is driven by existing borrowers and not new borrowers.
Eitherway, this helps corroborate the findings on production. If borrowing increased,
irrespective of the channel, we would expect more investment and therefore higher
26
production.
1.5.3 Effects on Household Production, Income and Consumption
It is interesting to examine how the higher borrowing estimated above translates into
spending and income. The following sections are devoted to this exercise. I first check
if production and sales increased for rice at the household level, which was the crop
that appeared to have been most affected by the policy in the district analysis. Then I
estimate effects of the program on household income and finally look at consumption
expenditure.
Rice Production and Sales
I use the IHDS crop level data, as described above, to estimate the reduced form ef-
fects of the KCC program on production outcomes. Results are reported in Table 1.4.
Most estimates are imprecise with large standard errors clustered by dsitrict. I restrict
attention to only those households that produce some positive amount of rice. Col-
umn 1 suggests minimal effects on overall household production levels but if I restrict
the sample to only those farmers who sell their output, as in columns 3 and 5, I find
suggestive evidence of large increases in production levels and revenue. The increase
in revenue is almost 40 thousand INR per year.
In columns 2, 4 and 6 I look at these outcomes for the subsample of bank borrowers
only. I find significant increases in production and revenue from sales of rice. This is
consistent with earlier findings of increase in production at the district level and bigger
bank loans. In the coutnerfactual, if households did not have access to larger loans
prior to the introduction of KCC, they may have faced difficutlies in financing their
27
production technology. With KCC they can secure larger formal sector loans which
allows productive investments and that transpires into higher output and revenue.
Income
Another way to corroborate the idea of higher agricultural output with increased ac-
cess to credit is to see if this translates into effects on household outcomes. Increased
agricultural output is only expected to have welfare effects if there is an observable
increase in income of the farmers. Table 1.5 reports results that look at this dimension.
When I restrict the sample to households whose main income source is cultivation
and look at the reduced form policy effects on incomes from their farms, I find in-
comes higher by 129 INR (USD 2) which is about 25% compared to the mean. This is
approximately a 24 INR monthly increase per capita for rice farmers. 7
I also check for non-farm income and find no effects. If the reduced form effects are
operating through enhanced credit access, especially for households with previously
no access to credit, then we would onlyt expect farm incomes to be higher because the
policy was directed towards farming households.
When I restrict the sample to only bank borrowers, which we have now identified
as the group of people most likely to be affected by the policy, I largely find significant
effects on income. Both per-capita income and per-capita farm income is likely to be
higher for these households if exposed to the KCC program more.
7Effects are imprecise as before but if we compare this to the estimated effects on revenue of ricefarmers we can do some rough calculations. A 24 INR per capita increase in income (profits) of ricefarmers would imply a yearly per capita increase in income of 288 INR. The average household hasfive or six members, so this translates to a total annual profit of around 1600 INR. With estimated salesrevenue increases of 40 thousand INR annually, this implies that costs and investments would havebeen higher by around 38 thousand for rice farmers.
28
Consumption Expenditures
I do not find any effect of enhanced credit access on overall consumption expendi-
tures but as reported in Table 1.6, composition of consumption expenditure changes.
Banerjee et al (2015) in their microfinance experiment find that spending categories are
sensitive to credit access and my results are consistent with their findings in a larger
nationwide setting. Similar to their experimental results, I find a decrease in expen-
diture on what is coined as ‘temptation goods’. These are expenses on tobacco, beetel
leaves etc and credit access has been believed to be a ‘disciplining device’ of sorts and
therefore exposure to credit reduces expenditure on these items. I also find an increase
in expenditure on recurring purchases of day to day household items.
The vast health economics literature also predicts that with increases in income,
stress levels decline and as a result consumption of goods like tobacco and alcohol
would go down (Cotti, Dunn and Tefft 2015). If access to credit led to higher produc-
tion and higher income, it is not surprising that consumption expenditures decrease
on temptation goods.
Expenditure on temptation goods is lower by 29 INR per month whereas day to
day spendng is higher by about 11 INR. For cultivators, this effect is 36 INR and and 16
INR respectively and is estimated with greater precision. The effects are even higher
for the sample of households who borrow from banks. One possible explanation con-
sistent with the findings would suggest that with credit constraints being relaxed,
households now can plan out their future spending stream better and spend money
on more productive uses that would be welfare enhancing in the long run whereas
they cut back on less productive consumption like tobacco etc. Even if the effects
are not through relaxation of credit constraints, expansion of credit could also have
29
similar effects.
1.5.4 Falsification Exercise
In this section, I perform a robustness check for my identification strategy and de-
scribe a falsification test. The identifying assumption for my analysis was that any
differences in districts with more banks compared to districts with less banks in 2005
in aligned states is attributable to the KCC program after controlling for trend differ-
ences in these districts using the non-aligned states. However, one maybe worried
that prior to the policy, these areas were already different and what we are picking
up is an existing trend. Figures 1.1 and 1.2 using the district panel data alleviates this
concern but here I present an alternate test using household data.
The cross sectional data is from the 2005 Indian Human Development Survey
(IHDS). A portion of the households interviewed in 2005 were drawn from an ear-
lier survey known as the Human Development Profile of India (HDPI) conducted in
1993. Since HDPI was in a year before the policy, I use the above identification strat-
egy for the households that can be traced back and run the regression equation (1) for
some comparable outcome variables but for the year 1993. If θ1 is the potential effect
of the KCC policy with the 2005 data, then for the same regression with the 1993 data,
we would not expect it to be significantly different from zero. Columns (1) and (2) of
Table 8 report the θ1 coefficients for the above regression for both 1993 and 2005 data
using the comparable Xs and for the comparable Y s. I report the regression results
for per capita income in Table 1.7.
I find that coefficients are systematically higher in 2005 whereas they are never sig-
nificantly different from zero in 1993. This suggests that the regions being compared
30
in my estimation were not different prior to the policy and any difference arising post-
policy may therefore be attributable as a reduced form impact of the program.
In column (3), I repeat the regressions from column (2) using the same sample but
adding other controls as used in the main analysis above. These additional controls
like number of persons in the family, number of children and married persons were
not available in 1993. I find that most of the effects are still pretty much the same as in
column (2) though the point estimates are marginally bigger.
1.6 Conclusions
In this paper I looked at a major agricultural credit reform in India known as the Kisan
Credit Card policy which simplified the functioning of the agricultural credit market.
A stated goal of the policy was to relax credit constraints on the poor. I used plausibly
exogenous geographic variation in the outreach of the program to identify the causal
impact of the policy. I find evidence that the reform led to large scale increases in ag-
gregate agricultural output. Rice, the major crop of India seems to have been the most
affected with a surge in production post-policy. There seems to have been significant
adoption of technology by putting more area under cultivation to use of HYV seeds.
Using a household dataset, I estimated effects of this policy on borrowing compo-
sitions. I find households are likely to have fewer but bigger loans with exposure to
the program. Also, size of the largest loan coming from banks is bigger for households
in areas exposed to KCC. No significant effects are estimated on interest rates.
I further looked at the impacts of this policy on outcomes like consumption, pro-
duction and income. I find suggestive evidence of increase in rice production and sales
31
revenue. I estimate an increase in farm income of around 129 INR (USD 2) monthly
per capita with enhanced credit access. The reduced form effects of the policy further
suggest that credit access acts a potential disciplining device where people spend less
on unproductive consumption and spend more on productive or investment goods.
There is no effect however, on overall consumption expenditure.
I identify households with a bank loan as the most affected category and find that
all estimated effects are much more pronounced for this sub sample of households
suggesting that a mediating channel for the reduced form estimates are bank loans.
Since KCC by design had to operate through banks, this provides confidence about
our identification strategy picking up the effect of the KCC policy.
32
1.7 References
1. Angelucci, M., Karlan, D. and Zinman, J (2015). ‘Microcredit Impacts: Evidence
from a Randomized Microcredit Program Placement Experiment by Comparta-
mos Banco’ American Economic Journal - Applied Economics
2. Asher, S and Novosad, P. (2015). ‘Politics and Local Economic Growth’ Working
Paper - Dartmouth - http://www.dartmouth.edu/ novosad/asher-novosad-politicians.pdf
3. Asubel, L. (1991). ‘The Failure of Competition in the Credit Card Market’ Amer-
ican Economic Review
4. Attanasio, O., Augsburg, B., De Haas, R., Fitzsimons, E. and Harmgart, H. (2015).
‘The Impacts of Microfinance: Evidence from Joint Liability Lending in Mongo-
lia’ American Economic Journal - Applied Economics
5. Augsburg, B., De Haas, R., Harmgart, H., and Meghir, C. (2015). ‘The Impacts of
Microcredit: Evidence from Bosnia and Herzegovina’ American Economic Journal
- Applied Economics
6. Banerjee, A. V. and Duflo, E. (2014). ‘Do firms want to Borrow More? Testing
Credit Constraints using a Directed Lending Program.’ Review of Economic Stud-
ies
7. Banerjee, A. V., Duflo, E., Glennerster, R. and Kinnan, C. (2015) ‘The Miracle
of Microfinance. Evidence from a Randomized Evaluation’. American Economic
Journal - Applied Economics
33
8. Banerjee, A. V. and Munshi, K. (2004). ‘How Efficiently is Capital Allocated?
Evidence from the Knitted Garment Industry in Tirupur.’ Review of Economic
Studies
9. Bhat, G. S (2006). ‘The Indian Drought of 2002 - a sub seasonal Phenomenon?’
Quarterly Journal of the Royal Meteorological Society
10. Bond, P., Musto, D and Yilmaz, B. (2009). ‘Predatory Mortgage Lending’ Journal
of Financial Economics
11. Burgess, R. and Pande, R. (2005) ‘Do Rural Banks matter? Evidence from the
Indian Social Banking Experiment.’ American Economic Review
12. Chanda, A. (2012). ‘Evaluating the Kisan Credit Card Scheme’ International Growth
Center Working Paper 12/0345
13. Chibber, P., Shastri, S and Sisson, R. (2004). ‘Federal Arrangements and the Pro-
vision of Public Goods in India’ Asian Survey
14. Chin, A., Karkoviata, L and Wilcox, N. (2011) ‘Impact of Bank Accounts on Mi-
grant Savings and Remittances: Evidence from a Field Experiment.’ Working
Paper, University of Houston
15. Cotti, C., Dunn, R and Tefft, N. (2015). ‘The Dow is Killing Me: Risky Health
Behaviors and the Stock Market’ Health Economics
16. Crepon, B., Devoto, F., Duflo, E and Pariente, W. (2015) ‘Estimating the Impact of
Microcredit on Those Who Take It Up: Evidence from a Randomized Experiment
in Morocco’ American Economic Journal - Applied Economics
34
17. Croppenstedt, A., Demeke, M. and Meschi, M. (2003) Technology Adoption in
the Presence of Constraints: the Case of Fertilizer Demand in Ethiopia’ Review of
Development Economics
18. De Mel, S., McKenzie, D. and Woodruff, C. (2008). ‘Returns to Capital in Mi-
croenterprises. Evidence from a Field Experiment.’ Quarterly Journal of Economics
19. Iyer, L. and Mani, M (2012). ‘Traveling Agents: Political Change and Bureau-
cratic Turnover in India’ Review of Economics and Statistics
20. Karlan, D. and Murdoch, J (2009). ‘Access to Finance’. Handbook of Development
Economics - Chapter 2
21. Melzer, B. (2011). ‘The Real Costs of Credit Access: Evidence from the Payday
Lending Market’ Quarterly Journal of Economics
22. Mukherjee, S (2012). ‘Access to Formal Banks and Technology Adoption: Evi-
dence from Indian Household Panel Data’ University of Houston - Working Paper
23. O’Donoghue, T. and Rabin, M. (1999). ‘Doing it Now or Later’ American Economic
Review
24. Planning Commission of India (2002). ‘Support from the Banking System: A
case Study of the Kisan Credit Card’ Study Report 146, Socio Economic Research
Division
25. Samantara, Samir (2010). ‘Kisan Credit Card - A Study” Occasional Paper 52,
National Bank of Agriculture and Rural Development, Mumbai
35
26. Tarozzi, A. Desai, J and Johnson, K (2015). ‘The Impacts of Microcredit: Evidence
from Ethiopia’ American Economic Journal - Applied Economics
36
TABLE 1.1: Comparing Means of statewise spread of KCC in 2000 byaligned
Aligned State Not Aligned State ∆(1) (2) (1)-(2)
KCC Coverage (in percentages) 30.596 12.710 17.886
Standard Deviation 17.223 12.617 H0 : ∆ = 0p-value < 0.001
Notes: KCC coverage is obtained from Planning Commission of India reports. It is calculated as thenumber of KCCs issued as percentage of total operational holdings in a given state in the year 2000. Iuse the definition of aligned as described for states aligned in 1998 and use the coverage figures 2 yearson. This table suggests that aligned states had much higher initial growth of the policy which providessupport to the use of aligned as a dimension of identification.
37
FIGURE 1.1: Crop Production: Year-specific coefficients for aligneds ·morebanksd
-100
-50
0
50
100
150
200
250
1986 1991 1996 2001 2006
Maize Production
95% ConfidenceIntervals
-100
-50
0
50
100
150
200
250
1986 1991 1996 2001 2006
Wheat Production
95% ConfidenceIntervals
Notes: The vertical line marks the policy year whereas the dotted line represent 95% confidence intervals. Regressions use the VDSA district panel dataset and control forcrop specific area under production, rainfall, crop specific iirigated area, markets nearby, gross cropped and irrigated areas and state fixed effects.
38
FIGURE 1.2: HYV Use: Year-specific coefficients for aligneds ·morebanksd
Notes: The vertical line marks the policy year whereas the dotted line represent 95% confidence intervals. Regressions use the VDSA district panel dataset and control forrainfall, markets nearby, gross cropped and irrigated areas and state fixed effects.
39
TAB
LE
1.2:
VD
SAD
istr
ictD
atas
et:E
ffec
tson
Cro
pPr
oduc
tion
and
HY
VU
se
Ric
ePr
oduc
tion
Are
aun
der
HY
VSe
eds
Whe
atM
aize
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
aligneds·m
orebanks d·I
(YEAR>
1998
)88
.7**
*89
.2**
*89
.8**
*71
.5**
*72
.7**
*72
.6**
*9.
996.
51(2
1.4)
(21.
5)(2
1.4)
(20.
8)(2
1.2)
(21.
6)(1
6.1)
(6.7
)
Line
arTr
end
inaligned
Yes
Yes
Yes
Yes
Line
arTr
end
inmorebanks
Yes
Yes
R2
0.94
0.94
0.94
0.75
0.75
0.75
0.97
0.83
Obs
erva
tion
s49
9249
9249
9241
7241
7241
7249
7849
92
Mea
nof
Dep
Var
284.
321
3.7
181.
738
.1
Not
es:A
naly
ses
inth
ista
ble
are
base
don
the
dist
rict
pane
ldat
aset
.Eac
hco
lum
npr
esen
tsa
diff
eren
treg
ress
ion.
Prod
ucti
onfig
ures
are
annu
al,u
nits
1000
tonn
esan
dH
YV
area
isin
term
sof
1000
hect
ares
.A
llre
gres
sion
sin
clud
est
ate
and
year
fixed
effe
cts
and
cont
rolf
oral
lthe
doub
lein
tera
ctio
nte
rms
and
base
line
vari
able
saligned
,after
andmorebanks.
Ico
ntro
lfor
the
area
put
unde
rri
cecu
ltiv
atio
nan
dal
soir
riga
tion
spec
ific
tori
ce.A
ddit
iona
lcon
trol
sin
clud
era
infa
ll,gr
oss
crop
ped
and
irri
gate
dar
ea,p
rese
nce
ofm
arke
ts.C
lust
ered
Erro
rsar
eat
the
dist
rict
leve
inpa
rent
hese
s.**
*p<
0.01
**p<
0.05
*p<
0.1
40
TAB
LE
1.3:
IHD
SD
atas
et:E
ffec
tson
Borr
owin
gC
ompo
siti
on
PAN
ELA
IfBo
rrow
sTo
talO
utst
andi
ngD
ebt
Num
ber
ofLo
ans
Mai
nC
redi
tor
isBa
nk
All
Cul
tiva
tor
Ric
eA
llC
ulti
vato
rR
ice
All
Cul
tiva
tor
Ric
eA
llC
ulti
vato
rR
ice
(1)
(2)
(3)
(4)
(5)
6)(7
)(8
)(9
)(1
0)(1
1)(1
2)
aligned·m
orebanks
-0.0
72-0
.059
-0.1
38*
-0.5
14-7
.984
-0.0
27-1
.351
**-0
.998
-1.5
90**
-.039
-.067
-.068
(0.0
49)
(0.0
60)
(0.0
75)
(6.5
89)
(9.4
41)
(9.6
70)
(0.6
51)
(0.6
67)
(0.7
43)
(0.0
23)
(0.0
47)
(0.0
62)
Mea
n0.
458
0.50
80.
509
38.6
3849
.252
34.8
273.
170
3.23
23.
385
0.12
10.
174
0.15
1
Obs
erva
tion
s21
450
7907
5700
1025
040
2126
7810
111
4147
2827
2145
079
0757
00
PAN
ELB
Onl
yLa
rges
tLoa
nsO
nly
Larg
estL
oans
from
Bank
s
Inte
rest
Rat
esLo
anSi
zeIn
tere
stR
ates
Loan
Size
All
Cul
tiva
tor
Ric
eA
llC
ulti
vato
rR
ice
All
Cul
tiva
tor
Ric
eA
llC
ulti
vato
rR
ice
(1)
(2)
(3)
(4)
(5)
6)(7
)(8
)(9
)(1
0)(1
1)(1
2)
aligned·m
orebanks
0.08
00.
014
0.19
29.
414*
**12
.068
16.7
25*
-0.0
37-0
.085
-0.1
1441
.364
***
37.8
17**
44.1
81**
(0.2
39)
(0.2
07)
(0.3
10)
(4.7
59)
(7.6
72)
(10.
197)
(0.0
72)
(0.0
82)
(0.0
82)
(2.9
73)
(14.
576)
(19.
470)
Mea
n2.
105
1.88
62.
157
32.7
1940
.828
32.9
281.
059
1.07
91.
045
62.9
6165
.314
58.9
79
Obs
erva
tion
s10
114
4145
2827
1011
741
5128
7826
9614
3788
026
9614
3788
0
Not
es:E
ach
colu
mn
repr
esen
tsa
diff
eren
treg
ress
ion.
The
sam
ple
inPa
nelB
incl
udes
answ
ers
toqu
esti
ons
abou
tthe
larg
estl
oan
inla
st5
year
sfo
rth
eho
useh
olds
.M
onet
ary
Val
ues
(for
loan
size
and
outs
tand
ing
loan
s)ar
ein
INR
1000
unit
s.C
olum
ns1-
3in
Pane
lAre
port
regr
essi
ons
whe
reth
ede
pend
entv
aria
ble
isa
dum
my
for
whe
ther
the
hous
ehol
dha
san
ybo
rrow
ing
inth
epa
st5
year
s.To
talo
utst
andi
ngde
btis
the
vari
able
for
how
muc
hth
eho
useh
old
curr
entl
yow
esot
hers
cond
itio
nal
onno
n-ze
roou
tsta
ndin
gde
bt.T
henu
mbe
rof
loan
sva
riab
leis
also
wit
hre
spec
tto
num
ber
oflo
ans
inpa
st5
year
s.Th
ede
pend
entv
aria
ble,
Mai
nC
redi
tor
isBa
nk,
isa
dum
my
indi
cati
ngif
the
larg
estl
oan
ofa
borr
ower
com
esfr
oma
bank
and
take
sth
eva
lue
zero
for
borr
ower
sfr
omot
her
sour
ces
asw
ella
sno
nbo
rrow
ers.
The
coef
ficie
nts
repo
rted
are
foraligned·m
orebanks.
All
regr
essi
ons
incl
ude
stat
efix
edef
fect
san
dco
ntro
lfor
base
linemorebanks
vari
able
.A
ddit
iona
ldem
ogra
phic
cont
rols
incl
ude
num
ber
ofpe
rson
sin
each
fam
ily,n
umbe
rof
child
ren
inea
chho
useh
old,
num
ber
ofm
arri
edm
enan
dm
arri
edw
omen
,age
and
educ
atio
nle
vels
ofm
enan
dw
omen
.C
ulti
vato
rre
pres
ents
the
sam
ple
ofho
useh
olds
who
sem
ain
inco
me
sour
ceis
repo
rted
tobe
cult
ivat
ion
and
allie
dag
ricu
ltur
e.R
ice
farm
ers
are
hous
ehol
dsw
hopr
oduc
ea
posi
tive
amou
ntof
rice
.Clu
ster
edSt
anda
rdEr
rors
are
atth
edi
stri
ctle
vel.
***
p<0.
01**
p<0.
05*p<
0.1
41
TAB
LE
1.4:
IHD
SD
atas
et:E
ffec
tson
Ric
ePr
oduc
tion
and
Sale
s
All
Prod
ucer
sIf
Sells
Out
put
Qua
ntit
yQ
uant
ity
Pric
eX
Qua
ntit
y
Full
Sam
ple
Bank
Borr
ower
sFu
llSa
mpl
eBa
nkBo
rrow
ers
Full
Sam
ple
Bank
Borr
ower
s(1
)(2
)(3
)(4
)(5
)(6
)
aligned·m
orebanks
0.91
710
.809
**7.
719
18.9
82**
40.8
810
7.63
**(4
.033
)(5
.299
)(6
.355
)(7
.636
)(3
8.62
)(4
3.12
)
Obs
erva
tion
s71
1899
725
8346
425
8346
4
Mea
n21
.067
25.2
4440
.736
40.3
6523
8.54
228.
67
Not
es:
The
sam
ple
isre
stri
cted
toon
lyri
cefa
rmer
sin
the
IHD
Sda
tase
t.Ea
chco
lum
nre
pres
ents
adi
ffer
entr
egre
ssio
n.A
llre
gres
-si
ons
incl
ude
stat
efix
edef
fect
san
dco
ntro
lfor
base
linemorebanks
vari
able
.A
ddit
iona
ldem
ogra
phic
cont
rols
incl
ude
num
ber
ofpe
rson
sin
each
fam
ily,
num
ber
ofch
ildre
nin
each
hous
ehol
d,nu
mbe
rof
mar
ried
men
and
mar
ried
wom
en,a
gean
ded
ucat
ion
leve
lsof
men
and
wom
enan
dar
eaun
der
rice
prod
ucti
on.T
hede
pend
entv
aria
ble
inco
lum
ns(1
)to
(4)i
sri
cepr
oduc
tion
inte
nths
ofa
quin
tal.
The
depe
nden
tvar
iabl
ein
colu
mn
5is
the
reve
nue
from
sale
ofri
ceco
ndit
iona
lon
selli
ngri
ce.T
heun
its
are
INR
1000
.Ba
nkBo
rrow
ers
repr
esen
tth
esa
mpl
eof
hous
ehol
dsw
hoha
vebo
rrow
edin
the
past
5ye
ars
and
thei
rla
rges
tlo
anco
mes
from
aba
nk.C
lust
ered
stan
dard
erro
rsat
the
dist
rict
leve
lin
pare
nthe
ses.
Num
ber
ofcl
uste
rsis
282.
***
p<0.
01**
p<0.
05*p<
0.1
42
TAB
LE
1.5:
IHD
SD
atas
et:E
ffec
tson
Inco
me
Per
Cap
ita
Inco
me
Per
Cap
ita
Farm
Inco
me
Per
Cap
ita
Non
Farm
Inco
me
All
Cul
tiva
tor
Bank
Borr
ower
All
Cul
tiva
tor
Bank
Borr
ower
sA
llBu
sine
ssPe
rson
(1)
(2)
(3)
(4)
(5)
6)(7
)(8
)
aligned·m
orebanks
-0.0
160.
053
0.32
3*0.
043
0.12
90.
424*
0.03
0.01
2(0
.069
)(0
.141
)(0
.197
)(0
.063
)(0
.140
)(0
.223
)(0
.060
)(0
.159
)
Mea
n0.
715
0.73
10.
951
0.20
70.
476
0.41
10.
463
0.80
8
Obs
erva
tion
s21
117
7634
2634
2029
172
6225
0137
6967
7
Not
es:E
ach
colu
mn
repr
esen
tsa
diff
eren
treg
ress
ion.
The
coef
ficie
nts
repo
rted
are
foraligned·m
orebanks.
All
regr
essi
ons
incl
ude
stat
efix
edef
fect
san
dco
ntro
lfor
base
linemorebanks
vari
able
.Add
itio
nald
emog
raph
icco
ntro
lsin
clud
enu
mbe
rof
pers
ons
inea
chfa
mily
,num
ber
ofch
ildre
nin
each
hous
ehol
d,nu
mbe
rof
mar
ried
men
and
mar
ried
wom
en,a
gean
ded
ucat
ion
leve
lsof
men
and
wom
en.
Inth
ista
ble,
mon
etar
yfig
ures
are
inm
onth
lyIN
R10
00w
hich
isap
prox
mon
thly
USD
20.
The
sam
ple
size
isva
riab
lein
this
tabl
eba
sed
onth
egr
adat
ions
‘ifcu
ltiv
ator
’and
‘ifbu
sine
sspe
rson
’.A
hous
ehol
dis
deno
ted
asa
cult
ivat
orif
the
resp
onde
ntre
port
edth
atth
eir
mai
nin
com
eso
urce
isfr
omcu
ltiv
atio
nor
allie
dac
tivi
ties
.Si
mila
rly
ifth
ere
port
edm
ain
sour
ceof
inco
me
isbu
sine
ss,I
clas
sify
thes
eho
useh
olds
as‘b
usin
ess
pers
ons’
.Ban
kBo
rrow
ers
repr
esen
tthe
sam
ple
ofho
useh
olds
who
have
borr
owed
inth
epa
st5
year
san
dth
eir
larg
estl
oan
com
esfr
oma
bank
.Clu
ster
edSt
anda
rdEr
rors
are
atth
edi
stri
ctle
vel.
***
p<0.
01**
p<0.
05*p<
0.1
43
TAB
LE
1.6:
IHD
SD
atas
et:E
ffec
tson
Con
sum
ptio
nEx
pend
itur
e
Per
Cap
ita
Mon
thly
Con
sH
ouse
hold
Item
sTe
mpt
atio
nG
oods
All
Cul
tiva
tor
Bank
Borr
ower
sA
llC
ulti
vato
rBa
nkBo
rrow
ers
All
Cul
tiva
tor
Bank
Borr
ower
s(1
)(2
)(3
)(4
)(5
)6)
(7)
(8)
(9)
aligned·m
orebanks
-0.0
019
0.00
020.
0084
0.01
130.
0161
*0.
0215
**-0
.028
3***
-0.0
360*
**-0
.011
4(0
.003
7)(0
.006
4)(0
.007
4)(0
.007
3)(0
.009
3)(0
.010
8)(0
.010
3)(0
.012
5)(0
.017
8)
Mea
n0.
0644
0.06
580.
0854
0.07
290.
0791
0.08
770.
0817
0.08
290.
0928
Obs
erva
tion
s21
437
7899
2695
2143
778
9926
9521
437
7899
2695
Not
es:E
ach
colu
mn
repr
esen
tsa
diff
eren
treg
ress
ion.
The
coef
ficie
nts
repo
rted
are
foraligned·m
orebanks.
All
regr
essi
ons
incl
ude
stat
efix
edef
fect
san
dco
ntro
lfor
base
linemorebanks
vari
able
.Add
itio
nald
emog
raph
icco
ntro
lsin
clud
enu
mbe
rof
pers
ons
inea
chfa
mily
,num
ber
ofch
ildre
nin
each
hous
ehol
d,nu
mbe
rof
mar
ried
men
and
mar
ried
wom
en,a
gean
ded
ucat
ion
leve
lsof
men
and
wom
en.M
onet
ary
figur
esar
ein
INR
1000
unit
she
re.T
hete
mpt
atio
ngo
ods
incl
ude
beet
elle
aves
/tob
acco
etc.
Bane
rjee
etal
(201
5)fin
dth
atm
icro
finan
ceas
adi
scip
linin
gde
vice
redu
ces
per
capi
taco
nsum
ptio
nof
thes
ego
ods
byIN
R9.
The
per
capi
tale
quiv
alen
tof
my
findi
ngof
INR
28re
duct
ion
isIN
R4.
The
HH
item
sin
clud
ere
gula
rpu
rcha
ses
like
soap
s,bu
lbs,
buck
ets,
inse
ctic
ides
etc.
Iuse
the
IHD
S20
05sa
mpl
eof
rura
lhou
seho
lds
whi
chha
ve26
734
obse
rvat
ions
.C
ulti
vato
rre
pres
ents
the
sam
ple
ofho
useh
olds
who
sem
ain
inco
me
sour
ceis
repo
rted
tobe
cult
ivat
ion
and
allie
dag
ricu
ltur
e.Ba
nkBo
rrow
ers
repr
esen
tth
esa
mpl
eof
hous
ehol
dsw
hoha
vebo
rrow
edin
the
past
5ye
ars
and
thei
rla
rges
tloa
nco
mes
from
aba
nk.
Clu
ster
edSt
anda
rdEr
rors
are
atth
edi
stri
ctle
vel.
***
p<0.
01**
p<0.
05*p<
0.1
44
TAB
LE
1.7:
Fals
ifica
tion
Exer
cise
1993
2005
2005
Com
para
ble
Con
trol
sA
llC
ontr
ols
(1)
(2)
(3)
Per
Cap
ita
Mon
thly
Inco
me
0.01
90.
094
0.12
7(0
.045
)(0
.125
)(0
.120
)
Per
Cap
ita
Mon
thly
Inco
me
(ifp
rinc
ipal
occu
pati
onis
cult
ivat
ion)
0.03
70.
292
0.32
9(0
.064
)(0
.198
)(0
.191
)
Not
es:I
run
redu
ced
form
regr
essi
ons
follo
win
gth
esa
me
iden
tific
atio
nst
rate
gyas
desc
ribe
din
the
text
for
2005
and
usin
gth
eH
DPI
data
from
1993
.Ius
eth
esa
me
hous
ehol
ds(8
947)
for
both
year
s.Fo
rth
esa
mpl
elo
okin
gon
lyat
cult
ivat
ors,
Ihav
e37
31ho
useh
olds
.In
colu
mns
1an
d2,
Irun
regr
essi
ons
usin
gon
lyth
eco
mpa
rabl
eco
ntro
lsw
hich
are
age,
educ
atio
nof
mal
e,ed
ucat
ion
offe
mal
ean
dst
ate
fixed
effe
cts.
Inco
lum
n3,
Ire-
run
regr
essi
ons
ofco
lum
n2
usin
gth
esa
me
sam
ple
buta
ddin
gth
eco
ntro
lsus
edin
the
orig
inal
anal
ysis
whi
char
eav
aial
able
for
2005
.The
addi
tion
alco
ntro
lsin
colu
mn
3ar
enu
mbe
rof
pers
ons
infa
mily
,num
ber
ofch
ildre
nan
dnu
mbe
rof
mar
ried
mal
esan
dfe
mal
es.
Ion
lyre
port
the
com
para
ble
outc
ome
vari
able
s.C
lust
ered
stan
dard
erro
rsat
the
dist
rict
leve
lin
pare
nthe
ses.
***
p<0.
01**
p<0.
05*p<
0.1
45
46
Chapter 2
“None of the Above" Votes in India and
the Consumption Utility of Voting(with Gergely Ujhelyi and Andrea Szabó)
2.1 Introduction
One possible solution to the “paradox” of why people bother to vote in large elections
is that voting yields consumption utility. Such consumption utility could be derived
from performing one’s civic duty, expressing one’s political views, or participating
in a democracy (Downs, 1957; Riker and Ordeshook, 1968; Brennan and Lomasky,
1993). Empirically distinguishing these consumption motives from each-other and
from other possible goals, such as a desire to affect the electoral outcome, is notori-
ously difficult. In this paper we propose to do this by using data from a natural ex-
periment in Indian elections and estimating a structural model of voter turnout using
techniques from the consumer demand literature.
We propose to identify different components of the consumption utility of voting
by exploiting a natural experiment in the world’s largest democracy. Following a
decision by the Indian Supreme Court, since September 2013 all state and national
elections in the country must offer a “None Of The Above” (NOTA) option to voters.
Votes cast for NOTA are counted (rather than simply discarded as invalid) but do not
affect the outcome of the election (the winner is still the candidate with a plurality of
votes among votes cast for candidates). In the five states that held local elections after
the Supreme Court ruling 1.7 million voters chose NOTA, and in the 2014 national
election 6 million voters voted for this option (representing 1.1% of all votes cast).
While elections with a NOTA-type option have been used elsewhere, none of them
came close to the scale of the Indian experiment.1
Because in the Indian system NOTA votes cannot affect the outcome of the elec-
tion, voters who choose this option must be motivated by a consumption utility to
vote. Such consumption utility can arise from two broad sources. It can be a general
utility obtained from showing up at the polls (such as complying with a social norm
to participate in the election), or it can be a utility specific to the option chosen by the
voter (such as utility derived from expressing one’s views).
Intuitively, we can distinguish between these two types of consumption utility by
asking how a NOTA-voter would have behaved in the absence of the NOTA option.
If without the NOTA option this voter would have voted for one of the candidates,
this is consistent with both a general and an option-specific utility of voting. By con-
trast if without NOTA this voter would have abstained, then the NOTA vote cannot
be explained by a general utility derived from showing up at the polls. Instead, the
voter must be voting for NOTA in order to obtain a utility specific to this option. Thus,
1In most elections the only way for a voter to participate without voting for a candidate is to cast aninvalid vote and these are difficult to distinguish from voting mistakes. In systems where a NOTA-typeoption is explicitly available to voters, it typically has electoral consequences, affecting who gets electedor whether the election has to be repeated. We review these different systems in section 2.3 below.
47
studying the counterfactual behavior of NOTA voters can test apart these two compo-
nents of the consumption utility from voting.
Our empirical work seeks to test whether, following the introduction of NOTA,
new voters showed up at the polls in order to vote for this option. This question
is challenging because it requires making statements about individual voter behavior
in the counterfactual no-NOTA scenario. Because ballots are secret, individual voter
behavior is observed neither with nor without the NOTA option. Instead, it must be
inferred from aggregate data.
To begin, we first ignore individual behavior and study the impact of NOTA on
aggregate turnout in a reduced form framework. This exercise exploits variation in
the effective timing of the NOTA reform created by the Indian electoral calendar: elec-
tions to the states’ legislative assemblies occur at different times in different states.
This allows us to study the impact of NOTA in a difference-in-differences framework
by comparing the change in voter turnout in states not yet affected by the policy to
changes in those that were already affected. From this analysis we estimate that, in the
average electoral district, the introduction of the NOTA policy significantly increased
turnout. This finding survives a variety of robustness checks and the magnitude of
the effect (2-3 percentage points) is similar to the vote share of NOTA observed in the
data.
While suggestive of new voters turning out to vote for NOTA, these aggregate pat-
terns do not provide conclusive evidence because they mask the substitution between
abstention, candidates, and NOTA at the individual level. To study this, we relate
the aggregate voting returns to individual voter behavior using a structural model
of voter demand for candidates. We adapt the BLP model of Berry, Levinsohn and
48
Pakes (1995) from the consumer demand literature, where consumers (voters) choose
between the products (candidates) of firms (parties) in various markets (electoral dis-
tricts). Voters have preferences over observed and unobserved candidate characteris-
tics (including NOTA) and abstention. The model explicitly allows for heterogeneity
in these preferences and links them to the aggregate vote shares we observe in the
data. Estimating the model allows us to recover the parameters of individual voters’
utility functions from this aggregate data. Using the estimates, we study how voters
substitute between choosing NOTA, one of the candidates, and abstention in counter-
factual simulations where the NOTA option is removed.
The results of this analysis indicate that NOTA increased turnout, which is in line
with the aggregate patterns observed in the reduced form exercise. Furthermore we
find that the magnitude of this increase explains virtually all the NOTA votes ob-
served in the data. We find negligible substitution towards NOTA away from the
candidates running for election. These results indicate that most voters who voted for
NOTA would normally abstain. In turn, this provides evidence for the existence of
consumption utility specifically from voting for NOTA. In this context, models that
do not include an option-specific utility of voting would have a hard time explaining
the data.
To the extent that participation in a democracy is valuable, our finding that having
a NOTA option on the ballot can increase voter turnout is relevant in its own right,
and provides support for the arguments of the Indian Supreme Court in introducing
this policy.
Our paper is related to the vast literature on voter turnout, some of which we
review in section 2.2 below. Instead of proposing a new model of turnout, we focus on
49
testing models apart by asking whether the presence of consumption utility specific to
the NOTA option (as opposed to consumption utility from simply showing up at the
polls) is necessary to explain our data. This approach is similar in spirit to Coate and
Conlin (2004) and Coate et al. (2008) who estimate and compare competing structural
models of turnout on data from Texas liquor referenda.
Methodologically, our paper offers a novel way to estimate vote returns in multi-
party elections. Some earlier approaches to this problem (e.g., Glasgow and Alvarez,
2005) have used discrete choice models with individual-level survey data, but such
data is subject to well-known biases in voters’ self-reported behavior (see, e.g., Selb
and Munzert (2013) and the literature cited therein). Other studies use aggregate ad-
ministrative data and purely statistical models to deal with the problem of conducting
“ecological inference” regarding voter preferences (see Cho and Manski (2008) for a
review). By contrast our BLP-based approach combines the advantages of a micro-
founded discrete choice model with those of aggregate administrative data. It allows
for rich heterogeneity in voter tastes for candidate characteristics and, because it is
micro-founded, offers the possibility of conducting counterfactual experiments. In
a different context, Rekkas (2007) also exploits some of these advantages of the BLP
model in her study of campaign expenditures in the 1997 Canadian election. Our pa-
per goes further by using panel data, allowing for heterogeneity in voters’ preferences
driven by demographics as in Nevo (2001), allowing for endogenous candidate choice
by the competing parties, and by conducting counterfactual experiments using the
estimated model.
Finally, our paper relates to previous studies of NOTA-type votes in the political
science literature (reviewed in section 2.3 below). We differ from this literature by
50
using NOTA votes to isolate the consumption utility from voting and by estimating a
structural model that can be used to answer normative questions about the desirability
of having this option on the ballot.
In the rest of the paper, section 2.2 explains how we propose to use NOTA votes
to identify various components of the consumption utility from voting. Section 2.3
describes the NOTA policy, explains how it differs from similar options available to
voters in other countries, and describes the Indian electoral setting we analyze. Sec-
tion 2.4 describes our data and section 2.5 documents the pattern of NOTA votes and
presents a difference-in-differences analysis of the effect of NOTA on turnout. Section
2.6 estimates the structural model and presents the counterfactual results. Section 2.7
concludes.
2.2 The consumption utility of voting
Why people vote is one of the classical questions of economics and political science.
In the “calculus of voting” model (Downs, 1957; Riker and Ordeshook, 1968; Fiorina,
1976), voters consider both instrumental and consumption benefits. They vote for
candidate j if
PjBj − c+ (Uj + U0) > 0 (2.1)
and abstain otherwise (where j = arg maxj′
(Pj′Bj′ + Uj′) is the voter’s preferred can-
didate). The first term in (2.1) is the expected instrumental benefit, where Pj is an
individual’s probability of being pivotal in the election of candidate j and Bj is the
benefit of the candidate winning (without loss of generality, all terms in (2.1) are as-
sumed to be non-negative). The second term represents any direct or opportunity
51
costs from voting. The final term is the consumption utility of voting, which captures
a wide range of factors sometimes referred to as “expressive utility” or “civic duty”:
“1. the satisfaction from compliance with the ethic of voting [...] 2. the satisfaction
from affirming allegiance to the political system [...] 3. the satisfaction from affirm-
ing a partisan preference [...] 4. the satisfaction of deciding, going to the polls, etc.
[...] 5. the satisfaction of affirming one’s efficacy in the political system” (Riker and
Ordeshook, 1968, p28). We separate this consumption utility into two components to
highlight that part of the utility (Uj) may depend on voting for a specific candidate
j (e.g., the satisfaction from expressing partisan support), while part of it (U0) only
depends on showing up at the polls regardless of who one votes for (e.g., satisfaction
from compliance with an ethical norm to vote).
Observing that in large elections the probability Pj of being pivotal is close to 0,
the recent literature seeking to explain turnout within the framework of the calculus
of voting equation (2.1) has followed various routes.2 First, voters could overestimate
Pj . Lab experiments show that, indeed, voters often overestimate the probability that
their vote will matter and suggest that this can explain turnout decisions (Duffy and
Tavits, 2008; Dittman et al., 2014). Relatedly, Ortoleva and Snowberg (2015) show
that turnout is higher in populations with more overconfident voters. Under these
conditions, turnout can be explained even if (Uj + U0) = 0.
A second set of studies present models that create an option-specific utility Uj . In
2Missing from equation (2.1) are instrumental motivations other than those related to winning. Forexample, it is possible that a voter votes in order to signal his preferences to affect the policies chosenafter the election. Or he may vote in order to encourage a candidate to run again in the future. Itis possible to treat such motivations in a strategic setting but the likelihood that a voter’s vote willbe pivotal in affecting policy or encouraging a candidate is likely to be small (see, e.g., Razin (2003)).Here we follow most of the literature in assuming that if such motivations exist, they are sources ofconsumption utility and hence part of Uj .
52
Coate and Conlin (2004) and Feddersen and Sandorini (2006), this utility represents
ethical considerations regarding what would be best for everyone in one’s group. In
other models, such as Shachar and Nalebuff (1999), Uj is created by the mobilization
efforts of political leaders. In Degan and Merlo (2011), Uj includes a psychological
disutility from the possibility of voting for the “wrong” candidate.
Another set of papers focus on the general utility U0 from showing up to vote.
For example, members of a group may observe turnout and draw inferences about
whether an individual is an “ethical type” (Bénabou and Tirole, 2006; Ali and Lin,
2013). Similarly, a voter may vote to avoid a feeling of shame from not having done
his duty, especially if others will ask whether one has voted (Harbaugh, 1996; Blais,
2000; DellaVigna et al., 2015).
While these studies convincingly demonstrate that the proposed models have ex-
planatory power, it is not always clear to what extent these models are necessary to
explain the data.3 In particular, can we rule out that one or both components of the
consumption utility Uj + U0 is 0?
As we explain below, the Indian NOTA policy allows us to study this question by
creating a “None Of The Above” option that voters can vote for but that, by design,
cannot affect the electoral outcome. First, we document that voters actually choose
NOTA. Because PNOTABNOTA = 0, from equation (2.1) a voter who chooses NOTA
must have
UNOTA + U0 > c (2.2)
i.e., there has to be a positive consumption utility of voting.
3Examples of papers that highlight the value of testing models apart include the pair of studies byCoate and Conlin (2004) and Coate, Conlin, and Moro (2008) which explicitly compare several differentmodels of turnout on the same dataset.
53
Second, we ask how a voter who chose NOTA would have voted in the absence of
the NOTA option. If the voter would have abstained, then PjBj − c + (Uj + U0) ≤ 0.
Combining with (2.2), we have
UNOTA > PjBj + Uj,
i.e., there has to be a positive option-specifc utility from voting for NOTA. For exam-
ple, a voter may derive utility from expressing his disapproval of all the candidates.
Conversely, if in the absence of NOTA the voter would have voted for one of the candi-
dates, then it is possible that there is no option-specific utility but U0 > 0 (for example,
a voter may vote to satisfy social pressure while deriving no specific utility from vot-
ing for any of the options on the ballot.). Thus, studying voters’ behavior with and
without NOTA offers a test for the existence of an option-specific utility of voting.
In this way, although not an actual candidate, studying NOTA votes gives us an
opportunity to test both for the existence of a consumption utility from voting and for
the existence of a consumption utility from voting for this particular option.4 The sec-
ond exercise is empirically challenging because it requires making statements about
individual behavior both with and without NOTA while, due to the secret ballot, such
behavior is never observed.4It is difficult to imagine a similar experiment where an actual candidate’s probability of wining
is administratively set to 0. (Having small-party candidates on the ballot who have little chance ofwinning is not the same experiment since voters’ believing that Pj > 0 cannot be ruled out.) Anotherpossibility is to run a lab experiment where Pj is set to 0, but creating an artificial environment wherepeople derive utility from voting may be difficult. In an interesting experiment Shayo and Harel (2012)create variation in P and find evidence that the moral superiority of an alternative affects voters’ be-havior when P is small.
54
2.3 Background
2.3.1 The Indian NOTA policy
In elections where a paper ballot is used, voters can participate without voting for
any of the candidates: they can hand in an empty ballot or otherwise intentionally
invalidate their vote. With the advent of electronic voting machines Indian voters
lost this possibility. In 2004, the citizen’s group People’s Union for Civil Liberties
(PUCL) filed a petition with the Supreme Court to rectify this and give voters the
ability to have their participation recorded without forcing them to vote on any of the
candidates.5 In its 2013 decision, the Supreme Court agreed:
“For democracy to survive, it is essential that the best available men
should be chosen as people’s representatives for proper governance of the
country. This can be best achieved through men of high moral and ethical
values, who win the elections on a positive vote. Thus in a vibrant democ-
racy, the voter must be given an opportunity to choose none of the above
[...] Democracy is all about choice. This choice can be better expressed by
giving the voters an opportunity to verbalize themselves unreservedly and
by imposing least restrictions on their ability to make such a choice. By pro-
viding NOTA button in the Electronic Voting Machines, it will accelerate
5Under the electronic voting machines, the only way for a voter to have his non-vote recorded wasto inform the clerk at the voting booth of his desire to do so. The clerk would then record this on thevoter ledger together with the voter’s thumbprint for identification. The PUCL argued that this wasunconstitutional, violating the secret ballot.
55
the effective political participation in the present state of democratic sys-
tem and the voters in fact will be empowered.” (PUCL vs. Union of India,
2013, p43-44).
Following the Supreme Court’s decision, since September 2013, all state and na-
tional elections in India give voters the option of recording a “None Of The Above”
vote on the voting machine. These votes are counted and reported separately but have
no role in the outcome of the election. In particular, votes cast on NOTA affect neither
the validity nor the winner of an election. Even if NOTA were to receive a majority
of the votes, the winner of the election would be the candidate who received the most
votes among the non-NOTA votes.
The NOTA policy received wide news coverage in both national and local me-
dia. In its decision the Supreme Court directed the Election Commission to undertake
awareness programs to inform the electorate of the new policy, and voter education
programs explicitly focused on explaining this new option to voters. As a result we
expect that most voters would be well-informed about the NOTA policy, including the
fact that NOTA votes would not affect the electoral outcome.6
2.3.2 NOTA-like options in other countries
In most countries voters can effectively cast a “none of the above” vote by intentionally
returning an invalid vote (e.g., leaving the ballot blank, writing on the ballot, or mark-
ing more than one candidate). Because it is typically impossible to know whether such
votes occur intentionally or by mistake, it is difficult to use them to draw conclusions
regarding voters’ intentional behavior (see, e.g., McAllister and Makkai, 1993; Herron6As we discuss in section 2.5, several patterns in the data also support this assumption.
56
and Sekhon, 2005; Power and Garand, 2007; Uggla, 2008; Driscoll and Nelson, 2014).
For some applications, the fact that invalid votes also include voting mistakes will
simply add measurement error to the “true” measure intended to capture negative
votes. In other cases, however, this will have an important impact on the interpreta-
tion of the results. For example, more invalid votes among the less educated can mean
either that these voters are more likely to make mistakes when filling out the ballot, or
they are particularly dissatisfied and intentionally cast invalid votes to express this.7
In some countries, while there is no NOTA option on the ballot, blank votes are
counted separately from invalid votes and are believed to represent a negative vote.
In principle, this system could be equivalent to the Indian NOTA, but in practice the
equivalence is unlikely to be perfect. First, blank votes could still represent voting
mistakes, especially if there is a judgement call to be made about whether a vote is
truly blank when it is being counted (for example, there could be markings on the
side of the ballot, a small dot inside the checkbox, etc.). Fujiwara (2015) finds that
the introduction of voting machines in Brazil reduced both blank and invalid votes
among the less educated, which is consistent with both of these containing voting
mistakes. Second, using the blank vote as an expression of dissatisfaction requires
a shared understanding among voters regarding what the vote represents. Whether
this social norm is operative in a given election is difficult to test. This is illustrated
by the findings of Superti (2015) who studies a set of municipal elections in Spain - a
country where the blank vote is generally understood to mean “none of the above.”
She shows that despite this common understanding, voter dissatisfaction following
7These two interpretations also have different welfare implications regarding the desirability of hav-ing a NOTA option on the ballot. In the first case, NOTA only serves to confuse the less educated; inthe second case, it gives disadvantaged segments of the population a voice.
57
a ban which prevented the Basque nationalist party from contesting an election was
likely expressed through an increase in invalid rather than blank votes.
Another feature that makes India a cleaner case study than other systems for the
analysis of voters’ motivations is the electoral impact of the NOTA vote. Recall that in
India the NOTA vote can never “win,” and due to the first-past-the-post system it has
no impact on the allocation of legislative seats. By contrast in Colombia if the “blank
vote” wins, new elections must be called with the rejected candidates prohibited from
running again. In Spain, while the blank vote can never win, seats are allocated in
a proportional system and a minimum 3% threshold must be reached for a party to
enter parliament. In both of these systems choosing the blank vote as opposed to
choosing one of the parties has immediate electoral consequences, affecting the mix
of candidates eventually elected for office. In the Indian case, NOTA votes cannot be
driven by electoral motivations in the current election.8
2.3.3 Assembly elections in India
We study voters’ behavior under NOTA in the context of Indian state elections. In
the Indian federal system, state governments are responsible for most areas of local
significance, including health care, education, public works, police and security, and
disaster management. State legislative assemblies are elected in single-member elec-
toral districts (called “constituencies”) in a first-past-the-post system. The party or
8A voter’s motivation (with any vote under any system) can always include a desire to affect long-run outcomes, e.g., by signaling his political preferences to the eventual winner in order to affect policy,or by encouraging a candidate to run in future elections. Because a single vote is just as unlikely to bepivotal in affecting these outcomes as it is in affecting who wins, we think that these motivations arebest viewed as alternative sources of the consumption utility derived from voting.
58
coalition that wins the most number of seats in an assembly forms the state govern-
ment headed by a Chief Minister and his council of ministers.9
Table 2.1 shows the timing of state assembly elections in our study period. Elec-
tions are held every 5 years but the electoral calendar varies widely across states. For
example some states held assembly elections in 2007 and 2012 while others in 2008
and 2013; some states always go to the polls in March while others always do so in
November. This variation in the timing of elections creates an important source of
identification for the analysis below.
In most states assembly elections are conducted separately from other elections.
Four states, Andhra Pradesh, Arunachal Pradesh, Odisha and Sikkim, hold elections
simultaneously with national elections. We will exclude these states from the analysis
below.
All state and national elections in India are conducted by the Election Commission
of India under the supervision of the chief election commissioner. Since independence,
the Commission has emerged as a highly regarded institution with a large degree
of autonomy (McMillan, 2010). Election dates are set well in advance and declared
as local holidays to reduce the cost of participation. Polling stations (“booths”) are
spread out throughout each constituency and enlisted voters are assigned to specific
booths. Voters go to their designated booth to cast their vote with their Elector’s Photo
Identification Card.10 Generally these booths are set up in neighboring schools or
9In states that have a bicameral legislature, the system just described applies to the lower house.Members of the upper house are either elected by the lower house or appointed by the Chief Ministeror the Governor (the representative of the federal government in the states).
10Voter Registration is a one time procedure. Except in special cases (such as for convicted criminals),once registered as a voter, a person can vote in all subsequent elections without having to go throughany further registration process. Once registered the voter’s name is on the voters’ list and he or shegets the identification card which needs to be produced at the polling station before being allowed tovote.
59
public buildings within a very small radius of one’s residence. Participation rates in
Indian elections tend to be high. In our state election data, average turnout is 71% and
only 7% of the constituencies had turnout lower than 50%. (By comparison turnout in
US midterm elections is typically around 40%.) The voting age is 18, and the average
constituency has approximately 180 thousand eligible voters.
Since 2004 all voting in India has taken place using electronic voting machines
(EVMs).11 Each candidate running in an election has a separate button assigned to
him on the machine. Next to the button is the symbol identifying the candidate (to
accommodate illiterate voters) and the voter pushes the button to record his vote. A
light illuminates confirming that the vote was successful.12 Under the NOTA policy,
one of the buttons on the machine is assigned to the NOTA option.
In the system of political reservation, some constituencies are designated Sched-
uled Caste (SC) and some Scheduled Tribe (ST). In these, only candidates from the
given caste can run (to win, they must still obtain a plurality of all votes regardless of
voters’ caste). The reserved status of SC and ST constituencies is set at the same time
as the electoral boundaries are drawn. In contrast to local (village) governments, state
elections have no political reservation for women.
The current electoral constituencies were set in April 2008 by a commission work-
ing under the Election Commission (see Table 2.1). This was the first time in over 30
years that electoral redistricting (“delimitation”) took place in India. All constituency
boundaries as well as the reservation status of the constituencies was fixed by the de-
limitation commission in order to reflect population figures of the 2001 census. As
11Electronic voting machines in India were introduced gradually beginning in 1999. Since 2004 allgeneral and state elections are conducted using these machines.
12These machines are simpler to operate than some of the EVMs used in other countries that some-times require a voter to follow written instructions, enter a candidate’s number on a keypad, etc.
60
described below, this redistricting poses challenges for the construction of our dataset
and our empirical strategy.
2.4 Data
2.4.1 Samples used for analysis
Our analysis uses two samples of constituencies: a panel serves as our primary dataset,
and we use a repeated cross-section as a secondary sample.
The instrumental variables used in the structural analysis require panel data, and
our main sample is a constituency level panel dataset of the 6 states that conducted
assembly elections in both 2008 and 2013 under the new electoral boundaries: Kar-
nataka, Chhattisgarh, Rajasthan, Madhya Pradesh, Delhi, and Mizoram (see Table 2.1).
One of these states, Karnataka held elections in both years without a NOTA option,
while the remaining 5 states had a NOTA option in 2013 but not in 2008.
The main obstacle to extending the panel data to more constituencies is the delim-
itation (electoral redistricting). This makes it impossible to include elections before
April 2008 in the panel as there is too little overlap between the old and new con-
stituencies to make constituency-level matching meaningful.13 For example, although
3 other states also held elections in both 2008 and 2013, they did so in February-March
and had their constituency boundaries redrawn between the two elections in April
13Using GIS software we have computed the maximum overlap of each current constituency’s areawith an old constituency. For example, a maximum overlap of 80% indicates that 80% of the currentconstituency’s area came from one constituency, while 20% came from one or more other constituencies.We find that half of the current constituencies have a maximum overlap of 62% or less and a quarter ofthe constituencies have a maximum overlap of 50% or less. This makes it impossible to match electoraldata across constituencies in a meaningful way.
61
2008 so we cannot include these states in the panel. Other states with consistent elec-
toral boundaries in our study period are those holding elections in 2014. However,
2014 was a national election year that made headlines around the world for its un-
usual outcome (the BJP led by Narendra Modi won by a landslide, the first time in 30
years that a single party won a majority of the legislative seats). Because 2014 state
assembly elections took place either simultaneously with or after the national election
(and in the latter case more than a year after the NOTA policy was introduced), the
national election could confound the impact of NOTA in these states. We therefore
decided to exclude these states from the panel analysis.
To obtain more power for a reduced form analysis, we use as a secondary dataset a
repeated cross section of constituencies in 25 states that conducted elections between
2006 and 2014. Like the panel, this dataset excludes the states that held assembly elec-
tions simultaneously with national elections (Andhra Pradesh, Arunachal Pradesh,
Odisha and Sikkim) since turnout considerations in these states are likely to be very
different.14 It also excludes the state of Bihar because its unique election calendar (2005
and 2010) would require earlier data on voter demographics than we have access to.
We next describe the information available in the primary (panel) and secondary
(repeated cross-section) datasets.
2.4.2 Election data
The electoral data comes from the Election Commission of India, which provides in-
formation on assembly elections at the candidate level. For each constituency we
14Since our main goal with the repeated cross section is to increase power, we include the states thatheld elections in 2014 but not simultaneously with the national election. Excluding these states makeslittle difference for the results.
62
know the list of candidates running, their party, age, gender, and caste (General, ST or
SC). We also know the number of votes received by each candidate (including NOTA
for the relevant elections) as well as the number of eligible voters in each constituency.
Table 2.2 shows summary statistics of the electoral data at the constituency level
for the panel and the repeated cross section. In the 6-state panel, for each year there
are 854 constituencies. In 2013, 630 of these constituencies were affected by NOTA
because state elections were held after the Supreme Court decision. In the repeated
cross section, we have a total of 6685 constituency-year observations from 25 states.
1176 of these observations were affected by NOTA in either 2013 or 2014. The average
constituency has approximately 180 thousand eligible voters and 11 candidates com-
peting. The overwhelming majority of candidates are male: the average constituency
has less than one female candidate. The median age of candidates in a constituency
is typically between 38 and 53. Approximately 13% of the constituencies are reserved
for SC and 15% for ST. The average non-reserved constituency has 1.3 SC candidates
and less than 0.5 ST candidates. Average turnout is 71% and the average vote share of
the winning candidate is 45%. Summary statistics of the electoral data in the 6-state
panel and the 25-state repeated cross section tend to be similar.
Table 2.3 shows a more detailed distribution of candidate characteristics in the
panel dataset. Each year we have approximately 10,000 candidates in this dataset.
40% of these candidates run as independents not affiliated with any party.
2.4.3 Voter demographics
Our first source of demographic information is various waves of the National Sample
Survey, conducted by the Indian Ministry of Statistics and Program Implementation
63
since 1950. Each wave contains close to half a million individual surveys covering all
Indian states, and is designed to be representative of the population at the subdistrict
level. We obtained the individual level data and use it to create characteristics of the
voting age population at the state-year or the district-year level for the reduced-form
analysis. Table 2.4 summarizes these variables for the 25 states in the repeated cross
section. We complement this with data on the growth rate of per capita state domestic
product from the Reserve Bank of India.
For the structural exercise, demographic characteristics are needed at the con-
stituency level. We are not aware of any existing dataset with appropriate coverage.
We create the necessary dataset using the 2011 Indian Census by aggregating village-
level information and matching it to constituencies using GIS coordinates. Specifically,
we obtained GIS boundary files for the 2013 electoral constituencies and the 2001 cen-
sus. To use data from the 2011 census, we proceed in two steps. First, we match
sub-districts (“tehsil”) in the 2001 census to the 2011 census using village names.15
Administrative boundaries in India change over time, with tehsils, districts, and even
states splitting up into new units. This step of our matching procedure is based on
the smallest administrative unit available in the census, the village. Second, we match
the 2011 census data to each 2013 electoral constituency using the 2001 sub-district
boundaries. We use area-weighted averages to compute values for constituencies that
overlap several sub-districts.
Of the 854 constituencies, we have constituency boundary files for 850. We were
able to match 723 of these to the sub-district data from the census. The location of
these constituencies is shown on Figure 2.1. Most of the constituencies we lose during
15Sub-districts, called tehsils in most states, are administrative units above the villages and below thedistricts and the states.
64
the matching (70) are in NCT Delhi. We lose this entire state because the census data
is not sufficiently disaggregated. Of the matched constituencies, 520 are affected by
NOTA in 2013 and 203 are not (in the full panel, these numbers are respectively 630
and 224).
Table 2.5 shows the summary statistics of the census data at the constituency level.
The variables include basic demographic characteristics such as gender, caste, literacy,
and employment as well as economic characteristics of the households (infrastructure
and asset ownership).
2.5 Patterns in the data
2.5.1 NOTA votes
The first noteworthy feature of the data is that a positive number of voters voted for
NOTA. Despite the fact that voting for NOTA could not affect the results of the elec-
tion, in the 9/25 states in our data affected by the policy a total of 2.51 million voters
chose this option.16 The distribution of the NOTA vote share is shown on Figure 2.2.
NOTA was chosen by a positive number of voters in every constituency, receiving an
average vote share of 1.5% with a range of 0.1-11%. As a fraction of all eligible vot-
ers (including abstainers) 1% voted for NOTA.17 In the average constituency, NOTA
16While the fact that people voted on an option that could not affect the election might seem sur-prising to some readers, this behavior is not qualitatively different from votes cast on small extra-parliamentary parties, or from voting in an election where voters have no trust in the integrity of theelection and that their vote will actually be counted. For example, in Cantú and García-Ponce (2015),despite having just voted, around 5% of Mexican voters exiting the election booth say that they haveno confidence that “the vote you cast for president will be respected and counted for the final result,”and another 15% state that they have little confidence.
17In the 5 states affected by NOTA in our panel data, the average vote share of NOTA among totalvotes cast (eligible voters) was 1.9% (1.4%).
65
received more votes than 7 of the candidates running for election. In 97 constituen-
cies out of 1176, the vote share of NOTA was larger than the winning margin (the
difference between the vote share of the winner and the runner up).
One consequence of the introduction of NOTA is simply the appearance of another
option on the ballot. A potential concern is that this new option confused some vot-
ers who chose it by mistake. Our findings below on increased turnout are difficult to
reconcile with this interpretation. If NOTA had simply confused voters at the voting
booth, we would not expect to find a positive impact on voter turnout. An alternative
way that voters might be confused is if they mistakenly thought that voting for NOTA
would somehow affect the electoral result (for example, that the election would be
invalid if NOTA obtained a majority). We find this interpretation implausible for two
reasons. First, given the 1.5% actual vote share on NOTA, voting for NOTA to in-
validate the election would have required not just confusion about electoral rules but
also extremely unrealistic expectations about the number of voters planning to vote
for NOTA. Second, if voters mistakenly thought that NOTA would affect elections, we
would expect them to be less likely to vote for NOTA as they gain more experience.
To check for this, we looked at the 2014 general elections, held at the same time in
all states. Some of these states already had experience with NOTA at the assembly
elections in 2013, while others did not. If the use of NOTA in 2013 was due to voter
confusion, we would expect the experienced states to vote for NOTA less than the
inexperienced states. In fact, the opposite is true: in this general election the average
NOTA vote share among the experienced states was 1.28%, compared to 1.09% among
non-experienced states. Voters in states that had more experience with NOTA were
significantly more likely to use it (p = 0.027).
66
Figure 2.2 reveals some heterogeneity in NOTA votes across constituencies. Our
structural analysis below will relate this heterogeneity to voter demographics and
model how different groups of voters choose between the different options on the bal-
lot. As a precursor to this analysis, in the Appendix we run cross-sectional regressions
of the NOTA vote share on a variety of constituency characteristics. We find evidence
of systematic heterogeneity: the NOTA vote share is significantly higher in reserved
constituencies and in constituencies with more illiterate voters, more women, more
ST, and a lower share of rural workers. This may suggest that economically disadvan-
taged and / or politically disenfranchized voters obtain more utility from expressing
themselves by voting for NOTA.
2.5.2 The effect of NOTA on turnout
As discussed in section 2.2, testing for a consumption utility of voting requires infer-
ring the changes in individual voters’ behavior following the introduction of NOTA.
Specifically, did the NOTA policy lead to some voters choosing to vote for NOTA in-
stead of abstaining? While this question is difficult to answer using reduced form
methods, in the Indian case we can use a simple difference-in-differences approach to
answer a related question: did the NOTA policy lead to some voters choosing to vote
instead of abstaining?
While identifying the impact of NOTA on turnout is challenging because the intro-
duction of the policy took place at the same time across India, we can exploit variation
in the Indian electoral calendar for a difference-in-differences analysis. Specifically, we
use the fact that elections to the state assemblies are held at different times in different
67
states (see Table 2.1). For the panel dataset, our specification is the following:
Ycst = α0 + α1NOTAst + α2Xcst + γc + ηt + εcst, (2.3)
where Ycst is turnout in constituency c of state s in year t, NOTAst equals 1 if the
NOTA policy is in place and 0 otherwise, Xcst are control variables, and γc and ηt are
constituency and year fixed effects, respectively. Using states that held elections in
2008 and 2013, the parameter of interest, α1 is identified by comparing the change in
turnout in the states that held elections in both years without NOTA to the change
in turnout in the states that were affected by NOTA in 2013 (but not in 2008). For
the repeated cross-section sample, the specification is identical to (2.3) except that we
replace the constituency fixed effects γc with state fixed effects γs.18
Table 2.6 shows the results from estimating equation (2.3). In column (1), we use
the panel dataset and control for the number of eligible voters in a constituency, state
labor force participation, weekly household earnings, and education (as well as con-
stituency and year fixed effects). The coefficient estimate on NOTA indicates a positive
turnout effect but with the small number of states and state-level variation in the pol-
icy, the estimate is highly imprecise.19 In column (2) we repeat the same specification
for the repeated cross-section, replacing the constituency fixed effects with state fixed
18Recall that the panel includes constituencies from 6 states, 5 of which were affected by NOTA. Therepeated cross section contains constituencies from 25 states, 9 of which were affected by NOTA (5 in2013 and 4 in 2014).
19Because the NOTA policy varies at the state level, inference needs to account for clustering. Giventhe small number of clusters, we obtain the p-value for the effect of NOTA by using a wild bootstrapprocedure as recommended by Cameron and Miller (2015) with the 6-point weight distribution of Webb(2013).
68
effects. The point estimate on NOTA remains similar but the precision improves dras-
tically, indicating a statistically significant turnout effect of 3 percentage points.20 In
column (3) we add as additional controls a dummy for reserved constituencies as well
as the following state-level variables: unemployment, sex ratio, urbanization, and the
growth rate of state per capita net domestic product. The estimated effect of NOTA
remains robust to these additional controls.
The main threat to identification in the regressions presented in Table 2.6 is other
events or policies that may affect changes in turnout between assembly elections held
before and after the introduction of NOTA. In the Appendix, we present a number
of robustness checks: we exclude national election years, control for the extent to
which constituencies were affected by electoral redistricting, and we drop specific
states where political events during our period of studies might potentially confound
the effect of NOTA. We find that our estimates are robust to these alternative samples
and specifications.
Overall, these results indicate that the presence of the NOTA option on the election
ballot increased turnout in the average constituency. It is interesting to note that in
each case the 95% confidence interval around the estimates includes the fraction of
eligible voters who voted for NOTA in the data (1% in the repeated cross-section).
This may indicate that NOTA voters turn out to vote specifically for this option and
would abstain if this option was not present on the ballot. At the same time, our
ability to infer NOTA voters’ counterfactual behavior using this aggregate analysis is
fundamentally limited. Did the NOTA policy lead to some voters choosing to vote for
NOTA instead of abstaining? Our structural analysis below will make such inference
20We obtain similar inference using either standard errors clustered by state or the wild bootstrapprocedure.
69
regarding individual voters’ voting patterns possible.
2.6 Estimating the effect of NOTA from a demand sys-
tem for candidates
In this section we estimate a model of voter choice among candidates, NOTA and
abstention using techniques from the discrete consumer choice literature. Two factors
make this approach particularly useful in the voting context. First, the rules governing
elections imply that several assumptions of the model naturally hold. Second, the
inference problem addressed by the method - inference regarding individual behavior
from aggregate data - is central to any voting application.
While the typical Industrial Organization applications view the static discrete choice
framework as an approximation, the rules governing elections actually make this
model quite realistic in the electoral context. By the nature of elections, voters are
restricted to a discrete choice between voting for a candidate, voting for NOTA, or
abstaining.21 Choices are made simultaneously by all voters in a given race - unlike
consumers, voters cannot adjust the timing of their choice. When making a choice,
the voter has before him a complete list of all available options on the ballot, in con-
trast to a consumer who may not be aware of all available brands of the product he is
considering buying. Electoral competition takes place in markets (electoral districts)
21By contrast, when choosing which product to buy, consumers may purchase a mixture of productseven if they only consume one at any given time.
70
that are administratively defined and where vote shares are fully observed by the re-
searcher.22 Finally, abstention (the “outside good” chosen by somebody who does not
choose any of the alternatives on the ballot) is well defined, and administrative data
on its prevalence is readily available.23
The attractiveness of the estimation framework used here arises from its ability to
deliver inference on individual behavior from aggregate data. In IO, this is an attrac-
tive feature for researchers who have access to market-level data. While for the con-
sumer demand application one could in principle obtain individual data (e.g., from
loyalty programs), in the voting context this is virtually impossible. Due to the secrecy
of the ballot, in most cases administrative data on individual choices simply does not
exist. Because of this, most existing research either relies exclusively on aggregate
analysis, or uses voter survey data to analyze individual behavior. Because voter sur-
vey data is subject to well-known biases, being able to infer individual behavior from
aggregate data is of major importance in this literature.
2.6.1 Specification: demand
Our specification of voter preferences adapts the consumer demand model of Berry,
Levinsohn and Pakes (1995) (BLP).24 Consider a constituency t ∈ {1, ..., T}where vot-
ers can vote for candidates j ∈ {1, ..., J}. If available, we include the NOTA option in
the list of candidates, and we let j = 0 indicate abstention (the outside option). Each
22By contrast, studies of consumer choice have to rely on proxying the true market in which a setof products compete, e.g., based on geographic areas. It is also common in these studies to rely on asample of products and stores, while in the electoral context complete data on the vote shares of allcandidates (including abstention) is readily available.
23By contrast, defining the relevant outside good for, e.g., the market for new cars, requires makingassumptions (is it a used car, public transportation, nothing?).
24See also Nevo (2001) and Petrin (2002).
71
candidate is described by a set of characteristics observed by the researcher and a set
of unobserved characteristics. Besides the candidate’s party, characteristics observed
in our data include gender, age, and caste. Unobserved characteristics include, for
example, the candidate’s experience. Assume that the utility that voter i derives from
voting for candidate j ∈ {1, ..., J} can be specified as
Uijt = βixjt + ξj + ξjt + εijt, (2.4)
where xjt = (x1jt, ..., x
Kjt)′ is a vector of the observed characteristics of party j’s candi-
date, ξj is the average popularity of the party, ξjt captures voters’ common valuation
of unobserved candidate characteristics, and εijt is a stochastic term with mean zero
drawn from a Type-I extreme value distribution (the role of this assumption will be
made clear below). Unless stated otherwise we treat the NOTA option as another can-
didate, including a NOTA indicator to identify characteristics (e.g., gender) which are
only defined for actual candidates.25
Voter preferences for the various candidate characteristics are represented by the
coefficients βi = (β1i , ..., β
Ki ). These vary across individuals based on demographic
variables and unobserved characteristics:
β′i = β + Πdi + Σvi, (2.5)
where di = (d1i , .., d
Di )′ is a vector of “observed” demographic variables, vi = (v1
i , ..., vKi )′
are “unobserved” voter characteristics, and the parameters are in the (K × 1) vector
25For example, if the only candidate characteristic is gender (gj), then xj = [(1− nj)gj , nj ] where njis equal to 1 for NOTA and 0 otherwise.
72
β, the (K ×D) matrix Π, and the (K ×K) scaling matrix Σ. We assume that the vi are
drawn from independent Normal distributions with mean 0. As in most consumer de-
mand applications, “observed” variables are individual characteristics whose empir-
ical distribution is known (from census data), while the distribution of “unobserved”
characteristics has to be assumed. While individual level consumption data is some-
times available, in the voting context, given the secrecy of the ballot, it is generally
impossible to directly match individual characteristics to votes.
To complete the choice set, the utility of the “outside good” (abstention) must be
specified. In consumer demand applications constructing the outside choice typically
involves two sets of assumptions: assumptions about what consumers do when they
don’t purchase a specific product, and assumptions about what constitutes a market.
In the voting context neither of these is necessary, since electoral constituencies are
exogenously given and voters who do not vote necessarily abstain. We let
Ui0t = π0di + σ0v0i + εi0t, (2.6)
which allows for the utility of abstention (hence the cost of voting) to vary by observed
demographics and unobserved voter characteristics. As discussed below, we also in-
clude in (2.4) state and year fixed effects and indicators for whether the constituency
is reserved for SC or ST candidates. Since voter choices will be determined by the dif-
ferences in utilities, including these variables in (2.4) is equivalent to including them
in the specification of the utility of abstention in (2.6). Thus, we are also allowing for
further heterogeneity in voting costs as captured by these variables.
73
Let ξ = (ξ1, ..., ξJ), θ1 = (β, ξ), and θ2 = (Π,Σ), and let θ = (θ1, θ2) represent the
parameters of the model. Substituting (2.5) into (2.4), we can write
Uijt = δjt + µijt + εijt,
where δjt ≡ βxjt+ξj + ξjt and µijt ≡ (Πdi+Σvi)xjt. Voters choose to vote for one of
the candidates (including NOTA) or abstain. This implicitly defines the set of demo-
graphics and unobserved variables for which voter i will choose candidate j:
Ajt(x, δt(θ1), θ2) = {(di,vi, εit)|Uijt ≥ Uilt for l = 0, 1, ..., J} ,
where x are all the candidate characteristics, δt = (δ1t, ..., δJt), and εit = (εi1t, ..., εiJt).
Given the distribution of (di,vi, εit), we can integrate overAjt to obtain the vote shares
sjt(x, δt(θ1), θ2) predicted by the model. Under the assumed Type-I extreme value
distribution for εijt, these are given by
sjt(x, δt(θ1), θ2) =
∫exp [δjt + µijt − µi0t]
1 +∑q≥1
exp [δqt + µiqt − µi0t]dF (di,vi) , (2.7)
where µi0t ≡ π0di + σ0v0i and F (di,vi) denotes the distribution of the voter character-
istics. These predicted vote shares are a function of the data (x), the parameters (θ),
and the unobserved candidate characteristics ξ.
2.6.2 Specification: supply
While some political economy models treat candidates as exogenously given, others,
notably the citizen-candidate literature, emphasize that politician characteristics may
74
emerge endogenously in the political process (Osborne and Slivinski, 1996; Besley
and Coate, 1997). To allow for this possibility while keeping the problem tractable,
we adopt a simple simultaneous-moves specification of the supply of candidates. We
will use this framework to justify the instrumental variables we use in the estimation
below.26
As in the citizen-candidate literature suppose that implemented policies depend
on elected politician’s characteristics and that candidate characteristics emerge en-
dogenously in the political process. In particular, suppose that candidates are cho-
sen by a political party that cares about winning as well as the policy implemented
by the winner. In constituency t, party j’s payoff is given by vjt(xt, st), where xt =
(x1t, ...,xJt) are the characteristics of all candidates running in the election and st =
(s1t, ..., sJt) are the vote shares that determine the winner. Vote shares are determined
by candidates’ characteristics as well as the voter valuations ξjt, as in equation (2.7).
Thus, sjt = sjt(xt, ξt) where ξt = (ξ1t, ..., ξJt) (to simplify the notation, we set ξ = 0).
Given a party’s membership, fielding candidates with some characteristics may be
easier than others. For example, a lower caste party may find it difficult to field general
caste candidates. A simple way to capture this is by supposing that party j faces a
budget constraint m =∑k
qktjxktj ≡ qtjxjt, where m is the budget available to spend
on candidates (assumed constant for simplicity) and qkjt is the “price” of increasing a
candidate’s characteristic k in constituency t. For example, if xk = 1 denotes a general
caste candidate, qkjt may be the extra cost of finding such a candidate and convincing
him to run. Prices will generally depend on such factors as a party’s membership, the
26In the consumer demand literature, it is common to model firms that compete on prices but takeall other product characteristics as exogenously given. In our context, there is no natural separationbetween endogenous and exogenous candidate characteristics so we will treat all characteristics exceptNOTA as potentially endogenous.
75
economic and demographic characteristics of a constituency, the prestige associated
with a political career in the local population, etc. We assume that parties take these
prices as given.
Suppose that parties choose the characteristics of their candidates simultaneously,
after voter valuations ξt have been realized. In a Nash equilibrium, the characteristics
of party j’s candidate will satisfy
x∗jt ∈ arg maxxjt
(vjt(xt, st(xt, ξt))|m = qjtxjt)
or
x∗jt = x∗jt(xt, ξt,qjt). (2.8)
In words, candidate j’s characteristics depend on the characteristics of all candidates
running, voters’ valuation of the unobserved characteristics, and party j’s cost of in-
creasing the various characteristics in the given constituency. This has two implica-
tions. First, the dependence of observed characteristics xjt on voter valuations ξt cre-
ates an endogeneity problem for the estimation of the utility functions (2.4). Second, it
is plausible that the prices qjt for a given party are correlated across constituencies t.
For example, a lower caste party is likely to face a higher price to field a general caste
candidate in all constituencies within a state. This implies that the characteristics of a
given party’s candidates will be correlated across constituencies. As explained below,
this opens the possibility of using candidate characteristics in neighboring constituen-
cies as instrumental variables in the estimation.
76
2.6.3 Estimation
Estimation follows the algorithm proposed by BLP. The idea is to treat the unobserved
characteristics ξ as the econometric error and derive moment conditions that can be
used to estimate the parameters using Generalized Method of Moments (GMM). De-
tailed treatments of the procedure can be found in BLP and Nevo (2000, 2001) so we
only provide a brief summary below.
Consider a dataset with information on candidate characteristics x and actual vote
shares Sjt. BLP show that, for given θ2, it is possible to numerically solve for δt from
the equations sjt(x, δt, θ2) = Sjt, i.e., equating the model-predicted vote shares to those
observed in the data. Using the resulting values of δjt(θ2), we express the unobserved
candidate characteristics as ξjt(θ) = δjt(θ2) − ξj − βxjt. Given our data and with δjt
computed, this is a standard econometric error, which depends nonlinearly on the pa-
rameters of the model. While we do not expect ξjt(θ) to be independent of xjt, we can
find a suitable set of instruments Zjt and use the moment conditions E[ξjt(θ)|Zjt] = 0
to estimate the parameters using GMM. Thus, we find
θ = arg minθ
ξ(θ)′ZW−1Z′ξ(θ),
where ξ(θ) is the vector of errors, Z is the matrix of instruments, and W is the weight-
ing matrix.
To compute the estimate, we use the standard two-step GMM procedure (Greene,
2003, p206). We first set W = Z′Z and compute an initial estimate of the parame-
ters, θ1. We then use this initial estimate to recompute a robust weight matrix W =
1n
n∑j,t
[ξjt(θ1)]2Z′jtZjt, and use this updated weight matrix to compute the final parameter
77
estimates.
In this framework, the need for instrumental variables arises for two reasons. First,
instruments are needed to generate enough moment conditions to identify the nonlin-
ear parameters in voters’ utility functions. Thus, instruments are necessary even if ξjt
and xjt are uncorrelated. Second, instruments are needed because some of the candi-
date characteristics are likely to be endogenous, as suggested by equation (2.8).27 In
the context of consumer demand estimation, where “voters” are the consumers and
“candidates” are the products, it is common to use instruments based on the character-
istics of other products produced by the same firm and the characteristics of products
produced by other firms (e.g., BLP; Nevo, 2001). A natural counterpart in our setting
is to think of firms as the parties that field the candidates. Using this analogy, we
use as instruments the average characteristics of a given party’s candidates in other
constituencies within the state. For example, we create a candidate gender IV for a
particular candidate by taking the fraction of female candidates of the given party
in other constituencies in the state for both 2008 and 2013.28 What makes these in-
struments possible in our case is the variation in the characteristics of a given party’s
candidates across constituencies as each election is contested by a different set of in-
dividuals. This avoids the difficulties that sometimes arise in the consumer demand
literature from insufficient variation in product characteristics across markets (e.g.,
Nevo, 2001).27The endogeneity problem is likely to be present even if candidate characteristics are assumed to be
fixed at the time of the election. We only observe a short list of characteristics xjt, and unobserved char-acteristics that influence ξjt (including experience, voting record, qualifications, physical appearance,etc.) are likely to be correlated with these.
28As usual, variables that enter the utility function and are treated as exogenous serve as their owninstruments. In our case, this includes state, year, and party fixed effects, as well as the NOTA indicatorand its interaction with demographics.
78
Beyond the analogy to product characteristics, a rationale for these instruments
in our case may be given based on the supply of candidates available to each party,
as in section 2.6.2. For a given party, the “price” of increasing a particular candidate
characteristic is likely to be correlated across constituencies due to the characteristics
of the party’s membership, demographic characteristics of the constituency, etc. This
implies that the characteristics of a particular party’s candidates are likely to be corre-
lated across constituencies.
The identifying assumption, expressed in the moment conditions, is that unob-
served voter valuations for a particular candidate are conditionally independent of
these instruments. One case in which this assumption will hold is if, controlling for
party-specific means and demographics, constituency-specific voter valuations ξjt are
independent across constituencies (but may be correlated for a given constituency
over time). This rules out a popularity shock to some of a party’s candidates as would
be caused, e.g., by a regionally coordinated advertising campaign (a campaign rais-
ing the popularity of all candidates would be captured by the party dummies). See
Hausman (1996) and Nevo (2001) for analogous assumptions in the consumer demand
literature. A second case in which the identifying assumption will hold is if parties do
not condition their choice of candidates on the popularity shocks ξjt in (2.8). For ex-
ample a party with an SC base may find it impossible to respond to a popularity shock
by finding a candidate from a different caste in time for the election. In this case, the
mix of candidate characteristics offered by a party would reflect the supply of charac-
teristics in the relevant population, rather than respond to popularity shocks among
voters.
Under either interpretation, (2.8) implies that characteristics of candidates fielded
79
by a given party in different constituencies will only be correlated due to the common
prices qjt. Thus, for a given party, the characteristics of its candidates xjt′ is con-
stituencies t′ 6= t are valid instruments for the characteristics of the candidate running
in constituency t. Since the introduction of the NOTA option took place at the national
level by the Supreme Court, the NOTA characteristic is treated as exogenous.
2.6.4 Practical issues
As described above, parties play an important role in our specification: we estimate
party fixed effects and we define our IVs based on parties. One difficulty arises be-
cause of the presence of many small parties. There are a total of 202 parties in the data,
but half of them field candidates in only 1 of every 40 constituency within a state. A
second, related difficulty is the presence of independent candidates (candidates not
affiliated with any party). There are 6751 of these candidates in the data, but 70% of
them receive less than 1% of the votes in a constituency and only 3% receive more
than 10%. Each of these parties and candidates adds a new parameter that is difficult
to identify due to the small number of constituencies where the party is represented
(in the extreme case of an independent candidate running in only one year, identify-
ing the fixed effect is not possible). To circumvent these difficulties, we create a “Small
Party” category comprising parties fielding candidates in less than a third of the con-
stituencies in any given state and we average all small party candidates’ characteristics
within a constituency (we do this after constructing the instruments so that the indi-
vidual IVs are aggregated also). We also create an “Independent Party” containing all
independent candidates within a state, and aggregate them within constituencies in
the same way. After this aggregation, we are left with a total of 22 parties.
80
We include in the analysis the full list of candidate characteristics available in the
data: gender, caste, age and party. We select the constituency characteristics to be
included based on the variables that indicated significant heterogeneity in voter pref-
erences for NOTA in the regressions in section 2.5. We include percent male, literacy,
percent SC, percent ST, and the share of rural workers.
The BLP algorithm requires numerically solving the integral in (2.7) to obtain the
predicted market shares. We do this in the standard way by drawing individual voters
from the distribution of demographics in each constituency, computing the predicted
individual probabilities of voting for each candidate, and averaging across simula-
tions to obtain the simulator for the integral.
2.6.5 Results
Parameter estimates
We present parameter estimates for different specifications of the above model in Ta-
bles 2.7 and 2.8. First, we set Π and Σ equal to 0 so that voter heterogeneity only enters
through the εijt terms in equation (2.4). This is the conditional Logit model, and we
estimate it with and without instruments in columns (1) and (2) of Table 2.7. We report
coefficient estimates on the candidate characteristics as well as a subset of the control
variables (reserved constituency indicators and dummies for 3 major parties).
In column (3), we keep Π = 0 but allow for random coefficients on all the candi-
date characteristics as well as the outside option (through Σvi in equation (2.5) and σ0
in equation (2.6)). The estimates of Σ are in the first column of Table 2.8. Finally, we
present the full model which allows for both observed and unobserved heterogeneity
in voters’ evaluation of the various candidate characteristics. These estimates are in
81
column (4) of Table 2.7 (β) and the remainder of Table 2.8 (Σ and Π). Moving from col-
umn (1) to column (2) of Table 2.8, we kept those elements of Σ that were statistically
significant.
In Table 2.7, coefficients change substantially between the OLS and IV specifica-
tions, suggesting that instrumenting is important. Tables 2.7 and 2.8 reveal that voters
value candidate characteristics other than party affiliation. For example, older male
candidates tend to receive more votes. Allowing for taste heterogeneity among voters
regarding these characteristics in the full model substantially reduces the estimated
impact of party affiliation on votes (column (4) of Table 2.7). In Table 2.8, areas with
a higher share of ST voters yield more votes for ST candidates (this effect is identified
from variation across non-reserved constituencies). NOTA is a less popular option in
constituencies with more literate voters and in rural areas.
Counterfactual analysis: The impact of NOTA
In this section we use the estimated model to evaluate the impact of introducing
NOTA. We restrict attention to those constituencies in our data that had the NOTA
option available in 2013 and perform a counterfactual experiment where the NOTA
option is removed. We compute new vote shares and turnout rates under this coun-
terfactual scenario, and calculate the impact of NOTA as the difference between the
actual and the counterfactual outcomes.
The estimated impact of NOTA on turnout is shown in Figure 2.3. The average
increase in turnout is above 0.5 percentage points in 74% of the constituencies, with an
average of 1.2 percentage points. This is somewhat lower than the effect we obtained
82
from the reduced form exercise in section 2.5.2, but very close to the 1.4 percent of
eligible voters who voted for NOTA in the data.
The estimated impact of NOTA on individual candidates’ vote shares is shown
in Figure 2.4. The reduction in vote shares is smaller than 0.5 percentage points in
absolute value for 96% of the candidates, with a mean of 0.06 percentage points. This
indicates that substitution from voting for a candidate to voting for NOTA is minimal.
To gauge the impact of NOTA on parties, Table 2.9 aggregates voter choices by
party. For each party in the data, the first column gives the number of candidates and
the second column shows the fraction of the 101.2 million eligible voters who voted
for that party. The third column shows the difference relative to the counterfactual
without NOTA. For the two largest parties, BJP and INC, we estimate a loss in total
votes of around 0.15 percentage points. The change in the votes cast on other parties is
even smaller. By contrast the change in the overall abstention rate (and hence turnout)
is a magnitude larger at 1.2 percentage points. This is similar to the estimated effect
across all constituencies.
It should be emphasized that allowing for the flexible random coefficients spec-
ification above was crucial to obtain these results. The more restrictive Logit model
could not possibly have resulted in these effects. As is well known, the Logit specifica-
tion implies that substitution patterns only depend on observed choice shares. In our
case, this would imply that adding the NOTA option would, by construction, cause
the biggest change in the most popular candidate’s vote share. This is illustrated by
the last column of Table 2.9 which shows the counterfactual implications that would
be obtained from the Logit specification. As can be seen, this would imply that substi-
tution towards NOTA is similar for voters of the two largest parties and for abstaining
83
voters.
The patterns emerging from the demand model are similar to those observed in
the difference-in-differences analysis. The results indicate that NOTA votes are mostly
cast by voters who would have chosen to abstain in the counterfactual without NOTA
and turn out to vote specifically for NOTA. This provides strong evidence that voters
derive positive consumption utility from voting for this option.
2.7 Conclusion
This paper analyzed India’s NOTA policy which gives people the option to partici-
pate in elections and cast a valid “None of the Above” vote without the possibility of
affecting the electoral outcome. Individuals who choose to vote for NOTA but would
abstain otherwise must derive a consumption utility from voting that is specific to this
option. Thus, the NOTA policy makes it possible to test apart various components of
the consumption utility of voting. To address the challenge that, due to the secret bal-
lot, individual choices are not observable, we estimate counterfactual voter behavior
using a structural model and techniques borrowed from the consumer demand liter-
ature. The model allows for rich heterogeneity in voter preferences and relates these
parameters to the aggregate vote returns. In counterfactual simulations, we find that
the NOTA policy resulted in increased turnout. Based on the estimated model, vir-
tually all the NOTA votes observed in the data represent new voters who showed up
specifically to vote for NOTA and who would have abstained in the absence of this op-
tion. These patterns are also supported by the findings from a reduced-form analysis.
Our results show that, in this context, the presence of an option-specific consumption
84
utility from voting is necessary to explain the data. For example, voters may derive
utility from expressing their protest against one or more of the candidates running
for election. In this case, models that do not incorporate an option-specific utility of
voting would have a hard time explaining the data.
To the extent that voter participation is valuable in a democracy, our results suggest
that having a NOTA option on the ballot may be a desirable policy. It creates both
political participation and individual utility.
2.8 References
1. Ali, N., and C. Lin (2013): “Why People Vote: Ethical Motives and Social Incen-
tives,” American Economic Journal: Microeconomics
2. Bénabou, R. J., and J. Tirole. (2006). “Incentives and Prosocial Behavior,” Ameri-
can Economic Review
3. Berry, S., J. Levinsohn, and A. Pakes (1995): “Automobile Prices in Market Equi-
librium,” Econometrica, 63(4)
4. Besley, T., and S. Coate (1997): “An Economic Model of Representative Democ-
racy,” Quarterly Journal of Economics
5. Blais, A. (2000): To vote or not to vote, University of Pittsburgh Press, Pittsburgh,
PA.
6. Brennan, G., and L. Lomasky (1993): Democracy and Decision: The Pure Theory of
Electoral Preference. Cambridge University Press, Cambridge, UK.
85
7. Cameron, A.C., and D.L. Miller (2015): “A Practitioner’s Guide to Cluster-Robust
Inference,” Journal of Human Resources
8. Cantú, F., and O. García-Ponce (2015): “Partisan losers’ effects: Perceptions of
electoral integrity in Mexico,” Electoral Studies
9. Cho, W.K.T., and C.F. Manski (2008): “Cross-Level/Ecological Inference,” in:
J.M. Box-Steffensmeier, H.E. Brady, and D. Collier (eds.) The Oxford Handbook
of Political Methodology, Oxford University Press, Oxford, UK.
10. Coate, S. and M. Conlin (2004): “A Group Rule: Utilitarian Approach to Voter
Turnout: Theory and Evidence,” American Economic Review
11. Coate, S., M. Conlin, and A. Moro (2008): “The performance of pivotal-voter
models in small-scale elections: Evidence from Texas liquor referenda,” Journal
of Public Economics
12. Degan, A. and A. Merlo (2011): “A Structural Model of Turnout and Voting in
Multiple Elections,” Journal of the European Economic Association
13. DellaVigna, S., J.A. List, U. Malmendier, and G. Rao (2015): “Voting to Tell Oth-
ers,” working paper, UC Berkeley.
14. Dittmann, I., D. Kubler, E. Maug, and L. Mechtenberg (2014): “Why Votes Have
Value: Instrumental Voting with Overconfidence and Overestimation of Others’
Errors,” Games and Economic Behavior
15. Downs, A. (1957): An economic theory of democracy, Harper and Row, New York,
NY.
86
16. Driscoll, A., and M.J. Nelson (2014): “Ignorance or Opposition? Blank and
Spoiled Votes in Low-Information, Highly Politicized Environments,” Political
Research Quarterly
17. Duffy, J. and M. Tavits (2008): “Beliefs and Voting Decisions: A Test of the Pivotal
Voter Model,” American Journal of Political Science
18. Feddersen, T. and A. Sandorini (2006): “A Theory of Participation in Elections,”
American Economic Review
19. Fiorina, M. P. (1976): “The Voting Decision: Instrumental and Expressive As-
pects,” Journal of Politics
20. Glasgow, G., and R.M. Alvarez (2005): “Voting behavior and the electoral context
of government formation,” Electoral Studies
21. Greene, W.H. (2002): Econometric Analysis, Fifth Ed., Prentice Hall, Upper Saddle
River, NJ.
22. Harbaugh, W. T. (1996): “If People Vote Because They like to, then Why do so
Many of Them Lie?,” Public Choice
23. Hausman, J. (1996): “Valuation of New Goods Under Perfect and Imperfect
Competition,” in: T. Bresnahan and R. Gordon (eds.): The Economics of New Goods,
Studies in Income and Wealth Vol. 58, National Bureau of Economic Research,
Chicago, IL.
87
24. Herron, M. C. and J. S. Sekhon (2005): “Black Candidates and Black Voters: As-
sessing the Impact of Candidate Race on Uncounted Vote Rates,” Journal of Poli-
tics
25. McAllister, I. and T. Makkai (1993): “Institutions, Society or Protest? Explaining
Invalid Votes in Australian Elections,” Electoral Studies
26. McMillan, A. (2010): “The Election Commission,” in: N.G. Jayal and P.B. Mehta
(eds): The Oxford Companion to Politics in India, Oxford University Press, Oxford,
UK.
27. Nevo, A. (2000): “A Practitioner’s Guide to Estimation of Random Coefficients
Logit Models of Demand,” Journal of Economics and Management Strategy
28. Nevo, A. (2001): “Measuring Market Power in the Ready-to-Eat Cereal Indus-
try,” Econometrica
29. Ortoleva, P. and E. Snowberg (2015): “Overconfidence in Political Behavior,”
American Economic Review
30. Osborne, M., and A. Slivinski (1996): “A Model of Political Competition with
Citizen-Candidates,” Quarterly Journal of Economics
31. Petrin A. (2002): “Quantifying the Benefits of New Products: The Case of the
Minivan,” Journal of Political Economy
32. Power, T. J. and J. C. Garand (2007): “Determinants of invalid voting in Latin
America,” Electoral Studies
88
33. Razin, R. (2003): “Signaling and Election Motivations in a Voting Model with
Common Values and Responsive Candidates,” Econometrica
34. Rekkas, M. (2007): “The Impact of Campaign Spending on Votes in Multiparty
Elections,” Review of Economics and Statistics
35. Riker, H. W. and P. C. Ordeshook (1968): “A Theory of the Calculus of Voting,”
American Political Science Review
36. Selb, P., and S. Munzert (2013): “Voter overrepresentation, vote misreporting,
and turnout bias in postelection surveys,” Electoral Studies
37. Shachar, R., and B. Nalebuff (1999): “Follow the Leader: Theory and Evidence
on Political Participation,” American Economic Review
38. Shayo, M. and A. Harel (2012): “Non-consequentialist voting,” Journal of Eco-
nomic Behavior and Organization
39. Superti, C. (2015): “Vanguard of the Discontents: Blank and Null Voting as So-
phisticated Protest,” working paper, Harvard University.
40. Uggla, F. (2008): “Incompetence, Alienation, or Calculation? Explaining Levels
of Invalid Ballots and Extra-Parliamentary Votes,” Comparative Political Studies
41. Webb, M. (2013): “Reworking Wild Bootstrap Based Inference for Clustered Er-
rors,” working paper, University of Calgary.
89
2.9 Appendix
2.9.1 The correlates of NOTA votes
In this section we investigate the correlation between NOTA votes and constituency
characteristics. We use the same sample as in our structural exercise (the panel dataset
matched to the census data) and run simple cross-sectional regressions on the 520 con-
stituencies that are affected by the NOTA policy in 2013. We include state fixed effects
and, to avoid confounding our estimates by differential turnout across constituencies,
we measure NOTA vote shares as a fraction of total votes cast.29
The results are in Table 2.10. We find substantial heterogeneity in NOTA votes
across constituencies. For example, the NOTA vote share is significantly higher in re-
served constituencies and in constituencies with more illiterate voters, more women,
more ST, and a lower share of rural workers. Each of these patterns is consistent with
a variety of possible explanations. One possible interpretation is that NOTA votes are
higher in more economically disadvantaged populations, reflecting a general dissatis-
faction with elected leaders in these constituencies. Note however that the coefficients
remain unchanged if we add controls for various indicators of infrastructure and eco-
nomic activity in column (2). Another possible interpretation is that NOTA votes come
from politically underrepresented voters, such as women and non-SC or ST voters in
reserved constituencies.
In columns (3) and (4) we add candidate characteristics to the regression. These
estimates are merely suggestive because the list of candidates running will be affected
by their expected vote share, and this could be correlated with the fraction of voters
29Using NOTA votes as a share of eligible voters yields very similar results.
90
choosing NOTA. We find that constituencies with more candidates running result in
lower NOTA vote shares, which is consistent with NOTA reflecting dissatisfaction
with the menu of candidates being offered. We do not find evidence that the presence
of female, SC or ST candidates affects NOTA votes.
2.9.2 Robustness of the DD estimates
This section explores the robustness of the difference-in-difference estimates presented
in section 2.5.2 of the paper to various political events affecting one or more states.
National elections
In our study period, Indian national elections took place in Spring 2009 and 2014.
Recall that we do not include in the analysis the four states that hold their assembly
elections simultaneously with the national election. In the remaining states, because
split-ticket voting (constituencies voting for different parties at the state and national
levels) is common in India, it is ex ante not obvious that events affecting national
turnout, like the wave of support for the BJP in the 2014 national elections, would
affect assembly elections.30 If national elections did have an impact, we would expect
this to be the strongest for states that held assembly elections in October and December
of the national election years. If national elections impacted turnout in the assembly
elections in these states, this has the potential to confound our estimates of the NOTA
policy introduced between the two national election years in September 2013.
In Table 2.11 we exclude the national election years from the sample. Columns
(1) and (2) exclude 2014 and columns (3) and (4) exclude both 2009 and 2014. Odd30Note also that increased support for the BJP would lead to more BJP votes rather than NOTA votes.
91
numbered columns correspond to the specification in column (2) of Table 2.6 with the
basic controls and even numbered columns to column (3) with the extended controls.
We find that all point estimates are similar to, and if anything slightly larger than the
3 percentage points effect we found in Table 2.6. National elections do not appear to
confound the estimates reported in the main text.
Redistricting
Another potential confound is the electoral redistricting that took place in April 2008.
Because elections are held every 5 years and NOTA was introduced in September
2013, none of the states that were affected by NOTA in our period of study were re-
districted, while most states that were not affected by NOTA were redistricted. Thus,
redistricting has the potential to confound our estimates of NOTA.31
To control for this, we create a constituency-level measure of redistricting by us-
ing GIS boundary files to compare constituencies before and after the delimitation.
Our first measure calculates for each current constituency that was redistricted in
our study period the largest area that was part of a single constituency before the
redistricting. For example, a value of 0.8 for this “maximum overlap” measure indi-
cates that 80% of the current constituency’s area was part of a single constituency
pre-delimitation (while the remaining 20% was part of one or more different con-
stituencies). The higher the maximum overlap, the less a constituency was affected
by redistricting. Our second measure, rather than focus on the largest area of over-
lap, uses each overlapping area to create an index of “territorial fractionalization.” If
a constituency overlaps with n pre-delimitation constituencies with s1, ..., sn denoting
31For example, if redistricting lowered turnout, our estimate of NOTA’s effect of turnout would likelybe biased upward.
92
the share of its area falling in each of these, then the fractionalization index is 1−n∑i=1
s2i .
The larger this value, the more the current constituency was affected by redistricting.
Both of these measures are available for 22 states (constituency boundary files are not
available for the states of Assam, Manipur, and Nagaland).
Table 2.12 presents regressions corresponding to specifications (2) and (3) in Ta-
ble 2.6 controlling for these measures of redistricting. The first two columns repeat
columns (2) and (3) in Table 2.6 on the 22 states with available redistricting measures.
Columns (3) and (4) then add the maximum overlap measure and columns (5) and (6)
the territorial fractionalization index. As can be seen, adding either measure of redis-
tricting to the regressions causes very little change in the estimated effect of NOTA.
The estimates also retain their significance, except for column (6) where the standard
error increases just enough to yield a p-value of 0.101.32
State-specific events
Turning to state-specific events that may confound our estimates, we identified four
states where various events may plausibly affect 2013 or 2014 turnout relative to the
previous election (that is, turnout in the with-NOTA election relative to turnout in
the without-NOTA election). In Chhattisgarh, Maoist insurgents conducted terrorist
attacks in 2010 and May 2013, between the 2008 and 2013 elections in this state. In
Jammu & Kashmir, various incidents occurred between its 2008 and 2014 elections,
including a border skirmish in January 2013 between India and Pakistan described by
observers as one of the worst in 10 years. In Delhi, a new anti-corruption party, Aam
Aadmi entered politics in 2012, energized voters, and emerged as the second-largest
32The coefficients on the redistricting measures are never statistically significant.
93
party in the 2013 assembly election. Finally, Maharashtra held its 2009 election a year
after the 2008 terrorist attacks in Mumbai on several hotels and public buildings, and
security concerns may have depressed voter turnout there.
In Table 2.13, we repeat specifications (3) and (4) from Table 2.6 excluding each
of these states one at a time and then all four of them. The results corresponding
to the first specification are in column (1) and column (2) corresponds to the second
specification with the extended set of controls. All these coefficients are close to the
3 percentage point effect found in Table 2.6. The events in these four states do not
appear to drive the estimated effect of NOTA on turnout reported in the main text.
94
TABLE 2.1: Timeline of events in the study period
Year Month State assembly elections Other events2006 4 Assam
5 Kerala, Puducherry, Tamil Nadu,West Bengal
2007 2 Manipur, Punjab, Uttarakhand5 Uttar Pradesh6 Goa12 Gujarat, Himachal Pradesh
2008 2 Tripura3 Meghalaya, Nagaland4 Delimitation5 Karnataka11 Madhya Pradesh, NCT of Delhi12 Chhattisgarh, Jammu & Kashmir,
Mizoram, Rajasthan2009 4 Andhra Pradesh*, Arunachal Pradesh*, National elections
Odisha*, Sikkim*10 Haryana, Maharashtra12 Jharkhand
2010 10 Bihar*2011 4 Assam, Kerala, Puducherry, Tamil Nadu
5 West Bengal2012 1 Manipur, Punjab, Uttarakhand
3 Goa, Uttar Pradesh11 Himachal Pradesh12 Gujarat
2013 2 Meghalaya, Nagaland, Tripura5 Karnataka9 NOTA policy introduced11 Chhattisgarh, Madhya Pradesh12 Mizoram, NCT of Delhi, Rajasthan
2014 4 Andhra Pradesh*, Arunachal Pradesh*, National electionsOdisha*, Sikkim*
10 Haryana, Maharashtra12 Jammu & Kashmir, Jharkhand
Notes: * excluded from the dataset.
95
TABLE 2.2: Summary statistics of the electoral data at the constituencylevel
Variable Obs Mean Std. Dev. 10% 90%A. PanelNumber of candidates 1708 11.32 4.96 6 18Female candidates 1708 0.80 1.00 0 2Median candidate age 1708 44.07 5.45 38 51Eligible voters (1000) 1708 175.54 47.15 139.76 218.02Turnout 1708 0.71 0.09 0.58 0.82Winning vote share 1708 0.44 0.09 0.33 0.55NOTA votes / total votes 630 0.019 0.013 0.006 0.035NOTA votes / eligible voters 630 0.014 0.009 0.004 0.026Non-reserved constituencies:Number of SC candidates 1144 1.31 1.43 0 3Number of ST candidates 1144 0.48 1.04 0 2
B. Repeated cross sectionNumber of candidates 6685 10.54 5.35 5 17Female candidates 6685 0.73 0.97 0 2Median candidate age 6685 44.99 5.78 38 53Eligible voters (1000) 6685 180.75 88.23 41.20 292.90Turnout 6685 0.71 0.13 0.53 0.87Winning vote share 6685 0.45 0.10 0.32 0.56NOTA votes / total votes 1176 0.015 0.116 0.004 0.030NOTA votes / eligible voters 1176 0.010 0.009 0.003 0.022Non-reserved constituencies:Number of SC candidates 4842 1.24 1.56 0 3Number of ST candidates 4842 0.26 0.79 0 1Notes: The panel dataset contains the 2008 and 2013 state assembly elections in the states of Karnataka, NCTDelhi, Mizoram, Rajasthan, Madhya Pradesh, and Chhattisgarh. The repeated cross-section contains all assemblyelections between 2006 and 2014 in 25 states. Turnout is total votes divided by the number of eligible voters.Winning vote share is the winner’s share of all non-NOTA votes. Source: Election Commission of India.
96
TABLE 2.3: Candidate characteristics in the panel data
Variable All 2008 2013Age 44.42 43.98 44.87Female 6.87 6.87 6.88General caste 61.47 62.35 60.63SC 20.79 21.3 20.3ST 14.58 16.35 12.89Nationally recognized party 27.12 27.58 26.68State recognized party 3.89 4.9 2.93Other state-recognized party 8.05 6.9 9.15Unrecognized party 18.77 18.21 19.29Independent 39.01 42.41 35.76NOTA 3.16 - 6.18N 19957 9762 10195Notes: 2008 and 2013 state assembly elections in the states of Karnataka, NCTDelhi, Mizoram, Rajasthan, Madhya Pradesh, and Chhattisgarh. All numbersexcept Mean age are percentages. Source: Election Commission of India.
TABLE 2.4: Voter demographics at the state level (repeated cross-section)
Obs Mean Std. Dev. Min MaxLabor force participation 50 0.58 0.07 0.46 0.75Unemployment rate 50 0.02 0.03 0.00 0.15Real household earnings (Rp per week) 50 1708 665 864 4335Fraction illiterate 50 0.25 0.13 0.04 0.52Fraction primary school or less 50 0.23 0.09 0.13 0.53Female per 1000 male 50 987 72 790 1172Fraction urban 50 0.32 0.18 0.09 0.97State NDP growth rate 50 6.11 4.57 -5.38 24.31Notes: Source for all variables except NDP growth rate: National Sample Survey, rounds 62, 64, 66,68, 71. Individual surveys for respondents above 18 were aggregated to the state level. Householdearnings deflated to 2001 prices using the CPI from the Reserve Bank of India. Source for NDPgrowth rate: Reserve Bank of India.
97
TABLE 2.5: Demographic characteristics of the constituencies from theIndian Census (panel dataset)
Variable Mean Std. Dev. 10% 90%Literate 0.58 0.09 0.47 0.69Percent male 0.51 0.01 0.50 0.53Percent SC 0.18 0.07 0.08 0.26Percent ST 0.29 0.18 0.16 0.58Percent employed 0.46 0.05 0.40 0.52Percent rural workers 0.66 0.17 0.44 0.84Fraction of households with infrastructure:No latrine 0.71 0.22 0.37 0.91Water near premises 0.47 0.10 0.35 0.59Water on premises 0.26 0.16 0.10 0.47Fraction of households owning asset:Car 0.03 0.02 0.01 0.06Computer 0.01 0.01 0.002 0.02Phone 0.55 0.20 0.25 0.78TV 0.37 0.17 0.15 0.61Notes: Source: Indian Census, 2011. Village level census data was merged to assemblyconstituency GIS boundary files as described in the text. N = 723.
98
TABLE 2.6: The impact of NOTA on turnout, DD estimates
(1) (2) (3)NOTA 0.035 0.029** 0.030*s.e. (0.013) (0.016)bootstrap p-value [0.587] [0.036] [0.060]Eligible voters, labor force participation, x x xhh earnings, educationPolitical reservation, unemployment, sex xratio, urbanization, NDP growth rateConstituency FE xState FE x xR2 0.78 0.18 0.19N 1708 6685 6685States 6 25 25Notes: Estimates of the effect of the NOTA policy on turnout from Eqn. (2). All regressions control for time fixedeffects, the log number of eligible voters in a constituency and its square, and the following state-level variables:labor force participation, real weekly household earnings, fraction of illiterates, fraction with primary schoolor less as highest education. Column (3) also controls for reserved constituencies and the following state levelvariables: unemployment, sex ratio, fraction urban, and the growth rate of net domestic state product. Standarderrors clustered by state in parentheses. The bootstrap p-value was computed using a wild bootstrap procedurewith a 6-point weight distribution. ***, **, and * indicate significance at 1, 5, and 10 percent, respectively.
99
TABLE 2.7: Estimates of the linear parameters of the demand system
InstrumentalVariable Random
OLS Logit Logit Coefficients Full modelVariable (1) (2) (3) (4)Female -0.090** -0.153 -1.547* -3.725
(0.044) (0.268) (0.928) (2.992)Age 1.029*** 8.925*** 8.013*** 10.231*
(0.116) (1.423) (2.097) (5.767)SC candidate -0.370*** 0.273 0.377 0.120
(0.047) (0.223) (0.385) (1.266)ST candidate -0.043 0.082 0.143 2.002
(0.067) (0.148) (0.386) (2.878)NOTA 0.269*** -3.439*** -4.263*** -61.830***
(0.074) (0.076) (0.890) (11.052)Reserved SC 0.302*** -0.202 -0.258 -0.140
(0.048) (0.189) (0.360) (0.954)Reserved ST 0.290*** 0.351*** 0.734*** -1.923
(0.064) (0.115) (0.194) (2.688)Party: BJP 2.664*** -5.135*** -4.677*** -1.994
(0.042) (0.728) (0.990) (3.104)Party: INC 2.738*** -5.157*** -4.724*** -2.089
(0.041) (0.745) (1.017) (3.177)Party: BSP 0.325*** -7.039*** -6.802*** -4.348
(0.045) (0.644) (0.895) (2.933)Notes: The table reports estimates of the linear parameters of the model (β). All specifications include afull set of party dummies (three of which are reported in the table) as well as dummies for states, years,and constituency reservation status. Columns (2)-(4) include instrumental variables as described in thetext. Robust standard errors in parentheses. ***, **, and * indicates significance at 1, 5, and 10 percent,respectively.
100
TAB
LE
2.8:
Esti
mat
esof
the
nonl
inea
rpa
ram
eter
sof
the
full
mod
el
Stan
dard
Stan
dard
Dev
iati
ons
Dev
iati
ons
Inte
ract
ions
wit
hD
emog
raph
icV
aria
bles
Var
iabl
e(1
)(2
)Pe
rcen
tSC
Perc
entS
TPe
rcen
tmal
eLi
tera
cyR
ural
Fem
ale
-3.8
73**
-7.0
81*
--
6.21
4-
-(1
.513
)(3
.813
)(6
.498
)A
ge-0
.915
--
--
--
(2.4
53)
SCca
ndid
ate
0.00
3-
0.06
0-
--
-(1
.799
)(2
.188
)ST
cand
idat
e1.
951*
*4.
971*
**-
6.45
2*-
--
(0.8
59)
(1.7
08)
(3.5
56)
NO
TA1.
343
--
-43.
704
-42.
581
-45.
274*
-24.
270*
*(0
.839
)(4
3.32
2)(3
4.23
1)(2
4.27
6)(1
1.30
5)C
onst
ant
-0.0
60-
--
-13
.066
***
6.08
2***
(1.6
50)
(1.8
81)
(1.9
27)
Not
es:
The
tabl
ere
port
ses
tim
ates
ofth
eno
nlin
ear
para
met
ers
ofth
em
odel
(Πan
dΣ
).’S
tand
ard
devi
atio
ns’r
efer
toth
eΣ
para
met
ers
ofth
era
ndom
coef
ficie
nts.
The
spec
ifica
tion
inco
lum
n(1
)co
rres
pond
sto
colu
mn
(3)
inTa
ble
7an
dre
stri
cts
Π=
0.
The
rem
aini
ngco
lum
nsof
the
tabl
eco
rres
pond
toth
esp
ecifi
cati
onin
colu
mn
(4)i
nTa
ble
7.R
obus
tsta
ndar
der
rors
inpa
rent
hese
s.**
*,**
,and
*in
dica
tes
sign
ifica
nce
at1,
5,an
d10
perc
ent,
resp
ecti
vely
.
101
TABLE 2.9: Impact of NOTA on vote shares by party
Change with NOTAN. of Percent of
Party candidates all voters Full model LogitBJP 506 32.90 -0.17 -0.63BSP 498 3.63 -0.02 -0.09BYS 102 0.10 0.00 -0.01CSM 54 0.22 0.00 -0.01GGP 44 0.20 0.00 -0.01INC 518 26.74 -0.14 -0.58Independents 468 4.96 -0.03 -0.11JGP 85 0.06 0.00 0.00MNF 10 0.06 0.00 0.00NPEP 133 1.30 -0.01 -0.02SP 194 0.43 0.00 -0.01ZNP 11 0.02 0.00 0.00Small Party 445 2.70 -0.01 -0.05Abstention 25.11 -1.20 -0.76Notes: The table shows, for each party, the total number of candidates and the correspond-ing share of all voters in the data (out of 101.168 million eligible voters). The last twocolumns are the simulated effects of introducing NOTA in the full random coefficientsspecification as well as in the more restrictive Logit model.
103
TABLE 2.10: The correlates of NOTA votes
(1) (2) (3) (4)Constituency characteristics:Reserved SC 0.005*** 0.005*** 0.002** 0.002**
(0.001) (0.001) (0.001) (0.001)Reserved ST 0.011*** 0.011*** 0.009*** 0.009***
(0.002) (0.001) (0.002) (0.002)Literacy -0.021*** -0.035*** -0.015** -0.026**
(0.006) (0.010) (0.006) (0.010)Size -0.006* -0.008** -0.003 -0.005
(0.003) (0.004) (0.003) (0.004)Percent male -0.215*** -0.230*** -0.133*** -0.190***
(0.031) (0.046) (0.033) (0.043)Percent SC 0.002 0.010 -0.006 0.006
(0.008) (0.009) (0.007) (0.008)Percent ST 0.010*** 0.008* 0.010** 0.007
(0.004) (0.004) (0.004) (0.004)No latrine 0.002 0.004
(0.004) (0.004)Water nearby 0.016** 0.014**
(0.006) (0.006)Water at home 0.011* 0.013**
(0.006) (0.005)Percent employed 0.015 -0.004
(0.011) (0.011)Rural workers -0.019*** -0.015***
(0.005) (0.005)Car ownership 0.023 0.020
(0.037) (0.034)Computer ownership -0.029 0.036
(0.057) (0.052)Phone ownership -0.009 -0.010*
(0.006) (0.005)TV ownership -0.003 -0.007
(0.006) (0.006)Candidate characteristics:Number of candidates -0.001*** -0.001***
(0.000) (0.000)No female -0.001 -0.001
(0.001) (0.001)<15% female -0.000 -0.000
(0.001) (0.001)Median age -0.000 -0.000
(0.000) (0.000)No SC 0.000 -0.000
(0.001) (0.001)<15% SC -0.000 0.000
(0.001) (0.001)No ST -0.000 -0.000
(0.001) (0.001)<10% ST 0.002 0.001
(0.001) (0.002)R2 0.57 0.60 0.63 0.66N 520 520 520 520Notes: The dependent variable is the share of NOTA votes among all votes. Regressions at the con-stituency level for the cross-section of constituencies affected by the NOTA policy in 2013. Mergeddataset: average demographic characteristics are from the census, average candidate characteris-tics are from the Election Commission. All regressions include state fixed effects. Robust standarderrors in parentheses. ***, **, and * indicates significance at 1, 5, and 10 percent, respectively.
104
TABLE 2.11: Effect of NOTA on turnout, excluding national election years
(1) (2) (3) (4)NOTA 0.033** 0.033** 0.030* 0.031*
(0.015) (0.016) (0.015) (0.015)Basic controls x xExtended controls x xExcluded years 2014 2014 2009, 2014 2009, 2014R2 0.18 0.20 0.19 0.21N 6139 6139 5680 5680States 25 25 22 22Notes: Estimates of the effect of the NOTA policy on turnout from Eqn. (2) using the repeatedcross section sample with specific years excluded. All regressions control for state and yearfixed effects, the log number of eligible voters in a constituency and its square, and the follow-ing state-level variables: labor force participation, real weekly household earnings, fractionof illiterates, fraction with primary school or less as highest education. The Extended controlsspecification also controls for reserved constituencies and the following state level variables:unemployment, sex ratio, fraction urban, and the growth rate of net domestic state product.Standard errors clustered by state in parentheses. ***, **, and * indicate significance at 1, 5,and 10 percent, respectively.
TABLE 2.12: Effect of NOTA on turnout, controlling for redistricting
(1) (2) (3) (4) (5) (6)NOTA 0.033** 0.022* 0.031** 0.020* 0.030** 0.020
(0.016) (0.011) (0.015) (0.011) (0.014) (0.012)Basic controls x x xExtended controls x x xR2 0.21 0.23 0.21 0.23 0.21 0.23N 6084 6084 6084 6084 6084 6084States 22 22 22 22 22 22Notes: Estimates of the effect of the NOTA policy on turnout from Eqn. (2) using the repeated cross section sam-ple. Columns (1) and (2) are run on the states with available constituency boundary files. Columns (3) and (4)control for redistricting using the maximum overlap measure and columns (5) and (6) using the territorial frac-tionalization index. All regressions control for state and year fixed effects, the log number of eligible voters in aconstituency and its square, and the following state-level variables: labor force participation, real weekly house-hold earnings, fraction of illiterates, fraction with primary school or less as highest education. Even-numberedcolumns also control for reserved constituencies and the following state level variables: unemployment, sex ratio,fraction urban, and the growth rate of net domestic state product. Standard errors clustered by state in parenthe-ses. ***, **, and * indicate significance at 1, 5, and 10 percent, respectively.
105
TABLE 2.13: Effect of NOTA on turnout, robustness to state-specificevents
Excluded state Effect of NOTA NBasic controls Extended controls
Chhattisgarh 0.025* 0.035 6505(0.014) (0.021)
Maharashtra 0.031** 0.031* 6109(0.015) (0.015)
Delhi 0.029** 0.031* 6545(0.013) (0.016)
Jammu and Kashmir 0.030** 0.031* 6511(0.014) (0.016)
All four 0.028* 0.039* 5615(0.015) (0.019)
Notes: Estimates of the effect of the NOTA policy on turnout from Eqn. (2) using the repeated crosssection sample with specific states excluded. All regressions control for state and year fixed effects, thelog number of eligible voters in a constituency and its square, and the following state-level variables: laborforce participation, real weekly household earnings, fraction of illiterates, fraction with primary school orless as highest education. The Extended controls specification also controls for reserved constituenciesand the following state level variables: unemployment, sex ratio, fraction urban, and the growth rateof net domestic state product. Standard errors clustered by state in parentheses. ***, **, and * indicatesignificance at 1, 5, and 10 percent, respectively.
106
FIGURE 2.2: Distribution of NOTA vote shares across constituencies
Notes: NOTA vote share is measured as a fraction of total votes cast. N = 1176.
107
FIGURE 2.3: Impact of NOTA on turnout
Notes: Distribution of the changes in turnout across constituencies. Mean = 0.0119, median = 0.0100, N= 519.
108
FIGURE 2.4: Impact of NOTA on candidates’ vote shares
Notes: Distribution of the changes in vote shares across candidates. Mean = -0.0006, median = -0.0000,N = 3068.
109
110
Chapter 3
Firm Ownership and Wage Dispersion:
Evidence from Ghana Using Matched
Employer Employee Data
3.1 Introduction
The role of the firm in worker wage dispersion has been little studied empirically.
It is however considered as a very important determinant of wage inequality. The
environment in which a worker is working may have direct and indirect consequences
on his eventual earnings through measured or unmeasured channels. The idea of
unmeasured worker ability on the one hand and pure industry or firm effects on the
other hand are believed to be important factors affecting observed wage dispersion.
Existing empirical work has shown that neither of these explanations can be re-
jected but seperately identifying these effects using longitudinal data is difficult. As
Hamermesh (2008) points out, the effect would be overstated in favor of the group
on which more information is available. So, if information on workers is richer than
that on firms, the estimated effect of workers would look a lot bigger than that of
firms and vice-versa. This is more of a concern if we believe that the worker and
firm characteristics are correlated. Hamermesh notes that it is much better if we have
a matched employer-employee dataset. In this paper, I use matched data from the
Ghanaian manufacturing industry to analyze the role of firms in observed wage dis-
persion among workers. The empirical methodology of this paper closely follows
Barth et al (2014).
Barth et al (2014) is, to the best of my knowledge, the only paper that studies the
role of establishments in explaining wage inequality using US data. I apply their
methodology to study the same question in a developing country setting. Addition-
ally, I look at how patterns of inequality vary by firm ownership type. Specifically,
are private firms owned domestically different from foreign owned firms? Are the
firm effects explaining wage dispersion different based on ownership structure? I also
analyze how trends in earnings dispersion differ by worker skills.
In the Ghanaian manufacturing sector, we see majority of the manufacturing firms
in the private sector. 1This is largely due to the extensive economic reforms that have
been implemented since the mid to late 1980s. This paper also contributes to the litera-
ture in terms of analyzing earnings in the post-reform era in Ghana. I find an unusual
and very interesting pattern of wage inequality in the Ghanaian manufacturing sector
in this period. In the early period, as the reforms were perhaps sinking in, the vari-
ance in worker earnings increased but towards the end of the 1990s, by the time the
reforms had potentially taken full effect, the variance in earnings actually decreased.
However, while this pattern is true for domestic private firms in Ghana, firms with
1This is true in terms of the dataset I use. Census figures may be different nonetheless.
111
other types of ownership did not witness this decline in the later periods and they ex-
perience a secular increase in variance of earnings. Much of this paper is focussed on
analyzing this interesting pattern and estimating what role the firm component might
have played in this.
Analyzing the variance components, I find that this pattern is explained mainly
by fluctuations in the between-firm variation. The variation within-firms has been
steadily increasing. The initial rise in variance and the later fall seems to be due to
variation between firms rather than within firms. It is important to understand if this
pattern can be explained by gains and losses made by the workers at the top of the
earnings distribution. Interestingly, I find that while net gains to workers above the
90th percentile can explain about 61% of the initial surge in variance, it only explains
about 9% of the later fall. This suggests that workers below the 90th percentile must
have made significant gains in the later stages which resulted in closing down of the
earnings gap. This can possibly be attributed to economic reforms, though the empir-
ical analysis of this paper cannot identify if this is a pure causal effect or not.
I further analyze the interaction of worker and firm characteristics in a regression
setting and find that the variance in workers’ predicted wage from observable charac-
teristics have increased over time whereas the variance of the firms’ effect on wages
have declined over time. This pattern is observed for the average firm and on av-
erage for firms owned privately by domestic Ghanaians. For foreign owned firms
however, this is not replicated. The data suggests that the inverted u-shaped pattern
of variances of log real hourly earnings for private firms can mostly be explained by a
decline in the firm specific effect at the latter stages. The correlation between the con-
tribution of individual attributes to wages and the firm effect have declined over time.
112
This suggests that firms have gradually hired workers by observable characteristics,
independent of firm specific earnings, or in other words, their preference towards hir-
ing similar workers within a firm has decreased. I find that for high skilled workers,
inequality increases over time whereas it declines for low skilled workers.
The rest of the paper is organized as follows. Section 2 gives some background in-
formation. The empirical methodology is introduced in Section 3 and Data in Section
4. The main results are in Section 5. In Section 6, I briefly discuss how the estimated
effects are different over time periods and present conclusions in Section 7.
3.2 Background
3.2.1 The Ghanaian Manufacturing Sector
The Ghanaian economy went through economic reforms from 1983-1991. The manu-
facturing sector was believed to be the engine through which employment generation
and long term economic growth would be attained. (See Teal 1998). An immediate
consequence of providing impetus to the manufacturing sector is that demand for
skilled labor would rise. This was no different for Ghana particularly because the two
most flourishing manufacturing sectors in Ghana were wood products and textiles.
Detailed review of the impact of reforms on the Ghanaian economy can be found in
Baah-Nuakoh and Teal (1993). The manufacturing sector has contributed consistently
113
to around 8-10% of the GDP of Ghana ever since the reforms. Even though by stan-
dards expected by the reform, this may look like a small number but the sector em-
ploys over 250,000 people. 2 Apart from wood and textiles, food processing, smelting,
oil refining, cement and pharmaceuticals are other important sectors in manufactur-
ing in Ghana. A rising concern for the sector is the influx of cheap goods from other
economies like China which have massive comparative advantages in manufacturing.
This is not just a concern for the Ghanaian economy but most of the western African
nations. A recent study by Szabo (2014) points out the inefficiencies in the manufac-
turing sector of Ghana. Szabo estimates production functions for the manufacturing
firms and finds that firms mostly use more capital and less labor than the optimal
amount.
In the wake of this situation, it is worthwhile to look at the wage patterns in the
Ghanaian manufacturing sector which has been the center of attention for Ghanaian
policymakers. It remains indeterminate whether the sector has given the economy
the necessary fillip towards economic growth but we can see how much wages have
changed over time in this sector post the reforms and whether the pre dominance
of private owenrship has had significant effects on worker wages or not. Existing
research has pointed out that there is a link between economic growth and wage dis-
persion in African economies (see Agesa et al, 2011 for details in the context of Kenya).
2See Natcomm Report on Ghanaian Manufacturing Sector -http://www.natcomreport.com/ghana/livre/manufacturing.pdf
114
3.2.2 Related Literature and Conceptual Framework
The study of wage dispersion using matched employer employee datasets has gained
popularity in the last decade and a half. Abowd, Kramarz and Margolis (1999) in
their seminal paper introduced the use of a linked employer employee data set to run
a additively separable fixed effects framework using French data to disentangle the
true effect of a firm from that of the worker.
Fafchamps, Soderbom and Benhassine (2008) studied the African manufacturing
sector and analyzed the role of sorting in explaining wage gaps. Not much is known
however in terms of the role of firms in explaining dispersion in such poor countries.
This paper is an attempt to fill this void in the literature. To my knowledge, this is the
first paper to use a matched employer employee dataset to look at wage dispersion by
ownership pattern for a developing country. In particular the idea is to see if private
domestic ownership of firms leads to higher or lower wage dispersion as compared to
other types of ownership like state-owned, foreign or mixed.
The effect of unobserved worker ability and firm components on wage dispersion
maybe very different for firms owned and managed differently. Especially after eco-
nomic reforms, one might expect more FDI to flow in and more foreign firms being
setup. A body of literature analyzes the impact of FDI inflows on wage dispersion
(Saglam and Sayek, 2011; Heyman, Sjoholm and Tingvall, 2011; Almeida, 2007 and
Martins, 2004). The findings generally indicate that foreign firms lead to higher wage
dispersion and also pay higher actual wages. One explanation is that foreign owned
firms hire skilled workers and that leads to dispersion. Also, lower skilled workers
sort to other firms and hence inter- as well as intra-firm dispersion rises.
It is therefore important to understand if firm behavior actually explains more of
115
the dispersion in wages when owned by foreign enterprises. The following empiri-
cal exercise is an attempt to answer this question. If foreign ownership indeed leads
to higher dispersion, then the firm component should explain more of the wage dis-
persion than worker components. Also, if domestic firms are selected by low skilled
workers, then worker components should explain dispersion more than firm effects
for such firms.
3.3 Empirical Methodology
3.3.1 Analyis of Variance
The empirical strategy I adopt is largely descriptive but provides valuable insight. I do
a variance decomposition of log real hourly earnings into within-firm and between-
firm elements. The matched component of my dataset facilitates this exercise. I follow
the method of Barth et al (2014) throughout this paper and analyze the variance com-
ponents as follows:
V ar(lnwif ) = V ar(Elnwif ) + V ar(lnwif − Elnwif ) (3.1)
where lnwif is the log real hourly earnings (before taxes) for individual ‘i’ working at
firm ‘f’. Elnwif represents the expected value of earnings of workers working at firm
‘f’. V ar(Elnwif ) is the between-firm variance and V ar(lnwif − Elnwif ) is the within
firm variance.
116
3.3.2 Percentile Analysis: Contribution of Earnings Gap to Variance
Barth et al (2014) also propose an arithmetic decomposition of the variance in log real
earnings to calculate the contribution of the difference in mean earnings for various
groups of individuals on the actual variance based on the percentile distribution of
earnings. Suppose we want to compare individuals above the p-th percentile to those
below the p-th percentile. Then if E(p) is the mean of log real hourly earnings of the
top p% and E(100−p) is the mean log real hourly earnings for the remaining workers,
then the variance can be decomposed as follows:
V ar(lnw) =p
100·100− p
100·[E(p)−E(100−p)]2+
p
100·V ar(p)+100− p
100·V ar(100−p) (3.2)
For this paper, I will focus on the earnings gap at the 90th percentile. So the above
reduces to:
V ar(lnw) = (0.1) · (0.9) · [E(10)− E(90)]2 + (0.1) · V ar(10) + (0.9) · V ar(90) (3.3)
The contribution of the earnings gap to variances is therefore given by the term
(0.1) · (0.9) · [E(10)− E(90)]2.
3.3.3 Worker Characteristics and Firm Fixed Effects
The third and final strategy of this paper is to run a modified Mincerian regression for
time periods ‘t’ as follows:
117
lnwif = β.Xif + δf + uif (3.4)
The vectorXif includes worker characteristics like hours worked, age, age squared,
experience, experience squared, education, education squared, parents’ experience
and parents’ experience squared for worker i in firm f . Using the methodology de-
scribed in Barth et al (2014), I decompose the variance of (4) as follows:
V ar(lnw) = V ar(κ) + V ar(δ) + 2cov(κ, δ) + V ar(u) (3.5)
In the above, κ = X · β and represents the predicted earnings from observable
worker characteristics. The term V ar(δ) represents the variance of earnings among
firms or the variance of the firm fixed effects. Another statistic of interest is the degree
of how similar workers are within a given firm, ρδ = cov(κ,δ)V ar(κ)
. Barth et al (2014) note
that if ρδ is close to zero, firms are independently hiring workers based on observed
characteristics, independent of firm earnings.
3.4 Data
The dataset for this study is a panel survey of the Ghanaian manufacturing sector
which took place from 1992 to 2003. The survey was the joint venture of Center
for Study of African Economies (CSAE), the University of Oxford, the University of
Ghana, Legon and the Ghana Statistical Office for the Regional Project for Enterprise
Development (RPED) and Ghana Manufacturing Enterprise Survey. The original sam-
ple of firms was drawn randomly from the 1987 census of manufacturing activities of
118
Ghana. From each of the firms a sample of workers were also interviewed in this pe-
riod. This allows matching the two databases to form a linked employer-employee
dataset. 3
Figure 3.1 shows the shares of firms by ownership categories from our sample. It
appears that the majority of firms are privately owned but some have foreign owner-
ship. For the eventual empirical analysis, I am going to break down the sample into
firms owned and managed by private Ghanaian agents, viz, Private Domestic Firms
and firms with some foreign ownership. The latter would include firms partially or
wholly owned by foreign owners. The remainder of the firms are state owned. Table
3.1 shows some descriptive statistics from plant level data. Column (1) reports means
for all firms whereas columns (2) and (3) report means for the other two categories.
All the means are averages across 12 years of data. Figure 3.2 gives the time series
plots of plant level averages of some of these observables by ownership types.
The average firm has about 75 workers whereas this number is much smaller for
private domestic firms (34) and much higher for foreign owned firms (194). The num-
ber of skilled workers as identified by the firms in their responses to the survey ques-
tion is 14 on average for the whole sample whereas for private domestic it is much
smaller at around 7 skilled workers per firm. The average education of workers is
about 10 years across firm types. Foreign owned firms seem to have workers with
higher average age and potential experience compared to the overall average. Most
foreign owned firms are located in the capital city of Accra. I also find that foreign
firms are more likely to export goods compared to private domestic firms. Value of
3The survey yields a dataset which is a balanced panel of 200 firms but a repeated cross section ofworkers. It was conducted over 7 rounds. The first three rounds were annual surveys, the next threecovered two years each and the last covered three years.
119
total output is higher for foreign firms throughout the span of the data.
There is not much difference in the hours worked by the average worker in these
firms although it is slightly less for foreign owned firms. Foreign owned firms also
have higher wage bills. In terms of input usage and capital, the data suggests that on
average 22% of the inputs are imported. This number is somewhat smaller at under
20% for private domestic firms but significantly higher at 33% for foreign owned firms.
In terms of replacement value of plant and machinery, the mean for foreign firms is
above the average and that for private domestic is below the average for all firms.
3.5 Results
3.5.1 Within-Firm and Between-Firm Effects
Figure 3.3 is indicative of how log real hourly earnings varied in Ghana over the years
(1992-2003). A striking feature of this graph is that the variance of earnings seem to re-
semble an inverted u-shape for Ghanaian manufacturing firms. The variances seem to
have increased from 1992 to around 1998 after which the variances show a decline in
trend. This is intriguing because unlike secular increases in earnings inequality which
are common in the developed world, including the US, here I observe a decline after
an initial surge. . One possible reason for this pattern in the Ghanaian manufacturing
sector is the reforms that happened extensively in the late 80s and continued into the
early 90s. It is plausible that the effects of the reforms on the manufacturing sector,
especially for the domestic firms started accruing around the mid nineties. This might
be the reason that earnings inequality appears to have been checked since 1998 and
into the 2000s. A feature of the Ghanaian manufacturing sector was overstaffing. It is
120
believed that post-reform this has reduced a lot.4 This may have also contributed to
paying workers more efficiently and a resultant decline in earnings inequality. In Fig-
ure 3.3, I look at composition of this variance changes for different ownership patterns.
The Ghanaian manufacturing sector is dominated by private firms. So, I restrict my
analysis only to such firms and find that the inverted u-shaped pattern is replicated
only for domestic firms and not for the foreign owned firms.
The rest of this paper delves deeper into this phenomenon and attempts to analyze
the components underlying such patterns of dispersion. The preceeding discussion
first leads us into the following question: are these patterns due to within-firm or
between-firm variation in worker earnings and how do changes in these components
reconcile the overall trends in dispersion? To answer this question, I decompose the
variances following equation 3.1. The results are reported in Table 3.2. I report the total
variance of log real hourly earnings along with the within and between components
for 3 different years,viz, 1992, 1998 and 2003 to illustrate how the change in overall
variances compare to the change in the within and between components. I find that
for all firms, the within-firm variation has increased over time from 0.444 in 1992 to
0.514 in 1998 and 0.748 in 2003. However, the inverted u-shape of the total variance
curve can be explained mainly by the between-firm component of earnings. I find that
in the early period, the between firm component rises from 0.661 to 0.862 but it falls
to 0.319 in the latter half. This suggests that the between firm component seems to
explain the pattern that we observe in the dispersion of earnings.
Looking at changes over time, I find that the total variance increased in the early
period and declined in the latter by a larger magnitude. The change in within variation
4See Ackah, Adjasi and Turkson - http : //www.brookings.edu/ /media/Research/F iles/Papers/2014/11/learning−to− compete/L2CWP18Ackah−Adjasi− and− Turkson.pdf?la = en
121
increased. So what drives this decline is likely to be the change in between variation.
I find that the change in between-firm variance declined from 0.201 to -0.541.
In Table 3.3, I analyze the overall variances in earnings by firm type. There are
some interesting patterns to note. For private domestic firms, variance seems to be
initially flat but declines in the latter period whereas for private foreign firms it seems
to increase in the early period and remain stable in the latter period. If an influx of
foreign investment disproprotionately attracts skilled labor, this may lead to rising
inequality in the other sectior, viz, domestic in our case.5
3.5.2 Percentile Analysis
Given that we observe the inverted u-shaped pattern for variances in log real hourly
earnings, the question of gainers versus losers assumes importance. The purpose of
this exercise is to estimate whether gains to the workers at the top of the earnigns
distribution has contributed to the inequality? For our unique case, this question can
be reframed to accomodate the decline in inequality at the latter stages to see if top
earners have lost out on the gains they made or have the rest of the workers caught
up. I perform the empirical analysis described in the empirical methodology section
above and focus on the decomposition as per equation 3.3.
Table 3.4 presents the relevant results. I break up the analysis into two periods
of time. The surge phase (1992-1998) and the decline phase (1998-2003). In Panel A,
I analyze the surge phase. Column 2 reports statistics from 1992, column 3 reports
5 To investigate this further, I looked at the variances by worker skill and found that the inequalityfor low skilled workers has reduced over time consistently. This is indicative of the fact that as skilledworkers sort into foreign firms, the unskilled (or low skilled) workers end up in domestic firms anddomestic firms have a lower skilled pool to choose from. This may lead to a decline in dispersion forlow skilled workers.
122
statistics from 1998 and column 4 reports the changes from 1992-1998. I find that the
variance of log real hourly earnings increased by 0.271 log points from 1.105 in 1992
to 1.376 in 1998. I calculate the means of log real hourly earnings for two sets of
workers, viz, above the 90th percentile and below the 90th percentile. I find that in
1992, the difference in these means was 1.649 and it increased to 2.138 in 1998. Using
the expression in equation 3.3, I calculate the contribution of these mean differences
to the variance in earnings.
In 1992, the difference in mean log real hourly earnings for workers above and
below the 90th percentile contributed 0.245 log points of the total variation of 1.105.
This number increased to 0.411 in 1998. The increased contribution of these differences
was 0.166 over the surge period which accounts for 61.25% of the total increase in
variance during this time period. This suggests that the increased advantage of the
top earners over the rest had a major contribution to the surge in wage dispersion in
Ghana during 1992-1998.
I replicate the above analysis for the decline phase and report the statistics in Panel
B. Variance in log real hourly earnings declined from 1.376 in 1998 to 1.067 in 2003,
a change of 0.309 log points. The difference in means of log real hourly earnings for
workers above and below the 90th percentile decreased from 2.138 in 1998 to 2.062
in 2003. The contribution to the overall variance changed by 0.028 which is approx-
imately 9.06% of the total decline in variance. So the relative advantage of the top
earners was not entirely wiped out and was not the major reason why the surge in
inequality reversed. This suggests that workers in the rest of the distribtuion of earn-
ings made significant gains in the latter phases which could probably be attributed to
the sinking in of economic reforms.
123
3.5.3 Returns to Schooling and Earnings Dispersion
I run simple Mincerian regressions as described by equation 3.4 with X denoting the
observables like years of schooling, experience and experience squared only. I use
firm fixed effects and run these regressions for two periods of time, 1992-1998 and
1998-2003. OLS estimates of the coefficients are reported in Table 3.5. I find that the
returns to schooling increased over time whereas the marginal effect of experience
seems to have declined over time. Variance of the predicted earnings from education
and experience has increased from 0.109 to 0.133. This is not consistent with the fall in
variance of earnings over time suggesting that the decline in inequality is explained
more by firm factors and residuals and not necessarily by a decline in inequality of
returns to schooling and experience.
3.5.4 Does Predicted Wage Dispersion by Worker and Firm Charac-
teristics Vary by Ownership Structure?
This is the final section of the empirical analysis and I run the regression described
by equation 3.4 using all observable worker characterisitcs as controls. Since I have
identified two distinctive time periods in the data for studying variance in earnings,
I breakup the analysis in this segment by those time periods. So I run equation 3.4
for the surge period and the decline period seperately by pooling together data from
the years in those periods. Results are reported in Table ??. For the average firm,
variance in log real hourly earnings in the surge period was 1.460 and it decreased to
1.318 in the decline period. I decompose the variances following equation 3.5. I find
that the variance in predicted wage due to observed worker characteristics increased
124
from 0.235 log points to 0.331 log points whereas the variance from firm level earn-
ings decreased from 0.595 log points to 0.385 log points. This is consistent with the
idea that firm level characterstics seem to have contribued to the decline in wage in-
equality in Ghana from 1998-2003 rather than worker characteristics. Earlier, I found
that between firm variation seems to explain the inverted u-shaped pattern better than
the within-firm variation and the findings in this section corroborate those. I find that
the variance in the residuals have gone down over time. The sorting factor between
worker-firm, ρδ also decreased over time. This suggests that with time firms have
employed workers based on observed characteristics and independent of firm level
earnings. This also suggests the possibility that positive sorting was on the decline
where higher pay workers now joined lower average earnings firms and that helped
bridge the earnings gap and led to a reduction in inequality.
I am interested to know if the patterns for the average firm is universal among
ownership types or is it specifically driven by a particular organizational structure.
I split the above analysis to seperately study private domestic firms and private for-
eign firms. I find that the average effect is replicated for private domestic firms. The
variance of κ increased from 0.308 to 0.400 whereas the variance of the firm earnings
decreased from 0.387 to 0.232. The sorting factor also decreased. However, for firms
that are partially or wholly owned by foreign private enterprenerus, I find the oppo-
site. Recall that for these firms, we do not observe the inverted u-shaped pattern in
3.3 and there is a secular increase in inequality for these firms. The variance of κ in-
creased from 0.139 to 0.214 and variance of δ increased from 0.134 to 0.441 for these
firms. This suggests that the ownership structure seems to be an important element
that characterizes the unique wage variance structure in Ghana.
125
3.6 Conclusions
This paper uses a matched employer-employee dataset from the Ghanaian manufac-
turing sector to analyze earnings dispersion in Ghana from 1992-2003. The plot of
variances in log real hourly earnings over time indicate two clear periods of surge and
decline in inequality in Ghana. I analyze the variances in earnings from 1992-1998
when inequality increased and find that within and between firm variation increased
during this period. From 1998 to 2003, the between firm variation declined whereas
the within-variation still increased. This suggests that the fall in inequality in the latter
stages was driven by between-firm effects.
I find that the earnings gap between the workers above and below the 90th per-
centile explains about 61% of the rise in inequality from 1992-1998 but it only explains
about 9% of the fall in inequality from 1998-2003. This is suggestive evidence that
top earners did not necessarily lose out their advantage in the latter stages but other
workers made significant gains to wipe out this gap.
Variance in the predicted wages from worker characteristics seem to have risen
consistently whereas variance from firm level earnings have decline in the later pe-
riod. This correlation between the predicted wage by worker characteristics and the
firm earnings have also declined. This leads us to believe that changing patterns of
sorting could be a possible reason for the decline in earnings inequality in Ghana.
Firms appear to have independently hired workers based on their observed traits and
even firms with low average earnings attracted hire pay workers. This meant that
inequality would be on the decline. These patterns are however only observed in pri-
vate domestic firms. Foreign owned firms do not seem to have such effects on earnings
variance.
126
3.7 References
1. Abowd, John M., Robert Creecy (2002). & Francis Kramarz. “Computing Person
and Firm Effects Using Linked Longitudinal Employer-Employee Data.” Cornell
University Working Paper
2. Abowd, John M., Francis Kramarz. & David Margolis (1999). “High Wage Work-
ers and High Wage Firms” Econometrica 67
3. Agesa, Richard U., Jacqueline Agesa (2011). & Andrew Dabalen. “Changes in
Wages, Wage Inequality and the Return to Human Capital Skills in Kenya: 1977-
2005.” Review of Development Economics 15
4. Ahiakpor, Ferdinand. & Raymond Swaray (2015). “Parental expectations and
school enrolment decisions: Evidence from rural Ghana.” Review of Development
Economics 19
5. Almeida, Rita (2007). “The labor market effects of foreign owned firms.” Journal
of International Economics 72
6. Baah-Nuakoh, Amoah. & Francis Teal (1993). “Economic Reform and the Man-
ufacturing Sector in Ghana.” RPED Country Studies Series
7. Barth, Erling., Alex Bryson., James C Davis., & Richard Freeman (2016). “It’s
Where You Work: Increases in Earnings Dispersion across Establishments and
Individuals in the United States.” Journal of Labor Economics
8. Card, David., Ana Rute Cardoso. & Patrick Kline (2013). “Bargaining and the
Gender Wage Gap: A Direct Assessment.” IZA DP 7592
127
9. Card, David., Jorg Heining. & Patrick Kline (2013). “Workplace Heterogeneity
and the Rise of West German Wage Inequality.” Quarterly Journal of Economics
128
10. Data Source Acknowledgement: Regional Project on Enterprise Development &
Ghana Manufacturing Enterprise Survey, Rounds I to VII (12 years: 1992-2003)
11. Fafchamps, Marcel., Mans Soderbom. & Najy Benhassine (2009). “Wage Gaps
and Job Sorting in African Manufacturing.” Journal of African Economics 18
12. Hamermesh, Daniel (2008). “Fun with matched firm-employee Data: Progress
and Road maps.” Labour Economics 15
13. Heyman, Fredrik., Fredrik Sjoholm. & Patrik Gustavsson Tingval (2011). “Multi-
nationals, cross-border acquisitions and wage dispersion.” Canadian Journal of
Economics
14. Martins, Pedro S. “Do Foreign Firms Really Pay Higher Wages? Evidence from
Different Estimators.” IZA DP No. 1388 (2004).
15. Saglam, Bahar Bayraktar & Selin Sayek (2011). “MNEs and wages: The role of
productivity spillovers and imperfect labor markets.” Economic Modelling 28
16. Szabo, Andrea (2014). “Measuring Firm-level Inefficiencies in the Ghanaian
Manufacturing Sector.” Working Paper, University of Houston
17. Teal, Francis (1998). “The Ghanaian Manufacturing Sector 1991-95: Firm Growth,
Productivity and Convergence.” Working Paper, CSAE, University of Oxford
128
TABLE 3.1: Descriptive Statistics
All Firms Private Domestic Any Foreign(1) (2) (3)
Number of Workers 74.50 34.53 194.41
Number of Skilled Workers 13.69 6.85 35.01
Years of Education of Average Worker 10.02 9.65 10.99
Potential Experience of Average Worker 15.26 6.85 35.01
Age of Average Worker (in years) 32.60 30.11 39.32
Weekly Hours Worked by Average Worker 45.90 46.87 43.56
Located in Accra (=1 if Yes) 0.59 0.53 0.70
Log Annual Wage Bills 16.58 15.77 18.68
Log Real Output 17.36 16.69 19.41
Exports(=1 if Yes) 0.33 0.29 0.46
Log Replacement Value of Plants and Machinery 17.67 16.65 20.80
Percentage of Inputs Imported 22.03 19.13 32.56
Observations (Firm X Years) 2400 1464 804
Notes: Data comes from the CSAE-RPED database as discussed in the data section. Any Foreign cor-responds to firms partially or wholly owned by foreigners. Private domestic refers to firms owned andmanaged by private Ghanaian residents entrepreneurs. Total observations are 2400, for private domes-tic it is 1464 and for foreign owned it is 804. Remaining 132 are observations for state firmsAll figuresare means computed from plant level database of 200 firms overall for 12 years. .
130
TABLE 3.2: Analysis of Variances of Log Real Hourly Earnings of Workers
1992 1998 2003(1) (2) (3)
All Firms
Total Variance 1.105 1.376 1.067
Within 0.444 0.514 0.748Between 0.661 0.862 0.319
Decomposing the Change over time 1992-1998 1998-2003
Total Change 0.271 -0.305
Within Change 0.070 0.234Between Change 0.201 -0.541
Data from CSAE-RPED on the Ghanaian Manufacturing Sector from 1992-2003. I matcched the workerand firm files to create the linked dataset based on which I perform the analysis. The arithmetic calcu-lations are based on the method described in Section 3.1 in the text.
TABLE 3.3: Variances of Log Real Hourly Earnings of Workers by Own-ership Type
1992 1998 2003(1) (2) (3)
Total Variance 1.105 1.376 1.067
Firms with Full Domestic Ownership 1.321 1.371 1.151
Firms with Some Foreign Ownership 0.311 0.671 0.834
Data from CSAE-RPED on the Ghanaian Manufacturing Sector from 1992-2003. I matcched the workerand firm files to create the linked dataset based on which I perform the analysis. The arithmetic calcu-lations are based on the method described in Section 3.1 in the text.
131
TABLE 3.4: Contribution of Earnings Gap to Variance in Earnings
(1) (2) (3)
Panel A 1992 1998 Change
Variance (Log Real Hourly Earnings) 1.105 1.376 0.271
Mean (Log Real Hourly Earnings) Above 90th percentile 5.988 6.488Mean (Log Real Hourly Earnings) Below 90th percentile 4.339 4.350Difference in Means (Above - Below 90th Percentile) 1.649 2.138
Contribution of Difference in Means to Variance 0.245 0.411 0.166Percentage of Total Change in Variance 61.25
Panel B 1998 2003 Change
Variance (Log Real Hourly Earnings) 1.376 1.067 -0.309
Mean (Log Real Hourly Earnings) Above 90th percentile 6.488 7.036Mean (Log Real Hourly Earnings) Below 90th percentile 4.350 4.974Difference in Means (Above - Below 90th Percentile) 2.138 2.062
Contribution of Difference in Means to Variance 0.411 0.383 -0.028Percentage of Total Change in Variance 9.06
Data from CSAE-RPED on the Ghanaian Manufacturing Sector from 1992-2003. The contribution of thedifference in means has been calculated as per the description in Section 3.2 in the text. The variance inlog earnings can be arithmetically decomposed following Barth el al (2014) as in equations (2) and (3)to calculate the contribution to variance. The percentage share has been calculated as the change in thecontribution as a share of the total change in the variance.
132
TABLE 3.5: Returns to Schooling and Experience
Dep Var: Log Real Hourly Earnings
1992-1998 1998-2003(1) (2)
Schooling 0.042*** 0.063***(0.008) (0.002)
Experience 0.068*** 0.046***(0.012) (0.004)
Experience2 -0.0013*** -0.0005***(0.0003) (0.0001)
Mean of Dep Var 4.568 4.654
Variance of Dep Var 1.460 1.318
Variance (X · β) 0.109 0.133
Data from CSAE-RPED on the Ghanaian Manufacturing Sector from 1992-2003. I matcched the workerand firm files to create the linked dataset based on which I perform the analysis. Log real hourlyearnings are measured in Ghanaian cedis. All regressions include firm fixed effects. Robust standarderrors reported in parentheses. ***,** and * represent significance at 1%, 5% and 10% respectively.
133
TABLE 3.6: Variance Decomposition: Estimated Firm Effects and Pre-dicted Wage from Worker Characteristics
Surge in Inequality Decline in Inequality
1992-1998 1998-2003
(1) (2)
All Firms
Earnings Dispersion: V ar(lnw) 1.460 1.318Individual Characteristics: V ar(κ) 0.235 0.331Firm Fixed Effect: V ar(δ) 0.595 0.385Cov(κ, δ) 0.175 0.164Sorting: Worker-Firm ρδ 0.745 0.495Residuals: V ar(u) 0.498 0.385
Firms with Private Domestic Ownership
Earnings Dispersion: V ar(lnw) 1.551 1.369Individual Characteristics: V ar(κ) 0.308 0.400Firm Fixed Effect: V ar(δ) 0.387 0.232Cov(κ, δ) 0.124 0.081Sorting: Worker-Firm ρδ 0.403 0.202Residuals: V ar(u) 0.580 0.535
Firms with Any Foreign Ownership
Earnings Dispersion: V ar(lnw) 0.803 0.825Individual Characteristics: V ar(κ) 0.139 0.214Firm Fixed Effect: V ar(δ) 0.134 0.441Cov(κ, δ) 0.036 -0.075Sorting: Worker-Firm ρδ 0.259 -0.350Residuals: V ar(u) 0.808 1.413
Data from CSAE-RPED on the Ghanaian Manufacturing Sector from 1992-2003. Regressions are ex-tensions of mincerian equations and include the following independent variables: age, age squared,education, education squared, experience and its squared and firm fixed effects. Dependent variable islog real hourly earnings, κ is predicted wage from observed worker characteristics, δ is firm effect onearnings and ρδ is the correlation between the worker characteristics and firm effects.
134
FIGURE 3.2: Time Series Plots of Firm Averages by Ownership Types:Firm Dataset of 200 Firms over 12 Years
135