learning about growth and democracy · a learning model of democratization elites, beliefs, and...
TRANSCRIPT
Learning about Growth and Democracy∗
Scott F Abramson†
University of Rochester
Sergio Montero‡
University of Rochester
April 17, 2020
Abstract
We develop and estimate a model of learning that accounts for the observed correlation
between economic development and democracy and for the clustering of democratiza-
tion events. In our model, countries’ own and neighbors’ past experiences shape elites’
beliefs about the effects of democracy on economic growth and their likelihood of
retaining power. These beliefs influence the choice to transition into or out of democ-
racy. We show that learning is crucial to explaining observed transitions since the
mid-twentieth century. Moreover, our model predicts reversals to authoritarianism if
the world experienced a growth shock the size of the Great Depression.
Keywords: democracy, learning, growth, diffusion, structural estimation
∗We thank Tasos Kalandrakis, Didac Queralt, Matt Shum, Erik Snowberg, Jorg Spenkuch, FabrizioZilibotti, and audiences at Kellogg MEDS, Princeton, Vanderbilt, NYU, the 2018 Banff Empirical Mi-croeconomics Workshop, SIOE 2018, APSA 2018, and the 2018 Formal Theory and Comparative PoliticsConference for very helpful comments. Anna Walsdorff provided excellent research assistance.†Department of Political Science, University of Rochester, Rochester, NY 14627. Email:
[email protected].‡Departments of Political Science and Economics, University of Rochester, Rochester, NY 14627. Email:
Introduction
Scholarship on the causes of democracy has sought to understand two empirical patterns: the
strong correlative relationship between levels of material well-being and democracy (Lipset,
1959; Przeworski et al., 2000; Boix and Stokes, 2003; Acemoglu et al., 2008, 2009) and the
spatial and temporal clustering of democratization events (Huntington, 1993; Brinks and
Coppedge, 2006; Gleditsch and Ward, 2006; Ahlquist and Wibbels, 2012; Houle, Kayser and
Xiang, 2016). Existing studies have treated these as distinct objects of inquiry, separately
assessing the impact of domestic and international factors on the propensity to democratize.
In this paper, we develop and estimate a model of elite belief formation that combines both
domestic and systemic features in order to jointly explain the correlation between economic
development and democracy and the clustering of democratic transitions.
We explicitly model the choice by incumbent elites to promote or subvert democracy.1
This choice impacts their likelihood of retaining power both directly and through its effect on
economic growth. Incumbents are uncertain about the relationship between democracy and
growth, and they rely upon worldwide economic history to update their beliefs. We allow
beliefs to be spatially correlated so that incumbents may learn more from the experiences of
more proximate (or similar) countries.2 In accordance with their beliefs, incumbents pursue
democracy or autocracy, seeking to maximize the probability they remain in power. For a
panel of 151 countries, we use data from 1875-1950 to calibrate initial conditions and data
from 1951-2000 to structurally estimate our model.3
To assess the ability of our learning model to explain observed patterns of economic
growth and democracy adoption, we conduct a series of goodness-of-fit and out-of-sample
1We conceive elites broadly as the group or faction in charge of the executive. For an overview of theformal literature that emphasizes the endogenous nature of institutions, see Gehlbach, Sonin and Svolik(2016) and Svolik (2019).
2Geographic distance is highly correlated with various measures of similarity between countries. Ourresults are robust to accounting directly for other types of similarity (see Online Appendix A7).
3Methodologically, our approach is similar to that of recent studies in political science that estimatemodel parameters by maximizing a likelihood derived from the equilibrium conditions of a formal model(Crisman-Cox and Gibilisco, 2018; Ascencio and Rueda, 2019).
1
(2001-2010) prediction exercises that pit our model against a range of reduced-form panel
regressions typical of the approach taken in the existing empirical literature on democrati-
zation (Acemoglu et al., 2008; Boix, 2011). We show that learning from past experiences
is crucial to explaining observed transitions to and from democracy, delivering an improve-
ment in predictive success of over 100% relative to the best-fitting specification that does
not account for learning—even after allowing for other potential channels of diffusion.
The success of our learning model is rooted, first, in our estimates of the political impli-
cations of economic growth. In line with a sizable empirical literature in political economy,
we find that democracies tend to reward incumbents for growth by keeping them in power
(Hibbs, 1977; Alesina, Roubini and Cohen, 1997; Brender and Drazen, 2008). In contrast, we
find that growth tends to be destabilizing in autocracies. That is, our estimates are consis-
tent with the view that rapid economic expansion in autocracies produces actors—a middle
class, for example—able to challenge the group in power (Olson, 1963; Huntington, 1968;
Hirschman and Rothschild, 1973). Together, these results imply that the cross-sectional
correlation between levels of material well-being and democracy is largely driven by elites
who seek to benefit politically from the economic consequences of institutional choice. In
particular, democratic incumbents will subvert democracy when they come to believe that
it does not produce sufficient economic growth to win a fair election, whereas autocratic
incumbents will transition to democracy when expected rates of growth make them more
likely to retain power via election than under continued authoritarian rule.
Importantly, we distinguish transitions of power, where only the identity of the incumbent
changes, from transitions into or out of democracy, where the form of government changes.
Sudden changes in economic conditions, for instance, may lead to generalized political in-
stability regardless of the system of government in place. However, whenever an incumbent
is replaced—be it through election, coup, or revolution—the new group in power again faces
the choice to support or subvert democracy. Our focus is on this choice, not transitions of
power between factions.
2
The second feature of our learning model that underpins its empirical success is its ability
to capture the wave-like nature of democratic transitions. We find that beliefs about the
relationship between democracy and economic growth are highly correlated both temporally
and spatially, which provides a structural interpretation for the observed clustering of transi-
tions. Our estimates indicate there is an approximately 5,000 kilometer radius within which
learning occurs. Outside this distance, virtually no additional information is gleaned. This
advances the literature on the diffusion of democracy, which has struggled to disentangle
competing mechanisms.4
Our methodological approach allows us to conduct counterfactual experiments of three
types, each of which highlights the importance of learning. First, it enables us to study how
systemic shocks to prosperity impact the worldwide prospects for democracy. Specifically,
our model predicts considerable reversals to authoritarianism if the world were hit with a
shock to growth the size of the Great Depression. Second, it allows us to ask retrospective
questions about historical democratization events, such as whether Greece, Portugal, and
Spain would have democratized when they did had western European democracies suffered a
recession in the early 1970s. Third, our model allows us to prospectively explore conditions
that would presently lead countries to transition to or from democracy. Overall, our results
suggest that, ultimately, democracy is a fragile system of government, one which depends
significantly upon its own economic success.
Taken together, our findings contribute to a substantial body of work on modernization
and democracy. Dating to at least the mid-twentieth century, social scientists have debated
whether increases in per capita income have a causal effect on the probability that a state
democratizes.5 We bring to bear two innovations. First, we do not rely on the instrumental-
4Proposed mechanisms include diffusion through international organizations (Pevehouse, 2005), di-rect emulation of neighbors (Gleditsch and Ward, 2006), diffusion through trade and economic exchange(Mansfield, Milner and Rosendorff, 2000), cultural linkages (Wejnert, 2005), and military coercion (Kadera,Crescenzi and Shannon, 2003). On the inability of this literature to empirically falsify any particular mech-anism, see Torfason and Ingram (2010).
5The modernization hypothesis—that higher incomes per capita cause countries to democratize—datesto, at least, Lipset (1959). For evidence in favor, see Londregan and Poole (1996); Barro (1996); Boix andStokes (2003); Boix (2011). Against, see Przeworski and Limongi (1997); Przeworski et al. (2000); Acemoglu
3
variables or reduced-form selection-on-observables identification strategies deployed—with
mixed results—in the existing literature. Rather, we propose an explicit model of the rela-
tionship between economic growth and democracy that we take directly to the data. Second,
we approach modernization as a systemic phenomenon. Indeed, our results suggest that fo-
cusing on the within-country causal impact of economic development is too narrow an object
of inquiry. Through its influence on neighbors’ beliefs, a country’s economic performance af-
fects not only its own likelihood of transitioning to or from democracy but also the prospects
for democracy outside of its borders.
Of course, we are not the first to propose diffusion through learning as a driver of demo-
cratic transitions.6 If direct and consistent measures of beliefs were available over a suffi-
ciently long period and a wide enough set of countries, it would be conceivable to directly
estimate the impact of changing beliefs on democratization.7 Given the current lack of
systematic or reliable data, our paper represents the first attempt at estimating this impact.
We are also not the first to examine the role of learning in policymaking more broadly.8
Our paper is most closely related to Buera, Monge-Naranjo and Primiceri’s (2011) struc-
tural analysis of the impact of learning on the adoption of market-oriented (versus state-
interventionist) policies. We build on their framework to model the worldwide evolution
and diffusion of beliefs about the economic consequences of competing policies. Yet, while
they consider the problem of a welfare-maximizing social planner, we take a political econ-
omy perspective and study institutional design as the outcome of self-interested choices by
power-seeking elites—a natural next step in this line of inquiry.
et al. (2008, 2009).6Dahl (1998); Diamond (2011); Miller (2016).7For work that attempts to gauge beliefs and their consequences for democratization in a subset of
countries, see Almond and Verba (1963); Norris (1999); Chen and Lu (2011).8Primiceri (2006); Garcıa-Jimeno (2016).
4
A Learning Model of Democratization
Elites, Beliefs, and Learning
We consider the decision problem of the decisive group in power in country i at time t.9
This decision-maker faces a choice between autocracy, Di,t = 0, and democracy, Di,t = 1.
The incumbent’s objective is to retain power in period t+ 1. Let Yi,t denote country i’s per
capita GDP in period t, and let yi,t ≡ log(Yi,t) − log(Yi,t−1) denote its growth rate. At the
beginning of period t, the incumbent solves
maxDi,t∈{0,1}
Ei,t−1
[exp(αi + θD=Di,tyi,t −Ki,tDi,t)
1 + exp(αi + θD=Di,tyi,t −Ki,tDi,t)
∣∣∣∣Di,t
], (1)
where the integrand represents the probability of remaining in power in period t+1, and the
expectation is taken with respect to yi,t, as explained below, conditional on the information
available in country i at the conclusion of period t − 1. The integrand is increasing in the
index αi + θD=Di,tyi,t − Ki,tDi,t, where coefficient αi establishes a baseline for country i,
coefficients θD=0 and θD=1 respectively measure the (de)stabilizing effect of GDP growth
on elite turnover under autocracy and democracy, and Ki,t captures the political cost of
democracy to the incumbent, i.e., its direct effect on the likelihood of retaining power.
The incumbent chooses Di,t at the start of period t, forming a subjective forecast of its
effect on GDP growth, yi,t, to solve (1).10 Incumbents believe that the relationship between
GDP growth and democracy takes the form
yi,t = (1−Di,t)βD=0i +Di,tβ
D=1i + εi,t, (2)
where βD=0i and βD=1
i denote country i’s long-run GDP growth rates under autocracy and
9As described below, both in democracies and autocracies we treat this as the political party or faction(where parties do not exist) in control of the executive. Online Appendix A2 provides descriptive evidencethat motivates key modeling choices in what follows.
10We build on the learning framework of Buera, Monge-Naranjo and Primiceri (2011).
5
democracy, respectively, and εi,t is an exogenous shock to growth that is uncorrelated over
time but potentially correlated across countries.11 Specifically, the vector εt ≡ [ε1,t, . . . , εn,t]′
of GDP growth shocks across the n countries of the world is independently and identically
distributed (i.i.d.) over time according to a mean-zero Normal distribution with covariance
matrix Σ, i.e.,
εti.i.d.∼ N(0,Σ).
Incumbents do not know βi ≡ [βD=0i , βD=1
i ]′, but they have perfect knowledge of all other
features of the model, including Ki,t, at the time of their choice.12
The timing of events is as follows. At the end of period t− 1, incumbents collect data on
worldwide GDP growth rates and systems of government, and they update their beliefs about
long-run economic growth under autocracy and democracy accordingly. At the beginning of
period t, incumbents observe Ki,t and decide what system of government, Di,t, to adopt that
period. Growth rates conditional on incumbents’ choices are then realized, which together
determine elite turnover.
Learning. In period t = 0, incumbents start out with a Normal prior over the vector of
unknown long-run GDP growth rates β ≡ [βD=01 , . . . , βD=0
n , βD=11 , . . . , βD=1
n ]′,
β ∼ N(β0, P−10 ), (3)
where β0 and P0 denote, respectively, the prior mean and precision matrix. We assume that
incumbents’ initial beliefs assign no correlation and the same degree of uncertainty to growth
11Studies on the causal impact of democracy on economic growth have highlighted various potentialmechanisms—most notably, how democracy influences redistributive policy, which in turn affects investment(Alesina and Rodrik, 1994; Persson and Tabellini, 1994; Acemoglu et al., 2019). For tractability, we abstractfrom considering these explicitly but allow incumbents to have flexible, country-specific beliefs about theirnet long-run impact.
12A natural question is whether incumbents may also be imperfectly informed about θ = (θD=0, θD=1).However, allowing for such double-sided uncertainty would be analytically and econometrically quite chal-lenging. We find it reasonable to presume incumbents are better informed about their political prospectsthan the macroeconomy—after all, they are professional politicians who typically rely on technocrats toadvise on economic policy.
6
under autocracy and democracy:
P−10 = I2 ⊗ (V RV ),
where V = diag([v1σ1, . . . , vnσn]) is a diagonal matrix whose ith diagonal entry measures
prior uncertainty (standard deviation) about country i’s long-run growth rate under autoc-
racy/democracy, and R is the cross-country prior correlation matrix. Prior uncertainty is
parameterized by {vi}ni=1, normalized by the standard deviation of growth shocks in each
country {σi}ni=1 (the square roots of the diagonal elements of Σ).
Our assumptions yield simple, recursive, Bayesian updating formulas for beliefs in each
period: letting Dt ≡ [D1,t, . . . , Dn,t]′ and yt ≡ [y1,t, . . . , yn,t]
′,
Pt = Pt−1 + D′tΣ−1Dt,
βt = βt−1 + P−1t D′tΣ−1(yt −Dtβt−1),
where Dt ≡ [diag(1−Dt), diag(Dt)] is an n× (2n) matrix such that the ith element of the
vector Dtβt−1 equals (1 − Di,t)βD=0i,t−1 + Di,tβ
D=1i,t−1. The impact of new data on the posterior
mean βt is determined by P−1t D′tΣ−1, which depends on three key factors. First, higher
initial uncertainty in beliefs (higher {vi}ni=1) raises the relative precision of new information,
increasing its impact. Second, higher correlation in growth shocks across countries (off-
diagonal elements of Σ) reduces the informational content of observed growth rates and
slows down learning. Lastly, higher cross-country correlation in initial beliefs (off-diagonal
elements of R) increases belief responsiveness to data from other countries.
We allow incumbents to potentially learn more from neighboring (or more similar) coun-
tries. Letting Zi,j denote a vector that may include various measures of distance (geographic
or otherwise) between countries i and j, we write
Ri,j = exp(−Z ′i,jγ),
7
where γ is constrained to be nonnegative to ensure correlations between 0 and 1.13
Incumbents’ optimal choice. While incumbents observe the political cost of democracy,
Ki,t, prior to choosing Di,t, this cost is unobservable to the researcher. We assume that Ki,t
has the following structure:
Ki,t = fi +X ′i,tξ + κi,t. (4)
Coefficient fi establishes a country-specific baseline, and the control vector Xi,t may include
various observable economic and political characteristics of country i (e.g., lagged per capita
GDP or incumbents’ time in power). Every period, country i also experiences an exogenous
idiosyncratic shock, κi,t, to the political cost of democracy, where
κi,t ∼ N(0, ς2i ).
The volatility of shocks to the political cost of democracy, ςi, is allowed to be country-specific,
but κi,t is assumed to be independently distributed over time and across countries.
As discussed, when choosing Di,t incumbents have perfect knowledge of Ki,t and all
features of the model except for the effect of their choice on GDP growth, yi,t. Together,
(1), (2), and (4) imply that the optimal choice for country i’s incumbent at time t is
Di,t = 1
{Ei,t−1
[exp(αi + θD=1(βD=1
i + εi,t)− fi −X ′i,tξ − κi,t)1 + exp(αi + θD=1(βD=1
i + εi,t)− fi −X ′i,tξ − κi,t)
]
> Ei,t−1
[exp(αi + θD=0(βD=0
i + εi,t))
1 + exp(αi + θD=0(βD=0i + εi,t))
]},
(5)
where the expectations are taken only with respect to βi and εi,t in accordance with the
incumbent’s beliefs at the conclusion of period t− 1.
Opposition groups and strategic experimentation. To conclude the description of
our model, we briefly discuss how we account for non-elite learning and strategic interactions
13This formulation also guarantees the positive definiteness of R (Matern, 1960).
8
between incumbents and potential challengers.
The common prior assumption for incumbents in our model extends to all stakeholders
in each country. Opposition groups (elite or non-elite) observe the same worldwide history
of economic growth and democracy, and they would be faced with solving (1)—in the event
they came to power—using information identical to that available to the incumbent. As a
result, in this shared-learning environment, the identity of the incumbent only matters via
its potential influence on the political cost of democracy.
For tractability, we abstract from explicitly modeling the intricacies of within-country
elite turnover. However, objective (1) can be viewed as describing an equilibrium proba-
bility of staying in power resulting from a richer interaction between the incumbent and
potential challengers, both elite and non-elite. Importantly, we distinguish transitions of
power, where only the identity of the incumbent changes, from transitions into or out of
democracy. Whenever an incumbent is overthrown by a rival elite faction or via a revolution
from below, the new group in power again faces the choice to support or subvert democracy.14
Understanding how this choice by self-interested elites—newly in power or entrenched—is
shaped by the evolution of beliefs about the economic effects of democracy is the focus of
this paper.15
Finally, objective (1) precludes incumbents from adopting a form of government with
negative expected consequences for the purpose of learning from the experience. Incumbents
are myopic and focused only on their immediate survival. Nevertheless, while we do not
explore a fully dynamic version of our model, as it would introduce strategic experimentation
incentives that would render the analysis intractable,16 we believe objective (1) offers a good
14Examples abound of revolutions from below, inspired by ostensibly democratic goals, that failed todeliver on the promise of liberal democracy. For instance, Skocpol’s (1979) three main cases—the French,Chinese, and Russian revolutions—all began as mass revolutionary movements with outwardly democraticmotives, and each nonetheless resulted in dictatorship. More recently, the Arab Spring yielded mixed results.
15An implication of our model is that transitions constitute attempts by incumbent elites to hold on topower. Notably, incumbents in our data retain power 30% and 19% of the time following transitions to andfrom democracy, respectively, despite the instability typically associated with transition periods.
16Bolton and Harris (1999); Bramoullee, Kranton and D’Amours (2014); Mossel, Sly and Tamuz (2015).
9
first-order approximation to optimal behavior by incumbents with longer time horizons.17
Empirical Strategy
Like incumbents in our model, we adopt a Bayesian inference approach to recover the un-
known structural parameters of our model, listed in Table 1.18 Let the vector ϕ collect all
the parameters in Table 1, and let Ii,t be an indicator of whether the incumbent in coun-
try i retained power (Ii,t = 1) or not (Ii,t = 0) at the conclusion of period t. Denote by
W T ≡ {It, yt, Dt, Xt}Tt=1 the set of all data available up to period T , where It ≡ [I1,t . . . , In,t]′
and Xt ≡ [X1,t . . . , Xn,t]′. Our goal is to estimate the true value of ϕ by computing the mode
of the posterior distribution of the model parameters,
p(ϕ|W T ) ∝ L(W T |ϕ)π(ϕ),
given the likelihood of the data, L, and our prior, π. We describe L and π in turn.
Table 1: Model Parameters
{αi}ni=1: baseline incumbent stabilityθD=0: effect of GDP growth on elite turnover under autocracyθD=1: effect of GDP growth on elite turnover under democracy
{βD=0i,0 }ni=1: prior mean of long-run GDP growth rate under autocracy
{βD=1i,0 }ni=1: prior mean of long-run GDP growth rate under democracy{vi}ni=1: prior uncertainty about economic effects of autocracy/democracy
γ: coefficients of cross-country correlation in prior beliefs{fi}ni=1: baseline political cost of democracy
ξ: coefficients of economic/political controls for political cost of democracy{ςi}ni=1: volatility of political cost of democracy
Likelihood of the data. While the structure of our model described thus far specifies
incumbents’ beliefs about how the data are generated as well as their optimal choices given
17See the discussion below on the robustness of our results and Online Appendix A7.18Following Buera, Monge-Naranjo and Primiceri (2011), to reduce the dimensionality of the model we
set Σ equal to its estimated value from the “true” data generating process described in Footnote 36.
10
those beliefs, we have refrained from specifying the “true” data generating process (DGP).
Below, to perform counterfactual experiments, we discuss and specify the true DGP. Next,
we only make one key assumption about the true DGP that simplifies inference about the
model parameters.
We assume that observed outcomes are only affected by actual choices and not by the be-
liefs that led to those choices. That is, transitions of power (It), GDP growth (yt), and other
economic and political characteristics of countries (Xt) are shaped by realized institutions
(Dt), but they are not directly affected by beliefs about the potential effects of transitioning
into or out of democracy. Formally, this assumption implies that the parameters in Table 1
are only involved in the component of the likelihood that describes the conditional proba-
bilities of countries’ observed systems of government. While we relegate a full derivation of
the likelihood to Online Appendix A3, with a slight abuse of notation—using L to denote
arbitrary densities of the data—it can be written as
L(W T |ϕ) ∝T∏t=1
n∏i=1
L(Di,t|Xi,t,Wt−1, ϕ),
where
L(Di,t|Xi,t,Wt−1, ϕ) = Φ
(κi,t(Xi,t,W
t−1, ϕ)
ςi
)Di,t[1− Φ
(κi,t(Xi,t,W
t−1, ϕ)
ςi
)]1−Di,t
, (6)
Φ denotes the standard Normal cumulative distribution function, and κi,t(Xi,t,Wt−1, ϕ) is
the threshold value of κi,t—the realized shock in period t to the political cost of democracy in
country i—that leaves country i’s incumbent indifferent between autocracy and democracy.
Note that (6) resembles the likelihood of a standard binary-choice Probit model. How-
ever, κi,t(Xi,t,Wt−1, ϕ) is a nonlinear function (with no closed-from expression) of the model
parameters and the data up to period t that encodes how each country’s propensity for
democracy evolves with incumbents’ beliefs.
11
Prior. Given the size of our model, we adopt an informative prior, π, to prevent overfitting.
To do so in a principled manner, we calibrate our prior in the way agents in our model would,
allowing the observed past to inform initial beliefs. We use data from 1875-1950 (excluding
the two world wars), a period that immediately precedes our main sample, to set the prior
mean and precision of the model parameters so as to match analogous empirical moments.
For example, we ensure that our prior over incumbents’ initial beliefs about the relationship
between democracy and GDP growth is consistent with average annual growth rates among
autocracies and democracies in the pre-sample period. Similarly, we use pre-sample history
of elite turnover to inform our prior over the parameters describing the likelihood of retaining
power. We describe our prior in full and how it is calibrated in Online Appendix A4.
Estimation and inference. Calculating κi,t(Xi,t,Wt−1, ϕ) to evaluate the likelihood of
the data is computationally quite expensive. To sidestep this burden, we follow the Math-
ematical Programming with Equilibrium Constraints (MPEC) approach of Su and Judd
(2012) to compute our maximum-a-posteriori estimator of ϕ.19 The idea behind this ap-
proach is simple: instead of calculating κi,t(Xi,t,Wt−1, ϕ) at every trial value of ϕ, one treats
each κi,t as an auxiliary parameter and imposes the optimality (or equilibrium) conditions of
the model as feasibility constraints on the log-posterior maximization program. This consid-
erably reduces the computational cost of estimating the model. We describe our estimation
strategy in detail in Online Appendix A5.
Data. For our main analysis, we obtain data from three sources, each measured at the
country-year. First, we obtain data on GDP & Per Capita GDP from Maddison (2010).
The Maddison Project Database provides information on comparative economic growth and
income levels over the very long run. The data give estimates of annual GDP and GDP
per capita between 1875-2008 for all of the independent states in our sample of countries.20
19Standard errors are parametrically bootstrapped.20Online Appendix A1 lists the countries in our sample and describes how changes in borders are handled.
12
We obtain two additional years (2009-2010) of GDP per capita growth rates from the Penn
World Table (Feenstra, Inklaar and Timmer, 2015) for out-of-sample predictions.
Second, we obtain our dichotomous measure of Democracy from Boix, Miller and
Rosato (BMR, 2013). This dataset provides an annual coding of democracy for every country
in the world from 1800 to 2010. If the following three criteria are met, countries are coded
as democratic:
1. The executive is directly or indirectly elected in popular elections and is responsibleeither directly to voters or to a legislature.
2. The legislature (or the executive if elected directly) is chosen in free and fair elections.
3. A majority of adult men has the right to vote.
If any of these criteria are not met, a country is coded as autocratic. While various alternative
measures of democracy have been used in previous studies, the BMR coding is the most
comprehensive and consistent for the period we cover (1875-2010). Nonetheless, our results
are robust to employing alternative codings (see Online Appendix A7).
Finally, for each country-year, we code the Executive Faction from Goemans, Gleditsch
and Chiozza (2009). The Archigos database on leaders describes the date and manner of
entry and exit for the executives of all countries in our sample from 1875-2015. With these
data, we then code, using biographical information, the political party of each executive. If
we cannot identify a political party (nearly all of these cases are military regimes), we identify
the particular faction to which the executive belongs. In combination with the entry/exit
dates, we construct our measure of change in the faction of the executive (elite turnover).
Estimation Results: The Importance of Learning
Before summarizing our structural parameter estimates, we subject our model to a series of
goodness-of-fit and out-of-sample prediction tests that assess its ability to explain observed
patterns of democracy adoption. We consider five alternative specifications of our model
13
that differ in the number of covariates used to characterize the political cost of democracy.
In our baseline specification with no covariates, the political cost of democracy, Ki,t, consists
of simply a country-specific baseline, fi, plus an idiosyncratic shock, κi,t. We then consider
specifications where we successively control for (lagged) log-GDP per capita, the incumbent’s
time in power, (lagged) trade volume as percentage of GDP (Gleditsch, 2002), and years as
democratic (negative when autocratic) to account for consolidation effects (Svolik, 2013).21
Across specifications, we use geographic distance between capitals, Zi,j, to capture the cross-
country correlation in initial beliefs.22
To quantify the importance of learning for our model’s ability to fit the data, we also
estimate a “no-learning” version of our model. For each specification, we constrain beliefs
about long-run growth rates under autocracy and democracy to be constant over time,
thus shutting down the learning mechanism. These no-learning specifications are otherwise
identical to their learning counterparts.
We conduct our model performance tests as follows. With each estimated model, we
compute one-year-ahead forecasts of the choice between autocracy and democracy for each
country-year. That is, conditional on the state of the world at the end of year t−1 as recorded
in our data, we use (5) for each model to predict Di,t worldwide. We produce forecasts for
the in-sample period used to estimate each model (1951-2000) and for 10 additional out-of-
sample years (2001-2010).
In Figure 1, we plot the actual (gray) and predicted percentage of world democracies. In
the top panel, predictions are generated using our baseline specification with no covariates.
We plot predictions with (blue) and without (red) learning for both the in-sample (solid)
and out-of-sample (dashed) periods. In the lower panel, we present the same set of estimates
21In Online Appendix A7, we also consider a specification that allows for an interaction between (lagged)log-GDP per capita and an indicator (lagged) of individual leader turnover as in Treisman (2015). Ourresults are virtually unchanged.
22Geographic distance is highly correlated with other measures of cultural, economic, or political similaritybetween countries (Buera, Monge-Naranjo and Primiceri, 2011). As shown in Online Appendix A7, ourresults are robust to allowing the correlation in initial beliefs to also depend on genetic distance—as measuredby Spolaore and Wacziarg (2009)—and on economic distance in terms of initial levels of development.
14
using our model with two covariates (lagged log-GDP per capita and time in power).
Note the vast improvement in predictive success, both in and out of sample, when we
account for learning. Unsurprisingly, the no-learning specification of our model with no
covariates performs worst as it produces a constant prediction for each country.23 However,
while the inclusion of covariates does markedly improve the accuracy of the no-learning
model, these gains pale in comparison with the role of learning. Indeed, our baseline learning
model with no covariates vastly outperforms any specification that does not account for
learning. As many of the covariates we condition upon are themselves outcomes of the
selection process we model (e.g., per capita GDP or elite turnover), their inclusion should
not yield much improvement in predictive success. Our results confirm this intuition.24
Table 2 provides a numerical summary of our goodness-of-fit tests. Each set of columns
corresponds to a different model specification, with (odd columns) and without (even columns)
learning. The first row gives the percentage of country-year observations each model cor-
rectly predicts. Unsurprisingly, all models perform remarkably well on this dimension. The
reason is that, as transitions into or out of democracy are quite rare (130 total in-sample
events), country fixed effects go a long way in fitting the data. Indeed, our no-learning model
with no covariates (second column), which produces a constant prediction for each country,
has a success rate of almost 90%.
A much harder test—one that is considerably more revealing of the underlying causes of
democracy—is whether a model can correctly predict transitions to and from democracy. In
Table 2, we present two scenarios, assessing each model’s accuracy in predicting transitions
within ±0 years (second row) and ±2 years (third row) of the event. Here, the importance
of learning is striking. Models that do not account for learning perform quite poorly, even
within a 5-year window.25 And, while including additional covariates does increase accuracy,
23The observed temporal variation is an artifact of the changing population of countries in our data.24To show that the success of our model is not an artifact of aggregating predictions across countries,
we provide in Online Appendix A7 nearly identical results disaggregated by four regions of the world: theAmericas, Europe, Africa, and Asia-Oceania.
25The 5-year window predictions for models in the last two columns should be taken with care. Whencontrolling for years as democratic (negative when autocratic) and aggregating over 5 years, transitions in the
15
% W
orld
Dem
ocra
cies
1950 1960 1970 1980 1990 2000 2010
2530
3540
4550
55
No Covariates
Out ofSampleIn Sample
True Value
Model w/ Learning
Model w/ No Learning
% W
orld
Dem
ocra
cies
1950 1960 1970 1980 1990 2000 2010
2530
3540
4550
55
Two CovariatesTime in Power, ln(GDPpc)
Figure 1: Observed versus Predicted Worldwide Prevalence of Democracy
Notes. This figure compares the true proportion of world democracies (gray) with in-sample (solid blue) andout-of-sample (dashed blue) estimates generated by our model. Additionally, we shut down learning in ourmodel and produce both in-sample (solid red) and out-of sample (dashed red) predictions. In the top panel,estimates are generated using our baseline specification with no covariates. In the bottom panel, we controlfor lagged log-GDP per capita and incumbents’ time in power.
16
Table 2: Model Fit
No Covariates One Covariate Two Covariates Three Covariates Four CovariatesLearning No Learning Learning No Learning Learning No Learning Learning No Learning Learning No Learning
Choices 95.2 88.9 95.7 90.4 95.8 91.7 96.4 92.3 97.1 96.3(% correct)
Transitions(% correct)±0 years 9.3 0.0 6.2 1.6 7.0 3.9 11.0 7.1 6.3 0.8±2 years 41.1 0.0 43.4 7.8 48.1 20.2 52.0 23.6 70.1 53.5
Log-likelihood -581.4 -1,390.8 -572.5 -1,176.8 -553.6 -1,061.8 -523.6 -995.1 -490.1 -704.2
Observations 5,925 5,925 5,925 5,925 5,925 5,925 5,845 5,845 5,845 5,845
Notes. This table reports various goodness-of-fit measures. Each set of columns corresponds to a differentspecification of our model, with (odd columns) and without (even columns) learning. Models in the firsttwo columns use only country fixed effects to characterize the political cost of democracy. Models in thethird and fourth columns control for lagged log-GDP per capita. Models in the fifth and sixth columnsadditionally control for incumbents’ time in power. Models in the seventh and eighth columns also controlfor trade volume as a percentage of lagged GDP. Models in the last two columns add years as democratic(negative when autocratic) as a control. For each model, we report the percentage of correctly predictedin-sample system of government choices (first row). We similarly report the percentage of correctly predictedtransitions to or from democracy within a 0-year window (second row) and a 5-year window (third row) ofthe event.
the marginal improvement is negligible. In contrast, turning on the learning mechanism in
our model raises predictive success by over 100% in virtually all scenarios and all specifica-
tions. In fact, our baseline learning model with no covariates outperforms most no-learning
specifications by a similar rate.26
To further benchmark our model, we present in Table 3 results from a series of panel
regressions typical of the approach taken in the existing empirical literature on democrati-
zation. Using linear probability models, we regress our democracy measure against a full set
of country fixed effects, a one-period lag of log-GDP per capita, and various controls. We
exploit both annual data and, as in Acemoglu et al. (2008, 2009) and Boix (2011), five-year
panels, which allow for the inclusion of covariates not available annually.27 As before, with
each specification we produce predictions for every country-period.
data are mechanically picked up by the models and turned into correct predictions. See below for a similarcomment about models with a one-year lag of democracy. Notably, turning on the learning mechanism stilldelivers a sizable improvement in predictive power.
26The only exception is the model in the last column—see Footnote 25.27We also estimate five-year panel versions of our model—see Online Appendix A7—and results are
essentially identical to those of Table 2, which should alleviate concerns about both the myopia of incumbentsin our model and whether an annual time frame is appropriate to study changes in system of government.
17
Table 3: Goodness of Fit of Reduced-Form Models
I II III IV V VI VII VIII IX X XI XII XIII XIV
Choices 89.3 88.6 89.5 88.8 89.3 88.8 77.1 91.3 97.8 91.4 90.4 90.8 97.8 91.6(% correct)
Transitions(% correct)±0 years 0.8 1.5 0.8 6.9 2.3 6.2±2 years 3.1 2.9 3.8 2.9 3.8 1.9 1.2 3.8 96.9 19.4 7.7 14.1 96.9 17.2
log(GDPpc)t−1 0.115∗∗∗ 0.147∗∗∗ 0.116∗∗∗ 0.142∗∗∗ 0.112∗∗∗ 0.142∗∗∗ 0.033 0.032 0.024∗∗∗ 0.102∗∗∗ 0.060∗∗∗ 0.040 0.011 0.022(0.010) (0.026) (0.010) (0.026) (0.011) (0.026) (0.053) (0.078) (0.006) (0.024) (0.014) (0.037) (0.008) (0.034)
Controls :Country Effects X X X X X X X X X X X X X XTime in Power X X X X X X X X X X X XTrade/GDPt−1 X X X X X X X X X XEducation X XLabor Share XDemocracyt−1 X X X XTime Effects X X X X
Observations 5,925 1,076 5,925 1,076 5,866 1,076 649 390 5,866 1,076 5,866 1,076 5,866 1,076Panel Length 1 5 1 5 1 5 5 5 1 5 1 5 1 5
∗p<0.1, ∗∗p<0.05, ∗∗∗p<0.01
Notes. This table gives linear probability estimates of the impact of (lagged) log-GDP per capita on democ-racy. All models include country fixed effects. We successively add in controls for the incumbent’s time inpower, trade as a percentage of (lagged) GDP, average years of schooling from Barro (1999), and labor shareof value added from Rodrik (1999). In columns XI-XIV, we include year fixed effects. In columns IX, X,XIII, and XIV, we include a one-period lag of democracy. We present estimates with annual and five-yearpanels. The first row gives the percentage of country-period observations correctly predicted by each model.For models with annual data, we also report the success rate at predicting transitions within ±0 (secondrow) and ±2 (third row) years of the event. For the five-year panels, we produce success rates at predictingtransitions just within the five-year window (third row).
18
Again, as with our model, it is trivial to correctly predict close to 90% of country-period
observations using country fixed effects. The important departure arises when we compare
predictions of transitions derived from these reduced-form regressions with those generated
by our learning model. Once more, we evaluate these predictions in exact (±0 years) and
five-year windows (±2).
In terms of exact predictions, no reduced-form specification surpasses the predictive suc-
cess of our baseline no-covariates learning model. In the five-year window, our model per-
forms similarly well, save annual panel specifications with a lagged dependent variable, which
correctly predict over 90% of transitions. However, aggregated over five years, the inclusion
of the one-year lagged outcome leads any transition not picked up exactly by the model to
be mechanically transformed into a correct prediction in subsequent years, thus yielding an
extremely high success rate in the five-year window despite a low success rate within a single
year. To further see this, note that, when we introduce a one-period lag in the five-year panel
specifications, they successfully predict less than 20% of transitions, a rate which our model
beats by more than 100%. More importantly, our model provides a structural interpretation
for the observed persistence of systems of government that underpins the predictive success
of these autoregressive specifications.28
Of course, it may be the case that, rather than reflecting the learning process we de-
scribe, our model’s success is simply an artifact of some alternative process of democratic
diffusion that is indirectly picked up by our model’s spatial and temporal flexibility. Exam-
ples of potential mechanisms proposed in the literature include direct emulation of neighbors
(Gleditsch and Ward, 2006), cultural linkages (Wejnert, 2005), and diffusion through trade
(Mansfield, Milner and Rosendorff, 2000). Due to data limitations, direct tests of each of
these mechanisms would be impractical. Nonetheless, in the context of our model, the chan-
28In particular, our model explains the resilience of long-lived democracies in light of recent work thatestablishes a positive causal impact of democracy on economic growth (Acemoglu et al., 2019). Over time,elites in these countries correctly come to believe—with increasing confidence—in democracy’s superioreconomic potential, which solidifies their commitment to democratic government. See the discussion in thefollowing subsection.
19
nel by which these alternative explanations could influence democratic transitions is through
their impact on the political cost of democracy. To evaluate this possibility, we construct
a distance-weighted measure of how democratic each country’s neighborhood is over time,
and we reestimate our baseline model using this measure as a control for direct diffusion
effects. In Online Appendix A7, we find minimal increase in predictive success relative to
our baseline model, which indicates that it is our proposed mechanism of learning about the
economic effects of democracy—and not some alternative process of diffusion—what drives
our results.
In sum, our goodness-of-fit tests quantify and highlight the crucial role of learning in
explaining the dynamics of worldwide democracy adoption, overshadowing the usefulness
of other explanatory variables typically employed in the literature. As discussed, this is
not surprising given that many of these controls are themselves outcomes of the learning
process we model. In light of these results, we hereafter focus our attention on the baseline
no-covariates specification of our learning model.
Structural Parameter Estimates
To understand how our model is fitting the data, we summarize our main parameter es-
timates and discuss their substantive implications.29 We begin with our estimates of the
(de)stabilizing effect of GDP growth on elite turnover under autocracy and democracy. This
question, itself, has been a separate subject of academic inquiry for decades.30 We find that
the impact of growth on the likelihood that the incumbent group retains power indeed dif-
fers across autocracies and democracies.31 The quantitative implications of our estimates are
summarized in Figure 2, which plots the estimated probability (averaged across countries)
that the incumbent remains in power at different rates of GDP growth. In blue, we plot our
estimates under democracy and, in red, our estimates under autocracy.
29We report all coefficient estimates for the baseline specification of our model in Online Appendix A6.30For a summary of the early literature on the topic, see Przeworski et al. (2000) ch. 1.31Specifically, we estimate θD=0 = −4.2213 with a standard error of 2.2209, and θD=1 = 8.8279 with a
standard error of 1.6368.
20
% ∆ GDP
Pr(
Ret
ain
Pow
er)
−50 −40 −30 −20 −10 0 10 20 30 40 50
020
4060
8010
0
Democracies
Autocracies
Figure 2: (De)Stabilizing Effect of GDP Growth on Elite Turnover
Notes. This figure plots the average (across countries) estimated probability of retaining power at differentrates of GDP growth under autocracy (red) and democracy (blue). Observed growth rates between 1951-2000are shown over the horizontal axis.
In line with a substantial empirical literature in political economy, we find that, in democ-
racies, economic growth is stabilizing for elites.32 In other words, the party in government is
more likely to win reelection when growth is high. In contrast, we find that growth is desta-
bilizing in autocracies. This result comports with the view that rapid economic expansion in
non-democracies creates inequalities of expectation or inequalities of outcomes that, in turn,
engender attempts to subvert the political system.33 That is, autocrats are less likely to
remain in power when economic growth produces actors—a middle class, for example—able
to place demands upon and challenge the authority of the group in power.
Coupling these results with learning helps explain the observed cross-sectional correlation
between democracy and per capita GDP. Democracy becomes more appealing to incumbents
as they come to believe that it is conducive to high rates of GDP growth. Conversely,
32Hibbs (1977); Alesina, Roubini and Cohen (1997); Brender and Drazen (2008).33Olson (1963); Huntington (1968); Hirschman and Rothschild (1973).
21
autocracy entails an incentive to suppress economic growth.
Of course, there are notable exceptions. For example, over the past two decades China
has experienced high rates of growth and nevertheless remained autocratic. Similarly, India
for the first four decades of its independence experienced low rates of growth and yet re-
mained democratic. Underlying these prominent cases is the country-specific political cost of
democracy. We recover estimates of the structural parameters describing the baseline polit-
ical cost of democracy for each country, fi, and plot them in Figure 3. Notably, the Chinese
Communist party faces, all else equal, the lowest probability of remaining in power under
democracy. In contrast, Congress at India’s independence had the fourth-highest ex-ante
probability of retaining power under democracy.
Next, in Figure 4, we present estimates of the spatial decay of learning, i.e., the extent
to which the cross-country correlation in prior beliefs depends on the geographic distance
between capitals.34 Consistent with the observed spatial clustering of democratization events
noted in the literature, we find that learning is highly circumscribed geographically. As
shown in Figure 4, at 5,000km (the approximate distance between the U.S. and Ecuador),
the prior correlation of beliefs is only 0.12. At 10,000km (the distance between the U.S.
and the Central African Republic), the potential for learning between countries is virtually
absent. This estimated feature of our model reveals that elites learn from the experiences
of relatively proximate countries, and it underpins the ability of our model to capture the
spatial clustering of transitions to and from democracy.
Finally, to understand the dynamics of democracy adoption, Figure 5 presents our esti-
mates of the evolution of beliefs about the economic effects of democracy. The top panel
plots the evolution of the worldwide distribution of the mean of beliefs about βD=1i − βD=0
i ,
the difference in long-run GDP growth rates under democracy versus autocracy. The bot-
tom panel shows the evolution of worldwide uncertainty (standard deviation) about these
beliefs. The initial state of beliefs in 1951 simply reflects our prior-calibration exercise. In the
34The estimated coefficient is γ = 0.4234 with a standard error of 0.2292.
22
Relative Elite Turnover Under Democracy
BOL
NEW
IND
NTH
DEN
CO
SSLOU
RU
ALBM
ACSALFR
NM
LDU
KRJPNM
ON
GN
BC
ENU
SABR
ALATBU
LM
ZMPAKH
UN
ECU
BENISRAU
LM
ASSAFG
RC
KYRG
RG
MAL
POR
PERH
ON
UZB
ANG
RW
ASPNQ
ATD
OM
SAUC
ON
MEX
ETHN
ICTAZAR
MTKMZAMPANSO
MKU
WG
UA
SUD
CR
OR
US
UAE
INS
YUG
TAWC
AMLBR
LIB
CD
IKENSIE
IRN
BFOEG
YTU
NC
OM
CH
N
RU
MVENSLVC
OL
PHI
FIN
LITTU
RTH
IG
AMTR
IC
HL
UKG
CAN
BOS
NO
RBO
TSW
ZPO
LM
AGESTIR
E
SRI
BELJAMAR
GC
ZRLAOM
AW
ITAG
MY
SWD
TAJAU
SAZEN
IRKZKN
EPU
GA
MLI
LEBC
HA
NIG
BNG
PARSENG
HA
NAM
DR
VG
ABAFGBAHM
AABU
ISW
AR
OK
ZIM
BLRJO
RTO
GIR
QG
UI
EQG
LESM
YAD
RC
DJI
OM
ASIN
MO
RC
AOC
UB
ALGSYRPR
K
●
●
● ●
●
●
●●
●
● ● ●●
● ● ●
● ●● ● ●
●●
●● ● ● ● ●
● ● ● ● ● ● ●
●
● ●●
●● ● ●
●● ● ● ● ● ● ●
●● ● ●
● ●●
● ●● ●
● ● ● ● ● ●● ●
● ●● ●
● ● ● ● ● ●● ● ● ● ●
● ●●
● ● ●●
●● ● ● ● ●
●●
●
●
●● ● ●
●● ● ● ● ● ●
● ● ● ● ●●
● ● ● ●● ● ● ●
● ● ● ● ●
● ● ●●
●● ●
●●
●
●● ●
● ●
●
●
●
HighTurnoverIn Democracy
In DemocracyTurnoverLow
−0.
50.
00.
5
Figure 3: Political Cost of Democracy
Notes. This figure plots our estimates—along with 90% confidence intervals—of the baseline political cost ofdemocracy, fi, across countries. Positive (negative) values imply higher (lower) elite turnover, all else equal,under democracy than autocracy.
23
KM (Thousands)
Cor
rela
tion
of B
elie
fs
0 5 10 15 20
0.0
0.2
0.4
0.6
0.8
1.0
Spatial Decay of Learning
USA−Ecuador
USA−CAR
Figure 4: Cross-Country Correlation in Prior Beliefs
Notes. This figure shows the spatial decay of learning, plotting our estimate of the cross-country correlationin prior beliefs about the economic effects of democracy as a function of geographic distance between capitals.For reference, Ecuador and the Central African Republic are approximately five and ten thousand kilometers,respectively, from the U.S.
24
pre-sample period (1875-1950), democracies grew, on average, about 0.4 percentage points
faster than autocracies. The median of mean initial beliefs in the top panel of the figure is
consistent with this statistic.
While estimated beliefs remain relatively flat for the first three decades of the in-sample
period, in the 1980s and, even more dramatically, in the 1990s there is a sharp expansion of
beliefs in favor of democracy’s superior potential to foster economic growth. This divergence
in beliefs reflects an increasingly large gap between autocracies and democracies in observed
economic performance. Between 1951-1979, little new information was revealed about the
relative ability of democracy to generate growth: on average over this period, autocracies
grew just 0.37 percentage points slower than democracies, roughly identical to the observed
difference in the pre-sample period. By contrast, between 1980-2000, democracies produced,
on average, 1.54 percentage points higher annual growth than autocracies.
This widening gap in economic performance—accelerating through the 1980s—reached
a peak in 1987, when democracies outgrew autocracies by 3.3 percentage points. The re-
sulting discrepancy in observed growth rates from countries’ prior expectations led them to
progressively update their beliefs. Together with our estimates of the (de)stabilizing effects
of GDP growth on elite turnover, this change in worldwide beliefs helps explain our model’s
ability to correctly predict the striking rise in the percentage of world democracies observed
in the same period (Figure 1).35
This process operated through changes in both the mean and precision of beliefs. After
the oil crisis, as democracies came to outperform autocracies, this induced countries to
revise up their estimates of democracy’s superior economic potential, which encouraged
transitions to democracy. As the first set of countries transitioned, democracy became
less rare worldwide, reducing belief uncertainty about its relative economic merits. This
helped solidify the growing consensus, leading to further democratization. The result was
35In our out-of sample period, we observe a decline in the relative performance of democracies, whichgrew just 0.67 percentage points faster than autocracies. As expected, we estimate a concomitant decline inaverage beliefs about the economic benefits of democracy. See Online Appendix A2 for further discussion.
25
1950 1960 1970 1980 1990 2000 2010
−1
01
2
βD=1
−βD
=0
Worldwide Distribution of Beliefs
1950 1960 1970 1980 1990 2000 2010
12
34
56
Distribution of Belief Uncertainty
Sta
ndar
d D
evia
tion
Figure 5: Worldwide Beliefs about Economic Effects of Democracy
Notes. The top panel of this figure plots estimates of the 20th-80th worldwide percentiles of the meanof beliefs about the percentage-point difference in long-run GDP growth rates under democracy versusautocracy. The lower panel shows estimates of the 20th-80th worldwide percentiles of uncertainty (standarddeviation) in these beliefs.
26
the cascade in beliefs between the 1980s and mid-1990s shown in Figure 5 and, ultimately,
the wave of democratization that followed over the same period.
Counterfactuals
Our model allows us to explore counterfactual experiments of three types. First, it enables
us to understand how systemic shocks to prosperity impact the worldwide prospects for
democracy. Second, it allows us to ask retrospective questions about historical democratiza-
tion events. Third, our model allows us to prospectively explore conditions that would lead
current elites to transition to or from democracy. We present results of each type in turn.
To conduct these experiments, it is necessary to specify and estimate the “true” data
generating process. A considerable advantage of using our baseline no-covariates model to
generate these counterfactuals is that only an estimate of the true relationship between
GDP growth and democracy is required. Following Buera, Monge-Naranjo and Primiceri
(2011), we assume that this relationship is described by a hierarchical linear model similar
to (2), which we estimate using all available data between 1875-2000 (excluding the two
world wars).36 This specification is appealing for its flexibility and because it is consistent
with our model of learning in that elites with beliefs (2) and prior (3) would eventually learn
the truth over time.
A Second Great Depression
In the year before the market crash of 1929, 51% of the world’s independent states were
democracies. A year later this proportion dropped to just over 43%. By 1935, the fraction
of countries that remained democratic decreased by another 5%, reaching a low of 36% by
36Specifically, we assume that
yi,t = (1−Di,t)bD=0i +Di,tb
D=1i + ei,t,
et ∼ N(0, S ·Q · S), b ∼ N(b, I2 ⊗ S ·W · S),
where S is a diagonal matrix, Qi,j = exp(−Zi,jζQ), and Wi,j = λ exp(−Zi,jζW ).
27
1938. In Europe, democratic backsliding was even starker. At its theretofore high in 1920,
twenty-six out of twenty-eight European states were democratic, but by 1938 thirteen of
these countries had transitioned away from democracy.
A substantial body of both historical and quantitative research has linked the global
decline of democracy in the inter-war period directly to the economic downturn of the Great
Depression.37 Likewise, both in academic and popular discourse, the global economic reces-
sion of 2008 has been put forward as a contributing factor in the observed wave of recent
democratic breakdowns.38 Next, we provide evidence that systemic economic crises indeed
engender reversals to autocracy and, moreover, highlight how this is driven by changes in
beliefs about the economic effects of democracy.
To that end, we simulate two sorts of crises. First, we generate a “short-deep” counter-
factual crisis where, for our last in-sample year (2000), we simulate a 5.9% average worldwide
contraction of per capita GDP, comparable to the worst year of the Great Depression (1931).
In our second crisis, we construct a “long-shallow” counterfactual condition where we perturb
growth by a smaller amount—1.7% annually (the average contraction between 1929-1933)—
but extend this contraction over a five-year period (2000-2004). We present results where
we concentrate these counterfactual conditions in autocratic and democratic countries, re-
spectively. Under the “autocratic-bias” condition, recessions are twice as deep in autocracies
as in democracies, while keeping the worldwide average contraction consistent with our in-
tervention. Conversely, under the “democratic-bias” condition, recessions are twice as deep
in democracies.
Model estimates of the worldwide percentage of democracies are shown in Figure 6. In the
left-hand panel, we present estimates from the autocratic-bias condition and, in the right-
hand panel, estimates from the democratic-bias condition. In both plots, the short-deep
counterfactual is shown in green, and the long-shallow counterfactual is shown in purple.
Note that both the short-deep and long-shallow counterfactual crises negatively impact the
37Frey and Weck (1983); de Bromhead, Eichengreen and O’Rourke (2013).38Bartels (2013); Armingeon and Guthmann (2014).
28
worldwide prevalence of democracy. While the large single-period decline in growth has a
larger impact in the first year, thereafter the proportion of democracies begins to recover.
In comparison, the smaller but longer crisis has a larger overall impact, with the proportion
of democracies continuing to decline through the duration of the economic contraction and
then recovering more slowly.
% W
orld
Dem
ocra
cies
1950 1960 1970 1980 1990 2000 2010
2530
3540
4550
55
True Value
Model Prediction
Long−Shallow
Short−Deep
Global Economic DepressionAutocratic Bias
% W
orld
Dem
ocra
cies
1950 1960 1970 1980 1990 2000 2010
2530
3540
4550
55
Global Economic DepressionDemocratic Bias
Figure 6: Second Great Depression Counterfactuals
Notes. This figure reports results from two counterfactual exercises. First, we present predictions followinga short-deep crisis of a single year (2000) of 6% average worldwide economic contraction (green). Second, wepresent predictions based on a long-shallow crisis of 5 years (2000-2004) of 2% average annual contraction(purple). We plot the true percentage of world democracies (gray) as well as baseline model predictions(blue). We concentrate the contraction in autocracies in the left-hand panel and in democracies in theright-hand panel.
Both patterns are consistent with our actors learning about the economic merits of democ-
racy. In both counterfactuals, the initial shock forces agents to revise their beliefs. However,
in the long-shallow counterfactual, as would be expected from a continued process of learning,
our agents update their beliefs following each additional negative perturbation of worldwide
growth and continue to select out of democracy accordingly. In contrast, in the short-deep
scenario, we observe a single large drop in the percentage of democracies. Since after the
initial shock to growth there is no “new” information revealed to our agents, the world-
wide percentage of democracies starts to recover. Importantly, comparing effects across the
29
two bias conditions, it is clear that the reduction in world democracies is larger when the
economic contraction is concentrated among democracies. This is consistent with the evo-
lution of beliefs prior to our intervention, as discussed above. When democracies perform
poorly, counter to the prevailing consensus, elites sharply revise their beliefs and select out
of democracy.
The Third Wave
A number of studies highlight the influence of external actors on the prospects for democ-
racy.39 Especially for the early “third-wave” democratization events in Greece, Portugal, and
Spain, the potential for accession to the European Community (EC) has been put forward as
a crucial determinant of their respective transitions.40 Brussels’ requirement that community
members maintain a form of government consistent with liberal democracy, coupled with the
economic benefits of access to the common market, generated an incentive to democratize.
In this section, we show that, rather than serving as an institutional target, much of the im-
pact the EC had upon third-wave democratization operated through its constituent states’
economic performance, which affected beliefs about the economic benefits of democracy in
potential member states.
To show this, we construct a counterfactual wherein we generate a recession in the EC’s
three largest economies, Britain, France, and Germany, of 2% average annual contraction for
the two years preceding the first transition of Greece in 1974.41 Then, to obtain an estimate
of this counterfactual recession’s impact on the transitions of Greece, Portugal, and Spain,
we compare our predicted transitions in this counterfactual world with the truth. Our results
are given in Figure 7.
For Portugal and Spain, the effect of lower growth in Britain, France, and Germany is
considerable, delaying their transitions to democracy by almost 20 years. In contrast, we
39Pevehouse (2005).40Whitehead (1996)41Recessions in our sample last 2 years on average, with an average 2% annual drop in per capita GDP.
30
1980 1990 2000
Greece
Portugal
Spain
1980 1990 2000
Greece
Portugal
Spain
Figure 7: Counterfactual Third Wave
Notes. The top panel shows the true timeline of democratization of Greece, Portugal, and Spain. In thebottom panel, we present the corresponding timeline under our counterfactual scenario of a two-year recession(1972 and 1973) in Britain, France, and Germany of 2% average annual contraction.
find no effect of this recession on Greece’s transition. The reason for this becomes apparent
once we compare the evolution of beliefs in the three countries. We plot in Figure 8, for
each of these cases, our estimates of beliefs under the observed economic conditions (solid)
and under our counterfactual timeline (dashed). Note that for Portugal and Spain there
is a substantial divergence in beliefs between the observed and counterfactual timelines.
Following our counterfactual recession, Portuguese and Spanish beliefs become markedly
less favorable towards the potential for growth under democracy. In contrast, for Greece
this is not the case: there is no substantial difference in beliefs. The reason, according to
our model, is that Greek elites pay little attention to the large, relatively distant Western
economies used to construct our counterfactual scenario. Britain, France, and Germany are
simply too different from Greece to be used as a reference for learning.
Ultimately, our counterfactual experiment suggests that the distinguishing characteristic
of Portugal and Spain, in contrast to, for example, Brazil or Mexico, is not their underlying
propensity for democracy. In terms of their estimated baseline political cost of democracy,
Portugal and Brazil and Spain and Mexico are fairly close.42 Rather, Portugal and Spain
democratized early because they learned from Western Europe, benefiting from proximity
to successful liberal democracies.
42The baseline political cost of democracy for Portugal and Brazil is estimated at 0.01 and -0.14, and interms of rank order they are 71 and 39, respectively. Estimates for Spain and Mexico are 0.06 and 0.12,yielding rank orders of 83 and 93, respectively.
31
−6
−5
−4
−3
−2
Truth
Counterfactual
Spain
−4
−3
−2
−1
0
Portugal
−5
−4
−3
Greece
1975 1980 1985 1990 1995 2000 2005 2010
Beliefs
βD=1
−βD
=0
Figure 8: Counterfactual Third-Wave Beliefs
Notes. We plot estimates of Spain, Portugal, and Greece’s mean beliefs under the true growth rates (solid)and under a counterfactual two-year recession (1972 and 1973) in Britain, France, and Germany of 2%average annual contraction (dashed).
32
Chinese (and North Korean) Democracy
With an eye to contemporary politics, we evaluate the stability of a pair of geopolitically
important autocracies. We explore conditions under which our model predicts China and
North Korea would democratize. We focus first on the Chinese case. Here, we look for the
minimal average growth rate among China’s democratic neighbors, in a five-year economic
expansion, that would result in transitions of two types.43 First, we find the rate of growth
that would result in at least a single year of democracy. Second, we establish the growth
rate that would deliver a permanent transition to democracy.
To obtain a predicted single-year Chinese transition to democracy, we estimate that it
would take five years (2000-2004) of 6.5% average annual growth in China’s democratic
neighbors. After this single year of democracy (2005), our model predicts an immediate re-
versal to autocracy. To obtain a “permanent” transition—that is, a prediction of democracy
until the end of our sample—we estimate that China’s democratic neighbors would have to
grow at an average annual rate of 11% between 2000-2004. In contrast, under the same
set of counterfactual conditions, North Korea would not democratize for any period. North
Korea would democratize for a single year following five years of 16.5% average growth in
its democratic neighbors, and permanently following five years of 20.5% average growth.
To highlight the differences in learning, Figure 9 plots Chinese and North Korean be-
liefs under the “Chinese democracy” scenarios of 6.5% and 11% average growth in their
democratic neighbors. While our intervention increases the perception that democracy out-
performs autocracy in both countries, the change in beliefs in North Korea is too small
to induce a transition.44 These results suggest that the prospects for Chinese and North
Korean democracy are limited. It would take a remarkably large economic boom in Asian
democracies for China to democratize, and an even larger, implausible expansion to generate
43Economic expansions in our sample last five years on average.44This stems from both higher initial skepticism in North Korean beliefs and a smaller effective neigh-
borhood from which to learn. While North Korea primarily learns from South Korea and Japan, Chinaadditionally draws from the experiences of its democratic neighbors to the west and south.
33
the same outcome in North Korea.45
β CH
ND
=1−
β CH
ND
=0
1980 1985 1990 1995 2000 2005 2010
−2
02
46
Chinese Beliefs
11.0%6.5%
β PR
KD
=1−
β PR
KD
=01980 1985 1990 1995 2000 2005 2010
−2
02
46
North Korean Beliefs
Figure 9: Chinese and North Korean Beliefs
Notes. We plot estimates of Chinese and North Korean mean beliefs about the economic impact of democ-racy under observed growth (solid) and under two counterfactual five-year expansions (2000-2004) in theirdemocratic neighbors of 6.5% (dashed) and 11% (dots) average annual growth.
Conclusion
We propose and estimate a learning model of democratization in which incumbent elites rely
on worldwide economic history to update their beliefs about the impact of democracy on
economic growth, which affects their likelihood of retaining power. Our estimates indicate
that growth is stabilizing for incumbents in democracies but destabilizing in autocracies.
Furthermore, we show that learning is highly circumscribed geographically. In combination,
these features allow us to successfully predict, both in (1951-2000) and out of sample (2001-
2010), much of the observed variation in democracy adoption. In particular, our model
45Scholars of Chinese politics frequently argue that Chinese citizens are willing to reward economic growthwith regime survival (Laliberte and Lanteigne, 2008), which suggests there could be considerable heterogene-ity in θ across countries. We explore this to some extent in Online Appendix A7 and find limited evidence.Relatedly, the conventional wisdom is that China will democratize if growth slows down. While an economiccrisis may lead to a transition of power in China, our results caution that a transition to democracy wouldrequire confidence by new elites in its superior economic potential.
34
jointly rationalizes the cross-sectional correlation between income and democracy and the
clustering of transitions to and from democracy.
Rather than an “end of history,” we show that democracy is only as resilient as the
economic performance it engenders. Even in the 1990s, when the success of democratic
systems made such proclamations seem reasonable, we find both substantial variation in
the worldwide distribution of beliefs about the economic consequences of democracy and
substantial uncertainty in these beliefs. As recent history suggests, and as our counterfac-
tual experiments demonstrate, systemic economic crises, particularly those concentrated in
democracies, have the potential to generate waves of autocratic reversals.
Our paper contributes to a sizable literature on democracy and development. Notably,
our results indicate that focusing on the within-country impact of economic development on
democratization is insufficient. Rather, modernization should be understood as a systemic
phenomenon. Because countries learn from each other, a country’s economic performance
affects not only its own likelihood of transitioning to or from democracy but also the prospects
for democracy outside of its borders through its influence on neighbors’ beliefs.
35
References
Acemoglu, Daron, Simon Johnson, James A. Robinson and Pierre Yared. 2008. “Income andDemocracy.” American Economic Review 98(3):808–842.
Acemoglu, Daron, Simon Johnson, James A. Robinson and Pierre Yared. 2009. “Reevaluating themodernization hypothesis.” Journal of Monetary Economics 56:1043–1058.
Acemoglu, Daron, Suresh Naidu, Pascual Restrepo and James A. Robinson. 2019. “DemocracyDoes Cause Growth.” Journal of Political Economy 127(1):47–100.
Ahlquist, John S and Erik Wibbels. 2012. “Riding the Wave: World Trade and Factor-BasedModels of Democratization.” American Journal of Political Science 56(2):447–464.
Alesina, Alberto and Dani Rodrik. 1994. “Distributive Politics and Economic Growth.” QuarterlyJournal of Economics 109(2):465–490.
Alesina, Alberto, Nouriel Roubini and Gerald D. Cohen. 1997. Political Cycles and the Macroecon-omy. Cambridge: MIT Press.
Almond, Gabriel A. and Sidney Verba. 1963. The Civic Culture: Political Attitudes and Democracyin Five Nations. Princeton: Princeton University Press.
Armingeon, Klaus and Kai Guthmann. 2014. “Democracy in crisis? The declining support fornational democracy in European countries, 2007–2011.” European Journal of Political Research53(3):423–442.
Ascencio, Sergio J. and Miguel R. Rueda. 2019. “Partisan Poll Watchers and Electoral Manipula-tion.” American Political Science Review 113(3):727–742.
Barro, Robert J. 1996. “Democracy and Growth.” Journal of Economic Growth 1(1):1–27.
Barro, Robert J. 1999. “Determinants of Democracy.” Journal of Political Economy 107(S6):S158–S183.
Bartels, Larry M. 2013. “Political Effects of the Great Recession.” Annals of the American Academyof Political and Social Science 650(1):47–76.
Boix, Carles. 2011. “Democracy, Development, and the International System.” American PoliticalScience Review 105(04):809–828.
Boix, Carles, Michael Miller and Sebastian Rosato. 2013. “A Complete Data Set of PoliticalRegimes, 1800–2007.” Comparative Political Studies 46(12):1523–1554.
Boix, Carles and Susan C. Stokes. 2003. “Endogenous Democratization.” World Politics 55(4):517–549.
Bolton, Patrick and Christopher Harris. 1999. “Strategic Experimentation.” Econometrica67(2):349–374.
Bramoullee, Yann, Rachel Kranton and Martin D’Amours. 2014. “Strategic Interaction and Net-works.” American Economic Review 104(3):898–930.
36
Brender, Adi and Allan Drazen. 2008. “How Do Budget Deficits and Economic Growth AffectReelection Prospects? Evidence from a Large Panel of Countries.” American Economic Review98(5):2203–2220.
Brinks, Daniel and Michael Coppedge. 2006. “Diffusion Is No Illusion: Neighbor Emulation in theThird Wave of Democracy.” Comparative Political Studies 39(4):463–489.
Buera, Francisco J., Alexander Monge-Naranjo and Giorgio E. Primiceri. 2011. “Learning theWealth of Nations.” Econometrica 79(1):1–45.
Chen, Jie and Chunlong Lu. 2011. “Democratization and the Middle Class in China: The MiddleClass’s Attitudes toward Democracy.” Political Research Quarterly 64(3):705–719.
Crisman-Cox, Casey and Michael Gibilisco. 2018. “Audience Costs and the Dynamics of War andPeace.” American Journal of Political Science 62(3):566–580.
Dahl, Robert A. 1998. On Democracy. New Haven: Yale University Press.
Davison, Anthony C. and David V. Hinkley. 1997. Bootstrap Methods and their Application. InCambridge Series in Statistical and Probabilistic Mathematics. New York: Cambridge UniversityPress.
de Bromhead, Alan, Barry Eichengreen and Kevin H. O’Rourke. 2013. “Political Extremism in the1920s and 1930s: Do German Lessons Generalize?” Journal of Economic History 73(2):371–406.
Diamond, Larry. 2011. “The Impact of the Economic Crisis: Why Democracies Survive.” Journalof Democracy 22(1):17–30.
Dube, Jean-Pierre, Jeremy T. Fox and Che-Lin Su. 2012. “Improving the Numerical Performanceof Static and Dynamic Aggregate Discrete Choice Random Coefficients Demand Estimation.”Econometrica 80(5):2231–2267.
Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer. 2015. “The Next Generation of thePenn World Table.” American Economic Review 105(10):3150–3182.
Frey, Bruno S. and Hannelore Weck. 1983. “A Statistical Study of the Effect of the Great Depressionon Elections: The Weimar Republic, 1930–1933.” Political Behavior 5(4):403–420.
Garcıa-Jimeno, Camilo. 2016. “The Political Economy of Moral Conflict: An Empirical Study ofLearning and Law Enforcement under Prohibition.” Econometrica 84(2):511–570.
Gehlbach, Scott, Konstantin Sonin and Milan W. Svolik. 2016. “Formal Models of NondemocraticPolitics.” Annual Review of Political Science 19:565–584.
Gleditsch, Kristian S. 2002. “Expanded Trade and GDP Data (version 6.0).” Journal of ConflictResolution 46(5):712–724.
Gleditsch, Kristian S. and Michael D Ward. 2006. “Diffusion and the International Context ofDemocratization.” International Organization 60(4):911–933.
Goemans, Henk E, Kristian S. Gleditsch and Giacomo Chiozza. 2009. “Introducing Archigos: ADataset of Political Leaders.” Journal of Peace Research 46(2):269–283.
37
Heiss, Florian and Viktor Winschel. 2008. “Likelihood approximation by numerical integration onsparse grids.” Journal of Econometrics 144(1):62–80.
Hibbs, Douglas A. 1977. “Political Parties and Macroeconomic Policy.” American Political ScienceReview 71(4):1467–1487.
Hirschman, Albert O. and Michael Rothschild. 1973. “The Changing Tolerance for Income Inequal-ity in the Course of Economic Development.” Quarterly Journal of Economics 87(4):544–566.
Houle, Christian, Mark A. Kayser and Jun Xiang. 2016. “Diffusion or Confusion? Clustered Shocksand the Conditional Diffusion of Democracy.” International Organization 70(4):687–726.
Huntington, Samuel P. 1968. Political Order in Changing Societies. New Haven: Yale UniversityPress.
Huntington, Samuel P. 1993. The Third Wave: Democratization in the Late Twentieth Century.Norman: University of Oklahoma Press.
Kadera, Kelly M., Mark J. C. Crescenzi and Megan L. Shannon. 2003. “Democratic Survival, Peace,and War in the International System.” American Journal of Political Science 47(2):234–247.
Laliberte, Andre and Marc Lanteigne, eds. 2008. The Chinese Party-State in the 21st Century:Adaptation and the Reinvention of Legitimacy. New York: Routledge.
Lipset, Seymour M. 1959. “Some Social Requisites of Democracy: Economic Development andPolitical Legitimacy.” American Political Science Review 53(1):69–105.
Londregan, John B. and Keith T. Poole. 1996. “Does High Income Promote Democracy?” WorldPolitics 49(1):1–30.
Maddison, Angus. 2010. “Statistics on World Population, GDP and Per Capita GDP, 1-2008 AD.”URL: http://www.ggdc.net/maddison/oriindex.htm
Mansfield, Edward D., Helen V. Milner and B. Peter Rosendorff. 2000. “Free to Trade: Democracies,Autocracies, and International Trade.” American Political Science Review 94(2):305–321.
Matern, Bertil. 1960. “Spatial variation: Stochastic models and their application to some problemsin forest surveys and other sampling investigations.” Meddelanden fran Statens Skogsforskn-ingsinstitut 49(5):1–144.
Miller, Michael K. 2016. “Democracy by Example? Why Democracy Spreads When the World’sDemocracies Prosper.” Comparative Politics 49(1):83–116.
Mossel, Elchanan, Allan Sly and Omer Tamuz. 2015. “Strategic Learning and the Topology ofSocial Networks.” Econometrica 83(5):1755–1794.
Norris, Pippa, ed. 1999. Critical Citizens: Global Support for Democratic Government. Oxford:Oxford University Press.
Olson, Mancur. 1963. “Rapid Growth as a Destabilizing Force.” Journal of Economic History23(4):529–552.
38
Persson, Torsten and Guido Tabellini. 1994. “Is Inequality Harmful for Growth?” AmericanEconomic Review 84(3):600–621.
Pevehouse, Jon C. 2005. Democracy from Above: Regional Organizations and Democratization.Cambridge: Cambridge University Press.
Primiceri, Giorgio E. 2006. “Why Inflation Rose and Fell: Policymakers’ Beliefs and US PostwarStabilization Policy.” Quarterly Journal of Economics 121(3):867–901.
Przeworski, Adam and Fernando Limongi. 1997. “Modernization: Theories and Facts.” WorldPolitics 49(2):155–183.
Przeworski, Adam, Michael E. Alvarez, Jose A. Cheibub and Fernando Limongi. 2000. Democracyand Development: Political Institutions and Well-Being in the World, 1950-1990. Cambridge:Cambridge University Press.
Rodrik, Dani. 1999. “Democracies Pay Higher Wages.” Quarterly Journal of Economics 114(3):707–738.
Skocpol, Theda. 1979. States and Social Revolutions: A Comparative Analysis of France, Russiaand China. Cambridge: Cambridge University Press.
Spolaore, Enrico and Romain Wacziarg. 2009. “The Diffusion of Development.” Quarterly Journalof Economics 124(2):469–529.
Su, Che-Lin and Kenneth L Judd. 2012. “Constrained Optimization Approaches to Estimation ofStructural Models.” Econometrica 80(5):2213–2230.
Svolik, Milan W. 2013. “Learning to Love Democracy: Electoral Accountability and the Successof Democracy.” American Journal of Political Science 57(3):685–702.
Svolik, Milan W. 2019. “Democracy as an equilibrium: rational choice and formal political theoryin democratization research.” Democratization 26(1):40–60.
Torfason, Magnus Thor and Paul Ingram. 2010. “The global rise of democracy: A network account.”American Sociological Review 75(3):355–377.
Treisman, Daniel. 2015. “Income, Democracy, and Leader Turnover.” American Journal of PoliticalScience 59(4):927–942.
Wejnert, Barbara. 2005. “Diffusion, Development, and Democracy, 1800-1999.” American Socio-logical Review 70(1):53–81.
Whitehead, Laurence, ed. 1996. The International Dimensions of Democratization: Europe andthe Americas. Oxford: Oxford University Press.
39
Online Appendix
A1 List of Countries
Table A1 lists all countries in our main sample used to estimate the baseline specification of
our model.
Table A1: Main Sample
Country Years Notes Country Years Notes
United States 1951-2000 Canada 1951-2000Cuba 1951-2000 Dominican Republic 1951-2000
Jamaica 1964-2000 Trinidad and Tobago 1964-2000Mexico 1951-2000 Guatemala 1951-2000
Honduras 1951-2000 El Salvador 1951-2000Nicaragua 1951-2000 Costa Rica 1951-2000Panama 1951-2000 Colombia 1951-2000
Venezuela 1951-2000 Ecuador 1951-2000Peru 1951-2000 Brazil 1951-2000
Bolivia 1951-2000 Paraguay 1951-2000Chile 1951-2000 Argentina 1951-2000
Uruguay 1951-2000 United Kingdom 1951-2000Ireland 1951-2000 Netherlands 1951-2000Belgium 1951-2000 France 1951-2000
Switzerland 1951-2000 Spain 1951-2000Portugal 1951-2000 Germany 1951-2000 Federal Republic of Germany from 1951-1990.Poland 1951-2000 Austria 1951-2000
Hungary 1951-2000 Czech Republic 1951-2000 Czechoslovakia from 1951-1992.Slovakia 1994-2000 Italy 1951-2000Albania 1951-2000 Macedonia 1993-2000Croatia 1992-2000 Yugoslavia 1951-1991
Bosnia and Herzegovina 1993-2000 Slovenia 1992-2000Greece 1951-2000 Bulgaria 1951-2000
Moldova 1992-2000 Romania 1951-2000Russia 1951-2000 U.S.S.R. from 1951-1991. Estonia 1993-2000Latvia 1992-2000 Lithuania 1992-2000
Ukraine 1992-2000 Belarus 1993-2000Armenia 1993-2000 Georgia 1993-2000
Azerbaijan 1993-2000 Finland 1951-2000Sweden 1951-2000 Norway 1951-2000
Denmark 1951-2000 Republic of Guinea-Bissau 1976-2000Equatorial Guinea 1969-2000 Gambia 1967-2000
Mali 1962-2000 Senegal 1962-2000Benin 1961-2000 Mauritania 1962-2000Niger 1962-2000 Cote D’Ivoire 1962-2000
Republic of Guinea 1960-2000 BurkinaFaso 1962-2000Liberia 1951-2000 Sierra Leone 1963-2000Ghana 1958-2000 Togo 1962-2000
Cameroon 1961-2000 Nigeria 1962-2000Gabon 1962-2000 Central African Republic 1962-2000Chad 1962-2000 Republic of the Congo 1962-2000Zaire 1962-2000 Congo-Leopoldville from 1962-1965, Democratic Republic of Congo from 1997-2000. Uganda 1964-2000Kenya 1965-2000 Tanzania 1963-2000
Burundi 1964-2000 Rwanda 1963-2000Somalia 1962-1991 Djibouti 1979-2000Ethiopia 1994-2000 Angola 1977-2000
Mozambique 1977-2000 Zambia 1966-2000Zimbabwe 1967-2000 Malawi 1966-2000
South Africa 1951-2000 Namibia 1992-2000Lesotho 1968-2000 Botswana 1968-2000
Swaziland 1970-2000 Madagascar 1962-2000Comoro Islands 1977-2000 Mauritius 1970-2000
Morocco 1958-2000 Algeria 1964-2000Tunisia 1951-2000 Libya 1953-2000Sudan 1957-2000 Iran 1951-2000Turkey 1951-2000 Iraq 1951-2000Egypt 1951-2000 Syria 1951-2000
Lebanon 1951-2000 Jordan 1951-2000Israel 1951-2000 Saudi Arabia 1951-2000
Kuwait 1952-2000 Bahrain 1973-2000Qatar 1973-2000 United Arab Emirates 1973-2000Oman 1951-2000 Afghanistan 1951-2000
Turkmenistan 1992-2000 Tajikistan 1993-2000Kyrgyzstan 1992-2000 Uzbekistan 1992-2000Kazakhstan 1992-2000 China 1951-2000
Mongolia 1951-2000 Taiwan 1951-2000North Korea 1951-2000 South Korea 1951-2000
Japan 1951-2000 India 1951-2000Pakistan 1951-2000 Bangladesh 1973-2000Burma 1951-2000 Sri Lanka 1951-2000Nepal 1951-2000 Thailand 1951-2000
Cambodia 1956-2000 Laos 1955-2000Democratic Republic of Vietnam 1951-2000 Socialist Republic of Vietnam from 1976-2000. Malaysia 1959-2000
Singapore 1961-2000 Philippines 1951-2000Indonesia 1951-2000 Australia 1951-2000
New Zealand 1951-2000
i
A2 Preliminary Evidence
To motivate our model, we estimate a pair of reduced-form regressions. The purpose is to
demonstrate two empirical patterns underpinning our structural approach: (i) democracy
adoption systematically covaries with observed differences in economic performance between
neighboring democracies and autocracies—which is suggestive of the learning process we
model—and (ii) economic growth has a differential impact on elite turnover under democracy
versus autocracy.
First, we estimate the following linear probability model via ordinary least squares:
Di,t = φ0 + φ1Di,t−1 + φ2
[yD=1i,t−1 − yD=0
i,t−1]
+ φ3Di,t−1 + ui,t, (A1)
where yD=1i,t−1 and yD=0
i,t−1 are weighted averages of past GDP per capita growth rates among
country i’s democratic and autocratic neighbors, respectively.46 Thus, yD=1i,t−1 − yD=0
i,t−1 proxies
for beliefs in country i about the impact of democracy on economic growth. We vary the size
of i’s effective neighborhood by setting the decay parameter equal to the estimated value from
our structural model (γ = 0.4234) as well as twice (2γ) and half this value (12γ). Furthermore,
we control for Di,t−1, the weighted proportion of democracies in i’s neighborhood.
If φ2 6= 0, then, consistent with our model, democracy adoption systematically covaries
with observed growth differences in democratic versus autocratic neighbors. We evaluate
this hypothesis in Table A2, which shows that, across specifications and spatial weights, the
reduced-form evidence is indeed consistent with our structural estimates: as φ2 > 0, the
likelihood of democracy increases with superior economic performance in democracies.
Second, to motivate incumbents’ objective function in our model, we show that the effect
of GDP growth on their likelihood of retaining power in the subsequent period is heterogenous
46Formally, as in Buera, Monge-Naranjo and Primiceri (2011), we compute yD=1i,t−1 − yD=0
i,t−1 =t−1∑
τ=t−3
(∑j 6=i exp(−γdi,j)yjτDjτ∑j 6=i exp(−γdi,j)Djτ
−∑j 6=i exp(−γdi,j)yjτ (1−Djτ )∑j 6=i exp(−γdi,j)(1−Djτ )
), where di,j is the distance between
countries i and j.
ii
Table A2: Reduced-Form Evidence of Effect of Learning about Growth on Democracy
Dependent variable: Di,t
(I) (II) (III) (IV) (V) (VI) (VII) (VIII) (IX) (X) (XI) (XII)
Decay:12γ γ 2γ
φ1 0.854∗∗∗ 0.851∗∗∗ 0.849∗∗∗ 0.849∗∗∗ 0.854∗∗∗ 0.851∗∗∗ 0.849∗∗∗ 0.849∗∗∗ 0.853∗∗∗ 0.851∗∗∗ 0.849∗∗∗ 0.850∗∗∗
(0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007)
φ2 0.801∗∗∗ 0.527∗∗ 0.556∗∗ 0.527∗∗ 0.419∗∗∗ 0.279∗ 0.289∗ 0.258∗ 0.214∗∗ 0.163∗ 0.161∗ 0.141(0.202) (0.217) (0.217) (0.221) (0.149) (0.154) (0.154) (0.156) (0.097) (0.098) (0.098) (0.098)
φ3 0.121∗∗∗ 0.128∗∗∗ 0.128∗∗∗ 0.133∗∗∗ 0.117∗∗∗ 0.114∗∗∗ 0.115∗∗∗ 0.115∗∗∗ 0.104∗∗∗ 0.096∗∗∗ 0.097∗∗∗ 0.096∗∗∗
(0.025) (0.025) (0.025) (0.026) (0.021) (0.021) (0.021) (0.022) (0.018) (0.018) (0.018) (0.018)
Controls:log(Yi,t−1) No Yes Yes Yes No Yes Yes Yes No Yes Yes YesTradei,t−1/Yi,t−1 No No Yes Yes No No Yes Yes No No Yes YesTime in Power No No No Yes No No No Yes No No No Yes
Observations 5,623 5,623 5,623 5,579 5,623 5,623 5,623 5,579 5,623 5,623 5,623 5,579
∗p<0.1, ∗∗p<0.05, ∗∗∗p<0.01
Notes. Ordinary least squares estimates from regression (A1). All models include country fixed effects.Standard errors in parentheses.
across democracies and autocracies, with its impact in democracies being consistently larger.
Specifically, we estimate the following linear probability model via least squares:
Pr(RetainPoweri,t+1|yi,t, Di,t) = λ0 + λ1yi,t + λ2Di,t + λ3yi,t ×Di,t. (A2)
Of course, given the selection into or out of democracy we model, estimates of λ will be
inconsistent. To address this without imposing additional structure, we use yD=1i,t−1, y
D=0i,t−1, and
Di,t−1 from (A1) to build instruments for the regressors in (A2).47 That is, in line with our
assumption that beliefs have no direct impact on outcomes (other than through institutional
choices), if we correctly measured beliefs, we could obtain consistent estimates of λ via two-
stage least squares. Since we view yD=1i,t−1 and yD=0
i,t−1 only as rough proxies for beliefs, however,
we take the instrumental-variables estimates of λ1 and λ3 with a grain of salt.
Estimates of λ1 and λ1 + λ3 measure the impact of economic growth on elite survival
under autocracy and democracy, respectively, while λ2 gives the difference in baseline survival
under democracy (the average of −fi in our structural model). Ordinary and two-stage least
47We set γ = 0.4234.
iii
Table A3: Heterogenous Impact of Growth on Incumbent Stability
Dependent variable: RetainPoweri,t+1
(I) (II) (III) (IV) (V) (VI) (VII) (VIII)
OLS 2SLS
λ1 0.336∗∗∗ 0.350∗∗∗ 0.353∗∗∗ 0.378∗∗∗ −26.391 −11.995 −13.179 −9.583(0.073) (0.073) (0.074) (0.074) (109.797) (22.195) (26.917) (15.707)
λ2 −0.081∗∗∗ −0.082∗∗∗ −0.083∗∗∗ −0.080∗∗∗ −1.161 −0.500 −0.533 −0.276(0.014) (0.014) (0.014) (0.014) (4.343) (0.731) (0.869) (0.291)
λ3 0.289∗ 0.288∗ 0.286∗ 0.245 91.411 37.182 41.071 27.540(0.168) (0.170) (0.170) (0.179) (378.318) (68.632) (83.795) (45.160)
Controls:log(Yi,t−1) No Yes Yes Yes No Yes Yes YesTradei,t−1/Yi,t−1 No No Yes Yes No No Yes YesTime in Power No No No Yes No No No Yes
Observations 5,966 5,924 5,919 5,845 5,622 5,622 5,622 5,578
∗p<0.1, ∗∗p<0.05, ∗∗∗p<0.01
Notes. Ordinary (OLS) and two-stage least squares (2SLS) estimates from regression (A2). All modelsinclude country fixed effects. Standard errors in parentheses.
squares estimates are presented in Table A3. Across specifications, we find that democracies
consistently reward incumbents for high rates of growth more than autocracies. Moreover,
the two-stage least squares results are consistent with our structural estimates in that growth
is stabilizing for incumbents under democracy but not under autocracy. These estimates are
imprecise, however, given that the instruments are only rough proxies for the learning process
we model.
A2.1 Growth in Democracies Versus Autocracies
Figure A1 illustrates the variation in the data that drives our main results by plotting
the evolution of the difference in growth rates in democracies versus autocracies. The plot
shows that, following the oil crisis, democracies experienced a markedly superior recovery.
This underlies the sharp revision of beliefs and subsequent transitions to democracy that
characterize the most consequential period in our sample.
iv
Figure A1: Difference in Average Growth Rates in Democracies Versus Autocracies
A3 Likelihood of the Data
Recall that W T ≡ {It, yt, Dt, Xt}Tt=1 denotes the set of all data available up to period T .
With a slight abuse of notation—using L to denote arbitrary densities of the data—the
likelihood function can be written as
L(W T |ϕ) =T∏t=1
L(Wt|W t−1, ϕ),
where Wt ≡ {It, yt, Dt, Xt} collects the data generated in period t. As discussed in the paper,
we assume that observed outcomes are only affected by actual choices and not by the beliefs
that led to those choices. That is, transitions of power (It), GDP growth (yt), and other
economic and political characteristics of countries (Xt) are shaped by realized institutions
(Dt), but they are not directly affected by beliefs about the potential effects of transitioning
v
into or out of democracy. Formally, this assumption allows us to write
L(Wt|W t−1, ϕ) = L(It, yt, Dt, Xt|W t−1, ϕ)
= L(It|yt, Dt, Xt,Wt−1)L(yt|Dt, Xt,W
t−1) · · ·
L(Dt|Xt,Wt−1, ϕ)L(Xt|W t−1),
which implies that L(W T |ϕ) ∝∏T
t=1 L(Dt|Xt,Wt−1, ϕ).
To compute L(Dt|Xt,Wt−1, ϕ), notice from (5) that, given (Xi,t,W
t−1, ϕ), there is a
threshold value of κi,t—the realized shock in period t to the political cost of democracy in
country i—such that Di,t = 1 if and only if κi,t falls below the threshold. This threshold
value, denoted κi,t(Xi,t,Wt−1, ϕ), is defined implicitly by
Ei,t−1
[exp(αi + θD=1(βD=1
i + εi,t)− fi −X ′i,tξ − κi,t(Xi,t,Wt−1, ϕ))
1 + exp(αi + θD=1(βD=1i + εi,t)− fi −X ′i,tξ − κi,t(Xi,t,W t−1, ϕ))
]
= Ei,t−1
[exp(αi + θD=0(βD=0
i + εi,t))
1 + exp(αi + θD=0(βD=0i + εi,t))
].
(A3)
Since κi,t is distributed independently across countries, the likelihood can be written as
L(W T |ϕ) ∝T∏t=1
n∏i=1
L(Di,t|Xi,t,Wt−1, ϕ),
where
L(Di,t|Xi,t,Wt−1, ϕ) = Φ
(κi,t(Xi,t,W
t−1, ϕ)
ςi
)Di,t[1− Φ
(κi,t(Xi,t,W
t−1, ϕ)
ςi
)]1−Di,t
and Φ denotes the standard Normal cumulative distribution function.
vi
A4 Prior
We set our prior over the model parameters in Table 1 as follows. We assume that
αii.i.d.∼ N(α, ω2
α),
θD=0,1 i.i.d.∼ N(θ, ω2θ),
βD=0i,0
i.i.d.∼ N(βD=0
0, ω2
β),
βD=1i,0
i.i.d.∼ N(βD=1
0, ω2
β),
vii.i.d.∼ IG(sv, dv),
fii.i.d.∼ N(f, ω2
f ),
ςii.i.d.∼ IG(sς , dς),
γi.i.d.∼ Uniform,
ξi.i.d.∼ Uniform,
where IG(s, d) denotes the Inverse-Gamma distribution with shape parameter s and scale
parameter d. We calibrate our prior using pre-sample data from 1875-1950 (excluding the
two world wars):
• We set βD=0
0= 0.0180 and βD=1
0= 0.0218, the average annual growth rates among
autocracies and democracies, respectively, in the pre-sample period. We then set ωβ =
0.02, allowing for considerable uncertainty about the mean of initial beliefs.
• We select sv = 3 and dv = 0.7423 so that the prior mean and standard deviation of viσi
equal the standard deviation of average growth rates, yi, in the pre-sample period. A
pre-sample estimate of the mean of σi (equal to 0.0531) is obtained from the residuals
of a regression of GDP growth on country and time fixed effects. We then set the prior
mean of vi equal to√
Var(yi)/0.0531 =√
0.0004/0.0531 = 0.3711.
• We set θ = 0 to adopt an agnostic starting point about whether GDP growth has a
vii
stabilizing or destabilizing effect on elite turnover across systems of government, and
we normalize ωθ = 1.
• To adopt an agnostic starting point regarding the political cost of democracy, we set
f = 0. We describe our choice of ωf below.
• To ensure prior belief correlations between 0 and 1, we adopt a flat (improper) prior
over γ ≥ 0. For the political cost of democracy, we center the variables in Xi,t around
their sample means so that Ki,t has an expected value of zero (in line with our agnostic
view of fi), and we adopt a flat (improper) prior over ξ.
• Letting θyi and KDi denote the within-country means of θD=Di,tyi,t and Ki,tDi,t, re-
spectively, and noting that αi+θyi−DKi approximately equals the log-odds of staying
in power in country i, we select α, ωα, and ωf to match the first two moments of these
log-odds across countries in the pre-sample period. Since E(αi + θyi −DKi) = α, we
set α = 1.8432, the average log-odds in the pre-sample period.48 Noting that the vari-
ance of the log-odds among autocracies is approximately equal to ω2α + Var(yi), while
the variance among democracies is approximately equal to ω2α + Var(yi) + ω2
f , we set
ω2α + 0.0005 = 0.722 and ω2
α + 0.0007 + ω2f = 1.0802, so ωα = 0.8494 and ωf = 0.5984.
• Finally, to discourage the model from fitting the data with large (absolute) realizations
of the unobserved political cost shock κi,t, we set sς = 3 and dς = 0.2992 so that ςi has
a prior mean and standard deviation of ωf/4 = 0.1496.
A5 Maximum-A-Posteriori Estimator: MPEC Approach
As noted in the paper, calculating κi,t(Xi,t,Wt−1, ϕ) from (A3) to evaluate the likelihood of
the data is computationally expensive. To avoid this burden, we follow the Mathematical
48For countries that experienced no elite turnover in the pre-sample period, we limit the probability ofstaying in power to equal the maximum among countries with turnover (95%).
viii
Programming with Equilibrium Constraints (MPEC) approach of Su and Judd (2012) to
compute our maximum-a-posteriori (MAP) estimator of ϕ. The idea behind this approach
is simple: instead of calculating κi,t(Xi,t,Wt−1, ϕ) at every trial value of ϕ, one treats each κi,t
as an auxiliary parameter and imposes (A3)—the optimality (or equilibrium) condition of the
model—as a feasibility constraint on the log-posterior maximization program. Accordingly,
we estimate ϕ by solving maxϕ,κ log(p(ϕ,κ|W T )
)subject to the constraint that κi,t satisfies
(A3) for all i and t.
As shown by Su and Judd (2012), MPEC and the standard approach of directly maxi-
mizing log(p(ϕ|W T )
)yield theoretically-identical estimates of ϕ. Computationally, MPEC’s
advantage arises from the fact that modern optimization algorithms do not enforce con-
straints until the final iteration of the search process. Thus, the computationally expensive
condition (A3) is satisfied exactly only once rather than at every trial value of ϕ. More-
over, for this reason, MPEC is robust to sensitivity issues that may arise from not setting
a sufficiently stringent convergence criterion when computing κi,t(Xi,t,Wt−1, ϕ) (Dube, Fox
and Su, 2012). A potential disadvantage is that, by introducing κ as additional parameters,
MPEC increases the size of the optimization problem. However, this concern is mitigated
by the sparsity that results from each auxiliary parameter κi,t entering a single constraint.
To reap the computational benefits of the MPEC approach, it is essential to employ op-
timization software tailor-made to handle large-scale problems—with thousands of variables
and nonlinear constraints. Accordingly, we implement our MPEC-MAP estimator using the
industry-leading software Knitro.49 Due to memory and computational constraints—our
baseline model with no covariates features 8,459 variables—our implementation relies on
Knitro’s Interior/Direct algorithm with their limited-memory quasi-Newton BFGS approx-
imation of the Hessian of the Lagrangian. Nevertheless, we provide exact first derivatives
of the log-posterior and constraints.50 With a 3.0 GHz machine, it takes about 5-6 days to
49https://www.artelys.com/en/optimization-tools/knitro50Knitro offers a derivative-check option—which our implementation passes—to test the code for exact
derivatives against finite-difference approximations.
ix
estimate our model once.
To mitigate concerns about potential local maxima, for each model specification we ran-
domly draw 5 sets of starting values for the optimization algorithm from the prior distribution
of the model parameters described in Appendix A4. We then select, for each specification,
the solution that achieves the highest log-posterior value. Reassuringly, there is very little
divergence in solutions across starting values.
Standard errors for our parameter estimates are parametrically bootstrapped (Davison
and Hinkley, 1997). Due to the considerable computational cost of estimating our model,
we only compute standard errors for our baseline specification with no covariates. This has
the added advantage that only estimates from the true DGP described in Footnote 36 are
necessary to generate bootstrap samples.
A final notable computational challenge is that the integrals in (A3) have no closed-
form solution. Rather than employing a Monte Carlo approximation, which would require
independent draws across all i and t to prevent simulation error from propagating, we rely
on sparse-grid integration as implemented by Heiss and Winschel (2008). This approach is
much more efficient and delivers virtually exact integral computations for integrands that
are well approximated by polynomials—as is the case for the integrals in (A3).51
A6 Coefficient Estimates: Baseline Model
Table 1 lists and describes all the parameters of our model. In our baseline specification
with no covariates, the vector ξ is empty. We report estimates of θ and γ in Footnotes 31
and 34, respectively, and estimates of fi for each country in Figure 3. Table A4 presents all
remaining coefficient estimates for our baseline specification (standard errors in parentheses).
51Our implementation computes exact integrals of fifteenth-degree polynomials.
x
Table A4: Estimates of Country-Specific Parameters αi, βD=0i,0 , βD=1
i,0 , vi, and ςi
Country αi βD=0i,0 βD=1
i,0 vi ςi Country αi βD=0i,0 βD=1
i,0 vi ςi
USA 1.8147 0.0189 0.0284 0.1902 0.0728 CAN 1.8000 0.0191 0.0249 0.1823 0.0723(0.0487) (0.0198) (0.0176) (0.1953) (0.3791) (0.0521) (0.0209) (0.0193) (0.1956) (0.3637)
CUB 1.7449 0.0422 0.0431 0.5696 0.0469 DOM 1.9302 0.0020 0.0005 0.6220 0.0442(0.0743) (0.0327) (0.0320) (0.2930) (0.2320) (0.1091) (0.0436) (0.0359) (0.3591) (0.4087)
JAM 1.8361 0.0185 0.0313 0.5088 0.0737 TRI 1.7974 0.0192 0.0165 0.1946 0.0722(0.0877) (0.0400) (0.0346) (0.2522) (0.3667) (0.0194) (0.0268) (0.0247) (0.1860) (0.3887)
MEX 1.8773 0.0126 0.0206 0.1913 0.0705 GUA 1.9023 0.0453 0.0386 0.2184 0.0598(0.0705) (0.0331) (0.0285) (0.1876) (0.1898) (0.0645) (0.0759) (0.0649) (0.1662) (0.1646)
HON 1.8249 0.0008 0.0082 0.2591 0.0585 SAL 1.8631 -0.0085 -0.0154 0.4512 0.0712(0.0672) (0.0346) (0.0312) (0.1910) (0.1484) (0.3185) (0.0687) (0.0635) (0.2657) (0.2856)
NIC 1.8652 0.0154 0.0264 0.1166 0.0278 COS 1.7512 0.0197 0.0142 0.2037 0.0715(0.0408) (0.0398) (0.0353) (0.1565) (0.1964) (0.0748) (0.0699) (0.0576) (0.1898) (0.3795)
PAN 1.8437 0.0426 0.0089 0.1317 0.1102 COL 2.5650 0.0072 0.0046 1.4453 0.0723(0.0118) (0.0285) (0.0277) (0.1404) (0.3525) (0.4678) (0.1286) (0.1070) (0.4873) (0.4519)
VEN 2.3228 0.0123 0.0106 1.4937 0.0531 ECU 1.4737 0.0385 0.0295 0.4143 0.0423(0.3398) (0.0835) (0.0641) (0.4937) (0.4106) (0.0809) (0.1467) (0.1283) (0.2001) (0.1520)
PER 1.8327 0.0300 0.0037 0.2171 0.0917 BRA 1.8211 0.0214 0.0030 0.1191 0.1059(0.0450) (0.0428) (0.0365) (0.1901) (0.2221) (0.0328) (0.0305) (0.0295) (0.1557) (0.2135)
BOL 1.9907 0.0068 0.0099 1.6876 0.0612 PAR 1.8948 0.0362 0.0208 0.2246 0.0735(2.3220) (0.4624) (0.3649) (0.3540) (0.7489) (0.0795) (0.0757) (0.0645) (0.1882) (0.3755)
CHL 1.4128 0.0289 0.0285 0.5013 0.0357 ARG 1.8439 0.0057 0.0356 0.0735 0.1003(0.0817) (0.0996) (0.0840) (0.2114) (0.2287) (0.0497) (0.0545) (0.0481) (0.1578) (0.2731)
URU 1.8164 0.0322 0.0077 0.2035 0.0365 UKG 1.7922 0.0191 0.0244 0.1782 0.0724(0.1215) (0.0647) (0.0553) (0.1432) (0.1799) (0.0523) (0.0182) (0.0141) (0.1726) (0.3912)
IRE 1.8277 0.0186 0.0243 0.1729 0.0734 NTH 1.6578 0.0202 0.0242 0.1883 0.0711(0.0603) (0.0134) (0.0113) (0.1723) (0.3845) (0.0928) (0.0376) (0.0304) (0.2439) (0.3671)
BEL 1.8362 0.0185 0.0203 0.1847 0.0736 FRN 1.7776 0.0192 0.0260 0.1702 0.0724(0.0610) (0.0259) (0.0218) (0.2549) (0.3996) (0.0623) (0.0282) (0.0232) (0.1777) (0.3995)
SWZ 1.8035 0.0190 0.0250 0.1680 0.0727 SPN 1.8729 0.0161 -0.0132 0.4770 0.0422(0.0418) (0.0208) (0.0164) (0.1565) (0.4014) (0.1007) (0.0575) (0.0466) (0.3326) (0.3714)
POR 1.8777 0.0124 -0.0022 0.3360 0.0505 GMY 1.8419 0.0182 0.0285 0.1457 0.0745(0.1023) (0.0748) (0.0608) (0.2820) (0.3306) (0.0362) (0.0376) (0.0328) (0.2375) (0.3980)
POL 1.9991 0.0175 0.0093 1.0647 0.0546 AUS 1.8432 0.0180 0.0740 0.1156 0.0748(1.5958) (0.4168) (0.3400) (0.3578) (0.5581) (0.0139) (0.0748) (0.0641) (0.2030) (0.3932)
HUN 2.3984 0.0179 0.0195 2.0474 0.0505 CZR 1.8709 -0.0005 -0.0157 0.3155 0.0421(0.4215) (0.1945) (0.1690) (0.7520) (0.2212) (0.0882) (0.0173) (0.0183) (0.2906) (0.1772)
SLO 1.7086 0.0197 0.0244 0.2122 0.0725 ITA 1.8411 0.0182 0.0272 0.1694 0.0743(0.0224) (0.0158) (0.0131) (0.2786) (0.4015) (0.0218) (0.0096) (0.0090) (0.1807) (0.3986)
ALB 1.7750 0.0006 0.0240 0.2592 0.0593 MAC 1.7758 0.0193 0.0222 0.2043 0.0724(0.0518) (0.0172) (0.0132) (0.2334) (0.1494) (0.0143) (0.0089) (0.0088) (0.2129) (0.4087)
CRO 1.8592 0.0180 0.0191 0.2071 0.0644 YUG 1.9095 0.0319 0.0188 0.2142 0.0712(0.0130) (0.0179) (0.0161) (0.2801) (0.2775) (0.0211) (0.0361) (0.0281) (0.2360) (0.4035)
BOS 1.8004 0.0190 0.0230 0.3225 0.0730 SLV 1.5727 0.0205 0.0249 0.2008 0.0718(0.0187) (0.0255) (0.0205) (0.2473) (0.4074) (0.0410) (0.0134) (0.0107) (0.1868) (0.4094)
GRC 1.5500 0.0468 0.0367 0.6486 0.0410 BUL 2.2611 0.0074 0.0222 1.0768 0.0579(0.0769) (0.0774) (0.0720) (0.2902) (0.3054) (0.3696) (0.3163) (0.2659) (0.3649) (0.2674)
MLD 1.7984 0.0191 0.0236 0.1817 0.0727 RUM 2.4243 0.0186 0.0149 6.6647 0.0651(0.0172) (0.0507) (0.0414) (0.1988) (0.3884) (3.2608) (0.1847) (0.0889) (2.0319) (0.6177)
RUS 1.8535 0.0186 0.0146 0.1078 0.1104 EST 1.8239 0.0186 0.0231 0.1796 0.0735(0.0115) (0.0256) (0.0204) (0.1631) (0.2625) (0.0068) (0.0102) (0.0109) (0.2519) (0.4018)
xi
Table A4 (continued)
Country αi βD=0i,0 βD=1
i,0 vi ςi Country αi βD=0i,0 βD=1
i,0 vi ςi
LAT 1.8201 0.0184 0.0204 0.2166 0.0655 LIT 1.7210 0.0196 0.0254 0.2098 0.0722(0.0207) (0.0094) (0.0084) (0.2520) (0.2467) (0.0281) (0.0124) (0.0123) (0.2735) (0.3954)
UKR 1.7982 0.0191 0.0240 0.1907 0.0728 BLR 1.8559 0.0187 0.0204 0.1672 0.0750(0.0149) (0.0122) (0.0118) (0.2818) (0.4079) (0.0058) (0.0185) (0.0162) (0.2341) (0.2946)
ARM 1.8847 0.0165 0.0203 0.1667 0.0730 GRG 1.8432 0.0183 0.0218 0.1925 0.0748(0.0096) (0.0064) (0.0062) (0.2125) (0.3988) (0.0258) (0.0211) (0.0154) (0.2378) (0.4042)
AZE 1.8470 0.0183 0.0217 0.2031 0.0747 FIN 1.7190 0.0197 0.0279 0.1797 0.0713(0.0237) (0.0194) (0.0138) (0.2389) (0.4047) (0.0298) (0.0241) (0.0182) (0.2173) (0.3992)
SWD 1.8431 0.0181 0.0348 0.1434 0.0748 NOR 1.8031 0.0190 0.0222 0.1954 0.0726(0.0191) (0.0102) (0.0099) (0.1754) (0.3970) (0.0701) (0.0256) (0.0208) (0.1995) (0.3838)
DEN 1.6931 0.0200 0.0242 0.2011 0.0713 GNB 1.8194 -0.0055 -0.0042 0.5045 0.0472(0.0786) (0.0494) (0.0404) (0.2254) (0.3822) (0.0177) (0.0130) (0.0144) (0.2605) (0.2337)
EQG 1.9776 0.0191 0.0187 0.2287 0.0722 GAM 1.5020 0.0155 0.0217 0.6286 0.0586(0.1098) (0.0786) (0.0633) (0.2426) (0.4031) (0.0847) (0.0472) (0.0391) (0.2995) (0.1793)
MLI 1.8795 0.0060 0.0008 0.5336 0.0453 SEN 1.9353 0.0363 0.0207 0.1905 0.1028(0.0524) (0.0145) (0.0136) (0.2888) (0.1921) (0.0557) (0.0161) (0.0110) (0.1466) (0.2668)
BEN 1.8362 0.0025 -0.0126 0.3807 0.0561 MAA 1.9058 0.0188 0.0195 0.1744 0.0720(0.0667) (0.0559) (0.0472) (0.2814) (0.1693) (0.0336) (0.0238) (0.0183) (0.1920) (0.4051)
NIR 1.8202 0.0196 0.0266 0.1989 0.0727 CDI 1.8757 0.0190 0.0182 0.1511 0.0708(0.0626) (0.0709) (0.0587) (0.2127) (0.2148) (0.0402) (0.0483) (0.0389) (0.1858) (0.3991)
GUI 1.8766 0.0162 0.0188 0.1989 0.0714 BFO 1.8553 0.0160 0.0175 0.2194 0.0704(0.0273) (0.0144) (0.0118) (0.1909) (0.3986) (0.0269) (0.0140) (0.0094) (0.2435) (0.3962)
LBR 1.8596 0.0028 0.0185 0.1369 0.0709 SIE 1.8233 0.0373 0.0505 0.4757 0.0479(0.0295) (0.0261) (0.0214) (0.2099) (0.4058) (0.1108) (0.0493) (0.0399) (0.2757) (0.2199)
GHA 1.8585 0.0139 0.0084 0.1360 0.0512 TOG 1.9611 0.0276 0.0189 0.1611 0.0721(0.0289) (0.0179) (0.0135) (0.1645) (0.2214) (0.1153) (0.0640) (0.0552) (0.1741) (0.3625)
CAO 1.8599 0.0171 0.0177 0.1792 0.0704 NIG 1.8737 0.0237 0.0192 0.0982 0.1973(0.0206) (0.0184) (0.0154) (0.2056) (0.4004) (0.0525) (0.0230) (0.0201) (0.1822) (0.2475)
GAB 1.9154 0.0159 0.0203 0.1530 0.0732 CEN 1.7973 0.0129 0.0257 0.4532 0.0517(0.0657) (0.0711) (0.0589) (0.1964) (0.3965) (0.1195) (0.0652) (0.0528) (0.2711) (0.2460)
CHA 1.8823 0.0189 0.0212 0.1889 0.0740 CON 1.8664 0.0243 0.0212 0.1445 0.0976(0.0629) (0.0340) (0.0272) (0.1768) (0.3713) (0.0409) (0.0342) (0.0288) (0.1896) (0.1751)
DRC 1.8770 0.0176 0.0185 0.1853 0.0710 UGA 1.8465 0.0165 0.0206 0.1186 0.0982(0.0428) (0.0226) (0.0170) (0.2403) (0.4070) (0.0343) (0.0188) (0.0148) (0.1935) (0.2448)
KEN 1.8534 0.0046 0.0181 0.1584 0.0706 TAZ 1.9018 0.0176 0.0204 0.2016 0.0732(0.0065) (0.0101) (0.0076) (0.2181) (0.4083) (0.0418) (0.0177) (0.0130) (0.2054) (0.3987)
BUI 1.9025 0.0173 0.0193 0.1830 0.0720 RWA 1.8749 0.0185 0.0213 0.2281 0.0741(0.0273) (0.0189) (0.0144) (0.2132) (0.4052) (0.0514) (0.0516) (0.0420) (0.2277) (0.4004)
SOM 1.8143 0.0390 0.0384 0.4018 0.0447 DJI 1.8634 0.0164 0.0183 0.1944 0.0709(0.0661) (0.0447) (0.0351) (0.2292) (0.1870) (0.0228) (0.0089) (0.0078) (0.2686) (0.3988)
ETH 1.8853 0.0175 0.0205 0.1796 0.0733 ANG 1.8744 0.0174 0.0213 0.1840 0.0742(0.0108) (0.0065) (0.0051) (0.2619) (0.4026) (0.0438) (0.0322) (0.0251) (0.1923) (0.3917)
MZM 1.8505 -0.0069 -0.0098 0.2624 0.0446 ZAM 1.9325 0.0172 0.0195 0.1916 0.0727(0.0272) (0.0494) (0.0418) (0.2579) (0.1588) (0.0525) (0.0288) (0.0229) (0.1848) (0.4029)
ZIM 1.9480 0.0205 0.0192 0.1595 0.0724 MAW 1.8417 0.0161 0.0196 0.3374 0.0565(0.0624) (0.0286) (0.0236) (0.1429) (0.3932) (0.0879) (0.0912) (0.0756) (0.2880) (0.2650)
SAF 1.8835 0.0143 -0.0226 0.9136 0.0475 NAM 1.9146 0.0191 0.0204 0.1937 0.0734(0.0427) (0.0511) (0.0448) (0.3972) (0.2298) (0.0398) (0.0261) (0.0206) (0.2140) (0.4067)
LES 2.0190 0.0179 0.0187 0.1879 0.0722 BOT 1.8295 0.0189 0.0053 0.1126 0.0727(0.1155) (0.0834) (0.0691) (0.2240) (0.4024) (0.0359) (0.0209) (0.0218) (0.1892) (0.4041)
SWA 2.0060 0.0293 0.0192 0.1458 0.0726 MAG 1.8638 0.0151 0.0024 0.3599 0.0522(0.1293) (0.0495) (0.0437) (0.1803) (0.4026) (0.0818) (0.0482) (0.0395) (0.2580) (0.1413)
COM 1.8525 0.0464 0.0167 0.1309 0.0695 MAS 1.8395 0.0183 0.0282 0.1617 0.0742(0.0265) (0.0417) (0.0341) (0.1620) (0.4078) (0.0504) (0.0157) (0.0172) (0.1598) (0.4021)
MOR 1.9467 0.0251 0.0179 0.1848 0.0709 ALG 1.8899 0.0245 0.0172 0.1966 0.0702(0.0897) (0.0433) (0.0347) (0.2236) (0.4046) (0.0492) (0.0448) (0.0365) (0.1958) (0.4080)
xii
Table A4 (continued)
Country αi βD=0i,0 βD=1
i,0 vi ςi Country αi βD=0i,0 βD=1
i,0 vi ςi
TUN 1.8613 0.0144 0.0170 0.2899 0.0702 LIB 1.8824 0.0112 0.0185 0.0907 0.0709(0.0253) (0.0323) (0.0258) (0.2818) (0.4061) (0.0239) (0.0120) (0.0095) (0.1662) (0.4058)
SUD 1.8564 0.0204 0.0221 0.1877 0.0834 IRN 1.8658 0.0214 0.0177 0.2123 0.0702(0.0278) (0.0201) (0.0164) (0.2124) (0.1882) (0.0157) (0.0155) (0.0115) (0.2701) (0.3650)
TUR 1.8914 0.0019 -0.0004 0.2253 0.0524 IRQ 1.9082 0.0198 0.0189 0.1784 0.0714(0.0894) (0.0825) (0.0664) (0.2256) (0.2125) (0.0533) (0.0229) (0.0176) (0.2391) (0.4042)
EGY 1.8520 0.0023 0.0172 0.2901 0.0698 SYR 2.1083 0.0087 0.0170 0.3711 0.0707(0.0465) (0.0345) (0.0287) (0.2560) (0.4075) (0.2113) (0.0841) (0.0717) (0.3321) (0.3848)
LEB 1.8962 0.0266 0.0236 0.2089 0.0402 JOR 1.9433 0.0198 0.0191 0.1241 0.0716(0.0939) (0.0622) (0.0552) (0.2392) (0.2457) (0.0733) (0.0393) (0.0329) (0.1834) (0.4037)
ISR 1.8391 0.0183 0.0182 0.1804 0.0740 SAU 1.8854 0.0180 0.0209 0.1885 0.0736(0.0429) (0.0134) (0.0152) (0.1730) (0.3986) (0.0808) (0.0625) (0.0502) (0.2255) (0.3956)
KUW 1.8888 0.0182 0.0192 0.1997 0.0718 BAH 1.8733 0.0063 0.0196 0.1443 0.0723(0.0980) (0.0414) (0.0321) (0.2272) (0.4032) (0.0156) (0.0364) (0.0259) (0.1757) (0.4062)
QAT 1.8708 0.0232 0.0212 0.2552 0.0741 UAE 1.8746 0.0185 0.0189 0.2131 0.0718(0.1227) (0.0518) (0.0411) (0.3129) (0.3987) (0.0445) (0.0203) (0.0133) (0.2240) (0.3665)
OMA 1.8797 0.0193 0.0182 0.1693 0.0707 AFG 1.9012 0.0273 0.0202 0.1375 0.0726(0.0282) (0.0162) (0.0146) (0.2570) (0.4062) (0.0563) (0.0269) (0.0213) (0.1821) (0.3988)
TKM 1.8841 0.0191 0.0199 0.2080 0.0729 TAJ 1.8431 0.0228 0.0218 0.3066 0.0748(0.0212) (0.0156) (0.0113) (0.2607) (0.4040) (0.0487) (0.0321) (0.0247) (0.2275) (0.4002)
KYR 1.8431 0.0171 0.0218 0.1955 0.0748 UZB 1.8608 0.0129 0.0214 0.1608 0.0743(0.0367) (0.0404) (0.0330) (0.2160) (0.4039) (0.0210) (0.0183) (0.0134) (0.2176) (0.4055)
KZK 1.8563 0.0186 0.0215 0.1883 0.0744 CHN 2.0790 0.0168 0.0152 0.1761 0.0698(0.0329) (0.0188) (0.0139) (0.2559) (0.4052) (0.3876) (0.2475) (0.2102) (0.2163) (0.3799)
MON 1.8330 -0.0068 -0.0246 0.1608 0.0279 TAW 1.8480 0.0198 0.0117 0.1921 0.0395(0.0277) (0.0198) (0.0251) (0.2343) (0.1869) (0.0725) (0.0762) (0.0659) (0.2445) (0.1506)
PRK 1.9355 0.0179 0.0161 0.1068 0.0701 ROK 1.8577 0.0225 0.0037 0.2863 0.0733(0.2099) (0.0544) (0.0449) (0.1801) (0.4006) (0.2518) (0.0955) (0.0823) (0.2341) (0.2046)
JPN 1.7171 0.0185 0.0284 0.1619 0.0624 IND 1.6707 0.0206 0.0040 0.1502 0.0699(0.2023) (0.1375) (0.1192) (0.1430) (0.5419) (0.0310) (0.0443) (0.0362) (0.1571) (0.3696)
PAK 1.8169 0.0067 0.0321 0.3833 0.0444 BNG 1.8673 0.0297 -0.0072 1.0305 0.0455(0.0765) (0.0490) (0.0404) (0.2426) (0.1674) (0.0367) (0.0723) (0.0652) (0.3952) (0.1721)
MYA 1.6948 0.0551 0.0344 0.3119 0.0479 SRI 1.7513 0.0412 0.0328 0.2532 0.0290(0.1321) (0.1018) (0.0826) (0.2006) (0.1879) (0.1899) (0.1046) (0.0813) (0.1753) (0.2451)
NEP 1.9384 0.0090 0.0035 0.5470 0.0396 THI 1.8232 -0.0573 0.0006 0.1695 0.0413(0.0592) (0.0718) (0.0582) (0.2928) (0.1939) (0.0237) (0.1572) (0.1165) (0.3165) (0.1885)
CAM 1.8808 0.0237 0.0186 0.3581 0.0716 LAO 1.6121 0.0362 0.0316 0.3543 0.0511(0.2869) (0.1764) (0.1528) (0.2957) (0.3984) (0.6854) (0.3354) (0.2872) (0.2662) (0.2295)
DRV 1.8845 0.0136 0.0204 0.2178 0.0732 MAL 1.8437 0.0210 0.0218 0.2600 0.0748(0.3411) (0.2784) (0.2450) (0.1317) (0.3981) (0.7790) (0.3452) (0.2911) (0.2960) (0.3965)
SIN 1.8514 0.0040 0.0180 0.1237 0.0706 PHI 1.6377 0.0166 0.0257 0.2015 0.0379(0.0352) (0.0632) (0.0491) (0.1237) (0.4075) (0.1145) (0.1574) (0.1436) (0.1361) (0.2380)
INS 1.8459 0.0241 0.0185 0.2275 0.0587 AUL 1.8423 0.0183 0.0212 0.1803 0.0740(0.0790) (0.0538) (0.0504) (0.2990) (0.2023) (0.0404) (0.0722) (0.0734) (0.2334) (0.3995)
NEW 1.5256 0.0210 0.0246 0.1843 0.0696(0.0624) (0.1167) (0.0986) (0.1256) (0.3372)
xiii
A7 Additional Results
In this appendix, we describe in detail various additional results mentioned in the paper.
Model fit by world region. Figure A2 presents goodness-of-fit and out-of-sample pre-
diction results disaggregated by four regions of the world: the Americas, Europe, Africa, and
Asia-Oceania. As in Figure 1, we compare the true proportion of democracies (gray) in each
region with our learning model’s predictions, both in (solid blue) and out of sample (dashed
blue). The top panel presents results from our baseline model with no covariates, and the
bottom panel, from our model with two covariates. Notably, both specifications perform well
at this, or indeed any, level of geographic aggregation. And, while not included in Figure
A2 to avoid clutter, our learning model still significantly outperforms any alternative that
ignores the role of learning.
Direct diffusion of democracy. To evaluate the possibility that our model’s empiri-
cal success is simply an artifact of some alternative process of democratic diffusion that is
indirectly picked up by our model’s spatial and temporal flexibility, we construct a distance-
weighted measure of how democratic each country’s neighborhood is over time, and we
reestimate our baseline model using this measure as a control for direct diffusion effects on
the political cost of democracy. Specifically, we include in Xi,t the weighted average
Di,t−1 =
∑j 6=i exp(−δdi,j)Dj,t−1∑
j 6=i exp(−δdi, j),
where di,j denotes the distance between i and j’s capitals.
To reduce the computational burden, instead of estimating δ—the parameter determin-
ing the size of each country’s effective neighborhood—we consider five scenarios. For our
“medium neighborhood” scenario, we set δ equal to the estimated value of the parameter
governing the spatial decay of learning in our baseline specification. We also consider a
“smaller” and “smallest” neighborhood scenarios, where we increase the value of δ two-fold
xiv
% D
emoc
raci
es
1950 1960 1970 1980 1990 2000 2010
010
2030
Africa
1950 1960 1970 1980 1990 2000 2010
2030
40
Asia−Oceania
% D
emoc
raci
es
1950 1960 1970 1980 1990 2000 2010
5060
7080
9010
0 Europe
1950 1960 1970 1980 1990 2000 2010
4050
6070
8090
Americas
No Covariates
% D
emoc
raci
es
1950 1960 1970 1980 1990 2000 2010
010
2030
Africa
1950 1960 1970 1980 1990 2000 2010
2030
40
Asia−Oceania
% D
emoc
raci
es
1950 1960 1970 1980 1990 2000 2010
5060
7080
9010
0 Europe
1950 1960 1970 1980 1990 2000 2010
4050
6070
8090
Americas
Two Covariates
Figure A2: Observed versus Predicted Prevalence of Democracy by World Region
Notes. By world region, this figure compares the true proportion of democracies (gray) with estimatesgenerated by our learning model for both the in-sample (solid blue) and out-of-sample (dashed blue) periods.The top panel presents results using our baseline specification with no covariates. In the lower panel, wecontrol for lagged log-GDP per capita and incumbents’ time in power.
xv
and five-fold, respectively, and a “larger” and “largest” neighborhood scenarios, where we
divide δ by two and five, respectively. Table A5 presents the results of this exercise, following
the format of Table 2. We find that controlling for direct diffusion provides little increase in
predictive power, which indicates that it is our proposed mechanism of learning about the
economic effects of democracy—and not some alternative process of diffusion—what drives
our results.
Table A5: Direct Diffusion of Democracy
Smallest Neighborhood Smaller Neighborhood Medium Neighborhood Larger Neighborhood Largest NeighborhoodLearning No Learning Learning No Learning Learning No Learning Learning No Learning Learning No Learning
Choices 96.3 92.1 96.0 92.4 96.3 92.3 95.9 92.6 96.7 92.7(% correct)
Transitions(% correct)±0 years 11.6 5.4 12.4 3.9 13.2 4.7 20.2 4.7 18.6 9.3±2 years 41.9 18.6 54.3 20.9 54.3 25.6 50.4 22.5 56.6 22.5
Log-likelihood -536.6 -1,019.1 -539.7 -964.3 -511.4 -931.7 -553.6 -939.3 -481.5 -946.2
Observations 5,925 5,925 5,925 5,925 5,925 5,925 5,925 5,925 5,925 5,925
Notes. From left to right, respectively, models in each “neighborhood” scenario control for direct diffusioneffects on the political cost of democracy using distance weights δ = 2.5, δ = 1, δ = 0.5, δ = 0.25, andδ = 0.1. For each model, we report the percentage of correctly predicted in-sample system of governmentchoices (first row). We similarly report the percentage of correctly predicted transitions to or from democracywithin a 0-year window (second row) and a 5-year window (third row) of the event.
Robustness to alternative measure of democracy and time frame. Next, we explore
the sensitivity of our results to (i) our preferred measure of democracy and (ii) the time frame.
First, as noted in the paper, various alternative measures of democracy have been em-
ployed in the literature. To test the robustness of our results to this feature of the data, we
take Acemoglu et al.’s (2019) preferred measure and reestimate our baseline model.52 Since
the BMR coding is more comprehensive than this alternative measure, to keep the results
as comparable as possible—in particular, to avoid having to modify the time span of the
sample—we use the BMR coding to fill any gaps in Acemoglu et al.’s (2019) data.
Second, to address potential concerns about the myopia of incumbents in our model and
whether an annual time frame is appropriate to study changes in systems of government, we
52To that end, we also reestimate the true DGP (see Footnote 36) to obtain a new estimate of Σ.
xvi
estimate a five-year panel version of our baseline model (and true DGP). Following Acemoglu
et al. (2008), we take the observation of democracy every fifth year as our measure for the
five-year panel.
Table A6 summarizes the results of these robustness exercises, following the format of
Table 2. Our findings are virtually unchanged.
Table A6: Robustness to Democracy Measure and Time Frame
Alternative Democracy Measure Five-Year PanelLearning No Learning Learning No Learning
Choices 95.4 87.5 95.2 85.5(% correct)
Transitions(% correct)±0 years 12.5 0.0±2 years 46.1 0.0 46.6 0.0
Log-likelihood -647.9 -1,555.4 -160.4 -388.1
Observations 5,925 5,925 1,437 1,437
Notes. Models in the first and second columns are estimated using Acemoglu et al.’s (2019) preferred measureof democracy with no covariates. Models in the last two columns are estimated using a five-year panel versionof our data with no covariates. For each model, we report the percentage of correctly predicted in-samplesystem of government choices (first row) and the percentage of correctly predicted transitions to or fromdemocracy within a 0-year window (second row) and a 5-year window (third row) of the event.
Robustness to alternative measures of similarity. While geographic distance is highly
correlated with various dimensions of similarity across countries, we present in Table A7
results from an alternative version of our baseline specification in which we allow the cross-
country correlation in initial beliefs to also depend on genetic distance—as measured by
Spolaore and Wacziarg (2009)—and on economic distance in terms of initial levels of devel-
opment. For the latter, we use a linear trend to impute values of GDP per capita in 1950
for countries without such observations. We then compute the economic distance between
countries i and j as (Yi0 − Yj0)2/Var(Y0). Our results are identical.
xvii
Table A7: Robustness to Similarity Measures
Learning
Choices 95.2(% correct)
Transitions(% correct)±0 years 9.3±2 years 41.1
Log-likelihood -581.4
Observations 5,925
Notes. Results are from a specification with no covariates for the political cost of democracy but threemeasures of similarity to capture cross-country correlation in initial beliefs: geographic distance, geneticdistance, and economic distance. We report the percentage of correctly predicted in-sample system ofgovernment choices (first row) and the percentage of correctly predicted transitions to or from democracywithin a 0-year window (second row) and a 5-year window (third row) of the event.
Income and leader turnover. While we conceive elites broadly as the political party or
faction in power in each country rather than as individual leaders, previous work has high-
lighted the influence individual leader exit can have on democratic transitions. To explore
this, we estimate an alternative specification of our model that uses, as in Treisman (2015),
(lagged) log-GDP per capita, (lagged) leader exit, and their interaction to characterize the
political cost of democracy. As shown in Table A8, and consistent with our main results,
accounting for learning dwarfs the explanatory benefits from such a specification.
Heterogeneous effects of growth on elite turnover. Finally, we explore whether there
is heterogeneity across countries in θ = (θD=0, θD=1), the effects of GDP growth on elite
turnover. Specifically, we estimate a version of our model where we allow θ to vary by region
of the world as in Figure A2. Table A9 reports the corresponding point estimates, and
Table A10 summarizes goodness of fit. Our results provide little evidence of substantively
important heterogeneity.
xviii
Table A8: Income, Leader Turnover, and Democratization
Learning No Learning
Choices 95.8 90.5(% correct)
Transitions(% correct)±0 years 13.3 7.0±2 years 50.0 21.1
Log-likelihood -557.6 -1,150.9
Observations 5,882 5,882
Notes. Results are from a specification that controls for (lagged) log-GDP per capita, (lagged) leaderturnover, and their interaction as in Treisman (2015). We report the percentage of correctly predicted in-sample system of government choices (first row) and the percentage of correctly predicted transitions to orfrom democracy within a 0-year window (second row) and a 5-year window (third row) of the event.
Table A9: Effects of Economic Growth on Elite Turnover by World Region
Region θD=0 θD=1
Africa -1.6859 5.0174Asia-Oceania -4.0284 5.7846
Europe -1.7962 4.6478Americas -1.9665 5.6278
xix
Table A10: Heterogeneous Effects of GDP Growth on Elite Turnover
Learning No Learning
Choices 95.3 88.9(% correct)
Transitions(% correct)±0 years 7.0 0.0±2 years 36.4 0.0
Log-likelihood -614.3 -1,390.8
Observations 5,925 5,925
Notes. Results are from a specification where we allow the effects of GDP growth on elite turnover to varyby world region (Africa, Asia-Oceania, Europe, Americas). We report the percentage of correctly predictedin-sample system of government choices (first row) and the percentage of correctly predicted transitions toor from democracy within a 0-year window (second row) and a 5-year window (third row) of the event.
xx