learning about growth and democracy · a learning model of democratization elites, beliefs, and...

Learning about Growth and Democracy∗

Scott F Abramson†

University of Rochester

Sergio Montero‡

University of Rochester

April 17, 2020

Abstract

We develop and estimate a model of learning that accounts for the observed correlation

between economic development and democracy and for the clustering of democratiza-

tion events. In our model, countries’ own and neighbors’ past experiences shape elites’

beliefs about the effects of democracy on economic growth and their likelihood of

retaining power. These beliefs influence the choice to transition into or out of democ-

racy. We show that learning is crucial to explaining observed transitions since the

mid-twentieth century. Moreover, our model predicts reversals to authoritarianism if

the world experienced a growth shock the size of the Great Depression.

Keywords: democracy, learning, growth, diffusion, structural estimation

∗We thank Tasos Kalandrakis, Didac Queralt, Matt Shum, Erik Snowberg, Jorg Spenkuch, FabrizioZilibotti, and audiences at Kellogg MEDS, Princeton, Vanderbilt, NYU, the 2018 Banff Empirical Mi-croeconomics Workshop, SIOE 2018, APSA 2018, and the 2018 Formal Theory and Comparative PoliticsConference for very helpful comments. Anna Walsdorff provided excellent research assistance.†Department of Political Science, University of Rochester, Rochester, NY 14627. Email:

[email protected].‡Departments of Political Science and Economics, University of Rochester, Rochester, NY 14627. Email:

[email protected].

mailto:[email protected]

mailto:[email protected]

Introduction

Scholarship on the causes of democracy has sought to understand two empirical patterns: the

strong correlative relationship between levels of material well-being and democracy (Lipset,

1959; Przeworski et al., 2000; Boix and Stokes, 2003; Acemoglu et al., 2008, 2009) and the

spatial and temporal clustering of democratization events (Huntington, 1993; Brinks and

Coppedge, 2006; Gleditsch and Ward, 2006; Ahlquist and Wibbels, 2012; Houle, Kayser and

Xiang, 2016). Existing studies have treated these as distinct objects of inquiry, separately

assessing the impact of domestic and international factors on the propensity to democratize.

In this paper, we develop and estimate a model of elite belief formation that combines both

domestic and systemic features in order to jointly explain the correlation between economic

development and democracy and the clustering of democratic transitions.

We explicitly model the choice by incumbent elites to promote or subvert democracy.1

This choice impacts their likelihood of retaining power both directly and through its effect on

economic growth. Incumbents are uncertain about the relationship between democracy and

growth, and they rely upon worldwide economic history to update their beliefs. We allow

beliefs to be spatially correlated so that incumbents may learn more from the experiences of

more proximate (or similar) countries.2 In accordance with their beliefs, incumbents pursue

democracy or autocracy, seeking to maximize the probability they remain in power. For a

panel of 151 countries, we use data from 1875-1950 to calibrate initial conditions and data

from 1951-2000 to structurally estimate our model.3

To assess the ability of our learning model to explain observed patterns of economic

growth and democracy adoption, we conduct a series of goodness-of-fit and out-of-sample

1We conceive elites broadly as the group or faction in charge of the executive. For an overview of theformal literature that emphasizes the endogenous nature of institutions, see Gehlbach, Sonin and Svolik(2016) and Svolik (2019).

2Geographic distance is highly correlated with various measures of similarity between countries. Ourresults are robust to accounting directly for other types of similarity (see Online Appendix A7).

3Methodologically, our approach is similar to that of recent studies in political science that estimatemodel parameters by maximizing a likelihood derived from the equilibrium conditions of a formal model(Crisman-Cox and Gibilisco, 2018; Ascencio and Rueda, 2019).

1

(2001-2010) prediction exercises that pit our model against a range of reduced-form panel

regressions typical of the approach taken in the existing empirical literature on democrati-

zation (Acemoglu et al., 2008; Boix, 2011). We show that learning from past experiences

is crucial to explaining observed transitions to and from democracy, delivering an improve-

ment in predictive success of over 100% relative to the best-fitting specification that does

not account for learning—even after allowing for other potential channels of diffusion.

The success of our learning model is rooted, first, in our estimates of the political impli-

cations of economic growth. In line with a sizable empirical literature in political economy,

we find that democracies tend to reward incumbents for growth by keeping them in power

(Hibbs, 1977; Alesina, Roubini and Cohen, 1997; Brender and Drazen, 2008). In contrast, we

find that growth tends to be destabilizing in autocracies. That is, our estimates are consis-

tent with the view that rapid economic expansion in autocracies produces actors—a middle

class, for example—able to challenge the group in power (Olson, 1963; Huntington, 1968;

Hirschman and Rothschild, 1973). Together, these results imply that the cross-sectional

correlation between levels of material well-being and democracy is largely driven by elites

who seek to benefit politically from the economic consequences of institutional choice. In

particular, democratic incumbents will subvert democracy when they come to believe that

it does not produce sufficient economic growth to win a fair election, whereas autocratic

incumbents will transition to democracy when expected rates of growth make them more

likely to retain power via election than under continued authoritarian rule.

Importantly, we distinguish transitions of power, where only the identity of the incumbent

changes, from transitions into or out of democracy, where the form of government changes.

Sudden changes in economic conditions, for instance, may lead to generalized political in-

stability regardless of the system of government in place. However, whenever an incumbent

is replaced—be it through election, coup, or revolution—the new group in power again faces

the choice to support or subvert democracy. Our focus is on this choice, not transitions of

power between factions.

2

The second feature of our learning model that underpins its empirical success is its ability

to capture the wave-like nature of democratic transitions. We find that beliefs about the

relationship between democracy and economic growth are highly correlated both temporally

and spatially, which provides a structural interpretation for the observed clustering of transi-

tions. Our estimates indicate there is an approximately 5,000 kilometer radius within which

learning occurs. Outside this distance, virtually no additional information is gleaned. This

advances the literature on the diffusion of democracy, which has struggled to disentangle

competing mechanisms.4

Our methodological approach allows us to conduct counterfactual experiments of three

types, each of which highlights the importance of learning. First, it enables us to study how

systemic shocks to prosperity impact the worldwide prospects for democracy. Specifically,

our model predicts considerable reversals to authoritarianism if the world were hit with a

shock to growth the size of the Great Depression. Second, it allows us to ask retrospective

questions about historical democratization events, such as whether Greece, Portugal, and

Spain would have democratized when they did had western European democracies suffered a

recession in the early 1970s. Third, our model allows us to prospectively explore conditions

that would presently lead countries to transition to or from democracy. Overall, our results

suggest that, ultimately, democracy is a fragile system of government, one which depends

significantly upon its own economic success.

Taken together, our findings contribute to a substantial body of work on modernization

and democracy. Dating to at least the mid-twentieth century, social scientists have debated

whether increases in per capita income have a causal effect on the probability that a state

democratizes.5 We bring to bear two innovations. First, we do not rely on the instrumental-

4Proposed mechanisms include diffusion through international organizations (Pevehouse, 2005), di-rect emulation of neighbors (Gleditsch and Ward, 2006), diffusion through trade and economic exchange(Mansfield, Milner and Rosendorff, 2000), cultural linkages (Wejnert, 2005), and military coercion (Kadera,Crescenzi and Shannon, 2003). On the inability of this literature to empirically falsify any particular mech-anism, see Torfason and Ingram (2010).

5The modernization hypothesis—that higher incomes per capita cause countries to democratize—datesto, at least, Lipset (1959). For evidence in favor, see Londregan and Poole (1996); Barro (1996); Boix andStokes (2003); Boix (2011). Against, see Przeworski and Limongi (1997); Przeworski et al. (2000); Acemoglu

3

variables or reduced-form selection-on-observables identification strategies deployed—with

mixed results—in the existing literature. Rather, we propose an explicit model of the rela-

tionship between economic growth and democracy that we take directly to the data. Second,

we approach modernization as a systemic phenomenon. Indeed, our results suggest that fo-

cusing on the within-country causal impact of economic development is too narrow an object

of inquiry. Through its influence on neighbors’ beliefs, a country’s economic performance af-

fects not only its own likelihood of transitioning to or from democracy but also the prospects

for democracy outside of its borders.

Of course, we are not the first to propose diffusion through learning as a driver of demo-

cratic transitions.6 If direct and consistent measures of beliefs were available over a suffi-

ciently long period and a wide enough set of countries, it would be conceivable to directly

estimate the impact of changing beliefs on democratization.7 Given the current lack of

systematic or reliable data, our paper represents the first attempt at estimating this impact.

We are also not the first to examine the role of learning in policymaking more broadly.8

Our paper is most closely related to Buera, Monge-Naranjo and Primiceri’s (2011) struc-

tural analysis of the impact of learning on the adoption of market-oriented (versus state-

interventionist) policies. We build on their framework to model the worldwide evolution

and diffusion of beliefs about the economic consequences of competing policies. Yet, while

they consider the problem of a welfare-maximizing social planner, we take a political econ-

omy perspective and study institutional design as the outcome of self-interested choices by

power-seeking elites—a natural next step in this line of inquiry.

et al. (2008, 2009).6Dahl (1998); Diamond (2011); Miller (2016).7For work that attempts to gauge beliefs and their consequences for democratization in a subset of

countries, see Almond and Verba (1963); Norris (1999); Chen and Lu (2011).8Primiceri (2006); Garcıa-Jimeno (2016).

4

A Learning Model of Democratization

Elites, Beliefs, and Learning

We consider the decision problem of the decisive group in power in country i at time t.9

This decision-maker faces a choice between autocracy, Di,t = 0, and democracy, Di,t = 1.

The incumbent’s objective is to retain power in period t+ 1. Let Yi,t denote country i’s per

capita GDP in period t, and let yi,t ≡ log(Yi,t) − log(Yi,t−1) denote its growth rate. At the

beginning of period t, the incumbent solves

maxDi,t∈{0,1}

Ei,t−1

[exp(αi + θD=Di,tyi,t −Ki,tDi,t)

1 + exp(αi + θD=Di,tyi,t −Ki,tDi,t)

∣∣∣∣Di,t

], (1)

where the integrand represents the probability of remaining in power in period t+1, and the

expectation is taken with respect to yi,t, as explained below, conditional on the information

available in country i at the conclusion of period t − 1. The integrand is increasing in the

index αi + θD=Di,tyi,t − Ki,tDi,t, where coefficient αi establishes a baseline for country i,

coefficients θD=0 and θD=1 respectively measure the (de)stabilizing effect of GDP growth

on elite turnover under autocracy and democracy, and Ki,t captures the political cost of

democracy to the incumbent, i.e., its direct effect on the likelihood of retaining power.

The incumbent chooses Di,t at the start of period t, forming a subjective forecast of its

effect on GDP growth, yi,t, to solve (1).10 Incumbents believe that the relationship between

GDP growth and democracy takes the form

yi,t = (1−Di,t)βD=0i +Di,tβ

D=1i + εi,t, (2)

where βD=0i and βD=1

i denote country i’s long-run GDP growth rates under autocracy and

9As described below, both in democracies and autocracies we treat this as the political party or faction(where parties do not exist) in control of the executive. Online Appendix A2 provides descriptive evidencethat motivates key modeling choices in what follows.

10We build on the learning framework of Buera, Monge-Naranjo and Primiceri (2011).

5

democracy, respectively, and εi,t is an exogenous shock to growth that is uncorrelated over

time but potentially correlated across countries.11 Specifically, the vector εt ≡ [ε1,t, . . . , εn,t]′

of GDP growth shocks across the n countries of the world is independently and identically

distributed (i.i.d.) over time according to a mean-zero Normal distribution with covariance

matrix Σ, i.e.,

εti.i.d.∼ N(0,Σ).

Incumbents do not know βi ≡ [βD=0i , βD=1

i ]′, but they have perfect knowledge of all other

features of the model, including Ki,t, at the time of their choice.12

The timing of events is as follows. At the end of period t− 1, incumbents collect data on

worldwide GDP growth rates and systems of government, and they update their beliefs about

long-run economic growth under autocracy and democracy accordingly. At the beginning of

period t, incumbents observe Ki,t and decide what system of government, Di,t, to adopt that

period. Growth rates conditional on incumbents’ choices are then realized, which together

determine elite turnover.

Learning. In period t = 0, incumbents start out with a Normal prior over the vector of

unknown long-run GDP growth rates β ≡ [βD=01 , . . . , βD=0

n , βD=11 , . . . , βD=1

n ]′,

β ∼ N(β0, P−10 ), (3)

where β0 and P0 denote, respectively, the prior mean and precision matrix. We assume that

incumbents’ initial beliefs assign no correlation and the same degree of uncertainty to growth

11Studies on the causal impact of democracy on economic growth have highlighted various potentialmechanisms—most notably, how democracy influences redistributive policy, which in turn affects investment(Alesina and Rodrik, 1994; Persson and Tabellini, 1994; Acemoglu et al., 2019). For tractability, we abstractfrom considering these explicitly but allow incumbents to have flexible, country-specific beliefs about theirnet long-run impact.

12A natural question is whether incumbents may also be imperfectly informed about θ = (θD=0, θD=1).However, allowing for such double-sided uncertainty would be analytically and econometrically quite chal-lenging. We find it reasonable to presume incumbents are better informed about their political prospectsthan the macroeconomy—after all, they are professional politicians who typically rely on technocrats toadvise on economic policy.

6

under autocracy and democracy:

P−10 = I2 ⊗ (V RV ),

where V = diag([v1σ1, . . . , vnσn]) is a diagonal matrix whose ith diagonal entry measures

prior uncertainty (standard deviation) about country i’s long-run growth rate under autoc-

racy/democracy, and R is the cross-country prior correlation matrix. Prior uncertainty is

parameterized by {vi}ni=1, normalized by the standard deviation of growth shocks in each

country {σi}ni=1 (the square roots of the diagonal elements of Σ).

Our assumptions yield simple, recursive, Bayesian updating formulas for beliefs in each

period: letting Dt ≡ [D1,t, . . . , Dn,t]′ and yt ≡ [y1,t, . . . , yn,t]

′,

Pt = Pt−1 + D′tΣ−1Dt,

βt = βt−1 + P−1t D′tΣ−1(yt −Dtβt−1),

where Dt ≡ [diag(1−Dt), diag(Dt)] is an n× (2n) matrix such that the ith element of the

vector Dtβt−1 equals (1 − Di,t)βD=0i,t−1 + Di,tβ

D=1i,t−1. The impact of new data on the posterior

mean βt is determined by P−1t D′tΣ−1, which depends on three key factors. First, higher

initial uncertainty in beliefs (higher {vi}ni=1) raises the relative precision of new information,

increasing its impact. Second, higher correlation in growth shocks across countries (off-

diagonal elements of Σ) reduces the informational content of observed growth rates and

slows down learning. Lastly, higher cross-country correlation in initial beliefs (off-diagonal

elements of R) increases belief responsiveness to data from other countries.

We allow incumbents to potentially learn more from neighboring (or more similar) coun-

tries. Letting Zi,j denote a vector that may include various measures of distance (geographic

or otherwise) between countries i and j, we write

Ri,j = exp(−Z ′i,jγ),

7

where γ is constrained to be nonnegative to ensure correlations between 0 and 1.13

Incumbents’ optimal choice. While incumbents observe the political cost of democracy,

Ki,t, prior to choosing Di,t, this cost is unobservable to the researcher. We assume that Ki,t

has the following structure:

Ki,t = fi +X ′i,tξ + κi,t. (4)

Coefficient fi establishes a country-specific baseline, and the control vector Xi,t may include

various observable economic and political characteristics of country i (e.g., lagged per capita

GDP or incumbents’ time in power). Every period, country i also experiences an exogenous

idiosyncratic shock, κi,t, to the political cost of democracy, where

κi,t ∼ N(0, ς2i ).

The volatility of shocks to the political cost of democracy, ςi, is allowed to be country-specific,

but κi,t is assumed to be independently distributed over time and across countries.

As discussed, when choosing Di,t incumbents have perfect knowledge of Ki,t and all

features of the model except for the effect of their choice on GDP growth, yi,t. Together,

(1), (2), and (4) imply that the optimal choice for country i’s incumbent at time t is

Di,t = 1

{Ei,t−1

[exp(αi + θD=1(βD=1

i + εi,t)− fi −X ′i,tξ − κi,t)1 + exp(αi + θD=1(βD=1

i + εi,t)− fi −X ′i,tξ − κi,t)

]

> Ei,t−1


i + εi,t))

1 + exp(αi + θD=0(βD=0i + εi,t))

]},

(5)

where the expectations are taken only with respect to βi and εi,t in accordance with the

incumbent’s beliefs at the conclusion of period t− 1.

Opposition groups and strategic experimentation. To conclude the description of

our model, we briefly discuss how we account for non-elite learning and strategic interactions

13This formulation also guarantees the positive definiteness of R (Matern, 1960).

8

between incumbents and potential challengers.

The common prior assumption for incumbents in our model extends to all stakeholders

in each country. Opposition groups (elite or non-elite) observe the same worldwide history

of economic growth and democracy, and they would be faced with solving (1)—in the event

they came to power—using information identical to that available to the incumbent. As a

result, in this shared-learning environment, the identity of the incumbent only matters via

its potential influence on the political cost of democracy.

For tractability, we abstract from explicitly modeling the intricacies of within-country

elite turnover. However, objective (1) can be viewed as describing an equilibrium proba-

bility of staying in power resulting from a richer interaction between the incumbent and

potential challengers, both elite and non-elite. Importantly, we distinguish transitions of

power, where only the identity of the incumbent changes, from transitions into or out of

democracy. Whenever an incumbent is overthrown by a rival elite faction or via a revolution

from below, the new group in power again faces the choice to support or subvert democracy.14

Understanding how this choice by self-interested elites—newly in power or entrenched—is

shaped by the evolution of beliefs about the economic effects of democracy is the focus of

this paper.15

Finally, objective (1) precludes incumbents from adopting a form of government with

negative expected consequences for the purpose of learning from the experience. Incumbents

are myopic and focused only on their immediate survival. Nevertheless, while we do not

explore a fully dynamic version of our model, as it would introduce strategic experimentation

incentives that would render the analysis intractable,16 we believe objective (1) offers a good

14Examples abound of revolutions from below, inspired by ostensibly democratic goals, that failed todeliver on the promise of liberal democracy. For instance, Skocpol’s (1979) three main cases—the French,Chinese, and Russian revolutions—all began as mass revolutionary movements with outwardly democraticmotives, and each nonetheless resulted in dictatorship. More recently, the Arab Spring yielded mixed results.

15An implication of our model is that transitions constitute attempts by incumbent elites to hold on topower. Notably, incumbents in our data retain power 30% and 19% of the time following transitions to andfrom democracy, respectively, despite the instability typically associated with transition periods.

16Bolton and Harris (1999); Bramoullee, Kranton and D’Amours (2014); Mossel, Sly and Tamuz (2015).

9

first-order approximation to optimal behavior by incumbents with longer time horizons.17

Empirical Strategy

Like incumbents in our model, we adopt a Bayesian inference approach to recover the un-

known structural parameters of our model, listed in Table 1.18 Let the vector ϕ collect all

the parameters in Table 1, and let Ii,t be an indicator of whether the incumbent in coun-

try i retained power (Ii,t = 1) or not (Ii,t = 0) at the conclusion of period t. Denote by

W T ≡ {It, yt, Dt, Xt}Tt=1 the set of all data available up to period T , where It ≡ [I1,t . . . , In,t]′

and Xt ≡ [X1,t . . . , Xn,t]′. Our goal is to estimate the true value of ϕ by computing the mode

of the posterior distribution of the model parameters,

p(ϕ|W T ) ∝ L(W T |ϕ)π(ϕ),

given the likelihood of the data, L, and our prior, π. We describe L and π in turn.

Table 1: Model Parameters

{αi}ni=1: baseline incumbent stabilityθD=0: effect of GDP growth on elite turnover under autocracyθD=1: effect of GDP growth on elite turnover under democracy

{βD=0i,0 }ni=1: prior mean of long-run GDP growth rate under autocracy

{βD=1i,0 }ni=1: prior mean of long-run GDP growth rate under democracy{vi}ni=1: prior uncertainty about economic effects of autocracy/democracy

γ: coefficients of cross-country correlation in prior beliefs{fi}ni=1: baseline political cost of democracy

ξ: coefficients of economic/political controls for political cost of democracy{ςi}ni=1: volatility of political cost of democracy

Likelihood of the data. While the structure of our model described thus far specifies

incumbents’ beliefs about how the data are generated as well as their optimal choices given

17See the discussion below on the robustness of our results and Online Appendix A7.18Following Buera, Monge-Naranjo and Primiceri (2011), to reduce the dimensionality of the model we

set Σ equal to its estimated value from the “true” data generating process described in Footnote 36.

10

those beliefs, we have refrained from specifying the “true” data generating process (DGP).

Below, to perform counterfactual experiments, we discuss and specify the true DGP. Next,

we only make one key assumption about the true DGP that simplifies inference about the

model parameters.

We assume that observed outcomes are only affected by actual choices and not by the be-

liefs that led to those choices. That is, transitions of power (It), GDP growth (yt), and other

economic and political characteristics of countries (Xt) are shaped by realized institutions

(Dt), but they are not directly affected by beliefs about the potential effects of transitioning

into or out of democracy. Formally, this assumption implies that the parameters in Table 1

are only involved in the component of the likelihood that describes the conditional proba-

bilities of countries’ observed systems of government. While we relegate a full derivation of

the likelihood to Online Appendix A3, with a slight abuse of notation—using L to denote

arbitrary densities of the data—it can be written as

L(W T |ϕ) ∝T∏t=1

n∏i=1

L(Di,t|Xi,t,Wt−1, ϕ),

where

L(Di,t|Xi,t,Wt−1, ϕ) = Φ

(κi,t(Xi,t,W

t−1, ϕ)

ςi

)Di,t[1− Φ

(κi,t(Xi,t,W

t−1, ϕ)

ςi

)]1−Di,t

, (6)

Φ denotes the standard Normal cumulative distribution function, and κi,t(Xi,t,Wt−1, ϕ) is

the threshold value of κi,t—the realized shock in period t to the political cost of democracy in

country i—that leaves country i’s incumbent indifferent between autocracy and democracy.

Note that (6) resembles the likelihood of a standard binary-choice Probit model. How-

ever, κi,t(Xi,t,Wt−1, ϕ) is a nonlinear function (with no closed-from expression) of the model

parameters and the data up to period t that encodes how each country’s propensity for

democracy evolves with incumbents’ beliefs.

11

Prior. Given the size of our model, we adopt an informative prior, π, to prevent overfitting.

To do so in a principled manner, we calibrate our prior in the way agents in our model would,

allowing the observed past to inform initial beliefs. We use data from 1875-1950 (excluding

the two world wars), a period that immediately precedes our main sample, to set the prior

mean and precision of the model parameters so as to match analogous empirical moments.

For example, we ensure that our prior over incumbents’ initial beliefs about the relationship

between democracy and GDP growth is consistent with average annual growth rates among

autocracies and democracies in the pre-sample period. Similarly, we use pre-sample history

of elite turnover to inform our prior over the parameters describing the likelihood of retaining

power. We describe our prior in full and how it is calibrated in Online Appendix A4.

Estimation and inference. Calculating κi,t(Xi,t,Wt−1, ϕ) to evaluate the likelihood of

the data is computationally quite expensive. To sidestep this burden, we follow the Math-

ematical Programming with Equilibrium Constraints (MPEC) approach of Su and Judd

(2012) to compute our maximum-a-posteriori estimator of ϕ.19 The idea behind this ap-

proach is simple: instead of calculating κi,t(Xi,t,Wt−1, ϕ) at every trial value of ϕ, one treats

each κi,t as an auxiliary parameter and imposes the optimality (or equilibrium) conditions of

the model as feasibility constraints on the log-posterior maximization program. This consid-

erably reduces the computational cost of estimating the model. We describe our estimation

strategy in detail in Online Appendix A5.

Data. For our main analysis, we obtain data from three sources, each measured at the

country-year. First, we obtain data on GDP & Per Capita GDP from Maddison (2010).

The Maddison Project Database provides information on comparative economic growth and

income levels over the very long run. The data give estimates of annual GDP and GDP

per capita between 1875-2008 for all of the independent states in our sample of countries.20

19Standard errors are parametrically bootstrapped.20Online Appendix A1 lists the countries in our sample and describes how changes in borders are handled.

12

We obtain two additional years (2009-2010) of GDP per capita growth rates from the Penn

World Table (Feenstra, Inklaar and Timmer, 2015) for out-of-sample predictions.

Second, we obtain our dichotomous measure of Democracy from Boix, Miller and

Rosato (BMR, 2013). This dataset provides an annual coding of democracy for every country

in the world from 1800 to 2010. If the following three criteria are met, countries are coded

as democratic:

1. The executive is directly or indirectly elected in popular elections and is responsibleeither directly to voters or to a legislature.

2. The legislature (or the executive if elected directly) is chosen in free and fair elections.

3. A majority of adult men has the right to vote.

If any of these criteria are not met, a country is coded as autocratic. While various alternative

measures of democracy have been used in previous studies, the BMR coding is the most

comprehensive and consistent for the period we cover (1875-2010). Nonetheless, our results

are robust to employing alternative codings (see Online Appendix A7).

Finally, for each country-year, we code the Executive Faction from Goemans, Gleditsch

and Chiozza (2009). The Archigos database on leaders describes the date and manner of

entry and exit for the executives of all countries in our sample from 1875-2015. With these

data, we then code, using biographical information, the political party of each executive. If

we cannot identify a political party (nearly all of these cases are military regimes), we identify

the particular faction to which the executive belongs. In combination with the entry/exit

dates, we construct our measure of change in the faction of the executive (elite turnover).

Estimation Results: The Importance of Learning

Before summarizing our structural parameter estimates, we subject our model to a series of

goodness-of-fit and out-of-sample prediction tests that assess its ability to explain observed

patterns of democracy adoption. We consider five alternative specifications of our model

13

that differ in the number of covariates used to characterize the political cost of democracy.

In our baseline specification with no covariates, the political cost of democracy, Ki,t, consists

of simply a country-specific baseline, fi, plus an idiosyncratic shock, κi,t. We then consider

specifications where we successively control for (lagged) log-GDP per capita, the incumbent’s

time in power, (lagged) trade volume as percentage of GDP (Gleditsch, 2002), and years as

democratic (negative when autocratic) to account for consolidation effects (Svolik, 2013).21

Across specifications, we use geographic distance between capitals, Zi,j, to capture the cross-

country correlation in initial beliefs.22

To quantify the importance of learning for our model’s ability to fit the data, we also

estimate a “no-learning” version of our model. For each specification, we constrain beliefs

about long-run growth rates under autocracy and democracy to be constant over time,

thus shutting down the learning mechanism. These no-learning specifications are otherwise

identical to their learning counterparts.

We conduct our model performance tests as follows. With each estimated model, we

compute one-year-ahead forecasts of the choice between autocracy and democracy for each

country-year. That is, conditional on the state of the world at the end of year t−1 as recorded

in our data, we use (5) for each model to predict Di,t worldwide. We produce forecasts for

the in-sample period used to estimate each model (1951-2000) and for 10 additional out-of-

sample years (2001-2010).

In Figure 1, we plot the actual (gray) and predicted percentage of world democracies. In

the top panel, predictions are generated using our baseline specification with no covariates.

We plot predictions with (blue) and without (red) learning for both the in-sample (solid)

and out-of-sample (dashed) periods. In the lower panel, we present the same set of estimates

21In Online Appendix A7, we also consider a specification that allows for an interaction between (lagged)log-GDP per capita and an indicator (lagged) of individual leader turnover as in Treisman (2015). Ourresults are virtually unchanged.

22Geographic distance is highly correlated with other measures of cultural, economic, or political similaritybetween countries (Buera, Monge-Naranjo and Primiceri, 2011). As shown in Online Appendix A7, ourresults are robust to allowing the correlation in initial beliefs to also depend on genetic distance—as measuredby Spolaore and Wacziarg (2009)—and on economic distance in terms of initial levels of development.

14

using our model with two covariates (lagged log-GDP per capita and time in power).

Note the vast improvement in predictive success, both in and out of sample, when we

account for learning. Unsurprisingly, the no-learning specification of our model with no

covariates performs worst as it produces a constant prediction for each country.23 However,

while the inclusion of covariates does markedly improve the accuracy of the no-learning

model, these gains pale in comparison with the role of learning. Indeed, our baseline learning

model with no covariates vastly outperforms any specification that does not account for

learning. As many of the covariates we condition upon are themselves outcomes of the

selection process we model (e.g., per capita GDP or elite turnover), their inclusion should

not yield much improvement in predictive success. Our results confirm this intuition.24

Table 2 provides a numerical summary of our goodness-of-fit tests. Each set of columns

corresponds to a different model specification, with (odd columns) and without (even columns)

learning. The first row gives the percentage of country-year observations each model cor-

rectly predicts. Unsurprisingly, all models perform remarkably well on this dimension. The

reason is that, as transitions into or out of democracy are quite rare (130 total in-sample

events), country fixed effects go a long way in fitting the data. Indeed, our no-learning model

with no covariates (second column), which produces a constant prediction for each country,

has a success rate of almost 90%.

A much harder test—one that is considerably more revealing of the underlying causes of

democracy—is whether a model can correctly predict transitions to and from democracy. In

Table 2, we present two scenarios, assessing each model’s accuracy in predicting transitions

within ±0 years (second row) and ±2 years (third row) of the event. Here, the importance

of learning is striking. Models that do not account for learning perform quite poorly, even

within a 5-year window.25 And, while including additional covariates does increase accuracy,

23The observed temporal variation is an artifact of the changing population of countries in our data.24To show that the success of our model is not an artifact of aggregating predictions across countries,

we provide in Online Appendix A7 nearly identical results disaggregated by four regions of the world: theAmericas, Europe, Africa, and Asia-Oceania.

25The 5-year window predictions for models in the last two columns should be taken with care. Whencontrolling for years as democratic (negative when autocratic) and aggregating over 5 years, transitions in the

15

% W

orld

Dem

ocra

cies

1950 1960 1970 1980 1990 2000 2010

2530

3540

4550

55

No Covariates

Out ofSampleIn Sample

True Value

Model w/ Learning

Model w/ No Learning

% W

orld

Dem

ocra

cies

1950 1960 1970 1980 1990 2000 2010

2530

3540

4550

55

Two CovariatesTime in Power, ln(GDPpc)

Figure 1: Observed versus Predicted Worldwide Prevalence of Democracy

Notes. This figure compares the true proportion of world democracies (gray) with in-sample (solid blue) andout-of-sample (dashed blue) estimates generated by our model. Additionally, we shut down learning in ourmodel and produce both in-sample (solid red) and out-of sample (dashed red) predictions. In the top panel,estimates are generated using our baseline specification with no covariates. In the bottom panel, we controlfor lagged log-GDP per capita and incumbents’ time in power.

16

Table 2: Model Fit

No Covariates One Covariate Two Covariates Three Covariates Four CovariatesLearning No Learning Learning No Learning Learning No Learning Learning No Learning Learning No Learning

Choices 95.2 88.9 95.7 90.4 95.8 91.7 96.4 92.3 97.1 96.3(% correct)

Transitions(% correct)±0 years 9.3 0.0 6.2 1.6 7.0 3.9 11.0 7.1 6.3 0.8±2 years 41.1 0.0 43.4 7.8 48.1 20.2 52.0 23.6 70.1 53.5

Log-likelihood -581.4 -1,390.8 -572.5 -1,176.8 -553.6 -1,061.8 -523.6 -995.1 -490.1 -704.2

Observations 5,925 5,925 5,925 5,925 5,925 5,925 5,845 5,845 5,845 5,845

Notes. This table reports various goodness-of-fit measures. Each set of columns corresponds to a differentspecification of our model, with (odd columns) and without (even columns) learning. Models in the firsttwo columns use only country fixed effects to characterize the political cost of democracy. Models in thethird and fourth columns control for lagged log-GDP per capita. Models in the fifth and sixth columnsadditionally control for incumbents’ time in power. Models in the seventh and eighth columns also controlfor trade volume as a percentage of lagged GDP. Models in the last two columns add years as democratic(negative when autocratic) as a control. For each model, we report the percentage of correctly predictedin-sample system of government choices (first row). We similarly report the percentage of correctly predictedtransitions to or from democracy within a 0-year window (second row) and a 5-year window (third row) ofthe event.

the marginal improvement is negligible. In contrast, turning on the learning mechanism in

our model raises predictive success by over 100% in virtually all scenarios and all specifica-

tions. In fact, our baseline learning model with no covariates outperforms most no-learning

specifications by a similar rate.26

To further benchmark our model, we present in Table 3 results from a series of panel

regressions typical of the approach taken in the existing empirical literature on democrati-

zation. Using linear probability models, we regress our democracy measure against a full set

of country fixed effects, a one-period lag of log-GDP per capita, and various controls. We

exploit both annual data and, as in Acemoglu et al. (2008, 2009) and Boix (2011), five-year

panels, which allow for the inclusion of covariates not available annually.27 As before, with

each specification we produce predictions for every country-period.

data are mechanically picked up by the models and turned into correct predictions. See below for a similarcomment about models with a one-year lag of democracy. Notably, turning on the learning mechanism stilldelivers a sizable improvement in predictive power.

26The only exception is the model in the last column—see Footnote 25.27We also estimate five-year panel versions of our model—see Online Appendix A7—and results are

essentially identical to those of Table 2, which should alleviate concerns about both the myopia of incumbentsin our model and whether an annual time frame is appropriate to study changes in system of government.

17

Table 3: Goodness of Fit of Reduced-Form Models

I II III IV V VI VII VIII IX X XI XII XIII XIV

Choices 89.3 88.6 89.5 88.8 89.3 88.8 77.1 91.3 97.8 91.4 90.4 90.8 97.8 91.6(% correct)

Transitions(% correct)±0 years 0.8 1.5 0.8 6.9 2.3 6.2±2 years 3.1 2.9 3.8 2.9 3.8 1.9 1.2 3.8 96.9 19.4 7.7 14.1 96.9 17.2

log(GDPpc)t−1 0.115∗∗∗ 0.147∗∗∗ 0.116∗∗∗ 0.142∗∗∗ 0.112∗∗∗ 0.142∗∗∗ 0.033 0.032 0.024∗∗∗ 0.102∗∗∗ 0.060∗∗∗ 0.040 0.011 0.022(0.010) (0.026) (0.010) (0.026) (0.011) (0.026) (0.053) (0.078) (0.006) (0.024) (0.014) (0.037) (0.008) (0.034)

Controls :Country Effects X X X X X X X X X X X X X XTime in Power X X X X X X X X X X X XTrade/GDPt−1 X X X X X X X X X XEducation X XLabor Share XDemocracyt−1 X X X XTime Effects X X X X

Observations 5,925 1,076 5,925 1,076 5,866 1,076 649 390 5,866 1,076 5,866 1,076 5,866 1,076Panel Length 1 5 1 5 1 5 5 5 1 5 1 5 1 5

∗p<0.1, ∗∗p<0.05, ∗∗∗p<0.01

Notes. This table gives linear probability estimates of the impact of (lagged) log-GDP per capita on democ-racy. All models include country fixed effects. We successively add in controls for the incumbent’s time inpower, trade as a percentage of (lagged) GDP, average years of schooling from Barro (1999), and labor shareof value added from Rodrik (1999). In columns XI-XIV, we include year fixed effects. In columns IX, X,XIII, and XIV, we include a one-period lag of democracy. We present estimates with annual and five-yearpanels. The first row gives the percentage of country-period observations correctly predicted by each model.For models with annual data, we also report the success rate at predicting transitions within ±0 (secondrow) and ±2 (third row) years of the event. For the five-year panels, we produce success rates at predictingtransitions just within the five-year window (third row).

18

Again, as with our model, it is trivial to correctly predict close to 90% of country-period

observations using country fixed effects. The important departure arises when we compare

predictions of transitions derived from these reduced-form regressions with those generated

by our learning model. Once more, we evaluate these predictions in exact (±0 years) and

five-year windows (±2).

In terms of exact predictions, no reduced-form specification surpasses the predictive suc-

cess of our baseline no-covariates learning model. In the five-year window, our model per-

forms similarly well, save annual panel specifications with a lagged dependent variable, which

correctly predict over 90% of transitions. However, aggregated over five years, the inclusion

of the one-year lagged outcome leads any transition not picked up exactly by the model to

be mechanically transformed into a correct prediction in subsequent years, thus yielding an

extremely high success rate in the five-year window despite a low success rate within a single

year. To further see this, note that, when we introduce a one-period lag in the five-year panel

specifications, they successfully predict less than 20% of transitions, a rate which our model

beats by more than 100%. More importantly, our model provides a structural interpretation

for the observed persistence of systems of government that underpins the predictive success

of these autoregressive specifications.28

Of course, it may be the case that, rather than reflecting the learning process we de-

scribe, our model’s success is simply an artifact of some alternative process of democratic

diffusion that is indirectly picked up by our model’s spatial and temporal flexibility. Exam-

ples of potential mechanisms proposed in the literature include direct emulation of neighbors

(Gleditsch and Ward, 2006), cultural linkages (Wejnert, 2005), and diffusion through trade

(Mansfield, Milner and Rosendorff, 2000). Due to data limitations, direct tests of each of

these mechanisms would be impractical. Nonetheless, in the context of our model, the chan-

28In particular, our model explains the resilience of long-lived democracies in light of recent work thatestablishes a positive causal impact of democracy on economic growth (Acemoglu et al., 2019). Over time,elites in these countries correctly come to believe—with increasing confidence—in democracy’s superioreconomic potential, which solidifies their commitment to democratic government. See the discussion in thefollowing subsection.

19

nel by which these alternative explanations could influence democratic transitions is through

their impact on the political cost of democracy. To evaluate this possibility, we construct

a distance-weighted measure of how democratic each country’s neighborhood is over time,

and we reestimate our baseline model using this measure as a control for direct diffusion

effects. In Online Appendix A7, we find minimal increase in predictive success relative to

our baseline model, which indicates that it is our proposed mechanism of learning about the

economic effects of democracy—and not some alternative process of diffusion—what drives

our results.

In sum, our goodness-of-fit tests quantify and highlight the crucial role of learning in

explaining the dynamics of worldwide democracy adoption, overshadowing the usefulness

of other explanatory variables typically employed in the literature. As discussed, this is

not surprising given that many of these controls are themselves outcomes of the learning

process we model. In light of these results, we hereafter focus our attention on the baseline

no-covariates specification of our learning model.

Structural Parameter Estimates

To understand how our model is fitting the data, we summarize our main parameter es-

timates and discuss their substantive implications.29 We begin with our estimates of the

(de)stabilizing effect of GDP growth on elite turnover under autocracy and democracy. This

question, itself, has been a separate subject of academic inquiry for decades.30 We find that

the impact of growth on the likelihood that the incumbent group retains power indeed dif-

fers across autocracies and democracies.31 The quantitative implications of our estimates are

summarized in Figure 2, which plots the estimated probability (averaged across countries)

that the incumbent remains in power at different rates of GDP growth. In blue, we plot our

estimates under democracy and, in red, our estimates under autocracy.

29We report all coefficient estimates for the baseline specification of our model in Online Appendix A6.30For a summary of the early literature on the topic, see Przeworski et al. (2000) ch. 1.31Specifically, we estimate θD=0 = −4.2213 with a standard error of 2.2209, and θD=1 = 8.8279 with a

standard error of 1.6368.

20

% ∆ GDP

Pr(

Ret

ain

Pow

er)

−50 −40 −30 −20 −10 0 10 20 30 40 50

020

4060

8010

0

Democracies

Autocracies

Figure 2: (De)Stabilizing Effect of GDP Growth on Elite Turnover

Notes. This figure plots the average (across countries) estimated probability of retaining power at differentrates of GDP growth under autocracy (red) and democracy (blue). Observed growth rates between 1951-2000are shown over the horizontal axis.

In line with a substantial empirical literature in political economy, we find that, in democ-

racies, economic growth is stabilizing for elites.32 In other words, the party in government is

more likely to win reelection when growth is high. In contrast, we find that growth is desta-

bilizing in autocracies. This result comports with the view that rapid economic expansion in

non-democracies creates inequalities of expectation or inequalities of outcomes that, in turn,

engender attempts to subvert the political system.33 That is, autocrats are less likely to

remain in power when economic growth produces actors—a middle class, for example—able

to place demands upon and challenge the authority of the group in power.

Coupling these results with learning helps explain the observed cross-sectional correlation

between democracy and per capita GDP. Democracy becomes more appealing to incumbents

as they come to believe that it is conducive to high rates of GDP growth. Conversely,

32Hibbs (1977); Alesina, Roubini and Cohen (1997); Brender and Drazen (2008).33Olson (1963); Huntington (1968); Hirschman and Rothschild (1973).

21

autocracy entails an incentive to suppress economic growth.

Of course, there are notable exceptions. For example, over the past two decades China

has experienced high rates of growth and nevertheless remained autocratic. Similarly, India

for the first four decades of its independence experienced low rates of growth and yet re-

mained democratic. Underlying these prominent cases is the country-specific political cost of

democracy. We recover estimates of the structural parameters describing the baseline polit-

ical cost of democracy for each country, fi, and plot them in Figure 3. Notably, the Chinese

Communist party faces, all else equal, the lowest probability of remaining in power under

democracy. In contrast, Congress at India’s independence had the fourth-highest ex-ante

probability of retaining power under democracy.

Next, in Figure 4, we present estimates of the spatial decay of learning, i.e., the extent

to which the cross-country correlation in prior beliefs depends on the geographic distance

between capitals.34 Consistent with the observed spatial clustering of democratization events

noted in the literature, we find that learning is highly circumscribed geographically. As

shown in Figure 4, at 5,000km (the approximate distance between the U.S. and Ecuador),

the prior correlation of beliefs is only 0.12. At 10,000km (the distance between the U.S.

and the Central African Republic), the potential for learning between countries is virtually

absent. This estimated feature of our model reveals that elites learn from the experiences

of relatively proximate countries, and it underpins the ability of our model to capture the

spatial clustering of transitions to and from democracy.

Finally, to understand the dynamics of democracy adoption, Figure 5 presents our esti-

mates of the evolution of beliefs about the economic effects of democracy. The top panel

plots the evolution of the worldwide distribution of the mean of beliefs about βD=1i − βD=0

i ,

the difference in long-run GDP growth rates under democracy versus autocracy. The bot-

tom panel shows the evolution of worldwide uncertainty (standard deviation) about these

beliefs. The initial state of beliefs in 1951 simply reflects our prior-calibration exercise. In the

34The estimated coefficient is γ = 0.4234 with a standard error of 0.2292.

22

Relative Elite Turnover Under Democracy

BOL

NEW

IND

NTH

DEN

CO

SSLOU

RU

ALBM

ACSALFR

NM

LDU

KRJPNM

ON

GN

BC

ENU

SABR

ALATBU

LM

ZMPAKH

UN

ECU

BENISRAU

LM

ASSAFG

RC

KYRG

RG

MAL

POR

PERH

ON

UZB

ANG

RW

ASPNQ

ATD

OM

SAUC

ON

MEX

ETHN

ICTAZAR

MTKMZAMPANSO

MKU

WG

UA

SUD

CR

OR

US

UAE

INS

YUG

TAWC

AMLBR

LIB

CD

IKENSIE

IRN

BFOEG

YTU

NC

OM

CH

N

RU

MVENSLVC

OL

PHI

FIN

LITTU

RTH

IG

AMTR

IC

HL

UKG

CAN

BOS

NO

RBO

TSW

ZPO

LM

AGESTIR

E

SRI

BELJAMAR

GC

ZRLAOM

AW

ITAG

MY

SWD

TAJAU

SAZEN

IRKZKN

EPU

GA

MLI

LEBC

HA

NIG

BNG

PARSENG

HA

NAM

DR

VG

ABAFGBAHM

AABU

ISW

AR

OK

ZIM

BLRJO

RTO

GIR

QG

UI

EQG

LESM

YAD

RC

DJI

OM

ASIN

MO

RC

AOC

UB

ALGSYRPR

K

●

●

● ●

●

●

●●

●

● ● ●●

● ● ●

● ●● ● ●

●●

●● ● ● ● ●

● ● ● ● ● ● ●

●

● ●●

●● ● ●

●● ● ● ● ● ● ●

●● ● ●

● ●●

● ●● ●

● ● ● ● ● ●● ●

● ●● ●

● ● ● ● ● ●● ● ● ● ●

● ●●

● ● ●●

●● ● ● ● ●

●●

●

●

●● ● ●

●● ● ● ● ● ●

● ● ● ● ●●

● ● ● ●● ● ● ●

● ● ● ● ●

● ● ●●

●● ●

●●

●

●● ●

● ●

●

●

●

HighTurnoverIn Democracy

In DemocracyTurnoverLow

−0.

50.

00.

5

Figure 3: Political Cost of Democracy

Notes. This figure plots our estimates—along with 90% confidence intervals—of the baseline political cost ofdemocracy, fi, across countries. Positive (negative) values imply higher (lower) elite turnover, all else equal,under democracy than autocracy.

23

KM (Thousands)

Cor

rela

tion

of B

elie

fs

0 5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0

Spatial Decay of Learning

USA−Ecuador

USA−CAR

Figure 4: Cross-Country Correlation in Prior Beliefs

Notes. This figure shows the spatial decay of learning, plotting our estimate of the cross-country correlationin prior beliefs about the economic effects of democracy as a function of geographic distance between capitals.For reference, Ecuador and the Central African Republic are approximately five and ten thousand kilometers,respectively, from the U.S.

24

pre-sample period (1875-1950), democracies grew, on average, about 0.4 percentage points

faster than autocracies. The median of mean initial beliefs in the top panel of the figure is

consistent with this statistic.

While estimated beliefs remain relatively flat for the first three decades of the in-sample

period, in the 1980s and, even more dramatically, in the 1990s there is a sharp expansion of

beliefs in favor of democracy’s superior potential to foster economic growth. This divergence

in beliefs reflects an increasingly large gap between autocracies and democracies in observed

economic performance. Between 1951-1979, little new information was revealed about the

relative ability of democracy to generate growth: on average over this period, autocracies

grew just 0.37 percentage points slower than democracies, roughly identical to the observed

difference in the pre-sample period. By contrast, between 1980-2000, democracies produced,

on average, 1.54 percentage points higher annual growth than autocracies.

This widening gap in economic performance—accelerating through the 1980s—reached

a peak in 1987, when democracies outgrew autocracies by 3.3 percentage points. The re-

sulting discrepancy in observed growth rates from countries’ prior expectations led them to

progressively update their beliefs. Together with our estimates of the (de)stabilizing effects

of GDP growth on elite turnover, this change in worldwide beliefs helps explain our model’s

ability to correctly predict the striking rise in the percentage of world democracies observed

in the same period (Figure 1).35

This process operated through changes in both the mean and precision of beliefs. After

the oil crisis, as democracies came to outperform autocracies, this induced countries to

revise up their estimates of democracy’s superior economic potential, which encouraged

transitions to democracy. As the first set of countries transitioned, democracy became

less rare worldwide, reducing belief uncertainty about its relative economic merits. This

helped solidify the growing consensus, leading to further democratization. The result was

35In our out-of sample period, we observe a decline in the relative performance of democracies, whichgrew just 0.67 percentage points faster than autocracies. As expected, we estimate a concomitant decline inaverage beliefs about the economic benefits of democracy. See Online Appendix A2 for further discussion.

25

1950 1960 1970 1980 1990 2000 2010

−1

01

2

βD=1

−βD

=0

Worldwide Distribution of Beliefs

1950 1960 1970 1980 1990 2000 2010

12

34

56

Distribution of Belief Uncertainty

Sta

ndar

d D

evia

tion

Figure 5: Worldwide Beliefs about Economic Effects of Democracy

Notes. The top panel of this figure plots estimates of the 20th-80th worldwide percentiles of the meanof beliefs about the percentage-point difference in long-run GDP growth rates under democracy versusautocracy. The lower panel shows estimates of the 20th-80th worldwide percentiles of uncertainty (standarddeviation) in these beliefs.

26

the cascade in beliefs between the 1980s and mid-1990s shown in Figure 5 and, ultimately,

the wave of democratization that followed over the same period.

Counterfactuals

Our model allows us to explore counterfactual experiments of three types. First, it enables

us to understand how systemic shocks to prosperity impact the worldwide prospects for

democracy. Second, it allows us to ask retrospective questions about historical democratiza-

tion events. Third, our model allows us to prospectively explore conditions that would lead

current elites to transition to or from democracy. We present results of each type in turn.

To conduct these experiments, it is necessary to specify and estimate the “true” data

generating process. A considerable advantage of using our baseline no-covariates model to

generate these counterfactuals is that only an estimate of the true relationship between

GDP growth and democracy is required. Following Buera, Monge-Naranjo and Primiceri

(2011), we assume that this relationship is described by a hierarchical linear model similar

to (2), which we estimate using all available data between 1875-2000 (excluding the two

world wars).36 This specification is appealing for its flexibility and because it is consistent

with our model of learning in that elites with beliefs (2) and prior (3) would eventually learn

the truth over time.

A Second Great Depression

In the year before the market crash of 1929, 51% of the world’s independent states were

democracies. A year later this proportion dropped to just over 43%. By 1935, the fraction

of countries that remained democratic decreased by another 5%, reaching a low of 36% by

36Specifically, we assume that

yi,t = (1−Di,t)bD=0i +Di,tb

D=1i + ei,t,

et ∼ N(0, S ·Q · S), b ∼ N(b, I2 ⊗ S ·W · S),

where S is a diagonal matrix, Qi,j = exp(−Zi,jζQ), and Wi,j = λ exp(−Zi,jζW ).

27

1938. In Europe, democratic backsliding was even starker. At its theretofore high in 1920,

twenty-six out of twenty-eight European states were democratic, but by 1938 thirteen of

these countries had transitioned away from democracy.

A substantial body of both historical and quantitative research has linked the global

decline of democracy in the inter-war period directly to the economic downturn of the Great

Depression.37 Likewise, both in academic and popular discourse, the global economic reces-

sion of 2008 has been put forward as a contributing factor in the observed wave of recent

democratic breakdowns.38 Next, we provide evidence that systemic economic crises indeed

engender reversals to autocracy and, moreover, highlight how this is driven by changes in

beliefs about the economic effects of democracy.

To that end, we simulate two sorts of crises. First, we generate a “short-deep” counter-

factual crisis where, for our last in-sample year (2000), we simulate a 5.9% average worldwide

contraction of per capita GDP, comparable to the worst year of the Great Depression (1931).

In our second crisis, we construct a “long-shallow” counterfactual condition where we perturb

growth by a smaller amount—1.7% annually (the average contraction between 1929-1933)—

but extend this contraction over a five-year period (2000-2004). We present results where

we concentrate these counterfactual conditions in autocratic and democratic countries, re-

spectively. Under the “autocratic-bias” condition, recessions are twice as deep in autocracies

as in democracies, while keeping the worldwide average contraction consistent with our in-

tervention. Conversely, under the “democratic-bias” condition, recessions are twice as deep

in democracies.

Model estimates of the worldwide percentage of democracies are shown in Figure 6. In the

left-hand panel, we present estimates from the autocratic-bias condition and, in the right-

hand panel, estimates from the democratic-bias condition. In both plots, the short-deep

counterfactual is shown in green, and the long-shallow counterfactual is shown in purple.

Note that both the short-deep and long-shallow counterfactual crises negatively impact the

37Frey and Weck (1983); de Bromhead, Eichengreen and O’Rourke (2013).38Bartels (2013); Armingeon and Guthmann (2014).

28

worldwide prevalence of democracy. While the large single-period decline in growth has a

larger impact in the first year, thereafter the proportion of democracies begins to recover.

In comparison, the smaller but longer crisis has a larger overall impact, with the proportion

of democracies continuing to decline through the duration of the economic contraction and

then recovering more slowly.

% W

orld

Dem

ocra

cies

1950 1960 1970 1980 1990 2000 2010

2530

3540

4550

55

True Value

Model Prediction

Long−Shallow

Short−Deep

Global Economic DepressionAutocratic Bias

% W

orld

Dem

ocra

cies

1950 1960 1970 1980 1990 2000 2010

2530

3540

4550

55

Global Economic DepressionDemocratic Bias

Figure 6: Second Great Depression Counterfactuals

Notes. This figure reports results from two counterfactual exercises. First, we present predictions followinga short-deep crisis of a single year (2000) of 6% average worldwide economic contraction (green). Second, wepresent predictions based on a long-shallow crisis of 5 years (2000-2004) of 2% average annual contraction(purple). We plot the true percentage of world democracies (gray) as well as baseline model predictions(blue). We concentrate the contraction in autocracies in the left-hand panel and in democracies in theright-hand panel.

Both patterns are consistent with our actors learning about the economic merits of democ-

racy. In both counterfactuals, the initial shock forces agents to revise their beliefs. However,

in the long-shallow counterfactual, as would be expected from a continued process of learning,

our agents update their beliefs following each additional negative perturbation of worldwide

growth and continue to select out of democracy accordingly. In contrast, in the short-deep

scenario, we observe a single large drop in the percentage of democracies. Since after the

initial shock to growth there is no “new” information revealed to our agents, the world-

wide percentage of democracies starts to recover. Importantly, comparing effects across the

29

two bias conditions, it is clear that the reduction in world democracies is larger when the

economic contraction is concentrated among democracies. This is consistent with the evo-

lution of beliefs prior to our intervention, as discussed above. When democracies perform

poorly, counter to the prevailing consensus, elites sharply revise their beliefs and select out

of democracy.

The Third Wave

A number of studies highlight the influence of external actors on the prospects for democ-

racy.39 Especially for the early “third-wave” democratization events in Greece, Portugal, and

Spain, the potential for accession to the European Community (EC) has been put forward as

a crucial determinant of their respective transitions.40 Brussels’ requirement that community

members maintain a form of government consistent with liberal democracy, coupled with the

economic benefits of access to the common market, generated an incentive to democratize.

In this section, we show that, rather than serving as an institutional target, much of the im-

pact the EC had upon third-wave democratization operated through its constituent states’

economic performance, which affected beliefs about the economic benefits of democracy in

potential member states.

To show this, we construct a counterfactual wherein we generate a recession in the EC’s

three largest economies, Britain, France, and Germany, of 2% average annual contraction for

the two years preceding the first transition of Greece in 1974.41 Then, to obtain an estimate

of this counterfactual recession’s impact on the transitions of Greece, Portugal, and Spain,

we compare our predicted transitions in this counterfactual world with the truth. Our results

are given in Figure 7.

For Portugal and Spain, the effect of lower growth in Britain, France, and Germany is

considerable, delaying their transitions to democracy by almost 20 years. In contrast, we

39Pevehouse (2005).40Whitehead (1996)41Recessions in our sample last 2 years on average, with an average 2% annual drop in per capita GDP.

30

1980 1990 2000

Greece

Portugal

Spain

1980 1990 2000

Greece

Portugal

Spain

Figure 7: Counterfactual Third Wave

Notes. The top panel shows the true timeline of democratization of Greece, Portugal, and Spain. In thebottom panel, we present the corresponding timeline under our counterfactual scenario of a two-year recession(1972 and 1973) in Britain, France, and Germany of 2% average annual contraction.

find no effect of this recession on Greece’s transition. The reason for this becomes apparent

once we compare the evolution of beliefs in the three countries. We plot in Figure 8, for

each of these cases, our estimates of beliefs under the observed economic conditions (solid)

and under our counterfactual timeline (dashed). Note that for Portugal and Spain there

is a substantial divergence in beliefs between the observed and counterfactual timelines.

Following our counterfactual recession, Portuguese and Spanish beliefs become markedly

less favorable towards the potential for growth under democracy. In contrast, for Greece

this is not the case: there is no substantial difference in beliefs. The reason, according to

our model, is that Greek elites pay little attention to the large, relatively distant Western

economies used to construct our counterfactual scenario. Britain, France, and Germany are

simply too different from Greece to be used as a reference for learning.

Ultimately, our counterfactual experiment suggests that the distinguishing characteristic

of Portugal and Spain, in contrast to, for example, Brazil or Mexico, is not their underlying

propensity for democracy. In terms of their estimated baseline political cost of democracy,

Portugal and Brazil and Spain and Mexico are fairly close.42 Rather, Portugal and Spain

democratized early because they learned from Western Europe, benefiting from proximity

to successful liberal democracies.

42The baseline political cost of democracy for Portugal and Brazil is estimated at 0.01 and -0.14, and interms of rank order they are 71 and 39, respectively. Estimates for Spain and Mexico are 0.06 and 0.12,yielding rank orders of 83 and 93, respectively.

31

−6

−5

−4

−3

−2

Truth

Counterfactual

Spain

−4

−3

−2

−1

0

Portugal

−5

−4

−3

Greece

1975 1980 1985 1990 1995 2000 2005 2010

Beliefs

βD=1

−βD

=0

Figure 8: Counterfactual Third-Wave Beliefs

Notes. We plot estimates of Spain, Portugal, and Greece’s mean beliefs under the true growth rates (solid)and under a counterfactual two-year recession (1972 and 1973) in Britain, France, and Germany of 2%average annual contraction (dashed).

32

Chinese (and North Korean) Democracy

With an eye to contemporary politics, we evaluate the stability of a pair of geopolitically

important autocracies. We explore conditions under which our model predicts China and

North Korea would democratize. We focus first on the Chinese case. Here, we look for the

minimal average growth rate among China’s democratic neighbors, in a five-year economic

expansion, that would result in transitions of two types.43 First, we find the rate of growth

that would result in at least a single year of democracy. Second, we establish the growth

rate that would deliver a permanent transition to democracy.

To obtain a predicted single-year Chinese transition to democracy, we estimate that it

would take five years (2000-2004) of 6.5% average annual growth in China’s democratic

neighbors. After this single year of democracy (2005), our model predicts an immediate re-

versal to autocracy. To obtain a “permanent” transition—that is, a prediction of democracy

until the end of our sample—we estimate that China’s democratic neighbors would have to

grow at an average annual rate of 11% between 2000-2004. In contrast, under the same

set of counterfactual conditions, North Korea would not democratize for any period. North

Korea would democratize for a single year following five years of 16.5% average growth in

its democratic neighbors, and permanently following five years of 20.5% average growth.

To highlight the differences in learning, Figure 9 plots Chinese and North Korean be-

liefs under the “Chinese democracy” scenarios of 6.5% and 11% average growth in their

democratic neighbors. While our intervention increases the perception that democracy out-

performs autocracy in both countries, the change in beliefs in North Korea is too small

to induce a transition.44 These results suggest that the prospects for Chinese and North

Korean democracy are limited. It would take a remarkably large economic boom in Asian

democracies for China to democratize, and an even larger, implausible expansion to generate

43Economic expansions in our sample last five years on average.44This stems from both higher initial skepticism in North Korean beliefs and a smaller effective neigh-

borhood from which to learn. While North Korea primarily learns from South Korea and Japan, Chinaadditionally draws from the experiences of its democratic neighbors to the west and south.

33

the same outcome in North Korea.45

β CH

ND

=1−

β CH

ND

=0

1980 1985 1990 1995 2000 2005 2010

−2

02

46

Chinese Beliefs

11.0%6.5%

β PR

KD

=1−

β PR

KD

=01980 1985 1990 1995 2000 2005 2010

−2

02

46

North Korean Beliefs

Figure 9: Chinese and North Korean Beliefs

Notes. We plot estimates of Chinese and North Korean mean beliefs about the economic impact of democ-racy under observed growth (solid) and under two counterfactual five-year expansions (2000-2004) in theirdemocratic neighbors of 6.5% (dashed) and 11% (dots) average annual growth.

Conclusion

We propose and estimate a learning model of democratization in which incumbent elites rely

on worldwide economic history to update their beliefs about the impact of democracy on

economic growth, which affects their likelihood of retaining power. Our estimates indicate

that growth is stabilizing for incumbents in democracies but destabilizing in autocracies.

Furthermore, we show that learning is highly circumscribed geographically. In combination,

these features allow us to successfully predict, both in (1951-2000) and out of sample (2001-

2010), much of the observed variation in democracy adoption. In particular, our model

45Scholars of Chinese politics frequently argue that Chinese citizens are willing to reward economic growthwith regime survival (Laliberte and Lanteigne, 2008), which suggests there could be considerable heterogene-ity in θ across countries. We explore this to some extent in Online Appendix A7 and find limited evidence.Relatedly, the conventional wisdom is that China will democratize if growth slows down. While an economiccrisis may lead to a transition of power in China, our results caution that a transition to democracy wouldrequire confidence by new elites in its superior economic potential.

34

jointly rationalizes the cross-sectional correlation between income and democracy and the

clustering of transitions to and from democracy.

Rather than an “end of history,” we show that democracy is only as resilient as the

economic performance it engenders. Even in the 1990s, when the success of democratic

systems made such proclamations seem reasonable, we find both substantial variation in

the worldwide distribution of beliefs about the economic consequences of democracy and

substantial uncertainty in these beliefs. As recent history suggests, and as our counterfac-

tual experiments demonstrate, systemic economic crises, particularly those concentrated in

democracies, have the potential to generate waves of autocratic reversals.

Our paper contributes to a sizable literature on democracy and development. Notably,

our results indicate that focusing on the within-country impact of economic development on

democratization is insufficient. Rather, modernization should be understood as a systemic

phenomenon. Because countries learn from each other, a country’s economic performance

affects not only its own likelihood of transitioning to or from democracy but also the prospects

for democracy outside of its borders through its influence on neighbors’ beliefs.

35

References

Acemoglu, Daron, Simon Johnson, James A. Robinson and Pierre Yared. 2008. “Income andDemocracy.” American Economic Review 98(3):808–842.

Acemoglu, Daron, Simon Johnson, James A. Robinson and Pierre Yared. 2009. “Reevaluating themodernization hypothesis.” Journal of Monetary Economics 56:1043–1058.

Acemoglu, Daron, Suresh Naidu, Pascual Restrepo and James A. Robinson. 2019. “DemocracyDoes Cause Growth.” Journal of Political Economy 127(1):47–100.

Ahlquist, John S and Erik Wibbels. 2012. “Riding the Wave: World Trade and Factor-BasedModels of Democratization.” American Journal of Political Science 56(2):447–464.

Alesina, Alberto and Dani Rodrik. 1994. “Distributive Politics and Economic Growth.” QuarterlyJournal of Economics 109(2):465–490.

Alesina, Alberto, Nouriel Roubini and Gerald D. Cohen. 1997. Political Cycles and the Macroecon-omy. Cambridge: MIT Press.

Almond, Gabriel A. and Sidney Verba. 1963. The Civic Culture: Political Attitudes and Democracyin Five Nations. Princeton: Princeton University Press.

Armingeon, Klaus and Kai Guthmann. 2014. “Democracy in crisis? The declining support fornational democracy in European countries, 2007–2011.” European Journal of Political Research53(3):423–442.

Ascencio, Sergio J. and Miguel R. Rueda. 2019. “Partisan Poll Watchers and Electoral Manipula-tion.” American Political Science Review 113(3):727–742.

Barro, Robert J. 1996. “Democracy and Growth.” Journal of Economic Growth 1(1):1–27.

Barro, Robert J. 1999. “Determinants of Democracy.” Journal of Political Economy 107(S6):S158–S183.

Bartels, Larry M. 2013. “Political Effects of the Great Recession.” Annals of the American Academyof Political and Social Science 650(1):47–76.

Boix, Carles. 2011. “Democracy, Development, and the International System.” American PoliticalScience Review 105(04):809–828.

Boix, Carles, Michael Miller and Sebastian Rosato. 2013. “A Complete Data Set of PoliticalRegimes, 1800–2007.” Comparative Political Studies 46(12):1523–1554.

Boix, Carles and Susan C. Stokes. 2003. “Endogenous Democratization.” World Politics 55(4):517–549.

Bolton, Patrick and Christopher Harris. 1999. “Strategic Experimentation.” Econometrica67(2):349–374.

Bramoullee, Yann, Rachel Kranton and Martin D’Amours. 2014. “Strategic Interaction and Net-works.” American Economic Review 104(3):898–930.

36

Brender, Adi and Allan Drazen. 2008. “How Do Budget Deficits and Economic Growth AffectReelection Prospects? Evidence from a Large Panel of Countries.” American Economic Review98(5):2203–2220.

Brinks, Daniel and Michael Coppedge. 2006. “Diffusion Is No Illusion: Neighbor Emulation in theThird Wave of Democracy.” Comparative Political Studies 39(4):463–489.

Buera, Francisco J., Alexander Monge-Naranjo and Giorgio E. Primiceri. 2011. “Learning theWealth of Nations.” Econometrica 79(1):1–45.

Chen, Jie and Chunlong Lu. 2011. “Democratization and the Middle Class in China: The MiddleClass’s Attitudes toward Democracy.” Political Research Quarterly 64(3):705–719.

Crisman-Cox, Casey and Michael Gibilisco. 2018. “Audience Costs and the Dynamics of War andPeace.” American Journal of Political Science 62(3):566–580.

Dahl, Robert A. 1998. On Democracy. New Haven: Yale University Press.

Davison, Anthony C. and David V. Hinkley. 1997. Bootstrap Methods and their Application. InCambridge Series in Statistical and Probabilistic Mathematics. New York: Cambridge UniversityPress.

de Bromhead, Alan, Barry Eichengreen and Kevin H. O’Rourke. 2013. “Political Extremism in the1920s and 1930s: Do German Lessons Generalize?” Journal of Economic History 73(2):371–406.

Diamond, Larry. 2011. “The Impact of the Economic Crisis: Why Democracies Survive.” Journalof Democracy 22(1):17–30.

Dube, Jean-Pierre, Jeremy T. Fox and Che-Lin Su. 2012. “Improving the Numerical Performanceof Static and Dynamic Aggregate Discrete Choice Random Coefficients Demand Estimation.”Econometrica 80(5):2231–2267.

Feenstra, Robert C., Robert Inklaar and Marcel P. Timmer. 2015. “The Next Generation of thePenn World Table.” American Economic Review 105(10):3150–3182.

Frey, Bruno S. and Hannelore Weck. 1983. “A Statistical Study of the Effect of the Great Depressionon Elections: The Weimar Republic, 1930–1933.” Political Behavior 5(4):403–420.

Garcıa-Jimeno, Camilo. 2016. “The Political Economy of Moral Conflict: An Empirical Study ofLearning and Law Enforcement under Prohibition.” Econometrica 84(2):511–570.

Gehlbach, Scott, Konstantin Sonin and Milan W. Svolik. 2016. “Formal Models of NondemocraticPolitics.” Annual Review of Political Science 19:565–584.

Gleditsch, Kristian S. 2002. “Expanded Trade and GDP Data (version 6.0).” Journal of ConflictResolution 46(5):712–724.

Gleditsch, Kristian S. and Michael D Ward. 2006. “Diffusion and the International Context ofDemocratization.” International Organization 60(4):911–933.

Goemans, Henk E, Kristian S. Gleditsch and Giacomo Chiozza. 2009. “Introducing Archigos: ADataset of Political Leaders.” Journal of Peace Research 46(2):269–283.

37

Heiss, Florian and Viktor Winschel. 2008. “Likelihood approximation by numerical integration onsparse grids.” Journal of Econometrics 144(1):62–80.

Hibbs, Douglas A. 1977. “Political Parties and Macroeconomic Policy.” American Political ScienceReview 71(4):1467–1487.

Hirschman, Albert O. and Michael Rothschild. 1973. “The Changing Tolerance for Income Inequal-ity in the Course of Economic Development.” Quarterly Journal of Economics 87(4):544–566.

Houle, Christian, Mark A. Kayser and Jun Xiang. 2016. “Diffusion or Confusion? Clustered Shocksand the Conditional Diffusion of Democracy.” International Organization 70(4):687–726.

Huntington, Samuel P. 1968. Political Order in Changing Societies. New Haven: Yale UniversityPress.

Huntington, Samuel P. 1993. The Third Wave: Democratization in the Late Twentieth Century.Norman: University of Oklahoma Press.

Kadera, Kelly M., Mark J. C. Crescenzi and Megan L. Shannon. 2003. “Democratic Survival, Peace,and War in the International System.” American Journal of Political Science 47(2):234–247.

Laliberte, Andre and Marc Lanteigne, eds. 2008. The Chinese Party-State in the 21st Century:Adaptation and the Reinvention of Legitimacy. New York: Routledge.

Lipset, Seymour M. 1959. “Some Social Requisites of Democracy: Economic Development andPolitical Legitimacy.” American Political Science Review 53(1):69–105.

Londregan, John B. and Keith T. Poole. 1996. “Does High Income Promote Democracy?” WorldPolitics 49(1):1–30.

Maddison, Angus. 2010. “Statistics on World Population, GDP and Per Capita GDP, 1-2008 AD.”URL: http://www.ggdc.net/maddison/oriindex.htm

Mansfield, Edward D., Helen V. Milner and B. Peter Rosendorff. 2000. “Free to Trade: Democracies,Autocracies, and International Trade.” American Political Science Review 94(2):305–321.

Matern, Bertil. 1960. “Spatial variation: Stochastic models and their application to some problemsin forest surveys and other sampling investigations.” Meddelanden fran Statens Skogsforskn-ingsinstitut 49(5):1–144.

Miller, Michael K. 2016. “Democracy by Example? Why Democracy Spreads When the World’sDemocracies Prosper.” Comparative Politics 49(1):83–116.

Mossel, Elchanan, Allan Sly and Omer Tamuz. 2015. “Strategic Learning and the Topology ofSocial Networks.” Econometrica 83(5):1755–1794.

Norris, Pippa, ed. 1999. Critical Citizens: Global Support for Democratic Government. Oxford:Oxford University Press.

Olson, Mancur. 1963. “Rapid Growth as a Destabilizing Force.” Journal of Economic History23(4):529–552.

38

Persson, Torsten and Guido Tabellini. 1994. “Is Inequality Harmful for Growth?” AmericanEconomic Review 84(3):600–621.

Pevehouse, Jon C. 2005. Democracy from Above: Regional Organizations and Democratization.Cambridge: Cambridge University Press.

Primiceri, Giorgio E. 2006. “Why Inflation Rose and Fell: Policymakers’ Beliefs and US PostwarStabilization Policy.” Quarterly Journal of Economics 121(3):867–901.

Przeworski, Adam and Fernando Limongi. 1997. “Modernization: Theories and Facts.” WorldPolitics 49(2):155–183.

Przeworski, Adam, Michael E. Alvarez, Jose A. Cheibub and Fernando Limongi. 2000. Democracyand Development: Political Institutions and Well-Being in the World, 1950-1990. Cambridge:Cambridge University Press.

Rodrik, Dani. 1999. “Democracies Pay Higher Wages.” Quarterly Journal of Economics 114(3):707–738.

Skocpol, Theda. 1979. States and Social Revolutions: A Comparative Analysis of France, Russiaand China. Cambridge: Cambridge University Press.

Spolaore, Enrico and Romain Wacziarg. 2009. “The Diffusion of Development.” Quarterly Journalof Economics 124(2):469–529.

Su, Che-Lin and Kenneth L Judd. 2012. “Constrained Optimization Approaches to Estimation ofStructural Models.” Econometrica 80(5):2213–2230.

Svolik, Milan W. 2013. “Learning to Love Democracy: Electoral Accountability and the Successof Democracy.” American Journal of Political Science 57(3):685–702.

Svolik, Milan W. 2019. “Democracy as an equilibrium: rational choice and formal political theoryin democratization research.” Democratization 26(1):40–60.

Torfason, Magnus Thor and Paul Ingram. 2010. “The global rise of democracy: A network account.”American Sociological Review 75(3):355–377.

Treisman, Daniel. 2015. “Income, Democracy, and Leader Turnover.” American Journal of PoliticalScience 59(4):927–942.

Wejnert, Barbara. 2005. “Diffusion, Development, and Democracy, 1800-1999.” American Socio-logical Review 70(1):53–81.

Whitehead, Laurence, ed. 1996. The International Dimensions of Democratization: Europe andthe Americas. Oxford: Oxford University Press.

39

Online Appendix

A1 List of Countries

Table A1 lists all countries in our main sample used to estimate the baseline specification of

our model.

Table A1: Main Sample

Country Years Notes Country Years Notes

United States 1951-2000 Canada 1951-2000Cuba 1951-2000 Dominican Republic 1951-2000

Jamaica 1964-2000 Trinidad and Tobago 1964-2000Mexico 1951-2000 Guatemala 1951-2000

Honduras 1951-2000 El Salvador 1951-2000Nicaragua 1951-2000 Costa Rica 1951-2000Panama 1951-2000 Colombia 1951-2000

Venezuela 1951-2000 Ecuador 1951-2000Peru 1951-2000 Brazil 1951-2000

Bolivia 1951-2000 Paraguay 1951-2000Chile 1951-2000 Argentina 1951-2000

Uruguay 1951-2000 United Kingdom 1951-2000Ireland 1951-2000 Netherlands 1951-2000Belgium 1951-2000 France 1951-2000

Switzerland 1951-2000 Spain 1951-2000Portugal 1951-2000 Germany 1951-2000 Federal Republic of Germany from 1951-1990.Poland 1951-2000 Austria 1951-2000

Hungary 1951-2000 Czech Republic 1951-2000 Czechoslovakia from 1951-1992.Slovakia 1994-2000 Italy 1951-2000Albania 1951-2000 Macedonia 1993-2000Croatia 1992-2000 Yugoslavia 1951-1991

Bosnia and Herzegovina 1993-2000 Slovenia 1992-2000Greece 1951-2000 Bulgaria 1951-2000

Moldova 1992-2000 Romania 1951-2000Russia 1951-2000 U.S.S.R. from 1951-1991. Estonia 1993-2000Latvia 1992-2000 Lithuania 1992-2000

Ukraine 1992-2000 Belarus 1993-2000Armenia 1993-2000 Georgia 1993-2000

Azerbaijan 1993-2000 Finland 1951-2000Sweden 1951-2000 Norway 1951-2000

Denmark 1951-2000 Republic of Guinea-Bissau 1976-2000Equatorial Guinea 1969-2000 Gambia 1967-2000

Mali 1962-2000 Senegal 1962-2000Benin 1961-2000 Mauritania 1962-2000Niger 1962-2000 Cote D’Ivoire 1962-2000

Republic of Guinea 1960-2000 BurkinaFaso 1962-2000Liberia 1951-2000 Sierra Leone 1963-2000Ghana 1958-2000 Togo 1962-2000

Cameroon 1961-2000 Nigeria 1962-2000Gabon 1962-2000 Central African Republic 1962-2000Chad 1962-2000 Republic of the Congo 1962-2000Zaire 1962-2000 Congo-Leopoldville from 1962-1965, Democratic Republic of Congo from 1997-2000. Uganda 1964-2000Kenya 1965-2000 Tanzania 1963-2000

Burundi 1964-2000 Rwanda 1963-2000Somalia 1962-1991 Djibouti 1979-2000Ethiopia 1994-2000 Angola 1977-2000

Mozambique 1977-2000 Zambia 1966-2000Zimbabwe 1967-2000 Malawi 1966-2000

South Africa 1951-2000 Namibia 1992-2000Lesotho 1968-2000 Botswana 1968-2000

Swaziland 1970-2000 Madagascar 1962-2000Comoro Islands 1977-2000 Mauritius 1970-2000

Morocco 1958-2000 Algeria 1964-2000Tunisia 1951-2000 Libya 1953-2000Sudan 1957-2000 Iran 1951-2000Turkey 1951-2000 Iraq 1951-2000Egypt 1951-2000 Syria 1951-2000

Lebanon 1951-2000 Jordan 1951-2000Israel 1951-2000 Saudi Arabia 1951-2000

Kuwait 1952-2000 Bahrain 1973-2000Qatar 1973-2000 United Arab Emirates 1973-2000Oman 1951-2000 Afghanistan 1951-2000

Turkmenistan 1992-2000 Tajikistan 1993-2000Kyrgyzstan 1992-2000 Uzbekistan 1992-2000Kazakhstan 1992-2000 China 1951-2000

Mongolia 1951-2000 Taiwan 1951-2000North Korea 1951-2000 South Korea 1951-2000

Japan 1951-2000 India 1951-2000Pakistan 1951-2000 Bangladesh 1973-2000Burma 1951-2000 Sri Lanka 1951-2000Nepal 1951-2000 Thailand 1951-2000

Cambodia 1956-2000 Laos 1955-2000Democratic Republic of Vietnam 1951-2000 Socialist Republic of Vietnam from 1976-2000. Malaysia 1959-2000

Singapore 1961-2000 Philippines 1951-2000Indonesia 1951-2000 Australia 1951-2000

New Zealand 1951-2000

i

A2 Preliminary Evidence

To motivate our model, we estimate a pair of reduced-form regressions. The purpose is to

demonstrate two empirical patterns underpinning our structural approach: (i) democracy

adoption systematically covaries with observed differences in economic performance between

neighboring democracies and autocracies—which is suggestive of the learning process we

model—and (ii) economic growth has a differential impact on elite turnover under democracy

versus autocracy.

First, we estimate the following linear probability model via ordinary least squares:

Di,t = φ0 + φ1Di,t−1 + φ2

[yD=1i,t−1 − yD=0

i,t−1]

+ φ3Di,t−1 + ui,t, (A1)

where yD=1i,t−1 and yD=0

i,t−1 are weighted averages of past GDP per capita growth rates among

country i’s democratic and autocratic neighbors, respectively.46 Thus, yD=1i,t−1 − yD=0

i,t−1 proxies

for beliefs in country i about the impact of democracy on economic growth. We vary the size

of i’s effective neighborhood by setting the decay parameter equal to the estimated value from

our structural model (γ = 0.4234) as well as twice (2γ) and half this value (12γ). Furthermore,

we control for Di,t−1, the weighted proportion of democracies in i’s neighborhood.

If φ2 6= 0, then, consistent with our model, democracy adoption systematically covaries

with observed growth differences in democratic versus autocratic neighbors. We evaluate

this hypothesis in Table A2, which shows that, across specifications and spatial weights, the

reduced-form evidence is indeed consistent with our structural estimates: as φ2 > 0, the

likelihood of democracy increases with superior economic performance in democracies.

Second, to motivate incumbents’ objective function in our model, we show that the effect

of GDP growth on their likelihood of retaining power in the subsequent period is heterogenous

46Formally, as in Buera, Monge-Naranjo and Primiceri (2011), we compute yD=1i,t−1 − yD=0

i,t−1 =t−1∑

τ=t−3

(∑j 6=i exp(−γdi,j)yjτDjτ∑j 6=i exp(−γdi,j)Djτ

−∑j 6=i exp(−γdi,j)yjτ (1−Djτ )∑j 6=i exp(−γdi,j)(1−Djτ )

), where di,j is the distance between

countries i and j.

ii

Table A2: Reduced-Form Evidence of Effect of Learning about Growth on Democracy

Dependent variable: Di,t

(I) (II) (III) (IV) (V) (VI) (VII) (VIII) (IX) (X) (XI) (XII)

Decay:12γ γ 2γ

φ1 0.854∗∗∗ 0.851∗∗∗ 0.849∗∗∗ 0.849∗∗∗ 0.854∗∗∗ 0.851∗∗∗ 0.849∗∗∗ 0.849∗∗∗ 0.853∗∗∗ 0.851∗∗∗ 0.849∗∗∗ 0.850∗∗∗

(0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007) (0.007)

φ2 0.801∗∗∗ 0.527∗∗ 0.556∗∗ 0.527∗∗ 0.419∗∗∗ 0.279∗ 0.289∗ 0.258∗ 0.214∗∗ 0.163∗ 0.161∗ 0.141(0.202) (0.217) (0.217) (0.221) (0.149) (0.154) (0.154) (0.156) (0.097) (0.098) (0.098) (0.098)

φ3 0.121∗∗∗ 0.128∗∗∗ 0.128∗∗∗ 0.133∗∗∗ 0.117∗∗∗ 0.114∗∗∗ 0.115∗∗∗ 0.115∗∗∗ 0.104∗∗∗ 0.096∗∗∗ 0.097∗∗∗ 0.096∗∗∗

(0.025) (0.025) (0.025) (0.026) (0.021) (0.021) (0.021) (0.022) (0.018) (0.018) (0.018) (0.018)

Controls:log(Yi,t−1) No Yes Yes Yes No Yes Yes Yes No Yes Yes YesTradei,t−1/Yi,t−1 No No Yes Yes No No Yes Yes No No Yes YesTime in Power No No No Yes No No No Yes No No No Yes

Observations 5,623 5,623 5,623 5,579 5,623 5,623 5,623 5,579 5,623 5,623 5,623 5,579

∗p<0.1, ∗∗p<0.05, ∗∗∗p<0.01

Notes. Ordinary least squares estimates from regression (A1). All models include country fixed effects.Standard errors in parentheses.

across democracies and autocracies, with its impact in democracies being consistently larger.

Specifically, we estimate the following linear probability model via least squares:

Pr(RetainPoweri,t+1|yi,t, Di,t) = λ0 + λ1yi,t + λ2Di,t + λ3yi,t ×Di,t. (A2)

Of course, given the selection into or out of democracy we model, estimates of λ will be

inconsistent. To address this without imposing additional structure, we use yD=1i,t−1, y

D=0i,t−1, and

Di,t−1 from (A1) to build instruments for the regressors in (A2).47 That is, in line with our

assumption that beliefs have no direct impact on outcomes (other than through institutional

choices), if we correctly measured beliefs, we could obtain consistent estimates of λ via two-

stage least squares. Since we view yD=1i,t−1 and yD=0

i,t−1 only as rough proxies for beliefs, however,

we take the instrumental-variables estimates of λ1 and λ3 with a grain of salt.

Estimates of λ1 and λ1 + λ3 measure the impact of economic growth on elite survival

under autocracy and democracy, respectively, while λ2 gives the difference in baseline survival

under democracy (the average of −fi in our structural model). Ordinary and two-stage least

47We set γ = 0.4234.

iii

Table A3: Heterogenous Impact of Growth on Incumbent Stability

Dependent variable: RetainPoweri,t+1

(I) (II) (III) (IV) (V) (VI) (VII) (VIII)

OLS 2SLS

λ1 0.336∗∗∗ 0.350∗∗∗ 0.353∗∗∗ 0.378∗∗∗ −26.391 −11.995 −13.179 −9.583(0.073) (0.073) (0.074) (0.074) (109.797) (22.195) (26.917) (15.707)

λ2 −0.081∗∗∗ −0.082∗∗∗ −0.083∗∗∗ −0.080∗∗∗ −1.161 −0.500 −0.533 −0.276(0.014) (0.014) (0.014) (0.014) (4.343) (0.731) (0.869) (0.291)

λ3 0.289∗ 0.288∗ 0.286∗ 0.245 91.411 37.182 41.071 27.540(0.168) (0.170) (0.170) (0.179) (378.318) (68.632) (83.795) (45.160)

Controls:log(Yi,t−1) No Yes Yes Yes No Yes Yes YesTradei,t−1/Yi,t−1 No No Yes Yes No No Yes YesTime in Power No No No Yes No No No Yes

Observations 5,966 5,924 5,919 5,845 5,622 5,622 5,622 5,578

∗p<0.1, ∗∗p<0.05, ∗∗∗p<0.01

Notes. Ordinary (OLS) and two-stage least squares (2SLS) estimates from regression (A2). All modelsinclude country fixed effects. Standard errors in parentheses.

squares estimates are presented in Table A3. Across specifications, we find that democracies

consistently reward incumbents for high rates of growth more than autocracies. Moreover,

the two-stage least squares results are consistent with our structural estimates in that growth

is stabilizing for incumbents under democracy but not under autocracy. These estimates are

imprecise, however, given that the instruments are only rough proxies for the learning process

we model.

A2.1 Growth in Democracies Versus Autocracies

Figure A1 illustrates the variation in the data that drives our main results by plotting

the evolution of the difference in growth rates in democracies versus autocracies. The plot

shows that, following the oil crisis, democracies experienced a markedly superior recovery.

This underlies the sharp revision of beliefs and subsequent transitions to democracy that

characterize the most consequential period in our sample.

iv

Figure A1: Difference in Average Growth Rates in Democracies Versus Autocracies

A3 Likelihood of the Data

Recall that W T ≡ {It, yt, Dt, Xt}Tt=1 denotes the set of all data available up to period T .

With a slight abuse of notation—using L to denote arbitrary densities of the data—the

likelihood function can be written as

L(W T |ϕ) =T∏t=1

L(Wt|W t−1, ϕ),

where Wt ≡ {It, yt, Dt, Xt} collects the data generated in period t. As discussed in the paper,

we assume that observed outcomes are only affected by actual choices and not by the beliefs

that led to those choices. That is, transitions of power (It), GDP growth (yt), and other

economic and political characteristics of countries (Xt) are shaped by realized institutions

(Dt), but they are not directly affected by beliefs about the potential effects of transitioning

v

into or out of democracy. Formally, this assumption allows us to write

L(Wt|W t−1, ϕ) = L(It, yt, Dt, Xt|W t−1, ϕ)

= L(It|yt, Dt, Xt,Wt−1)L(yt|Dt, Xt,W

t−1) · · ·

L(Dt|Xt,Wt−1, ϕ)L(Xt|W t−1),

which implies that L(W T |ϕ) ∝∏T

t=1 L(Dt|Xt,Wt−1, ϕ).

To compute L(Dt|Xt,Wt−1, ϕ), notice from (5) that, given (Xi,t,W

t−1, ϕ), there is a

threshold value of κi,t—the realized shock in period t to the political cost of democracy in

country i—such that Di,t = 1 if and only if κi,t falls below the threshold. This threshold

value, denoted κi,t(Xi,t,Wt−1, ϕ), is defined implicitly by

Ei,t−1


i + εi,t)− fi −X ′i,tξ − κi,t(Xi,t,Wt−1, ϕ))

1 + exp(αi + θD=1(βD=1i + εi,t)− fi −X ′i,tξ − κi,t(Xi,t,W t−1, ϕ))

]

= Ei,t−1


i + εi,t))

1 + exp(αi + θD=0(βD=0i + εi,t))

].

(A3)

Since κi,t is distributed independently across countries, the likelihood can be written as

L(W T |ϕ) ∝T∏t=1

n∏i=1

L(Di,t|Xi,t,Wt−1, ϕ),

where

L(Di,t|Xi,t,Wt−1, ϕ) = Φ

(κi,t(Xi,t,W

t−1, ϕ)

ςi

)Di,t[1− Φ

(κi,t(Xi,t,W

t−1, ϕ)

ςi

)]1−Di,t

and Φ denotes the standard Normal cumulative distribution function.

vi

A4 Prior

We set our prior over the model parameters in Table 1 as follows. We assume that

αii.i.d.∼ N(α, ω2

α),

θD=0,1 i.i.d.∼ N(θ, ω2θ),

βD=0i,0

i.i.d.∼ N(βD=0

0, ω2

β),

βD=1i,0

i.i.d.∼ N(βD=1

0, ω2

β),

vii.i.d.∼ IG(sv, dv),

fii.i.d.∼ N(f, ω2

f ),

ςii.i.d.∼ IG(sς , dς),

γi.i.d.∼ Uniform,

ξi.i.d.∼ Uniform,

where IG(s, d) denotes the Inverse-Gamma distribution with shape parameter s and scale

parameter d. We calibrate our prior using pre-sample data from 1875-1950 (excluding the

two world wars):

• We set βD=0

0= 0.0180 and βD=1

0= 0.0218, the average annual growth rates among

autocracies and democracies, respectively, in the pre-sample period. We then set ωβ =

0.02, allowing for considerable uncertainty about the mean of initial beliefs.

• We select sv = 3 and dv = 0.7423 so that the prior mean and standard deviation of viσi

equal the standard deviation of average growth rates, yi, in the pre-sample period. A

pre-sample estimate of the mean of σi (equal to 0.0531) is obtained from the residuals

of a regression of GDP growth on country and time fixed effects. We then set the prior

mean of vi equal to√

Var(yi)/0.0531 =√

0.0004/0.0531 = 0.3711.

• We set θ = 0 to adopt an agnostic starting point about whether GDP growth has a

vii

stabilizing or destabilizing effect on elite turnover across systems of government, and

we normalize ωθ = 1.

• To adopt an agnostic starting point regarding the political cost of democracy, we set

f = 0. We describe our choice of ωf below.

• To ensure prior belief correlations between 0 and 1, we adopt a flat (improper) prior

over γ ≥ 0. For the political cost of democracy, we center the variables in Xi,t around

their sample means so that Ki,t has an expected value of zero (in line with our agnostic

view of fi), and we adopt a flat (improper) prior over ξ.

• Letting θyi and KDi denote the within-country means of θD=Di,tyi,t and Ki,tDi,t, re-

spectively, and noting that αi+θyi−DKi approximately equals the log-odds of staying

in power in country i, we select α, ωα, and ωf to match the first two moments of these

log-odds across countries in the pre-sample period. Since E(αi + θyi −DKi) = α, we

set α = 1.8432, the average log-odds in the pre-sample period.48 Noting that the vari-

ance of the log-odds among autocracies is approximately equal to ω2α + Var(yi), while

the variance among democracies is approximately equal to ω2α + Var(yi) + ω2

f , we set

ω2α + 0.0005 = 0.722 and ω2

α + 0.0007 + ω2f = 1.0802, so ωα = 0.8494 and ωf = 0.5984.

• Finally, to discourage the model from fitting the data with large (absolute) realizations

of the unobserved political cost shock κi,t, we set sς = 3 and dς = 0.2992 so that ςi has

a prior mean and standard deviation of ωf/4 = 0.1496.

A5 Maximum-A-Posteriori Estimator: MPEC Approach

As noted in the paper, calculating κi,t(Xi,t,Wt−1, ϕ) from (A3) to evaluate the likelihood of

the data is computationally expensive. To avoid this burden, we follow the Mathematical

48For countries that experienced no elite turnover in the pre-sample period, we limit the probability ofstaying in power to equal the maximum among countries with turnover (95%).

viii

Programming with Equilibrium Constraints (MPEC) approach of Su and Judd (2012) to

compute our maximum-a-posteriori (MAP) estimator of ϕ. The idea behind this approach

is simple: instead of calculating κi,t(Xi,t,Wt−1, ϕ) at every trial value of ϕ, one treats each κi,t

as an auxiliary parameter and imposes (A3)—the optimality (or equilibrium) condition of the

model—as a feasibility constraint on the log-posterior maximization program. Accordingly,

we estimate ϕ by solving maxϕ,κ log(p(ϕ,κ|W T )

)subject to the constraint that κi,t satisfies

(A3) for all i and t.

As shown by Su and Judd (2012), MPEC and the standard approach of directly maxi-

mizing log(p(ϕ|W T )

)yield theoretically-identical estimates of ϕ. Computationally, MPEC’s

advantage arises from the fact that modern optimization algorithms do not enforce con-

straints until the final iteration of the search process. Thus, the computationally expensive

condition (A3) is satisfied exactly only once rather than at every trial value of ϕ. More-

over, for this reason, MPEC is robust to sensitivity issues that may arise from not setting

a sufficiently stringent convergence criterion when computing κi,t(Xi,t,Wt−1, ϕ) (Dube, Fox

and Su, 2012). A potential disadvantage is that, by introducing κ as additional parameters,

MPEC increases the size of the optimization problem. However, this concern is mitigated

by the sparsity that results from each auxiliary parameter κi,t entering a single constraint.

To reap the computational benefits of the MPEC approach, it is essential to employ op-

timization software tailor-made to handle large-scale problems—with thousands of variables

and nonlinear constraints. Accordingly, we implement our MPEC-MAP estimator using the

industry-leading software Knitro.49 Due to memory and computational constraints—our

baseline model with no covariates features 8,459 variables—our implementation relies on

Knitro’s Interior/Direct algorithm with their limited-memory quasi-Newton BFGS approx-

imation of the Hessian of the Lagrangian. Nevertheless, we provide exact first derivatives

of the log-posterior and constraints.50 With a 3.0 GHz machine, it takes about 5-6 days to

49https://www.artelys.com/en/optimization-tools/knitro50Knitro offers a derivative-check option—which our implementation passes—to test the code for exact

derivatives against finite-difference approximations.

ix

https://www.artelys.com/en/optimization-tools/knitro

estimate our model once.

To mitigate concerns about potential local maxima, for each model specification we ran-

domly draw 5 sets of starting values for the optimization algorithm from the prior distribution

of the model parameters described in Appendix A4. We then select, for each specification,

the solution that achieves the highest log-posterior value. Reassuringly, there is very little

divergence in solutions across starting values.

Standard errors for our parameter estimates are parametrically bootstrapped (Davison

and Hinkley, 1997). Due to the considerable computational cost of estimating our model,

we only compute standard errors for our baseline specification with no covariates. This has

the added advantage that only estimates from the true DGP described in Footnote 36 are

necessary to generate bootstrap samples.

A final notable computational challenge is that the integrals in (A3) have no closed-

form solution. Rather than employing a Monte Carlo approximation, which would require

independent draws across all i and t to prevent simulation error from propagating, we rely

on sparse-grid integration as implemented by Heiss and Winschel (2008). This approach is

much more efficient and delivers virtually exact integral computations for integrands that

are well approximated by polynomials—as is the case for the integrals in (A3).51

A6 Coefficient Estimates: Baseline Model

Table 1 lists and describes all the parameters of our model. In our baseline specification

with no covariates, the vector ξ is empty. We report estimates of θ and γ in Footnotes 31

and 34, respectively, and estimates of fi for each country in Figure 3. Table A4 presents all

remaining coefficient estimates for our baseline specification (standard errors in parentheses).

51Our implementation computes exact integrals of fifteenth-degree polynomials.

x

Table A4: Estimates of Country-Specific Parameters αi, βD=0i,0 , βD=1

i,0 , vi, and ςi

Country αi βD=0i,0 βD=1

i,0 vi ςi Country αi βD=0i,0 βD=1

i,0 vi ςi

USA 1.8147 0.0189 0.0284 0.1902 0.0728 CAN 1.8000 0.0191 0.0249 0.1823 0.0723(0.0487) (0.0198) (0.0176) (0.1953) (0.3791) (0.0521) (0.0209) (0.0193) (0.1956) (0.3637)

CUB 1.7449 0.0422 0.0431 0.5696 0.0469 DOM 1.9302 0.0020 0.0005 0.6220 0.0442(0.0743) (0.0327) (0.0320) (0.2930) (0.2320) (0.1091) (0.0436) (0.0359) (0.3591) (0.4087)

JAM 1.8361 0.0185 0.0313 0.5088 0.0737 TRI 1.7974 0.0192 0.0165 0.1946 0.0722(0.0877) (0.0400) (0.0346) (0.2522) (0.3667) (0.0194) (0.0268) (0.0247) (0.1860) (0.3887)

MEX 1.8773 0.0126 0.0206 0.1913 0.0705 GUA 1.9023 0.0453 0.0386 0.2184 0.0598(0.0705) (0.0331) (0.0285) (0.1876) (0.1898) (0.0645) (0.0759) (0.0649) (0.1662) (0.1646)

HON 1.8249 0.0008 0.0082 0.2591 0.0585 SAL 1.8631 -0.0085 -0.0154 0.4512 0.0712(0.0672) (0.0346) (0.0312) (0.1910) (0.1484) (0.3185) (0.0687) (0.0635) (0.2657) (0.2856)

NIC 1.8652 0.0154 0.0264 0.1166 0.0278 COS 1.7512 0.0197 0.0142 0.2037 0.0715(0.0408) (0.0398) (0.0353) (0.1565) (0.1964) (0.0748) (0.0699) (0.0576) (0.1898) (0.3795)

PAN 1.8437 0.0426 0.0089 0.1317 0.1102 COL 2.5650 0.0072 0.0046 1.4453 0.0723(0.0118) (0.0285) (0.0277) (0.1404) (0.3525) (0.4678) (0.1286) (0.1070) (0.4873) (0.4519)

VEN 2.3228 0.0123 0.0106 1.4937 0.0531 ECU 1.4737 0.0385 0.0295 0.4143 0.0423(0.3398) (0.0835) (0.0641) (0.4937) (0.4106) (0.0809) (0.1467) (0.1283) (0.2001) (0.1520)

PER 1.8327 0.0300 0.0037 0.2171 0.0917 BRA 1.8211 0.0214 0.0030 0.1191 0.1059(0.0450) (0.0428) (0.0365) (0.1901) (0.2221) (0.0328) (0.0305) (0.0295) (0.1557) (0.2135)

BOL 1.9907 0.0068 0.0099 1.6876 0.0612 PAR 1.8948 0.0362 0.0208 0.2246 0.0735(2.3220) (0.4624) (0.3649) (0.3540) (0.7489) (0.0795) (0.0757) (0.0645) (0.1882) (0.3755)

CHL 1.4128 0.0289 0.0285 0.5013 0.0357 ARG 1.8439 0.0057 0.0356 0.0735 0.1003(0.0817) (0.0996) (0.0840) (0.2114) (0.2287) (0.0497) (0.0545) (0.0481) (0.1578) (0.2731)

URU 1.8164 0.0322 0.0077 0.2035 0.0365 UKG 1.7922 0.0191 0.0244 0.1782 0.0724(0.1215) (0.0647) (0.0553) (0.1432) (0.1799) (0.0523) (0.0182) (0.0141) (0.1726) (0.3912)

IRE 1.8277 0.0186 0.0243 0.1729 0.0734 NTH 1.6578 0.0202 0.0242 0.1883 0.0711(0.0603) (0.0134) (0.0113) (0.1723) (0.3845) (0.0928) (0.0376) (0.0304) (0.2439) (0.3671)

BEL 1.8362 0.0185 0.0203 0.1847 0.0736 FRN 1.7776 0.0192 0.0260 0.1702 0.0724(0.0610) (0.0259) (0.0218) (0.2549) (0.3996) (0.0623) (0.0282) (0.0232) (0.1777) (0.3995)

SWZ 1.8035 0.0190 0.0250 0.1680 0.0727 SPN 1.8729 0.0161 -0.0132 0.4770 0.0422(0.0418) (0.0208) (0.0164) (0.1565) (0.4014) (0.1007) (0.0575) (0.0466) (0.3326) (0.3714)

POR 1.8777 0.0124 -0.0022 0.3360 0.0505 GMY 1.8419 0.0182 0.0285 0.1457 0.0745(0.1023) (0.0748) (0.0608) (0.2820) (0.3306) (0.0362) (0.0376) (0.0328) (0.2375) (0.3980)

POL 1.9991 0.0175 0.0093 1.0647 0.0546 AUS 1.8432 0.0180 0.0740 0.1156 0.0748(1.5958) (0.4168) (0.3400) (0.3578) (0.5581) (0.0139) (0.0748) (0.0641) (0.2030) (0.3932)

HUN 2.3984 0.0179 0.0195 2.0474 0.0505 CZR 1.8709 -0.0005 -0.0157 0.3155 0.0421(0.4215) (0.1945) (0.1690) (0.7520) (0.2212) (0.0882) (0.0173) (0.0183) (0.2906) (0.1772)

SLO 1.7086 0.0197 0.0244 0.2122 0.0725 ITA 1.8411 0.0182 0.0272 0.1694 0.0743(0.0224) (0.0158) (0.0131) (0.2786) (0.4015) (0.0218) (0.0096) (0.0090) (0.1807) (0.3986)

ALB 1.7750 0.0006 0.0240 0.2592 0.0593 MAC 1.7758 0.0193 0.0222 0.2043 0.0724(0.0518) (0.0172) (0.0132) (0.2334) (0.1494) (0.0143) (0.0089) (0.0088) (0.2129) (0.4087)

CRO 1.8592 0.0180 0.0191 0.2071 0.0644 YUG 1.9095 0.0319 0.0188 0.2142 0.0712(0.0130) (0.0179) (0.0161) (0.2801) (0.2775) (0.0211) (0.0361) (0.0281) (0.2360) (0.4035)

BOS 1.8004 0.0190 0.0230 0.3225 0.0730 SLV 1.5727 0.0205 0.0249 0.2008 0.0718(0.0187) (0.0255) (0.0205) (0.2473) (0.4074) (0.0410) (0.0134) (0.0107) (0.1868) (0.4094)

GRC 1.5500 0.0468 0.0367 0.6486 0.0410 BUL 2.2611 0.0074 0.0222 1.0768 0.0579(0.0769) (0.0774) (0.0720) (0.2902) (0.3054) (0.3696) (0.3163) (0.2659) (0.3649) (0.2674)

MLD 1.7984 0.0191 0.0236 0.1817 0.0727 RUM 2.4243 0.0186 0.0149 6.6647 0.0651(0.0172) (0.0507) (0.0414) (0.1988) (0.3884) (3.2608) (0.1847) (0.0889) (2.0319) (0.6177)

RUS 1.8535 0.0186 0.0146 0.1078 0.1104 EST 1.8239 0.0186 0.0231 0.1796 0.0735(0.0115) (0.0256) (0.0204) (0.1631) (0.2625) (0.0068) (0.0102) (0.0109) (0.2519) (0.4018)

xi

Table A4 (continued)



i,0 vi ςi

LAT 1.8201 0.0184 0.0204 0.2166 0.0655 LIT 1.7210 0.0196 0.0254 0.2098 0.0722(0.0207) (0.0094) (0.0084) (0.2520) (0.2467) (0.0281) (0.0124) (0.0123) (0.2735) (0.3954)

UKR 1.7982 0.0191 0.0240 0.1907 0.0728 BLR 1.8559 0.0187 0.0204 0.1672 0.0750(0.0149) (0.0122) (0.0118) (0.2818) (0.4079) (0.0058) (0.0185) (0.0162) (0.2341) (0.2946)

ARM 1.8847 0.0165 0.0203 0.1667 0.0730 GRG 1.8432 0.0183 0.0218 0.1925 0.0748(0.0096) (0.0064) (0.0062) (0.2125) (0.3988) (0.0258) (0.0211) (0.0154) (0.2378) (0.4042)

AZE 1.8470 0.0183 0.0217 0.2031 0.0747 FIN 1.7190 0.0197 0.0279 0.1797 0.0713(0.0237) (0.0194) (0.0138) (0.2389) (0.4047) (0.0298) (0.0241) (0.0182) (0.2173) (0.3992)

SWD 1.8431 0.0181 0.0348 0.1434 0.0748 NOR 1.8031 0.0190 0.0222 0.1954 0.0726(0.0191) (0.0102) (0.0099) (0.1754) (0.3970) (0.0701) (0.0256) (0.0208) (0.1995) (0.3838)

DEN 1.6931 0.0200 0.0242 0.2011 0.0713 GNB 1.8194 -0.0055 -0.0042 0.5045 0.0472(0.0786) (0.0494) (0.0404) (0.2254) (0.3822) (0.0177) (0.0130) (0.0144) (0.2605) (0.2337)

EQG 1.9776 0.0191 0.0187 0.2287 0.0722 GAM 1.5020 0.0155 0.0217 0.6286 0.0586(0.1098) (0.0786) (0.0633) (0.2426) (0.4031) (0.0847) (0.0472) (0.0391) (0.2995) (0.1793)

MLI 1.8795 0.0060 0.0008 0.5336 0.0453 SEN 1.9353 0.0363 0.0207 0.1905 0.1028(0.0524) (0.0145) (0.0136) (0.2888) (0.1921) (0.0557) (0.0161) (0.0110) (0.1466) (0.2668)

BEN 1.8362 0.0025 -0.0126 0.3807 0.0561 MAA 1.9058 0.0188 0.0195 0.1744 0.0720(0.0667) (0.0559) (0.0472) (0.2814) (0.1693) (0.0336) (0.0238) (0.0183) (0.1920) (0.4051)

NIR 1.8202 0.0196 0.0266 0.1989 0.0727 CDI 1.8757 0.0190 0.0182 0.1511 0.0708(0.0626) (0.0709) (0.0587) (0.2127) (0.2148) (0.0402) (0.0483) (0.0389) (0.1858) (0.3991)

GUI 1.8766 0.0162 0.0188 0.1989 0.0714 BFO 1.8553 0.0160 0.0175 0.2194 0.0704(0.0273) (0.0144) (0.0118) (0.1909) (0.3986) (0.0269) (0.0140) (0.0094) (0.2435) (0.3962)

LBR 1.8596 0.0028 0.0185 0.1369 0.0709 SIE 1.8233 0.0373 0.0505 0.4757 0.0479(0.0295) (0.0261) (0.0214) (0.2099) (0.4058) (0.1108) (0.0493) (0.0399) (0.2757) (0.2199)

GHA 1.8585 0.0139 0.0084 0.1360 0.0512 TOG 1.9611 0.0276 0.0189 0.1611 0.0721(0.0289) (0.0179) (0.0135) (0.1645) (0.2214) (0.1153) (0.0640) (0.0552) (0.1741) (0.3625)

CAO 1.8599 0.0171 0.0177 0.1792 0.0704 NIG 1.8737 0.0237 0.0192 0.0982 0.1973(0.0206) (0.0184) (0.0154) (0.2056) (0.4004) (0.0525) (0.0230) (0.0201) (0.1822) (0.2475)

GAB 1.9154 0.0159 0.0203 0.1530 0.0732 CEN 1.7973 0.0129 0.0257 0.4532 0.0517(0.0657) (0.0711) (0.0589) (0.1964) (0.3965) (0.1195) (0.0652) (0.0528) (0.2711) (0.2460)

CHA 1.8823 0.0189 0.0212 0.1889 0.0740 CON 1.8664 0.0243 0.0212 0.1445 0.0976(0.0629) (0.0340) (0.0272) (0.1768) (0.3713) (0.0409) (0.0342) (0.0288) (0.1896) (0.1751)

DRC 1.8770 0.0176 0.0185 0.1853 0.0710 UGA 1.8465 0.0165 0.0206 0.1186 0.0982(0.0428) (0.0226) (0.0170) (0.2403) (0.4070) (0.0343) (0.0188) (0.0148) (0.1935) (0.2448)

KEN 1.8534 0.0046 0.0181 0.1584 0.0706 TAZ 1.9018 0.0176 0.0204 0.2016 0.0732(0.0065) (0.0101) (0.0076) (0.2181) (0.4083) (0.0418) (0.0177) (0.0130) (0.2054) (0.3987)

BUI 1.9025 0.0173 0.0193 0.1830 0.0720 RWA 1.8749 0.0185 0.0213 0.2281 0.0741(0.0273) (0.0189) (0.0144) (0.2132) (0.4052) (0.0514) (0.0516) (0.0420) (0.2277) (0.4004)

SOM 1.8143 0.0390 0.0384 0.4018 0.0447 DJI 1.8634 0.0164 0.0183 0.1944 0.0709(0.0661) (0.0447) (0.0351) (0.2292) (0.1870) (0.0228) (0.0089) (0.0078) (0.2686) (0.3988)

ETH 1.8853 0.0175 0.0205 0.1796 0.0733 ANG 1.8744 0.0174 0.0213 0.1840 0.0742(0.0108) (0.0065) (0.0051) (0.2619) (0.4026) (0.0438) (0.0322) (0.0251) (0.1923) (0.3917)

MZM 1.8505 -0.0069 -0.0098 0.2624 0.0446 ZAM 1.9325 0.0172 0.0195 0.1916 0.0727(0.0272) (0.0494) (0.0418) (0.2579) (0.1588) (0.0525) (0.0288) (0.0229) (0.1848) (0.4029)

ZIM 1.9480 0.0205 0.0192 0.1595 0.0724 MAW 1.8417 0.0161 0.0196 0.3374 0.0565(0.0624) (0.0286) (0.0236) (0.1429) (0.3932) (0.0879) (0.0912) (0.0756) (0.2880) (0.2650)

SAF 1.8835 0.0143 -0.0226 0.9136 0.0475 NAM 1.9146 0.0191 0.0204 0.1937 0.0734(0.0427) (0.0511) (0.0448) (0.3972) (0.2298) (0.0398) (0.0261) (0.0206) (0.2140) (0.4067)

LES 2.0190 0.0179 0.0187 0.1879 0.0722 BOT 1.8295 0.0189 0.0053 0.1126 0.0727(0.1155) (0.0834) (0.0691) (0.2240) (0.4024) (0.0359) (0.0209) (0.0218) (0.1892) (0.4041)

SWA 2.0060 0.0293 0.0192 0.1458 0.0726 MAG 1.8638 0.0151 0.0024 0.3599 0.0522(0.1293) (0.0495) (0.0437) (0.1803) (0.4026) (0.0818) (0.0482) (0.0395) (0.2580) (0.1413)

COM 1.8525 0.0464 0.0167 0.1309 0.0695 MAS 1.8395 0.0183 0.0282 0.1617 0.0742(0.0265) (0.0417) (0.0341) (0.1620) (0.4078) (0.0504) (0.0157) (0.0172) (0.1598) (0.4021)

MOR 1.9467 0.0251 0.0179 0.1848 0.0709 ALG 1.8899 0.0245 0.0172 0.1966 0.0702(0.0897) (0.0433) (0.0347) (0.2236) (0.4046) (0.0492) (0.0448) (0.0365) (0.1958) (0.4080)

xii

Table A4 (continued)



i,0 vi ςi

TUN 1.8613 0.0144 0.0170 0.2899 0.0702 LIB 1.8824 0.0112 0.0185 0.0907 0.0709(0.0253) (0.0323) (0.0258) (0.2818) (0.4061) (0.0239) (0.0120) (0.0095) (0.1662) (0.4058)

SUD 1.8564 0.0204 0.0221 0.1877 0.0834 IRN 1.8658 0.0214 0.0177 0.2123 0.0702(0.0278) (0.0201) (0.0164) (0.2124) (0.1882) (0.0157) (0.0155) (0.0115) (0.2701) (0.3650)

TUR 1.8914 0.0019 -0.0004 0.2253 0.0524 IRQ 1.9082 0.0198 0.0189 0.1784 0.0714(0.0894) (0.0825) (0.0664) (0.2256) (0.2125) (0.0533) (0.0229) (0.0176) (0.2391) (0.4042)

EGY 1.8520 0.0023 0.0172 0.2901 0.0698 SYR 2.1083 0.0087 0.0170 0.3711 0.0707(0.0465) (0.0345) (0.0287) (0.2560) (0.4075) (0.2113) (0.0841) (0.0717) (0.3321) (0.3848)

LEB 1.8962 0.0266 0.0236 0.2089 0.0402 JOR 1.9433 0.0198 0.0191 0.1241 0.0716(0.0939) (0.0622) (0.0552) (0.2392) (0.2457) (0.0733) (0.0393) (0.0329) (0.1834) (0.4037)

ISR 1.8391 0.0183 0.0182 0.1804 0.0740 SAU 1.8854 0.0180 0.0209 0.1885 0.0736(0.0429) (0.0134) (0.0152) (0.1730) (0.3986) (0.0808) (0.0625) (0.0502) (0.2255) (0.3956)

KUW 1.8888 0.0182 0.0192 0.1997 0.0718 BAH 1.8733 0.0063 0.0196 0.1443 0.0723(0.0980) (0.0414) (0.0321) (0.2272) (0.4032) (0.0156) (0.0364) (0.0259) (0.1757) (0.4062)

QAT 1.8708 0.0232 0.0212 0.2552 0.0741 UAE 1.8746 0.0185 0.0189 0.2131 0.0718(0.1227) (0.0518) (0.0411) (0.3129) (0.3987) (0.0445) (0.0203) (0.0133) (0.2240) (0.3665)

OMA 1.8797 0.0193 0.0182 0.1693 0.0707 AFG 1.9012 0.0273 0.0202 0.1375 0.0726(0.0282) (0.0162) (0.0146) (0.2570) (0.4062) (0.0563) (0.0269) (0.0213) (0.1821) (0.3988)

TKM 1.8841 0.0191 0.0199 0.2080 0.0729 TAJ 1.8431 0.0228 0.0218 0.3066 0.0748(0.0212) (0.0156) (0.0113) (0.2607) (0.4040) (0.0487) (0.0321) (0.0247) (0.2275) (0.4002)

KYR 1.8431 0.0171 0.0218 0.1955 0.0748 UZB 1.8608 0.0129 0.0214 0.1608 0.0743(0.0367) (0.0404) (0.0330) (0.2160) (0.4039) (0.0210) (0.0183) (0.0134) (0.2176) (0.4055)

KZK 1.8563 0.0186 0.0215 0.1883 0.0744 CHN 2.0790 0.0168 0.0152 0.1761 0.0698(0.0329) (0.0188) (0.0139) (0.2559) (0.4052) (0.3876) (0.2475) (0.2102) (0.2163) (0.3799)

MON 1.8330 -0.0068 -0.0246 0.1608 0.0279 TAW 1.8480 0.0198 0.0117 0.1921 0.0395(0.0277) (0.0198) (0.0251) (0.2343) (0.1869) (0.0725) (0.0762) (0.0659) (0.2445) (0.1506)

PRK 1.9355 0.0179 0.0161 0.1068 0.0701 ROK 1.8577 0.0225 0.0037 0.2863 0.0733(0.2099) (0.0544) (0.0449) (0.1801) (0.4006) (0.2518) (0.0955) (0.0823) (0.2341) (0.2046)

JPN 1.7171 0.0185 0.0284 0.1619 0.0624 IND 1.6707 0.0206 0.0040 0.1502 0.0699(0.2023) (0.1375) (0.1192) (0.1430) (0.5419) (0.0310) (0.0443) (0.0362) (0.1571) (0.3696)

PAK 1.8169 0.0067 0.0321 0.3833 0.0444 BNG 1.8673 0.0297 -0.0072 1.0305 0.0455(0.0765) (0.0490) (0.0404) (0.2426) (0.1674) (0.0367) (0.0723) (0.0652) (0.3952) (0.1721)

MYA 1.6948 0.0551 0.0344 0.3119 0.0479 SRI 1.7513 0.0412 0.0328 0.2532 0.0290(0.1321) (0.1018) (0.0826) (0.2006) (0.1879) (0.1899) (0.1046) (0.0813) (0.1753) (0.2451)

NEP 1.9384 0.0090 0.0035 0.5470 0.0396 THI 1.8232 -0.0573 0.0006 0.1695 0.0413(0.0592) (0.0718) (0.0582) (0.2928) (0.1939) (0.0237) (0.1572) (0.1165) (0.3165) (0.1885)

CAM 1.8808 0.0237 0.0186 0.3581 0.0716 LAO 1.6121 0.0362 0.0316 0.3543 0.0511(0.2869) (0.1764) (0.1528) (0.2957) (0.3984) (0.6854) (0.3354) (0.2872) (0.2662) (0.2295)

DRV 1.8845 0.0136 0.0204 0.2178 0.0732 MAL 1.8437 0.0210 0.0218 0.2600 0.0748(0.3411) (0.2784) (0.2450) (0.1317) (0.3981) (0.7790) (0.3452) (0.2911) (0.2960) (0.3965)

SIN 1.8514 0.0040 0.0180 0.1237 0.0706 PHI 1.6377 0.0166 0.0257 0.2015 0.0379(0.0352) (0.0632) (0.0491) (0.1237) (0.4075) (0.1145) (0.1574) (0.1436) (0.1361) (0.2380)

INS 1.8459 0.0241 0.0185 0.2275 0.0587 AUL 1.8423 0.0183 0.0212 0.1803 0.0740(0.0790) (0.0538) (0.0504) (0.2990) (0.2023) (0.0404) (0.0722) (0.0734) (0.2334) (0.3995)

NEW 1.5256 0.0210 0.0246 0.1843 0.0696(0.0624) (0.1167) (0.0986) (0.1256) (0.3372)

xiii

A7 Additional Results

In this appendix, we describe in detail various additional results mentioned in the paper.

Model fit by world region. Figure A2 presents goodness-of-fit and out-of-sample pre-

diction results disaggregated by four regions of the world: the Americas, Europe, Africa, and

Asia-Oceania. As in Figure 1, we compare the true proportion of democracies (gray) in each

region with our learning model’s predictions, both in (solid blue) and out of sample (dashed

blue). The top panel presents results from our baseline model with no covariates, and the

bottom panel, from our model with two covariates. Notably, both specifications perform well

at this, or indeed any, level of geographic aggregation. And, while not included in Figure

A2 to avoid clutter, our learning model still significantly outperforms any alternative that

ignores the role of learning.

Direct diffusion of democracy. To evaluate the possibility that our model’s empiri-

cal success is simply an artifact of some alternative process of democratic diffusion that is

indirectly picked up by our model’s spatial and temporal flexibility, we construct a distance-

weighted measure of how democratic each country’s neighborhood is over time, and we

reestimate our baseline model using this measure as a control for direct diffusion effects on

the political cost of democracy. Specifically, we include in Xi,t the weighted average

Di,t−1 =

∑j 6=i exp(−δdi,j)Dj,t−1∑

j 6=i exp(−δdi, j),

where di,j denotes the distance between i and j’s capitals.

To reduce the computational burden, instead of estimating δ—the parameter determin-

ing the size of each country’s effective neighborhood—we consider five scenarios. For our

“medium neighborhood” scenario, we set δ equal to the estimated value of the parameter

governing the spatial decay of learning in our baseline specification. We also consider a

“smaller” and “smallest” neighborhood scenarios, where we increase the value of δ two-fold

xiv

% D

emoc

raci

es

1950 1960 1970 1980 1990 2000 2010

010

2030

Africa

1950 1960 1970 1980 1990 2000 2010

2030

40

Asia−Oceania

% D

emoc

raci

es

1950 1960 1970 1980 1990 2000 2010

5060

7080

9010

0 Europe

1950 1960 1970 1980 1990 2000 2010

4050

6070

8090

Americas

No Covariates

% D

emoc

raci

es

1950 1960 1970 1980 1990 2000 2010

010

2030

Africa

1950 1960 1970 1980 1990 2000 2010

2030

40

Asia−Oceania

% D

emoc

raci

es

1950 1960 1970 1980 1990 2000 2010

5060

7080

9010

0 Europe

1950 1960 1970 1980 1990 2000 2010

4050

6070

8090

Americas

Two Covariates

Figure A2: Observed versus Predicted Prevalence of Democracy by World Region

Notes. By world region, this figure compares the true proportion of democracies (gray) with estimatesgenerated by our learning model for both the in-sample (solid blue) and out-of-sample (dashed blue) periods.The top panel presents results using our baseline specification with no covariates. In the lower panel, wecontrol for lagged log-GDP per capita and incumbents’ time in power.

xv

and five-fold, respectively, and a “larger” and “largest” neighborhood scenarios, where we

divide δ by two and five, respectively. Table A5 presents the results of this exercise, following

the format of Table 2. We find that controlling for direct diffusion provides little increase in

predictive power, which indicates that it is our proposed mechanism of learning about the

economic effects of democracy—and not some alternative process of diffusion—what drives

our results.

Table A5: Direct Diffusion of Democracy

Smallest Neighborhood Smaller Neighborhood Medium Neighborhood Larger Neighborhood Largest NeighborhoodLearning No Learning Learning No Learning Learning No Learning Learning No Learning Learning No Learning

Choices 96.3 92.1 96.0 92.4 96.3 92.3 95.9 92.6 96.7 92.7(% correct)

Transitions(% correct)±0 years 11.6 5.4 12.4 3.9 13.2 4.7 20.2 4.7 18.6 9.3±2 years 41.9 18.6 54.3 20.9 54.3 25.6 50.4 22.5 56.6 22.5

Log-likelihood -536.6 -1,019.1 -539.7 -964.3 -511.4 -931.7 -553.6 -939.3 -481.5 -946.2

Observations 5,925 5,925 5,925 5,925 5,925 5,925 5,925 5,925 5,925 5,925

Notes. From left to right, respectively, models in each “neighborhood” scenario control for direct diffusioneffects on the political cost of democracy using distance weights δ = 2.5, δ = 1, δ = 0.5, δ = 0.25, andδ = 0.1. For each model, we report the percentage of correctly predicted in-sample system of governmentchoices (first row). We similarly report the percentage of correctly predicted transitions to or from democracywithin a 0-year window (second row) and a 5-year window (third row) of the event.

Robustness to alternative measure of democracy and time frame. Next, we explore

the sensitivity of our results to (i) our preferred measure of democracy and (ii) the time frame.

First, as noted in the paper, various alternative measures of democracy have been em-

ployed in the literature. To test the robustness of our results to this feature of the data, we

take Acemoglu et al.’s (2019) preferred measure and reestimate our baseline model.52 Since

the BMR coding is more comprehensive than this alternative measure, to keep the results

as comparable as possible—in particular, to avoid having to modify the time span of the

sample—we use the BMR coding to fill any gaps in Acemoglu et al.’s (2019) data.

Second, to address potential concerns about the myopia of incumbents in our model and

whether an annual time frame is appropriate to study changes in systems of government, we

52To that end, we also reestimate the true DGP (see Footnote 36) to obtain a new estimate of Σ.

xvi

estimate a five-year panel version of our baseline model (and true DGP). Following Acemoglu

et al. (2008), we take the observation of democracy every fifth year as our measure for the

five-year panel.

Table A6 summarizes the results of these robustness exercises, following the format of

Table 2. Our findings are virtually unchanged.

Table A6: Robustness to Democracy Measure and Time Frame

Alternative Democracy Measure Five-Year PanelLearning No Learning Learning No Learning

Choices 95.4 87.5 95.2 85.5(% correct)

Transitions(% correct)±0 years 12.5 0.0±2 years 46.1 0.0 46.6 0.0

Log-likelihood -647.9 -1,555.4 -160.4 -388.1

Observations 5,925 5,925 1,437 1,437

Notes. Models in the first and second columns are estimated using Acemoglu et al.’s (2019) preferred measureof democracy with no covariates. Models in the last two columns are estimated using a five-year panel versionof our data with no covariates. For each model, we report the percentage of correctly predicted in-samplesystem of government choices (first row) and the percentage of correctly predicted transitions to or fromdemocracy within a 0-year window (second row) and a 5-year window (third row) of the event.

Robustness to alternative measures of similarity. While geographic distance is highly

correlated with various dimensions of similarity across countries, we present in Table A7

results from an alternative version of our baseline specification in which we allow the cross-

country correlation in initial beliefs to also depend on genetic distance—as measured by

Spolaore and Wacziarg (2009)—and on economic distance in terms of initial levels of devel-

opment. For the latter, we use a linear trend to impute values of GDP per capita in 1950

for countries without such observations. We then compute the economic distance between

countries i and j as (Yi0 − Yj0)2/Var(Y0). Our results are identical.

xvii

Table A7: Robustness to Similarity Measures

Learning

Choices 95.2(% correct)

Transitions(% correct)±0 years 9.3±2 years 41.1

Log-likelihood -581.4

Observations 5,925

Notes. Results are from a specification with no covariates for the political cost of democracy but threemeasures of similarity to capture cross-country correlation in initial beliefs: geographic distance, geneticdistance, and economic distance. We report the percentage of correctly predicted in-sample system ofgovernment choices (first row) and the percentage of correctly predicted transitions to or from democracywithin a 0-year window (second row) and a 5-year window (third row) of the event.

Income and leader turnover. While we conceive elites broadly as the political party or

faction in power in each country rather than as individual leaders, previous work has high-

lighted the influence individual leader exit can have on democratic transitions. To explore

this, we estimate an alternative specification of our model that uses, as in Treisman (2015),

(lagged) log-GDP per capita, (lagged) leader exit, and their interaction to characterize the

political cost of democracy. As shown in Table A8, and consistent with our main results,

accounting for learning dwarfs the explanatory benefits from such a specification.

Heterogeneous effects of growth on elite turnover. Finally, we explore whether there

is heterogeneity across countries in θ = (θD=0, θD=1), the effects of GDP growth on elite

turnover. Specifically, we estimate a version of our model where we allow θ to vary by region

of the world as in Figure A2. Table A9 reports the corresponding point estimates, and

Table A10 summarizes goodness of fit. Our results provide little evidence of substantively

important heterogeneity.

xviii

Table A8: Income, Leader Turnover, and Democratization

Learning No Learning

Choices 95.8 90.5(% correct)

Transitions(% correct)±0 years 13.3 7.0±2 years 50.0 21.1

Log-likelihood -557.6 -1,150.9

Observations 5,882 5,882

Notes. Results are from a specification that controls for (lagged) log-GDP per capita, (lagged) leaderturnover, and their interaction as in Treisman (2015). We report the percentage of correctly predicted in-sample system of government choices (first row) and the percentage of correctly predicted transitions to orfrom democracy within a 0-year window (second row) and a 5-year window (third row) of the event.

Table A9: Effects of Economic Growth on Elite Turnover by World Region

Region θD=0 θD=1

Africa -1.6859 5.0174Asia-Oceania -4.0284 5.7846

Europe -1.7962 4.6478Americas -1.9665 5.6278

xix

Table A10: Heterogeneous Effects of GDP Growth on Elite Turnover

Learning No Learning

Choices 95.3 88.9(% correct)

Transitions(% correct)±0 years 7.0 0.0±2 years 36.4 0.0

Log-likelihood -614.3 -1,390.8

Observations 5,925 5,925

Notes. Results are from a specification where we allow the effects of GDP growth on elite turnover to varyby world region (Africa, Asia-Oceania, Europe, Americas). We report the percentage of correctly predictedin-sample system of government choices (first row) and the percentage of correctly predicted transitions toor from democracy within a 0-year window (second row) and a 5-year window (third row) of the event.

xx

learning about growth and democracy · a learning model of democratization elites, beliefs, and...

Documents