the impact of venture capital on the life cycles of startupsyling/vc.pdfdatabase that include both...

The Impact of Venture Capital on the Life Cycles of Startups

Yun Ling∗

September 6, 2016

New Version Coming Soon

Abstract

How do VCs select startups to fund over multiple rounds? To study this

question, I develop a dynamic two-sided matching model of VC funding. Using a

hand-collected database including both VC-funded and non-VC-funded startups,

I estimate the joint determinants of investment selection and the effects of post-

investment influence in a Bayesian framework. The results show that selection

depends on startups’ quality and VCs’ influence – VCs may choose to invest in

a startup with lower quality if their subsequent impact is large. Importantly,

previously funded startups are of higher quality and thus are more likely to get

additional funding. A simulation experiment shows that initial random differ-

ences in startups can magnify significantly under the joint effects of the selection

and influence of VC funding.

Keywords: Venture Capital (VC), Investment, Startups, Initial Public Offering

(IPO), Merger and Acquisition (MA), Dynamic Model, Bayesian Estimation

JEL Classification Numbers: C22, C51, C80, G11, G24, G34, M13

∗The author is from the University of Southern California, Marshall School of Business, 3670 TrousdaleParkway, Suite 308, Los Angeles, CA 90089, Email at: [email protected]. I would like to thankGordon Phillips, Arthur Korteweg, John Matsusaka, Kenneth Ahern, Christopher Jones, Kevin Murphy,David Mauer, Fernando Zapatero and seminar participants at the University of Southern California forhelpful comments and suggestions.

mailto:[email protected]

1. Introduction

Venture capital (VC) is essential for early business financing. It is widely believed that

venture capital firms (VCs) have skills in selecting startups1 with great potential and have

profound impact on their subsequent growth. Existing research highlights VCs’ role in

providing value-added services at the post-investment stage (e.g. Sahlman (1990), Lerner

(1995), Hellmann and Puri (2002), Baum and Silverman (2004), Cumming, Fleming and

Suchard (2005), Sørensen (2007), and Bottazzi et al. (2008)). Yet, the actual investment

decision, which requires more of a VC’s skill at the pre-investment stage, is underexplored.

Most importantly, the literature largely ignores the dynamic nature of VCs’ investments,

instead treating the funding decision as a one-shot game.

In this paper, I explore how VCs select startups to fund over time, considering both their

past and future influence on the startups. In particular, I extend a static two-sided matching

model (e.g. Sørensen 2007) to a dynamic setting that involves multiple funding rounds.2

The dynamic setting gives two new insights. First, selection and influence of VC funding are

directly linked over time. Selection depends on startups’ quality3 and VCs’ influence – VCs

may choose to invest in a startup with lower quality if their subsequent impact is expected

to be large. Second, VCs learn from past funding rounds. Thus, previously funded startups

are more likely to get additional funding.

I perform a joint estimation for the determinants of the investment decision and the

effects of post-investment influence. The estimation strategy exploits the implications of

funding selection to identify VCs’ impact. A given startup is less preferred if other startups

are of better quality or expected to benefit more from VCs’ impact. Note that the features

1 A startup is an entrepreneurial firm, before it goes public, gets acquired, or goes out of business.2 In Sørensen 2007, a two-sided matching model is used and calibrated to separate the effects of selection

and influence of VC funding.3 A startup’s quality is defined as the cumulative return of the startup. It is the logarithm of the unob-

servable value of the startup. It is also the continuously-compounded return of investing $1 in the startupsince inception.

1

of the other startups are independent of VCs’ impact on the given startup per se. Thus,

they provide exogenous variations to the selection of the given startup that identify the VCs’

influence. As a result, the given startup is either funded by a worse VC syndicate4 or not

funded at all. In the first case, the exogenous variations identify the differential impacts of

VCs across the subsample of funded startups. In the second case, the given startup changes

from being funded to being unfunded. The exogenous variations identify the overall impact

of VCs through a comparison between VC-funded startups and non-VC-funded startups.

In order to construct a control group to estimate VCs’ impact, I hand-collect a novel

database that include both VC-funded and non-VC-funded startups. The database is from

a leading startup platform called Crunchbase. It provides information for over 290,000

companies (e.g. startups, VCs, incubators, accelerators, etc.) and 310,000 individuals (e.g.

entrepreneurs, venture capitalists, angel investors, etc.) across 176 countries. My final

sample covers 9,303 startups and 2,844 VC syndicates from 1998 to 2014. I then construct

various features for startups, VC syndicates, and the pair-wise matches between them using

both company-level (e.g. office locations, products, categories5, acquisitions, investments,

websites, news, etc.) and people-level information (e.g. educational background, employment

history, founded companies, etc.).

Following funding, the impact of VCs’ investment on a startup’s growth comes from two

sources. First, there is a direct impact that is mostly determined by VC-related charac-

teristics. For instance, funded startups exhibit more growth if they are funded by smaller

and more compatible VC syndicates whose members have cooperated before. Also, both

the presence of alumni ties and a previous funding relationship between a VC syndicate and

a startup increase the startup’s return. Second, VCs’ investments make funded startups

4 VCs usually form a group (called a VC syndicate) to provide funding to a startup. I use VC syndicate,instead of the leading VC, as the unit of investor. This is because only a few funding records containinformation on which is the leading VC.

5 A startup’s category is the sub-industry for the startup that is classified by Crunchbase. A startup canhave multiple categories.

2

less dependent on macroeconomic factors as well startups own quality. For instance, stock

market performance has a larger positive effect on the growth of unfunded startups. Also,

absent from VC funding, startups rely more on the talents of their own management teams.

In contrast, VC-funded companies can resort to the funding VC syndicates for human capital

resources.

While the quality of startups incorporates the impact of VCs’ investments in the past,

it also affects VCs’ future investment decisions. A one standard deviation increase in a

startup’s cumulative return6 corresponds a marginal increase of 6.9% in the probability

of getting funded, relative to a 50% benchmark probability. For funded startups, it also

improves the probability of a successful exit by 49.6% and reduces the probability of a

failure exit by 35.4%. For unfunded startups, the relative increase and decrease are 46.3%

and 45.2% respectively.7

In addition to startups’ quality, VCs take into account a wide range of startup- and VC-

related features to form a subjective expectation of their impact when making the funding

decision. For instance, the expected impact grows with the number of a startup’s products

but declines with the startup’s geographical and categorical span. It indicates a preference

of more productive but focused startups. Also, the expected impact is higher for more

experienced VC syndicates with members participated in more funding rounds in the past.

Between a VC syndicate and a startup, the geographical distance decreases the expected

value while the existence of alumni ties and previous funding relationship increase it. This

might result from a reduction in asymmetric information and agency cost facilitated by

learning through past cooperation.

I then investigate how the expected impact correlates with the growth of funded startup

6 The cumulative return of a startup is the continuously-compounded return of investing $1 in the startupsince inception. It is used interchangeably as the startup’s quality.

7 A startup exits the market of seeking venture funding if it goes public, gets acquired, or dies (goes out ofbusiness). In the case of going public and getting acquired, the exit is defined as successful; in the case ofdeath, the exit is defined as a failure or unsuccessful.

3

in the future. A one standard deviation increase in the expected impact corresponds to a

0.11 standard deviation increase in the funded startups’ returns one month ahead. However,

beyond one period, the correlation is insignificant. This might be due to the noises in the

imputed return data. Besides observable features, unobservable private information shared

between VCs and startups might also play a role in forming the expectation. Therefore, I

extend the baseline model to allow for the correlation between the unobserved factors driving

the funding decision and the subsequent one-period returns for funded startups. The same

private information that will lead to a 100 basis point increase in a funded startup’s return

next month increases the marginal probability by 28% for the startup to get funded this

month. Overall, the above pieces of evidence collectively verify the conjecture that the pre-

investment funding selection and the post-investment influence of VCs are interdependent

in a dynamic setting.

Lastly, I conduct a simulation experiment to study the implications of this interdepen-

dence over multiple rounds of VC funding. Specifically, I simulate data from two models. The

first model has random funding decisions each period, while the second model has random

funding decisions only in the first period. In order to produce a clean comparison, I assume

that startups have the same features and there is only one VC in the market. The difference

between VC-funded and non-VC-funded startups caused by random selection is insignificant

in the first model. In contrast, in the second model, the initial random selection persists over

time and makes a significant difference between funded and unfunded startups. Previously

VC-funded startups have a 93% chance to receive VC funding in the second model, versus a

50% chance in the first model.

This paper adds to the growing literature that examines the funding decisions of venture

capitalists. I depart from the previous studies (e.g. Stuart and Sørensen (2001), Brander et

al. (2002), Chemmanur et al. (2011), Bengtsson and Hsu (2010), Hochberg, Lindsey, and

Westerfield (2015)) by focusing on the role of VCs’ impact in the selection of startups. It

4

is the first paper to model the selection and influence of VC funding jointly in a dynamic

setting. Previous studies either consider the selection and influence of funding separately, or

ignore the dynamic perspective of the funding decision.

Two other novelties include the new hand-collected database and the methodology de-

veloped for the joint estimation of the dynamic model. Along with few other studies (e.g.,

Hellmann and Puri (2002), Hsu (2004)), this paper exploits a database containing a true

control group for the study of VC funding. Most studies that use proprietary databases (e.g.

VentureXpert, VentureOne, Preqin) examine the differential funding incentive or impact

among all-funded startups. Moreover, the new database contains individual characteristics

that facilitate the study of human capital in this area. The estimation strategy adopts

a Bayesian framework in which parameters are estimated by drawing samples from their

distributions, similar to Korteweg and Sørensen (2010).8 This type of methodology can

be applied for the estimation of econometric models that feature strong interaction among

various driving factors.

The paper proceeds as follows. Section 2 presents the economic model and then briefly

discusses the estimation strategy. Section 3 describes the data and the construction of

measures. Section 4 presents the estimation results and conducts post-estimation analy-

sis. Section 5 presents subsample results and explores alternative measures and models for

robustness checks. Section 6 concludes.

8 More generally, the dynamic interdependence among the key variables maps into a Bayesian network forwhich a parallel Gibbs Sampler is developed for parameter learning. Bayesian network is a prototypemodel for the study of the conditional dependencies among random variables.

5

2. Economic Model

2.1 Economy and Agents

The economy has two types of agents: VC syndicates and startups. With discrete time

periods, the set of VC syndicates is constant and is denoted by I. I assume each individual

VC syndicate, denoted by i, is always there and ready to provide funding all the time. In

contrast, each individual startup, denoted by j, enters the market at birth (i.e., when it is

founded), and exits the market either as a success (e.g. going public or getting acquired) or

as a failure (e.g. going out of business).

I use a list of notations to characterize the different time-varying sets of startups. Et is

the set of existing startups in the market by the end of t. It is also the set of existing startups

at the beginning of t+ 1. Nt is the set of newborn startups that enter the market at t. The

entrance occurs at the end of t after the funding decision is made. IPO/MAt and Dt denote

the sets of successful and unsuccessful exits at t, respectively. The exiting startups do not

need more funding, so the exits occur before funding decision is made. The startups that

remain in the market need to compete for more funding. I call them funding candidates.

However, not all of them manage to get funded at t. The set of funding candidates is Jt.

There are two identities that relate the different sets of startups. First, each of the existing

startups at the beginning of t will have exactly one of the three cases: exit successfully, exit

unsuccessfully, or remain in the economy. Second, the existing startups at the end of t

consists of the newborn startups, and the old startups that do not exit (i.e., the funding

candidates). Equivalently, the two identities can be written as follows using the notations

6

defined above.

Et−1 = IPO/MAt +Dt + Jt (K1)

Et = Jt +Nt (K2)

2.2 Sequence of Events

This subsection gives the sequence of events at time t. It highlights the dynamics of

three key endogenous variables for startups. In the following, I first give the definitions of

the three variables. I then describe the sequence of events, along with the dynamics of the

key variables.

2.2.1 Three Startup Variables

The first variable, denoted by rjt , is equal to the cumulative return of a startup j up to

time t. I call it the “growth variable”. By definition, the investment of $1 dollar in the

startup at inception will worth exp rjt at time t.9 The second variable, denoted by sjt , is a

signal perceived by the public on the well-being of a startup j at time t. I call it the “implicit

exit value”. It will determine whether a startup exit at time t and in which way if it exits

(e.g. IPO/MA, or Death). The evolution of both variables depend on whether a startup is

funded by a VC syndicate in the previous time period. As a result, they reflect the direct

influence of VC funding.

The third variable, denoted by vijt , is the expected value created jointly by a VC syndicate

i and a startup j, if they choose to have a funding relationship at time t.10 The expected

9 The cumulative return is continuously compounded to calculate the value of investment (net of dilution).10 Mathematically, vijt ≡ Eij

t

[rjt+1 − r

jt

∣∣∣ i is funding j from t to t+ 1], namely the value added

(rjt+1 − r

jt

)from this period to the next period in the common perspective of i and j at time t, given that i is fundingj from t to t+ 1.

7

value is subjective and different across all pairs of i and j. Therefore, I call it the “subjective

value added”. An important assumption is that a pair of startup and a VC syndicate share

a common perspective toward the expected value they would create jointly. Together, the

set of all concurrent expected values will determine the selection of VC funding.

2.2.2 Sequence of Events

[Insert Figure 1 and Figure A5]

A sequence of events occur at time t, as illustrated in Figure 1.11 First, the growth

variable at time t is determined. It depends on the lagged growth variable at t − 1, the

previous funding relations, and various current features. More specifically, if a startup is

not funded previously, its return depends only on its own features. In contrast, if it is

funded by a VC syndicate previously, its return also depends on the features related to the

VC syndicate. Eq(1) gives the law of motion for the growth variable. The various current

features include: macroeconomic variables Xt, startup features Xjt , VC syndicate features

X it , and startup-VC syndicate pair features X ij

t . I use φr,y and φr,n to denote the coefficients

associated with relevant features separately for the two cases. The noises are independent

and follow Normal distributions with variances σ2r,y and σ2

r,n.12

rjt − rjt−1 =

[1, Xt, X

jt , X

it , X

ijt

]φr,y + σr,yε

jt if j is funded by i at time t− 1[

1, Xt, Xjt

]φr,n + σr,nε

jt if j is unfunded at time t− 1

(1)

Second, given the growth variable, the implicit exit value is determined. It also depends

on a similar set of features according to whether a startup is previously-funded. In addition,

11 The relationship among the three key startup variables is illustrated in Figure A5.12 The first subscript “r” indicates that the equation’s dependent variable is r. The second subscript, “y” or

“n”, indicates the answer (“yes” or “no”) to the question of whether the startup is previously VC-funded.

8

the implicit exit value also depends on the concurrent growth variable. This assumption is

to capture the intuition that a startup is more likely to have a successful exit, or equiva-

lently the implicit exit value is higher, when its cumulative return is higher. Eq(2) gives

the determinants of the implicit exit value. As before, I use φs,y and φs,n to denote the

coefficients separately for the previously-funded and previously-unfunded cases. The noises

are independent and follow Normal distributions with variances σ2s,y and σ2

s,n.

sjt =

[1, Xt, X

jt , X

it , X

ijt , r

jt

]φs,y + σs,yη

jt if j is funded by i at time t− 1[

1, Xt, Xjt , r

jt

]φr,n + σs,nη

jt if j is unfunded at time t− 1

(2)

Third, given the implicit exit value, a startup’s status is determined. For a startup j,

the status denotes whether j exits the market at t and in which way if it exits. The status

takes three values, “IPO/MA”, “Death”, and “Survival”, according to the implicit exit value.

Eq(3) gives the correspondence.13 As a result, the set of existing startups at the beginning of

time t breaks into three groups. IPO/MAt contains the startups with status “IPO/MA”. Dt

contains the startups with status “Death”. Jt contains the startups with status “Survival”.

Recall that the startups in Jt are funding candidates and will compete for VC funding at

time t.

statusjt =

“IPO/MA” if δ ≤ sjt

“Survival” if − δ ≤ sjt < δ

“Death” if sjt < δ

(3)

Fourth, given the set of funding candidates, the subjective value added is determined, for

each pair of startup and VC syndicate in the economy. The pair-wise value depends on a set

of startup features, VC syndicate features, and startup-VC syndicate pair features. Again, it

13 I set δ to 3. It could an arbitrary positive value. This is because the equation for s is unidentified withoutthe specification of δ. The change of δ only shifts and re-scales the distribution of s. Consequently, theestimated coefficients will have its intercept shifted, and other coefficients multiplied by common factor.

9

also depends on a startup’s growth variable. Eq(4) gives the determinants of the subjective

value added.14 I use φv to denote the coefficients. The noises are independent and follow

standard Normal distributions.15 Fifth, given the subjective value added, the equilibrium

funding at time t is determined as a two-sided matching between the set of VC syndicates

and the set of funding candidates. I use µt to denote the funding. More details will be given

in the following subsection.

vijt =[Xjt , X

it , X

ijt , r

jt

]φv + ξijt , for all i ∈ I, j ∈ Jt (4)

Lastly, after the funding relationship is established, newborn startups enter into the

market at the end of t. Their company value is set to $1, or equivalently their growth

variable is set to zero. Thus, the existing startups at the end of t include newborn startups

and the old startups that have not exited.

2.3 Funding Decision

The funding decision µt lasts for one period from time t to time t+1. I use the two-sided

matching model in Sørensen (2007) as the prototype for the one-period funding decision.

Two-sided means that both the VC syndicates and the startups are active in the search

of a funding relationship. However, a startup can only be matched to one VC syndicate,

while a VC syndicate can be matched with multiple startups. The number of startups a VC

syndicate i funds at t is denoted by qit, and it will be calibrated to equal the actual number

in estimation. As discussed before, the set of subjective value added{vijt : i ∈ I, j ∈ Jt

}14 The equilibrium funding decision is determined through the comparison of the subjective value added. This

will be discussed later. However, a common shift of all concurrent subjective value added will not changethe equilibrium funding. Thus, the coefficients associated with the intercept and the macroeconomicvariables are unidentified. Therefore, those variables are not included in the equation for v.

15 I set the variance to 1. It can be any arbitrary positive value. Again, this is because the equation for vis unidentified without the specification of it. The change of variance only re-scales the distribution of vand will not change the equilibrium funding decision.

10

determines the equilibrium matching µ∗t . In the following, I first give the utility maximization

problem for both agents in terms of the subjective value added, then discuss the equilibrium.

2.3.1 Preferences and Choice Variable

Given a funding relationship µt, the utility is the sum of subjective value added that an

agent creates. For a startup j, U jt equals vijt if there is a funding relationship between i and

j. For a VC syndicate i, U it equals the sum of vijt if there is a funding relationship between i

and j. The choice variable is the indicator whether the pair (i, j) is in µt. Both startups and

VC syndicates can propose to the counterparty for the establishment of a pair. However,

the pair (i, j) ends up in µt if and only if both i and j want it to exist. Eq(5) and Eq(6)

define the utility function and state the utility maximization problem.

U it =

∑j:ij∈µt

vijt , s.t. |i : ij ∈ µt| ≤ qi (5)

U jt =

∑i:ij∈µt

vijt , s.t. |j : ij ∈ µt| ≤ 1 (6)

There are two assumptions behind the definition of utility. First, for each pair of VC

syndicate and startup, I assume they agree with the subjective value added vijt . Second, I

assume there is a common fraction, say λ, of the value added goes to a VC syndicate, and

the remaining fraction, 1 − λ, goes to a startup. This assumption allows me to ignore the

differences in bargaining power and agency problem. Thus, the ranking among the set of

expected value added is sufficient to determine the equilibrium funding.

11

2.3.2 Pairwise Stability and Equilibrium

The equilibrium matching µ∗t is pairwise stable. Pairwise stability means that there exists

no pair that can gain from pairwise deviation. A pairwise deviation would occur for a pair

(i, j) not in µt that both i and j prefer each other to their current matched counterparties.

As a result, i and j would break up with their existing matched counterparties and form a

new pair (i, j) between them. In the equilibrium matching µ∗t , such a pairwise deviation is

not profitable for any pair.

The equilibrium matching exists and is unique if all vijt are distinct.16 The equilibrium

condition can be characterized by a set of inequalities as in Sørensen (2007). Eq(7) and Eq(8)

give the inequalities. For a pair (i, j) not in µt, vijt is not greater than the two opportunity

costs of i and j to break up with their matched counterparties. The two opportunity costs

are equal to minij′∈µt vij′

t and vµt(j)jt respectively. Here, I use µt(j) to denote the matched

VC syndicate for j at t. For a pair (i, j) in µt, vijt is greater than all vi

′jt that i′ wants to

deviate to j and all vij′

t that j wants to deviate to i. Let St(i) denote the set of j′ that

want to deviate to i, and let St(j) denote the set of i′ that want to deviate to j. Eq(11) and

Eq(12) give the expressions for St(i) and St(j) respectively.

(i, j) /∈ µt ⇔ vijt < v (7)

(i, j) ∈ µt ⇔ vijt ≥ v (8)

v = max

[minij′∈µt

vij′

t , vµt(j)jt

](9)

v = max

[maxj′∈St(i)

vij′

t , maxi′∈St(j)

vi′jt

](10)

16 This is because i and j share the same perspective of vijt . A proof is given in Sørensen (2005). In general,it is not true. The stable matching problem can be solved by the Gale-Shapley algorithm.

12

St(i) ={j′ ∈ Jt : vij

′

t > vµt(j′)j′

t

}(11)

St(j) =

{i′ ∈ I : vi

′jt > min

ij′∈µtvi

′j′

t

}(12)

2.4 Estimation Strategy

The parameter estimation is performed jointly on the main system of equations Eq(1),

Eq(2), and Eq(4). The parameters include (φr,y, σ2r,y), (φr,n, σ

2r,n), (φs,y, σ

2s,y), (φs,n, σ

2s,n), and

φv. Eq(1) and Eq(2) give the direct influence of VC funding on startup growth and implicit

exit value. The direct influence is shown in a comparison between the previously-funded and

previously-unfunded startups. Eq(4) describes subjective value added as the determinant of

the selection of VC funding. The selection corresponds to a two-sided matching between VC

syndicates and funding candidates.

Not all variables in the main system of equations are observed. The three dependent

variables are latent or only partially observed. Thus, the estimation involves the imputation

of these variables.17 The observed data is composed of four pieces. The first piece is the

partially observed growth variable. It is observed at birth and exit of a startup, or when

the startup gets VC funding. The second piece is the startup status (e.g. IPO/MA, Death,

Survival) at each time period. It helps the imputation of the implicit exit value. The third

piece is the equilibrium funding. It helps the imputation of the subjective value added. The

last piece includes the macroeconomic variables, and the various features of startups, VC

syndicates, and startup-VC syndicate pairs. They are the independent variables for the three

equations.

I use Gibbs Sampler to estimate the parameters in a Bayesian framework. In fact, it is

simpler to estimate in this way. First, regarding the interdependence among the key variables,

17 The imputation relies on the observed data, the last-updated parameter estimates, and most importantly,the interdependence among the three key variables.

13

it is impossible to use regressions alone to accomplish the joint estimation. Also, the existence

of latent variables makes it implausible to estimate using a GMM/SMM strategy. Therefore,

it is easiest to use the Bayesian framework because it can handle the interdependence in the

presence of latent variables. The Bayesian framework gives tractable posterior distributions

for all the parameters and variables given proper priors. Thus, a tractable algorithm can

be implemented using the Gibbs Sampler. The Gibbs Sampler iterates between parameter

updates and the latent variable imputations. Appendix A gives the detailed algorithm for

the estimation strategy.

The distribution assumption is given as follows. I assume that the noises in Eq(1), Eq(2),

and Eq(3) are independent.18 As defined before, they follow Normal distributions with dif-

ferent variances. I also assume the parameters have conjugate priors. In Eq(1) and Eq(2),

the priors of the joint distributions (φ, σ2) follow Normal-Inverse-Gamma distributions. In

Eq(3), the prior of φv follows a Normal distribution. Appendix A gives the detailed assump-

tion for the priors. For post-estimation analysis, I use the posterior t-statistics for hypothesis

testing.

3. Data and Measure

Most research in venture capital uses proprietary databases (e.g. VentureOne, VentureX-

pert). For the study of VC’s impact on startups, the biggest disadvantage of these databases

is that they lack a control group of non-VC-funded startups. In this paper, I hand-collect a

novel database from a leading startup platform, Crunchbase, that provides information for

both VC-funded and non-VC-funded startups.

Founded in 2007, Crunchbase has 1.5 million unique visitors each month in 2013. By 2014,

it has a record of more than 290,000 companies (e.g. startups, VCs, incubators, accelerators,

18 Later on, in an extended model, the three noises are assumed to be correlated.

14

https://www.crunchbase.com/


etc.) and 310,000 individuals (e.g. entrepreneurs, venture capitalists, angel investors, etc.)

across 176 countries. For companies, a typical record includes founding information, current

status (IPO, acquired, alive, dead), acquisitions and investments history19, funding history20,

products and categories information21, contact information, and company news. There is also

human capital information that relates companies to individuals. The individuals include

founders, angel investors, board members and advisors, and personnel on the current and

past teams of management for a company. For individuals, a typical record includes name,

gender, primary location, employment history, and educational background.

One unique feature about Crunchbase is that it collects information by crowdsourcing.

The advantage is that the database can be built very quickly at an exponential speed with

insiders, especially entrepreneurs, feeding detailed information. Another database that is

built in this way is Wikipedia which has more than 120,000 regular contributors and 12,000

editors by now. Like Wikipedia, the disadvantage of Crunchbase is that it may contain

some inaccurate information. The Crunchbase team combines human and machine reviews

to prevent it.22

To check the credibility, I manually compare the Crunchbase profiles for a subsample of

startups with the information from major business journals and proprietary databases. The

subsample includes 250 startups with successful exits (IPO/MA) and 790 startups that have

funding records from VentureXpert. More specifically, I compare the numbers for the money

raised in IPO, the transaction value for MA, and the money invested in the funding rounds.

Those numbers are similar for the information collected from Crunchbase and from other

sources. In addition, I apply a number of filters to select the startups with the most accurate

information for model estimation. The filtering procedure will be discussed in details in the

19 The company is the acquirer or investor.20 The company is the investee.21 The category is classified by Crunchbase denoting the sub-industry for the company. The categories are

not mutually exclusive.22 For more details, please see https://info.crunchbase.com/about/faqs.

15


https://www.wikipedia.org/

https://www.wikipedia.org/





https://info.crunchbase.com/about/faqs

construction of sample.

Finally, one prevalent concern on any startup database is that it may contain some zombie

companies. A zombie company shows as alive on the record but is actually out of business.

Thus, I need to change the final status from “Survival” to “Death” for zombie companies

in my sample. To do that, I visit each startup’s website in the sample if its final status

is “Survival” according to the Crunchbase profile. It turns out that over 65% of the dead

companies in the sample are detected in this way.

3.1 Sample

The sample period is from 1998 to 2014. I apply a number of filters to select a sample that

is suitable for the study and has the most accurate information. The first set of filters is on

startups. A startup needs to have a birth year equal to or greater than 1998 to be included in

the sample. Moreover, I include a startup if it has available website, category, headquarter,

founder, and current team of management information on its Crunchbase profile. This gives

a total number of 29,184 startups.

The second set of filters is on VCs and funding rounds. For VCs, I only include experi-

enced VCs that have participated in at least ten funding rounds in the sample. It is due to

the model assumption that VCs are always there ready to provide funding. This gives a total

number of 765 VCs. For funding rounds, a filter is applied on the type of funding. It excludes

angel-investing, debt-financing, equity or product-crowdfunding, grant, non-equity-issuance,

post-ipo-debt or equity, and secondary-market-investing. A funding round also needs to have

available information on investment amount and post-investment valuation to be included

in the sample. This gives a total number of 21,483 funding rounds.

The third set of filters is on IPOs and MAs. For IPOs, I delete a record if a company’s

16


market value at IPO cannot be calculated or otherwise obtained from other sources. Fortu-

nately, no record of is dropped in this way. For MAs, I delete a record if either transaction

value or acquired proportion of a company is missing and unavailable from the SDC Plat-

inum Mergers & Acquisitions database. This gives a total number of 323 IPO and 1,728 MA

records.

Finally, the resulted datasets of startups, VCs, funding rounds, IPOs and MAs need to

be consistent with one another. For instance, I delete the whole record of a startup if its

corresponding IPO, MA or funding-round record is excluded due to missing information. I

also delete the whole record of a VC if all of its funding records have been deleted given

the above filters. Accordingly, VC-funded startups account for a smaller proportion in the

sample than in the resulted dataset from the first set of filters. To keep the proportion

roughly the same, I randomly drop non-VC-funded startups with a preference for those with

the least information in the database.

[Insert Table 1]

Table 1 gives a descriptive summary of the final sample. It contains 9,303 startups and

755 VCs that have formed 2,844 distinct VC syndicates. Among the 9,303 startups, only

2,350 (25.26%) have been funded by VCs. For the VC-funded ones, about 22.47% go public

or get acquired and 17.62% finally die. For the non-VC-funded ones, the proportions are

1.57% and 27.02% respectively. Getting VCs’ funding can increase the IPO/MA rate by 14

times. Regarding the number of rounds for VC-funded startups, both the median and the

mode are 2 rounds per startup. Regarding the size of VC syndicates, the mean and the

median are 2.93 and 3 VCs per syndicate.

17

3.2 Measures

There are two groups of variables for which I need to construct measures. The first group

consists of startups’ birth, final status (e.g. IPO/MA, Survival, Death), and funding status

(e.g. VC-funded or not, and by which VC syndicate if funded). It is straightforward to

construct these measures as they are directly recorded in the database. They will be used

to update the posterior distributions of latent variables.

The second group consists of the dependent and independent variables in the main system

of equations for estimation (Eq(1), Eq(2), and Eq(4)). The three dependent variables are the

cumulative return, the implicit exit value, and the subjective value added. Among them, the

implicit exit value and the subjective value added are latent variables to be imputed. The

cumulative return is observed sporadically. Based on these observations, interim values are

imputed during estimation. The independent variables include various observed features of

startups and VC syndicates. The following details the construction of the cumulative return

and those various features and Appendix B gives a summary.

3.2.1 Cumulative Return

By definition, the cumulative return is the logarithm of a startup’s valuation. I assume

that all startups have a valuation equal to 1 at birth. Later on, a startup has its valuation

revealed at exit or funding rounds. I estimate the valuation in three ways. First, I calculate

the valuation at exit for startups that finally exit during the sample period. These are the

startups with final a status equal to “IPO/MA” or “Death”. For “IPO/MA” ones, I set the

valuation to be the reported market value at IPO or the deal price divided by the percentage

acquired in MA. For “Death” ones, I sample the valuation from a triangle distribution with

a mode of 0.1. For the exact time of death, I collect additional information to determine

when is the last time these finally-dead startups have events or news. I then sample from

18

a uniform distribution spanning from 6 to 30 months after that time for the exact time of

death.

Second, I calculate the valuation at funding rounds for startups that have received VC

funding during the sample period. The valuation is net of the pure money effect of in-

vestment, namely “anti-diluted” as defined in Cochrane (2005) and Korteweg and Sørensen

(2010). More specifically, the valuation can be calculated by compounding the “anti-diluted”

period-to-period return. The period-to-period return from the last funding round to this

funding round is equal to the ratio of the pre-investment value at this round to the post-

investment value at last round.23 As a result, the “anti-diluted” valuation is equal to the

product of the period-to-period returns between neighboring funding rounds. This defini-

tion measures the growth rate of a VC-funded startup by excluding the dollar amount of

investment.

Third, I estimate the valuation for startups whose final status is “Survival” by the end of

the sample period. The estimation is based on current startup performance. In particular, I

visit their websites and use their Crunchbase profiles (e.g. company description, current team

of management, offices, investments and acquisitions24) to evaluate the performance. I then

classify the startups into three groups according to their performance ranked in a decreasing

order. Next, for each group, I find comparable startups with similar features and non-missing

valuations at exit. Finally, I impute the valuation for each group by drawing samples from

a smoothed distribution of the exit values of comparable startups.25 Table 2 Panel A gives

the summary statistics for the observed cumulative return. As implied by the percentile

information, the cumulative return is very dispersed and has a bimodal distribution.

23 Appendix B gives the formula for the “anti-diluted” period-to-period return and company value.24 The startups are the investors and acquirers.25 Admittedly, the estimation is subjective. Nevertheless, given the huge variation in startups performance,

it is better off to have a rough estimate than leave the valuation missing. In the latter case, the distri-bution of imputed interim returns would be unrealistically flattened. Also, the returns for successful andunsuccessful startups should follow very different distributions.

19

[Insert Table 2]

3.2.2 Macroeconomic Variables

The macroeconomic variables include the risk-free rate (rf ), the Fama-French three fac-

tors ((rm − rf ), smb, hml), and a proxy for the cost of long-term borrowing (Ybaa − Yus10).

(Ybaa − Yus10) is equal to the spread of the yield of Moody’s seasoned Baa corporate bond

over the yield of 10-year Treasury bond. Both the risk-free rate and the spread are obtained

from the Federal Reserve Bank of St. Louis. The Fama French three factors are from Ken

Frenchs website. (rm − rf ) is excess market return over risk-free rate; smb is the factor

return on the small-minus-big portfolio; and hml is the factor return on the high-minus-low

portfolio. Table 2 Panel B gives the summary statistics.

3.2.3 Startup Features

The startup features are either constant or time-varying. The constant startup features

are along three dimensions: location, category, and product. # locations is the number of

cities that a startups headquarter or offices are located in. # categories is the number of

categories a startup is classified into by Crunchbase. # products is the number of products

that a startup has. In addition, I construct a set of dummies 1(LOC) to indicate whether

a startup has a headquarter or offices located in a specific place LOC. LOC takes “ny” for

New York, “ca” for California, “ous” for other places in U.S., “ona” for other places in North

America, “as” for Asia, and “eu” for Europe. Table A1 gives the list of top 20 cities and

categories with the most startups.

The time-varying startup features characterize funding history and human capital in-

formation. For funding history, t from last round is the time in years since last funding

round; t2 from last round is the square of it. # rounds is the number of funding rounds

20

experienced in the past. For human capital information, # startups founded is the number

of companies that the startup founder has built in the past. 1(top20 school) is the dummy

variable indicating whether a startup has people on its management team who graduated

from a top school at a specific time. Table A2 gives the top school list.26 Table 2 Panel B

gives the summary statistics for the startup features.

3.2.4 VC Syndicate Features

The VC syndicate features also include the constant and the time-varying. The constant

features characterize geographical and size information. # of VCs is the number of VCs in a

syndicate and it measures the size of the syndicate. # locations if the number of cities that

a VC syndicate has at least one VC member that has an office or headquarter located in

it. Table A3 gives the lists of top 20 cities with the most VCs (Panel B) and with the most

VC syndicates (Panel C). A comparison between the two lists shows that the cooperation is

prevalent among VCs in different locations. For instance, the percentage of VCs that has an

office in New York, San Francisco, Menlo Park, and Palo Alto is at most 15%. In contrast,

the percentage of VC syndicates that has an office in these places is at least 44%.

The time-varying features characterize investment and human capital information. For

a VC syndicate, 1(cooperated) is a dummy variable indicating whether any member VCs

have cooperated in the past. # categories is the number of categories that a VC syndicate

has at least one member VC that has investment experience in it in the past. # rounds

is the median funding rounds that VC members have participated in before. This variable

measures the average experience of a VC syndicate. Finally, 1(top20 school) is the dummy

variable indicating whether a VC syndicate has people who graduated from a top school

26 Both lists in Table A2 and Table A4 exclude some best schools but include some schools with lowerranks. There are two reasons. First, I give priority to schools that provide the best education in sometechnical fields (e.g. engineering, biochemistry). Second, I select schools that appear most frequently inthe educational background of VC and startup personnel, since these schools should have very strongalumni network.

21

on its members’ management teams at a specific time. Table A4 gives the top school list.

Table 2 Panel C gives the summary statistics for the VC syndicate features.

3.2.5 Startup-VC Syndicate Pair Features

Last, I construct measures for each pair of startup and VC syndicate. There are

26,457,732 startup-VC syndicate pairs in total. The constant features measure the pair-

wise geographical distance. Distance is the closest distance in miles between a startup and a

VC syndicate. Based on that, days of travel is the number of (additional) days for a round

travel. It equals 0 if distance is within 100 miles, 1 if distance is between 100 and 1,000

miles, 2 if distance is between 1,000 and 10,000 miles, and 3 otherwise.27 The percentage of

all pairs that has a distance with 100 miles is around 25%. For comparison, the percentage

of pairs that has a funding relationship and has a distance within 100 miles is around 75%.

The time-varying features describe the existence of funding relationship and alumni ties.

For a startup-VC syndicate pair, 1(funding tie) is a dummy variable indicating whether

any VC member has funded the startup in the past. 1(alumni tie) is a dummy variable

indicating whether any VC member and the startup have people graduated from the same

school. Table 2 Panel D gives the summary statistics for the startup-VC syndicate features.

27 For distance within 100 miles, I assume a one-day round travel by car. For distance between 100 and1,000 miles, I assume a two-day travel by flight. For distance between 1,000 and 10,000 miles, I assume athree-day travel by flight. For distance greater than 10,000 miles, I assume an intercontinental travel.

22

4. Estimation Results

4.1 Baseline Estimation

Table 3 presents the joint estimation result for the baseline model. In the table, the first

four columns show the direct influence of VC funding on startup growth and implicit exit

value as a comparison between previously-funded and previously-unfunded startups. Among

them, columns 1 and 2 give the parameters for the law of motion of the growth variable in

Eq(1); columns 3 and 4 give the parameters for the dynamics of the implicit exit value in

Eq(2). The last column shows the determinants of the selection of VC funding. It gives the

parameters for the dynamics of the subjective value added in Eq(4).

[Insert Table 3]

4.1.1 Influence of VC Funding

The influence of VC funding on startup growth comes from two sources. First, there is a

direct impact from the funding VC syndicate features and the pair features. Note that these

features are non-missing only for previously-funded startups. Second, there is an indirect

impact given the presence of the funding VC syndicate. The indirect impact changes the

effects of the macroeconomic variables and the startup features. It is shown as a comparison

between the previously-funded and unfunded startups.

For the direct impact, various VC syndicate and pair features show significance. For

instance, funded startups growth is negatively correlated with VC syndicate size (# of VCs)

and positively correlated with the educational background of individual venture capitalists

(1(top20 school)). In addition, past cooperation among any VC members (1(cooperated))

also helps funded startups grow faster. Regarding the pairwise features, both the existence

23

of alumni ties (1(alumni tie)) and the existence of past funding relations (1(funding tie))

correspond to a higher growth rate for the funded startups. It might result from a reduction

in asymmetric information and agency cost facilitated by learning through social network or

past cooperation.

For the indirect impact, previously-funded startups are less dependent on the macroeco-

nomic conditions. For instance, the cost of alternative funding, indicated by the difference

between the BAA corporate bond yield and the 10-year Treasury yield (Ybaa − Yus10), has a

more significant negative effect on startup growth when the startup is previously-unfunded.

Likewise, startup growth depends more on the Fama French three factors ((rm − rf ), smb,

hml) as well as the risk-free rate (rf ) without funding.

Regarding the startup features, one surprising result is that the effects of startup hu-

man capital features exhibit opposite signs for the previously-funded and unfunded cases.

Founding experience of entrepreneurs (# startups founded) has a positive effect on a startup

growth when it is not funded. However, the effect becomes negative in the presence of VC

funding. Similarly, the educational background of a startup’s management team (1(top20

school)) promotes growth without funding but impedes growth with funding. The positive

impact of a startup’s human resource seems to be supplanted by the funding VC syndicate’s.

The change in the sign indicates a power struggle between the top executives of the funded

startups and funding VCs.

The influence of VC funding on startup implicit exit value is mainly through its effect on

startup growth. Faster startup growth implies better startup quality (i.e., higher cumulative

return) and corresponds to a higher implicit exit value. As a result, a startup is more likely

to exit through IPO or MA and less likely to run out of business. For the previously-funded

case, a one standard deviation increase in the cumulative return (7.408) is associated with an

increase of 0.207 in the implicit exit value. It corresponds to a 49.6% relative increase in the

probability of IPO/MA and a 35.4% relative decrease in the probability of Death compared

24

with the benchmark mean values.28 For the previously-unfunded case, the relative increase

and decrease in the probabilities of IPO/MA and Death are 46.3% and 45.2% respectively.

4.1.2 Selection of VC Funding

The selection of VC funding depends on the subjective value added. One of the determi-

nants for the subjective value added is startup quality. A one standard deviation increase in

the cumulative return (7.172)29 is associated with an increase of 0.244 in the subjective value

added. It corresponds to a marginal increase of 6.9% in the probability of getting funded,

relative to a 50% benchmark probability.30

In addition to startup quality, a wide range of startup- and VC-related features enter into

the consideration of funding selection. Regarding the startup features, the expected value

added grows with the number of a startup’s products (# products) but declines with the

startup’s geographical span (# location) and categorical span (# categories). It indicates

a preference of more productive but focused startups. Also, startups that have experienced

more funding rounds is preferred, as they have been selected by other savvy VCs in the past.

Interestingly, better startup human capital (# startups founded, 1(top20 school)) lowers the

chance of receiving VC funding. It is not surprising given their negative impact on funded

startup’s growth.

Regarding the VC-related features, smaller VC syndicates (# of VCs) with more funding

experience (# rounds) correspond to a higher expected impact. However, diversity in the

categories of funding (# categories) is associated with a lower expected impact. Also, better

28 For the previously-funded case, the mean and standard deviation for the implicit exit value are 0.044 and1.307. The mean probabilities of IPO/MA and Death are 1.19% and 0.99%. Thus, an increase of 0.207in the implicit exit value changes the IPO/MA probability to Φ ((0.647− δ)/1.307) = 1.78%, and changesthe Death probability to Φ ((−0.647− δ)/1.307) = 0.64%. Φ is the c.d.f. of standard Normal. δ equals 3.

29 This is the standard deviation of the cumulative return over the whole sample.30 The benchmark 50% is the probability that some vij is greater than vi

′j′ given the same distributionassuming all their determinants in Eq(4) are the same. Thus, an increase of 0.244 in vij changes thisprobability to Φ

(0.244/

√2)

= 56.86%. Φ is the c.d.f. of standard Normal.

25

educational background of venture capitalists (1(top20 school)) is less preferred. Between a

pair of VC syndicate and startup, the geographical distance (days of travel) decreases the

expected value added while the existence of alumni ties (1(alumni tie)) and previous funding

relationship (1(funding tie)) increase it. The expected value added seems to be consistent

with the VC syndicate’s impact on funded startups following the funding decision.

4.1.3 Previously Funded vs. Previously Unfunded

The overall effect of VC funding is measured by a comparison of startup growth and

implicit exit value between previously-funded and previously-unfunded startups. It is to

compare the dependent variables in Eq(1) and Eq(2). Table 4 compares the mean values us-

ing different control groups. The different control groups are all of the previously-unfunded

startups and subgroups of them including only comparable ones. I use the Nearest Neighbor

method and the Propensity Score Matching method to choose comparable unfunded star-

tups. Identifying covariates include startup features and macroeconomic variables. Figure 2

compares the distributions.

[Insert Table 4 & Figure 2]

In general, previously-funded startups exhibit faster growth and higher implicit exit value.

For the period return31, the previously-funded group has a raw and trimmed mean both equal

to 3.4%.32 The previously-unfunded group has a raw and trimmed mean equal to 5.1% and

1.7% respectively. Growth of unfunded startups seems to have huge variance. However, on

average, it is smaller than the growth of funded startups after ignoring the outliers.

The mean exit value is 0.044 for previously-funded startup and -0.197 for previously-

unfunded ones. It implies a higher unconditional probability of successful exit and a lower

31 Period return is the first difference in the cumulative return. One period is one month.32 The trimmed mean is the mean of the subsample winsorized between 1 and 99 percentiles.

26

probability of failure exit for funded startups. This is consistent with the results on IPO&MA

rate and Death rate. The two rates are the proportions of startups that go IPO/MA and

Death this period following the previous funding decision. They represent the conditional

probabilities of successful and failure exits given that a startup still exists in the economy.

The IPO&MA and Death rates are 0.5% and 0.4% for previously-funded startups. In con-

trast, they are 0.02% and 0.49% for previously-unfunded startups. A Chi2 statistics of 164.9

shows that the two conditional distributions of successful and failure startups are significantly

different across the funded and unfunded groups.

The results stay qualitatively the same using comparable unfunded startups as alternative

controls. The difference in the mean trimmed period return is around 5.2% between the

funded and comparable unfunded startups. The difference in the mean implicit exit value is

around 0.2. By large, VC funding improves the startup return and the chance of successful

exit in the following period after controlling the determinants of funding selection.

4.2 Constrained Estimation

Both startup growth and implicit exit value have separate dynamics across funded and

unfunded startups. One concern is that the separate dynamics might mask the first-order

effects of VC-related features. To address this concern, I impose the constraints such that

funded and unfunded startups have the same dynamics of evolution. In particular, the con-

straints make the coefficients of the common terms the same for the two groups of startups.

The common terms include the intercept, the macroeconomic variables, and the startup

features. By imposing the constraints, it shuts down the indirect impact of VC funding.

Appendix A3 gives the estimation strategy for the constrained model. Table 5 presents the

constrained estimation results.

[Insert Table 5]

27

For the selection of VC funding, the coefficients stay almost the same. One exception is

the effect of days of travel. It is tripled (-0.696) compared with the baseline model (-0.210).

Now, a one standard deviation decrease in days of travel is associated with 0.176 increase in

the expected value added. It corresponds to a marginal increase of 5.0% in the probability

of getting funded, compared with a 50% benchmark probability.

For the influence of VC funding, the new coefficients have some interesting patterns.

First, the effects of the VC-related features are similar to those in the baseline model. It

implies that the direct impact of VC funding is not sensitive to the assumption of separate

dynamics for funded and unfunded startups. Second, the effects of the common terms (the

intercept, the macroeconomic variables, and the startup features) are close to those for the

unfunded case in the baseline model. This might be due to the large proportion of unfunded

startups in the panel data.

Moreover, the effect of startup quality on implicit exit value is now much larger. Now,

a one standard deviation increase in the cumulative return corresponds to a 282.5% relative

increase in the probability of IPO/MA and a 76.2% relative decrease in the probability of

Death.33 However, the boost in the coefficient is due to the fact that VC funding works as a

confounding factor that improves startup quality and implicit exit value at the same time.

In fact, it is more appropriate to have different dynamics for funded and unfunded startups

to separate out the funding effect. Therefore, the following analysis is performed on the

estimation results from the baseline model.

4.3 Expected VC Impact & Startup Future

Given the estimation result, I then investigate how the subjective value added lines up

with the true value added following funding. That is, within the group of previously-funded

33 The associated increase in the implicit exit value is 0.581. It increases the probability of IPO/MA from0.40% to 1.53%. It decreases the probability of Death from 0.84% to 0.20%.

28

startups, whether the expected VCs’ impact correctly reflects startups’ future performance

(startup growth, implicit exit value), as a measure of the true value added.

4.3.1 Expected VC Impact & Startup Future

For a direct test, I look at the correlations between the subjective value added and startup

future performance. The sample of the test only includes funded startups at each funding

round. Note that only on funded startups, venture capitalists have a chance to realize their

expected impact. Also, the impact might last for multiple periods. Therefore, I use each

funding round as an observation, to facilitate the study of the impact’s duration. Table 6

presents the results.

[Insert Table 6]

The correlation is positive and significant for startup future performance one-period

ahead. Within the group of funded startups, a one standard deviation increase in the ex-

pected VC impact is associated with a 0.114 standard deviation increase in the startup

return one-period ahead. It also relates to a 0.056 standard deviation increase in the im-

plicit exit value next period, which corresponds to a 15.1% relative increase in the probability

of IPO/MA and a 13.7% relative decrease in the probability of Death. However, the positive

correlation quickly fades away beyond one period. It seems that the expected VC impact is

better actualized in the near future of funded startups.

cov(vijt , r

jt+1 − r

jt

)= cov

( [Xjt , X

it , X

ijt , r

jt

]φv + ξijt ,[

Xt+1, Xjt+1, X

it+1, X

ijt+1

]φr,y + σr,yε

jt+1

) (13)

The strong positive correlation one-period ahead is actually hinted in the estimation

results. To see this, we can write out the covariance between the subjective value added

29

and funded startups return next period as in Eq(13). One thing to notice is that a large

proportion of the covariates are slow-moving, with half of them time-invariant. Therefore,

the sign of the correlation depends on whether the signs of the corresponding coefficients in

the estimated parameters (φv and φr,y) are the same. In Table 3, most of these coefficients

take the same signs. For instance, the existence of alumni ties has a positive effect on

both subjective value added and funded startups’ growth. Thus, the source of the positive

covariance is the observable startup and VC-related features.

4.3.2 Selection on Private Information

The above correlation is not accounted by any private information shared between a

pair of startup and VC syndicate. In fact, the original model lacks a channel for private

information to play a role. This is because all unobservable factors, absorbed by the three

noise terms (ε, η, ξ), are assumed to be independent in the baseline model.

To incorporate the effect of private information, I extend the model by revising the noise

terms as follows. For startups funded at t − 1, the noise term in the growth variable at t

includes an additional term ρr,yξt−1, and the noise term in the implicit exit value includes an

additional term ρs,yξt−1. The coefficients ρr,y and ρs,y are the covariances among the errors.

They give the dependence of the subjective value added on the private information that

will drive the startup growth and the implicit exit value one period ahead. Appendix C2

details the extended model and gives the estimation strategy. Table 7 presents the estimation

results.

[Insert Table 7]

Now, a unit increase in the expected VC impact driven by the private information cor-

relates with 0.9% increase in startup return following funding. It also correlates with 0.073

30

increase in the implicit exit value, which corresponds to a 48.74% relative increase in the

probability of IPO/MA and a 90.90% relative decrease in the probability of Death. There-

fore, private information could be an additional source to be added to the positive correlation

between the expected VC impact and funded startups’ future performance.

4.4 Joint Effect of Selection and Influence

The results of the joint estimation highlight one fact: selection and influence of VC

funding are directly linked over time. On one hand, following the funding decision, VCs’

influence improves funded startups’ quality and their probabilities of successful exits. On

the other hand, the selection of funding in turn is dependent on the expected VCs’ impact

in the future, as well as the startups’ quality resultant from past VCs’ influence if funded

before. A natural question then follows. How large is the joint effect of VC funding, given

the direct linkage of selection and influence over time?

To answer this question, I conduct a simulation experiment to study the joint effect

of funding selection and influence over multiple rounds. More specifically, I compare the

simulation results from two models. In Model 1, I break down the interdependence between

funding selection and influence, by making the funding decision random for all periods. In

Model 2, the funding decision is random only for the first period.

Model 1 : Selection is random for all periods.

Influence follows the same dynamics as in the baseline model.

Model 2 : Selection is random for the first period, then determined in equilibrium.

Influence follows the same dynamics as in the baseline model.

For parameter values, I use the estimation results from Table 3. The dynamics of the

31

three key variables are the same as in the baseline model. I assume the economy has 100

startups born at time 0, with exactly the same features ex ante. There is only one VC

syndicate, which can fund up to 50 startups each period. I assume all the startup and

VC-related features take the mean values of the sample used in the above analysis. Note

that there is no variation across the startups in the beginning. The only source that makes

them different later on comes from the randomness in the funding decision in both models.

Table 8 compares the simulated results for the two models.

[Insert Table 8]

The randomness in the initial funding decision magnifies significantly in Model 2, i.e.,

under the joint effect of selection and influence of VC funding. In Model 2, previously-funded

startups have significantly higher period return and implicit exit value. In contrast, there is

no significant difference between previously-funded and unfunded startups in Model 1.

To show how the initial randomness accumulates over time in Model 2, I compare the

transition matrix of VC funding. In Model 1, each startup has around 50% chance to get

funded each period, whether it gets funded or not previously. However, in Model 2, a

previously-funded startup has 93.1% chance to get funded this time, but the chance is only

4.1% for a previously-unfunded startup. Thus, the funded subsample in Model 2 does not

change a lot over time – it always consists of a large proportion of startups which happen to

get the funding in the first period, so they get the subsequent fundings as well.

As a consequence, initial randomness might have a long-term effect given the interdepen-

dence between funding selection and influence. To verify the conjecture, I further compare

the differences in future performance between funded and unfunded startups for the two

models. Funded startups are not very different from unfunded ones in Model 1. However,

they consistently over-perform in both short-term and long-term future in Model 2. To sum

32

up, the joint effect of selection and influence of VC funding is essential. It helps accumulate

the impact of some incidental factor over time which can become decisive in the end.

5. Robustness

This section presents the results of some robustness checks. First, to account for the

possible shifts in the parameter estimates, I perform the estimation on two subsamples from

1998 to 2006 and from 2007 to 2014. Second, to curb the potential multicollinearity problem,

I extract factors from the original set of features and estimate the model using these factors.

Third, I construct two alternative models to allow for the autocorrelation in the subjective

value added and a hierarchical structure in startup returns driven by a hidden startup fixed-

effect.

5.1 Subsample Results

I perform the subsample estimation on two periods separately: from 1998 to 2006 and

from 2007 to 2014. Both periods feature a flagging economy in the beginning and then a

recovery afterwards. The first period captures the dot-com bubble, and the second period

spans over the 2007-2008 financial crisis. Both subsamples only contain the startups that are

born during the corresponding periods. For the first subsample running from 1998 to 2006,

I adjust startups’ final status to “Survival” if they are going to be “IPO/MA” or “Death”

in the second subsample. Note that even if the model is correctly specified, there might be

drift in the parameter estimates due to the evolution of the VC industry and changes in the

startup ecosystem. Table 9 gives the estimation results.

[Insert Table 9]

33

Comparing the two periods, the effect of the cumulative return on the implicit exit value

is much more significant in 2007-2014. It is true for both funded and unfunded startups: the

coefficient jumps from 0.001 to 0.017 in the funded case, and it jumps from 0.016 to 0.048

in the unfunded case. Thus, the probability of IPO/MA depends more on startups’ quality

in 2007-2014 and more so for the unfunded ones.

Another difference is related to the number of locations covered by a VC syndicate. It

has a significant effect on the selection of VC funding in 1998-2006 but the significance shifts

to be on the influence of VC funding in 2007-2014. More specifically, the number of locations

has a significant effect on the subjective value added in 1998-2006 and the effect fades away

in 2007-2014. In contrast, its effect on startup growth becomes significant in the more recent

subsample. It indicates that VCs exert more efforts for funded startups more recently, while

the vicinity to startups is more important for the investment decision in the early times.

Additionally, there are two distinct differences for previously funded and unfunded star-

tups separately. For the unfunded startups, the period returns rise with the risk-free rate in

2007-2014 but fall with it in 1998-2006. It implies a stronger substitutional effect in the later

period but a strong income effect in the early period. For the funded startups, a top-school

graduation dummy has a significant positive effect on startup growth only in 1998-2006.

One explanation is that more entrepreneurs have to quit from schools in the early period in

order to build their startups more seriously which got funding from VCs. Fewer have to do

so more recently due to the advancement in the communication technology as well as the

geographical expansion of the whole VC industry.

By large, the parameter estimates do not shift a lot, and most importantly, the cumulative

return continues to show positive effect on the implicit exit value as well as the subjective

value added for both subsamples.

34

5.2 Alternative Features

To curb the potential multicollinearity problem, I apply the principal factor method to

extract common factors separately from the macroeconomic variables, the constant and time-

varying startup features, the constant and time-varying VC features, and the constant and

time-varying pair features.34 Note that I include additional features (e.g. location, category

dummies for the startups, and school, field, degree dummies for the VCs’ and startups’

personnel) but extract fewer factors to keep the total number of factors low. Table 10 Panel

A gives the factor loadings on the original features. The goal of this exercise is to double-

check the interdependence between the selection and influence of VC funding rather than

the effect of any individual feature.

[Insert Table 10]

As shown in Table 10 Panel B, the significance of many factors goes away for their effects

on the growth of the funded startups. It results in a boost in the variance estimate. Both

changes might be due to the reduction in the number of covariates. In contrast, the extracted

factors still have significant effects on the growth of the unfunded startups and the subjective

value added which determines the equilibrium funding decision. This might be due to the

additional inclusion of many other features for the extraction of the factors. The effects

of the cumulative return on the implicit exit value and the subjective value added remain

positive and significant.

34 The factor loadings are computed using the squared multiple correlations as estimates of the communality.By comparison, the principal component factor uses 1 for the communality for all pairs of variables.

35

5.3 Alternative Models

5.3.1 AR(1) in Subjective Value Added

One extension of the model is to introduce autocorrelation in the subjective value added.

Intuitively, the autocorrelation structure implies that how much a VC syndicate and a startup

value the synergy they can create as a team today depends on how much they valued it

yesterday. More specifically, I change the dynamics of vijt given in Eq(4) to contain an

additional term of the lagged value vijt−1.35 Appendix C1 details the model and the estimation

strategy.

[Insert Table 11]

Table 11 Panel A gives the parameter estimates. The AR(1) coefficient has an estimate of

0.009 and a t-value of 34.65. The effect is statistically significant but economically small. At

the same time, the estimates barely change for Eq(4) which determines the subjective value

added (e.g. the coefficient for the effect of the cumulative return on the subjective value

added is 0.034 in Table 3 and 0.033 in Table 11 Panel A). Interestingly, the effect of the

cumulative return on the implicit exit value decreases for both funded and unfunded cases

even though Eq(2) remains exactly the same. The reason is that the system of equations

(Eq(1), Eq(2), Eq(4)) needs to be estimated jointly. This again is due to the interdependence

between the selection and influence of VC funding that the model intends to capture.

35 Note that the introduction of the lagged value vijt−1 is not supposed to substitute the effect of 1(funding

tie). The latter dummy equals to one if any member VC has funded the startup before, and vijt−1 is relatedto the probability of the syndicate as a whole to fund the startup in the previous period.

36

5.3.2 A Hierarchical Model for Startup Returns

It is well known that the startup return distribution has excessive kurtosis. Thus, several

papers use Mixture-of-Normals to model the startup return distribution (Ewens 2009, Ko-

rteweg and Sørensen 2010). The underlying assumption is that there are different types of

startups, e.g., winners, losers, and break-eveners, so the mean expected returns are different

for different types.36 On the other hand, there might also be a startup fixed-effect to account

for the inborn difference among different startups.

Therefore, I extend the baseline model to allow for a hierarchical structure on the startup

returns such that the noise in the period returns samples from a Mixture-of-Normals. What’s

new here is that the probability mixture is a hidden startup fixed-effect and is different across

startups. The extended model borrows from the Latent-Dirichlet-Allocation (LDA) model

well-known in the area of machine learning. It belongs to the class of hierarchical models

documented in Korteweg and Sørensen (2014) and more generally the class of Bayesian

network models.

More specifically, the hidden startup fixed-effect is a startup-specific probability mixture.

I denote it by pj ≡(p(1)j , p

(2)j , p

(3)j

)for startup j. The three elements of pj represent the

probabilities that startup j is a winner, loser, and break-evener, respectively. For the funded

case, the mean period returns for the three types are µ(1)y , µ

(2)y , and µ

(3)y ; for the unfunded case,

they are µ(1)n , µ

(2)n , and µ

(3)n instead. Given pj, startup j’s period return defined in Eq(C6)

follows a Mixture-of-Normals distribution with the probability mixture pj. Appendix C3

details the model and the estimation strategy.

Table 11 Panel B gives the parameter estimates. Compared with Table 3, the estimates

for the old parameters stay qualitatively the same. For the new parameters, the mean growth

rate for the previously funded case (i.e., µy’s) for the winners, losers, and break-eveners is

36 The terms to describe the type of startups are borrowed from Ewens 2009.

37

0.011, -0.005, and 0.006 respectively. However, none of them is significantly different from

0. In contrast, the mean growth rate for the previously unfunded case (i.e., µn’s) for the

three types is 1.564, -1.055, and 0.056 respectively. The mean returns for the winner and the

break-evener types are significantly above 0, while for the loser type it is significantly below

0. The difference across the three types is economically big and more so for the unfunded

case.

[Insert Figure 3]

Figure 3 plots the distribution of the hidden startup fixed-effect pj. The triangle repre-

sents the sample space, and each vertex on the triangle represents a certain type.37 As shown

in the figure, there is a large mass concentrated on the line between the loser vertex and

the break-evener vertex, and only a few startups have a probability greater than 0.5 to be

become a winner. Even the one with the highest winning probability has a non-zero chance

to have mediocre performance. The distribution mirrors the fact that most startups end up

mediocre or unsuccessful even though the returns for the successful ones are stunningly high.

6. Conclusion

In this paper, I study how VCs select startups to fund over multiple rounds in a dynamic

setting. The model I build highlights the interdependence between funding decision and

funding impacts. Using a hand-collected database including both VC-funded and non-VC-

funded startups, I develop an estimation strategy using Gibbs sampler to jointly estimate

the model parameters in a Bayesian framework.

The results show that selection and influence of VC funding are directly linked over time.

37 The top vertex represents the winner type; the rightmost vertex represents the loser type; the leftmostvertex represents the break-evener type.

38

The selection of funding depends on both startups’ quality and VCs’ expected influence in the

future. Following funding, VCs’ influence improves startups’ quality and their probabilities

of successful exits. The results suggest a joint effect of selection and influence of VCs’

investments in a dynamic setting. Namely, funded startups have better quality, and thus

they are more likely to get funded again in the future. A simulation experiment shows that

under this joint effect, initial random differences in startups can magnify significantly over

time.

This paper illustrates the dynamic aspect of investment, for which post-investment ac-

tivities are relevant for decision making. Using venture capital as an example, the paper

highlights the importance of a joint consideration of both investee’s value ex-ante and in-

vestor’s influence ex-post. A frequently invested item becomes a valuable asset to the whole

economy as it capitalizes the impacts of its past investors. While absent from a static setting,

these new insights shed light on some fundamental issues of strategic investment behavior.

39

Figure 1. Sequence of Events

This figure presents the sequence of events that happen at the end of t. Existing startupsat the beginning of t is Et−1, after one-period of growth, their growth and implicit exitvalue become rt and st, respectively. The law of motions for rt and st is given by Eq(1) andEq(2). If the implicit exit value st ≥ δ, the startup goes public or gets acquired (IPO/MAt);if the implicit exit value st < −δ, the startup goes bankruptcy (Dt); if the implicit value−δ ≤ st < δ, the startup remains in the economy and ready for another round of competitionfor VC’s funding. The remaining ones are called funding candidates (Jt). The funding isthen determined endogenously as a matching between the group of VC syndicates I andthe funding candidates Jt. The funding decision is described in Section 2.3. After that, thenewborn startups Nt come, and the existing startups at the end of t (Et) consists both Jtand Nt.

Sequence of Events

40

B. Growth

C. Exit

41

D. Funding

E. Separate Growth Trajectory

42

Figure 2. Distribution of Imputed Values

This figure presents the distributions of the imputed values for the latent variables. Panel Agives the distributions for the period-to-period growth (rjt − r

jt−1) separately for the funded

and unfunded cases. Panel B gives the distributions for the implicit exit value sjt separatelyfor the funded and unfunded cases. Panel C gives the pairwise subjective value added vijtseparately for the matched and unmatched pairs of startups and VC syndicates.

A. Period-to-Period Growth rt − rt−1

43

B. Implicit Exit Value st

C. Subjective Value Added vt

44

Figure 3. Distribution of Type Mixture

This figure plots the distribution of the type mixture in a two-dimensional probabilitysimplex for the extended model with the hierarchical structure in startup returns. Modeldetails are given in Appendix C3. Each point in the simplex characteristics a startup-specific vector pj ≡ (p

(1)j , p

(2)j , p

(3)j ) which satisfies the condition that p

(1)j + p

(2)j + p

(3)j = 1.

I call pj the startup-specific type mixture. The three elements in pj represent theprobabilities of the startup j belonging to the winner, loser, and break-evener type.Therefore, the type mixture is a hidden startup fixed-effect. In the simplex, the topvertex represents the pure winner type which has p(1) = 1 or equivalently p = (1, 0, 0), theleftmost vertex represents the pure break-evener type which has p(3) = 1 or equivalentlyp = (0, 0, 1), the rightmost vertex represents the pure loser vertex which has p(2) = 1 orequivalently p = (0, 1, 0). On the center of the gravity, the pink point has p = (1/3, 1/3, 1/3).

45

Figure 4. Model Identification

This figure

46

Figure A1. Life Expectancy by Final Status

This figure compares the distributions of the life expectancy (in months) for startups thatfinally go public or get acquired (IPO&MA), go bankruptcy (Death), or remain private inbusiness (Survival). The sample covers the period from 1998 to 2014, 204 months in total.

47

Figure A2. Closest Distance Comparison

This figure compares the closest distance for the matched pairs of startups and VCsyndicates and the unmatched pairs of startups and VC syndicates. The closest distance isthe minimum distance (in miles) between all the locations that a startup has an office andall the locations that a member VC has an office.

48

Figure A3. Trace Plot of Key Parameter Estimates

The figure gives the trace plots for the estimates of the key parameters in the model. PanelA gives the estimates for the last elements in φs,y (Eq(2)), φs,n (Eq(2)), and φv (Eq(4)).They are denoted by φs,y,−1, φs,n,−1, and φv,−1 respectively. They correspond to the effectof the cumulative return r on the implicit exit value s (with and without funding), andthe effect of the cumulative return r on the subjective value added v in the baseline model.Panel B gives the estimates for ρr,y and ρs,y. They correspond to the dependence of thesubjective value added on the private information that will drive the startup growth andthe implicit exit value one period ahead. The extended model that incorporate these twocorrelations are given in Appendix C2. Panel C gives the estimates of the AR(1) coefficientfor the subjective value added in the extended model which incorporates an autoregressivestructure. The AR(1) model is described in Appendix C1. Panel D and E give the estimates

for (µ(1)y , µ

(2)y , µ

(3)y ) and (µ

(1)n , µ

(2)n , µ

(3)n ) in the hierarchical return model. The new parameters

represent the fixed-effects in startup growth for the pure type of winner, loser, and break-evener, with and without funding respectively. The hierarchical return model is describedin detail in Appendix C3.

A. Baseline ModelEffect of Cumulative Return

49

B. Correlation ModelDependence of Subjective Value Added

C. Autoregressive ModelAR(1) Parameter for Subjective Value Added

50

D. Hierarchical Return Model(1) Fixed-Effects in Growth by Type: Funded

E. Hierarchical Return Model(2) Fixed-Effects in Growth by Type: Unfunded

51

Figure A4. Propensity Score

This figure compares the propensity scores for the funded and unfunded startups. Thepropensity score (PS) matching method is used for the comparison of the two groups meanperiod returns and status variables in Table 7 Panel B. The covariates for the PS matchinginclude startup-specific variables (e.g. # locations, # categories, # products, # startupsfounded), and macroeconomic variables (e.g. (Ybaa − Yus10), (rm − rf ), smb, hml, and rf ).

52

Figure A5. The Relationship among Key Startup Variables

This figure illustrates the relationship among the three key startup variables. The three keystartup variables are: (1) rjt : cumulative return, (2) sjt : implicit exit value, (3) vijt : subjectivevalue added. Note that the funding decision is determined collectively by the entire set ofall pairwise subjective value added.

53

Table 1. Sample Statistics

This table presents summary statistics for the sample. Panel A gives information on theoverall sample. Panel B gives information on the final status of startups that are VC-fundedand VC-unfunded. Panel C gives information on the size of VC syndicates. The size refersto the number of VCs in a syndicate. Panel D describes the distribution of the number offunding rounds experienced by the startups in the sample. Panel E describes the distributionof funding by funding rounds. The time period is from January 1998 to December 2014.

A. Sample

# Startups 9,303

# VC Syndicates 2,844

# VCs 755

# Time Periods 204

B. Startup Final Status

Status Death Survival IPO&MA Total

Unfunded 1,879 4,965 109 6,953

Funded 414 1,408 528 2,350

Total 2,293 6,373 637 9,303

C. VC Syndicate

# of VCs 1 2 3 4 5 6 7 8 9 10 11

Freq. 505 839 661 406 234 99 50 25 18 3 4

Percent. 17.76 29.5 23.24 14.28 8.23 3.48 1.76 0.88 0.63 0.11 0.14

D. Startup Funding

# of funding 0 1 2 3 4 5 6 7 8 9 10 11

Freq. 6,953 452 1,015 479 236 96 44 13 11 1 1 2

Percent 74.74 4.86 10.91 5.15 2.54 1.03 0.47 0.14 0.12 0.01 0.01 0.02

E. Funding by Rounds

Rounds of funding 1 2 3 4 5 6 7 8 9 10 11

Freq. 2350 1898 883 404 168 72 28 15 4 3 2

Percent 40.33 32.57 15.15 6.93 2.88 1.24 0.48 0.26 0.07 0.05 0.03

54

Table 2. Measure Statistics

This table presents summary statistics for the variables in Eq(1), Eq(2) and Eq(4). PanelA gives information on the cumulative return r in Eq(1). Panel B to Panel E give theinformation on the right-hand-side variables X in the three equations. Panel B describesthe macroeconomic variables Xt. Panel C describes startup-specific features Xj

t separatelyfor constant ones and time-varying ones. Panel D describes VC syndicate-specific featuresX it . Panel E describes startup-VC syndicate-pairwise-specific features X ij

t . There are 204months, 9,303 startups, and 2,844 VC syndicates. In total, there are 520,715 startup-monthobservations, 580,176 VC syndicate-month observations, 26,457,732 startup-VC syndicatepairs, and 1,650,020,544 startup-VC syndicate-month observations. The detailed construc-tion of these variables is given in Appendix B.

A. Cumulative Returns

Mean Std Skew Kurt min p5 p10 p25 p50 p75 p90 p95 max

6.58 9.46 0.31 1.23 -8.19 -2.3 -2.3 -2.3 0.53 17.06 18.78 19.59 26.19

B. Macroeconomic Variables

N Mean Std min p25 p50 p75 max

(Ybaa − Yus10) (%) 204 2.62 0.8 1.56 2.11 2.57 2.98 6.01

(rm − rf ) (%) 204 0.48 4.68 -17.23 -2.09 1.19 3.49 11.35

smb (%) 204 0.28 3.56 -16.41 -1.63 0.19 2.28 22.02

hml (%) 204 0.21 3.4 -12.61 -1.49 0.08 1.71 13.89

rf (%) 204 0.18 0.17 0 0.01 0.13 0.37 0.56

C. Startup Features


Constant

# locations 9,303 1.21 1.01 1 1 1 1 66

# categories 9,303 1.85 1.3 1 1 1 2 14

# products 9,303 0.43 1.97 0 0 0 0 134

Time-varying

t from last round 520,715 3.02 3.01 0 0.83 2 4.25 16.92

t2 from last round 520,715 18.14 34.91 0 0.69 4 18.06 286.17

# rounds 520,715 0.32 0.81 0 0 0 0 10

# startups founded 520,715 0.17 0.47 0 0 0 0 5

1(top20 school) 520,715 0.3 0.46 0 0 0 1 1

55

D. VC Syndicate Features


Constant

# of VCs 2,844 2.93 1.63 1 2 3 4 11

# locations 2,844 6.37 5.40 1 3 5 9 75

Time-varying

1(cooperated) 580,176 0.33 0.47 0 0 0 1 1

# rounds 580,176 5.16 9.65 0 0 1 6 116

# categories 580,176 8.11 12.87 0 0 1 11 87

1(top20 school) 580,176 0.86 0.35 0 1 1 1 1

E. Startup-VC Syndicate Features


Constant

distance 2.65× 107 2893.67 3521.56 0 79.48 1200.26 4313.19 19932.96

[0, 100] 2.65× 107 0.25 0.43 0 0 0 1 1

(100, 1000] 2.65× 107 0.21 0.4 0 0 0 0 1

(1000, max] 2.65× 107 0.54 0.5 0 0 1 1 1

days of travel 2.65× 107 2.29 0.84 0 1 2 2 3

Time-varying

1(funding tie) 1.65× 109 0.61%

1(alumni tie) 1.65× 109 20.76%

56

Table 3. Estimation Result

This table presents the estimation result for the baseline model described by the system ofequations Eq(1), Eq(2) and Eq(4). The first two columns give the estimates for φr,y and φr,nin Eq(1) for the law of motion of the cumulative return r. The third and fourth columnsgive the estimates for φs,y and φs,n in Eq(2) for the determinants of the implicit exit value s.The last column gives the estimates for φv in Eq(4) for the determinants of the subjectivevalue added v. Details of the estimation are given in Appendix A. Numbers in the bracketsare the t-statistics. Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.

Influence Selection

Growth (r) Exit Value (s) Value Added (v)

Funded Unfunded Funded Unfunded

[1] [2] [3] [4] [5]

r 0.028 *** 0.063 *** 0.034 ***

(30.78) (40.17) (178.94)

sigma2 0.01 *** 0.092 *** 1.124 *** 1.373 ***

(103.37) (1287.12) (11.79) (28.35)

intercept 0.004 *** 0.004 *** -0.251 -0.447 ***

(2.91) (44.61) (1.35) (3.25)

Macro Variables

(Ybaa − Yus10) -0.001 -0.013 *** 0.005 0.024

(0.90) (197.74) (0.16) (1.02)

(rm − rf ) 0.007 0.032 *** -0.045 0.573

(0.72) (41.62) (0.10) (1.62)

smb -0.023 0.006 *** -1.332 -0.929 *

(0.95) (5.03) (1.40) (1.92)

hml 0.036 * 0.099 *** -0.045 -0.250

(1.67) (48.54) (0.04) (0.54)

rf 0.017 0.196 *** 0.072 0.370 **

(0.21) (57.13) (0.30) (2.48)

Startup Features

# locations 0.003 *** 0.017 *** 0.009 -0.004 -0.853 ***

(5.26) (406.38) (0.43) (0.29) (6.23)

# categories -0.009 *** -0.004 *** -0.007 -0.006 -0.273 ***

(4.90) (171.65) (0.29) (0.75) (9.10)

# products 0.004 *** 0.004 *** 0.007 0.015 0.024 ***

(2.67) (85.13) (0.34) (1.06) (7.71)

t from last round 0.000 0.041

57

(0.35) (1.66)

t2 from last round -0.040 -0.046 ***

(0.11) (3.00)

# rounds 0.008 0.001 *** 0.226 -0.001 0.022 ***

(0.00) (187.20) (0.05) (1.35) (35.88)

# startups founded -0.005 0.005 *** -0.012 0.013 -0.158 ***

(1.49) (11.48) (0.36) (0.45) (12.09)

1(top20 school) -0.019 *** 0.154 *** 0.005 -0.037 -0.434 ***

(10.38) (372.63) (0.09) (0.91) (35.30)

VC Syndicate Features

# of VCs -0.003 *** 0.006 -0.342 ***

(18.15) (0.22) (10.27)

# locations 0.000 *** 0.002 0.001 ***

(8.26) (0.49) (6.60)

1(cooperated) 0.001 *** 0.002 0.030 ***

(12.59) (0.46) (12.99)

# rounds -0.001 -0.068 0.101 ***

(0.64) (0.53) (15.29)

# categories 0.000 *** -0.001 -0.019 ***

(8.08) (0.59) (11.38)

1(top20 school) 0.039 *** -0.001 -0.647 ***

(18.08) (0.02) (24.72)

Pair Features

days of travel -0.146 0.295 -0.210 ***

(1.66) (0.86) (28.02)

1(funding tie) 0.004 *** -0.251 4.486 ***

(4.02) (1.35) (10.36)

1(alumni tie) 0.003 *** -0.003 0.117 ***

(18.62) (0.24) (35.28)

58

Table 4. Previously Funded vs. Previously Unfunded

This table compares the means of the period return (rt+1 − rt), the implicit exit valuest, IPO&MA rate and death rate for the previously funded and unfunded startups. Periodreturn* is winsorized between 1 and 99 percentiles separately for the subsamples of previouslyfunded and unfunded startups. Panel A uses all previously unfunded startups as control.Panel B uses only comparable startups identified by the Nearest Neighbor (NN) method(with 1 or 5 neighbors) or the Propensity Score (PS) method (with probit or logit treatmentmodel). The covariates for the identification of the comparable startups include startup-specific variables (e.g. # locations, # categories, # products, # startups founded), andmacroeconomic variables (e.g. (Ybaa − Yus10), (rm − rf ), smb, hml, and rf ). Numbers in thebrackets are the t-statistics. Significance at 10%, 5%, and 1% levels are denoted by *, **,and ***.

A. Previously Funded vs. Previously Unfunded

Period return* Period return Exit value IPO&MA rate(%) Death rate (%)

Funded 0.034 0.034 0.044 0.5 0.4

Unfunded -0.017 0.051 -0.197 0.02 0.49

Difference 0.05 *** -0.017 0.241 *** 0.48 *** -0.09

Test (t or chi2) 10.501 0.901 16.232 164.91

B. Previously Funded vs. Comparable Previously Unfunded

Period return* Period return Exit value

Nearest Neighbor (nn1) 0.052 *** -0.039 0.206 ***

(9.62) (0.57) (9.84)

Nearest Neighbor (nn5) 0.052 *** -0.042 0.223 ***

(10.59) (0.81) (12.64)

Propensity Score (probit) 0.051 *** -0.024 0.201 ***

(6.88) (0.58) (9.86)

Propensity Score (logit) 0.051 *** -0.025 0.199 ***

(8.97) (0.79) (9.74)

59

Table 5. Constrained Estimation

This table presents the estimation results for the constrained model in which the coefficientsassociated with the intercept term, the macroeconomic variables, and the startup-specificfeatures are constrained to be the same. This is equivalent to the assumption that funded andunfunded startups have the same law of motions for their cumulative return r and implicitexit value s, and at the same time, the VC-related features for the unfunded startups areset to be zeros. Therefore, Eq(1) and Eq(2) become Eq(A28) and Eq(A29). Details ofthe estimation are given in Appendix A3. Numbers in the brackets are the t-statistics.Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.

Influence Selection


[1] [2] [3]

r 0.081 *** 0.033 ***

(43.10) (163.69)

sigma2 0.085 *** 1.584 ***

(595.14) (209.15)

intercept 0.052 *** -0.934 ***

(70.07) (7.63)

Macro Variables

(Ybaa − Yus10) 0.052 *** -0.044 **

(70.07) (2.41)

(rm − rf ) -0.011 *** 0.785 ***

(24.03) (2.94)

smb 0.008 *** -2.095 ***

(16.17) (4.06)

hml 0.099 *** -0.085

(6.35) (0.21)

rf 0.226 *** 0.569 ***

(8.44) (5.79)

Startup Features

# locations 0.02 *** 0.01 -0.849 ***

(71.08) (0.63) (6.26)

# categories -0.009 *** 0.000 -0.273 ***

(62.09) (0.02) (8.93)

# products 0.001 *** 0.014 0.023 ***

(26.69) (1.18) (7.16)

60

t from last round 0.052 ***

(10.81)

t2 from last round -0.026 ***

(8.19)

# rounds 0.002 *** -0.003 0.022 ***

(454.42) (1.28) (44.80)

# found startups -0.003 -0.066 -0.164 ***

(0.30) (1.55) (10.54)

1(top20 school) -0.017 *** 0.042 -0.435 ***

(32.69) (0.62) (34.76)


# of VCs -0.004 *** 0.048 * -0.344 ***

(7.99) (1.67) (10.28)

# locations 0.001 *** 0.006 0.001 ***

(5.37) (1.35) (5.21)

1(cooperated) 0.001 *** 0.005 * 0.03 ***

(9.46) (1.94) (12.58)

# rounds 0.01 *** -0.258 ** 0.105 ***

(2.41) (1.98) (4.76)

# categories 0.000 *** -0.004 -0.02 ***

(3.76) (1.35) (11.43)

1(top20 school) 0.042 *** -0.05 -0.648 ***

(7.11) (0.38) (25.79)

Pair Features

days of travel -0.021 *** 0.591 -0.696 ***

(3.74) (1.51) (12.16)

1(funding tie) -0.119 -0.652 4.52 ***

(0.36) (1.38) (10.13)

1(alumni tie) 0.005 *** -0.024 0.117 ***

(5.27) (1.16) (41.81)

61

Table 6. Expected VC Impact & Startup Future

This table presents the correlations between the subjective value added vt and the futureperiod return (rt+τ − rt+τ−1) as well as the future implicit exit value st+τ for startups thatare selected to be funded at t. The correlations are calculated as corr(vijt ,rjt+τ − rjt+τ−1)

and corr(vijt , sjt+τ ), with τ varies from 1 to 12 (months). Numbers in the brackets are the

t-statistics. Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.

Period return Exit value

1 0.114 *** 0.056 ***

(30.00) (12.06)

2 0.089 * 0.125 **

(1.78) (2.26)

3 0.059 -0.025

(0.24) (0.04)

4 0.03 0.043

(0.06) (0.12)

5 0.019 -0.029

(0.03) (0.06)

6 0.059 -0.03

(0.23) (0.05)

7 0.065 0.103 **

(0.29) (2.50)

8 0.102 * -0.048

(1.59) (0.14)

9 0.126 ** 0.038

(5.33) (0.08)

10 0.106 * -0.02

(1.63) (0.03)

11 0.105 * 0.061

(1.45) (0.24)

12 0.07 0.07

(0.27) (0.37)

62

Table 7. Selection on Private Information

This table presents the estimation results for the extended correlation model in which Eq(1)contains an additional term ρr,yξ

ijt−1 for the funded case and Eq(2) contains an additional

term ρs,yξijt−1 for the funded case. The estimates for ρr,y and ρs,y give the dependence of the

subjective value added on the private information that will drive the startup growth andthe implicit exit value one period ahead. They are denoted by ”rho” in the table. Detailsof the extended model are given in Appendix C2. As before, the first two columns givethe estimates for φr,y and φr,n, the third and fourth columns give the estimates for φs,y andφs,n, the last column gives the estimate for φv. Numbers in the brackets are the t-statistics.Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.

Influence Selection



[1] [2] [3] [4] [5]

rho 0.009 *** 0.073 ***

(42.32) (5.43)

r 0.017 *** 0.044 *** 0.033 ***

(37.90) (77.06) (327.79)

sigma2 0.011 *** 0.092 *** 1.256 *** 1.284 ***

(46.21) (1316.04) (49.91) (127.15)

intercept 0.006 *** 0.005 *** -0.29 -0.464 ***

(7.08) (50.29) (1.28) (4.47)

Macro Variables

(Ybaa − Yus10) -0.001 -0.013 *** 0.008 0.027

(0.99) (183.77) (0.27) (1.19)

(rm − rf ) 0.009 0.026 *** 0.103 0.61

(0.61) (37.22) (0.18) (1.26)

smb -0.026 0.008 *** -1.644 -0.884 *

(1.12) (7.27) (1.54) (1.77)

hml 0.034 0.105 *** -0.095 -0.205

(1.60) (47.28) (0.10) (0.33)

rf 0.013 0.195 *** 0.133 0.391 **

(0.96) (54.45) (0.50) (1.99)

Startup Features

# locations 0.002 *** 0.017 *** 0.007 -0.014 -0.849 ***

(12.89) (440.09) (0.36) (1.43) (6.28)

# categories -0.009 *** -0.004 *** -0.01 0.001 -0.271 ***

63

(11.70) (175.47) (0.47) (0.10) (8.88)

# products 0.004 *** 0.004 *** 0.013 0.008 0.024 ***

(8.33) (92.83) (0.55) (0.85) (6.15)

t from last round 0.000 -0.238

(0.12) (1.59)

t2 from last round -0.078 -0.046 ***

(0.20) (3.16)

# rounds 0.05 0.001 *** 0.617 -0.001 0.022 ***

(0.20) (184.14) (0.13) (1.19) (40.95)

# found startups -0.004 0.005 *** -0.012 0.004 -0.164 ***

(0.54) (11.75) (0.35) (0.14) (12.10)

1(top20 school) -0.021 *** 0.154 *** -0.003 -0.036 -0.434 ***

(6.20) (403.86) (0.05) (1.41) (30.54)


# of VCs -0.005 *** 0.009 -0.347 ***

(12.52) (0.36) (9.98)

# locations 0.000 *** 0.002 0.001 ***

(3.32) (0.57) (6.84)

1(cooperated) 0.001 *** 0.002 0.03 ***

(60.80) (0.66) (12.53)

# rounds 0.000 -0.012 0.105 ***

(1.27) (0.10) (5.51)

# categories 0.000 *** -0.002 -0.02 ***

(16.51) (0.68) (11.34)

1(top20 school) 0.038 *** -0.011 -0.643 ***

(14.66) (0.17) (23.59)

Pair Features

days of travel -0.029 *** 0.291 -0.703 ***

(2.04) (0.68) (12.70)

1(funding tie) 0.006 *** -0.29 4.484 ***

(7.08) (1.28) (9.75)

1(alumni tie) 0.003 *** -0.005 0.118 ***

(63.38) (0.30) (27.67)

64

Table 8. Simulation Results: Model 1 vs. Model 2

This table compares the difference in means of the period return (rt+1 − rt), implicit exitvalue s, IPO&MA rate and death rate for the two models in the simulation. Model 1assumes random selection of funding for all periods to exclude the chain effect. Model 2assumes random selection of funding only at the first period. I simulate 100 startups thathave ex-ante homogeneous features which are set to the means of these features in sample.The startups also have synchronous births and same cumulative return in the beginning. Iassume there is one VC that has a funding quota of 50 so each startup gets funded witha probability of 0.5 at any time. Panel A compares the values for previously funded andunfunded startups for the two models. Panel B compares the transition matrix of currentfunding given previous funding for the two models. Panel C compares the values τ periodahead for currently funded and unfunded startups, with τ ranging from 1 to 10. Numbersin the brackets are the t-statistics. Significance at 10%, 5%, and 1% levels are denoted by*, **, and ***.

A. Previously Funded vs. Previously Unfunded

1. Random Selection for All Periods

Period return Exit value IPO&MA rate (%) Death rate (%)

Funded 0.515 0.577 1.327 0.207

Unfunded 0.319 0.326 1.067 0.249

Difference 0.196 0.251 0.26 -0.041

t-stat 0.74 0.61 1.002 0.363

2. Random Selection Only for the First Period


Funded 0.956 0.828 1.886 0.179

Unfunded 0.106 -0.126 0.811 0.516

Difference 0.851 *** 0.954 *** 1.075 *** -0.337 *

t-stat 2.92 2.45 2.456 1.645

B. Transition Matrix

1. Random Selection

for All Periods

2. Random Selection

Only for the First Period

Currently Funded Currently Funded

Previously Funded N Y Previously Funded N Y

N 0.502 0.498 N 0.931 0.069

Y 0.533 0.467 Y 0.041 0.959

65

C. Future Difference for Initially Funded vs. Unfunded

1. Random Selection for All Periods


1 0.196 0.251 0.26 -0.041

2 0.278 0.082 0.094 -0.015

3 0.175 0.037 0.081 -0.015

4 0.082 0.17 -0.007 0.075

5 0.069 0.024 0.271 0.045

6 0.113 0.247 0.639 *** -0.017

7 -0.026 0.215 0.291 -0.016

8 0.02 -0.06 0.822 *** -0.241 ***

9 0.01 0.336 0.43 -0.209

10 0.034 0.087 0.222 -0.011

2. Random Selection Only for the First Period


1 0.851 *** 0.954 *** 1.075 *** -0.337 *

2 0.906 *** 1.077 *** 1.124 *** -0.349 *

3 0.909 *** 1.214 *** 1.278 *** -0.284

4 0.881 *** 1.052 ** 1.256 *** -0.215

5 0.856 *** 1.023 ** 1.336 *** -0.225

6 0.879 *** 1.35 *** 1.312 *** -0.344

7 0.857 *** 1.218 ** 1.175 *** -0.358

8 1.024 *** 1.34 ** 1.256 *** -0.281

9 1.114 *** 1.677 *** 1.316 *** -0.243

10 1.359 *** 1.549 *** 1.4 *** -0.233

66

Table 9. Estimation for Subsamples

This table presents the estimation results using two subsamples for the baseline model de-scribed by the system of equations Eq(1), Eq(2), and Eq(4) as in Table 3. Panel A and PanelB give the results for 1998-2006 and 2007-2014 respectively. The subsamples only containstartups that are born during those sub-periods. The first two columns give the estimatesfor φr,y and φr,n. The third and fourth columns give the estimates for φs,y and φs,n. The lastcolumn gives the estimates for φv. Significance at 10%, 5%, and 1% levels are denoted by *,**, and ***.

A. 1998-2006

Influence Selection



[1] [2] [3] [4] [5]

r 0.001 ** 0.016 *** 0.033 ***

(2.08) (42.55) (123.33)

sigma2 0.011 *** 0.108 *** 1.524 *** 1.473 ***

(58.52) (131.291 (24.17) (61.29)

intercept 0.052 *** 0.028 *** 0.136 -0.264 **

(16.17) (28.04) (0.53) (2.34)

Macro Variables

(Ybaa − Yus10) -0.009 *** -0.051 *** -0.019 0.051

(2.92) (478.79) (0.15) (1.47)

(rm − rf ) 0.056 0.04 *** 0.267 0.093

(0.60) (31.51) (0.15) (0.28)

smb -0.036 -0.074 *** -0.379 0.349

(0.56) (40.50) (0.25) (0.82)

hml -0.055 -0.048 *** 0.114 0.728

(0.39) (15.57) (0.06) (1.56)

rf -0.024 -0.101 *** -0.111 0.072

(0.74) (84.64) (0.25) (0.56)

Startup Features

# locations -0.004 *** 0.013 *** 0.008 0.007 -0.154 ***

(10.00) (473.09) (0.27) (0.86) (9.02)

# categories 0.002 -0.002 *** 0.014 0.003 -0.44 ***

(0.18) (69.35) (0.29) (0.20) (6.06)

# products 0.002 *** 0.006 *** -0.002 -0.001 0.051 ***

67

(24.60) (265.24) (0.15) (0.13) (10.27)

t from last round 0.000 -0.099

(0.03) (0.53)

t2 from last round -0.067 -0.008

(0.06) (0.27)

# rounds 0.047 0.005 *** 1.013 0.003 0.08 ***

(0.00) (1384.51) (0.08) (0.86) (14.70)

# found startups -0.001 0.051 *** -0.002 -0.013 -0.125 ***

(0.23) (73.39) (0.03) (0.32) (16.06)

1(top20 school) -0.012 *** 0.176 *** 0.026 0.021 -0.465 ***

(3.47) (694.07) (0.22) (0.88) (11.80)


# of VCs -0.009 *** -0.021 -0.319 ***

(13.36) (0.50) (7.60)

# locations 0.000 0.001 0.023 ***

(0.57) (0.06) (12.42)

1(cooperated) 0.002 *** 0.002 0.032 ***

(5.21) (0.14) (7.30)

# rounds -0.001 -0.043 0.34 ***

(0.46) (0.29) (31.20)

# categories 0.001 ** 0.000 -0.03 ***

(2.54) (0.01) (2.84)

1(top20 school) -0.004 -0.151 -0.774 ***

(0.76) (0.71) (49.66)

Pair Features

days of travel 0.004 -0.05 -1.13 ***

(0.82) (0.22) (22.80)

1(funding tie) 0.052 *** 0.136 6.73 ***

(16.62) (0.53) (10.51)

1(alumni tie) 0.003 *** -0.001 0.07 ***

(16.05) (0.06) (6.62)

68

B. 2007-2014

Influence Selection



[1] [2] [3] [4] [5]

r 0.017 ** 0.048 *** 0.035 ***

(22.78) (56.00) (70.00)

sigma2 0.036 *** 0.159 *** 1.361 *** 1.587 ***

(36.96) (60.03) (36.58) (57.20)

intercept 0.019 *** 0.085 *** -0.117 -0.147

(4.07) (40.93) (0.29) (1.37)

Macro Variables

(Ybaa − Yus10) -0.005 -0.002 *** 0.001 -0.002

(0.20) (6.51) (0.01) (0.08)

(rm − rf ) 0.072 0.008 * 0.138 -0.276

(0.11) (1.93) (0.15) (0.50)

smb -0.094 -0.009 * -0.457 -0.235

(0.14) (1.90) (0.23) (0.27)

hml -0.111 -0.034 *** -0.773 -0.113

(0.16) (8.71) (0.44) (0.17)

rf 0.227 0.261 *** 0.181 -0.041

(0.03) (6.17) (0.15) (0.13)

Startup Features

# locations -0.001 0.049 *** -0.015 0.015 -1.024 ***

(0.41) (400.97) (0.52) (0.66) (8.35)

# categories -0.002 -0.012 *** 0.02 -0.004 -0.27 ***

(0.29) (182.43) (0.71) (0.31) (10.29)

# products 0.000 0.005 *** -0.002 0.004 0.035 ***

(0.02) (63.69) (0.07) (0.40) (18.53)

t from last round 0.001 0.351 *

(0.47) (1.67)

t2 from last round -0.04 -0.11 ***

(0.05) (2.60)

# rounds -0.062 0.011 *** 0.337 0.027 *** 0.13 ***

(0.00) (944.51) (0.03) (3.43) (25.82)

# found startups -0.017 0.016 *** 0.067 0.036 -0.001

(0.63) (25.75) (1.28) (1.29) (0.32)

69

1(top20 school) 0.007 0.178 *** 0.002 -0.041 -0.28 ***

(0.81) (187.21) (0.02) (1.03) (11.46)


# of VCs 0.001 -0.004 -0.485 ***

(1.22) (0.08) (15.05)

# locations 0.002 *** -0.004 -0.002

(7.79) (0.32) (0.90)

1(cooperated) 0.000 0.003 0.042 ***

(1.10) (0.45) (19.62)

# rounds 0.008 0.037 0.047 ***

(1.33) (0.16) (5.63)

# categories 0.000 -0.002 -0.026 ***

(0.21) (0.35) (16.58)

1(top20 school) 0.004 0.082 -0.458 ***

(0.40) (0.53) (19.78)

Pair Features

days of travel -0.037 *** -0.164 -0.356 ***

(4.13) (0.22) (3.97)

1(funding tie) 0.019 *** -0.117 6.179 ***

(4.57) (0.29) (11.09)

1(alumni tie) 0.001 -0.004 0.065 ***

(0.77) (0.12) (8.29)

70

Table 10. Estimation with Alternative Features

This table presents the estimation results using the principal factors extracted from thevarious measures described in Section 3.2 as covariates. Panel A gives the factor loadingsseparately for the macroeconomic variables, constant and time-varying startup features, VCsyndicate features, and pair features. I use f t, f j, f tj, f i, f ti, f tji to denote these factors.Note that there is no f ji because days (days of a round travel between a VC syndicate and astartup) is the only startup-VC time-invariant feature. Panel B gives the estimation result.The first two columns give the estimates for φr,y and φr,n. The third and fourth columnsgive the estimates for φs,y and φs,n. The last column gives the estimates for φv. Significanceat 10%, 5%, and 1% levels are denoted by *, **, and ***.

A. Factor Loadings

Macro Variables Pair Features

f t f1 t f2 t Constant

(Ybaa − Yus10) 0.328 -0.277 f ji: days of travel

(rm − rf ) 0.135 0.247 Time-varying

smb 0.204 0.286 f tji f1 tji

hml -0.187 -0.226 1(funding tie) 0.5095

rf -0.355 0.193 1(alumni tie) 0.0431

Startup Features VC Syndicate Features

Constant Constant

f j f1 j f i f1 i

# locations 0.558 # of VCs 0.543

# categories 0.128 # locations 0.849

# products 0.308 Others: location dummies

Others: location/category dummies Time-varying

Time-varying f ti f1 ti f2 ti

f tj f1 tj f2 tj 1(cooperated) 0.112 0.16

# rounds 0.085 0.107 # rounds 0.116 0.23

t from last round -0.387 0.495 # categories 0.127 -0.423

t2 from last round -0.044 0.23 1(top20 school) 0.26 -0.145

# startup founded 0.026 0.008 Others: category/school dummies

1(top20 school) 0.207 0.067 degree/major dummies

71

B. Parameter Estimates

Influence Selection



[1] [2] [3] [4] [5]

r 0.083 *** 0.077 *** 0.039 ***

(7.85) (97.29) (141.25)

sigma2 0.04 *** 0.096 *** 1.622 *** 0.893 ***

(76.87) (2019.42) (50.34) (222.54)

intercept 0.655 *** -0.025 *** -0.42 0.062

(42.47) (215.12) (0.22) (0.29)

Macro Factors

f1 t -0.005 -0.018 *** -0.189 -0.04 **

(0.17) (82.48) (0.70) (2.00)

f2 t 0.001 0.005 *** -0.274 0.007

(0.03) (15.06) (0.68) (0.28)

Startup Factors

f1 j 0.002 0.014 *** 0.056 0.027 * -0.012 ***

(0.27) (142.44) (0.66) (1.69) (3.54)

f1 tj -0.008 -0.074 *** 0.171 0.894 ** 4.325 ***

(0.72) (412.23) (0.09) (2.54) (9.60)

f2 tj -0.009 -0.038 *** -0.154 0.38 *** 0.91 ***

(0.17) (584.25) (0.04) (4.50) (10.54)

f3 tj 0.008 0.262 *** -0.049 -1.248 ** -5.683 ***

(1.49) (1106.98) (0.06) (2.51) (8.69)

VC Syndicate Factors

f1 i 0.006 0.075 0.11 ***

(0.45) (0.52) (10.65)

f1 ti 0.004 0.066 0.233 ***

(0.17) (0.53) (30.77)

f2 ti -0.006 -0.039 0.337 ***

(0.65) (0.44) (33.17)

Pair Factors

days of travel 0.000 0.000 -0.042 ***

(0.57) (0.65) (23.88)

f1 tji 0.081 *** 0.381 1.096 ***

(49.14) (0.15) (50.04)

72

Table 11. Estimation for Alternative Models

This table presents the estimation results for the alternative models. Panel A presents theresult for the extended model with an AR(1) structure in the subjective value added. TheAR(1) coefficient is denoted by ”lagged v” in the table. Details of the model are given inAppendix C1. Panel B presents the result for the extend model with a hierarchical structurein startup returns. The fixed-effects in startup growth for the pure type of winner, loser, andbreak-evener are denoted by µ

(1)y , µ

(2)y , µ

(3)y for the previously funded case and by µ

(1)n , µ

(2)n ,

µ(3)n for the previously unfunded case. Details of the model are given in Appendix C3. The

first two columns give the estimates for φr,y and φr,n. The third and fourth columns give theestimates for φs,y and φs,n. The last column gives the estimates for φv. Significance at 10%,5%, and 1% levels are denoted by *, **, and ***.

A. Autocorrelation in Subjective Value Added

Influence Selection



[1] [2] [3] [4] [5]

lagged v 0.009 ***

(34.65)

r 0.017 *** 0.04 *** 0.033 ***

(34.90) (66.66) (48.87)

sigma2 0.011 *** 0.092 *** 1.385 *** 1.338 ***

(97.84) (773.63) (90.78) (95.24)

intercept 0.004 *** 0.005 *** -0.407 * -0.466 ***

(3.03) (54.83) (1.81) (4.41)

Macro Variables

(Ybaa − Yus10) -0.001 -0.013 *** 0.012 0.029

(0.86) (18.59) (0.40) (1.22)

(rm − rf ) 0.007 0.026 *** 0.104 0.548 *

(0.74) (38.15) (0.21) (1.67)

smb -0.023 0.008 *** -1.78 * -0.721

(0.88) (7.04) (1.82) (1.28)

hml 0.036 0.105 *** -0.085 -0.177

(1.59) (4.68) (0.10) (0.33)

rf 0.017 0.196 *** 0.145 0.381 ***

(0.19) (5.60) (0.58) (3.39)

Startup Features

# locations 0.003 *** 0.017 *** 0.009 -0.002 -0.861 ***

73

(5.40) (4113.36) (0.44) (0.12) (6.24)

# categories -0.009 *** -0.004 *** -0.01 0.001 -0.282 ***

(4.35) (1622.62) (0.52) (0.07) (8.87)

# products 0.004 *** -0.004 *** 0.004 0.008 0.026 ***

(2.33) (1011.76) (0.16) (0.78) (8.21)


(0.11) (1.74)

t2 from last round -0.047 -0.051 ***

(0.13) (3.53)

# rounds -0.025 0.001 *** 0.135 -0.002 ** 0.023 ***

(0.11) (199.13) (0.03) (1.99) (39.64)

# found startups -0.005 0.005 *** -0.012 0.002 -0.164 ***

(1.55) (122.57) (0.39) (0.05) (11.06)

1(top20 school) -0.019 *** 0.154 *** -0.008 -0.039 -0.438 ***

(10.80) (374.70) (0.16) (1.31) (44.48)


# of VCs -0.003 *** 0.005 -0.314 ***

(18.39) (0.18) (11.45)

# locations 0.000 *** 0.001 0.002 ***

(8.36) (0.35) (5.77)

1(cooperated) 0.001 *** 0.002 0.028 ***

(13.40) (0.59) (14.68)

# rounds -0.001 -0.035 0.107 ***

(0.69) (0.31) (6.35)

# categories 0.000 *** -0.002 -0.019 ***

(7.65) (0.69) (13.13)

1(top20 school) 0.039 *** 0.003 -0.638 ***

(17.30) (0.04) (26.34)

Pair Features

days of travel -0.162 * 0.55 -0.72 ***

(1.76) (1.39) (13.70)

1(funding tie) 0.004 *** 0.407 * 4.198 ***

(4.17) (1.81) (10.15)

1(alumni tie) 0.003 *** -0.002 0.107 ***

(19.84) (0.14) (43.43)

74

B. Hierarchical Model for Return

Influence Selection



[1] [2] [3] [4] [5]

mu1 0.011 1.564 ***

(1.01) (33.16)

mu2 -0.005 -1.055 ***

(0.45) (65.20)

mu3 0.006 0.056 ***

(0.39) (46.03)

r 0.024 *** 0.057 *** 0.036 ***

(39.92) (57.85) (264.89)

sigma2 0.007 *** 0.105 *** 1.403 *** 1.211 ***

(304.83) (642.93) (28.02) (55.72)

intercept -0.397 *** -0.686 ***

(2.61) (4.40)

Macro Variables

(Ybaa − Yus10) 0.000 -0.004 *** 0.021 0.088 ***

(0.56) (4.44) (0.69) (3.35)

(rm − rf ) 0.017 *** -0.018 *** 0.425 0.465 **

(0.51) (4.52) (0.59) (2.00)

smb 0.013 0.043 *** -0.32 -2.655 ***

(0.21) (4.42) (0.37) (6.61)

hml -0.024 0.022 *** -0.856 -1.032 *

(0.31) (2.59) (1.12) (1.95)

rf 0.015 ** 0.163 *** 0.297 0.564 ***

(2.06) (14.82) (1.19) (5.19)

Startup Features

# locations 0.000 0.023 *** 0.006 -0.015 -0.565 ***

(1.53) (75.70) (0.27) (1.06) (7.60)

# categories -0.007 *** -0.006 *** 0.008 -0.016 -0.305 ***

(18.16) (26.65) (0.43) (1.62) (11.04)

# products 0.002 *** 0.009 *** -0.01 0.022 *** 0.006 ***

(5.33) (66.36) (1.04) (2.68) (15.08)


(1.25) (1.90)

75

t2 from last round -0.062 -0.062 ***

(0.19) (3.55)

# rounds 0.025 0.002 *** 0.301 -0.002 * 0.022 ***

(0.13) (32.92) (0.08) (1.86) (30.32)

# found startups -0.011 * 0.035 *** 0.044 0.003 -0.146 ***

(1.81) (8.52) (1.00) (0.09) (20.84)

1(top20 school) 0.01 0.138 *** -0.017 -0.085 ** -0.464 ***

(1.61) (268.67) (0.35) (2.55) (25.69)


# of VCs -0.006 *** 0.016 -0.417 ***

(16.03) (0.57) (14.62)

# locations 0.003 *** -0.003 0.032 ***

(2.11) (0.37) (29.53)

1(cooperated) 0.000 * -0.001 0.03 ***

(1.76) (0.42) (16.10)

# rounds -0.007 *** 0.058 0.114 ***

(4.17) (0.36) (5.63)

# categories 0.000 *** 0.001 -0.023 ***

(8.06) (0.51) (17.33)

1(top20 school) 0.004 * 0.048 -0.556 ***

(1.86) (0.63) (55.21)

Pair Features

days of travel -0.014 *** 0.206 -0.913 ***

(5.68) (1.03) (21.89)

1(funding tie) 0.012 -0.397 *** 4.113 ***

(1.52) (2.61) (14.40)

1(alumni tie) 0.001 *** 0.004 0.11 ***

(4.67) (0.33) (24.41)

76

Table A1. Locations and Categories of Startup

This table presents the location and category summary for the startups in sample. Panel Acounts the number of startups that has an office in California (CA), New York (NY), otherlocations in U.S. except California and New York (OUS), other locations in North Americaexcept U.S. (ONA), Asia (AS) and Europe (EU). Panel B counts the number of startupsthat has an office in the top 20 cities. Panel C counts the number of startups that belongsto the top 20 categories. The classification of category is from TechCrunch.

A. Locations by Area

Area CA NY OUS ONA AS EU

Freq. 2,542 945 2,630 348 1,184 2,085

B. Top 20 Cities C. Top 20 Categories

City Freq. Category Freq.

San Francisco 882 Software 1,561

New York 809 Mobile 1,222

London 480 Advertising 689

Los Angeles 211 Games 505

Chicago 177 Education 351

Palo Alto 173 Consulting 345

Seattle 165 Internet 340

Mountain View 141 Apps 320

Austin 141 Finance 296

Paris 139 Analytics 295

Toronto 123 Technology 263

Boston 113 Search 255

Bangalore 108 Video 247

San Diego 102 Startups 245

Berlin 99 Networking 238

Cambridge 98 Music 217

Tel Aviv 96 Android 210

Santa Monica 95 Fashion 208

Singapore 86 Design 206

San Jose 80 Travel 194

77

Table A2. Educational Background of Startup Teams

This table presents the educational background for the people associated with the startupsin sample. The total number of people with valid educational background is 15,342. Theyare either ever-employed or currently-employed by the startups in sample, including thefounders. Panel A counts the number of people who have completed a degree in the top 20schools. A school is included if it is in the top 20 on the U.S. news rankings or it frequentlyappears in the startups personnels educational background. Panel B counts the numberof people who have completed a degree in the three broad fields of engineering, business &economics, and law & politics. Panel C counts the number of people by their highest degrees.

A. Top School List

School Freq. School Freq.

Stanford 1,073 Dartmouth 128

Harvard 804 Oxford 124

NYU 712 Santa Clara 124

UC Berkeley 578 Cambridge 122

Upenn 494 Brown U 122

MIT 472 Boston U 121

Columbia 384 Caltech 118

Northwestern 310 Galtech 115

UCLA 274 UCSB 113

Cornell 273 Georgetown 108

UT Austin 221 U Colorado Boulder 107

U Tel Aviv 213 U Wisconsin Madison 103

USC 209 San Jose State U 93

Yale 199 U Maryland 92

U Chicago 193 INSEAD 89

Carnegie Mellon U 191 PSU 88

U Illinois 180 UC Davis 79

Duke 163 U Waterloo 78

Princeton 162 U Johns Hopkins 68

U Washington 157

B. Field C. Degree

Field Freq. Degree Freq.

Engineering 7,105 M.S. & M.A. 2,784

Business & Economics 4,967 M.B.A 2,948

Law & Politics 723 Ph.D. 845

78

Table A3. Locations of VC and Funded Categories

This table presents the location and funding category summary for the VCs in sample. PanelA counts the number of VCs and VC syndicates that has an office in California (CA), NewYork (NY), other locations in U.S. except California and New York (OUS), other locations inNorth America except U.S. (ONA), Asia (AS), and Europe (EU). Panel B and Panel C countthe number of VCs and VC syndicates that has an office in the top 20 cities. Panel D countsthe number of startups by categories that have received VCs’ funding. The classification ofcategory is from TechCrunch.

A. Locations by Area

Area CA NY OUS ONA AS EU

Freq.

VC 343 126 253 21 113 145

VC Syndicate 2,330 1,278 1,650 111 1,278 846

B. Top 20 Cities

by VC

C. Top 20 Cities

by VC Syndicate

D. Top 20 Funded

Categories

City Freq. City Freq. Category Freq.

New York 119 Menlo Park 2,175 Software 215

San Francisco 97 New York 1,806 Advertising 206

Menlo Park 84 San Francisco 1,504 Mobile 117

Palo Alto 82 Palo Alto 1,260 Consulting 77

London 70 Beijing 802 Biotechnology 61

Beijing 39 Shanghai 775 Games 55

Boston 39 Herzliya 676 Education 45

Shanghai 30 London 654 Internet 43

Cambridge 29 Cambridge 639 Finance 40

Paris 23 Mumbai 525 Design 34

Seattle 20 Bangalore 489 Search 29

Herzliya 20 Boston 391 Technology 26

Mumbai 19 New Delhi 348 Analytics 25

Chicago 18 Hong Kong 212 Networking 23

Tokyo 16 Mountain View 172 Apps 23

Hong Kong 15 Waltham 166 Media 23

Bangalore 14 Los Angeles 163 Security 21

Singapore 14 Seattle 158 News 21

Austin 13 Philadelphia 153 Video 21

Toronto 13 Tokyo 141 Services 21

79

Table A4. Educational Background of VC Teams

This table presents the educational background for the people associated with the VCs in thesample. The total number of people with valid educational background is 11,026. They areeither ever-employed or currently-employed by the VCs in the sample, including the founders.Panel A counts the number of people who have completed a degree in the top 20 schools.A school is included if it is in the top 20 on the U.S. news rankings or it frequently appearsin the VCs’ personnels educational background. Panel B counts the number of people whohave completed a degree in the three broad fields of engineering, business & economics, andlaw & politics. Panel C counts the number of people by their highest degrees.

A. Top School List

School Freq. School Freq.

Harvard 1,246 INSEAD 126

Stanford 1,211 U Illinois 120

Upenn 669 Carnegie Mellon U 106

NYU 532 UT Austin 96

MIT 458 U Tel Aviv 96

Columbia 398 U Washington 94

UC Berkeley 366 Caltech 93

U Chicago 322 Boston U 85

Yale 223 Santa Clara 84

Princeton 219 LSE 82

Cornell 207 Boston College 76

UCLA 198 U Notre Dame 74

Dartmouth 182 San Jose State U 68

Duke 176 U Wisconsin Madison 65

Oxford 157 U Waterloo 63

Cambridge 148 Washington U 62

Northwestern 140 Galtech 61

Brown U 132 U Johns Hopkins 57

USC 126

B. Field C. Degree

Field Freq. Degree Freq.

Engineering 4,730 M.S. & M.A. 2,043

Business & Economics 4,272 M.B.A 3,515

Law & Politics 615 Ph.D. 652

80

Appendix A. Estimation Procedure

A1. Prior Distributions

The prior distribution assumptions are as follows. The parameters in Eq(1) and Eq(2), i.e.,(φr,y, σ

2r,y

),(φr,n, σ

2r,n

),(φs,y, σ

2s,y

), and

(φs,n, σ

2s,n

), have Normal-Inverse-Gamma priors (with un-

known variances σ2 to be estimated).

φk,m|σ2k,m ∼ N(

0, σ2k,mA−1k,m

), σ2k,m ∼ Γ−1 (ak,m, bk,m) (A1)

where k = ”r” or ”s” to denote the dependent variable, and m = ”y” or ”n” to denote the answer

to whether the startup is previously funded.

The parameter in Eq(4), i.e., φv, has a Normal prior (with known variance equal to 1).

φv ∼ N(0, A−1v

)(A2)

Note that the prior means of φ’s are assumed to be zero so that the null hypothesis is that all

coefficients are insignificant. A’s are diagonal matrices with all elements equal to 1/100, and a’s

and b’s are set to be 2.0 and 1.0. The assumption on the prior distributions is to form a Bayesian

Linear Regression setting which gives tractable posterior distribution. Please see Korteweg (2013)

for a detailed description of the setting as well as the rule for parameter estimation.

A2. Algorithm for Estimation (Baseline Model)

For parameter learning, I develop a parallel Gibbs Sampler to draw from the posterior condi-

tional distributions given the data and the prior distributions. In particular, I factorize the joint

posterior distribution into a full set of conditional distributions of (1) the latent variables r, s,

v, and (2) the parameters(φr,y, σ

2r,y

),(φr,n, σ

2r,n

),(φs,y, σ

2s,y

),(φs,n, σ

2s,n

), and φv. The algorithm

consists of the following six steps to be performed iteratively. For initial values, φ’s are set to 0

and σ2’s are set to 1.0.

Steps

1. Impute rjt given sjt ,{vijt , i ∈ I

}, parameters and data

2. Impute sjt given rjt , parameters and data

81

3. Impute vijt (all together for each t) given{rjt , j ∈ Jt

}, the equilibrium condition, parameters,

and data

4. Update(φr,y, σ

2r,y

)and

(φr,n, σ

2r,n

)given all r and data

5. Update(φs,y, σ

2s,y

)and

(φs,n, σ

2s,n

)given all s, r, and data

6. Update φv given all v, r and data

The following paragraphs give the detailed information for the steps 1 to 6. I use some new

notations to simplify the description.

Notations

• φr: equals φr,y or φr,n for previously funded or unfunded

• σ2r : equals σ2r,y or σ2r,n for previously funded or unfunded

• φs: equals φs,y or φs,n for previously funded or unfunded

• σ2s : equals σ2s,y or σ2s,n for previously funded or unfunded

• φs,1: the vector of φs except the last element

• φs,−1: the last element of φs

• φv,1: the vector of φv except the last element

• φv,−1: the last element of φv

• Xjt : equals

[1, Xt, X

jt , X

it , X

ijt

]if j is funded by i at t− 1, and equals

[1, Xt, X

jt

]if j is not

funded at t− 1

• Xijt : equals

[Xit , X

jt , X

ijt

]for a pair of i and j

• µt(i) = {j : ij ∈ µt}• µt(j) = {i : ij ∈ µt}

Impute r

As in Korteweg and Sørensen (2010), r is imputed using a FFBS (forward filtering and backward

smoothing) method. Given infrequent observable values of r, this method samples interim values

given the information that is correlated with these interim values. Note that in Eq(2) and Eq(4),

both the implicit exit value s and the subjective value added v depend on r. Therefore, s and v

generate the information set for the conditional distribution of r. I use mjt|τ and P jt|τ to denote the

conditional mean and variance of rjt given information generated by s and v up to time τ . Below

gives the forward and backward steps for the FFBS method.

• Forward

82

– Forecast

mjt|t−1 = mj

t−1|t−1 + Xjtφr (A3)

P jt|t−1 = P jt−1|t−1 + σ2r (A4)

– Update

b =

[φs,−1

φv,−1

](A5)

e =

sjt −Xjtφs,1∑

i

(vijt −Xij

t φv,1

)/I

− bmjt|t−1 (A6)

K = P jt|t−1b

(bP jt|t−1b

′ +

[σ2s 0

0 σ2v/I

])−1(A7)

mjt|t = mj

t|t−1 +Ke (A8)

P jt|t−1 = (1−Kb)P jt|t−1 (A9)

• Backward

– Given the draw of rj∗t+1

G =

Pjt|t

(P jt|t + σ2r,y

)−1if j is funded at t

P jt|t

(P jt|t + σ2r,n

)−1if j is unfunded at t

(A10)

M = mjt|t +G

(rj∗t+1|t

)(A11)

V = P jt|t(1−G) (A12)

– Draw rj∗t ∼ N(M,V )

Note that given s, v, and the data, the conditional distributions of{rjt : 0 ≤ t ≤ T

}are inde-

pendent across j. Therefore, the FFBS procedure can be performed in a parallel fashion for all

startups.

Impute s

The distribution of s follows truncated Normal given r and startup status (i.e., IPO/MA, Death,

or Survival), and it is conditionally independent of v. By assumption, VC’s appearance also affects

83

the startup’s status. Here, I let δ = 3. Using the same notation, s is sampled as follows.

M =[Xj

t , rjt

]φs, V = σ2s (A13)

• Statusjt = IPO/MA: draw sjt ∼ N(M,V )× 1[δ ≤ sjt ]• Statusjt = Survival: draw sjt ∼ N(M,V )× 1[−δ ≤ sjt < δ]

• Statusjt = Death: draw sjt ∼ N(M,V )× 1[sjt < −δ]

Here, 1[.] denotes the indicator function. The imputation of s can be performed parallely for

all startups and for all time periods.

Impute v

The distribution of vijt is also one-dimensional truncated Normal given rjt and v−ijt which is

defined as the collection{vi

′j′

t : i 6= i′ or j 6= j′}

. As in Sørensen (2007), the conditional distribution

of vijt depends on whether j is matched with i at time t. More specifically, v is sampled as follows.

M =[Xij

t , rjt

]φv (A14)

• Matched

Draw vijt ∼ N(M, 1)× 1[v ≤ vijt

], where v is given in Eq(10).

• Unmatched

Draw vijt ∼ N(M, 1)× 1[vijt < v

], where v is given in Eq(9).

Again, the imputation of v can be performed in a parallel fashion for all time periods. However,

within a specific time period, the v’s for all matched pairs need to be imputed first (either parallely

or sequentially) since the values for the unmatched pairs will depend on those of the matched pairs.

Given the imputed variables, the parameters can be updated through Bayesian Linear Regres-

sion (either with or without variance update). For different parameters, the following specifies the

subsamples used as well as the dependent and independent variables for the Bayesian Linear Regres-

sion. Note that we can still parallelize the procedure because the matrices used for multiplication

in the following equations are actually summations of small matrices each of which corresponds to

one observation in the subsample for the regression.

84

Update φr and σ2r

• Update φr,y and σ2r,y

The subsample includes all previously-matched funding candidates. Let Ny be the size of

the subsmaple. Using the Bayesian Linear Regression rule, we can sample φr,y and σ2r,y as

follows. Here, X and y represent the stacks of the independent and dependent variables for

the previously-matched case in the linear regression in Eq(1).

a = ar,y +Ny/2 (A15)

b = br,y +[y′y −G′

(X ′X +Ar,y

)−1G]/2 (A16)

G =(X ′X +Ar,y

)−1X ′y (A17)

First draw σ2r,y ∼ Γ−1(a, b), then draw µr,y|σ2r,y ∼ N(G, σ2r,y (X ′X +Ar,y)

−1)

.

• Update φr,n and σ2r,n

The subsample includes all previously-unmatched funding candidates, let Nn be the size of

the subsample. Using the Bayesian Linear Regression rule, we can sample φr,n and σ2s,n as

follows. Here, X and y are the stacks of the independent and dependent variables for the

previously-unmatched case in the linear regression in Eq(1).

a = ar,n +Nn/2 (A18)

b = br,n +[y′y −G′

(X ′X +Ar,n

)−1G]/2 (A19)

G =(X ′X +Ar,n

)−1X ′y (A20)

First draw σ2r,n ∼ Γ−1(a, b), then draw µr,n|σ2r,n ∼ N(G, σ2r,n (X ′X +Ar,n)

−1)

.

Update φs and σ2s

• Update φs,y and σ2s,y

The subsample is the same as above for the update of φr,y and σ2r,y. The difference is

that X and y now represent the stacks of the independent and dependent variables for the

previously-matched case in Eq(2).

a = as,y +Ny/2 (A21)

b = bs,y +[y′y −G′

(X ′X +As,y

)−1G]/2 (A22)

G =(X ′X +As,y

)−1X ′y (A23)

First draw σ2s,y ∼ Γ−1(a, b), then draw µs,y|σ2s,y ∼ N(G, σ2s,y (X ′X +As,y)

−1)

.

85

• Update φs,n and σ2s,n

Again, X and y now are the stacks of the independent and dependent variables for the

previously-unmatched case in Eq(2).

a = as,n +Nn/2 (A24)

b = bs,n +[y′y −G′

(X ′X +As,n

)−1G]/2 (A25)

G =(X ′X +As,n

)−1X ′y (A26)

First draw σ2s,n ∼ Γ−1(a, b), then draw µs,n|σ2s,n ∼ N(G, σ2s,n (X ′X +As,n)

−1)

.

Update φv

The subsample includes all pairs of VCs and funding candidates. The Bayesian Linear Regres-

sion now does not include the noise variance. Here X and y represent the stacks of the independent

and dependent variables in Eq(4).

G = (X ′X +Av)−1X ′y (A27)

Draw µv ∼ N(G, (X ′X +Av)

−1)

.

A3. Algorithm for Constrained Estimation (Baseline Model)

Now I impose the constraint that in Eq(1) and Eq(2), the parameters that are not associated

with the VC-related features are the same for funded and unfunded startups. Equivalently, Eq(1)

and Eq(2) become the following.38

rjt − rjt−1 =

[1, Xt, X

jt , X

it , X

ijt

]φr,y + σr,yε

jt , for all j ∈ Et−1 (A28)

sjt =[1, Xt, X

jt , X

it , X

ijt , r

jt

]φs,y + σs,yη

jt , for all j ∈ Et−1 (A29)

With Xit = 0 and Xij

t = 0 if j is not funded at t− 1.

It is straightforward to change the algorithm in Appendix A2 for the estimation here. The model

does not have(φr,n, σ

2r,n

)and

(φs,n, σ

2s,n

), so all the subsamples in Et−1 are used for the estimation

of the parameters(φr,y, σ

2r,y

)and

(φs,y, σ

2s,y

), with the above augmentation of the independent

variables (i.e., Xit = 0 and Xij

t = 0) for the previously-unfunded startups.

38 Recall that Et−1 represents the set of existing startups at the end of t− 1, or at the beginning of t.

86

Appendix B. Measure Construction

Dependent Variables

• r: Cumulative Return

• s: Implicit Exit Value

• v: Subjective Value Added

Independent Variables

• Macroeconomic Variables

• Startup Features

• VC Syndicate Features

• Startup - VC Syndicate Pair Features

B1. Cumulative Return

r = cumulative return = log(V )

• For newborn startups: r = log(V ) = 0, V = 1.

• For existing startups:

– IPO: r = log(V ), with V = market value at IPO.

– MA: r = log(V ), with V = acquired price at MA.

– Death: r = log(V ), with V ∼ triangle distribution with a = 0.05, b = 0.1, and c = 0.8.

• At funding round: r = log(V ) =∑

t log(V PREt /V POST

t−1), so V =

∏t

(V PREt /V POST

t−1), where

V POST = I + V PRE . Here, V is the anti-diluted valuation of the startup at t, I is the

investment amount.

B2. Macroeconomic Variables

• (Ybaa − Yus10) = (Moody’s seasoned Baa corporate bond yield) − (10-year Treasury bond

yield).

• (rm − rf ) = monthly market excess return over risk-free rate.

• smb = monthly factor return for the small-minus-big portfolio.

• hml = monthly factor return for the high-minus-low portfolio.

• rf = monthly risk-free rate.

87

B3. Startup Features

Constant

• # locations = number of cities that a startups headquarter or offices are located in.

• # categories = number of categories that a startup is classified into.

• # products = number of products that a startup has.

• 1(LOC) = dummy variable indicating whether the startup has its headquarter or offices in

LOC, where LOC is

– ca: state of California

– ny: state of New York

– ous: other places in U.S. except from New York and California

– ona: other places in North American except from U.S.

– as: Asia

– eu: Europe

Time-varying

• t from last round = time since last funding round in years at a specific time t.

• t2 from last round = square of (t from last round) at a specific time t.

• # rounds = number of funding rounds experienced in the past at a specific time t.

• # startups founded = number of companies the founder of the startup has built in the past

prior to a specific time t.

• 1(top20 school) = dummy variable indicating whether the startup at a specific time t has

people on the management team who graduated from a top school. The list of top schools

(for startups) is given in Table A2.

B4. VC Syndicate Features

Constant

• # of VCs = number of VC members in a VC syndicate.

• # locations = number of cities that a VC syndicate has at least one member VC that has an

office or headquarter.

• 1(LOC) = dummy variable indicating whether the VC syndicate has at least one member

VC has its headquarter or offices in LOC, where LOC defined as above in startup features.

Time-varying

• 1(cooperated) = dummy variable indicating whether any member VCs have cooperated in

the past prior to a specific time t.

88

• # categories = number of categories that the VC syndicate has at least one member VC that

has funding experience prior to a specific time t.

• # rounds = median funding rounds that VC members have participated in prior to a specific

time t. It measures the average experience of the VC syndicate.

• 1(top20 school) = dummy variable indicating whether a VC syndicate at a specific time t

has people on its member VCs management teams who graduated from a top school. The

list of top schools (for VCs) is given in Table A4.

B5. Pair Features

Constant

• distance = The closest distance in miles between a startup and a VC syndicate. See Figure A2

for the distributions of dist for all and funded pairs.

• days of travel = number of days for a round travel between a startup and a VC syndicate

using the closest distance defined above, where the number of days equals

– 0: if distance∈ [0, 100], indicating a round travel by driving within one day

– 1: if distance∈ [100, 1000], indicating a round travel by flight within two days

– 2: if distance∈ [1000, 10000], indicating a round travel by flight within three days

– 3: if distance∈ [10000,∞), indicating an intercontinental travel

Time-varying

• 1(funding tie) = dummy variable indicating whether any VC member in the syndicate has

funded the startup prior to some specific time t.

• 1(alumni tie) = dummy variable indicating whether any VC member and the startup have

any alumni ties at some specific time t.

89

Appendix C. Alternative Models

C1. Autocorrelation in Subjective Value Added

The subjective value added at t might depend on its value at t − 1. Therefore, one extension

of the baseline model is to include vijt−1 in the expression of vijt . With Eq(1) and Eq(2) unchanged,

Eq(4) changes to the following.

vijt =[Xit , X

jt , X

ijt , r

jt , v

ijt−1

]φv + ξijt , for all i ∈ I, j ∈ Jt (C1)

The new parameter introduced is the last element in φv that is associated with vijt−1. For

estimation, there are some small changes in the imputation of v and in the update step of the

FFBS method for the imputation of r.

C2. Funding Decision Incorporating Private Information

When making the funding decision, venture capitalists may have some private information on

startups that are unobservable to an outside economist. This private information will drive startup

growth and implicit exit value at t+ 1, and thus is incorporated in the subjective value added at t

to make the funding selection.

Therefore, the model can be extended as follows. With Eq(4) unchanged, I modify Eq(1) and

Eq(2) to the follows for startups that are previously funded at t − 1 to incorporate the “private

information” ξijt−1 in the subjective value added at t− 1.

rjt − rjt−1 =

[1, Xt, X

jt , X

it , X

ijt

]φr,y + ρr,yξ

ijt−1 + σr,yε

jt , if j is funded by i at t− 1 (C2)

sjt =[1, Xt, X

jt , X

it , X

ijt , r

jt

]φs,y + ρs,yξ

ijt−1 + σs,yη

jt , if j is funded by i at t− 1 (C3)

Note that the difference here is that we now include ξijt−1 as the last independent variable for both

equations. The associated parameters are denoted by ρr,y and ρs,y respectively. The correlations

between vijt−1 and rjt − rjt−1, and between vijt−1 and sjt now also capture the private information

shared between startups and VC syndicates which is not loaded on publicly observable features.

The changes in the estimation strategy stem from the imputation of r and s for the previously

funded case, the update of(φr,y, σ

2r,y

)and

(φs,y, σ

2s,y

), and the imputation of v for matched pairs

given next-period r and s for those pairs. The following lists the steps for a revised algorithm.

90

More details are available upon request.

Steps

1. Calculate ξijt given currently imputed vijt , parameters and data

2. Impute rjt given ξijt−1, sjt ,{vijt , i ∈ I

}, parameters and data

3. Update(φr,y, σ

2r,y

)and

(φr,n, σ

2r,n

)given all ξ, r, and data

4. Impute sjt given ξijt−1, rjt , parameters and data

5. Update(φs,y, σ

2s,y

)and

(φs,n, σ

2s,n

)given all ξ, s, r, and data

6. Calculate er,jt and es,jt given ξijt−1, rjt , s

jt , parameters and data, where

er,jt ≡ rjt − r

jt−1 −

[1, Xt, X

jt , X

it , X

ijt

]φr,y + ρr,yξ

ijt−1 (C4)

es,jt ≡ sjt −

[1, Xt, X

jt , X

it , X

ijt , r,

]φs,y + ρs,yξ

ijt−1 (C5)

7. Impute vijt for all matched pairs (all together for each t) given{rjt , e

r,jt+1, e

s,jt+1, j ∈ Jt

}, the

equilibrium condition, parameters, and data. Impute vijt for all unmatched pars as before


C3. Hierarchical Model for Returns

The extended model has a hidden startup fixed-effect pj ≡(p(1)j , p

(2)j , p

(3)j

), which is a startup-

specific probability mixture. It consists of the probabilities that a startup belongs to a specific

type: winner, loser, and break-evener. The probabilities are denoted by p(1)j , p

(2)j , and p

(3)j . The

period return for a specific type follows a Normal distribution with different means: µ(1)y , µ

(2)y , and

µ(3)y for the three types with funding, and µ

(1)n , µ

(2)n , and µ

(3)n for the three types without funding.

More specifically, Eq(1) changes to the following.

rjt − rjt−1 =

γjy,t +

[Xt, X

jt , X

it , X

ijt

]φr,y + σr,yε

jt if j is funded by i at t− 1

γjn,t +[Xt, X

jt

]φr,n + σr,nε

jt if j is unfunded at t− 1

(C6)

with

γjy,t ≡{µ(k)y : with probability p

(k)j

}, for k = 1, 2, 3 (C7)

γjn,t ≡{µ(k)n : with probability p

(k)j

}, for k = 1, 2, 3 (C8)

91

Here, γjy,t and γjn,t are i.i.d. and follow categorical distributions with the common probability

mixture pj . Equivalently, both noises in the period returns(γjy,t + σr,yε

jt

)and

(γjn,t + σr,nε

jt

)follow

Mixture-of-Normals with the common probability mixture pj . I assume the priors of pj follow a

Dirichlet distribution as follows.

pj ∼ Dirichlet (α) (C9)

Let zjt denote the realized type (winner, loser, break-evener) for a startup j at time t then

P(zjt = k

)= p

(k)j . Eq(C6) can be re-written as follows.

rjt − rjt−1 =

µ(zjt )y +

[Xt, X

jt , X

it , X

ijt

]φr,y + σr,yε

jt if j is funded by i at t− 1

µ(zjt )n +

[Xt, X

jt

]φr,n + σr,nε

jt if j is unfunded at t− 1

(C10)

Consequently, the estimation for the extended model includes the imputation of the two ad-

ditional variables, pj and zjt , and the update of the two additional sets of parameters µy =(µ(1)y , µ

(2)y , µ

(3)y

)and µn =

(µ(1)n , µ

(2)n , µ

(3)n

). The following lists the steps for a revised algorithm.

More details are available upon request.

Steps

1. Impute rjt given sjt ,{vijt , i ∈ I

}, zjt , parameters and data

2. Impute sjt given rjt , parameters and data

3. Impute vijt (all together for each t) given{rjt , j ∈ Jt

}, the equilibrium condition, parameters,

and data

4. Update(µy, φr,y, σ

2r,y

)and

(µn, φr,n, σ

2r,n

)given all r, z and data

5. Update(µy, φs,y, σ

2s,y

)and

(µn, φs,n, σ

2s,n

)given all s, r, and data


7. Update z given(µy, φr,y, σ

2r,y

)and

(µy, φs,y, σ

2s,y

), all r and data

8. Update pj given zjt

92

REFERENCES

Aizenman, Kendall, and Jake Kendall, 2008, The Internationalization of Venture Capital and

Private Equity, Journal of Economic Studies, 39(5), 488-511

Amit, Raphael, Werner Antweiler, and James A. Brander, 2002, Venture Capital Syndication:

Improved Venture Selection vs. the Value-added Hypothesis, Journal of Economics and Manage-

ment Strategy, 11(3), 423-452

Baum, Joel, Brian Silverman, 2004, Picking Winners or Building Them? Alliance, Intellectual,

and Human Capital as Selection Criteria in Venture Financing and Performance of Biotechnology

Startups, Journal of Business Venturing, 19(3), 411-436

Bengtsson, Ola, 2013, Relational Venture Capital Financing of Serial Founders, Journal of Fi-

nancial Intermediation, 22(3), 308-334

Bengtsson, Ola and David Hsu, 2010, How Do Venture Capital Partners Match with Startup

Founders? Unpublished Working Paper

Bernstein, Shai, Xavier Giroud, and Richard Townsend, 2015, The Impact of Venture Capital

Monitoring, Journal of Finance, forthcoming

Blei, David, Andrew Ng, and Michael Jordan, 2003, Latent Dirichlet Allocation, Journal of

Machine Learning Research, 3(4-5), 993-1022

Bottazzi, Laura, Marco Da Rin, and Thomas Hellmann, 2008, Who Are the Active Investors?

Evidence from Venture Capital, Journal of Financial Economics, 89(3), 488-512

Bottazzi, Laura, Macro Da Rin, and Thomas Hellmann, 2011, The Importance of Trust for

Investment: Evidence from Venture Capital, Working Paper

Brander, James A, Raphael Amit and Werner Antweiler, 2002, Venture-Capital Syndication: Im-

proved Venture Selection vs. the Value Added Hypothesis, Journal of Economics and Management

Strategy, 11, 423-452

Carter, Chris K. and Robert J. Kohn, 1994, On Gibbs Sampling for State Space Models,

Biometrika 81, 541-553

Chemmanur, Thomas J., Karthik Krishnan, and Debarshi K. Nandy, 2011, How Does Venture

Capital Financing Improve Efficiency in Private Firms? A Look Beneath the Surface, Review of

Financial Studies, 24(12), 4037-4090

93

Chen, Henry, Paul Gompers, Anna Kovner, and Josh Lerner, 2010, Buy Local? The Geography

of Successful and Unsuccessful Capital Expansion, Journal of Urban Economics, 67(1)

Cochrane, John, 2005, The Risk and Return of Venture Capital, Journal of Financial Economics,

75(1), 3-52

Cumming, Douglas, Fleming, and Suchard, 2005, Venture Capitalist Value-added Activities,

Fundraising and Drawdowns, Journal of Banking & Finance, 29(2), 295-331

Da Rin, Marco, Thomas Hellmann, and Manju Puri, 2012, A Survey of Venture Capital Research,

Goerge Constantinides, Milton Harris, and Rene Stulz (editors), Handbook of the Economics of

Finance, vol 2, Amsterdam, North Holland

Ewens, Michael, 2009, A New Model of Venture Capital Risk and Return, SSRN Working Paper

Fama, Eugene and Kenneth French, 1995, Size and Book-to-Market Factors in Earnings and

Returns, Journal of Finance, 50, 131-155

Fruhwirth-Schnatter, Sylvia, 1994, Data Augmentation and Dynamic Linear Models, Journal of

Time Series Analysis, 15, 183-202

Fulghieri, Paolo, and Merih Sevilir, 2009, Size and Focus of a Venture Capitalists Portfolio, The

Review of Financial Studies, 22(11), 4643-4680

Gompers, Paul, Josh Lerner, 1997, Risk and Reward in Private Equity Investments: The Chal-

lenge of Performance Assessment, Journal of Private Equity, 1, 5-12

Gompers, Paul, 1994, The Rise and Fall of Venture Capital, Business and Economic History, 23,

1-26

Gompers, Paul, Anna Kovner, Josh Lerner, 2009, Specialization and Success: Evidence from

Venture Capital, Journal of Economics and Management Strategy, 18(3), 817-844

Gompers, Paul, Anna Kovner, Josh Lerner, and David Scharfstein, 2010, Performance Persistence

in Entrepreneurship, Journal of Financial Economics, 96, 18-32

Hellmann, Thomas, and Manju Puri, 2000, The Interaction Between Product Market and Fi-

nancing Strategy: The Role of Venture Capital, The Review of Financial Studies, 13(4),959-984

Hellmann, Thomas, and Manju Puri, 2002, On the Fundamental Role of Venture Capital, Eco-

nomic Review, published by the Atlanta Federal Reserve Bank, 87, No. 4

Hellmann, Thomas, and Manju Puri, 2002, Venture Capital and the Professionalization of the

94

Startup Firms: Empirical Evidence, Journal of Finance, 57(1), 169-197

Hochberg, Yael V., Alexander Ljungqvist, and Yang Lu, 2010, Networking as a Barrier to Entry

and the Competitive Supply of Venture Capital, Journal of Finance, 65(3), 829859

Hochberg, Yael V., Alexander Ljungqvist, and Yang Lu, 2007, Whom You Know Matters: Ven-

ture Capital Networks and Investment Performance, Journal of Finance, 62(1), 251-301

Hochberg, Yael V., Laura Anne Lindsey, and Mark M. Westerfield, 2015, Resource Accumula-

tion Through Economic Ties: Evidence from Venture Capital, Journal of Financial Economics,

forthcoming

Hochberg, Yael V., Michael J. Mazzeo, Ryan C. McDevitt, 2015, Specialization and Competition

in the Venture Capital Industry, Review of Industrial Organization 46(4), 323-347

Hsu, H. David, 2004, What Do Entrepreneurs Pay for Venture Capital Affiliation? Journal of

Finance, 59, 1805-1844

Hsu, H. David, and Ola Bengtsson, 2010, How Do Venture Capital Partners Match with Startup

Founders? Working Paper

Kaplan, Steven, Mark Klebanov, and Morten Sørensen, 2012, Which CEO Characteristics and

Abilities Matter? Journal of Finance, 67(3), 973-1007

Korteweg, Arthur and Sørensen Morten, 2010, Risk and Return Characteristics of Venture

Capital-backed Entrepreneurial Companies, Review of Financial Studies, 23(10), 3738-3772

Korteweg, Arthur, Markov Chain Monte Carlo Methods in Corporate Finance. In: P. Damien,

P. Dellaportas, N. Polson, and D. Stephens (Eds.), Bayesian Theory and Applications, Oxford

University Press

Lerner, Josh, 1994, The Syndication of Venture Capital Investments, Financial Management,

23(3), 16-27

Lerner, Josh, 1995, Venture Capitalists and The Oversight of Private Firms, Journal of Finance,

50, 301-318

Nanda, Ramana and Matthew Rhodes-Kropf, 2013, Investment cycles and startup innovation,

Journal of Finance Economics, 110(2), 403-418

Phalippou, Ludovic, 2010, Venture Capital Funds: Flow-Performance Relationships and Perfor-

mance Persistence, Journal of Banking and Finance, 34(3), 568-577

95

Sahlman, William. 1990, The Structure and Governance of Venture-Capital Organizations, Jour-

nal of Financial Economics, 27, 473-521

Sørensen, Morten, 2005, An Economic and Econometric Analysis of Market Sorting with an

Application to Venture Capital, dissertation (Stanford University)

Sørensen, Morten, 2007, How Smart is Smart Money? A Two-sided matching model of venture

capital, Journal of Finance, 62(6), 2725-2762

Sørensen, Morten, 2008, Learning by Investing: Evidence from Venture Capital, Working Paper

Sorenson, Olav, and Toby Stuart, 2001, Syndication Networks and the Spatial Distribution of

Venture Capital Investments, American Journal of Sociology, 106, 1546-1588

Tykvova, Tereza, 2007, Who Chooses Whom? Syndicate, Skills, and Reputation, Review of

Financial Economics, 16(1), 5-28

Xuan, Tian, 2010, The Causes and Consequences of Venture Capital Stage Financing, Journal

of Financial Economics, 101(2), 132-159

96

the impact of venture capital on the life cycles of startupsyling/vc.pdfdatabase that include both...

Documents