the impact of venture capital on the life cycles of startupsyling/vc.pdfdatabase that include both...
TRANSCRIPT
The Impact of Venture Capital on the Life Cycles of Startups
Yun Ling∗
September 6, 2016
New Version Coming Soon
Abstract
How do VCs select startups to fund over multiple rounds? To study this
question, I develop a dynamic two-sided matching model of VC funding. Using a
hand-collected database including both VC-funded and non-VC-funded startups,
I estimate the joint determinants of investment selection and the effects of post-
investment influence in a Bayesian framework. The results show that selection
depends on startups’ quality and VCs’ influence – VCs may choose to invest in
a startup with lower quality if their subsequent impact is large. Importantly,
previously funded startups are of higher quality and thus are more likely to get
additional funding. A simulation experiment shows that initial random differ-
ences in startups can magnify significantly under the joint effects of the selection
and influence of VC funding.
Keywords: Venture Capital (VC), Investment, Startups, Initial Public Offering
(IPO), Merger and Acquisition (MA), Dynamic Model, Bayesian Estimation
JEL Classification Numbers: C22, C51, C80, G11, G24, G34, M13
∗The author is from the University of Southern California, Marshall School of Business, 3670 TrousdaleParkway, Suite 308, Los Angeles, CA 90089, Email at: [email protected]. I would like to thankGordon Phillips, Arthur Korteweg, John Matsusaka, Kenneth Ahern, Christopher Jones, Kevin Murphy,David Mauer, Fernando Zapatero and seminar participants at the University of Southern California forhelpful comments and suggestions.
1. Introduction
Venture capital (VC) is essential for early business financing. It is widely believed that
venture capital firms (VCs) have skills in selecting startups1 with great potential and have
profound impact on their subsequent growth. Existing research highlights VCs’ role in
providing value-added services at the post-investment stage (e.g. Sahlman (1990), Lerner
(1995), Hellmann and Puri (2002), Baum and Silverman (2004), Cumming, Fleming and
Suchard (2005), Sørensen (2007), and Bottazzi et al. (2008)). Yet, the actual investment
decision, which requires more of a VC’s skill at the pre-investment stage, is underexplored.
Most importantly, the literature largely ignores the dynamic nature of VCs’ investments,
instead treating the funding decision as a one-shot game.
In this paper, I explore how VCs select startups to fund over time, considering both their
past and future influence on the startups. In particular, I extend a static two-sided matching
model (e.g. Sørensen 2007) to a dynamic setting that involves multiple funding rounds.2
The dynamic setting gives two new insights. First, selection and influence of VC funding are
directly linked over time. Selection depends on startups’ quality3 and VCs’ influence – VCs
may choose to invest in a startup with lower quality if their subsequent impact is expected
to be large. Second, VCs learn from past funding rounds. Thus, previously funded startups
are more likely to get additional funding.
I perform a joint estimation for the determinants of the investment decision and the
effects of post-investment influence. The estimation strategy exploits the implications of
funding selection to identify VCs’ impact. A given startup is less preferred if other startups
are of better quality or expected to benefit more from VCs’ impact. Note that the features
1 A startup is an entrepreneurial firm, before it goes public, gets acquired, or goes out of business.2 In Sørensen 2007, a two-sided matching model is used and calibrated to separate the effects of selection
and influence of VC funding.3 A startup’s quality is defined as the cumulative return of the startup. It is the logarithm of the unob-
servable value of the startup. It is also the continuously-compounded return of investing $1 in the startupsince inception.
1
of the other startups are independent of VCs’ impact on the given startup per se. Thus,
they provide exogenous variations to the selection of the given startup that identify the VCs’
influence. As a result, the given startup is either funded by a worse VC syndicate4 or not
funded at all. In the first case, the exogenous variations identify the differential impacts of
VCs across the subsample of funded startups. In the second case, the given startup changes
from being funded to being unfunded. The exogenous variations identify the overall impact
of VCs through a comparison between VC-funded startups and non-VC-funded startups.
In order to construct a control group to estimate VCs’ impact, I hand-collect a novel
database that include both VC-funded and non-VC-funded startups. The database is from
a leading startup platform called Crunchbase. It provides information for over 290,000
companies (e.g. startups, VCs, incubators, accelerators, etc.) and 310,000 individuals (e.g.
entrepreneurs, venture capitalists, angel investors, etc.) across 176 countries. My final
sample covers 9,303 startups and 2,844 VC syndicates from 1998 to 2014. I then construct
various features for startups, VC syndicates, and the pair-wise matches between them using
both company-level (e.g. office locations, products, categories5, acquisitions, investments,
websites, news, etc.) and people-level information (e.g. educational background, employment
history, founded companies, etc.).
Following funding, the impact of VCs’ investment on a startup’s growth comes from two
sources. First, there is a direct impact that is mostly determined by VC-related charac-
teristics. For instance, funded startups exhibit more growth if they are funded by smaller
and more compatible VC syndicates whose members have cooperated before. Also, both
the presence of alumni ties and a previous funding relationship between a VC syndicate and
a startup increase the startup’s return. Second, VCs’ investments make funded startups
4 VCs usually form a group (called a VC syndicate) to provide funding to a startup. I use VC syndicate,instead of the leading VC, as the unit of investor. This is because only a few funding records containinformation on which is the leading VC.
5 A startup’s category is the sub-industry for the startup that is classified by Crunchbase. A startup canhave multiple categories.
2
less dependent on macroeconomic factors as well startups own quality. For instance, stock
market performance has a larger positive effect on the growth of unfunded startups. Also,
absent from VC funding, startups rely more on the talents of their own management teams.
In contrast, VC-funded companies can resort to the funding VC syndicates for human capital
resources.
While the quality of startups incorporates the impact of VCs’ investments in the past,
it also affects VCs’ future investment decisions. A one standard deviation increase in a
startup’s cumulative return6 corresponds a marginal increase of 6.9% in the probability
of getting funded, relative to a 50% benchmark probability. For funded startups, it also
improves the probability of a successful exit by 49.6% and reduces the probability of a
failure exit by 35.4%. For unfunded startups, the relative increase and decrease are 46.3%
and 45.2% respectively.7
In addition to startups’ quality, VCs take into account a wide range of startup- and VC-
related features to form a subjective expectation of their impact when making the funding
decision. For instance, the expected impact grows with the number of a startup’s products
but declines with the startup’s geographical and categorical span. It indicates a preference
of more productive but focused startups. Also, the expected impact is higher for more
experienced VC syndicates with members participated in more funding rounds in the past.
Between a VC syndicate and a startup, the geographical distance decreases the expected
value while the existence of alumni ties and previous funding relationship increase it. This
might result from a reduction in asymmetric information and agency cost facilitated by
learning through past cooperation.
I then investigate how the expected impact correlates with the growth of funded startup
6 The cumulative return of a startup is the continuously-compounded return of investing $1 in the startupsince inception. It is used interchangeably as the startup’s quality.
7 A startup exits the market of seeking venture funding if it goes public, gets acquired, or dies (goes out ofbusiness). In the case of going public and getting acquired, the exit is defined as successful; in the case ofdeath, the exit is defined as a failure or unsuccessful.
3
in the future. A one standard deviation increase in the expected impact corresponds to a
0.11 standard deviation increase in the funded startups’ returns one month ahead. However,
beyond one period, the correlation is insignificant. This might be due to the noises in the
imputed return data. Besides observable features, unobservable private information shared
between VCs and startups might also play a role in forming the expectation. Therefore, I
extend the baseline model to allow for the correlation between the unobserved factors driving
the funding decision and the subsequent one-period returns for funded startups. The same
private information that will lead to a 100 basis point increase in a funded startup’s return
next month increases the marginal probability by 28% for the startup to get funded this
month. Overall, the above pieces of evidence collectively verify the conjecture that the pre-
investment funding selection and the post-investment influence of VCs are interdependent
in a dynamic setting.
Lastly, I conduct a simulation experiment to study the implications of this interdepen-
dence over multiple rounds of VC funding. Specifically, I simulate data from two models. The
first model has random funding decisions each period, while the second model has random
funding decisions only in the first period. In order to produce a clean comparison, I assume
that startups have the same features and there is only one VC in the market. The difference
between VC-funded and non-VC-funded startups caused by random selection is insignificant
in the first model. In contrast, in the second model, the initial random selection persists over
time and makes a significant difference between funded and unfunded startups. Previously
VC-funded startups have a 93% chance to receive VC funding in the second model, versus a
50% chance in the first model.
This paper adds to the growing literature that examines the funding decisions of venture
capitalists. I depart from the previous studies (e.g. Stuart and Sørensen (2001), Brander et
al. (2002), Chemmanur et al. (2011), Bengtsson and Hsu (2010), Hochberg, Lindsey, and
Westerfield (2015)) by focusing on the role of VCs’ impact in the selection of startups. It
4
is the first paper to model the selection and influence of VC funding jointly in a dynamic
setting. Previous studies either consider the selection and influence of funding separately, or
ignore the dynamic perspective of the funding decision.
Two other novelties include the new hand-collected database and the methodology de-
veloped for the joint estimation of the dynamic model. Along with few other studies (e.g.,
Hellmann and Puri (2002), Hsu (2004)), this paper exploits a database containing a true
control group for the study of VC funding. Most studies that use proprietary databases (e.g.
VentureXpert, VentureOne, Preqin) examine the differential funding incentive or impact
among all-funded startups. Moreover, the new database contains individual characteristics
that facilitate the study of human capital in this area. The estimation strategy adopts
a Bayesian framework in which parameters are estimated by drawing samples from their
distributions, similar to Korteweg and Sørensen (2010).8 This type of methodology can
be applied for the estimation of econometric models that feature strong interaction among
various driving factors.
The paper proceeds as follows. Section 2 presents the economic model and then briefly
discusses the estimation strategy. Section 3 describes the data and the construction of
measures. Section 4 presents the estimation results and conducts post-estimation analy-
sis. Section 5 presents subsample results and explores alternative measures and models for
robustness checks. Section 6 concludes.
8 More generally, the dynamic interdependence among the key variables maps into a Bayesian network forwhich a parallel Gibbs Sampler is developed for parameter learning. Bayesian network is a prototypemodel for the study of the conditional dependencies among random variables.
5
2. Economic Model
2.1 Economy and Agents
The economy has two types of agents: VC syndicates and startups. With discrete time
periods, the set of VC syndicates is constant and is denoted by I. I assume each individual
VC syndicate, denoted by i, is always there and ready to provide funding all the time. In
contrast, each individual startup, denoted by j, enters the market at birth (i.e., when it is
founded), and exits the market either as a success (e.g. going public or getting acquired) or
as a failure (e.g. going out of business).
I use a list of notations to characterize the different time-varying sets of startups. Et is
the set of existing startups in the market by the end of t. It is also the set of existing startups
at the beginning of t+ 1. Nt is the set of newborn startups that enter the market at t. The
entrance occurs at the end of t after the funding decision is made. IPO/MAt and Dt denote
the sets of successful and unsuccessful exits at t, respectively. The exiting startups do not
need more funding, so the exits occur before funding decision is made. The startups that
remain in the market need to compete for more funding. I call them funding candidates.
However, not all of them manage to get funded at t. The set of funding candidates is Jt.
There are two identities that relate the different sets of startups. First, each of the existing
startups at the beginning of t will have exactly one of the three cases: exit successfully, exit
unsuccessfully, or remain in the economy. Second, the existing startups at the end of t
consists of the newborn startups, and the old startups that do not exit (i.e., the funding
candidates). Equivalently, the two identities can be written as follows using the notations
6
defined above.
Et−1 = IPO/MAt +Dt + Jt (K1)
Et = Jt +Nt (K2)
2.2 Sequence of Events
This subsection gives the sequence of events at time t. It highlights the dynamics of
three key endogenous variables for startups. In the following, I first give the definitions of
the three variables. I then describe the sequence of events, along with the dynamics of the
key variables.
2.2.1 Three Startup Variables
The first variable, denoted by rjt , is equal to the cumulative return of a startup j up to
time t. I call it the “growth variable”. By definition, the investment of $1 dollar in the
startup at inception will worth exp rjt at time t.9 The second variable, denoted by sjt , is a
signal perceived by the public on the well-being of a startup j at time t. I call it the “implicit
exit value”. It will determine whether a startup exit at time t and in which way if it exits
(e.g. IPO/MA, or Death). The evolution of both variables depend on whether a startup is
funded by a VC syndicate in the previous time period. As a result, they reflect the direct
influence of VC funding.
The third variable, denoted by vijt , is the expected value created jointly by a VC syndicate
i and a startup j, if they choose to have a funding relationship at time t.10 The expected
9 The cumulative return is continuously compounded to calculate the value of investment (net of dilution).10 Mathematically, vijt ≡ Eij
t
[rjt+1 − r
jt
∣∣∣ i is funding j from t to t+ 1], namely the value added
(rjt+1 − r
jt
)from this period to the next period in the common perspective of i and j at time t, given that i is fundingj from t to t+ 1.
7
value is subjective and different across all pairs of i and j. Therefore, I call it the “subjective
value added”. An important assumption is that a pair of startup and a VC syndicate share
a common perspective toward the expected value they would create jointly. Together, the
set of all concurrent expected values will determine the selection of VC funding.
2.2.2 Sequence of Events
[Insert Figure 1 and Figure A5]
A sequence of events occur at time t, as illustrated in Figure 1.11 First, the growth
variable at time t is determined. It depends on the lagged growth variable at t − 1, the
previous funding relations, and various current features. More specifically, if a startup is
not funded previously, its return depends only on its own features. In contrast, if it is
funded by a VC syndicate previously, its return also depends on the features related to the
VC syndicate. Eq(1) gives the law of motion for the growth variable. The various current
features include: macroeconomic variables Xt, startup features Xjt , VC syndicate features
X it , and startup-VC syndicate pair features X ij
t . I use φr,y and φr,n to denote the coefficients
associated with relevant features separately for the two cases. The noises are independent
and follow Normal distributions with variances σ2r,y and σ2
r,n.12
rjt − rjt−1 =
[1, Xt, X
jt , X
it , X
ijt
]φr,y + σr,yε
jt if j is funded by i at time t− 1[
1, Xt, Xjt
]φr,n + σr,nε
jt if j is unfunded at time t− 1
(1)
Second, given the growth variable, the implicit exit value is determined. It also depends
on a similar set of features according to whether a startup is previously-funded. In addition,
11 The relationship among the three key startup variables is illustrated in Figure A5.12 The first subscript “r” indicates that the equation’s dependent variable is r. The second subscript, “y” or
“n”, indicates the answer (“yes” or “no”) to the question of whether the startup is previously VC-funded.
8
the implicit exit value also depends on the concurrent growth variable. This assumption is
to capture the intuition that a startup is more likely to have a successful exit, or equiva-
lently the implicit exit value is higher, when its cumulative return is higher. Eq(2) gives
the determinants of the implicit exit value. As before, I use φs,y and φs,n to denote the
coefficients separately for the previously-funded and previously-unfunded cases. The noises
are independent and follow Normal distributions with variances σ2s,y and σ2
s,n.
sjt =
[1, Xt, X
jt , X
it , X
ijt , r
jt
]φs,y + σs,yη
jt if j is funded by i at time t− 1[
1, Xt, Xjt , r
jt
]φr,n + σs,nη
jt if j is unfunded at time t− 1
(2)
Third, given the implicit exit value, a startup’s status is determined. For a startup j,
the status denotes whether j exits the market at t and in which way if it exits. The status
takes three values, “IPO/MA”, “Death”, and “Survival”, according to the implicit exit value.
Eq(3) gives the correspondence.13 As a result, the set of existing startups at the beginning of
time t breaks into three groups. IPO/MAt contains the startups with status “IPO/MA”. Dt
contains the startups with status “Death”. Jt contains the startups with status “Survival”.
Recall that the startups in Jt are funding candidates and will compete for VC funding at
time t.
statusjt =
“IPO/MA” if δ ≤ sjt
“Survival” if − δ ≤ sjt < δ
“Death” if sjt < δ
(3)
Fourth, given the set of funding candidates, the subjective value added is determined, for
each pair of startup and VC syndicate in the economy. The pair-wise value depends on a set
of startup features, VC syndicate features, and startup-VC syndicate pair features. Again, it
13 I set δ to 3. It could an arbitrary positive value. This is because the equation for s is unidentified withoutthe specification of δ. The change of δ only shifts and re-scales the distribution of s. Consequently, theestimated coefficients will have its intercept shifted, and other coefficients multiplied by common factor.
9
also depends on a startup’s growth variable. Eq(4) gives the determinants of the subjective
value added.14 I use φv to denote the coefficients. The noises are independent and follow
standard Normal distributions.15 Fifth, given the subjective value added, the equilibrium
funding at time t is determined as a two-sided matching between the set of VC syndicates
and the set of funding candidates. I use µt to denote the funding. More details will be given
in the following subsection.
vijt =[Xjt , X
it , X
ijt , r
jt
]φv + ξijt , for all i ∈ I, j ∈ Jt (4)
Lastly, after the funding relationship is established, newborn startups enter into the
market at the end of t. Their company value is set to $1, or equivalently their growth
variable is set to zero. Thus, the existing startups at the end of t include newborn startups
and the old startups that have not exited.
2.3 Funding Decision
The funding decision µt lasts for one period from time t to time t+1. I use the two-sided
matching model in Sørensen (2007) as the prototype for the one-period funding decision.
Two-sided means that both the VC syndicates and the startups are active in the search
of a funding relationship. However, a startup can only be matched to one VC syndicate,
while a VC syndicate can be matched with multiple startups. The number of startups a VC
syndicate i funds at t is denoted by qit, and it will be calibrated to equal the actual number
in estimation. As discussed before, the set of subjective value added{vijt : i ∈ I, j ∈ Jt
}14 The equilibrium funding decision is determined through the comparison of the subjective value added. This
will be discussed later. However, a common shift of all concurrent subjective value added will not changethe equilibrium funding. Thus, the coefficients associated with the intercept and the macroeconomicvariables are unidentified. Therefore, those variables are not included in the equation for v.
15 I set the variance to 1. It can be any arbitrary positive value. Again, this is because the equation for vis unidentified without the specification of it. The change of variance only re-scales the distribution of vand will not change the equilibrium funding decision.
10
determines the equilibrium matching µ∗t . In the following, I first give the utility maximization
problem for both agents in terms of the subjective value added, then discuss the equilibrium.
2.3.1 Preferences and Choice Variable
Given a funding relationship µt, the utility is the sum of subjective value added that an
agent creates. For a startup j, U jt equals vijt if there is a funding relationship between i and
j. For a VC syndicate i, U it equals the sum of vijt if there is a funding relationship between i
and j. The choice variable is the indicator whether the pair (i, j) is in µt. Both startups and
VC syndicates can propose to the counterparty for the establishment of a pair. However,
the pair (i, j) ends up in µt if and only if both i and j want it to exist. Eq(5) and Eq(6)
define the utility function and state the utility maximization problem.
U it =
∑j:ij∈µt
vijt , s.t. |i : ij ∈ µt| ≤ qi (5)
U jt =
∑i:ij∈µt
vijt , s.t. |j : ij ∈ µt| ≤ 1 (6)
There are two assumptions behind the definition of utility. First, for each pair of VC
syndicate and startup, I assume they agree with the subjective value added vijt . Second, I
assume there is a common fraction, say λ, of the value added goes to a VC syndicate, and
the remaining fraction, 1 − λ, goes to a startup. This assumption allows me to ignore the
differences in bargaining power and agency problem. Thus, the ranking among the set of
expected value added is sufficient to determine the equilibrium funding.
11
2.3.2 Pairwise Stability and Equilibrium
The equilibrium matching µ∗t is pairwise stable. Pairwise stability means that there exists
no pair that can gain from pairwise deviation. A pairwise deviation would occur for a pair
(i, j) not in µt that both i and j prefer each other to their current matched counterparties.
As a result, i and j would break up with their existing matched counterparties and form a
new pair (i, j) between them. In the equilibrium matching µ∗t , such a pairwise deviation is
not profitable for any pair.
The equilibrium matching exists and is unique if all vijt are distinct.16 The equilibrium
condition can be characterized by a set of inequalities as in Sørensen (2007). Eq(7) and Eq(8)
give the inequalities. For a pair (i, j) not in µt, vijt is not greater than the two opportunity
costs of i and j to break up with their matched counterparties. The two opportunity costs
are equal to minij′∈µt vij′
t and vµt(j)jt respectively. Here, I use µt(j) to denote the matched
VC syndicate for j at t. For a pair (i, j) in µt, vijt is greater than all vi
′jt that i′ wants to
deviate to j and all vij′
t that j wants to deviate to i. Let St(i) denote the set of j′ that
want to deviate to i, and let St(j) denote the set of i′ that want to deviate to j. Eq(11) and
Eq(12) give the expressions for St(i) and St(j) respectively.
(i, j) /∈ µt ⇔ vijt < v (7)
(i, j) ∈ µt ⇔ vijt ≥ v (8)
v = max
[minij′∈µt
vij′
t , vµt(j)jt
](9)
v = max
[maxj′∈St(i)
vij′
t , maxi′∈St(j)
vi′jt
](10)
16 This is because i and j share the same perspective of vijt . A proof is given in Sørensen (2005). In general,it is not true. The stable matching problem can be solved by the Gale-Shapley algorithm.
12
St(i) ={j′ ∈ Jt : vij
′
t > vµt(j′)j′
t
}(11)
St(j) =
{i′ ∈ I : vi
′jt > min
ij′∈µtvi
′j′
t
}(12)
2.4 Estimation Strategy
The parameter estimation is performed jointly on the main system of equations Eq(1),
Eq(2), and Eq(4). The parameters include (φr,y, σ2r,y), (φr,n, σ
2r,n), (φs,y, σ
2s,y), (φs,n, σ
2s,n), and
φv. Eq(1) and Eq(2) give the direct influence of VC funding on startup growth and implicit
exit value. The direct influence is shown in a comparison between the previously-funded and
previously-unfunded startups. Eq(4) describes subjective value added as the determinant of
the selection of VC funding. The selection corresponds to a two-sided matching between VC
syndicates and funding candidates.
Not all variables in the main system of equations are observed. The three dependent
variables are latent or only partially observed. Thus, the estimation involves the imputation
of these variables.17 The observed data is composed of four pieces. The first piece is the
partially observed growth variable. It is observed at birth and exit of a startup, or when
the startup gets VC funding. The second piece is the startup status (e.g. IPO/MA, Death,
Survival) at each time period. It helps the imputation of the implicit exit value. The third
piece is the equilibrium funding. It helps the imputation of the subjective value added. The
last piece includes the macroeconomic variables, and the various features of startups, VC
syndicates, and startup-VC syndicate pairs. They are the independent variables for the three
equations.
I use Gibbs Sampler to estimate the parameters in a Bayesian framework. In fact, it is
simpler to estimate in this way. First, regarding the interdependence among the key variables,
17 The imputation relies on the observed data, the last-updated parameter estimates, and most importantly,the interdependence among the three key variables.
13
it is impossible to use regressions alone to accomplish the joint estimation. Also, the existence
of latent variables makes it implausible to estimate using a GMM/SMM strategy. Therefore,
it is easiest to use the Bayesian framework because it can handle the interdependence in the
presence of latent variables. The Bayesian framework gives tractable posterior distributions
for all the parameters and variables given proper priors. Thus, a tractable algorithm can
be implemented using the Gibbs Sampler. The Gibbs Sampler iterates between parameter
updates and the latent variable imputations. Appendix A gives the detailed algorithm for
the estimation strategy.
The distribution assumption is given as follows. I assume that the noises in Eq(1), Eq(2),
and Eq(3) are independent.18 As defined before, they follow Normal distributions with dif-
ferent variances. I also assume the parameters have conjugate priors. In Eq(1) and Eq(2),
the priors of the joint distributions (φ, σ2) follow Normal-Inverse-Gamma distributions. In
Eq(3), the prior of φv follows a Normal distribution. Appendix A gives the detailed assump-
tion for the priors. For post-estimation analysis, I use the posterior t-statistics for hypothesis
testing.
3. Data and Measure
Most research in venture capital uses proprietary databases (e.g. VentureOne, VentureX-
pert). For the study of VC’s impact on startups, the biggest disadvantage of these databases
is that they lack a control group of non-VC-funded startups. In this paper, I hand-collect a
novel database from a leading startup platform, Crunchbase, that provides information for
both VC-funded and non-VC-funded startups.
Founded in 2007, Crunchbase has 1.5 million unique visitors each month in 2013. By 2014,
it has a record of more than 290,000 companies (e.g. startups, VCs, incubators, accelerators,
18 Later on, in an extended model, the three noises are assumed to be correlated.
14
etc.) and 310,000 individuals (e.g. entrepreneurs, venture capitalists, angel investors, etc.)
across 176 countries. For companies, a typical record includes founding information, current
status (IPO, acquired, alive, dead), acquisitions and investments history19, funding history20,
products and categories information21, contact information, and company news. There is also
human capital information that relates companies to individuals. The individuals include
founders, angel investors, board members and advisors, and personnel on the current and
past teams of management for a company. For individuals, a typical record includes name,
gender, primary location, employment history, and educational background.
One unique feature about Crunchbase is that it collects information by crowdsourcing.
The advantage is that the database can be built very quickly at an exponential speed with
insiders, especially entrepreneurs, feeding detailed information. Another database that is
built in this way is Wikipedia which has more than 120,000 regular contributors and 12,000
editors by now. Like Wikipedia, the disadvantage of Crunchbase is that it may contain
some inaccurate information. The Crunchbase team combines human and machine reviews
to prevent it.22
To check the credibility, I manually compare the Crunchbase profiles for a subsample of
startups with the information from major business journals and proprietary databases. The
subsample includes 250 startups with successful exits (IPO/MA) and 790 startups that have
funding records from VentureXpert. More specifically, I compare the numbers for the money
raised in IPO, the transaction value for MA, and the money invested in the funding rounds.
Those numbers are similar for the information collected from Crunchbase and from other
sources. In addition, I apply a number of filters to select the startups with the most accurate
information for model estimation. The filtering procedure will be discussed in details in the
19 The company is the acquirer or investor.20 The company is the investee.21 The category is classified by Crunchbase denoting the sub-industry for the company. The categories are
not mutually exclusive.22 For more details, please see https://info.crunchbase.com/about/faqs.
15
construction of sample.
Finally, one prevalent concern on any startup database is that it may contain some zombie
companies. A zombie company shows as alive on the record but is actually out of business.
Thus, I need to change the final status from “Survival” to “Death” for zombie companies
in my sample. To do that, I visit each startup’s website in the sample if its final status
is “Survival” according to the Crunchbase profile. It turns out that over 65% of the dead
companies in the sample are detected in this way.
3.1 Sample
The sample period is from 1998 to 2014. I apply a number of filters to select a sample that
is suitable for the study and has the most accurate information. The first set of filters is on
startups. A startup needs to have a birth year equal to or greater than 1998 to be included in
the sample. Moreover, I include a startup if it has available website, category, headquarter,
founder, and current team of management information on its Crunchbase profile. This gives
a total number of 29,184 startups.
The second set of filters is on VCs and funding rounds. For VCs, I only include experi-
enced VCs that have participated in at least ten funding rounds in the sample. It is due to
the model assumption that VCs are always there ready to provide funding. This gives a total
number of 765 VCs. For funding rounds, a filter is applied on the type of funding. It excludes
angel-investing, debt-financing, equity or product-crowdfunding, grant, non-equity-issuance,
post-ipo-debt or equity, and secondary-market-investing. A funding round also needs to have
available information on investment amount and post-investment valuation to be included
in the sample. This gives a total number of 21,483 funding rounds.
The third set of filters is on IPOs and MAs. For IPOs, I delete a record if a company’s
16
market value at IPO cannot be calculated or otherwise obtained from other sources. Fortu-
nately, no record of is dropped in this way. For MAs, I delete a record if either transaction
value or acquired proportion of a company is missing and unavailable from the SDC Plat-
inum Mergers & Acquisitions database. This gives a total number of 323 IPO and 1,728 MA
records.
Finally, the resulted datasets of startups, VCs, funding rounds, IPOs and MAs need to
be consistent with one another. For instance, I delete the whole record of a startup if its
corresponding IPO, MA or funding-round record is excluded due to missing information. I
also delete the whole record of a VC if all of its funding records have been deleted given
the above filters. Accordingly, VC-funded startups account for a smaller proportion in the
sample than in the resulted dataset from the first set of filters. To keep the proportion
roughly the same, I randomly drop non-VC-funded startups with a preference for those with
the least information in the database.
[Insert Table 1]
Table 1 gives a descriptive summary of the final sample. It contains 9,303 startups and
755 VCs that have formed 2,844 distinct VC syndicates. Among the 9,303 startups, only
2,350 (25.26%) have been funded by VCs. For the VC-funded ones, about 22.47% go public
or get acquired and 17.62% finally die. For the non-VC-funded ones, the proportions are
1.57% and 27.02% respectively. Getting VCs’ funding can increase the IPO/MA rate by 14
times. Regarding the number of rounds for VC-funded startups, both the median and the
mode are 2 rounds per startup. Regarding the size of VC syndicates, the mean and the
median are 2.93 and 3 VCs per syndicate.
17
3.2 Measures
There are two groups of variables for which I need to construct measures. The first group
consists of startups’ birth, final status (e.g. IPO/MA, Survival, Death), and funding status
(e.g. VC-funded or not, and by which VC syndicate if funded). It is straightforward to
construct these measures as they are directly recorded in the database. They will be used
to update the posterior distributions of latent variables.
The second group consists of the dependent and independent variables in the main system
of equations for estimation (Eq(1), Eq(2), and Eq(4)). The three dependent variables are the
cumulative return, the implicit exit value, and the subjective value added. Among them, the
implicit exit value and the subjective value added are latent variables to be imputed. The
cumulative return is observed sporadically. Based on these observations, interim values are
imputed during estimation. The independent variables include various observed features of
startups and VC syndicates. The following details the construction of the cumulative return
and those various features and Appendix B gives a summary.
3.2.1 Cumulative Return
By definition, the cumulative return is the logarithm of a startup’s valuation. I assume
that all startups have a valuation equal to 1 at birth. Later on, a startup has its valuation
revealed at exit or funding rounds. I estimate the valuation in three ways. First, I calculate
the valuation at exit for startups that finally exit during the sample period. These are the
startups with final a status equal to “IPO/MA” or “Death”. For “IPO/MA” ones, I set the
valuation to be the reported market value at IPO or the deal price divided by the percentage
acquired in MA. For “Death” ones, I sample the valuation from a triangle distribution with
a mode of 0.1. For the exact time of death, I collect additional information to determine
when is the last time these finally-dead startups have events or news. I then sample from
18
a uniform distribution spanning from 6 to 30 months after that time for the exact time of
death.
Second, I calculate the valuation at funding rounds for startups that have received VC
funding during the sample period. The valuation is net of the pure money effect of in-
vestment, namely “anti-diluted” as defined in Cochrane (2005) and Korteweg and Sørensen
(2010). More specifically, the valuation can be calculated by compounding the “anti-diluted”
period-to-period return. The period-to-period return from the last funding round to this
funding round is equal to the ratio of the pre-investment value at this round to the post-
investment value at last round.23 As a result, the “anti-diluted” valuation is equal to the
product of the period-to-period returns between neighboring funding rounds. This defini-
tion measures the growth rate of a VC-funded startup by excluding the dollar amount of
investment.
Third, I estimate the valuation for startups whose final status is “Survival” by the end of
the sample period. The estimation is based on current startup performance. In particular, I
visit their websites and use their Crunchbase profiles (e.g. company description, current team
of management, offices, investments and acquisitions24) to evaluate the performance. I then
classify the startups into three groups according to their performance ranked in a decreasing
order. Next, for each group, I find comparable startups with similar features and non-missing
valuations at exit. Finally, I impute the valuation for each group by drawing samples from
a smoothed distribution of the exit values of comparable startups.25 Table 2 Panel A gives
the summary statistics for the observed cumulative return. As implied by the percentile
information, the cumulative return is very dispersed and has a bimodal distribution.
23 Appendix B gives the formula for the “anti-diluted” period-to-period return and company value.24 The startups are the investors and acquirers.25 Admittedly, the estimation is subjective. Nevertheless, given the huge variation in startups performance,
it is better off to have a rough estimate than leave the valuation missing. In the latter case, the distri-bution of imputed interim returns would be unrealistically flattened. Also, the returns for successful andunsuccessful startups should follow very different distributions.
19
[Insert Table 2]
3.2.2 Macroeconomic Variables
The macroeconomic variables include the risk-free rate (rf ), the Fama-French three fac-
tors ((rm − rf ), smb, hml), and a proxy for the cost of long-term borrowing (Ybaa − Yus10).
(Ybaa − Yus10) is equal to the spread of the yield of Moody’s seasoned Baa corporate bond
over the yield of 10-year Treasury bond. Both the risk-free rate and the spread are obtained
from the Federal Reserve Bank of St. Louis. The Fama French three factors are from Ken
Frenchs website. (rm − rf ) is excess market return over risk-free rate; smb is the factor
return on the small-minus-big portfolio; and hml is the factor return on the high-minus-low
portfolio. Table 2 Panel B gives the summary statistics.
3.2.3 Startup Features
The startup features are either constant or time-varying. The constant startup features
are along three dimensions: location, category, and product. # locations is the number of
cities that a startups headquarter or offices are located in. # categories is the number of
categories a startup is classified into by Crunchbase. # products is the number of products
that a startup has. In addition, I construct a set of dummies 1(LOC) to indicate whether
a startup has a headquarter or offices located in a specific place LOC. LOC takes “ny” for
New York, “ca” for California, “ous” for other places in U.S., “ona” for other places in North
America, “as” for Asia, and “eu” for Europe. Table A1 gives the list of top 20 cities and
categories with the most startups.
The time-varying startup features characterize funding history and human capital in-
formation. For funding history, t from last round is the time in years since last funding
round; t2 from last round is the square of it. # rounds is the number of funding rounds
20
experienced in the past. For human capital information, # startups founded is the number
of companies that the startup founder has built in the past. 1(top20 school) is the dummy
variable indicating whether a startup has people on its management team who graduated
from a top school at a specific time. Table A2 gives the top school list.26 Table 2 Panel B
gives the summary statistics for the startup features.
3.2.4 VC Syndicate Features
The VC syndicate features also include the constant and the time-varying. The constant
features characterize geographical and size information. # of VCs is the number of VCs in a
syndicate and it measures the size of the syndicate. # locations if the number of cities that
a VC syndicate has at least one VC member that has an office or headquarter located in
it. Table A3 gives the lists of top 20 cities with the most VCs (Panel B) and with the most
VC syndicates (Panel C). A comparison between the two lists shows that the cooperation is
prevalent among VCs in different locations. For instance, the percentage of VCs that has an
office in New York, San Francisco, Menlo Park, and Palo Alto is at most 15%. In contrast,
the percentage of VC syndicates that has an office in these places is at least 44%.
The time-varying features characterize investment and human capital information. For
a VC syndicate, 1(cooperated) is a dummy variable indicating whether any member VCs
have cooperated in the past. # categories is the number of categories that a VC syndicate
has at least one member VC that has investment experience in it in the past. # rounds
is the median funding rounds that VC members have participated in before. This variable
measures the average experience of a VC syndicate. Finally, 1(top20 school) is the dummy
variable indicating whether a VC syndicate has people who graduated from a top school
26 Both lists in Table A2 and Table A4 exclude some best schools but include some schools with lowerranks. There are two reasons. First, I give priority to schools that provide the best education in sometechnical fields (e.g. engineering, biochemistry). Second, I select schools that appear most frequently inthe educational background of VC and startup personnel, since these schools should have very strongalumni network.
21
on its members’ management teams at a specific time. Table A4 gives the top school list.
Table 2 Panel C gives the summary statistics for the VC syndicate features.
3.2.5 Startup-VC Syndicate Pair Features
Last, I construct measures for each pair of startup and VC syndicate. There are
26,457,732 startup-VC syndicate pairs in total. The constant features measure the pair-
wise geographical distance. Distance is the closest distance in miles between a startup and a
VC syndicate. Based on that, days of travel is the number of (additional) days for a round
travel. It equals 0 if distance is within 100 miles, 1 if distance is between 100 and 1,000
miles, 2 if distance is between 1,000 and 10,000 miles, and 3 otherwise.27 The percentage of
all pairs that has a distance with 100 miles is around 25%. For comparison, the percentage
of pairs that has a funding relationship and has a distance within 100 miles is around 75%.
The time-varying features describe the existence of funding relationship and alumni ties.
For a startup-VC syndicate pair, 1(funding tie) is a dummy variable indicating whether
any VC member has funded the startup in the past. 1(alumni tie) is a dummy variable
indicating whether any VC member and the startup have people graduated from the same
school. Table 2 Panel D gives the summary statistics for the startup-VC syndicate features.
27 For distance within 100 miles, I assume a one-day round travel by car. For distance between 100 and1,000 miles, I assume a two-day travel by flight. For distance between 1,000 and 10,000 miles, I assume athree-day travel by flight. For distance greater than 10,000 miles, I assume an intercontinental travel.
22
4. Estimation Results
4.1 Baseline Estimation
Table 3 presents the joint estimation result for the baseline model. In the table, the first
four columns show the direct influence of VC funding on startup growth and implicit exit
value as a comparison between previously-funded and previously-unfunded startups. Among
them, columns 1 and 2 give the parameters for the law of motion of the growth variable in
Eq(1); columns 3 and 4 give the parameters for the dynamics of the implicit exit value in
Eq(2). The last column shows the determinants of the selection of VC funding. It gives the
parameters for the dynamics of the subjective value added in Eq(4).
[Insert Table 3]
4.1.1 Influence of VC Funding
The influence of VC funding on startup growth comes from two sources. First, there is a
direct impact from the funding VC syndicate features and the pair features. Note that these
features are non-missing only for previously-funded startups. Second, there is an indirect
impact given the presence of the funding VC syndicate. The indirect impact changes the
effects of the macroeconomic variables and the startup features. It is shown as a comparison
between the previously-funded and unfunded startups.
For the direct impact, various VC syndicate and pair features show significance. For
instance, funded startups growth is negatively correlated with VC syndicate size (# of VCs)
and positively correlated with the educational background of individual venture capitalists
(1(top20 school)). In addition, past cooperation among any VC members (1(cooperated))
also helps funded startups grow faster. Regarding the pairwise features, both the existence
23
of alumni ties (1(alumni tie)) and the existence of past funding relations (1(funding tie))
correspond to a higher growth rate for the funded startups. It might result from a reduction
in asymmetric information and agency cost facilitated by learning through social network or
past cooperation.
For the indirect impact, previously-funded startups are less dependent on the macroeco-
nomic conditions. For instance, the cost of alternative funding, indicated by the difference
between the BAA corporate bond yield and the 10-year Treasury yield (Ybaa − Yus10), has a
more significant negative effect on startup growth when the startup is previously-unfunded.
Likewise, startup growth depends more on the Fama French three factors ((rm − rf ), smb,
hml) as well as the risk-free rate (rf ) without funding.
Regarding the startup features, one surprising result is that the effects of startup hu-
man capital features exhibit opposite signs for the previously-funded and unfunded cases.
Founding experience of entrepreneurs (# startups founded) has a positive effect on a startup
growth when it is not funded. However, the effect becomes negative in the presence of VC
funding. Similarly, the educational background of a startup’s management team (1(top20
school)) promotes growth without funding but impedes growth with funding. The positive
impact of a startup’s human resource seems to be supplanted by the funding VC syndicate’s.
The change in the sign indicates a power struggle between the top executives of the funded
startups and funding VCs.
The influence of VC funding on startup implicit exit value is mainly through its effect on
startup growth. Faster startup growth implies better startup quality (i.e., higher cumulative
return) and corresponds to a higher implicit exit value. As a result, a startup is more likely
to exit through IPO or MA and less likely to run out of business. For the previously-funded
case, a one standard deviation increase in the cumulative return (7.408) is associated with an
increase of 0.207 in the implicit exit value. It corresponds to a 49.6% relative increase in the
probability of IPO/MA and a 35.4% relative decrease in the probability of Death compared
24
with the benchmark mean values.28 For the previously-unfunded case, the relative increase
and decrease in the probabilities of IPO/MA and Death are 46.3% and 45.2% respectively.
4.1.2 Selection of VC Funding
The selection of VC funding depends on the subjective value added. One of the determi-
nants for the subjective value added is startup quality. A one standard deviation increase in
the cumulative return (7.172)29 is associated with an increase of 0.244 in the subjective value
added. It corresponds to a marginal increase of 6.9% in the probability of getting funded,
relative to a 50% benchmark probability.30
In addition to startup quality, a wide range of startup- and VC-related features enter into
the consideration of funding selection. Regarding the startup features, the expected value
added grows with the number of a startup’s products (# products) but declines with the
startup’s geographical span (# location) and categorical span (# categories). It indicates
a preference of more productive but focused startups. Also, startups that have experienced
more funding rounds is preferred, as they have been selected by other savvy VCs in the past.
Interestingly, better startup human capital (# startups founded, 1(top20 school)) lowers the
chance of receiving VC funding. It is not surprising given their negative impact on funded
startup’s growth.
Regarding the VC-related features, smaller VC syndicates (# of VCs) with more funding
experience (# rounds) correspond to a higher expected impact. However, diversity in the
categories of funding (# categories) is associated with a lower expected impact. Also, better
28 For the previously-funded case, the mean and standard deviation for the implicit exit value are 0.044 and1.307. The mean probabilities of IPO/MA and Death are 1.19% and 0.99%. Thus, an increase of 0.207in the implicit exit value changes the IPO/MA probability to Φ ((0.647− δ)/1.307) = 1.78%, and changesthe Death probability to Φ ((−0.647− δ)/1.307) = 0.64%. Φ is the c.d.f. of standard Normal. δ equals 3.
29 This is the standard deviation of the cumulative return over the whole sample.30 The benchmark 50% is the probability that some vij is greater than vi
′j′ given the same distributionassuming all their determinants in Eq(4) are the same. Thus, an increase of 0.244 in vij changes thisprobability to Φ
(0.244/
√2)
= 56.86%. Φ is the c.d.f. of standard Normal.
25
educational background of venture capitalists (1(top20 school)) is less preferred. Between a
pair of VC syndicate and startup, the geographical distance (days of travel) decreases the
expected value added while the existence of alumni ties (1(alumni tie)) and previous funding
relationship (1(funding tie)) increase it. The expected value added seems to be consistent
with the VC syndicate’s impact on funded startups following the funding decision.
4.1.3 Previously Funded vs. Previously Unfunded
The overall effect of VC funding is measured by a comparison of startup growth and
implicit exit value between previously-funded and previously-unfunded startups. It is to
compare the dependent variables in Eq(1) and Eq(2). Table 4 compares the mean values us-
ing different control groups. The different control groups are all of the previously-unfunded
startups and subgroups of them including only comparable ones. I use the Nearest Neighbor
method and the Propensity Score Matching method to choose comparable unfunded star-
tups. Identifying covariates include startup features and macroeconomic variables. Figure 2
compares the distributions.
[Insert Table 4 & Figure 2]
In general, previously-funded startups exhibit faster growth and higher implicit exit value.
For the period return31, the previously-funded group has a raw and trimmed mean both equal
to 3.4%.32 The previously-unfunded group has a raw and trimmed mean equal to 5.1% and
1.7% respectively. Growth of unfunded startups seems to have huge variance. However, on
average, it is smaller than the growth of funded startups after ignoring the outliers.
The mean exit value is 0.044 for previously-funded startup and -0.197 for previously-
unfunded ones. It implies a higher unconditional probability of successful exit and a lower
31 Period return is the first difference in the cumulative return. One period is one month.32 The trimmed mean is the mean of the subsample winsorized between 1 and 99 percentiles.
26
probability of failure exit for funded startups. This is consistent with the results on IPO&MA
rate and Death rate. The two rates are the proportions of startups that go IPO/MA and
Death this period following the previous funding decision. They represent the conditional
probabilities of successful and failure exits given that a startup still exists in the economy.
The IPO&MA and Death rates are 0.5% and 0.4% for previously-funded startups. In con-
trast, they are 0.02% and 0.49% for previously-unfunded startups. A Chi2 statistics of 164.9
shows that the two conditional distributions of successful and failure startups are significantly
different across the funded and unfunded groups.
The results stay qualitatively the same using comparable unfunded startups as alternative
controls. The difference in the mean trimmed period return is around 5.2% between the
funded and comparable unfunded startups. The difference in the mean implicit exit value is
around 0.2. By large, VC funding improves the startup return and the chance of successful
exit in the following period after controlling the determinants of funding selection.
4.2 Constrained Estimation
Both startup growth and implicit exit value have separate dynamics across funded and
unfunded startups. One concern is that the separate dynamics might mask the first-order
effects of VC-related features. To address this concern, I impose the constraints such that
funded and unfunded startups have the same dynamics of evolution. In particular, the con-
straints make the coefficients of the common terms the same for the two groups of startups.
The common terms include the intercept, the macroeconomic variables, and the startup
features. By imposing the constraints, it shuts down the indirect impact of VC funding.
Appendix A3 gives the estimation strategy for the constrained model. Table 5 presents the
constrained estimation results.
[Insert Table 5]
27
For the selection of VC funding, the coefficients stay almost the same. One exception is
the effect of days of travel. It is tripled (-0.696) compared with the baseline model (-0.210).
Now, a one standard deviation decrease in days of travel is associated with 0.176 increase in
the expected value added. It corresponds to a marginal increase of 5.0% in the probability
of getting funded, compared with a 50% benchmark probability.
For the influence of VC funding, the new coefficients have some interesting patterns.
First, the effects of the VC-related features are similar to those in the baseline model. It
implies that the direct impact of VC funding is not sensitive to the assumption of separate
dynamics for funded and unfunded startups. Second, the effects of the common terms (the
intercept, the macroeconomic variables, and the startup features) are close to those for the
unfunded case in the baseline model. This might be due to the large proportion of unfunded
startups in the panel data.
Moreover, the effect of startup quality on implicit exit value is now much larger. Now,
a one standard deviation increase in the cumulative return corresponds to a 282.5% relative
increase in the probability of IPO/MA and a 76.2% relative decrease in the probability of
Death.33 However, the boost in the coefficient is due to the fact that VC funding works as a
confounding factor that improves startup quality and implicit exit value at the same time.
In fact, it is more appropriate to have different dynamics for funded and unfunded startups
to separate out the funding effect. Therefore, the following analysis is performed on the
estimation results from the baseline model.
4.3 Expected VC Impact & Startup Future
Given the estimation result, I then investigate how the subjective value added lines up
with the true value added following funding. That is, within the group of previously-funded
33 The associated increase in the implicit exit value is 0.581. It increases the probability of IPO/MA from0.40% to 1.53%. It decreases the probability of Death from 0.84% to 0.20%.
28
startups, whether the expected VCs’ impact correctly reflects startups’ future performance
(startup growth, implicit exit value), as a measure of the true value added.
4.3.1 Expected VC Impact & Startup Future
For a direct test, I look at the correlations between the subjective value added and startup
future performance. The sample of the test only includes funded startups at each funding
round. Note that only on funded startups, venture capitalists have a chance to realize their
expected impact. Also, the impact might last for multiple periods. Therefore, I use each
funding round as an observation, to facilitate the study of the impact’s duration. Table 6
presents the results.
[Insert Table 6]
The correlation is positive and significant for startup future performance one-period
ahead. Within the group of funded startups, a one standard deviation increase in the ex-
pected VC impact is associated with a 0.114 standard deviation increase in the startup
return one-period ahead. It also relates to a 0.056 standard deviation increase in the im-
plicit exit value next period, which corresponds to a 15.1% relative increase in the probability
of IPO/MA and a 13.7% relative decrease in the probability of Death. However, the positive
correlation quickly fades away beyond one period. It seems that the expected VC impact is
better actualized in the near future of funded startups.
cov(vijt , r
jt+1 − r
jt
)= cov
( [Xjt , X
it , X
ijt , r
jt
]φv + ξijt ,[
Xt+1, Xjt+1, X
it+1, X
ijt+1
]φr,y + σr,yε
jt+1
) (13)
The strong positive correlation one-period ahead is actually hinted in the estimation
results. To see this, we can write out the covariance between the subjective value added
29
and funded startups return next period as in Eq(13). One thing to notice is that a large
proportion of the covariates are slow-moving, with half of them time-invariant. Therefore,
the sign of the correlation depends on whether the signs of the corresponding coefficients in
the estimated parameters (φv and φr,y) are the same. In Table 3, most of these coefficients
take the same signs. For instance, the existence of alumni ties has a positive effect on
both subjective value added and funded startups’ growth. Thus, the source of the positive
covariance is the observable startup and VC-related features.
4.3.2 Selection on Private Information
The above correlation is not accounted by any private information shared between a
pair of startup and VC syndicate. In fact, the original model lacks a channel for private
information to play a role. This is because all unobservable factors, absorbed by the three
noise terms (ε, η, ξ), are assumed to be independent in the baseline model.
To incorporate the effect of private information, I extend the model by revising the noise
terms as follows. For startups funded at t − 1, the noise term in the growth variable at t
includes an additional term ρr,yξt−1, and the noise term in the implicit exit value includes an
additional term ρs,yξt−1. The coefficients ρr,y and ρs,y are the covariances among the errors.
They give the dependence of the subjective value added on the private information that
will drive the startup growth and the implicit exit value one period ahead. Appendix C2
details the extended model and gives the estimation strategy. Table 7 presents the estimation
results.
[Insert Table 7]
Now, a unit increase in the expected VC impact driven by the private information cor-
relates with 0.9% increase in startup return following funding. It also correlates with 0.073
30
increase in the implicit exit value, which corresponds to a 48.74% relative increase in the
probability of IPO/MA and a 90.90% relative decrease in the probability of Death. There-
fore, private information could be an additional source to be added to the positive correlation
between the expected VC impact and funded startups’ future performance.
4.4 Joint Effect of Selection and Influence
The results of the joint estimation highlight one fact: selection and influence of VC
funding are directly linked over time. On one hand, following the funding decision, VCs’
influence improves funded startups’ quality and their probabilities of successful exits. On
the other hand, the selection of funding in turn is dependent on the expected VCs’ impact
in the future, as well as the startups’ quality resultant from past VCs’ influence if funded
before. A natural question then follows. How large is the joint effect of VC funding, given
the direct linkage of selection and influence over time?
To answer this question, I conduct a simulation experiment to study the joint effect
of funding selection and influence over multiple rounds. More specifically, I compare the
simulation results from two models. In Model 1, I break down the interdependence between
funding selection and influence, by making the funding decision random for all periods. In
Model 2, the funding decision is random only for the first period.
Model 1 : Selection is random for all periods.
Influence follows the same dynamics as in the baseline model.
Model 2 : Selection is random for the first period, then determined in equilibrium.
Influence follows the same dynamics as in the baseline model.
For parameter values, I use the estimation results from Table 3. The dynamics of the
31
three key variables are the same as in the baseline model. I assume the economy has 100
startups born at time 0, with exactly the same features ex ante. There is only one VC
syndicate, which can fund up to 50 startups each period. I assume all the startup and
VC-related features take the mean values of the sample used in the above analysis. Note
that there is no variation across the startups in the beginning. The only source that makes
them different later on comes from the randomness in the funding decision in both models.
Table 8 compares the simulated results for the two models.
[Insert Table 8]
The randomness in the initial funding decision magnifies significantly in Model 2, i.e.,
under the joint effect of selection and influence of VC funding. In Model 2, previously-funded
startups have significantly higher period return and implicit exit value. In contrast, there is
no significant difference between previously-funded and unfunded startups in Model 1.
To show how the initial randomness accumulates over time in Model 2, I compare the
transition matrix of VC funding. In Model 1, each startup has around 50% chance to get
funded each period, whether it gets funded or not previously. However, in Model 2, a
previously-funded startup has 93.1% chance to get funded this time, but the chance is only
4.1% for a previously-unfunded startup. Thus, the funded subsample in Model 2 does not
change a lot over time – it always consists of a large proportion of startups which happen to
get the funding in the first period, so they get the subsequent fundings as well.
As a consequence, initial randomness might have a long-term effect given the interdepen-
dence between funding selection and influence. To verify the conjecture, I further compare
the differences in future performance between funded and unfunded startups for the two
models. Funded startups are not very different from unfunded ones in Model 1. However,
they consistently over-perform in both short-term and long-term future in Model 2. To sum
32
up, the joint effect of selection and influence of VC funding is essential. It helps accumulate
the impact of some incidental factor over time which can become decisive in the end.
5. Robustness
This section presents the results of some robustness checks. First, to account for the
possible shifts in the parameter estimates, I perform the estimation on two subsamples from
1998 to 2006 and from 2007 to 2014. Second, to curb the potential multicollinearity problem,
I extract factors from the original set of features and estimate the model using these factors.
Third, I construct two alternative models to allow for the autocorrelation in the subjective
value added and a hierarchical structure in startup returns driven by a hidden startup fixed-
effect.
5.1 Subsample Results
I perform the subsample estimation on two periods separately: from 1998 to 2006 and
from 2007 to 2014. Both periods feature a flagging economy in the beginning and then a
recovery afterwards. The first period captures the dot-com bubble, and the second period
spans over the 2007-2008 financial crisis. Both subsamples only contain the startups that are
born during the corresponding periods. For the first subsample running from 1998 to 2006,
I adjust startups’ final status to “Survival” if they are going to be “IPO/MA” or “Death”
in the second subsample. Note that even if the model is correctly specified, there might be
drift in the parameter estimates due to the evolution of the VC industry and changes in the
startup ecosystem. Table 9 gives the estimation results.
[Insert Table 9]
33
Comparing the two periods, the effect of the cumulative return on the implicit exit value
is much more significant in 2007-2014. It is true for both funded and unfunded startups: the
coefficient jumps from 0.001 to 0.017 in the funded case, and it jumps from 0.016 to 0.048
in the unfunded case. Thus, the probability of IPO/MA depends more on startups’ quality
in 2007-2014 and more so for the unfunded ones.
Another difference is related to the number of locations covered by a VC syndicate. It
has a significant effect on the selection of VC funding in 1998-2006 but the significance shifts
to be on the influence of VC funding in 2007-2014. More specifically, the number of locations
has a significant effect on the subjective value added in 1998-2006 and the effect fades away
in 2007-2014. In contrast, its effect on startup growth becomes significant in the more recent
subsample. It indicates that VCs exert more efforts for funded startups more recently, while
the vicinity to startups is more important for the investment decision in the early times.
Additionally, there are two distinct differences for previously funded and unfunded star-
tups separately. For the unfunded startups, the period returns rise with the risk-free rate in
2007-2014 but fall with it in 1998-2006. It implies a stronger substitutional effect in the later
period but a strong income effect in the early period. For the funded startups, a top-school
graduation dummy has a significant positive effect on startup growth only in 1998-2006.
One explanation is that more entrepreneurs have to quit from schools in the early period in
order to build their startups more seriously which got funding from VCs. Fewer have to do
so more recently due to the advancement in the communication technology as well as the
geographical expansion of the whole VC industry.
By large, the parameter estimates do not shift a lot, and most importantly, the cumulative
return continues to show positive effect on the implicit exit value as well as the subjective
value added for both subsamples.
34
5.2 Alternative Features
To curb the potential multicollinearity problem, I apply the principal factor method to
extract common factors separately from the macroeconomic variables, the constant and time-
varying startup features, the constant and time-varying VC features, and the constant and
time-varying pair features.34 Note that I include additional features (e.g. location, category
dummies for the startups, and school, field, degree dummies for the VCs’ and startups’
personnel) but extract fewer factors to keep the total number of factors low. Table 10 Panel
A gives the factor loadings on the original features. The goal of this exercise is to double-
check the interdependence between the selection and influence of VC funding rather than
the effect of any individual feature.
[Insert Table 10]
As shown in Table 10 Panel B, the significance of many factors goes away for their effects
on the growth of the funded startups. It results in a boost in the variance estimate. Both
changes might be due to the reduction in the number of covariates. In contrast, the extracted
factors still have significant effects on the growth of the unfunded startups and the subjective
value added which determines the equilibrium funding decision. This might be due to the
additional inclusion of many other features for the extraction of the factors. The effects
of the cumulative return on the implicit exit value and the subjective value added remain
positive and significant.
34 The factor loadings are computed using the squared multiple correlations as estimates of the communality.By comparison, the principal component factor uses 1 for the communality for all pairs of variables.
35
5.3 Alternative Models
5.3.1 AR(1) in Subjective Value Added
One extension of the model is to introduce autocorrelation in the subjective value added.
Intuitively, the autocorrelation structure implies that how much a VC syndicate and a startup
value the synergy they can create as a team today depends on how much they valued it
yesterday. More specifically, I change the dynamics of vijt given in Eq(4) to contain an
additional term of the lagged value vijt−1.35 Appendix C1 details the model and the estimation
strategy.
[Insert Table 11]
Table 11 Panel A gives the parameter estimates. The AR(1) coefficient has an estimate of
0.009 and a t-value of 34.65. The effect is statistically significant but economically small. At
the same time, the estimates barely change for Eq(4) which determines the subjective value
added (e.g. the coefficient for the effect of the cumulative return on the subjective value
added is 0.034 in Table 3 and 0.033 in Table 11 Panel A). Interestingly, the effect of the
cumulative return on the implicit exit value decreases for both funded and unfunded cases
even though Eq(2) remains exactly the same. The reason is that the system of equations
(Eq(1), Eq(2), Eq(4)) needs to be estimated jointly. This again is due to the interdependence
between the selection and influence of VC funding that the model intends to capture.
35 Note that the introduction of the lagged value vijt−1 is not supposed to substitute the effect of 1(funding
tie). The latter dummy equals to one if any member VC has funded the startup before, and vijt−1 is relatedto the probability of the syndicate as a whole to fund the startup in the previous period.
36
5.3.2 A Hierarchical Model for Startup Returns
It is well known that the startup return distribution has excessive kurtosis. Thus, several
papers use Mixture-of-Normals to model the startup return distribution (Ewens 2009, Ko-
rteweg and Sørensen 2010). The underlying assumption is that there are different types of
startups, e.g., winners, losers, and break-eveners, so the mean expected returns are different
for different types.36 On the other hand, there might also be a startup fixed-effect to account
for the inborn difference among different startups.
Therefore, I extend the baseline model to allow for a hierarchical structure on the startup
returns such that the noise in the period returns samples from a Mixture-of-Normals. What’s
new here is that the probability mixture is a hidden startup fixed-effect and is different across
startups. The extended model borrows from the Latent-Dirichlet-Allocation (LDA) model
well-known in the area of machine learning. It belongs to the class of hierarchical models
documented in Korteweg and Sørensen (2014) and more generally the class of Bayesian
network models.
More specifically, the hidden startup fixed-effect is a startup-specific probability mixture.
I denote it by pj ≡(p(1)j , p
(2)j , p
(3)j
)for startup j. The three elements of pj represent the
probabilities that startup j is a winner, loser, and break-evener, respectively. For the funded
case, the mean period returns for the three types are µ(1)y , µ
(2)y , and µ
(3)y ; for the unfunded case,
they are µ(1)n , µ
(2)n , and µ
(3)n instead. Given pj, startup j’s period return defined in Eq(C6)
follows a Mixture-of-Normals distribution with the probability mixture pj. Appendix C3
details the model and the estimation strategy.
Table 11 Panel B gives the parameter estimates. Compared with Table 3, the estimates
for the old parameters stay qualitatively the same. For the new parameters, the mean growth
rate for the previously funded case (i.e., µy’s) for the winners, losers, and break-eveners is
36 The terms to describe the type of startups are borrowed from Ewens 2009.
37
0.011, -0.005, and 0.006 respectively. However, none of them is significantly different from
0. In contrast, the mean growth rate for the previously unfunded case (i.e., µn’s) for the
three types is 1.564, -1.055, and 0.056 respectively. The mean returns for the winner and the
break-evener types are significantly above 0, while for the loser type it is significantly below
0. The difference across the three types is economically big and more so for the unfunded
case.
[Insert Figure 3]
Figure 3 plots the distribution of the hidden startup fixed-effect pj. The triangle repre-
sents the sample space, and each vertex on the triangle represents a certain type.37 As shown
in the figure, there is a large mass concentrated on the line between the loser vertex and
the break-evener vertex, and only a few startups have a probability greater than 0.5 to be
become a winner. Even the one with the highest winning probability has a non-zero chance
to have mediocre performance. The distribution mirrors the fact that most startups end up
mediocre or unsuccessful even though the returns for the successful ones are stunningly high.
6. Conclusion
In this paper, I study how VCs select startups to fund over multiple rounds in a dynamic
setting. The model I build highlights the interdependence between funding decision and
funding impacts. Using a hand-collected database including both VC-funded and non-VC-
funded startups, I develop an estimation strategy using Gibbs sampler to jointly estimate
the model parameters in a Bayesian framework.
The results show that selection and influence of VC funding are directly linked over time.
37 The top vertex represents the winner type; the rightmost vertex represents the loser type; the leftmostvertex represents the break-evener type.
38
The selection of funding depends on both startups’ quality and VCs’ expected influence in the
future. Following funding, VCs’ influence improves startups’ quality and their probabilities
of successful exits. The results suggest a joint effect of selection and influence of VCs’
investments in a dynamic setting. Namely, funded startups have better quality, and thus
they are more likely to get funded again in the future. A simulation experiment shows that
under this joint effect, initial random differences in startups can magnify significantly over
time.
This paper illustrates the dynamic aspect of investment, for which post-investment ac-
tivities are relevant for decision making. Using venture capital as an example, the paper
highlights the importance of a joint consideration of both investee’s value ex-ante and in-
vestor’s influence ex-post. A frequently invested item becomes a valuable asset to the whole
economy as it capitalizes the impacts of its past investors. While absent from a static setting,
these new insights shed light on some fundamental issues of strategic investment behavior.
39
Figure 1. Sequence of Events
This figure presents the sequence of events that happen at the end of t. Existing startupsat the beginning of t is Et−1, after one-period of growth, their growth and implicit exitvalue become rt and st, respectively. The law of motions for rt and st is given by Eq(1) andEq(2). If the implicit exit value st ≥ δ, the startup goes public or gets acquired (IPO/MAt);if the implicit exit value st < −δ, the startup goes bankruptcy (Dt); if the implicit value−δ ≤ st < δ, the startup remains in the economy and ready for another round of competitionfor VC’s funding. The remaining ones are called funding candidates (Jt). The funding isthen determined endogenously as a matching between the group of VC syndicates I andthe funding candidates Jt. The funding decision is described in Section 2.3. After that, thenewborn startups Nt come, and the existing startups at the end of t (Et) consists both Jtand Nt.
Sequence of Events
40
B. Growth
C. Exit
41
D. Funding
E. Separate Growth Trajectory
42
Figure 2. Distribution of Imputed Values
This figure presents the distributions of the imputed values for the latent variables. Panel Agives the distributions for the period-to-period growth (rjt − r
jt−1) separately for the funded
and unfunded cases. Panel B gives the distributions for the implicit exit value sjt separatelyfor the funded and unfunded cases. Panel C gives the pairwise subjective value added vijtseparately for the matched and unmatched pairs of startups and VC syndicates.
A. Period-to-Period Growth rt − rt−1
43
B. Implicit Exit Value st
C. Subjective Value Added vt
44
Figure 3. Distribution of Type Mixture
This figure plots the distribution of the type mixture in a two-dimensional probabilitysimplex for the extended model with the hierarchical structure in startup returns. Modeldetails are given in Appendix C3. Each point in the simplex characteristics a startup-specific vector pj ≡ (p
(1)j , p
(2)j , p
(3)j ) which satisfies the condition that p
(1)j + p
(2)j + p
(3)j = 1.
I call pj the startup-specific type mixture. The three elements in pj represent theprobabilities of the startup j belonging to the winner, loser, and break-evener type.Therefore, the type mixture is a hidden startup fixed-effect. In the simplex, the topvertex represents the pure winner type which has p(1) = 1 or equivalently p = (1, 0, 0), theleftmost vertex represents the pure break-evener type which has p(3) = 1 or equivalentlyp = (0, 0, 1), the rightmost vertex represents the pure loser vertex which has p(2) = 1 orequivalently p = (0, 1, 0). On the center of the gravity, the pink point has p = (1/3, 1/3, 1/3).
45
Figure 4. Model Identification
This figure
46
Figure A1. Life Expectancy by Final Status
This figure compares the distributions of the life expectancy (in months) for startups thatfinally go public or get acquired (IPO&MA), go bankruptcy (Death), or remain private inbusiness (Survival). The sample covers the period from 1998 to 2014, 204 months in total.
47
Figure A2. Closest Distance Comparison
This figure compares the closest distance for the matched pairs of startups and VCsyndicates and the unmatched pairs of startups and VC syndicates. The closest distance isthe minimum distance (in miles) between all the locations that a startup has an office andall the locations that a member VC has an office.
48
Figure A3. Trace Plot of Key Parameter Estimates
The figure gives the trace plots for the estimates of the key parameters in the model. PanelA gives the estimates for the last elements in φs,y (Eq(2)), φs,n (Eq(2)), and φv (Eq(4)).They are denoted by φs,y,−1, φs,n,−1, and φv,−1 respectively. They correspond to the effectof the cumulative return r on the implicit exit value s (with and without funding), andthe effect of the cumulative return r on the subjective value added v in the baseline model.Panel B gives the estimates for ρr,y and ρs,y. They correspond to the dependence of thesubjective value added on the private information that will drive the startup growth andthe implicit exit value one period ahead. The extended model that incorporate these twocorrelations are given in Appendix C2. Panel C gives the estimates of the AR(1) coefficientfor the subjective value added in the extended model which incorporates an autoregressivestructure. The AR(1) model is described in Appendix C1. Panel D and E give the estimates
for (µ(1)y , µ
(2)y , µ
(3)y ) and (µ
(1)n , µ
(2)n , µ
(3)n ) in the hierarchical return model. The new parameters
represent the fixed-effects in startup growth for the pure type of winner, loser, and break-evener, with and without funding respectively. The hierarchical return model is describedin detail in Appendix C3.
A. Baseline ModelEffect of Cumulative Return
49
B. Correlation ModelDependence of Subjective Value Added
C. Autoregressive ModelAR(1) Parameter for Subjective Value Added
50
D. Hierarchical Return Model(1) Fixed-Effects in Growth by Type: Funded
E. Hierarchical Return Model(2) Fixed-Effects in Growth by Type: Unfunded
51
Figure A4. Propensity Score
This figure compares the propensity scores for the funded and unfunded startups. Thepropensity score (PS) matching method is used for the comparison of the two groups meanperiod returns and status variables in Table 7 Panel B. The covariates for the PS matchinginclude startup-specific variables (e.g. # locations, # categories, # products, # startupsfounded), and macroeconomic variables (e.g. (Ybaa − Yus10), (rm − rf ), smb, hml, and rf ).
52
Figure A5. The Relationship among Key Startup Variables
This figure illustrates the relationship among the three key startup variables. The three keystartup variables are: (1) rjt : cumulative return, (2) sjt : implicit exit value, (3) vijt : subjectivevalue added. Note that the funding decision is determined collectively by the entire set ofall pairwise subjective value added.
53
Table 1. Sample Statistics
This table presents summary statistics for the sample. Panel A gives information on theoverall sample. Panel B gives information on the final status of startups that are VC-fundedand VC-unfunded. Panel C gives information on the size of VC syndicates. The size refersto the number of VCs in a syndicate. Panel D describes the distribution of the number offunding rounds experienced by the startups in the sample. Panel E describes the distributionof funding by funding rounds. The time period is from January 1998 to December 2014.
A. Sample
# Startups 9,303
# VC Syndicates 2,844
# VCs 755
# Time Periods 204
B. Startup Final Status
Status Death Survival IPO&MA Total
Unfunded 1,879 4,965 109 6,953
Funded 414 1,408 528 2,350
Total 2,293 6,373 637 9,303
C. VC Syndicate
# of VCs 1 2 3 4 5 6 7 8 9 10 11
Freq. 505 839 661 406 234 99 50 25 18 3 4
Percent. 17.76 29.5 23.24 14.28 8.23 3.48 1.76 0.88 0.63 0.11 0.14
D. Startup Funding
# of funding 0 1 2 3 4 5 6 7 8 9 10 11
Freq. 6,953 452 1,015 479 236 96 44 13 11 1 1 2
Percent 74.74 4.86 10.91 5.15 2.54 1.03 0.47 0.14 0.12 0.01 0.01 0.02
E. Funding by Rounds
Rounds of funding 1 2 3 4 5 6 7 8 9 10 11
Freq. 2350 1898 883 404 168 72 28 15 4 3 2
Percent 40.33 32.57 15.15 6.93 2.88 1.24 0.48 0.26 0.07 0.05 0.03
54
Table 2. Measure Statistics
This table presents summary statistics for the variables in Eq(1), Eq(2) and Eq(4). PanelA gives information on the cumulative return r in Eq(1). Panel B to Panel E give theinformation on the right-hand-side variables X in the three equations. Panel B describesthe macroeconomic variables Xt. Panel C describes startup-specific features Xj
t separatelyfor constant ones and time-varying ones. Panel D describes VC syndicate-specific featuresX it . Panel E describes startup-VC syndicate-pairwise-specific features X ij
t . There are 204months, 9,303 startups, and 2,844 VC syndicates. In total, there are 520,715 startup-monthobservations, 580,176 VC syndicate-month observations, 26,457,732 startup-VC syndicatepairs, and 1,650,020,544 startup-VC syndicate-month observations. The detailed construc-tion of these variables is given in Appendix B.
A. Cumulative Returns
Mean Std Skew Kurt min p5 p10 p25 p50 p75 p90 p95 max
6.58 9.46 0.31 1.23 -8.19 -2.3 -2.3 -2.3 0.53 17.06 18.78 19.59 26.19
B. Macroeconomic Variables
N Mean Std min p25 p50 p75 max
(Ybaa − Yus10) (%) 204 2.62 0.8 1.56 2.11 2.57 2.98 6.01
(rm − rf ) (%) 204 0.48 4.68 -17.23 -2.09 1.19 3.49 11.35
smb (%) 204 0.28 3.56 -16.41 -1.63 0.19 2.28 22.02
hml (%) 204 0.21 3.4 -12.61 -1.49 0.08 1.71 13.89
rf (%) 204 0.18 0.17 0 0.01 0.13 0.37 0.56
C. Startup Features
N Mean Std min p25 p50 p75 max
Constant
# locations 9,303 1.21 1.01 1 1 1 1 66
# categories 9,303 1.85 1.3 1 1 1 2 14
# products 9,303 0.43 1.97 0 0 0 0 134
Time-varying
t from last round 520,715 3.02 3.01 0 0.83 2 4.25 16.92
t2 from last round 520,715 18.14 34.91 0 0.69 4 18.06 286.17
# rounds 520,715 0.32 0.81 0 0 0 0 10
# startups founded 520,715 0.17 0.47 0 0 0 0 5
1(top20 school) 520,715 0.3 0.46 0 0 0 1 1
55
D. VC Syndicate Features
N Mean Std min p25 p50 p75 max
Constant
# of VCs 2,844 2.93 1.63 1 2 3 4 11
# locations 2,844 6.37 5.40 1 3 5 9 75
Time-varying
1(cooperated) 580,176 0.33 0.47 0 0 0 1 1
# rounds 580,176 5.16 9.65 0 0 1 6 116
# categories 580,176 8.11 12.87 0 0 1 11 87
1(top20 school) 580,176 0.86 0.35 0 1 1 1 1
E. Startup-VC Syndicate Features
N Mean Std min p25 p50 p75 max
Constant
distance 2.65× 107 2893.67 3521.56 0 79.48 1200.26 4313.19 19932.96
[0, 100] 2.65× 107 0.25 0.43 0 0 0 1 1
(100, 1000] 2.65× 107 0.21 0.4 0 0 0 0 1
(1000, max] 2.65× 107 0.54 0.5 0 0 1 1 1
days of travel 2.65× 107 2.29 0.84 0 1 2 2 3
Time-varying
1(funding tie) 1.65× 109 0.61%
1(alumni tie) 1.65× 109 20.76%
56
Table 3. Estimation Result
This table presents the estimation result for the baseline model described by the system ofequations Eq(1), Eq(2) and Eq(4). The first two columns give the estimates for φr,y and φr,nin Eq(1) for the law of motion of the cumulative return r. The third and fourth columnsgive the estimates for φs,y and φs,n in Eq(2) for the determinants of the implicit exit value s.The last column gives the estimates for φv in Eq(4) for the determinants of the subjectivevalue added v. Details of the estimation are given in Appendix A. Numbers in the bracketsare the t-statistics. Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
r 0.028 *** 0.063 *** 0.034 ***
(30.78) (40.17) (178.94)
sigma2 0.01 *** 0.092 *** 1.124 *** 1.373 ***
(103.37) (1287.12) (11.79) (28.35)
intercept 0.004 *** 0.004 *** -0.251 -0.447 ***
(2.91) (44.61) (1.35) (3.25)
Macro Variables
(Ybaa − Yus10) -0.001 -0.013 *** 0.005 0.024
(0.90) (197.74) (0.16) (1.02)
(rm − rf ) 0.007 0.032 *** -0.045 0.573
(0.72) (41.62) (0.10) (1.62)
smb -0.023 0.006 *** -1.332 -0.929 *
(0.95) (5.03) (1.40) (1.92)
hml 0.036 * 0.099 *** -0.045 -0.250
(1.67) (48.54) (0.04) (0.54)
rf 0.017 0.196 *** 0.072 0.370 **
(0.21) (57.13) (0.30) (2.48)
Startup Features
# locations 0.003 *** 0.017 *** 0.009 -0.004 -0.853 ***
(5.26) (406.38) (0.43) (0.29) (6.23)
# categories -0.009 *** -0.004 *** -0.007 -0.006 -0.273 ***
(4.90) (171.65) (0.29) (0.75) (9.10)
# products 0.004 *** 0.004 *** 0.007 0.015 0.024 ***
(2.67) (85.13) (0.34) (1.06) (7.71)
t from last round 0.000 0.041
57
(0.35) (1.66)
t2 from last round -0.040 -0.046 ***
(0.11) (3.00)
# rounds 0.008 0.001 *** 0.226 -0.001 0.022 ***
(0.00) (187.20) (0.05) (1.35) (35.88)
# startups founded -0.005 0.005 *** -0.012 0.013 -0.158 ***
(1.49) (11.48) (0.36) (0.45) (12.09)
1(top20 school) -0.019 *** 0.154 *** 0.005 -0.037 -0.434 ***
(10.38) (372.63) (0.09) (0.91) (35.30)
VC Syndicate Features
# of VCs -0.003 *** 0.006 -0.342 ***
(18.15) (0.22) (10.27)
# locations 0.000 *** 0.002 0.001 ***
(8.26) (0.49) (6.60)
1(cooperated) 0.001 *** 0.002 0.030 ***
(12.59) (0.46) (12.99)
# rounds -0.001 -0.068 0.101 ***
(0.64) (0.53) (15.29)
# categories 0.000 *** -0.001 -0.019 ***
(8.08) (0.59) (11.38)
1(top20 school) 0.039 *** -0.001 -0.647 ***
(18.08) (0.02) (24.72)
Pair Features
days of travel -0.146 0.295 -0.210 ***
(1.66) (0.86) (28.02)
1(funding tie) 0.004 *** -0.251 4.486 ***
(4.02) (1.35) (10.36)
1(alumni tie) 0.003 *** -0.003 0.117 ***
(18.62) (0.24) (35.28)
58
Table 4. Previously Funded vs. Previously Unfunded
This table compares the means of the period return (rt+1 − rt), the implicit exit valuest, IPO&MA rate and death rate for the previously funded and unfunded startups. Periodreturn* is winsorized between 1 and 99 percentiles separately for the subsamples of previouslyfunded and unfunded startups. Panel A uses all previously unfunded startups as control.Panel B uses only comparable startups identified by the Nearest Neighbor (NN) method(with 1 or 5 neighbors) or the Propensity Score (PS) method (with probit or logit treatmentmodel). The covariates for the identification of the comparable startups include startup-specific variables (e.g. # locations, # categories, # products, # startups founded), andmacroeconomic variables (e.g. (Ybaa − Yus10), (rm − rf ), smb, hml, and rf ). Numbers in thebrackets are the t-statistics. Significance at 10%, 5%, and 1% levels are denoted by *, **,and ***.
A. Previously Funded vs. Previously Unfunded
Period return* Period return Exit value IPO&MA rate(%) Death rate (%)
Funded 0.034 0.034 0.044 0.5 0.4
Unfunded -0.017 0.051 -0.197 0.02 0.49
Difference 0.05 *** -0.017 0.241 *** 0.48 *** -0.09
Test (t or chi2) 10.501 0.901 16.232 164.91
B. Previously Funded vs. Comparable Previously Unfunded
Period return* Period return Exit value
Nearest Neighbor (nn1) 0.052 *** -0.039 0.206 ***
(9.62) (0.57) (9.84)
Nearest Neighbor (nn5) 0.052 *** -0.042 0.223 ***
(10.59) (0.81) (12.64)
Propensity Score (probit) 0.051 *** -0.024 0.201 ***
(6.88) (0.58) (9.86)
Propensity Score (logit) 0.051 *** -0.025 0.199 ***
(8.97) (0.79) (9.74)
59
Table 5. Constrained Estimation
This table presents the estimation results for the constrained model in which the coefficientsassociated with the intercept term, the macroeconomic variables, and the startup-specificfeatures are constrained to be the same. This is equivalent to the assumption that funded andunfunded startups have the same law of motions for their cumulative return r and implicitexit value s, and at the same time, the VC-related features for the unfunded startups areset to be zeros. Therefore, Eq(1) and Eq(2) become Eq(A28) and Eq(A29). Details ofthe estimation are given in Appendix A3. Numbers in the brackets are the t-statistics.Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
[1] [2] [3]
r 0.081 *** 0.033 ***
(43.10) (163.69)
sigma2 0.085 *** 1.584 ***
(595.14) (209.15)
intercept 0.052 *** -0.934 ***
(70.07) (7.63)
Macro Variables
(Ybaa − Yus10) 0.052 *** -0.044 **
(70.07) (2.41)
(rm − rf ) -0.011 *** 0.785 ***
(24.03) (2.94)
smb 0.008 *** -2.095 ***
(16.17) (4.06)
hml 0.099 *** -0.085
(6.35) (0.21)
rf 0.226 *** 0.569 ***
(8.44) (5.79)
Startup Features
# locations 0.02 *** 0.01 -0.849 ***
(71.08) (0.63) (6.26)
# categories -0.009 *** 0.000 -0.273 ***
(62.09) (0.02) (8.93)
# products 0.001 *** 0.014 0.023 ***
(26.69) (1.18) (7.16)
60
t from last round 0.052 ***
(10.81)
t2 from last round -0.026 ***
(8.19)
# rounds 0.002 *** -0.003 0.022 ***
(454.42) (1.28) (44.80)
# found startups -0.003 -0.066 -0.164 ***
(0.30) (1.55) (10.54)
1(top20 school) -0.017 *** 0.042 -0.435 ***
(32.69) (0.62) (34.76)
VC Syndicate Features
# of VCs -0.004 *** 0.048 * -0.344 ***
(7.99) (1.67) (10.28)
# locations 0.001 *** 0.006 0.001 ***
(5.37) (1.35) (5.21)
1(cooperated) 0.001 *** 0.005 * 0.03 ***
(9.46) (1.94) (12.58)
# rounds 0.01 *** -0.258 ** 0.105 ***
(2.41) (1.98) (4.76)
# categories 0.000 *** -0.004 -0.02 ***
(3.76) (1.35) (11.43)
1(top20 school) 0.042 *** -0.05 -0.648 ***
(7.11) (0.38) (25.79)
Pair Features
days of travel -0.021 *** 0.591 -0.696 ***
(3.74) (1.51) (12.16)
1(funding tie) -0.119 -0.652 4.52 ***
(0.36) (1.38) (10.13)
1(alumni tie) 0.005 *** -0.024 0.117 ***
(5.27) (1.16) (41.81)
61
Table 6. Expected VC Impact & Startup Future
This table presents the correlations between the subjective value added vt and the futureperiod return (rt+τ − rt+τ−1) as well as the future implicit exit value st+τ for startups thatare selected to be funded at t. The correlations are calculated as corr(vijt ,rjt+τ − rjt+τ−1)
and corr(vijt , sjt+τ ), with τ varies from 1 to 12 (months). Numbers in the brackets are the
t-statistics. Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.
Period return Exit value
1 0.114 *** 0.056 ***
(30.00) (12.06)
2 0.089 * 0.125 **
(1.78) (2.26)
3 0.059 -0.025
(0.24) (0.04)
4 0.03 0.043
(0.06) (0.12)
5 0.019 -0.029
(0.03) (0.06)
6 0.059 -0.03
(0.23) (0.05)
7 0.065 0.103 **
(0.29) (2.50)
8 0.102 * -0.048
(1.59) (0.14)
9 0.126 ** 0.038
(5.33) (0.08)
10 0.106 * -0.02
(1.63) (0.03)
11 0.105 * 0.061
(1.45) (0.24)
12 0.07 0.07
(0.27) (0.37)
62
Table 7. Selection on Private Information
This table presents the estimation results for the extended correlation model in which Eq(1)contains an additional term ρr,yξ
ijt−1 for the funded case and Eq(2) contains an additional
term ρs,yξijt−1 for the funded case. The estimates for ρr,y and ρs,y give the dependence of the
subjective value added on the private information that will drive the startup growth andthe implicit exit value one period ahead. They are denoted by ”rho” in the table. Detailsof the extended model are given in Appendix C2. As before, the first two columns givethe estimates for φr,y and φr,n, the third and fourth columns give the estimates for φs,y andφs,n, the last column gives the estimate for φv. Numbers in the brackets are the t-statistics.Significance at 10%, 5%, and 1% levels are denoted by *, **, and ***.
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
rho 0.009 *** 0.073 ***
(42.32) (5.43)
r 0.017 *** 0.044 *** 0.033 ***
(37.90) (77.06) (327.79)
sigma2 0.011 *** 0.092 *** 1.256 *** 1.284 ***
(46.21) (1316.04) (49.91) (127.15)
intercept 0.006 *** 0.005 *** -0.29 -0.464 ***
(7.08) (50.29) (1.28) (4.47)
Macro Variables
(Ybaa − Yus10) -0.001 -0.013 *** 0.008 0.027
(0.99) (183.77) (0.27) (1.19)
(rm − rf ) 0.009 0.026 *** 0.103 0.61
(0.61) (37.22) (0.18) (1.26)
smb -0.026 0.008 *** -1.644 -0.884 *
(1.12) (7.27) (1.54) (1.77)
hml 0.034 0.105 *** -0.095 -0.205
(1.60) (47.28) (0.10) (0.33)
rf 0.013 0.195 *** 0.133 0.391 **
(0.96) (54.45) (0.50) (1.99)
Startup Features
# locations 0.002 *** 0.017 *** 0.007 -0.014 -0.849 ***
(12.89) (440.09) (0.36) (1.43) (6.28)
# categories -0.009 *** -0.004 *** -0.01 0.001 -0.271 ***
63
(11.70) (175.47) (0.47) (0.10) (8.88)
# products 0.004 *** 0.004 *** 0.013 0.008 0.024 ***
(8.33) (92.83) (0.55) (0.85) (6.15)
t from last round 0.000 -0.238
(0.12) (1.59)
t2 from last round -0.078 -0.046 ***
(0.20) (3.16)
# rounds 0.05 0.001 *** 0.617 -0.001 0.022 ***
(0.20) (184.14) (0.13) (1.19) (40.95)
# found startups -0.004 0.005 *** -0.012 0.004 -0.164 ***
(0.54) (11.75) (0.35) (0.14) (12.10)
1(top20 school) -0.021 *** 0.154 *** -0.003 -0.036 -0.434 ***
(6.20) (403.86) (0.05) (1.41) (30.54)
VC Syndicate Features
# of VCs -0.005 *** 0.009 -0.347 ***
(12.52) (0.36) (9.98)
# locations 0.000 *** 0.002 0.001 ***
(3.32) (0.57) (6.84)
1(cooperated) 0.001 *** 0.002 0.03 ***
(60.80) (0.66) (12.53)
# rounds 0.000 -0.012 0.105 ***
(1.27) (0.10) (5.51)
# categories 0.000 *** -0.002 -0.02 ***
(16.51) (0.68) (11.34)
1(top20 school) 0.038 *** -0.011 -0.643 ***
(14.66) (0.17) (23.59)
Pair Features
days of travel -0.029 *** 0.291 -0.703 ***
(2.04) (0.68) (12.70)
1(funding tie) 0.006 *** -0.29 4.484 ***
(7.08) (1.28) (9.75)
1(alumni tie) 0.003 *** -0.005 0.118 ***
(63.38) (0.30) (27.67)
64
Table 8. Simulation Results: Model 1 vs. Model 2
This table compares the difference in means of the period return (rt+1 − rt), implicit exitvalue s, IPO&MA rate and death rate for the two models in the simulation. Model 1assumes random selection of funding for all periods to exclude the chain effect. Model 2assumes random selection of funding only at the first period. I simulate 100 startups thathave ex-ante homogeneous features which are set to the means of these features in sample.The startups also have synchronous births and same cumulative return in the beginning. Iassume there is one VC that has a funding quota of 50 so each startup gets funded witha probability of 0.5 at any time. Panel A compares the values for previously funded andunfunded startups for the two models. Panel B compares the transition matrix of currentfunding given previous funding for the two models. Panel C compares the values τ periodahead for currently funded and unfunded startups, with τ ranging from 1 to 10. Numbersin the brackets are the t-statistics. Significance at 10%, 5%, and 1% levels are denoted by*, **, and ***.
A. Previously Funded vs. Previously Unfunded
1. Random Selection for All Periods
Period return Exit value IPO&MA rate (%) Death rate (%)
Funded 0.515 0.577 1.327 0.207
Unfunded 0.319 0.326 1.067 0.249
Difference 0.196 0.251 0.26 -0.041
t-stat 0.74 0.61 1.002 0.363
2. Random Selection Only for the First Period
Period return Exit value IPO&MA rate (%) Death rate (%)
Funded 0.956 0.828 1.886 0.179
Unfunded 0.106 -0.126 0.811 0.516
Difference 0.851 *** 0.954 *** 1.075 *** -0.337 *
t-stat 2.92 2.45 2.456 1.645
B. Transition Matrix
1. Random Selection
for All Periods
2. Random Selection
Only for the First Period
Currently Funded Currently Funded
Previously Funded N Y Previously Funded N Y
N 0.502 0.498 N 0.931 0.069
Y 0.533 0.467 Y 0.041 0.959
65
C. Future Difference for Initially Funded vs. Unfunded
1. Random Selection for All Periods
Period return Exit value IPO&MA rate (%) Death rate (%)
1 0.196 0.251 0.26 -0.041
2 0.278 0.082 0.094 -0.015
3 0.175 0.037 0.081 -0.015
4 0.082 0.17 -0.007 0.075
5 0.069 0.024 0.271 0.045
6 0.113 0.247 0.639 *** -0.017
7 -0.026 0.215 0.291 -0.016
8 0.02 -0.06 0.822 *** -0.241 ***
9 0.01 0.336 0.43 -0.209
10 0.034 0.087 0.222 -0.011
2. Random Selection Only for the First Period
Period return Exit value IPO&MA rate (%) Death rate (%)
1 0.851 *** 0.954 *** 1.075 *** -0.337 *
2 0.906 *** 1.077 *** 1.124 *** -0.349 *
3 0.909 *** 1.214 *** 1.278 *** -0.284
4 0.881 *** 1.052 ** 1.256 *** -0.215
5 0.856 *** 1.023 ** 1.336 *** -0.225
6 0.879 *** 1.35 *** 1.312 *** -0.344
7 0.857 *** 1.218 ** 1.175 *** -0.358
8 1.024 *** 1.34 ** 1.256 *** -0.281
9 1.114 *** 1.677 *** 1.316 *** -0.243
10 1.359 *** 1.549 *** 1.4 *** -0.233
66
Table 9. Estimation for Subsamples
This table presents the estimation results using two subsamples for the baseline model de-scribed by the system of equations Eq(1), Eq(2), and Eq(4) as in Table 3. Panel A and PanelB give the results for 1998-2006 and 2007-2014 respectively. The subsamples only containstartups that are born during those sub-periods. The first two columns give the estimatesfor φr,y and φr,n. The third and fourth columns give the estimates for φs,y and φs,n. The lastcolumn gives the estimates for φv. Significance at 10%, 5%, and 1% levels are denoted by *,**, and ***.
A. 1998-2006
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
r 0.001 ** 0.016 *** 0.033 ***
(2.08) (42.55) (123.33)
sigma2 0.011 *** 0.108 *** 1.524 *** 1.473 ***
(58.52) (131.291 (24.17) (61.29)
intercept 0.052 *** 0.028 *** 0.136 -0.264 **
(16.17) (28.04) (0.53) (2.34)
Macro Variables
(Ybaa − Yus10) -0.009 *** -0.051 *** -0.019 0.051
(2.92) (478.79) (0.15) (1.47)
(rm − rf ) 0.056 0.04 *** 0.267 0.093
(0.60) (31.51) (0.15) (0.28)
smb -0.036 -0.074 *** -0.379 0.349
(0.56) (40.50) (0.25) (0.82)
hml -0.055 -0.048 *** 0.114 0.728
(0.39) (15.57) (0.06) (1.56)
rf -0.024 -0.101 *** -0.111 0.072
(0.74) (84.64) (0.25) (0.56)
Startup Features
# locations -0.004 *** 0.013 *** 0.008 0.007 -0.154 ***
(10.00) (473.09) (0.27) (0.86) (9.02)
# categories 0.002 -0.002 *** 0.014 0.003 -0.44 ***
(0.18) (69.35) (0.29) (0.20) (6.06)
# products 0.002 *** 0.006 *** -0.002 -0.001 0.051 ***
67
(24.60) (265.24) (0.15) (0.13) (10.27)
t from last round 0.000 -0.099
(0.03) (0.53)
t2 from last round -0.067 -0.008
(0.06) (0.27)
# rounds 0.047 0.005 *** 1.013 0.003 0.08 ***
(0.00) (1384.51) (0.08) (0.86) (14.70)
# found startups -0.001 0.051 *** -0.002 -0.013 -0.125 ***
(0.23) (73.39) (0.03) (0.32) (16.06)
1(top20 school) -0.012 *** 0.176 *** 0.026 0.021 -0.465 ***
(3.47) (694.07) (0.22) (0.88) (11.80)
VC Syndicate Features
# of VCs -0.009 *** -0.021 -0.319 ***
(13.36) (0.50) (7.60)
# locations 0.000 0.001 0.023 ***
(0.57) (0.06) (12.42)
1(cooperated) 0.002 *** 0.002 0.032 ***
(5.21) (0.14) (7.30)
# rounds -0.001 -0.043 0.34 ***
(0.46) (0.29) (31.20)
# categories 0.001 ** 0.000 -0.03 ***
(2.54) (0.01) (2.84)
1(top20 school) -0.004 -0.151 -0.774 ***
(0.76) (0.71) (49.66)
Pair Features
days of travel 0.004 -0.05 -1.13 ***
(0.82) (0.22) (22.80)
1(funding tie) 0.052 *** 0.136 6.73 ***
(16.62) (0.53) (10.51)
1(alumni tie) 0.003 *** -0.001 0.07 ***
(16.05) (0.06) (6.62)
68
B. 2007-2014
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
r 0.017 ** 0.048 *** 0.035 ***
(22.78) (56.00) (70.00)
sigma2 0.036 *** 0.159 *** 1.361 *** 1.587 ***
(36.96) (60.03) (36.58) (57.20)
intercept 0.019 *** 0.085 *** -0.117 -0.147
(4.07) (40.93) (0.29) (1.37)
Macro Variables
(Ybaa − Yus10) -0.005 -0.002 *** 0.001 -0.002
(0.20) (6.51) (0.01) (0.08)
(rm − rf ) 0.072 0.008 * 0.138 -0.276
(0.11) (1.93) (0.15) (0.50)
smb -0.094 -0.009 * -0.457 -0.235
(0.14) (1.90) (0.23) (0.27)
hml -0.111 -0.034 *** -0.773 -0.113
(0.16) (8.71) (0.44) (0.17)
rf 0.227 0.261 *** 0.181 -0.041
(0.03) (6.17) (0.15) (0.13)
Startup Features
# locations -0.001 0.049 *** -0.015 0.015 -1.024 ***
(0.41) (400.97) (0.52) (0.66) (8.35)
# categories -0.002 -0.012 *** 0.02 -0.004 -0.27 ***
(0.29) (182.43) (0.71) (0.31) (10.29)
# products 0.000 0.005 *** -0.002 0.004 0.035 ***
(0.02) (63.69) (0.07) (0.40) (18.53)
t from last round 0.001 0.351 *
(0.47) (1.67)
t2 from last round -0.04 -0.11 ***
(0.05) (2.60)
# rounds -0.062 0.011 *** 0.337 0.027 *** 0.13 ***
(0.00) (944.51) (0.03) (3.43) (25.82)
# found startups -0.017 0.016 *** 0.067 0.036 -0.001
(0.63) (25.75) (1.28) (1.29) (0.32)
69
1(top20 school) 0.007 0.178 *** 0.002 -0.041 -0.28 ***
(0.81) (187.21) (0.02) (1.03) (11.46)
VC Syndicate Features
# of VCs 0.001 -0.004 -0.485 ***
(1.22) (0.08) (15.05)
# locations 0.002 *** -0.004 -0.002
(7.79) (0.32) (0.90)
1(cooperated) 0.000 0.003 0.042 ***
(1.10) (0.45) (19.62)
# rounds 0.008 0.037 0.047 ***
(1.33) (0.16) (5.63)
# categories 0.000 -0.002 -0.026 ***
(0.21) (0.35) (16.58)
1(top20 school) 0.004 0.082 -0.458 ***
(0.40) (0.53) (19.78)
Pair Features
days of travel -0.037 *** -0.164 -0.356 ***
(4.13) (0.22) (3.97)
1(funding tie) 0.019 *** -0.117 6.179 ***
(4.57) (0.29) (11.09)
1(alumni tie) 0.001 -0.004 0.065 ***
(0.77) (0.12) (8.29)
70
Table 10. Estimation with Alternative Features
This table presents the estimation results using the principal factors extracted from thevarious measures described in Section 3.2 as covariates. Panel A gives the factor loadingsseparately for the macroeconomic variables, constant and time-varying startup features, VCsyndicate features, and pair features. I use f t, f j, f tj, f i, f ti, f tji to denote these factors.Note that there is no f ji because days (days of a round travel between a VC syndicate and astartup) is the only startup-VC time-invariant feature. Panel B gives the estimation result.The first two columns give the estimates for φr,y and φr,n. The third and fourth columnsgive the estimates for φs,y and φs,n. The last column gives the estimates for φv. Significanceat 10%, 5%, and 1% levels are denoted by *, **, and ***.
A. Factor Loadings
Macro Variables Pair Features
f t f1 t f2 t Constant
(Ybaa − Yus10) 0.328 -0.277 f ji: days of travel
(rm − rf ) 0.135 0.247 Time-varying
smb 0.204 0.286 f tji f1 tji
hml -0.187 -0.226 1(funding tie) 0.5095
rf -0.355 0.193 1(alumni tie) 0.0431
Startup Features VC Syndicate Features
Constant Constant
f j f1 j f i f1 i
# locations 0.558 # of VCs 0.543
# categories 0.128 # locations 0.849
# products 0.308 Others: location dummies
Others: location/category dummies Time-varying
Time-varying f ti f1 ti f2 ti
f tj f1 tj f2 tj 1(cooperated) 0.112 0.16
# rounds 0.085 0.107 # rounds 0.116 0.23
t from last round -0.387 0.495 # categories 0.127 -0.423
t2 from last round -0.044 0.23 1(top20 school) 0.26 -0.145
# startup founded 0.026 0.008 Others: category/school dummies
1(top20 school) 0.207 0.067 degree/major dummies
71
B. Parameter Estimates
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
r 0.083 *** 0.077 *** 0.039 ***
(7.85) (97.29) (141.25)
sigma2 0.04 *** 0.096 *** 1.622 *** 0.893 ***
(76.87) (2019.42) (50.34) (222.54)
intercept 0.655 *** -0.025 *** -0.42 0.062
(42.47) (215.12) (0.22) (0.29)
Macro Factors
f1 t -0.005 -0.018 *** -0.189 -0.04 **
(0.17) (82.48) (0.70) (2.00)
f2 t 0.001 0.005 *** -0.274 0.007
(0.03) (15.06) (0.68) (0.28)
Startup Factors
f1 j 0.002 0.014 *** 0.056 0.027 * -0.012 ***
(0.27) (142.44) (0.66) (1.69) (3.54)
f1 tj -0.008 -0.074 *** 0.171 0.894 ** 4.325 ***
(0.72) (412.23) (0.09) (2.54) (9.60)
f2 tj -0.009 -0.038 *** -0.154 0.38 *** 0.91 ***
(0.17) (584.25) (0.04) (4.50) (10.54)
f3 tj 0.008 0.262 *** -0.049 -1.248 ** -5.683 ***
(1.49) (1106.98) (0.06) (2.51) (8.69)
VC Syndicate Factors
f1 i 0.006 0.075 0.11 ***
(0.45) (0.52) (10.65)
f1 ti 0.004 0.066 0.233 ***
(0.17) (0.53) (30.77)
f2 ti -0.006 -0.039 0.337 ***
(0.65) (0.44) (33.17)
Pair Factors
days of travel 0.000 0.000 -0.042 ***
(0.57) (0.65) (23.88)
f1 tji 0.081 *** 0.381 1.096 ***
(49.14) (0.15) (50.04)
72
Table 11. Estimation for Alternative Models
This table presents the estimation results for the alternative models. Panel A presents theresult for the extended model with an AR(1) structure in the subjective value added. TheAR(1) coefficient is denoted by ”lagged v” in the table. Details of the model are given inAppendix C1. Panel B presents the result for the extend model with a hierarchical structurein startup returns. The fixed-effects in startup growth for the pure type of winner, loser, andbreak-evener are denoted by µ
(1)y , µ
(2)y , µ
(3)y for the previously funded case and by µ
(1)n , µ
(2)n ,
µ(3)n for the previously unfunded case. Details of the model are given in Appendix C3. The
first two columns give the estimates for φr,y and φr,n. The third and fourth columns give theestimates for φs,y and φs,n. The last column gives the estimates for φv. Significance at 10%,5%, and 1% levels are denoted by *, **, and ***.
A. Autocorrelation in Subjective Value Added
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
lagged v 0.009 ***
(34.65)
r 0.017 *** 0.04 *** 0.033 ***
(34.90) (66.66) (48.87)
sigma2 0.011 *** 0.092 *** 1.385 *** 1.338 ***
(97.84) (773.63) (90.78) (95.24)
intercept 0.004 *** 0.005 *** -0.407 * -0.466 ***
(3.03) (54.83) (1.81) (4.41)
Macro Variables
(Ybaa − Yus10) -0.001 -0.013 *** 0.012 0.029
(0.86) (18.59) (0.40) (1.22)
(rm − rf ) 0.007 0.026 *** 0.104 0.548 *
(0.74) (38.15) (0.21) (1.67)
smb -0.023 0.008 *** -1.78 * -0.721
(0.88) (7.04) (1.82) (1.28)
hml 0.036 0.105 *** -0.085 -0.177
(1.59) (4.68) (0.10) (0.33)
rf 0.017 0.196 *** 0.145 0.381 ***
(0.19) (5.60) (0.58) (3.39)
Startup Features
# locations 0.003 *** 0.017 *** 0.009 -0.002 -0.861 ***
73
(5.40) (4113.36) (0.44) (0.12) (6.24)
# categories -0.009 *** -0.004 *** -0.01 0.001 -0.282 ***
(4.35) (1622.62) (0.52) (0.07) (8.87)
# products 0.004 *** -0.004 *** 0.004 0.008 0.026 ***
(2.33) (1011.76) (0.16) (0.78) (8.21)
t from last round 0.000 0.242 *
(0.11) (1.74)
t2 from last round -0.047 -0.051 ***
(0.13) (3.53)
# rounds -0.025 0.001 *** 0.135 -0.002 ** 0.023 ***
(0.11) (199.13) (0.03) (1.99) (39.64)
# found startups -0.005 0.005 *** -0.012 0.002 -0.164 ***
(1.55) (122.57) (0.39) (0.05) (11.06)
1(top20 school) -0.019 *** 0.154 *** -0.008 -0.039 -0.438 ***
(10.80) (374.70) (0.16) (1.31) (44.48)
VC Syndicate Features
# of VCs -0.003 *** 0.005 -0.314 ***
(18.39) (0.18) (11.45)
# locations 0.000 *** 0.001 0.002 ***
(8.36) (0.35) (5.77)
1(cooperated) 0.001 *** 0.002 0.028 ***
(13.40) (0.59) (14.68)
# rounds -0.001 -0.035 0.107 ***
(0.69) (0.31) (6.35)
# categories 0.000 *** -0.002 -0.019 ***
(7.65) (0.69) (13.13)
1(top20 school) 0.039 *** 0.003 -0.638 ***
(17.30) (0.04) (26.34)
Pair Features
days of travel -0.162 * 0.55 -0.72 ***
(1.76) (1.39) (13.70)
1(funding tie) 0.004 *** 0.407 * 4.198 ***
(4.17) (1.81) (10.15)
1(alumni tie) 0.003 *** -0.002 0.107 ***
(19.84) (0.14) (43.43)
74
B. Hierarchical Model for Return
Influence Selection
Growth (r) Exit Value (s) Value Added (v)
Funded Unfunded Funded Unfunded
[1] [2] [3] [4] [5]
mu1 0.011 1.564 ***
(1.01) (33.16)
mu2 -0.005 -1.055 ***
(0.45) (65.20)
mu3 0.006 0.056 ***
(0.39) (46.03)
r 0.024 *** 0.057 *** 0.036 ***
(39.92) (57.85) (264.89)
sigma2 0.007 *** 0.105 *** 1.403 *** 1.211 ***
(304.83) (642.93) (28.02) (55.72)
intercept -0.397 *** -0.686 ***
(2.61) (4.40)
Macro Variables
(Ybaa − Yus10) 0.000 -0.004 *** 0.021 0.088 ***
(0.56) (4.44) (0.69) (3.35)
(rm − rf ) 0.017 *** -0.018 *** 0.425 0.465 **
(0.51) (4.52) (0.59) (2.00)
smb 0.013 0.043 *** -0.32 -2.655 ***
(0.21) (4.42) (0.37) (6.61)
hml -0.024 0.022 *** -0.856 -1.032 *
(0.31) (2.59) (1.12) (1.95)
rf 0.015 ** 0.163 *** 0.297 0.564 ***
(2.06) (14.82) (1.19) (5.19)
Startup Features
# locations 0.000 0.023 *** 0.006 -0.015 -0.565 ***
(1.53) (75.70) (0.27) (1.06) (7.60)
# categories -0.007 *** -0.006 *** 0.008 -0.016 -0.305 ***
(18.16) (26.65) (0.43) (1.62) (11.04)
# products 0.002 *** 0.009 *** -0.01 0.022 *** 0.006 ***
(5.33) (66.36) (1.04) (2.68) (15.08)
t from last round 0.001 0.339 *
(1.25) (1.90)
75
t2 from last round -0.062 -0.062 ***
(0.19) (3.55)
# rounds 0.025 0.002 *** 0.301 -0.002 * 0.022 ***
(0.13) (32.92) (0.08) (1.86) (30.32)
# found startups -0.011 * 0.035 *** 0.044 0.003 -0.146 ***
(1.81) (8.52) (1.00) (0.09) (20.84)
1(top20 school) 0.01 0.138 *** -0.017 -0.085 ** -0.464 ***
(1.61) (268.67) (0.35) (2.55) (25.69)
VC Syndicate Features
# of VCs -0.006 *** 0.016 -0.417 ***
(16.03) (0.57) (14.62)
# locations 0.003 *** -0.003 0.032 ***
(2.11) (0.37) (29.53)
1(cooperated) 0.000 * -0.001 0.03 ***
(1.76) (0.42) (16.10)
# rounds -0.007 *** 0.058 0.114 ***
(4.17) (0.36) (5.63)
# categories 0.000 *** 0.001 -0.023 ***
(8.06) (0.51) (17.33)
1(top20 school) 0.004 * 0.048 -0.556 ***
(1.86) (0.63) (55.21)
Pair Features
days of travel -0.014 *** 0.206 -0.913 ***
(5.68) (1.03) (21.89)
1(funding tie) 0.012 -0.397 *** 4.113 ***
(1.52) (2.61) (14.40)
1(alumni tie) 0.001 *** 0.004 0.11 ***
(4.67) (0.33) (24.41)
76
Table A1. Locations and Categories of Startup
This table presents the location and category summary for the startups in sample. Panel Acounts the number of startups that has an office in California (CA), New York (NY), otherlocations in U.S. except California and New York (OUS), other locations in North Americaexcept U.S. (ONA), Asia (AS) and Europe (EU). Panel B counts the number of startupsthat has an office in the top 20 cities. Panel C counts the number of startups that belongsto the top 20 categories. The classification of category is from TechCrunch.
A. Locations by Area
Area CA NY OUS ONA AS EU
Freq. 2,542 945 2,630 348 1,184 2,085
B. Top 20 Cities C. Top 20 Categories
City Freq. Category Freq.
San Francisco 882 Software 1,561
New York 809 Mobile 1,222
London 480 Advertising 689
Los Angeles 211 Games 505
Chicago 177 Education 351
Palo Alto 173 Consulting 345
Seattle 165 Internet 340
Mountain View 141 Apps 320
Austin 141 Finance 296
Paris 139 Analytics 295
Toronto 123 Technology 263
Boston 113 Search 255
Bangalore 108 Video 247
San Diego 102 Startups 245
Berlin 99 Networking 238
Cambridge 98 Music 217
Tel Aviv 96 Android 210
Santa Monica 95 Fashion 208
Singapore 86 Design 206
San Jose 80 Travel 194
77
Table A2. Educational Background of Startup Teams
This table presents the educational background for the people associated with the startupsin sample. The total number of people with valid educational background is 15,342. Theyare either ever-employed or currently-employed by the startups in sample, including thefounders. Panel A counts the number of people who have completed a degree in the top 20schools. A school is included if it is in the top 20 on the U.S. news rankings or it frequentlyappears in the startups personnels educational background. Panel B counts the numberof people who have completed a degree in the three broad fields of engineering, business &economics, and law & politics. Panel C counts the number of people by their highest degrees.
A. Top School List
School Freq. School Freq.
Stanford 1,073 Dartmouth 128
Harvard 804 Oxford 124
NYU 712 Santa Clara 124
UC Berkeley 578 Cambridge 122
Upenn 494 Brown U 122
MIT 472 Boston U 121
Columbia 384 Caltech 118
Northwestern 310 Galtech 115
UCLA 274 UCSB 113
Cornell 273 Georgetown 108
UT Austin 221 U Colorado Boulder 107
U Tel Aviv 213 U Wisconsin Madison 103
USC 209 San Jose State U 93
Yale 199 U Maryland 92
U Chicago 193 INSEAD 89
Carnegie Mellon U 191 PSU 88
U Illinois 180 UC Davis 79
Duke 163 U Waterloo 78
Princeton 162 U Johns Hopkins 68
U Washington 157
B. Field C. Degree
Field Freq. Degree Freq.
Engineering 7,105 M.S. & M.A. 2,784
Business & Economics 4,967 M.B.A 2,948
Law & Politics 723 Ph.D. 845
78
Table A3. Locations of VC and Funded Categories
This table presents the location and funding category summary for the VCs in sample. PanelA counts the number of VCs and VC syndicates that has an office in California (CA), NewYork (NY), other locations in U.S. except California and New York (OUS), other locations inNorth America except U.S. (ONA), Asia (AS), and Europe (EU). Panel B and Panel C countthe number of VCs and VC syndicates that has an office in the top 20 cities. Panel D countsthe number of startups by categories that have received VCs’ funding. The classification ofcategory is from TechCrunch.
A. Locations by Area
Area CA NY OUS ONA AS EU
Freq.
VC 343 126 253 21 113 145
VC Syndicate 2,330 1,278 1,650 111 1,278 846
B. Top 20 Cities
by VC
C. Top 20 Cities
by VC Syndicate
D. Top 20 Funded
Categories
City Freq. City Freq. Category Freq.
New York 119 Menlo Park 2,175 Software 215
San Francisco 97 New York 1,806 Advertising 206
Menlo Park 84 San Francisco 1,504 Mobile 117
Palo Alto 82 Palo Alto 1,260 Consulting 77
London 70 Beijing 802 Biotechnology 61
Beijing 39 Shanghai 775 Games 55
Boston 39 Herzliya 676 Education 45
Shanghai 30 London 654 Internet 43
Cambridge 29 Cambridge 639 Finance 40
Paris 23 Mumbai 525 Design 34
Seattle 20 Bangalore 489 Search 29
Herzliya 20 Boston 391 Technology 26
Mumbai 19 New Delhi 348 Analytics 25
Chicago 18 Hong Kong 212 Networking 23
Tokyo 16 Mountain View 172 Apps 23
Hong Kong 15 Waltham 166 Media 23
Bangalore 14 Los Angeles 163 Security 21
Singapore 14 Seattle 158 News 21
Austin 13 Philadelphia 153 Video 21
Toronto 13 Tokyo 141 Services 21
79
Table A4. Educational Background of VC Teams
This table presents the educational background for the people associated with the VCs in thesample. The total number of people with valid educational background is 11,026. They areeither ever-employed or currently-employed by the VCs in the sample, including the founders.Panel A counts the number of people who have completed a degree in the top 20 schools.A school is included if it is in the top 20 on the U.S. news rankings or it frequently appearsin the VCs’ personnels educational background. Panel B counts the number of people whohave completed a degree in the three broad fields of engineering, business & economics, andlaw & politics. Panel C counts the number of people by their highest degrees.
A. Top School List
School Freq. School Freq.
Harvard 1,246 INSEAD 126
Stanford 1,211 U Illinois 120
Upenn 669 Carnegie Mellon U 106
NYU 532 UT Austin 96
MIT 458 U Tel Aviv 96
Columbia 398 U Washington 94
UC Berkeley 366 Caltech 93
U Chicago 322 Boston U 85
Yale 223 Santa Clara 84
Princeton 219 LSE 82
Cornell 207 Boston College 76
UCLA 198 U Notre Dame 74
Dartmouth 182 San Jose State U 68
Duke 176 U Wisconsin Madison 65
Oxford 157 U Waterloo 63
Cambridge 148 Washington U 62
Northwestern 140 Galtech 61
Brown U 132 U Johns Hopkins 57
USC 126
B. Field C. Degree
Field Freq. Degree Freq.
Engineering 4,730 M.S. & M.A. 2,043
Business & Economics 4,272 M.B.A 3,515
Law & Politics 615 Ph.D. 652
80
Appendix A. Estimation Procedure
A1. Prior Distributions
The prior distribution assumptions are as follows. The parameters in Eq(1) and Eq(2), i.e.,(φr,y, σ
2r,y
),(φr,n, σ
2r,n
),(φs,y, σ
2s,y
), and
(φs,n, σ
2s,n
), have Normal-Inverse-Gamma priors (with un-
known variances σ2 to be estimated).
φk,m|σ2k,m ∼ N(
0, σ2k,mA−1k,m
), σ2k,m ∼ Γ−1 (ak,m, bk,m) (A1)
where k = ”r” or ”s” to denote the dependent variable, and m = ”y” or ”n” to denote the answer
to whether the startup is previously funded.
The parameter in Eq(4), i.e., φv, has a Normal prior (with known variance equal to 1).
φv ∼ N(0, A−1v
)(A2)
Note that the prior means of φ’s are assumed to be zero so that the null hypothesis is that all
coefficients are insignificant. A’s are diagonal matrices with all elements equal to 1/100, and a’s
and b’s are set to be 2.0 and 1.0. The assumption on the prior distributions is to form a Bayesian
Linear Regression setting which gives tractable posterior distribution. Please see Korteweg (2013)
for a detailed description of the setting as well as the rule for parameter estimation.
A2. Algorithm for Estimation (Baseline Model)
For parameter learning, I develop a parallel Gibbs Sampler to draw from the posterior condi-
tional distributions given the data and the prior distributions. In particular, I factorize the joint
posterior distribution into a full set of conditional distributions of (1) the latent variables r, s,
v, and (2) the parameters(φr,y, σ
2r,y
),(φr,n, σ
2r,n
),(φs,y, σ
2s,y
),(φs,n, σ
2s,n
), and φv. The algorithm
consists of the following six steps to be performed iteratively. For initial values, φ’s are set to 0
and σ2’s are set to 1.0.
Steps
1. Impute rjt given sjt ,{vijt , i ∈ I
}, parameters and data
2. Impute sjt given rjt , parameters and data
81
3. Impute vijt (all together for each t) given{rjt , j ∈ Jt
}, the equilibrium condition, parameters,
and data
4. Update(φr,y, σ
2r,y
)and
(φr,n, σ
2r,n
)given all r and data
5. Update(φs,y, σ
2s,y
)and
(φs,n, σ
2s,n
)given all s, r, and data
6. Update φv given all v, r and data
The following paragraphs give the detailed information for the steps 1 to 6. I use some new
notations to simplify the description.
Notations
• φr: equals φr,y or φr,n for previously funded or unfunded
• σ2r : equals σ2r,y or σ2r,n for previously funded or unfunded
• φs: equals φs,y or φs,n for previously funded or unfunded
• σ2s : equals σ2s,y or σ2s,n for previously funded or unfunded
• φs,1: the vector of φs except the last element
• φs,−1: the last element of φs
• φv,1: the vector of φv except the last element
• φv,−1: the last element of φv
• Xjt : equals
[1, Xt, X
jt , X
it , X
ijt
]if j is funded by i at t− 1, and equals
[1, Xt, X
jt
]if j is not
funded at t− 1
• Xijt : equals
[Xit , X
jt , X
ijt
]for a pair of i and j
• µt(i) = {j : ij ∈ µt}• µt(j) = {i : ij ∈ µt}
Impute r
As in Korteweg and Sørensen (2010), r is imputed using a FFBS (forward filtering and backward
smoothing) method. Given infrequent observable values of r, this method samples interim values
given the information that is correlated with these interim values. Note that in Eq(2) and Eq(4),
both the implicit exit value s and the subjective value added v depend on r. Therefore, s and v
generate the information set for the conditional distribution of r. I use mjt|τ and P jt|τ to denote the
conditional mean and variance of rjt given information generated by s and v up to time τ . Below
gives the forward and backward steps for the FFBS method.
• Forward
82
– Forecast
mjt|t−1 = mj
t−1|t−1 + Xjtφr (A3)
P jt|t−1 = P jt−1|t−1 + σ2r (A4)
– Update
b =
[φs,−1
φv,−1
](A5)
e =
sjt −Xjtφs,1∑
i
(vijt −Xij
t φv,1
)/I
− bmjt|t−1 (A6)
K = P jt|t−1b
(bP jt|t−1b
′ +
[σ2s 0
0 σ2v/I
])−1(A7)
mjt|t = mj
t|t−1 +Ke (A8)
P jt|t−1 = (1−Kb)P jt|t−1 (A9)
• Backward
– Given the draw of rj∗t+1
G =
Pjt|t
(P jt|t + σ2r,y
)−1if j is funded at t
P jt|t
(P jt|t + σ2r,n
)−1if j is unfunded at t
(A10)
M = mjt|t +G
(rj∗t+1|t
)(A11)
V = P jt|t(1−G) (A12)
– Draw rj∗t ∼ N(M,V )
Note that given s, v, and the data, the conditional distributions of{rjt : 0 ≤ t ≤ T
}are inde-
pendent across j. Therefore, the FFBS procedure can be performed in a parallel fashion for all
startups.
Impute s
The distribution of s follows truncated Normal given r and startup status (i.e., IPO/MA, Death,
or Survival), and it is conditionally independent of v. By assumption, VC’s appearance also affects
83
the startup’s status. Here, I let δ = 3. Using the same notation, s is sampled as follows.
M =[Xj
t , rjt
]φs, V = σ2s (A13)
• Statusjt = IPO/MA: draw sjt ∼ N(M,V )× 1[δ ≤ sjt ]• Statusjt = Survival: draw sjt ∼ N(M,V )× 1[−δ ≤ sjt < δ]
• Statusjt = Death: draw sjt ∼ N(M,V )× 1[sjt < −δ]
Here, 1[.] denotes the indicator function. The imputation of s can be performed parallely for
all startups and for all time periods.
Impute v
The distribution of vijt is also one-dimensional truncated Normal given rjt and v−ijt which is
defined as the collection{vi
′j′
t : i 6= i′ or j 6= j′}
. As in Sørensen (2007), the conditional distribution
of vijt depends on whether j is matched with i at time t. More specifically, v is sampled as follows.
M =[Xij
t , rjt
]φv (A14)
• Matched
Draw vijt ∼ N(M, 1)× 1[v ≤ vijt
], where v is given in Eq(10).
• Unmatched
Draw vijt ∼ N(M, 1)× 1[vijt < v
], where v is given in Eq(9).
Again, the imputation of v can be performed in a parallel fashion for all time periods. However,
within a specific time period, the v’s for all matched pairs need to be imputed first (either parallely
or sequentially) since the values for the unmatched pairs will depend on those of the matched pairs.
Given the imputed variables, the parameters can be updated through Bayesian Linear Regres-
sion (either with or without variance update). For different parameters, the following specifies the
subsamples used as well as the dependent and independent variables for the Bayesian Linear Regres-
sion. Note that we can still parallelize the procedure because the matrices used for multiplication
in the following equations are actually summations of small matrices each of which corresponds to
one observation in the subsample for the regression.
84
Update φr and σ2r
• Update φr,y and σ2r,y
The subsample includes all previously-matched funding candidates. Let Ny be the size of
the subsmaple. Using the Bayesian Linear Regression rule, we can sample φr,y and σ2r,y as
follows. Here, X and y represent the stacks of the independent and dependent variables for
the previously-matched case in the linear regression in Eq(1).
a = ar,y +Ny/2 (A15)
b = br,y +[y′y −G′
(X ′X +Ar,y
)−1G]/2 (A16)
G =(X ′X +Ar,y
)−1X ′y (A17)
First draw σ2r,y ∼ Γ−1(a, b), then draw µr,y|σ2r,y ∼ N(G, σ2r,y (X ′X +Ar,y)
−1)
.
• Update φr,n and σ2r,n
The subsample includes all previously-unmatched funding candidates, let Nn be the size of
the subsample. Using the Bayesian Linear Regression rule, we can sample φr,n and σ2s,n as
follows. Here, X and y are the stacks of the independent and dependent variables for the
previously-unmatched case in the linear regression in Eq(1).
a = ar,n +Nn/2 (A18)
b = br,n +[y′y −G′
(X ′X +Ar,n
)−1G]/2 (A19)
G =(X ′X +Ar,n
)−1X ′y (A20)
First draw σ2r,n ∼ Γ−1(a, b), then draw µr,n|σ2r,n ∼ N(G, σ2r,n (X ′X +Ar,n)
−1)
.
Update φs and σ2s
• Update φs,y and σ2s,y
The subsample is the same as above for the update of φr,y and σ2r,y. The difference is
that X and y now represent the stacks of the independent and dependent variables for the
previously-matched case in Eq(2).
a = as,y +Ny/2 (A21)
b = bs,y +[y′y −G′
(X ′X +As,y
)−1G]/2 (A22)
G =(X ′X +As,y
)−1X ′y (A23)
First draw σ2s,y ∼ Γ−1(a, b), then draw µs,y|σ2s,y ∼ N(G, σ2s,y (X ′X +As,y)
−1)
.
85
• Update φs,n and σ2s,n
Again, X and y now are the stacks of the independent and dependent variables for the
previously-unmatched case in Eq(2).
a = as,n +Nn/2 (A24)
b = bs,n +[y′y −G′
(X ′X +As,n
)−1G]/2 (A25)
G =(X ′X +As,n
)−1X ′y (A26)
First draw σ2s,n ∼ Γ−1(a, b), then draw µs,n|σ2s,n ∼ N(G, σ2s,n (X ′X +As,n)
−1)
.
Update φv
The subsample includes all pairs of VCs and funding candidates. The Bayesian Linear Regres-
sion now does not include the noise variance. Here X and y represent the stacks of the independent
and dependent variables in Eq(4).
G = (X ′X +Av)−1X ′y (A27)
Draw µv ∼ N(G, (X ′X +Av)
−1)
.
A3. Algorithm for Constrained Estimation (Baseline Model)
Now I impose the constraint that in Eq(1) and Eq(2), the parameters that are not associated
with the VC-related features are the same for funded and unfunded startups. Equivalently, Eq(1)
and Eq(2) become the following.38
rjt − rjt−1 =
[1, Xt, X
jt , X
it , X
ijt
]φr,y + σr,yε
jt , for all j ∈ Et−1 (A28)
sjt =[1, Xt, X
jt , X
it , X
ijt , r
jt
]φs,y + σs,yη
jt , for all j ∈ Et−1 (A29)
With Xit = 0 and Xij
t = 0 if j is not funded at t− 1.
It is straightforward to change the algorithm in Appendix A2 for the estimation here. The model
does not have(φr,n, σ
2r,n
)and
(φs,n, σ
2s,n
), so all the subsamples in Et−1 are used for the estimation
of the parameters(φr,y, σ
2r,y
)and
(φs,y, σ
2s,y
), with the above augmentation of the independent
variables (i.e., Xit = 0 and Xij
t = 0) for the previously-unfunded startups.
38 Recall that Et−1 represents the set of existing startups at the end of t− 1, or at the beginning of t.
86
Appendix B. Measure Construction
Dependent Variables
• r: Cumulative Return
• s: Implicit Exit Value
• v: Subjective Value Added
Independent Variables
• Macroeconomic Variables
• Startup Features
• VC Syndicate Features
• Startup - VC Syndicate Pair Features
B1. Cumulative Return
r = cumulative return = log(V )
• For newborn startups: r = log(V ) = 0, V = 1.
• For existing startups:
– IPO: r = log(V ), with V = market value at IPO.
– MA: r = log(V ), with V = acquired price at MA.
– Death: r = log(V ), with V ∼ triangle distribution with a = 0.05, b = 0.1, and c = 0.8.
• At funding round: r = log(V ) =∑
t log(V PREt /V POST
t−1), so V =
∏t
(V PREt /V POST
t−1), where
V POST = I + V PRE . Here, V is the anti-diluted valuation of the startup at t, I is the
investment amount.
B2. Macroeconomic Variables
• (Ybaa − Yus10) = (Moody’s seasoned Baa corporate bond yield) − (10-year Treasury bond
yield).
• (rm − rf ) = monthly market excess return over risk-free rate.
• smb = monthly factor return for the small-minus-big portfolio.
• hml = monthly factor return for the high-minus-low portfolio.
• rf = monthly risk-free rate.
87
B3. Startup Features
Constant
• # locations = number of cities that a startups headquarter or offices are located in.
• # categories = number of categories that a startup is classified into.
• # products = number of products that a startup has.
• 1(LOC) = dummy variable indicating whether the startup has its headquarter or offices in
LOC, where LOC is
– ca: state of California
– ny: state of New York
– ous: other places in U.S. except from New York and California
– ona: other places in North American except from U.S.
– as: Asia
– eu: Europe
Time-varying
• t from last round = time since last funding round in years at a specific time t.
• t2 from last round = square of (t from last round) at a specific time t.
• # rounds = number of funding rounds experienced in the past at a specific time t.
• # startups founded = number of companies the founder of the startup has built in the past
prior to a specific time t.
• 1(top20 school) = dummy variable indicating whether the startup at a specific time t has
people on the management team who graduated from a top school. The list of top schools
(for startups) is given in Table A2.
B4. VC Syndicate Features
Constant
• # of VCs = number of VC members in a VC syndicate.
• # locations = number of cities that a VC syndicate has at least one member VC that has an
office or headquarter.
• 1(LOC) = dummy variable indicating whether the VC syndicate has at least one member
VC has its headquarter or offices in LOC, where LOC defined as above in startup features.
Time-varying
• 1(cooperated) = dummy variable indicating whether any member VCs have cooperated in
the past prior to a specific time t.
88
• # categories = number of categories that the VC syndicate has at least one member VC that
has funding experience prior to a specific time t.
• # rounds = median funding rounds that VC members have participated in prior to a specific
time t. It measures the average experience of the VC syndicate.
• 1(top20 school) = dummy variable indicating whether a VC syndicate at a specific time t
has people on its member VCs management teams who graduated from a top school. The
list of top schools (for VCs) is given in Table A4.
B5. Pair Features
Constant
• distance = The closest distance in miles between a startup and a VC syndicate. See Figure A2
for the distributions of dist for all and funded pairs.
• days of travel = number of days for a round travel between a startup and a VC syndicate
using the closest distance defined above, where the number of days equals
– 0: if distance∈ [0, 100], indicating a round travel by driving within one day
– 1: if distance∈ [100, 1000], indicating a round travel by flight within two days
– 2: if distance∈ [1000, 10000], indicating a round travel by flight within three days
– 3: if distance∈ [10000,∞), indicating an intercontinental travel
Time-varying
• 1(funding tie) = dummy variable indicating whether any VC member in the syndicate has
funded the startup prior to some specific time t.
• 1(alumni tie) = dummy variable indicating whether any VC member and the startup have
any alumni ties at some specific time t.
89
Appendix C. Alternative Models
C1. Autocorrelation in Subjective Value Added
The subjective value added at t might depend on its value at t − 1. Therefore, one extension
of the baseline model is to include vijt−1 in the expression of vijt . With Eq(1) and Eq(2) unchanged,
Eq(4) changes to the following.
vijt =[Xit , X
jt , X
ijt , r
jt , v
ijt−1
]φv + ξijt , for all i ∈ I, j ∈ Jt (C1)
The new parameter introduced is the last element in φv that is associated with vijt−1. For
estimation, there are some small changes in the imputation of v and in the update step of the
FFBS method for the imputation of r.
C2. Funding Decision Incorporating Private Information
When making the funding decision, venture capitalists may have some private information on
startups that are unobservable to an outside economist. This private information will drive startup
growth and implicit exit value at t+ 1, and thus is incorporated in the subjective value added at t
to make the funding selection.
Therefore, the model can be extended as follows. With Eq(4) unchanged, I modify Eq(1) and
Eq(2) to the follows for startups that are previously funded at t − 1 to incorporate the “private
information” ξijt−1 in the subjective value added at t− 1.
rjt − rjt−1 =
[1, Xt, X
jt , X
it , X
ijt
]φr,y + ρr,yξ
ijt−1 + σr,yε
jt , if j is funded by i at t− 1 (C2)
sjt =[1, Xt, X
jt , X
it , X
ijt , r
jt
]φs,y + ρs,yξ
ijt−1 + σs,yη
jt , if j is funded by i at t− 1 (C3)
Note that the difference here is that we now include ξijt−1 as the last independent variable for both
equations. The associated parameters are denoted by ρr,y and ρs,y respectively. The correlations
between vijt−1 and rjt − rjt−1, and between vijt−1 and sjt now also capture the private information
shared between startups and VC syndicates which is not loaded on publicly observable features.
The changes in the estimation strategy stem from the imputation of r and s for the previously
funded case, the update of(φr,y, σ
2r,y
)and
(φs,y, σ
2s,y
), and the imputation of v for matched pairs
given next-period r and s for those pairs. The following lists the steps for a revised algorithm.
90
More details are available upon request.
Steps
1. Calculate ξijt given currently imputed vijt , parameters and data
2. Impute rjt given ξijt−1, sjt ,{vijt , i ∈ I
}, parameters and data
3. Update(φr,y, σ
2r,y
)and
(φr,n, σ
2r,n
)given all ξ, r, and data
4. Impute sjt given ξijt−1, rjt , parameters and data
5. Update(φs,y, σ
2s,y
)and
(φs,n, σ
2s,n
)given all ξ, s, r, and data
6. Calculate er,jt and es,jt given ξijt−1, rjt , s
jt , parameters and data, where
er,jt ≡ rjt − r
jt−1 −
[1, Xt, X
jt , X
it , X
ijt
]φr,y + ρr,yξ
ijt−1 (C4)
es,jt ≡ sjt −
[1, Xt, X
jt , X
it , X
ijt , r,
]φs,y + ρs,yξ
ijt−1 (C5)
7. Impute vijt for all matched pairs (all together for each t) given{rjt , e
r,jt+1, e
s,jt+1, j ∈ Jt
}, the
equilibrium condition, parameters, and data. Impute vijt for all unmatched pars as before
8. Update φv given all v, r and data
C3. Hierarchical Model for Returns
The extended model has a hidden startup fixed-effect pj ≡(p(1)j , p
(2)j , p
(3)j
), which is a startup-
specific probability mixture. It consists of the probabilities that a startup belongs to a specific
type: winner, loser, and break-evener. The probabilities are denoted by p(1)j , p
(2)j , and p
(3)j . The
period return for a specific type follows a Normal distribution with different means: µ(1)y , µ
(2)y , and
µ(3)y for the three types with funding, and µ
(1)n , µ
(2)n , and µ
(3)n for the three types without funding.
More specifically, Eq(1) changes to the following.
rjt − rjt−1 =
γjy,t +
[Xt, X
jt , X
it , X
ijt
]φr,y + σr,yε
jt if j is funded by i at t− 1
γjn,t +[Xt, X
jt
]φr,n + σr,nε
jt if j is unfunded at t− 1
(C6)
with
γjy,t ≡{µ(k)y : with probability p
(k)j
}, for k = 1, 2, 3 (C7)
γjn,t ≡{µ(k)n : with probability p
(k)j
}, for k = 1, 2, 3 (C8)
91
Here, γjy,t and γjn,t are i.i.d. and follow categorical distributions with the common probability
mixture pj . Equivalently, both noises in the period returns(γjy,t + σr,yε
jt
)and
(γjn,t + σr,nε
jt
)follow
Mixture-of-Normals with the common probability mixture pj . I assume the priors of pj follow a
Dirichlet distribution as follows.
pj ∼ Dirichlet (α) (C9)
Let zjt denote the realized type (winner, loser, break-evener) for a startup j at time t then
P(zjt = k
)= p
(k)j . Eq(C6) can be re-written as follows.
rjt − rjt−1 =
µ(zjt )y +
[Xt, X
jt , X
it , X
ijt
]φr,y + σr,yε
jt if j is funded by i at t− 1
µ(zjt )n +
[Xt, X
jt
]φr,n + σr,nε
jt if j is unfunded at t− 1
(C10)
Consequently, the estimation for the extended model includes the imputation of the two ad-
ditional variables, pj and zjt , and the update of the two additional sets of parameters µy =(µ(1)y , µ
(2)y , µ
(3)y
)and µn =
(µ(1)n , µ
(2)n , µ
(3)n
). The following lists the steps for a revised algorithm.
More details are available upon request.
Steps
1. Impute rjt given sjt ,{vijt , i ∈ I
}, zjt , parameters and data
2. Impute sjt given rjt , parameters and data
3. Impute vijt (all together for each t) given{rjt , j ∈ Jt
}, the equilibrium condition, parameters,
and data
4. Update(µy, φr,y, σ
2r,y
)and
(µn, φr,n, σ
2r,n
)given all r, z and data
5. Update(µy, φs,y, σ
2s,y
)and
(µn, φs,n, σ
2s,n
)given all s, r, and data
6. Update φv given all v, r and data
7. Update z given(µy, φr,y, σ
2r,y
)and
(µy, φs,y, σ
2s,y
), all r and data
8. Update pj given zjt
92
REFERENCES
Aizenman, Kendall, and Jake Kendall, 2008, The Internationalization of Venture Capital and
Private Equity, Journal of Economic Studies, 39(5), 488-511
Amit, Raphael, Werner Antweiler, and James A. Brander, 2002, Venture Capital Syndication:
Improved Venture Selection vs. the Value-added Hypothesis, Journal of Economics and Manage-
ment Strategy, 11(3), 423-452
Baum, Joel, Brian Silverman, 2004, Picking Winners or Building Them? Alliance, Intellectual,
and Human Capital as Selection Criteria in Venture Financing and Performance of Biotechnology
Startups, Journal of Business Venturing, 19(3), 411-436
Bengtsson, Ola, 2013, Relational Venture Capital Financing of Serial Founders, Journal of Fi-
nancial Intermediation, 22(3), 308-334
Bengtsson, Ola and David Hsu, 2010, How Do Venture Capital Partners Match with Startup
Founders? Unpublished Working Paper
Bernstein, Shai, Xavier Giroud, and Richard Townsend, 2015, The Impact of Venture Capital
Monitoring, Journal of Finance, forthcoming
Blei, David, Andrew Ng, and Michael Jordan, 2003, Latent Dirichlet Allocation, Journal of
Machine Learning Research, 3(4-5), 993-1022
Bottazzi, Laura, Marco Da Rin, and Thomas Hellmann, 2008, Who Are the Active Investors?
Evidence from Venture Capital, Journal of Financial Economics, 89(3), 488-512
Bottazzi, Laura, Macro Da Rin, and Thomas Hellmann, 2011, The Importance of Trust for
Investment: Evidence from Venture Capital, Working Paper
Brander, James A, Raphael Amit and Werner Antweiler, 2002, Venture-Capital Syndication: Im-
proved Venture Selection vs. the Value Added Hypothesis, Journal of Economics and Management
Strategy, 11, 423-452
Carter, Chris K. and Robert J. Kohn, 1994, On Gibbs Sampling for State Space Models,
Biometrika 81, 541-553
Chemmanur, Thomas J., Karthik Krishnan, and Debarshi K. Nandy, 2011, How Does Venture
Capital Financing Improve Efficiency in Private Firms? A Look Beneath the Surface, Review of
Financial Studies, 24(12), 4037-4090
93
Chen, Henry, Paul Gompers, Anna Kovner, and Josh Lerner, 2010, Buy Local? The Geography
of Successful and Unsuccessful Capital Expansion, Journal of Urban Economics, 67(1)
Cochrane, John, 2005, The Risk and Return of Venture Capital, Journal of Financial Economics,
75(1), 3-52
Cumming, Douglas, Fleming, and Suchard, 2005, Venture Capitalist Value-added Activities,
Fundraising and Drawdowns, Journal of Banking & Finance, 29(2), 295-331
Da Rin, Marco, Thomas Hellmann, and Manju Puri, 2012, A Survey of Venture Capital Research,
Goerge Constantinides, Milton Harris, and Rene Stulz (editors), Handbook of the Economics of
Finance, vol 2, Amsterdam, North Holland
Ewens, Michael, 2009, A New Model of Venture Capital Risk and Return, SSRN Working Paper
Fama, Eugene and Kenneth French, 1995, Size and Book-to-Market Factors in Earnings and
Returns, Journal of Finance, 50, 131-155
Fruhwirth-Schnatter, Sylvia, 1994, Data Augmentation and Dynamic Linear Models, Journal of
Time Series Analysis, 15, 183-202
Fulghieri, Paolo, and Merih Sevilir, 2009, Size and Focus of a Venture Capitalists Portfolio, The
Review of Financial Studies, 22(11), 4643-4680
Gompers, Paul, Josh Lerner, 1997, Risk and Reward in Private Equity Investments: The Chal-
lenge of Performance Assessment, Journal of Private Equity, 1, 5-12
Gompers, Paul, 1994, The Rise and Fall of Venture Capital, Business and Economic History, 23,
1-26
Gompers, Paul, Anna Kovner, Josh Lerner, 2009, Specialization and Success: Evidence from
Venture Capital, Journal of Economics and Management Strategy, 18(3), 817-844
Gompers, Paul, Anna Kovner, Josh Lerner, and David Scharfstein, 2010, Performance Persistence
in Entrepreneurship, Journal of Financial Economics, 96, 18-32
Hellmann, Thomas, and Manju Puri, 2000, The Interaction Between Product Market and Fi-
nancing Strategy: The Role of Venture Capital, The Review of Financial Studies, 13(4),959-984
Hellmann, Thomas, and Manju Puri, 2002, On the Fundamental Role of Venture Capital, Eco-
nomic Review, published by the Atlanta Federal Reserve Bank, 87, No. 4
Hellmann, Thomas, and Manju Puri, 2002, Venture Capital and the Professionalization of the
94
Startup Firms: Empirical Evidence, Journal of Finance, 57(1), 169-197
Hochberg, Yael V., Alexander Ljungqvist, and Yang Lu, 2010, Networking as a Barrier to Entry
and the Competitive Supply of Venture Capital, Journal of Finance, 65(3), 829859
Hochberg, Yael V., Alexander Ljungqvist, and Yang Lu, 2007, Whom You Know Matters: Ven-
ture Capital Networks and Investment Performance, Journal of Finance, 62(1), 251-301
Hochberg, Yael V., Laura Anne Lindsey, and Mark M. Westerfield, 2015, Resource Accumula-
tion Through Economic Ties: Evidence from Venture Capital, Journal of Financial Economics,
forthcoming
Hochberg, Yael V., Michael J. Mazzeo, Ryan C. McDevitt, 2015, Specialization and Competition
in the Venture Capital Industry, Review of Industrial Organization 46(4), 323-347
Hsu, H. David, 2004, What Do Entrepreneurs Pay for Venture Capital Affiliation? Journal of
Finance, 59, 1805-1844
Hsu, H. David, and Ola Bengtsson, 2010, How Do Venture Capital Partners Match with Startup
Founders? Working Paper
Kaplan, Steven, Mark Klebanov, and Morten Sørensen, 2012, Which CEO Characteristics and
Abilities Matter? Journal of Finance, 67(3), 973-1007
Korteweg, Arthur and Sørensen Morten, 2010, Risk and Return Characteristics of Venture
Capital-backed Entrepreneurial Companies, Review of Financial Studies, 23(10), 3738-3772
Korteweg, Arthur, Markov Chain Monte Carlo Methods in Corporate Finance. In: P. Damien,
P. Dellaportas, N. Polson, and D. Stephens (Eds.), Bayesian Theory and Applications, Oxford
University Press
Lerner, Josh, 1994, The Syndication of Venture Capital Investments, Financial Management,
23(3), 16-27
Lerner, Josh, 1995, Venture Capitalists and The Oversight of Private Firms, Journal of Finance,
50, 301-318
Nanda, Ramana and Matthew Rhodes-Kropf, 2013, Investment cycles and startup innovation,
Journal of Finance Economics, 110(2), 403-418
Phalippou, Ludovic, 2010, Venture Capital Funds: Flow-Performance Relationships and Perfor-
mance Persistence, Journal of Banking and Finance, 34(3), 568-577
95
Sahlman, William. 1990, The Structure and Governance of Venture-Capital Organizations, Jour-
nal of Financial Economics, 27, 473-521
Sørensen, Morten, 2005, An Economic and Econometric Analysis of Market Sorting with an
Application to Venture Capital, dissertation (Stanford University)
Sørensen, Morten, 2007, How Smart is Smart Money? A Two-sided matching model of venture
capital, Journal of Finance, 62(6), 2725-2762
Sørensen, Morten, 2008, Learning by Investing: Evidence from Venture Capital, Working Paper
Sorenson, Olav, and Toby Stuart, 2001, Syndication Networks and the Spatial Distribution of
Venture Capital Investments, American Journal of Sociology, 106, 1546-1588
Tykvova, Tereza, 2007, Who Chooses Whom? Syndicate, Skills, and Reputation, Review of
Financial Economics, 16(1), 5-28
Xuan, Tian, 2010, The Causes and Consequences of Venture Capital Stage Financing, Journal
of Financial Economics, 101(2), 132-159
96