a group and reputation model for the emergence of...
TRANSCRIPT
A Group and Reputation Model for the
Emergence of Voluntarism in Open Source Development
Lik Mui, Mojdeh Mohtashemi, Sameer Verma
[email protected], [email protected], [email protected]
Abstract
The open source development efforts of the past decade have brought much benefit for the users of
such free software as Apache, GNOME, GNU/Linux, JBoss, among others. Many questions have been
raised about the motivation behind contributions of developers around the world to create useful software
and then freely give it away Traditional rational agency theory predicts that rational, self-interested
individuals are not likely to volunteer contributing to the common good (Olson, 1965; Hardin, 1982). In
contrast, the voluntary actions of the open source developers seem altruistic and defy our intuitive notion
of how a market-driven, capitalistic society works.
Von Hippel and von Krogh (2003), Lerner and Tirole (2000) among others have documented field
observations on possible motives for why top developers around the world would volunteer to participate
in open source projects. In this paper, we develop a multi-agents model of open source development with
a focus on how reputation and social groups influence the emergence of volunteerism for open source
developers. We analytically derive conditions for how voluntary actions among the developers would
emerge to dominate over overt self-interested concerns. We also report simulation results which correlate
well with field observations about voluntary actions in the open source world. For the management
community, results of this paper illustrate how such implicit and indirect incentives as reputation gain and
group identification can have powerful influence on motivating voluntary actions in software
development.
2
Introduction
Voluntary acts are necessary for actions such as tasks forces, special interest groups, and many
other informal organizational actions (Katz, 1964). However, potential volunteers face the following
dilemma: while contributing to the welfare of everyone in a group, volunteers themselves typically incur
costs not shared by the group. A large body of literature has attempted to address the motivation of these
seemingly altruistic, cooperative individuals who volunteer to contribute to the common good freely
(Diekmann, 1985; Munighan, et al., 1993; Andreoni, 1995; Ostrom, 1997, 2000). Nevertheless, Tullock
(1995), Lichbach (2000), and others have observed a so called “5 Percent Rule” – in a group engaged in
collective action, only a small percentage (about 5%) of these “… are willing to run great risks ... for the
poor or the public good” (Tullock 1995).
In the context of these literatures, the open source phenomenon poses significant challenges: the
Open Source community depends entirely on voluntary actions by its contributors in order to exist. Thus,
the open source development process can be cast in light of a number of sociological dilemmas: social
traps (Diekmann, 1985; Rapoport, 1988), collective action dilemma (Lichbach, 1996; Ostrom, 2000),
irrational behavior (Olson, 1965), etc.
This paper first describes the open source movement and field studies on what motivates
developers in the open source world. We then points out the background literature related to voluntary
games and collective actions. The bulk of the paper will be devoted to describing an analytical model for
how voluntary acts by developers can emerge to contribute to the common good of open source software.
This model is based on earlier work on evolution of cooperation from Mohtashemi and Mui (2003). We
then report simulation results based on the model and discuss how these results correlate with empirical
observations on voluntarism in general and the working of the open source community in particular. A
brief conclusion ends this paper.
3
Background Literature
It is a common belief that the success of an open source project relies on the support from, and
cooperation among, various groups involved in the open source community. Motivations for contribution
have been attributed to altruistic behavior (Raymond, 1999), a gift-giving culture (Bergquist,and
Ljungberg, 2001) and in some cases, a negative consensus toward large software corporations that sell
proprietary software (Dotzler, 2003).
Explaining the Open Source Incentives
The behaviors of open source developers to share code without monetary or organizational
incentives have challenged traditional rational agency analyses. Lerner and Tirole (2000) have suggested
two main approaches to address these seemingly irrational behaviors. One is to study career prospects as
an incentive and the other to consider open source participation as a signaling of programming abilities.
These hypotheses are echoed by findings by von Hippel and von Krogh (2003), and open source pioneers
such as Raymond (1999).
“The ‘utility function’ Linux hackers is maximizing is not classically economic, but is
the intangible of their own ego satisfaction and reputation among other hackers.”
-- Raymond, 1999
Participation in open source projects is well acknowledged to have significant reputation benefits.
According to Raymond (1999), successful open source developers create “good reputation among one’s
peers, attention and cooperation from other, … higher status [in the] … exchange economy.”
Von Hippel and von Krogh (2003) have proposed a “private-collective model” for explaining how
the innovation process within the open source world works. In contrast to a collective action model
where free-rider incentives induce participants not to contribute, this model describes incentives for those
who contribute to the development and not to free-ride based on private-investment considerations:
4
• Increase of innovation diffusion and so increase an innovator’s innovation-related profits
through network effects
• Open source developers inherently obtain private benefits from the contributions. These
private benefits include: learning and enjoyment during the development process, gaining a
sense of ownership and control over their work product, being able to participate in a
community
Traditional model of innovation assumes that the innovator typically invest in innovation based on
an incentive that their efforts will be repaid when that innovation is diffused to the consumers. Innovation
within open source projects defies this incentive structure because by definition, open source projects will
result in freely distributed, downloadable intellectual property. Von Hippel and von Krogh (2003) argues
that the open source development does not violate the traditional private-investment model of innovation
due to several causes:
• Positive network effects which could lead to increased sales of complementary goods
• Private losses associated with freely revealing source code is low
• For developers, since the costs to freely reveal and diffuse their development effort are
low, small rewards “should be sufficient to induce the [voluntary] behavior”
Von Krogh (2002) and Lerner and Tirole (2000) have observed that rewards to the open source
participants include: “elevated reputation among fellow developers, expected reciprocity, and incentives
to build a community.” In addition, other benefits for the open source developers: (1) gain skills and
knowledge usable in their paid jobs (2) sense of fun and control performing the open source development
project. There are also delayed (signaling) incentives: (1) improved prospect for career options by
signaling his abilities, (2) possible shares or other monetary interests in commercial open source
companies, (3) enhanced chances for access to venture capital market, (4) “ego gratification incentive
stems from a desire for peer recognition.”
5
“Strategic complementarities” is an economic term that can be used to describe why developers are
eager to join popular open source efforts (Lerner and Tirole, 2000). To be associated with “fads” in the
open source community, developers can exhibit strong signals to a large body of “audience” (e.g., peers,
labor market, venture capital community) of their performances. Consequently, they are more motivated
to produce high-quality work.
An N-person Volunteer’s Dilemma Game
For voluntary actions in general, field experiments have shown that instrumental rewards such as
increased money, reduced work time, and recognition encourages voluntary actions (Adam, 1965; Pillutla
and Munighan 1993; Munighan, et al., 1993). For studying voluntary actions, an N-person game first
described by Diekmann (1985) is often used. (The model to be described in this paper will adapt this N-
person game to the open source development process.)
This game assumes that a volunteer produces a common good but at a personal cost. When
deciding on her action, a volunteer faces a dilemma: she can volunteer to contribute to a common good
but gain less than others in the group, or she can hope for another to volunteer and free-ride her that
effort. When there is no volunteer, no one benefits: all lose. The personal dilemma is a direct result of
the incomplete information an individual has on how others would choose.
Let U be the utility for the collective good, and K be the production cost. An N-person volunteer’s
game can be summarized by the following payoff matrix:
0 1 2 … N - 1
C U-K U-K U-K U-K U-K
D 0 U U U U
6
where C represents “cooperation” by a given individual to volunteer and D denotes “defection” by that
individual to not volunteer (thereby hoping for free-ride).1
If N persons realize the need for a software feature, each person might be inclined to wait for
someone else to development the feature. This is the “free-rider” incentive. The person who develops the
feature incurs a cost K (in terms of opportunity cost, time, and possibly lost wages). Everyone else gains
a utility of U. As the size of the population increases, each agent perceives that they share less
responsibility for the common good – an effect called the “bystander effect” or the “diffusion of
responsibility.” (Latane and Darley, 1968, 1970).
Rational Egoist
What is the rational behavior for the volunteer’s game?
There exists an asymmetric strong equilibrium with (N-1) free riders and one volunteer who
contributes the voluntary effort (Diekmann, 1985). To achieve this equilibrium in an iterated game,
social coordination can ensure the volunteer to gain an expected utility of (U - K / N). However, as in
many social situations, such coordination is not possible (Darley and Latane, 1968).
Explaining Incentives for Voluntary Actions
In the game theoretic literature, both prisoner’s dilemma (PD) and volunteer’s dilemma arise due to
rational agents using strategies which take advantage of short term gains at the expense of the group’s
gain, and at the expense of future payoffs. The PD game in particular has provided a fertile group for
examining strategic actions by Axelrod (1981, 1984) and many following his work (e.g., Busch and
Reinhardt, 1993). As a highly successful strategy, the Tit-for-Tat (or the “trigger strategy”) can be used
as a reference to explain voluntary actions. The explanation hinges on earlier work by the evolutionary
1 Although “cooperation” has a more general meaning in the evolutionary literature (e.g., Mohtashemi and Mui, 2003; Mui, 2003), this paper treats the act to cooperate in the open source world as a voluntary action toward an open source effort – including bug fixing, development, documentation, etc. “Defection” denotes the opposite meaning of not volunteering toward an open source effort.
7
biologist Trivers (1971), who observed that many human activities are based on reciprocal altruism. As
de Wall and Luttrell (1988) noted, Trivers’ reciprocal altruism assumes a fair and balanced exchange over
the longer term. In order for reciprocal altruism to induce ”evolution of cooperation” for repeated PD
game using the Tit-for-Tat strategy, Axelrod (1981, 1984) and others have demonstrated a critical
threshold. This threshold depends on the ratio of payoff benefits to cost and the probability of future
encounters. Cooperation can emerge among rational egoists despite the absence of a central authority.
Another way to interpret this result is that acts of reciprocal altruism are not altruistic at all, but rather,
calculated acts of self-interest with a long time payoff horizon. For the open source world, reciprocal
altruism can be used to claim that developers are simply hedging their voluntary contributions on future
reciprocal payments (either in terms of job offers, leadership opportunities, venture funding, etc.).
Yet, there are key differences between volunteering and reciprocal altruism. As Smith (1983)
pointed out, volunteering behaviors usually have more market value to the recipient than it does to the
volunteers themselves. Volunteers are motivated to perform such acts based on expectation of psychic
and moral benefits (Hoffman 1981; Allen and Rushton, 1983). They have suggested that voluntary acts
might be instinctive reactions to seeing others in distress. In addition, Berkowitz and Daniels (1964)
observed that voluntary acts of kindness toward others are often performed without expectation of direct
rewards, only with expectations of “generalized reciprocity” from the social group. Batson (1987) has
further documented through a series of experiments that pro-social behaviors are more likely among
empathetic individuals – such as those among a closely knitted social group. Regan and Totten (1975)
have also found the importance of empathic attitude for inducing prosocial altruistic actions without
enforcing contracts.
The generalized reciprocity notion nevertheless points to voluntary actions that are selfishly
motivated – for either tangible or intangible gains. Research results in the voluntary organizations
generally support the claim that voluntary actions are correlated to the ratio of individual benefits to cost
(Smith, 1983). Stinson and Stam (1976) have found that volunteers’ actions could be motivated by needs
to improve their skills and employability. People volunteer in order to seek social interactions and
8
companionship (Sharp, 1978). Murnighan et al. (1993) have documented field findings which suggest
that voluntary actions can be connected to promotion concerns in an organization. In our daily lives,
many of us have probably experienced working in companies which explicitly reward and complement
employees for helping their community, “being good social citizens.” Even for such voluntary actions as
donating blood, Kessler (1975) has reported findings that the amount of blood donated is correlated to
demands in donors’ occupations. With these findings, the altruistic links with many voluntary actions can
be easily questioned.
Without explicit rewards, are there other motivators for voluntary actions? Kinship (Hamilton,
1963, 1964) and group-based affiliation (Williams, 1971; Wilson and Sober, 1994) have been suggested
to motivate cooperative behaviors. Kramer (1993) has shown that people are more likely to volunteer
when their organizational identification becomes strong. Strong social bonds can create positive
environments for individuals to participate in voluntary actions toward the common good.
Modeling Open Source Development
The open source community is composed of developers who freely distribute their innovations in
the form of source code. Voluntary actions by these developers contribute to the common good –
software binaries and codes available for all to download.
The literature of voluntary action tells us that volunteers are not entirely altruistic but are motivated
by various tangible and intangible gains. This is also observed among the open source developers:
“The ‘utility function’ Linux hackers is maximizing is not classically economic, but
is the intangible of their own ego satisfaction and reputation among other hackers.”
-- Raymond, 1999
Economists Lerner and Tirole (2000) have observed the importance of “elevated reputation among
fellow developers” as an important incentive for open source developers. Due to all of the above field
9
observations, our model for open source development must therefore incorporate reputation calculation in
determining agent actions.
Studies have also indicate that voluntary actions could have stemmed from a socialization need
among empathetic individuals (Sharp, 1978; Batson, 1987), or a calculated decision to participate in
prominent development groups (Lerner and Tirole, 2000). Our model of open source should also account
for group affiliation and cooperation among group members within open source projects.
In the sections to follow, we will present a multi-agents model for open source which has the
following characteristics:
• Players can communicate and inquire information about the reputation of their co-players.
• Players belong to social groups as non-overlapping cliques of “core development teams.” Members
in clique are cooperative with other members of the same clique regarding feature or bug-fix
requests.
• Information about the reputation 2 of co-players is not obtained randomly nor globally; rather,
players selectively acquire information from their acquaintance network by taking advantage of the
collective memory of the social group to which they belong.
• Information is not propagated randomly in a population. New information resulting from new
interactions modifies the content of the collective memory of a recipient and is therefore selectively
propagated through the recipient acquaintance network.3
• A social network is an evolving dynamic entity. With new interactions, new links are created
which in turn increase the likelihood of any two randomly selected players would know each other.
2 Many types of reputation exist: global, observer-based, proprogated via n-degree separations, etc.Reputation in this paper is inferred from all interactions in the acquaintance network which an agent belongs to. Discussions and analyses on these various types of reputation can be found in Mui (2003).
3 The main information propagated includes history of cooperation and defections, number of encounters with a given agent, etc. Mohtashemi and Mui (2003) call these information “social information.”
10
Mapping Open Source to a Model
The open source community relies on the support from, and cooperation among, various groups
involved in the open source community. Different roles can be approximately categorized into four
groups, including core developers, patch submitters, source code testers, and end-users (Dotzler, 2003).
Core developers are a small group of people who contribute most of the code and control the
software releases. In the case of Apache, a popular web-server project, core developers account for over
80% of the coding (Mockus, et al, 2000). Apache has approximately 25 core developers. These members
are ones who have the rights to accept or reject requests for features or bug fixes. Following that group is
a collection of patch submitters, bug testers and end-users. Although end-users may be reported to be
largest group of all, they are usually only interested in using pre-compiled binary software and rarely
provide explicit feedback (or any other contributions) to the projects. Most feedback comes from bug
testers and patch submitters. We will treat the patch submitters and bug testers in one category as “feature
(or bug-fix) requester.”
The interactions between core developers and feature requesters usually happen via conversations
on e-mail lists or through more well-defined mechanisms such as bug reporting systems such as Bugzilla
(Dotzler, 2003). In most cases, several rounds of interaction lead to the fruition of a strategy where certain
requests are accepted, while others are rejected, and a new generation or version of the software is
released. This is typically called a release in the open source world. It is common for core developers and
feature requesters to change their interaction strategies across each generation – some are more active in
some than in others.
Based on the game theoretic, open source, and voluntary action literatures, we expect this model to
exhibit some of the following characteristics:
• Voluntary actions to participate in open source activities can emerge for rational, self-interested
agents if these agents can take advantage of reputation information and group identification for
modifying their interaction strategies (Mohtashemi and Mui, 2003).
11
• Some players are willing to cooperate for the common good – volunteer to develop free software –
even when no explicit payoff is given for such actions (Lerner and Tirole, 2000; Ostrom, 2000).
• As observed by Diekmann (1985), Murnighan, et al., (1993) and others, as the size of a population
increases, players’ willingness to cooperate for the common good decreases due to decreases in
group identification and empathy (Batson, 1987).
• Participation in common goods activities increases the socialization for the players, which acts as a
motivator for more such participation (Sharp, 1978; Batson, 1987)
Modeling Open Source with Social Information
We construct a multi-agents model to capture the essential features of the open source development
process. Variants of this model have been used in the past for modeling evolutionary phenomena (Nowak
and Sigmund, 1998; Mohtashemi and Mui, 2003; Mui, 2003). It can be shown that the stage game
described below is related to the volunteer’s stage game as described earlier (Diekmann, 1985).
Consider a population of n individuals divided into non-overlapping groups of the same size, s.
We use the term “agents” to denote such individuals to emphasize the notion that these individuals are in
fact rational, albeit without explicit payoffs for guiding their actions – as discussed below. Each agent
can assume the role of a developer or a feature or bug-fix requester.
Each group models a set of “core developers” who are working in an open source project. We
model the underlying graph structure of a group as a clique.4 Members in a group are more likely to
cooperate with each other during feature (or bug-fix) requests than they are to a non-member request.
Acquaintances of a group include those who have interacted with members of the group – members of the
acquaintance set can evolve over time.
A generation models one complete development cycle for a group of affiliated open source
developers. Every generation consists of a fixed number of rounds, m. In every round, two agents are
4 A clique is a fully connected graph.
12
selected at random: one as a developer and the other as a feature (or bug-fix) requester. The developer has
an option of cooperating with or defecting upon the feature requester. If the developer cooperates, he not
only would not get any payoff but would incur a cost of c > 0. The requester receives a benefit value of b
(b>c):
b : benefit per round for the requester who receives a feature implementation
c : cost per round to a developer who satisfies a given feature request
At the beginning of each generation every agent assumes a unique clique of acquaintances. Every
agent j possesses a strategy, kj: kj ∈{ -5, -4, … 5, 6 } and an reputation score about agent i, sij: ∈{ -5, -4,
… 4, 5 } 5
kj: agent j’s strategy (cooperate if the recipient’s reputation ≥ k)
sij: reputation score that agent j has about agent i
An agent j cooperates if he is willing to developer a feature as requested by agent i. When an agent has a
strategy kj = -5, he is a complete cooperator: he would fulfill any request because any requester’s
reputation score would be ≥ kj. On the other hand, when an agent has a strategy kj = 6, he is a complete
defector: he would deny any request because all requester’s reputation score would be < kj.6 The private
utility gained by the cooperative agent j is an increase in his reputation – despite an opportunity cost c for
fulfilling a request. Note that the cooperative agent j receives no direct payoff from agent i during a given
round of exchange.
To observe how different strategies affect the fitness of agents, all agents assume a reputation score
of zero at the beginning of a generation.7 A potential developer j cooperates if the reputation score sij of
the recipient i is at least as high as j’s own strategy value (kj). The reputation scores of agents are only
5 The ranges of these values have been successfully used in the past in modeling evolution of cooperation in general in Nowak and Sigmund (1998), Mohtashemi and Mui (2003), and Mui (2003).
6 Because agents with negative strategy kj tend to cooperate while those with positive strategy kj tend to defect, the former will be dubbed “cooperators” while the latter “defectors” in the simulation discussion later.
7 This assumption can be removed to observe faster convergence of strategies toward the more “robust” strategy which is copied by all other agents.
13
known to and updated for their acquaintances.8 If a potential developer cooperates, his reputation score is
increased by one unit; otherwise it is decreased by one unit. A developer performs one of the two actions,
cooperate or defect.
Developer’s action space: { cooperate, defect }
cooperate: refers to a developer’s action to fulfill a feature or bug-fix request from a requester
defect: refers to a developer’s inability or refusal to fulfill a feature or bug-fix request
from a requester
Payoff matrix: Reputation update matrix:
cooperate defect cooperate defect
developer -c 0 developer +1 -1
recipient b 0 recipient 0 0
After a developer’s action, the feature requester and members of the developer’s clique update their
perception of the developer’s reputation; the developer will then become a ‘one-way acquaintance’ of the
feature requester, i.e., if in future rounds the requester is chosen as a developer, in addition to her
acquaintances, she will also have access to the previous developer for reputation inquiry about her
opponents. In other words, the developer is being added to the feature requester’s acquaintance set but
not vice versa. This is because the developer gains no information about the feature requester by
volunteering his development effort to fulfill a request. However, the requester would have observed the
developer in action and can hence attest how cooperative the developer has been.9 The aggregate
reputation of a developer depends on his accumulated fulfillments of feature (or bug fix) requests for a
given requester. Intuitively, the more a developer successfully delivers fulfillments for a requester, his
reputation will correspondingly be enhanced for that requester. Conversely, if such fulfillments are few
in between if at all, his reputation will then suffer.
8 The notion of an acquaintance set is treated dynamically, as discussed earlier. 9 As a simplifying first step, our model assumes that all fulfillment of a request by a developer is completed.
14
If a potential developer j does not know the reputation score of the feature requester i, he will make
use of the social information available by asking all his acquaintances whether they have ever interacted
as feature requesters against the current recipient i -- i.e., if the current feature requester is a one-way
acquaintance of any one in the developer’s acquaintance set Qj. If no information is learned, the
developer will assume a reputation score sij of zero; otherwise, he assesses the reputation of the requester
by adding up her reputation scores, provided by members of the acquaintance set Qj, and dividing by the
total number of encounters.
Reputation sij of a requester i in Qj = ratio of cooperation over total number of i’s encounters in Qj
Such an averaging scheme has the benefit of transparency during analysis. More sophisticated reputation
estimation techniques have been discussed in Mui (2003).
A potential developer j compares the computed reputation score sij for a potential requester i to his
own strategy kj. j volunteers to fulfill i’s request if sij ≥ kj: this is the decision rule for the open source
developer j. Therefore, the outcome of a round depends on the probability of knowing the requester’s
reputation, which is derived from the collective memory embedded in one’s acquaintance set.
Furthermore, the underlying social structure is itself evolving as agents meet over time, which causes the
probability of knowing a randomly selected requester to increase over the lifetime of a development
generation.
Modeling strategy evolution
To model how voluntary action strategies evolve over time, our model uses a multi-generation
scheme. In this scheme, n number of agents at the end of a generation disbands and reorganizes
themselves into new groups of agents (core developers for new projects). Within a generation, each agent
gains payoff benefits (as requester) as well as a cumulative reputation score (as developer). At the end of
a generation, that agent’s sum of payoff benefits determines how many new agents in the new generation
will copy his voluntary action strategy k. The more “fit” a given cooperative strategy, the more new
15
agents would be included to copy that strategy in the new generation. In turn, how fit an agent is in a
generation depends on his accumulated reputation in the community at large.
Open Source Development Model: an Analysis
For the analysis of the framework presented in the previous section, several additional variables
need to be defined:
iA : the average number of acquaintances per agent at round i
This variable represents the average network connectivity per agent at round i. Then 0A is the average
initial “core developer” clique size at the beginning of each generation.
iq : the probability that a potential developer knows the reputation score of the feature requester at round i.
In terms of Ai, qi can be expressed as:
1 /( 1)i iq A n−= − (1)
i.e., the likelihood of knowing the feature requester at round i is a function of the average number of
acquaintances for the developer from the previous round over n-1 possible acquaintances one can have in
a population of size n. The average network connectivity per agent at round i can be expressed as the
following recurrence:
11 (1 )
2i
i i iAA A q
n−
−= + − (2)
This relation shows that a new link is created between two agents in every round of the game only
if the developer does not know the reputation score of the feature requester, in which case the developer is
added to the acquaintance set of the feature requester resulting in as many as 1iA − new links.
16
Rewrite iA using equation (1), the following can be derived:
21 1
1 1(1 ) ( )2 2 ( 1)i i iA A A
n n n− −= + −−
(3)
Substituting /(2 1)( 1)i iX A n n= + − , equation (3) can be expressed as the following canonical
form:
1 11(1 ) (1 )
2i i iX X Xn − −= + − (4)
which is the familiar logistic equation.
As n →∞ , the growth rate, 11 1
2n + →
. Therefore, for n>1 the system is stable with the
nontrivial fixed point * 1A n= − . This is the maximum network connectivity per agent, i.e., the
maximum number of acquaintances an individual can have in a population of size n.
Now consider a population of size n consists of two types of agents: defecting developers who
never help the feature requester and discriminators who only help agents with good reputation.
(Discriminators are those with strategy k = 0). Let the frequencies of these populations be:
x : the frequency of discriminators
y : the frequency of unconditional defectors
For a discriminating developer, a feature requester has a good reputation, G, if he is known to have
cooperated the last time; otherwise he has a bad reputation, B. We modify this model by imposing a
social structure on acquaintance relations. If discriminating developers do not learn new information
about their feature requesters by asking their acquaintances, they always cooperate. This means that in
the absence of information, defectors can be mistaken for good scorers by discriminators. Therefore at
the beginning of each generation, when no information about agents’ behaviors is available, all agents are
assumed to be good scorers. Let Ax(i) and Ay(i) denote the respective payoff for discriminators and
defectors at round i. These two quantities can be expressed as:
17
( )( ) { (1 ) }
2G
y i iy ibA i x q xq
y= − +
( )1( ) { ( 1 ) (1 }2
Gx i i i i
x iA i c q g q bx bxqx
= − + − + − −
where ( )Gx i and ( )Gy i are the respective frequency of good scoring discriminators and defectors at
round i. These frequencies can be expressed as:
1( ) { ( 1) (1 )}2G G i i ix i x i x q q g= − + − +
1( ) 2 iGy i y −=
The differential payoff at round i for discriminators is:
11 { ( 1 ) ( 2 )}2
ix i i i i iD c q g q bq g −= − + − + − .
Here ( ) ( )i G Gg x i y i= + is the proportion of good scorers at round i. Because everyone is assumed
to have a good reputation at the beginning of each generation:
1
1 1
1 2{ (1 ) (1 ) }2
i
i i i
ig
i mg q x q x−
=
= ≤ ≤ + + −
The differential total expected payoff for discriminators is then:
1
1 1
1 { [ (1 )] ( 2 )}2
m mi
x i i i ii i
P c m q g b q g −
= =
= − − − + −∑ ∑ (5)
For 0xP > , we must have:
1
1 1{ (1 )} ( 2 )
m mi
i i i ii i
c m q g b q g −
= =
− − < −∑ ∑ (6)
Divide both sides of inequality (4) by m to get:
18
1
1 1{1 (1 ) / } ( 2 ) /
m mi
i i i ii i
c q g m b q g m−
= =
− − < −∑ ∑ (7)
To interpret inequality (7) in an intuitive manner note that the following ratio is the average per
round probability of knowing and helping a good scoring discriminator, i.e., the probability of a
discriminator making the right decision:
1
1( 2 ) /
mi
i ii
q g m−
=
−∑Similarly, the following ratio is the average per round probability of knowing and not helping a bad
scorer:
1(1 ) /
m
i ii
q g m=
−∑
Hence, the term 1
1 (1 ) /m
i ii
q g m=
− −∑ describes the average per round probability of knowing and
helping a bad scorer, i.e., the probability of a discriminator making the wrong decision. Inequality (5)
then asserts that for discriminators to outperform the defectors the average per round benefit received by a
good scoring discriminator must out-weigh the average per round cost to a discriminator for helping a bad
scorer. Therefore trust pays off if it is based on information and placed upon the trustworthy.
If we rewrite inequality (7) in terms of cost-to-benefit ratio we derive that for discriminators to be
evolutionarily stable we must have:
1
1
1
( 2 ) //
1 (1 ) /
mi
i ii
m
i ii
q g mc b
q g m
−
=
=
−<
− −
∑
∑(8)
This is the ratio of knowing and helping a good scoring discriminator to knowing and helping a bad
scorer must out-weight the cost-to-benefit ratio.
19
Note that equation (8) can be used to illustrate the calculated motive behind voluntary actions as
described by Smith (1983) and Murnighan, et al., (1993), and those behind open source developers as
postulated by Lerner and Tirole (2000). The calculation here does not involve explicit benefits to an
agent; rather, it is a function of reputation scores and probability of knowing other agents in a community.
The voluntary action strategy to cooperate with the requester is only robust if the cost-to-benefit ratio of
reputation and group affiliation in inequality (8) is satisfied.
A property of the framework outlined here is that the underlying network of trust relations is itself
evolving as players interact. Hence, iA and iq are changing over time. As the game continues therefore,
the threshold on c/b increases (see inequality (8)). Under such changing environment the payoff to
cooperative strategies may be negative at the beginning, but over time they may be able to recover and
even outperform the exploiters. This is achieved through two parameters in the system: the initial clique
size, A0, and the number of rounds, m. The effect of varying the initial clique size on the long term
outcome of the game will be demonstrated in a set of simulations in the next section. But even under
fixed initial clique size, the probability of knowing the reputation of an opponent through one’s
acquaintances is increased over time – following the equations (2)-(4).
20
Social Information: Simulation Results
Simulation results in this section are based on the model proposed and analyzed earlier in the
paper. Figure 1 shows that the probability (qi) of knowing a randomly selected agent’s reputation score
increases at round i. Initially a population of n=100 individuals is divided into non-overlapping “core
developer” cliques of size s=4. As the game is continued, agents meet and make new acquaintanceships.
The average number of acquaintances per player increases with every round of the game, and therefore
the probability of knowing the opponent’s reputation also increases. Here a generation consists of 10,000
rounds. The rise in the probability of familiarity is consistent with the analytic result in Equation (2) in the
former section – a trivial result illustrating that socialization increases one’s number of acquaintances.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 1000 2000 3000 4000 5000 6000 7000 8000 9000
Rounds
Pr(k
now
ling
co-p
laye
rim
age
scor
e)
Figure 1. Dynamics of acquaintanceship.
21
t=200
0
5
10
15
20
25
30
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy k
Freq
uenc
yt=10
0
5
10
15
20
25
30
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy k
Freq
uenc
y
t=1
0
5
10
15
20
25
30
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy k
Freq
uenc
y
t=1000
0
5
10
15
20
25
30
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy k
Freq
uenc
y
Figure 2. Frequency counts (averaged over 50 runs each) for number of agents with strategy k at generation t. Developer agents choose to cooperate or defect (on feature requests) based on the reputation score of the feature requesters.
Shown in Figure 2 are aggregate results based on 50 simulation runs each running with number of
generations t. Every generation consists of a fixed number agents (n = 100) and a fixed number of
rounds, m=10n. Under these assumptions, the likelihood for a pair of individuals to meet more than once
is very small – making the need to gather reputation information from one’s peer the more important for
making decisions. As mentioned previously, strategy k ranges from -5 to 6 where k=-5 represents
unconditional cooperators, k=6 represents complete defectors, and k=0 represents the most discriminating
agents. Agents with strategy k < 0 indicates more cooperative agents while agents with strategy k > 0
indicates those who are more willing to defect. The reputation scores range from -5 to 5. A potential
developer j volunteers to satisfy a request only if the reputation score sij of the feature requester i is at
22
least as large as j’s strategy sij ≥ kj. Agents in new generations copy the successful strategies 10 of their
former generation’s agents -- subject to a copy error rate of 0.001. For developer agents, Figure 2 shows
that as the number of generations increases, those who are willing to volunteer to fulfill development
requests are more likely to have their strategies copied by other agents.
-6
-4
-2
0
2
4
6
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy, k
Mea
nA
gent
Rep
utat
ion
Figure 3. A sample run’s average agent reputation for each type of interaction strategy (k) at the end of the first generation (t = 1) in Figure 2.
How do the reputation scores for agents with various strategies vary in the experiment depicted in
Figure 2? Figure 3 plots the average reputation score for agents with the same strategy value k at the end
of the first generation -- where there are about equal proportions of agents with each strategy k (see top-
left graph in Figure 2).11 For the more cooperative agents (with k < 0), their average reputation on
average is higher than those whose strategies are less cooperative (with k > 0).
Figure 4 shows the results of computer simulations for a population of n individuals with an initial
“core developer” clique size s=4. We sampled the frequency distribution of strategies over 106
generations for population sizes n=52, n=100, and n=200. All other parameters are as in the experiment
10 “Success” refers to the fitness of an agent as measured by his individual accumulated benefits (c.f., the previous section’s discussion).
11 Comparing agents at later generations become less statistically significant as the distribution of strategies becomes increasingly skewed toward the more robust strategies (c.f., Figure 2). This is particularly the case for positive k.
23
in Figure 2. As observed by Diekmann (1985), Murnighan, et al., (1993) and others, voluntary actions
among social agents are expected to drop as the population size increases. Figure 4 shows that as the
population size increases, the relative proportion of cooperative agents versus defective ones decreases.
In other words, in larger population setting, less proportion of agents are willing to volunteer to fulfill
development requests.
n=52
0
0.05
0.1
0.15
0.2
0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Freq
uenc
y
n=100
0
0.05
0.1
0.15
0.2
0.25
0.3
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Freq
uenc
y
n=200
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy, k
Freq
uenc
y
Figure 4. Shifting of proportion of voluntary agents with varying population sizes (n). Each simulation has clique size s=4.
New generations of agents copies neither the reputation score of their parents nor their parental
group structure. They copy the strategy of their parents subject to an error rate as mentioned above. At
the beginning of each generation, players are grouped to unique cliques of the same size. The game is
played for many generations to subject the population to selective pressure. We say that cooperation is
24
established if the average winning strategy (k) for all individuals at the end of the game is less than or
equal zero. We find that under our framework, cooperation evolves and is sustained even in larger
populations after the game is played for many generations (c.f., Figure 4). The effect of the initial clique
size (of “core developers”), however, is less discernible in larger populations (see the simulation result for
n=200 in Figure 4).
Figure 5 shows the result of the simulation for a population of 100 individuals for varying initial
clique sizes. With more initial acquaintances, voluntary actions seem to emerge and be sustained faster.
When information is scarce, in the absence of a dynamically evolving network of acquaintances,
cooperators (players with 0k ≤ ) stand less of a chance against defectors. As soon as we allow for
channels of information to evolve by letting individuals make new acquaintances as they interact, the
likelihood of dissemination of information about players’ reputation is increased, thereby increasing the
likelihood for discriminators to be able to discriminate rightfully against exploiters, as well as in favor of
cooperators. This is why voluntary actions and cooperators can evolve under this scenario despite the
obvious fact that discriminators are the only ones who can make use of information. The emergence of
voluntary actions in our open source development model is then a consequence of informed
discrimination.
25
initial clique size=1
0
0.05
0.1
0.15
0.2
0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Freq
uenc
y
initial clique size=3
0
0.05
0.1
0.15
0.2
0.25
0.3
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Freq
uenc
y
initial clique size=5
0
0.05
0.1
0.15
0.2
0.25
-5 -4 -3 -2 -1 0 1 2 3 4 5 6
Strategy, k
Freq
uenc
y
Figure 5. Emergence of cooperation with varying initial clique size. Each simulation has population size n=100.
Discussions
By giving software away for free, open source developers are surely not irrational, nor are they
purely altruistic for the good of humankind. We have explained the seeming altruism of open source
developers in terms of their concerns for their reputation and group identifications. In addition, we have
developed a model for illustrating how voluntarism in open source projects can emerge among rational,
self-interested agents. In our model, developers’ ability to process complex social information by way of
reputation propagation through acquaintance networks has played a critical role in the emergence of
26
voluntary actions within open source projects. The topology of the underlying social network is allowed
to be dynamically modified as members change their interaction patterns.
In addition to simulation results, we have also analytically derived the conditions for the emergence
of voluntary actions based on reputation inference through acquaintance sets. Our results suggest this
emergence depends on the implicit benefit received by developers (Equation 8), which must out-weight
the cost of trusting and helping a random feature (or bug-fix) requester.
Observations
Comparing the outcome of our simulations to field observations in the open source world, we
discern a few interesting patterns. We observe a noticeable impact of generations (t) on the emergence of
voluntary actions. A generation is modeling the equivalent of an open source project’s version release.
As opposed to closed source software, open source software releases new versions often. In fact,
“release early and release often” has become an unsaid motto for the open source philosophy (Raymond,
1999). This strategy is seen at odds with traditional methods of development where release cycles are
often long and few in between. For open source projects such as Gaim, we observe releases of over
eighty versions over a period of five years (Sourceforge, 2004). This trend is common for many projects.
From our results, we can hypothesize that a higher number of generations gives members of the
community opportunities at re-aligning their strategies to either imitate successful developers, or to form
new groups. As the number of generations increases, we see simulation results which indicate that
voluntary actions are more likely to emerge.
With respect to the influence of population size on the strategy of cooperation, we find that with
most successful projects such as the Mozilla browser or the Apache web server, the relative size of the
core developers is relatively small compared to the overall community. For example, the Mozilla project
has approximately 25 core developers (clique size) and over 400 patch submitters (Dotzler, 2002).
Similarly, Apache has about 25 core developers, and over 430 patch submitters. Combined with our
results, these observations on successful open source projects seem to indicate that by maintaining a
27
smaller population of core developers interacting with large user population, these projects are able to
sustain the developers’ voluntary tendency.
Clique size relates the size of core developers in a given open source project. Large clique sizes
result in faster emergence of volunteering among the developers. Continuing with the example of Gaim,
it was a project initiated by one developer and has since grown to a core developer group size of twelve
over a period of five years. Gaim is also rated as being one of the top SourceForge open source projects
in terms of the activity (93 percentile or above since April 2000) on their mailing list and code repository.
This is evidence of a high degree of cooperation and socialization in the Gaim project community.
Similarly, in case of PHPmyAdmin, the project began with one developer in 1998, and grew to a team of
seven core developers. Activity for the PHPmyAdmin project has been rated in the 97th percentile or
higher since 2001 (Sourceforge, 2004). These observations lend support to our results on how social
groups (cliques) could lead to the emergence of voluntarism in the successful open source projects. By
studying these top ten projects on SourceForge today, we find that most of the top ten projects have
between three and twenty developers. These core developer sizes are extremely small compared to the
size of the community (population) itself. We postulate that only by keeping clique sizes to a smaller
size, open source projects maintain the voluntary nature of developer participation.
Conclusion
This model and analyses presented here challenge the common belief that developers are altruistic
for their participation in open source project (Raymond, 1999). We have developed a multi-agents model
of open source development focusing on how reputation and social groups influence the emergence of
voluntarism for open source developers. We have also analytically derived conditions for how voluntary
actions among rational, self-interested developers could emerge. Our results seem to correlate well with
field observations on the less lofty motivators for open source developers – in particular reputation, and
group involvement (von Hippel and von Krogh, 2003; Lerner and Tirole, 2000; among others). This
28
paper has illustrated how such implicit and indirect incentives as reputation gain and group identification
can have powerful influence on motivating voluntary actions in software development.
Clearly, our results only paved a beginning step toward a more realistic model of the open source
development process. Nevertheless, by explicitly modeling reputation and group involvement, our model
permits a quantitative discussion on how important different model parameters are in sustaining voluntary
actions in open source projects. Future work is needed to extend our model to cover overlapping
generations for projects of different lengths and group sizes, agents with different types of implicit and
delayed payoffs, varying levels of coordination between and across projects, more sophisticated
mechanisms for reputation updating, etc. We also would like to study how core developers’ size could
influence the emergence and decay of voluntarism in open source projects, thus determining the success
rates for these projects.
29
References
Allen, N. J., J. P. Rushton (1983) “Personality Characteristics of Community Mental Health Volunteers: a
Review,” Journal of Voluntary Action Research, 12, pp. 36-49.
Andreoni, J. (1995) “Cooperation in Public-Goods Experiments: Kindness or Confusion,” The American
Economic Review, 85 (4), pp. 891-904.
Axelrod, R. (1984) The Evolution of Cooperation. New York: Basic Books.
Axelrod, R, W. D. Hamilton (1981) “The Evolution of Cooperation.” Science, 211, 1390.
Batson, C. D. (1987) “Prosocial Motivation: Is it ever Altruistic?” in L. Berkowitz (ed.) Advances in
Experimental Social Psychology, 20, New York: Academic Press, pp. 65-122.
Berkowitz, L., L. R. Daniels (1964) “Affecting the Salience of the Social Responsibility Norm: Effects of
Past Help on the Response to Dependency Relationships,” Journal of Abnormal and Social
Psychology, 68, pp. 275-281.
Bergquist, M., J. Ljungberg (2001) “The power of gifts: Organizing social relationships in open source
community”, Information Systems Journal, 11, pp 305-320.
Busch, M. L., E. R. Reinhardt (1993) “Nice Strategies in a World of Relative Gains,” Journal of Conflict
Resolution, 37 (September), pp. 427-445.
Darley, J., B. Latane (1968) “Bystander Intervention in Emergencies: Diffusion of Responsibility,”
Journal of Personality and Social Psychology, 10, pp. 215-221.
Diekmann, A. (1985) “The Volunteer Dilemma,” The Journal of Conflict Resolution, 29 (4), pp. 605-610.
Dotzler, A. (2003) “Getting involved with Mozilla: The people, the tools, the process”, LUGoD lecture
series, Davis CA. http://lugod.org/meetings/past/2003.03.04.php
Hamilton, W. D., (1963). “The Evolution of Altruistic Behavior.” American Naturalist. 97: 354-356.
Hamilton, W. D., (1964). “The Ggenetical Evolution of Social Behavior.” J. Theor. Biol. 7: 1-16.
Hardin, R. (1982) Collective Action. Baltimore, MD: Johns Hopkins University Press.
30
von Hippel, E., G. von Krogh (2003) “Open Source Software and the ‘Private-Collective’ Innovation
Model: Issues for Organization Science,” Organization Science, 14 (2), pp. 209-223.
Hoffman, M. L. (1981) “Is Altruism Part of Human Nature?” Journal of Personality and Social
Psychology, 40, pp. 121-137.
Katz, D. (1964) “The Motivational Basis of Organizational Behavior,” Behavioral Science, 9, pp. 131-
146.
Kessler, R. C. (1975) “A Descriptive Model of Emergence On-call Blood Donation,” Journal of
Voluntary Action Research, 4, pp. 159-171.
Kramer, R. M. (1993) “Cooperation and Organizational Identification,” in J. K. Murnighan (ed.) Social
Psychology in Organizations: Advances in Theory and Research, Englewood Cliffs, NJ: Prentice
Hall, pp. 244-268.
Von Krogh, G. (2002) “The Communal Resource and Information Systems,” Journal of Strategic
Information Systems, 11 (2), pp. 85-107.
Latane, B., J. M. Darley (1970) The Unresponsive Bystander: Why Doesn’t He Help? New York:
Appleton-Century-Crofts.
Lerner, J., J. Tirole (2000) “The Simple Economics of Open Source,” NBER Working Paper No. w7600.
http://www.nber.org/papers/w7600.
Lichbach, M. I. (1996) The Coopreator’s Dilemma, Ann Arbor, MI: The University of Michigan Press.
Mockus, A., E. Fielding, and J Hersleb, (2000) “A Case Study of Open Source Software Development:
The Apache server”, Proceedings of International Conference on Software Engineering (ICSE),
Limerick, Ireland
Mohtashemi, M., L. Mui (2003) "Evolution of Indirect Reciprocity by Social Information: the Role of
Trust and Reputation in Evolution of Altruism," Journal of Theoretic Biology, 223 (4), pp. 523-531.
Mui, L. (2003) Computational Models of Trust and Reputation: Agents, Evolutionary Games, and Social
Networks, Ph.D. Dissertation, EECS, Massachusetts Institute of Technology (MIT).
31
Murnighan, J. K., J. W. Kim, A. R. Metzger (1993) “The Volunteer Dilemma,” Administrative Science
Quarterly, 38 (4), pp. 515-538.
Nowak, M. A., K. Sigmund (1998) “Evolution of Indirect Reciprocity by Image Scoring,” Nature,
393:6685, pp. 573-577.
Olson, M. (1965) The Logic of Collective Action: Public Goods and the Theoryof Groups. Cambridge,
MA: Harvard University Press.
Ostrom, E. (2000) “Collective Action and the Evolution of Social Norms,” Journal of Economic
Perspective, 14 (3), pp. 137-158.
Rapoport, A. (1998) “Experiments with N-Person Social Traps I: Prisoner’s Dilemma, Weak Prisoner’s
Dilemma, Volunteer’s Dilemma, and Largest Number,” The Journal of Conflict Resolution, 32 (3), p.
457-472.
Raymond, E. (1999) The Cathedral and the Bazaar: Musings on Linux and Open Source by an Accidental
Revolutionary, Sabastopol, CA: O’Reilly.
Regan, D. T., J. Totten (1975) “Empathy and Attribution: Turning Observers into Actors,” Journal of
Personality and Social Psychology, 32, pp. 850-856.
Sharp, E. B. (1978) “Citizen Organizations on Policing Issues and Crime Prevention: Incentives for
Participation,” Journal of Voluntary Action Research, 7, pp. 45-58.
Smith, D. H. (1983) “Altruism, Volunteers, and Volunteerism,” Journal of Voluntary Action Research,
12, pp. 21-36.
Sourceforge.net (2004) “The world's largest Open Source software development website”, Available on
the web at http://sourceforge.net
Stinson, T. F., J. M. Stam (1976) “Toward an Economic Model of Voluntarism: the Case of Participation
in Local Government,” Journal of Voluntary Action Research, 5, pp. 52-60.
Trivers, R. L. (1971) “The Evolution of Reciprocal Altruism,” Quarterly Review of Biology, 46, pp. 35-
57.
32
Tullock, G. (1995) “Comment: Rationality and Revolution,” Rationality and Society, 7 (January), pp.
116-120.
de Wall, F. B. M., L. M. Luttrell (1988) “Mechanisms of Social Reciprocity in Three Primate Species:
Symmetrical Relationship Characteristics or Cognition?” Ethology and Sociobiology, 9, pp. 101-118.
G. C. Williams (1971). Group Selection. Aldine-Atherton, Chicago.
D. S. Wilson, E. Sober (1994). Reintroducing group selection to the human behavioral sciences. Behavior
and Brain Science, 17: 585-654.