group selection and social preferences · 2017. 4. 12. · jörgen weibull† and marcus...
TRANSCRIPT
Group selection and social preferences∗
Jörgen Weibull† and Marcus Salomonsson‡
Stockholm School of Economics, Box 6501, SE - 113 83 Stockholm, Sweden
April 15, 2005
Abstract
Suppose that a large number of individuals are randomly matched into groups
where each group plays a finite symmetric game. Individuals breed true accord-
ing to their individual material payoffs, but the expected number of surviving
offspring may depend on the material payoff vector to the whole group. We show
that the mean-field equation for the induced population dynamic is equivalent to
the replicator dynamic for a game with payoffs derived from those in the origi-
nal game. We apply this selection dynamic to a number of examples, including
prisoners’ dilemma games, coordination games, hawk-dove games, a prisoners’
dilemma with a punishment option, and common-pool games. For each of these,
we provide conditions under which our selection dynamic leads to other outcomes
than those obtained under the usual replicator dynamic. By way of a revealed-
preference argument, we show how our selection dynamic can explain certain
stable behaviors that are consistent with individuals having social preferences.
Keywords: Group selection, social preferences, altruism, fairness.
∗We thank Milo Bianchi, Olof Leimar and participants in the conference on evolutionary gamedynamics at PED, Harvard, November 2004, for comments to an earlier draft of this manuscript. Weare also grateful to Bill Sandholm who provided the software we used to construct Figures 4 and 5,available at http://www.ssc.wisc.edu/~whs/. Marcus Salomonsson thanks the Wallenberg Foundationfor financial support of his research.
†Corresponding author. E-mail: [email protected]. Phone: +46 8 736 92 04. Fax: + 46 8 3132 07.
‡E-mail address: [email protected].
1
1 Introduction
One of the longest standing controversies in evolutionary game theory has been the
group selection controversy. The group selection idea, which traces its origins all the
way back to Darwin, essentially says that groups with internal cooperation will be more
successful than other groups, and that this may cause altruistic behaviors – individual
sacrifices for the common good of the group – to survive and in some circumstances
thrive:
“There can be no doubt that a tribe including many members who, from pos-
sessing in a high degree the spirit of patriotism, fidelity, obedience, courage,
and sympathy, were always ready to give aid to each other and to sacri-
fice themselves for the common good, would be victorious over most other
tribes, and this would be natural selection." (Darwin, 1871, page 166.)
The controversy was long believed to have been finally settled after an exchange
between Wynne-Edwards (1962) and Maynard Smith (1964). The exchange was ignited
by Wynne-Edwards, who argued in favor of group selection. His argumentation was
informal and based on examples. In response, Maynard Smith argued that Wynne-
Edwards’ examples were explicable without reference to group selection, and then went
on to formulate what a more precise model of group selection might look like. Based on
this model sketch, called the haystack model, Maynard Smith dismissed group selection.
In the haystack model, groups are randomly reshuffled at given time intervals. Be-
tween each such reshuffle, a one-shot prisoners’ dilemma game is played recurrently in
every group. A crucial assumption is that the population state in every group converges
to a limit state before groups are reshuffled. The process is thus adiabatic: individual
selection within groups is an order of magnitude faster than group selection. This fea-
ture of the model implies that all cooperators in mixed groups become extinct before
it is time to reshuffle the groups. Only cooperators in groups that consist exclusively
of cooperators will survive. The fact that such groups must be pure, and that they
must stay isolated for long periods of time, led Maynard Smith to conclude that cir-
cumstances for group selection to be effective were so special that group selection was
unlikely to play an important role.
Despite the fact that this model suggested that group selection was only unlikely,
not impossible, it was viewed as a sounding rejection of the concept. Consequently, after
Maynard-Smith’s and Wynne-Edwards’ exchange, and after a passionate criticism of
2
the concept by Williams (1966), the group selection idea all but disappeared from the
evolutionary literature.1 When it was mentioned, it was rather as a cautionary tale
of how evolutionary selection does not work. In later years, however, group selection
has had a vivid revival. The literature has in fact become much too large to be fairly
treated here. Surveys of the group selection literature are given in Bergstrom (2002)
and Wilson and Sober (1994), and recent contributions, with extensive discussions, are
given in Kerr and Godfrey-Smith (2002) and Henrich (2003).
The aim of this study is not to provide arguments for or against group selection, but
instead to suggest a parsimonious, operational and simple population selection model
that allows for both individual and group selection, without the adiabatic assumption
of the haystack model. In a nutshell, our model is as follows. A large population of
individuals are randomly matched into groups. The interaction in each group takes the
form of a finite symmetric game. The game can be simple or complex, and may consist
of one or many stages – as, for example, in finitely repeated games. All individuals
play pure strategies in their group. The play of the game in a group results in material
payoffs to the group members. Each individual breeds true, and the expected number
of offspring depends on the individual’s own material payoff. All offspring are subject to
an exogenous hazard, such as infectious diseases, harsh weather conditions, or attacks
from predators. The expected share of survivors among the offspring in a group may
depend on all material payoffs in the group. As a canonical example of this, we will
assume that the expected share of survivors is proportional to the sum of material
payoffs. However, we will also consider other functional forms, such as the minimum or
product of material payoffs in the group. The key assumption in our model is weaker:
all that matters is that the expected number of surviving offspring may depend, in part,
on other group members’ material payoffs. Such dependence seems likely in situations
where groups, and their offspring, stay together for some length of time. We here take
this dependence as a primitive, although this, in its turn, may have arisen from the
interplay between material production and reproduction conditions as well as forms
of social interaction, “institutions”, and habits, of the population under study. For
instance, as humans turned from hunting and gathering to agriculture, the form of
dependence most likely changed.
We show that the mean-field equation for the induced stochastic population process
is identical with the Taylor and Jonker (1978) replicator dynamic for a certain derived
game. The payoffs in this derived game are functions of the vector of individual material
1Wilson (1983) gives a more detailed description of this period, from a proponent’s point of view.
3
payoffs. Relying on established results for the replicator dynamic, predictions for long-
run population states can then be made. In particular, if the dependence of survival
probabilities on other group members’ material payoffs is sufficiently strong, cooperation
among group members may emerge in the long run.
We illustrate the implications of our approach by way of a number of examples,
including prisoners’ dilemma games, coordination games, hawk-dove games, a prisoners’
dilemma with possibility to punish a defector, and common-pool games. In particular,
for each of these games we provide conditions under which our selection dynamic leads
to other outcomes than those obtained under pure individual selection. As expected,
the effect of group selection is to promote behaviors that benefit the common good for
the group. However, in some games, and for certain survival functions, the effect is too
weak to cause any change of the long-run outcome.
Utility theory in economics is based upon a revealed-preference principle; human
behavior is interpreted as the result of rational choice according to some underlying
binary preference relation over outcomes, or, more generally, lotteries over outcomes.
If choice behaviors meet certain regularity conditions with respect to variations of the
set of alternatives, there exists a utility function for the decision-maker such that his
or her behavior is consistent with the maximization of the expected value of that func-
tion. Such a mathematical representation allows for powerful analysis and prediction
of behaviors in new environments.
By way of a similar revealed-preference argument, here applied to the asymptotic
population behavior under our selection dynamic, we argue that the results of selection,
in some situations, allow for the interpretation that individuals are rational decision-
makers with utility functions given by the payoffs of the derived game, and even that
this rationality and those preferences are common knowledge among all individuals in
the population. If aggregate population behavior converges in our selection dynamic,
then the limit population state will correspond to a symmetric Nash equilibrium of the
derived game. It is then as if individuals, on top of the above-mentioned rationality, had
consistent expectations as each others’ behaviors. Moreover, the payoffs in the derived
game in general depend on all players’ material payoffs, so the revealed preferences are
“other-regarding” or “social” – typically combining a concern for one’s own material
payoff with some concern for the material well-being of others. In this limited sense,
the present model provides an evolutionary underpinning of the hypothesis of game
theoretic rationality combined with social preferences – a common hypothesis in much
of modern “behavioral” game theory. Our approach also suggests a certain class of
4
social preferences, to the best of our knowledge not studied before, where an individual’s
utility is the product of an individualistic utility function and a social welfare function.
The rest of the paper is organized as follows. The model is formalized in section 2,
applied to examples in section 3. Section 4 discusses briefly the evolutionary asymme-
try between rewards and punishments. Implications for “as if” rationality and social
preferences are discussed in section 5. Related literature is discussed in section 6, and
section 7 concludes.
2 Model
Consider a finite and symmetric two-player gameGwith pure strategy set S = {1, 2, ...,m}and payoff matrix Π = (πhk), where πhk is the material payoff to pure strategy h ∈ S
when played against pure strategy k ∈ S. Let u (x, y) denote the expected material
payoff to mixed strategy x ∈ ∆ (S) when played against mixed strategy y ∈ ∆ (S):
u (x, y) =Xh∈S
Xk∈S
xhπhkyk (1)
Suppose this game is played recurrently in randomly matched groups of size 2, drawn
from a finite population in which every individual is “programmed” to play a certain
pure strategy. Let N (t) be the population size at time t, and for each pure strategy
h ∈ S, let Nh (t) be the number of “h-strategists” in the current population. At times
t = 0,∆, 2∆, ..., where ∆ > 0, [N (t) /2] groups of size 2 are randomly formed.2 For
each individual, all matches with others are equally likely.
In each such time period, every group plays the game G once, and each individual
breeds true; all offspring inherit their single parent’s pure strategy.3 The expected
number of surviving offspring to a h-strategist in a group where the other member
plays k ∈ S is ∆ · φ (πhk, πkh), where φ : R2→ R+. Hence, φ (πhk, πkh) is the fitnessof pure strategy h against pure strategy k. At the end of each period, all surviving
individuals from all groups are brought together, and a fixed fraction ∆ · γ ≥ 0 die,where γ ≥ 0 is the common death rate (this rate turns out to play no role). We have
2Here [x] denotes the integer part of a real number x, the largest integer not exceeding x. If N (t)is odd, then one individual is not assigned to any group. The focus is here on large populations, and itis then immaterial what happens to the left-out individual. For the sake of definiteness, assume thatsuch an individual does not reproduce.
3The time period ∆ may be long, say a year, and the interaction may take the form of a finitelyrepeated game, say a stage game played each day of the year.
5
in mind, as a canonical example, multiplicative fitness functions where the first factor
is an increasing function of own material payoff, representing the number of offspring,
and the second is an increasing function of some aggregate of both group members’
material payoffs, representing the survival probability of each offspring in the group.
This model easily generalizes to a single population playing a finite and symmetric
n-player game (see below), and also to n populations, one for each player role, playing
a finite n-player game. In all these cases, a group is defined as a random match between
n individuals. In the first case, the fitness function φ : Rn→ R+ is symmetric in thesense that it is invariant under permutations of other players’ payoffs.
2.1 The induced selection dynamic
The mean-field equation for the induced stochastic population process can be derived
as follows. Assume that the population is non-extinct in some period t ∈ {0,∆, 2∆, ...}.For every pure strategy h ∈ S, let xh (t) denote the population share of h-strategists:
xh (t) = Nh (t) /N (t). For each pure strategy h and an even numberN (t) of individuals,
the expected number of h-strategists in the next period is
E [Nh (t+∆) | N1 (t) , ..., Nm (t)] =
=
Ã1−∆γ +∆
Xk∈S
Nk (t)− δhkN (t)− 1 φ (πhk, πkh)
!Nh (t) , (2)
where δhk is Kronecker’s delta.4 For N (t) large, we thus have
E [Nh (t+∆) | N1 (t) , ..., Nm (t)]−Nh (t)
∆≈
≈"Xk∈S
xk (t)φ (πhk, πkh)− γ
#Nh (t) . (3)
Taking the limit ∆ → 0, and dividing through by N (t), we obtain the mean-field
equation
xh =£u¡eh, x
¢− u (x, x)
¤xh, (4)
4That is, δhk = 1 if h = k, otherwise δhk = 0. For N (t) odd, the denominator is N (t)− 2 insteadof N (t)− 1.
6
where eh is the unit vector in direction h (let ehk = 0 for all coordinates k 6= h and
ehh = 1) and u : [∆ (S)]2 → R+ is defined by
u (x, y) =Xh∈S
Xk∈S
xhφ (πhk, πkh) yk. (5)
The function u is the derived payoff function associated with the pure-strategy payoff
matrix Π = (πhk), where
πhk = φ (πhk, πkh) . (6)
The selection dynamic (4) is thus nothing but the Taylor and Jonker (1978) repli-
cator dynamic for the derived game. In the special case when fitness is a positive
affine function of own material payoff only, φ (πhk, πkh) ≡ α + βπhk for some β > 0,
the selection dynamic (4) is proportional to the standard replicator dynamic, and thus
has identical solution orbits. The present model hence contains the usual model of
individual selection as a special case.
2.2 2× 2 gamesApplied to a symmetric 2× 2 game, our approach gives
Π =
Ãφ (π11, π11) φ (π12, π21)
φ (π21, π12) φ (π22, π22)
!. (7)
The best-reply correspondence, weak and strict dominance, risk dominance, and the
replicator dynamic are all unaffected by the addition or subtraction of a constant to a
column of the payoff matrix (see e.g. Weibull (1995)), so the derived game is equivalent
in these respects with the normalized derived game
Π =
Ãφ (π11, π11)− φ (π21, π12) 0
0 φ (π22, π22)− φ (π12, π21)
!. (8)
This game, and hence also the derived game, is a (strict) coordination (CO) game if
both diagonal entries are positive, a (strict) hawk-dove (HD) game if both diagonal
entries are negative, and a (strictly) dominance-solvable (DS) game if the diagonal
entries have opposite signs. In the case of a CO-game, the pure-strategy pair (1, 1)
strictly risk dominates the pure-strategy pair (2, 2) if and only if the first diagonal
entry, π11, exceeds the second, π22. In case of a DS-game, the derived game Π, but not
7
necessarily the normalized derived game Π, is a prisoners’ dilemma (PD) game if and
only if the dominant pure strategy earns less against itself than the other pure strategy
earns against itself, in terms of derived payoffs. In the opposite case, we call the derived
game Π an efficient dominance-solvable (ED) game.
The replicator dynamic in a generic symmetric 2×2 game converges from all initialstates. Moreover, the limit point is a best reply to itself (the strategy of a symmetric
Nash equilibrium) if the initial state is interior. Strict CO-games have two attractors
– the whole population playing one of the two pure strategies – and their basins of
attraction are separated by the unique (but unstable) mixed Nash equilibrium strategy.
Strict HD-games have one attractor, the population mix defined by the unique mixed
Nash equilibrium strategy. Strict DS-games, finally, also have a unique attractor – the
whole population playing the dominant pure strategy – irrespective of whether this is
socially efficient or not.
3 Examples
We illustrate the present selection model by way of a few examples.
3.1 A family of 2× 2-gamesConsider symmetric 2× 2-games with material payoffs
Π =
Ã2 a
b 1
!, (9)
for arbitrary constants a and b. Such a game is a HD-game when a > 1 and b > 2, a
ED-game when a > 1 and b < 2, a CO-game when a < 1 and b < 2, and a PD-game
when a < 1 and b > 2. These conditions cut the (a, b)-plane into four regions, oriented
clockwise around the point (1, 2), see the straight cross in Figure 1 below.
Suppose that fitness is bilinear in own material payoff and in the group’s material
payoff sum. With vi denoting own material payoff and v−i that of the other individual:
φ (vi, v−i) = vi (vi + v−i) . (10)
Such a fitness function arises if the expected number of offspring is proportional to own
payoff and the survival probability of all offspring is proportional to the group’s total
8
resources. The derived game is then
Π =
Ã8 a (a+ b)
b (a+ b) 2
!. (11)
The two curves in Figure 1 divide the (a, b)-plane into four regions that determine
the nature of the derived game. The regions are oriented in the same way as for
the original game: HD-games being located north-east, CO-games south-west, and
dominance solvable games south-east (ED-games) and north-west (PD-games) of the
two curves’ intersection.
21.81.61.41.210.80.60.40.20
4
3
2
1
0
a
b
a
b
Figure 1: Parameter combinations (a, b) and the nature of the two games Π (straight
lines) and Π (curves).
We see, in particular, that if the game Π, defined in terms of material payoffs, is
a PD game, then the derived game can be any one of the four generic game types.
Suppose, for example, that a = 0.8 and b = 3. In this case, the long-run population
state is a certain interior state – the mixed strategy in the derived HD-game– for
all interior initial states. As another example, suppose a = 0.6 and b = 2.4. Then the
derived game is a CO-game. Hence, the long-run population state depends on the initial
state. In particular, if there are sufficiently many cooperators in the initial population
state, then defectors will be asymptotically wiped out from the population – although
the game is a PD-game in terms of material payoffs. In the presence of perpetual
random mutations, as modelled in Kandori, Mailath and Rob (1993), the population
process becomes ergodic, and its invariant distribution places virtually all probability
9
mass on the population state where all individuals cooperate if the mutation rate is
low. This follows from the observation that the (C,C) equilibrium risk dominates the
(D,D) equilibrium in the derived game for these parameter values.5
3.2 Punishing defectors and rewarding cooperators
There is experimental evidence, see Fehr and Gächter (2002), that human subjects
punish defectors in public-goods provision interactions, even when such punishment
is costly to the punisher. The threat of such punishment enhances cooperation and
hence welfare in interacting groups of human subjects. However, to implement such
punishment violates individual sequential rationality, as applied to the material payoffs
of the game. Can punishment behaviors be explained by the present model? Figure
3 below shows a game-theoretic representation of public-goods provision that allows
cooperators to punish defectors.
1
2
C
D
P N
C
C
D
D
N P
22
11
a-cb-d
ab
ba
b-da-c
1 2
Figure 3: A two-stage prisoner’s dilemma game with punishment option.
The first stage of this game is a simultaneous-move prisoners’ dilemma, where each
player chooses C or D, with material payoffs according to (9), for a < 1 and b > 2. In the
5To see this, note that the normalized derived game has diagonal elements 0.8 and 0.2, respectively.
10
second stage, a player who cooperated in stage one has the option to punish defection
by the other player. The cost of punishing is c > 0 and the effect of punishment is a
reduction of the punished’s payoff by d > 0.
The unique subgame perfect equilibrium of this extensive-form game is not to punish
– since this reduces the punisher’s material payoff – and hence for both players to
defect in the first stage. In the normal form of this symmetric two-player game, each
player has four pure strategies 1=CN, 2=CP, 3=DN and 4=DP at his or her disposal,
and the payoff matrix of this symmetric game is
Π =
⎛⎜⎜⎜⎜⎝2 2 a a
2 2 a− c a− c
b b− d 1 1
b b− d 1 1
⎞⎟⎟⎟⎟⎠ .
Not surprisingly, the pure strategy CN weakly dominates strategy CP. However, if the
punishment is not too harsh, d < b − 2, then each of the two behaviorally equivalentstrategies DN and DP strictly dominates CN (and hence also CP). In such cases, the
game has a unique Nash equilibrium component, where both players play arbitrary
mixes between DN and DP, and this component attracts all interior solution orbits of
the replicator dynamic. However, if punishment is harsh, d > b−2, there exists anothercomponent of symmetric Nash equilibria, namely all mixed strategies pCN+(1− p)CP
with
p ≤ 1− (b− 2) /d (12)
are then best replies to themselves. A large set of initial population states lead asymp-
totically to the set, see solution orbits in Figure 4 below, computed for a = 0.5, b = 3,
c = 0.5 and d = 1.5, and hence p ≤ 1/3, and where “D” stands for the sum of the
population shares playing DN and DP.
11
CN
CP D
Figure 4: Solution orbits to the replicator dynamic for the game defined in terms of
material payoffs.
All points in the cooperative Nash equilibrium component, except from its end-
point, 13CN + 2
3CP , are Lyapunov stable (small pushes do not lead far away), but
the component is not an attractor since its end-point is unstable. The reason for the
relative persistence of these equilibria is that near this equilibrium component defections
are rare and hence punishment not very costly (CP gives almost the same expected
payoff as CN). Hence, defectors “learn” that defections are likely to be followed by
punishments, and punishers “learn” that CP is somewhat more costly than CN. These
two adaptations occur at comparable rates in the selection dynamic, and hence the
population state moves back toward the cooperative equilibrium component, except
when the population state is close to the end-point of the equilibrium component.
Similar dynamic phenomena have been observed in Binmore and Samuelson (1999),
in the context of ultimatum bargaining and in Sethi and Somanathan (1996) for the
tragedy of the commons.
The derived payoff matrix is
12
Π =
⎛⎜⎜⎜⎜⎝φ (2, 2) φ (2, 2) φ (a, b) φ (a, b)
φ (2, 2) φ (2, 2) φ (a− c, b− d) φ (a− c, b− d)
φ (b, a) φ (b− d, a− c) φ (1, 1) φ (1, 1)
φ (b, a) φ (b− d, a− c) φ (1, 1) φ (1, 1)
⎞⎟⎟⎟⎟⎠ .Suppose that the fitness function φ is strictly increasing in both arguments, that
is, that higher material payoff to any one of the group members increases the expected
number of surviving offspring to both members. The pure strategy CP is clearly weakly
dominated by strategy CN also in the derived game. In order to high-light the role of
the punishment option, we henceforth focus on cases where
φ (a, b) < φ (1, 1) and φ (b, a) > φ (2, 2) , (13)
that is, where the derived game would be a PD-game in the absence of the punishment
option.
Strategies DN and DP do not strictly dominate CN and CP in the derived game if
φ (b− d, a− c) ≤ φ (2, 2) , (14)
that is, if the punishment is sufficiently harsh and/or the cost of punishing sufficiently
high. This is, thus, a qualitative difference between the games defined in terms of
material and derived payoffs, respectively.6
Under (13) and (14), all mixed strategies pCN + (1− p)CP with
p ≤ φ (2, 2)− φ (b− d, a− c)
φ (b, a)− φ (b− d, a− c)(15)
are best replies to themselves. Hence, not surprisingly, there is a “cooperative” sym-
metric Nash equilibrium component also in the derived game. Indeed, we would expect
this component to be larger, and attract a larger set of initial population states, than
in the game based on individual material payoffs. It is not difficult to confirm this
conjecture under bilinear fitness (10), granted b > d and a + b > c + d, inequalities
that are met in the above numerical example.7 In this sense, group selection makes
cooperation “more common.” Solution orbits to the replicator dynamic in the derived
6With bilinear fitness (10), conditions (13) and (14) are, for example, met in the numerical examplein Figure 4.
7The right-hand side in (15) exceeds that in (12) iff 8d+ (b− 2) (b− d) (c+ d)− 2d (a+ b) > 0.
13
game are shown in Figure 5 below, based on the bilinear fitness function and the same
parameter values as in the preceding diagram.
CN
CP D
Figure 5: Solution orbits to the replicator dynamic for the derived game.
The end-point of the cooperative equilibrium component has moved up from p = 1/3
to p ≈ 0.7.8 We also see that the basin of initial states that tend towards the cooperativecomponent has increased significantly. In the presence of random mutations, however,
the unique outcome in the “ultra long run” is still that everyone defects.
3.3 Defence of a common resource pool
We finally apply our model to an example discussed in Boyd and Richerson (1985).
Consider, thus, a group of n individuals who have a common pool of resources, say
a herd of domestic animals, where each group member owns one n:th of the pool.
The total pool is worth w, thus w/n for each group member. The pool is occasionally
exposed to some hazard– for example a predator or tempest. Group members therefore
take turns to guard it. The guard has a binary choice in case the hazard materializes:
8The right-hand side in (15) becomes 8−2.2510.5−2.25 ≈ .697.
14
either to defend the common pool at a cost c to him- or herself, and thereby save the
pool, or not to act (at zero cost to him- or herself) in which case the amount d of the
pool will be lost to the group, where 0 < d < w. If d/n < c < d, as we here assume, it
is in the group’s interest that the guard defends the pool: the expected total material
payoff to the group is then w − pc, where p is the probability of the hazard, while the
expected total material payoff to the group is otherwise w − pd. However, the guard
has no material incentive to defend the pool in case the hazard materializes: his or her
material payoff when defending w/n− c, which is less than (w − d) /n. In other words,
the expected payoff to the group is maximized if every member, when on guard, defends
the resource, but it is individually rational, as defined in terms of own material payoffs,
for the guard not to act in case the hazard materializes.
Such situations are of the type alluded to in the introductory quote by Darwin, and
can be modelled as finite and symmetric n+ 1-player games, where players i = 1, ..., n
are the group members and player 0 is “nature” who randomly selects one of the group
members as guard, with probability 1/n for each individual to be so selected. Nature
also makes a second random draw, statistically independent from its first draw, namely
whether or not the hazard will materialize. Let p be the hazard probability, where
0 < p ≤ 1. Each personal player thus has available two pure strategies, 0 (no action)and 1 (defense). The expected material payoff to such a player i, under any pure-
strategy profile s = (s1, s2, ..., sn), is
π (si, s−i) =
((1− p)w/n+ p [(w/n− d/n) /n+ (1− 1/n) yi] for si = 0(1− p)w/n+ p [(w/n− c) /n+ (1− 1/n) yi] for si = 1
, (16)
where yi is the conditionally expected value to member i of his or her share of the
common pool when another group member is on guard, given that the hazard hits:
yi =1
n− 1
"Ãn− 1−
Xj 6=i
sj
!w − d
n+
w
n
Xj 6=i
sj
#.
Under the maintained hypothesis d/n < c < d, pure strategy 0 strictly dominates
pure strategy 1. Hence, as noted by Boyd and Richerson (1985), the standard replicator
dynamic, applied to the game defined in terms of individual material payoffs, asymp-
totically wipes out strategy 1 from the population, from any interior initial population
state. What happens in the present selection dynamic?
15
In our framework, and focusing on the case n = 2, the material payoff matrix is
Π =1
2
Ã(1− p)w + p (w − d) (1− p)w + p (w − d/2)
(1− p)w + p (w − c− d/2) (1− p)w + p (w − c)
!
For p = 1, that is, a sure hazard, the derived payoff matrix is
Π =
Ãφ [(w − d) /2, (w − d) /2] φ [(w − d/2) /2, (w − c− d/2) /2]
φ [(w − c− d/2) /2, (w − d/2) /2] φ [(w − c) /2, (w − c) /2]
!
With bilinear fitness (10), defence of the common property, pure strategy 1, is
strictly dominant in the derived game if and only if
(w − d)2 <
µw − c− d
2
¶µw − c+ d
2
¶(17)
Figure 7 below shows when this condition is satisfied, in the (c, d)-plane, for w = 2.
The condition is satisfied to the left of the curve. For parameter pairs (c, d) to the left
of the steeper straight line (c = d/2), pure strategy 1 is strictly dominant also in terms
of individual material payoffs.
1.51.2510.750.50.250
2
1.5
1
0.5
0
c
d
c
d
Figure 7: Parameter combinations (c, d) for which defence of a common property is a
dominant strategy in the derived game.
For parameter pairs to the right of the less steep straight line (c = d), strategy 1 is
social inefficient: the group’s total material payoff is maximized if the guard takes no
action. Hence, the effect of group selection is to expand the set of parameter combi-
16
nations (c, d) for which pure strategy 1 is dominant, from the region to the left of the
steeper straight line to all points to the left of the curve. For parameter combinations
to the left of the curve, our selection dynamic, starting from any interior initial popu-
lation state, will asymptotically wipe out strategy 0 (“no action”) from the population.
Group selection, as modelled here, is then sufficiently strong to asymptotically lead the
population towards the state where all group members defend the common pool, and
hence to the socially efficient outcome: in the long run, individuals behave as if they
were rational and had preferences according to the derived payoff matrix.
4 The evolutionary logic of rewards and punish-
ments
We her some general remarks about the rationality of punishments and rewards. Con-
sider, thus, a behavior strategy profile in a finite extensive-form game and an infor-
mation set on the path of this profile (that is, an information set that is reached with
positive probability when the profile is played). A behavior strategy is sequentially ra-
tional (Kreps and Wilson, 1982) at such an information set if its conditionally expected
payoff, conditioned on the induced probabilities at its nodes, cannot be exceeded by
any other behavior strategy, when used from this information set on. If the payoffs in
the game tree are the derived payoffs, as defined here, then Lyapunov stability in our
selection dynamic implies sequential rationality at all information sets that are reached
with positive probability by play.9
Consider now a decision node in a finite extensive-form game where the player
has perfect information about what others have done before his or her move (thus, a
singleton information set) and where each move at the node is immediately followed by
a terminal node in the game tree. Suppose, first, that the player i in question has the
option to punish another player j. Let vi and vj be the two player’s material payoffs if
i chooses not to punish j, and let the payoffs be vi − c and vj − d if i does choose to
punish j, where c, d > 0. In terms of material payoffs, it is thus sequentially rational
not to punish. The same holds true in terms of derived payoffs, if these are increasing
functions of material payoffs, since the derived payoff to player i from punishing j,
φ (vi − c, vj − d), is then lower than the payoff from not punishing, φ (vi, vj). Hence, our
9This follows from the two facts that (a) Lyapunov stability in the replicator dynamic implies Nashequilibrium, and (b) a behavior strategy profile in a finite extensive form game is a Nash equilibriumiff it prescribes sequentially rational play at all information sets on its path, see van Damme (1987).
17
model of group selection does not render punishment sequentially rational. However,
as was shown above, group selection may have significant effects on the set of stable
population states.
Secondly, suppose that player i instead has the option to reward another player j.
Let vi and vj be the two player’s material payoffs if i does not reward j, and let the
payoffs be vi − c and vj + r if i rewards j, where c, r > 0. In terms of material payoffs,
it is sequentially rational for player i not to reward player j. However, rewarding is
sequentially rational in terms of derived payoffs if and only if
φ (vi − c, vj + r) ≥ φ (vi, vj) . (18)
With bilinear fitness (10), this condition is equivalent with³rc− 1´(vi − c) ≥ vi + vj. (19)
For c < vi, the latter condition requires r > c. In other words, a necessary (but not
sufficient) condition for rewarding to be sequentially rational is that the reward exceeds
the cost of giving it. Figure 8 below illustrates condition (19); rewarding is sequentially
rational for all pairs (r, c) on and above the steep curve, and not below it (the diagram
is drawn for vi = vj = 1). The thin straight line is r = c.
0.80.60.40.20
8
6
4
2
0
c
r
c
r
Figure 8: Parameter combinations (c, r) and the condition for rewarding others to be
sequentially rational.
18
5 Social preferences
The analysis so far has assumed that individuals do not choose a strategy; they are “pro-
grammed” to a pure strategy from birth. However, the long run of the present model
can be interpreted in terms of rational individual choice and expectations formation
as follows. Let the underlying material payoffs be given by the matrix Π and consider
the derived payoff matrix Π as a representation of rational individuals’ von Neumann
Morgenstern utilities. We here use the term “rational” in its usual economics sense (pi-
oneered by Savage (1954)), that is, behavior that is consistent with the maximization
of the expected value of some goal function under some subjective probabilistic belief
about the state of the world (here: other players’ actions).10
It is well-known that in finite two-player games, a pure strategy is strictly dom-
inated if and only if it is not optimal against any probability distribution over the
other player’s strategy choice (Pearce, 1984). It is known that the replicator dynamic
asymptotically wipes out all iteratively strictly dominated pure strategies, from any in-
terior initial population state, in all finite games.11 Hence, observed play in the present
model, after the population has evolved over a long time span from an arbitrary interior
initial state, is consistent with the hypothesis that all individuals have von Neumann-
Morgenstern utilities (6), are rational, and that their preferences and rationality are
common knowledge in the population.12
Let us consider the example in section 3.1 in this light. Assume a = 0.8 and b = 2.2.
From Figure 1 we deduce that in terms of material payoffs, the game is a prisoners’
dilemma with strategy 1 being C. However, in terms of the derived payoffs strategy C
strictly dominates strategy D. The present selection dynamic thus takes the population
state from any interior state to the state in which everybody plays C. Hence, to an
outside observer who sees this long-run outcome, and who knows the material payoffs
Π, this observation is consistent with the hypothesis that all individuals are rational and
have von Neumann Morgenstern utilities (6) as detailed in (10). Such preferences are
“social” or “other-regarding:” they depend both on ownmaterial payoff and on the other
10von Neumann-Morgenstern utilities are real numbers u (ω) attached to outcomes ω in such a waythat the decision maker’s choices among lotteries over outcomes are consistent with maximization ofthe expected value of u.11Akin (1980) showed that all strictly dominated strategies are wiped out in this sense in all finite and
symmetric two-player games. This result was generalized to iteratively strictly dominated strategiesin arbitrary finite games, for both the Taylor (1979) and Maynard Smith (1982) versions of the multi-population replicator dynamic by Samuelson and Zhang (1992) and Hofbauer and Weibull (1996).12A player who knows other players’ preferences and that other players are rational does not use
iteratively strictly dominated strategies.
19
player’s material payoff. Likewise, the long-run population state in the common-pool
game in section 3.3 is consistent with rational players who have “social” preferences,
granted the parameters satisfy condition (17).
It is known that if the replicator dynamic converges to some population state from
an interior initial state, then the limit state is a Nash equilibrium.13 Hence, observation
of play after the population has evolved for a long time along some interior convergent
solution trajectory is consistent with the hypothesis that all individuals have preferences
according to (6) and play (approximately) according to the limiting Nash equilibrium.
Consider the public-goods example in section 3.2 in this light, for, say, n = 4 and
λ = 0.7. If the population behavior settles down over time from some interior initial
state, and an observer sees the long-run behavior, then this observation is consistent
with the hypothesis that all individuals have von-Neumann-Morgenstern utilities (??),and play (approximately) the game’s unique symmetric Nash equilibrium, namely to
contribute maximally to the public good. Again, it is as if individuals were rational,
and had “rational” expectations and “social” preferences.
The above observations substantiate the first part of the above claim, namely that
the present model can be viewed as a model of rational choice and, sometimes, Nash
equilibrium play, and it suggests that the so revealed preferences may be social in nature.
In order to substantiate this second claim further, we now examine more closely the
nature of the derived payoffs, bearing in mind that the fitness function φ in its turn may
depend on material production and reproduction conditions as well as forms of social
interaction, “institutions”, and habits, of the population under study. In particular, as
these conditions change, so may the induced social preferences.
Assume, first, that the fitness function φ is bilinear, as specified in equation (10).
The figure below shows a contour map of this function, hence, the indifference curves
of a player with such preferences, with own material payoff on the horizontal axis and
the other player’s material payoff on the vertical. The two straight lines have slope plus
and minus 45 degrees.
13See Nachbar (1990) and Weibull (1995).
20
876543210
8
7
6
5
4
3
2
1
0
own
other's
own
other's
Figure 9: Indifference curves induced by the bi-linear fitness function.
We see that the indifference curves are not the vertical ones of homo oeconomicus
– the selfish species studied in most of economics. Indeed, an individual with φ as
his or her utility function prefers the “fair” payoff allocation (3, 3) to one where he/she
gets 4 material payoff units and the other zero. In this sense, individuals have a certain
“preference for fairness.” However, as the negatively sloped “budget line” shows: if
there is a given material payoff sum to be divided (here 6 units), then each individual
prefers to get the whole “pie” for him- or herself. It is as if others’ material well-being
is of some concern, but less so than one’s own, in particular when others are better
off. We also note that along the positively sloped diagonal, where individual material
payoffs are equal, the indifference curves become steeper as the common material payoff
goes up. It is as if equally wealthy individuals care less about each others’ well-being –
are more similar to homo oeconomicus – than equally poor individuals do: a wealthy
person is “more upset” by a marginal transfer to an equally wealthy person than a poor
person is by a marginal transfer to an equally poor person.
What about other group fitness functions? A more general class of fitness functions
are given by
φ (vi, v−i) = vi · ψ (v) , (20)
21
for
ψ (v) =
"1
n
nXj=1
µvj − π−
π+ − π−
¶ρ#1/ρ
, (21)
where ρ ∈ R andπ− ≤ min
h,k∈Sπhk and π+ ≥ max
h,k∈Sπhk (22)
(we assume π− < π+). The function value ψ (v) may be thought of as the survival
probability of an offspring in a group with material payoff vector v = (v1, ..., vn). By
(21), this probability is a symmetric, continuous and strictly increasing function of
the payoff vector v. It belongs to a parametric function family called CES (constant
elasticity of substitution) functions in economics. As is shown in an appendix at the
end of the paper, the function ψ has the following properties:14
ψ (v) =1
π+ − π−·
⎧⎪⎪⎪⎪⎨⎪⎪⎪⎪⎩1n
Pnj=1 (vj − π−) when ρ = 1£
Πnj=1 (vj − π−)
¤1/nwhen ρ = 0
min1≤j≤n (vj − π−) when ρ→ −∞max1≤j≤n (vj − π−) when ρ→ +∞
. (23)
In other words, ψ (v) is proportional to the arithmetic average payoff gain in the group
when ρ = 1, to the geometric average payoff gain when ρ = 0, and its limit as ρ→−∞(ρ → +∞) is the minimal (maximal) payoff gain in the group. These special casesrepresent established social welfare functions in economics: ρ = 1 corresponding to
Bentham’s utilitarian welfare function, ρ = 0 to Nash’s bargaining-based welfare func-
tion, and ρ→ −∞ to Rawls’ egalitarian welfare function. Moreover, if conditions (22)
are met with equality, then ψ (v) is scale invariant in the sense of being unaffected by
positive affine transformations of material payoffs.15 As a consequence, the solution
orbits to our selection dynamic (4) are then invariant under positive affine transforma-
tions of material payoffs, a form of scale invariance shared with dominance relations
and Nash equilibrium.16
Isoquants for the fitness function φ through the point vi = v−i = 3 are shown in
Figure 10 below, for n = 2, π− = 0, and different values of ρ. The horizontal axis
14The right-hand side in (21) is undefined for ρ = 0, but the given expression is the limit as ρ→ 0.Hence, this expression is the continuous extension of the definition for ρ 6= 0 to ρ = 0.15More exactly, if each argument vi is replaced by v0i = a + bvi for some a, b ∈ R with b positive,
then ψ (v0) = ψ (v).16Alternatively, background fitness factors could be incorporated by means of shifting π− downwards
by the corresponding amount in material payoffs.
22
represents own material payoff, the vertical axis the other player’s material payoff, and
the curves are ordered from left to right according to the ρ-values. The left-most curve
is the indifference curve when ρ → +∞ and the right-most curve when ρ → −∞.The bold-face curve in the middle is the indifference curve when ρ = 1. The two
straight lines are v−i = vi and v−i = 6 − vi, respectively. We note, in particular, that
in the limit case as ρ → −∞, individuals have the same selfish preferences as homooeconomicus whenever the other individual in the group earns a higher material payoff
(the indifference curve is vertical above the diagonal) but prefer the even split over all
other allocations of a given material payoff sum.
876543210
8
7
6
5
4
3
2
1
own
other's
own
other's
Figure 10: Isoquants of (20), when parametrized as in (21), for different values of ρ.
Preferences represented by (20) and (21) are qualitatively similar to those in Fehr
and Schmidt (1999) and Charness and Rabin (2002). Fehr and Schmidt assume that
an individual’s utility is composed of three terms, where one is own material payoff
and the two others represent “fairness” with regard to those who are better and worse
off, respectively. Charness and Rabin (2002) suggest that much experimental evidence
for human subjects is consistent with individual utility being the sum of own material
payoff and a social welfare function for all participants. The only essential difference
in comparison with the present functions φ is that the latter are multiplicative, not
additive, and, unlike Fehr and Schmidt, cannot express spitefulness – a wish to reduce
23
other’s material payoffs.17
6 Related literature
As mentioned above, the model in Maynard Smith (1964) is adiabatic, with individual
selection working on a faster time scale than group selection. Behaviors within each
group thus first converges to a steady state determined by individual selection, before
selection among groups take place. By contrast, Kerr and Godfrey-Smith (2002) allow
individual and group selection to operate on the same time scale, and discuss individu-
alist and multi-level perspectives on natural selection. In their vocabulary, the present
model is an example of the contextual approach, where individuals are the bearers of
fitness and fitness is sensitive to the context of the individual. They contrast this ap-
proach with what they call the collective approach, where instead collectives (groups)
are fitness-bearing entities in their own right. Their model is not game-theoretic, how-
ever, and their analysis is only remotely related to ours.
Another strand of the literature is the so-called indirect evolutionary approach, pio-
neered by Güth and Yaari (1992).18 In that approach, individuals are randomly drawn
from large populations to play a game defined in terms of material payoffs, just as here.
However, individuals have different preferences, and, by assumption, play some equi-
librium either of the game defined in terms of the drawn individuals’ preferences or in
the game defined by the population distribution of preferences. The drawn individuals
receive material payoffs according to the underlying game. Preferences that result in
higher material payoffs in the corresponding equilibria are selected for. The indirect
evolutionary approach is thus quite distinct from the present. Closest to our approach
among those models is that in Herold (2004), who studies preferences for rewarding al-
truistic behaviors and punishing hostile or selfish behaviors (c.f. Section 3.3 above). He
shows that such preferences can survive in the indirect evolutionary approach with ran-
domly formed groups, even when individual preferences are unobservable. His results
do rely, however, on the assumption that individuals can condition on the preference
distribution in their own group, an assumption not made in the present approach.
17Spiteful social preferences could arise under group selection as modelled here if an increase inothers’ material payoffs (for example when these are above one’s own) would reduce the survivalprobability of own offspring. The symmetry of the survival function ψ would then have to be replacedby symmetry with respect to other group members’ material payoffs.18Se also Bester and Güth (1998), Huck and Oechssler (1999), Sethi and Somanathan (2001), Güth
and Peleg (2001) and Herold (2004).
24
Kuzmics (2003) develops a model of simultaneous individual and group selection
for symmetric two-player games. Like here, he considers the mean-field equation in a
large population. However, unlike here, the number of groups is finite, so each group is
large when the total population is made large in the mean-field approximation. Groups
may thus be though of as subpopulations. Individual selection operates within each
subpopulation, but group selection is driven by migration. More exactly, the model op-
erates in continuous time by way of a Poisson arrival process of migration-cum-imitation
opportunities for each individual. When an individual gets such an opportunity, she
migrates to another group with a probability that is increasing in that group’s current
average material payoff. Whether or not she migrated, she then imitates a randomly
drawn individual in her chosen subpopulation, with a higher probability to imitate
an individual with a higher current material payoff. When applied to CO-games, the
long-run prediction is that all individuals play the pure strategy that yields the highest
material payoff. This contrasts with the predictions of the current model, where other
outcomes are possible. The reason for the tendency towards joint payoff maximization
in Kuzmics’ model is migration, which, by assumption, is directed towards subpopula-
tions with high average payoffs. When applied to PD-games, the long-run prediction
diverges. Individual selection in favor of strategy D is counter-acted by migration to
groups where many play C, resulting in perpetual fluctuations in the population state.
Again, the prediction differs from that of our model.
In Vega-Redondo (1996), a finite population of individuals are recurrently and ran-
domly matched in pairs to play a prisoners’ dilemma. Time is divided into an infinite
sequence of discrete periods, and with each such period, every individual is matched
with another individual. Each matched pair forms a group, and there is both indi-
vidual and group selection at the end of each time period. First, individual selection
takes place: each individual switches to the strategy that yielded the highest payoff
in its group. Thereafter, a mutation may take place: with a very small probability
the selected strategy is replaced by the other pure strategy. Third, group selection
takes place: each group is disbanded with a positive probability, and the members of
disbanded groups switch to the strategies used in those groups that earned the high-
est payoff sum. This defines an ergodic population process, and Vega-Redondo (1996)
shows that if the mutation rate is very small and the population large, its invariant
distribution places virtually all probability mass on the population state in which all
individuals cooperate. This result contrasts with ours, where this is the long-run out-
come only for certain parameter values (see section 3.1). The reason for Vega-Redondo’s
25
drastic result seems to be that if the population initially is in the pure-C state, and, say,
a pair of individuals mutates to (D,D), then group selection will bring the pair back to
(C,C) as soon as it disbands, and this happens with a probability that is “infinitely”
higher than a mutation. Hence, while Maynard Smith’s haystack model can be said to
be tilted in favor of individual selection by letting this operate at a faster time scale
than group selection, Vega-Redondo’s model can be said to be tilted in favor of group
selection by giving group selection full swing as soon as a single group has mutated to
a “better” strategy profile.
The model in Sjöström and Weitzman (1996), finally, is not game-theoretic but
deals with the same issues as here, in the context of the internal efficiency of competing
firms. There are infinitely many firms and each firm has the same finite number of
workers. Within a firm, each worker is “programmed” to some effort level. All workers
in each firm are paid a wage that depend on their average effort. The resulting utility or
fitness to a worker is the wage minus a “cost” or “disutility” of exerting effort. Workers
also have an outside option (for instance, self employment) that gives them a fixed
utility/fitness. The (owner of the) firm sets the wage so as to keep the workers indifferent
between staying and leaving. Suppose that initially all workers in a firm make the same
effort, and that one worker suddenly mutates to a lower effort level. This worker’s
individual utility/fitness goes up, and by individual selection this becomes the worker’s
new effort level. All other workers in the firm imitate the mutant, so the firm ends up
with all workers exerting the lower effort. Hence, individual mutation-cum-selection
drives efforts down towards zero within each firm. However, this force is counteracted
by a form of firm selection. Now and then a pair of firms is randomly drawn, and
the effort level in the firm with higher profit is “copied” over to the other firm. It is,
thus, as if the more profitable firm takes over the niche of the other firm. Sjöström and
Weitzman (1996) show that the ratio between the individual mutation-cum-selection
rate and the firm selection rate is crucial for the long-run outcome. An increase in
this ratio unambiguously increases efficiency (in the sense of stochastic dominance).
At a first glance, this model may seem similar to the present prisoners’ dilemma and
public-goods games. However, all workers are equally well off in all population states,
while this is not true for players in prisoners’ dilemma games and public-goods provision
games. The key difference is that the firms in Sjöström and Weitzman (1996), when
viewed as groups, are asymmetric, where one group member absorbs all excess payoff.
26
7 Concluding remarks
The aim of this study was to construct a parsimonious population selection model that
allows for both individual and group selection, without the adiabatic feature of the
haystack model.
Some earlier models have modelled group selection as a pairwise contest between
groups; out of two randomly selected groups, the one with highest group fitness “wins,”
in terms of future population shares. Instead, we have a very large number of groups
who all “compete” with each other, where groups with higher group fitness “win,” in
terms of future population shares, over those with lower group fitness. In this respect,
our approach is similar in spirit to how perfect competition is modelled in economics;
where no individual firm can affect the market conditions of any other firm, while
aggregate firm (and consumer) behavior determines the market conditions of all firms.
We hope to have shown that the analytical power that follows from this approach allows
for straight-forward analyses of relevance to evolutionary, experimental and behavioral
game theory.
Our approach calls for many extensions, including selection among multi-level hi-
erarchies. It also calls for a careful analysis of the full stochastic process that arises
when the population is large but finite. Another aspect that deserves more attention
is the fitness function φ, which here is treated as a primitive. This function is a “re-
duced form” representation of group organization and its environment, where group
organization in turn has a technological and a cultural or habitual side, representing
different forms of collective local organization in different natural environments. One
could thus conceive of selection of such group organization forms in different natural
environments, thus rendering the function φ endogenous. However, these topics bring
us outside the scope of the present, more limited, study.
8 Appendix
We prove the claims in the special case n = 2. Let
f (x1, x2, ρ) = [θxρ1 + (1− θ)xρ2]
1/ρ ,
27
for ρ ≤ 1, x1, x2 > 0 and θ ∈ (0, 1). Applying the Taylor expansion twice for ρ close tozero, we obtain
ln f (x1, x2, ρ) = lnx1 +1
ρln
∙θ + (1− θ)
µx2x1
¶ρ¸= lnx1 +
1
ρln
∙θ + (1− θ) exp
µρ ln
x2x1
¶¸= lnx1 +
1
ρln
∙θ + (1− θ)
µ1 + ρ ln
x2x1+O
¡ρ2¢¶¸
= lnx1 +1
ρln
∙1 + (1− θ) ρ ln
x2x1+O
¡ρ2¢¸
= lnx1 + (1− θ) lnx2x1+O (ρ) = lnxθ1x1−θ2 +O (ρ) ,
proving the claim for ρ = 0. Assume x1 < x2. The claim for ρ → −∞ follows
immediately from
limρ→−∞
ln f (x1, x2, ρ) = limρ→−∞
µlnx1 +
1
ρln
∙θ + (1− θ) exp
µρ ln
x2x1
¶¸¶= lnx1.
Finally, assume x1 > x2. Then
limρ→+∞
ln f (x1, x2, ρ) = lnx1 + limρ→+∞
1
ρln
∙θ + (1− θ) exp
µρ ln
x2x1
¶¸= lnx1.
References
Akin, E., 1980. Domination or equilibrium, Mathematical Biosciences 50, pp. 239-50.
Bergstrom, T.C., 2002. Evolution of social behavior: Individual and group selection.
Journal of Economic Perspectives 16, pp. 67-88.
Binmore, K., Samuelson, L., 1999. Evolutionary drift and equilibrium selection. The
Review of Economic Studies 66, pp. 363-393.
Boyd, R., Richerson, P.J., 1985. Culture and the Evolutionary Process. University of
Chicago Press, Chicago.
28
Charness, G., Rabin, M., 2002. Understanding social preference with simple tests. Quar-
terly Journal of Economics117, pp. 817-869.
Darwin, C., 1871. The Descent of Man and Selection in Relation to Sex. Murray, Lon-
don.
Fehr, E. and Gächter, S., 2002. Altruistic punishment in humans. Nature 415, pp.
137-140.
Fehr, E. and Schmidt, K. M., 1999. A theory of fairness, competition, and cooperation.
The Quarterly Journal of Economics 114, pp. 817-868.
Güth, W., Bester, H., 1998. Is altruism evolutionarily stable? Journal of Economic
Behavior and Organization 34, pp. 193-200.
Güth, W., Yaari, M., 1992. An evolutionary approach to explain reciprocal behavior in a
simple strategic game. InWitt U. (ed.) Explaining Process and Change–Approaches
to Evolutionary Economics, pp. 23-34. The University of Michigan Press, Ann Arbor.
Güth, W., Peleg, B., 2001. When will payoff maximization survive? Journal of Evolu-
tionary Economics 11, pp. 479-499.
Henrich, J., 2003. Cultural group selection, coevolutionary processes and large-scale
cooperation. Journal of Economic Behavior and Organization 53, pp. 3-35.
Herold, F., 2004. Carrot or stick? The evolution of reciprocal preferences in a haystack
model. Mimeo, Department of Economics, University of Munich.
Hofbauer J., Weibull, J., 1996. Evolutionary selection against dominated strategies,
Journal of Economic Theory 71, pp. 558-573.
Huck, S., Oechssler J., 1999. The indirect evolutionary approach to explaining fair
allocations. Games and Economic Behavior 28, pp. 13-24.
Kandori, M., Mailath, G.J., Rob, R., 1993. Learning, mutation, and long run equilibria
in games. Econometrica 61, pp. 29-56.
Kerr, B., Godfrey-Smith, P., 2002. Individualist and multi-level perspectives on selec-
tion in structured populations, Biology and Philosophy 17, pp. 477-517.
Kreps, D., Wilson, R., 1982. Sequential equilibria. Econometrica 50, pp. 863-894.
29
Kuzmics, C., 2003. Individual and group selection in symmetric 2-player games. Mimeo,
Kellogg School of Management, Northwestern University.
Maynard Smith, J.M., 1964. Group selection and kin Selection. Nature 201, pp. 1145-
1147.
Maynard Smith, J., 1982. Evolution and the Theory of Games. Cambridge. Cambridge
University Press.
Nachbar, J., 1990. "Evolutionary" selection dynamics in games: Convergence and limit
properties. International Journal of Game Theory 19, pp. 59-89.
Pearce, D., 1984. Rationalizable strategic behavior and the problem of perfection,
Econometrica 52, pp. 1029-1050.
Samuelson, L., and Zhang, J., 1992. Evolutionary stability in asymmetric games. Jour-
nal of Economic Theory 57, pp. 363-91.
Savage, L., 1954. The Foundations of Statistics. Dover.
Sethi, R., Somanathan, E., 1996. The evolution of social norms in common property
resource use. American Economic Review 86, pp. 766-88.
Sethi, R., Somanathan, E., 2001. Preference evolution and reciprocity. Journal of Eco-
nomic Theory 97, pp. 273-297.
Sjöström T., Weitzman, M. L., 1996. Competition and the evolution of efficiency. Jour-
nal of Economic Behavior and Organization 30, pp. 25-43.
Taylor, P., 1979. Evolutionary stable strategies with two types of player. Journal of
Applied Probability 16, pp. 76-83.
Taylor, P., Jonker, L., 1978 Evolutionary stable strategies and game dynamics, Math-
ematical Biosciences 40, pp. 145-56.
van Damme, E., 1987. Stability and Perfection of Nash Equilibrium. Berlin. Springer
Verlag.
Vega-Redondo, F., 1996. Long-run Cooperation in the one-shot prisoner’s dilemma: A
hierarchic evolutionary approach. Biosystems 37, pp. 39-47.
Weibull, J., 1995. Evolutionary Game Theory, MIT Press, Cambridge.
30
Williams, G. C., 1966. Adaptation and natural selection. Princeton University Press,
Princeton.
Wilson D.S., 1983. The group selection controversy: History and current status. Annual
Review of Ecology and Systematics 14, pp. 159-187.
Wilson, D.S., Sober, E., 1994. Reintroducing group selection to the human behavioral
sciences. Behavioral and Brain Sciences 17, pp. 585-654.
Wynne-Edwards, V. C., 1962. Animal Dispersion in Relation to Social Behavior. Oliver
and Boyd, Edinburgh.
31