what does newcomb’s paradox teach us? - arxiv · on that deduction. newcomb’s paradox is that...

arX

iv:0

904.

2540

v3 [

cs.G

T]

30 S

ep 2

010

What does Newcomb’s Paradox Teach us?

Abstract. In Newcomb’s paradox you choose to receive either the contents of a particularclosed box, or the contents of both that closed box and another one. Before you choose though,an antagonist uses a prediction algorithm to deduce your choice, and fills the two boxes basedon that deduction. Newcomb’s paradox is that game theory’s expected utility and dominanceprinciples appear to provide conflicting recommendations for what you should choose. Arecent extension of game theory provides a powerful tool forresolving paradoxes concerninghuman choice, which formulates such paradoxes in terms of Bayes nets. Here we apply thistool to Newcomb’s scenario. We show that the conflicting recommendations in Newcomb’sscenario use different Bayes nets to relate your choice and the algorithm’s prediction. Thesetwo Bayes nets are incompatible. This resolves the paradox:the reason there appears to be twoconflicting recommendations is that the specification of theunderlying Bayes net is open totwo, conflicting interpretations. We then show that the accuracy of the prediction algorithm inNewcomb’s paradox, the focus of much previous work, is irrelevant. We similarly show thatthe utility functions of you and the antagonist are irrelevant. We end by showing that New-comb’s paradox is time-reversal invariant; both the paradox and its resolution are unchangedif the algorithm makes its ‘prediction’after you make your choice rather than before.

Keywords: Newcomb’s paradox, game theory, Bayes net, causality, determinism

c© 2018Kluwer Academic Publishers. Printed in the Netherlands.

newcomb.arxiv.for.synthese.tex; 30/10/2018; 18:04; p.1

http://arxiv.org/abs/0904.2540v3

2

1. Introduction

1.1. Background

Suppose you meet a Wise being (W) who tells you it has put $1,000 in boxA, and either $1 million or nothing in box B. This being tells you to eithertake the contents of box B only, or to take the contents of bothA and B.Suppose further that the being had put the $1 million in box B if a predictionalgorithm used by the being had said that you would take only B. If insteadthe algorithm had predicted you would take both boxes, thenW put nothingin box B.

Presume that due to determinism, there exists a perfectly accurate predic-tion algorithm, and assume that it is this perfect prediction algorithm thatWuses. Suppose further that when you choose which boxes to take, you don’tknow the prediction of that algorithm. What should your choice be?

In Table 1 we present this question as a game theory matrix involving W’sprediction and your choice. Two seemingly logical answers to your questioncontradict each other. The Realist answer is that you shouldtake both boxes,because your choice occurs afterW has already made its prediction, and sinceyou have free will, you’re free to make whatever choice you want, indepen-dent of that prediction thatW made. More precisely, ifW predicted you wouldtake A along with B, then taking both gives you $1,000 rather than nothing.If insteadW predicted you would take only B, then taking both boxes yields$1,001,000, which again is $1000 better than taking only B.

In contrast, the Fearful answer is thatW designed a prediction algorithmwhose answer will match what you do. So you can get $1,000 by taking bothboxes or get $1 million by taking only box B. Therefore you should take onlyB.

This is the conventional formulation of Newcomb’s Paradox,a famouslogical riddle stated by William Newcomb in 1960 (Nozick, 1969; Gard-ner, 1974; Bar-Hillel and Margalit, 1972; Jacobi, 1993; Hunter and Richter,1978; Campbell and Lanning, 1985; Levi, 1982; Geanakoplos,1997; Collins,2001; Piotrowski and Sladkowski, 2002; Burgess, 2004). Newcomb neverpublished the paradox, but had long conversations about it with with philoso-phers and physicists such as Robert Nozick and Martin Kruskal, along withScientific American’s Martin Gardner. Gardner said after his second Scien-tific American column on Newcomb’s paradox appeared that it generatedmore mail than any other column.

One of us (Benford) worked with Newcomb, publishing severalpaperstogether. We often discussed the paradox, which Newcomb invented to testhis own ideas. Newcomb said that he would just take B; why fighta God-like being? However Nozick said, “To almost everyone, it is perfectly clearand obvious what should be done. The difficulty is that these people seem to


Newcomb teaches us 3

divide almost evenly on the problem, with large numbers thinking that theopposing half is just being silly” (Nozick, 1969).

It was Nozick who pointed out that two accepted principles ofgame the-ory appear to conflict in Newcomb’s problem. The expected-utility principle,considering the probability of each outcome, says you should take box Bonly. But the dominance principle argues that if one strategy is always betterthan the other strategies no matter what other players do, then you shouldpick that strategy (Fudenberg and Tirole, 1991; Myerson, 1991; Osborne andRubenstein, 1994).) No matter what box B contains, you are $1000 richer ifyou take both boxes than if you take B only. So the dominance principle saysyou should take both boxes.

Is there really a contradiction? Some philosophers argue that a perfectpredictor implies a time machine, since with such a machine causality isreversed, i.e., for predictions made in the present to be perfectly determinedby events in the future means that the future causes past events.1

But Nozick stated the problem specifically to exclude backward causation(and so time travel), because his formulation demands only that the predic-tions be of high accuracy, not perfect. So arguments about time travel cannotresolve the issue.

Recently an extension to game theory has been introduced that providesa powerful tool for resolving apparent logical paradoxes concerning humanchoice. This tool formulates games in terms of Bayes nets. Here we illustratethis tool by using it to resolve Newcomb’s paradox. In a nutshell, it turns outthat Newcomb’s scenario does not fully specify the Bayes netunderlying thegame you andW are playing. The two “conflicting principles of game the-ory” actually correspond to two different underlying Bayes nets. As we show,those nets are mutually inconsistent. So there is no conflictof game theoryprinciples — simply imprecision in specifying the Bayes netunderlying thegame you andW and playing.

The full extension of game theory is more powerful than needed to resolveNewcomb’s paradox; a smaller extension would suffice. However using thefull extension to analyze a paradox often draws attention toaspects of theparadox that otherwise would escape notice. That is the casehere. We use thefull extension to show that the accuracy of the prediction algorithm in New-comb’s paradox, the focus of much previous work, is irrelevant. In additionwe show that the utility functions of you andW are actually irrelevant; theself-contradiction built in to Newcomb’s scenario has to dowith imprecisionin specification of the underlying Bayes net, and is independent of the utilityfunctions. We also show that Newcomb’s paradox is time-reversal invariant;

1 Interestingly, near when Newcomb devised the paradox, he also coauthored a paper prov-ing that a tachyonic time machine could not be reinterpretedin a way that precludes suchparadoxes (Benford et al., 1970). The issues of time travel and Newcomb-style paradoxes areintertwined.


4

both the paradox and its resolution are unchanged if the algorithm makes its‘prediction’ after you make your choice rather than before.

In the next section we present a simplified version of our fullargument.We then review the recent extension of game theory. In the following sec-tion we use extended game theory to provide our fully detailed resolution ofNewcomb’s paradox. We end with a discussion.

2. Decompositions of the reasoning of Fearful and Realist

There are two players in Newcomb’s paradox: you, and the wisebeingW.In addition, there are two game variables that are central tothe paradox:W’sprediction,g, and the choice you actually make,y. So the player strategies willinvolve the joint probability distribution relating thosetwo variables. Sincethere are only two variables, there are only two ways to decompose that jointprobability distribution into a Bayes net. These two decompositions turn outto correspond to the two recommendations for how to answer Newcomb’squestion, one matching the reasoning of Realist and one matching Fearful.

Defineab as the event thatW predicts you will take both boxes, andb asthe event thatW predicts you will only take boxB. Similarly defineABas theevent that you actually take both boxes, andB as the event that you actuallytake only boxB. So P(g = ab | y = AB) = P(ab | AB) is the probabilitythatW predicts correctly, given that you chooseAB. Similarly P(b | B) is theprobability thatW predicts correctly given that you choose onlyB.

Decision theory says that you should chooseP(y) so as maximize theassociated expected utility. To make that formal, first we express Fearful’sreasoning as a joint probability distribution overg andy:

P(y, g) = P(g | y)P(y)

whereP(g | y) is pre-fixed, byW’s prediction algorithm.Decision theory says that for this decomposition, you should setP(AB)

(and thereforeP(B) = 1− P(AB)) so as to maximize

1000[P(ab | AB)P(AB)] + 1001000[P(b | AB)P(AB)]

+ 0[P(ab | B)P(B)] + 1000000[P(b | B)P(B)].

Provided that the associated distributionsP(ab | AB) andP(b | B) are largeenough — providedW’s prediction algorithm is accurate enough — to achievethis maximization you should setP(B) = 1. In other words, you shouldchoose to take onlyB. This expresses Fearful’s “expected utility” reasoning.

On the other hand, under Realist’s reasoning, you can make your decisionentirely independently ofW’s prediction. This amounts to the assumption that

P(y, g) = P(y | g)P(g)

= P(y)P(g)



where you setP(y) andW setsP(g). For this decomposition, decision theorysays that you should chooseP(AB) to maximize

1000[P(ab)P(AB)] + 1001000[P(b)P(AB)]

+ 0[P(ab)P(B)] + 1000000[P(b)P(B)].

which is achieved by settingP(B) = 0, no matter whatP(g) is. This conclu-sion expresses the dominance principle of game theory.

This foregoing analysis illustrates how the reasoning of Fearful and Re-alist correspond to different decompositions of the joint probability of yourchoice andW’s choice. The simple fact that those two decompositions differis what underlies the resolution of the paradox — you can state Newcomb’sscenario in terms of Fearful’s reasoning, or in terms of Realist’s reasoning,but not both.2

The foregoing analysis, while quite reasonable, does not formally estab-lish that there is no other way to interpret Newcomb’s scenario besides thetwo considered. For example, one might worry that there is a way to mergethe joint probability decompositions of Fearful and Realist, to get a hybriddecomposition that would somehow better captures Newcomb’s scenario. Inaddition, it is not clear if Newcomb’s scenario should be viewed as a decisiontheory problem, in which caseW’s behavior is pre-fixed (as assumed above).Much of the literature instead casts Newcomb’s paradox as a game theoryproblem, to allowW to act as an antagonist (e.g., by givingW a utility func-tion that equals the negative of your utility function). In particular, viewingNewcomb’s scenario as a game raises questions like how the analysis changesif W can choose the accuracy of the prediction algorithm, ratherthan have itbe pre-fixed. Finally, the simplified analysis presented above does not drawattention to subtleties of Newcomb’s scenario like its time-invariance.

To overcome these shortcomings it is necessary to use extended gametheory. That is the topic of the next two sections.

3. Extended game theory

Central to Newcomb’s scenario is a prediction process, and its (in)fallibility.Recent work proves that any given prediction algorithm mustfail on at leastone prediction task (Binder, 2008; Wolpert, 2008; Wolpert,2010). That par-ticular result does not resolve Newcomb’s paradox. Howeverthe proof of thatresult requires an extension of conventional game theory. And as we showbelow, that extension of game theory provides a way to resolve Newcomb’sparadox.

2 We are grateful to an anonymous referee for emphasizing the underlying simplicity ofour analyzing Newcomb’s scenario by decomposing the reasoning of Fearful and Realist intodifferent joint probability distributions.


6

3.1. Extended games

To introduce extended game theory, consider a situation where there is a set ofseveral players,N , together with a set ofgame variables. A (profile) strategyis a set of one or more joint probability distributions over the game variables.Every player has their own set of strategies (i.e., their ownset of sets ofdistributions) that they can choose from. In general we indicate a memberof playeri’s strategy set bySi .

If the intersection of every possible joint choice of strategies by all theplayers is a single joint probability distribution, we say the game isproper.For a proper game, it doesn’t matter what choice each of the players inN

makes; their joint choice always specifies a single joint probability distribu-tion over the game variables. All the games considered in conventional gametheory are proper. For the rest of this subsection, we restrict attention to propergames.

Every player in an extended game whose profile strategy set contains ex-actly one strategy (i.e., one set of probability distributions) is called aNatureplayer. Note that the strategy “chosen” by a Nature player isautomatic.

Every non-Nature playeri has their ownutility function ui which maps anyjoint value of the game variables to a real number. So given any joint choice ofstrategies, in a proper game the associated (unique) joint distribution providesthe expected utility values of the non-Nature players.

To play the game, all the non-Nature players independently choose a strat-egy (i.e., choose a set of joint distributions over the game varaibles) from theirrespective strategy sets. Their doing this fixes the joint strategy, and thereforethe joint distribution over the game variables (if the game is proper), andtherefore the expected utilities of all the non-Nature players.

Game theory is the analysis of what strategies the non-Nature playerswill jointly choose in this situation, under different possible choice-makingprinciples. One such principle is the dominance principle,mentioned above,and another is the expected-utility principle, also mentioned above.

The following example shows how to formulate a game very popular inconventional game theory in terms of extended game theory:

Example 1. Matching pennies.

1. In the conventional formulation of the “Matching Pennies” game, thereare two players, Row and Col, both of whom are non-Nature players.There are also two game variables, XR and XC, with associated valuesxR, xC, both of which can be either0 or 1. To play the game player Rowchooses a probability distribution pR(xR) and player Col independentlychooses a probability distribution pC(xC). Taken together, those two dis-tributions define the joint distribution over the game variables accordingto the rule P(xR, xC) = P(xR)P(xC).



Player Row’s utility for any joint value (xR, xC) formed by sampling P(xR, xC)equals1 if xR = xC, 0 otherwise. Conversely, Player Col’s utility is0if xR = xC, 1 otherwise. So the expected utility of player Row underP(xR, xC) is ∑

xR,xC

pR(xR)pC(xC)δxR,xC ,

where the Kronecker delta functionδa,b equals1 if a = b, and0otherwise.Similarly, the expected utility of player Col is

∑

xR,xC

pR(xR)pC(xC)[1 − δxR,xC ].

2. We now show how to reformulate Matching Pennies as an extended game.Each of player Row’s strategies is specified by a single real number, pR.The strategy SpR associated with the value pR is the set of all joint distri-butions Pr(xR, xC) obeying Pr(xR, xC) = [pRδxR,1 + (1− pR)δxR,0]Pr(xC)]for some distribution Pr(xC). In other words, it is the set of all joint distri-butions P(xR, xC) whose marginal P(xR) is given by pR. Similarly, eachof player Col’s strategies SpC is specified by a single real number, pC.The strategy SpC associated with the value pC is the set of all joint distri-butions Pr(xR, xC) such that Pr(xR, xC) = Pr(xR)[pCδxC,1+ (1− pC)δxC ,0]for some distribution Pr(xR).

In this extended game formulation, to play the game, each player inde-pendently chooses a strategy. Whatever strategy SpR Player Row chooses,and whatever strategy SpC Player Col chooses, there is a unique jointdistribution consistent with their choices,

Pr(xR, xC) = SpR ∩ SpC

= [pRδxR,1 + (1− pR)δxR,0][ pCδxC,1 + (1− pC)δxC ,0]

The expected utilities of the two players are then given by their expectedutilities under SpR ∩ SpC .

A moment’s thought shows that the extended game formulationis identical tothe original formulation in (1).

3.2. Over-played and under-played games

If a game is not proper, then there is a joint strategy,S = {Si : i ∈ N }, suchthat one of two conditions holds:

1. ∩i∈N Si is empty. In this case we say the game isover-played.


8

2. ∩i∈N Si contains more than one distribution over the game variables. Inthis case we say the game isunder-played.

Note that one way to establish that∩i∈N Si = ∅ is to show that the sets ofconditional distributions{Si}, taken together, violate the laws of probability.(This is how we will do it below.) In over-played games, the players are notactually independent; there is a joint strategy choiceS by the players thatis impossible, and therefore it isnot the case that every playeri is free tochoose the associated strategySi without regard for the choices of the players.Accordingly, if one specifies a game theoretic scenario where ∩i∈N Si = ∅

for some joint strategy choiceS, one cannot at the same time presume thatthe players are independent. If you do so, then your specification of the gametheoretic scenario contradicts itself.

Conversely, in under-played games, there is a joint strategy choice by theplayers that does not uniquely determine the distribution over the game vari-ables. Accordingly, if one specifies a game theoretic scenario where∩i∈N Si

contains more than one element for some joint strategy choice S, then theoutcome of the game is not well-defined.

Often a “paradox” involving human choice is a game theoreticscenariothat is over-played, and therefore has a self-contradiction built into it, oris under-played, and therefore not well-defined. To resolveeither type ofparadox one simply requires that the game theoretic scenario be proper. Inparticular, this is how Newcomb’s paradox is resolved.

3.3. Representing extended game theory in terms of Bayes nets

A Bayes netover a set of variables{A, B,C, . . .} is a compact, graphical rep-resentation of a joint probability distribution over thosevariables (Koller andMilch, 2003; Pearl, 2000). Often in extended game theory it is convenient toexpress a player’s profile strategy set in terms of a Bayes net. In particular, ex-pressing strategy sets this way provides a convenient way ofestablishing thata game is (not) proper, and thereby analyzing apparent paradoxes involvinghuman choice.

Formally, any Bayes net contains three parts. The first part is a DirectedAcyclic Graph (DAG) comprising a set of vertices/ nodes,V, and an associ-ated set of directed edges,E ⊆ V × V. There are the same number of nodesin the DAG as there are variablesA, B,C, . . .. A nodeν′ in the DAG such thatan edge leads directly fromν′ into a nodeν is called aparentof ν. The set ofall parents ofν is written as pa(ν). Note that pa(ν) is the empty set for a rootnode.

The second part of a Bayes net is a one-to-one onto functionχ from theset of nodes of the DAG to the set of variablesA, B,C, . . .. Soχ labels each ofthe nodes of the DAG with one and only one of the variablesA, B,C, . . .. Thefinal part of the Bayes net is a set of conditional distributions, one for each



nodeν ∈ V of the DAG. The conditional distribution atν is a specificationof P(χ(ν) | χ[pa(ν)]), whereχ[pa(ν)] is the variables associated with thoseparent nodes.

It is straight-forward to see that any set of all such conditional distributionsin the Bayes net uniquely specifies a joint distribution:

P(A, B,C, . . .) =∏

ν∈V

P(χ(ν) | χ[pa(ν)]) (1)

Conversely, any joint distribution can be expressed as a Bayes net. (Indeed, ingeneral any joint distribution can be expressed as more thanone Bayes net.)

Many games can be very naturally defined in terms of a Bayes net. Let(V,E) be the nodes and edges of the Bayes net. To define the game, thenodesV are partitioned among the players. So playeri is assigned some subsetVi ⊆ V, for all i, j , i, Vi ∩ V j = ∅ and∪iVi = V. A profile strategy ofplayer i is simply the specificationpVi of all the conditional distributions atthe nodes inVi . More precisely, given apVi , the associated profile strategyis all joint distributions represented by a Bayes net with DAG (V,E) thathave the conditional distributionspVi at the nodes inVi, where the remainingconditional distributions are unspecified.

We are guaranteed that any game defined in terms of a Bayes net this wayis proper. In other words,any joint profile strategy of the players correspondsto one and only one joint distribution over the game variables. This is becausea joint strategy fixes the conditional distributions at all nodes inVi for all i— which means it fixes all the conditional distributions. So ajoint strategyuniquely specifies the full Bayes net. Eq. 1 then maps that Bayes net to the(unique) joint distribution over the game variables.

Example 2. In the Matching Pennies scenario, there are two variables, xR

and xC. Therefore V consists of two nodes,νR and νC, corresponding to xRand xC respectively. In addition, the two variables are statistically indepen-dent, i.e., P(xR, xC) = P(xR)P(xC). Therefore in the DAG there are no edgesconnecting the two nodes.

A strategy of player Row is the specification of the conditional distributionP(χ(νR) | χ[pa(νR)]). SinceνR has no parents, this is just the distributionP(χ(νR)). Similarly, a strategy of player Col is just the distribution P(χ(νC)).Given the DAG of the game, plugging into Eq. 1 shows that a joint distributionis determined from a joint strategy by P(χ(νR), χ(νC)) = P(χ(νR))P(χ(νC)).

In contrast, say that Row player had moved first, and there wasa noisysensor making an observation D of that move, and that Col chose their movebased on the value of D. Then the Bayes net would have three nodes,νR, νDand νC, corresponding to the three variables. The nodeνR would have noparents, the nodeνD would have (only)νR as its parent, and the nodeνCwould have (only) D as its parent. We can have Row and Col be non-Nature


10

players, and the sensor be a Nature player. So the Row player would set

P(χ(νR) | χ[pa(νR)]) = P(χ(νR)),

by choosing a distribution from its strategy set. In addition a Nature playerwould set

P(χ(νD) | χ[pa(νD)]) = P(χ(νD) | χ(νR)),

to the single strategy in its strategy set. Finally, the Col player would set

P(χ(νC) | χ[pa(νC)]) = P(χ(νC) | χ(νD)).

by choosing a distribution from its strategy set.

3.4. Conflicting DAG’s

All of the games arising in conventional game theory are formulated as aBayes net game involving a single underlying DAG. Accordingly, all thosegames are proper.

Often in a game theory paradox it is implicitly presumed thatthe game isproper, just like conventional games. In particular, it is often presumed thatthe players are free to choose their strategies independently of one another.However in contrast to the case with games in conventional game theory,often in these paradoxes the strategy sets of the different players are definedin terms of different Bayes net DAGs. Due to this, the presumption of playerindependence might be violated. Showing that this is the case — showingthere are joint choices of conditional distributions by theplayers that arenot simultaneously consistent with any joint distribution— shows that theimplicit premise in the paradox that the players are independent is impos-sible. Resolving such an apparent paradox simply means requiring that theassociated game be stated in terms of a single Bayes net DAG, so that it isproper, and that therefore all premises actually hold.

It is exactly such reasoning that underlies the fallibilityof prediction provenin (Binder, 2008; Wolpert, 2008; Wolpert, 2010). As we show below, thisreasoning also resolves Newcomb’s paradox. Loosely speaking, Newcomb’sscenario involves two Bayes nets, one corresponding to Realist’s reasoning,and one corresponding to Fearful’s reasoning. One can choose one Bayesnet or the other, but to avoid inconsistency, one cannot choose both. In short,Realist and Fearful implicitly formalize Newcomb’s scenario in different, andincompatible ways.

In the next section, we first show how to express the reasoningof Realistand of Fearful in terms of strategy spaces over two different Bayes net DAGs.We then show that those two Bayes net DAGs and associated strategy spacescannot be combined without raising the possibility of inconsistent joint distri-butions between the Bayes nets. This resolves Newcomb’s paradox, without



relying on mathematizations of concepts that are vague and/or controversial(e.g., ‘free will’, ’causality’). In addition, our resolution shows that there isno conflict in Newcomb’s scenario between the dominance principle of gametheory and the expected utility principle. Once one takes care to specify theunderlying Bayes net game, those principles are completelyconsistent withone another.

We end by discussing the implications of our analysis. In particular, ouranalysis formally establishes the (un)importance ofW’s predictive accuracy.It also shows that the utility functions of you andW are irrelevant. Fur-thermore, it provides an illustrative way to look at Newcomb’s scenario byrunning that scenario backwards in time.

4. Resolving Newcomb’s paradox with extended game theory

There are two players in Newcomb’s paradox: you, and the wisebeingW.In addition, there are two game variables that are central tothe paradox:W’sprediction,g, and the choice you actually make,y. So the player strategies willinvolve the joint probability distribution relating thosetwo variables. Sincethere are only two variables, there are only two ways to decompose that jointprobability distribution into a Bayes net. These two decompositions turn outto correspond to the two recommendations for how to answer Newcomb’squestion, one matching the reasoning of Realist and one matching Fearful.

4.1. The first decomposition of the joint probability

The first way to decompose the joint probability is

P(y, g) = P(g | y)P(y) (2)

(where we define the right-hand side to equal 0 for anyy such thatP(y) = 0).Such a decomposition is known as a ‘Bayes net’ having two ‘nodes’ (Pearl,2000; Koller and Milch, 2003). The variabley, and the associated uncondi-tioned distribution,P(y), is identified with the first, ‘parent’ node. The vari-able g, and the associated conditional distribution,P(g | y), is identifiedwith the second, ‘child’ node. The stochastic dependence ofg on y can begraphically illustrated by having a directed edge go from a node labeledy toa node labeledg.

This Bayes net can be used to express Fearful’s reasoning. Fearful inter-prets the statement that ‘W designed a perfectly accurate prediction algo-rithm’ to imply that W has the power to set the conditional distribution inthe child node of the Bayes net,P(g | y), to anything it wants (for ally suchthat P(y) , 0). More precisely, since the algorithm is ‘perfectly accurate’,Fearful presumes thatW chooses to setP(g | y) = δg,y, the Kronecker delta


12

function that equals 1 ifg = y, zero otherwise. So Fearful presumes that thereis nothing you can do that can affect the values ofP(g | y) (for all y such thatP(y) , 0). Instead, you get to choose the unconditioned distribution in theparent node of the Bayes net,P(y). Intuitively, this choice constitutes your‘free will’.

Fearful’s interpretation of Newcomb’s paradox specifies what aspect ofP(y, g) you can choose, and what aspect is instead chosen by W. Thosechoices— P(y) and P(g | y), respectively — are the ‘strategies’ that you andWchoose. It is important to note that these strategies by you and W do notdirectly specify the two variablesy andg. Rather the strategies you andWchoose specify two different distributions which, taken together, specify thefull joint distribution overy and g (Koller and Milch, 2003). This kind ofstrategy contrasts with the kind considered in decision theory (Berger, 1985)or causal nets (Pearl, 2000), where the strategies are direct specifications ofthe variables (which here areg andy).

In both game theory and decision theory, by the axioms of VonNeumannMorgenstern utility theory, your task is to choose the strategy that maximizesyour expected payoff under the associated joint distribution.3 For Fearful,this means choosing theP(y) that maximizes your expected payoff underthe P(y, g) associated with that choice. Given Fearful’s presumptionthat theBayes net of Eq. 2 underlies the game withP(g | y) = δg,y, and that yourstrategy is to set the distribution at the first node, for you to maximize ex-pected payoff your strategy should beP(y) = δy,B. In other words, you shouldchooseB with probability 1. Your doing so results in the joint distributionP(y, g) = δg,yδy,B = δg,Bδy,B, with payoff 1, 000, 000. This is the formaljustification of Fearful’s recommendation.4

Note that since this analysis treats the strategy set ofW as being a singleset of joint distributions, it can be interpreted as either adecision-theoreticscenario or a game-theoretic one, withW being a Nature player.

4.2. The second decomposition of the joint probability

The second way to decompose the joint probability is

P(y, g) = P(y | g)P(g) (3)

3 For simplicity, we make the common assumption that utility is a linear function ofpayoff (Starmer, 2000). So maximizing expected utility is the sameas maximizing expectedpayoff.

4 In Fearful’s Bayes net,y ‘causally influencesg’, to use the language of causal nets (Pearl,2000). To cast this kind of causal relationship in terms of conventional game theory, we wouldhave to replace the single-stage game in Table 1 with a two-stage game in which you first sety, andthen Wsets their strategy, having observedy. This two-stage game is incompatible withNewcomb’s stipulation thatW sets their strategy before you do, not after. This is one of thereasons why it is necessary to use extended game theory rather than conventional game theoryto formalize Fearful’s reasoning.



(where we define the right-hand side to equal 0 for anyg such thatP(g) = 0).In the Bayes net of Eq. 3, the unconditioned distribution identified with theparent node isP(g), and the conditioned distribution identified with the childnode isP(y | g). This Bayes net can be used to express Realist’s reasoning.Realist interprets the statements that ‘your choice occursafterW has alreadymade its prediction’ and ‘when you have to make your choice, you don’t knowwhat that prediction is’ to mean that you can choose any distribution h(y) andthen setP(y | g) to equalh(y) (for all gsuch thatP(g) , 0). This is how Realistinterprets your having ‘free will’. (Note that this is a different interpretationof ‘free will” from the one made by Fearful.) Under this interpretation,Whas no power to affect P(y | g). RatherW gets to set the parent node in theBayes net,P(g). For Realist, this is the distribution that you cannot affect. (Incontrast, in Fearful’s reasoning, you set a non-conditional distribution, and itis the conditional distribution that you cannot affect.)

Realist’s interpretation of Newcomb’s paradox specifies what it is youcan set concerningP(y, g), and what is set byW. Just like under Fearful’sreasoning, under Realist’s reasoning the ‘strategies’ youand W choose donot directly specify the variablesg and y. Rather the strategies of you andW specify two distributions which, taken together, specify the full joint dis-tribution. As before, your task is to choose your strategy — which now ish(y) — to maximize your expected payoff under the associatedP(y, g). GivenRealist’s presumption that the Bayes net of Eq. 3 underlies the game andthat you get to seth, you should chooseh(y) = P(y | g) = δy,AB, i.e., youshould chooseABwith probability 1. Doing this results in the expected payoff1, 000P(g = AB) + 1, 001, 000 P(g = B), which is your maximum expectedpayoff no matter what the values ofP(g = AB) andP(g = B) are. This is theformal justification of Realist’s recommendation.

Note that in Fearful’s interpretation, your strategy is choosing a single realnumber, while W chooses two real numbers. In contrast, in Realist’s interpre-tation, your strategy is still to choose a single real number, but now W alsochooses only a single real number. The different interpretations correspond todifferent games.5

4.3. Combining the decompositions

The original statement of Newcomb’s question is somewhat informal. Asoriginally stated (and commonly analyzed), Newcomb’s question does notspecify whether you are playing the game that Feaful thinks is being played,

5 In Realist’s Bayes net, given the associated restricted possible form ofP(y | g), there isno edge connecting the node forg with the node fory. In other words,P(y,g) = P(y)P(g).This means thatg andy are ‘causally independent’, to use the language of causal nets (Pearl,2000). This causal relationship is consistent with the single-stage game in Table 1, in contrastto the causal relationship of the game played under Fearful’s interpretation, which requires atwo-stage game.


14

or the game that Realist thinks is being played. What happensif we try toformulate Newcomb’s question more formally?

In light of extended game theory, there are three ways we could imaginedoing this. Let{SF

y } be your strategy set in the game Fearful thinks is beingplayed, and{SR

y } be your strategy set in the game Realist thinks is beingplayed. Similarly, let{SF

W} be W’s strategy set in the game Fearful thinksis being played, and{SR

W} be W’s strategy set in the game Realist thinks isbeing played.

1. The most natural way to accommodate the informality of Newcomb’squestion is to define an extended game where your strategy setis thecombination of what it is in Fearful’s games and in Realist’sgame, i.e.,where your strategy set is{SF

y } ∪ {SRy }, and similarly whereW’s strategy

set is{SFW} ∪ {S

RW}.

There are two additional ways it might be possible to merge the games ofFearful and Realist:

2. Have your strategy set be{SFy }, while W’s strategy set is{SR

W}.

3. Have your strategy set be{SRy }, while W’s strategy set is{SF

W}.

Are any of these extended games proper? If not, then the only way ofinterpreting Newcomb’s scenario is as the game Fearful supposes or as thegame Realist supposes. There would be no way to combine thosetwo games.

We start by analyzing scenario [1]. IfP(g | y) is set by W to beδg,y for all ysuch thatP(y) , 0, as under Fearful’s presumption, then your (Realist) choiceof h affectsP(g). In fact, your choice ofh fully specifiesP(g).6 This contra-dicts Realist’s presumption that it isW’s strategy that setsP(g), independentof you. So scenario [1] is impossible; it is over-played, in the terminology ofSec. 3.2.

Now consider scenario [2]. In this scenario, any strategySy specifies thedistribution P(y) only, with P(g | y) being arbitrary. In other words, such astrategy consists of all joint distributionsP(y, g) with the specified marginaldistributionP(y). Similarly, any strategySW consists of all joint distributionsP(y, g) with a specified marginalP(g) and P(y, g) = h(y)P(g) for someh.For any pair of such strategies (Sy,SW), there is a unique joint distributionin Sy ∩ SW: the product distributionP(y)P(g) whereSy setsP(y) andSg setsP(g). However this is just Realist’s extended game; it is not a new extendedgame.

6 For example, make Fearful’s presumption be thatP(g | y) = δg,y for all y such thatP(y) , 0. Given this, if you seth(y) = δy,AB, thenP(g) = δg,AB, and if you seth(y) = δy,B, thenP(g) = δg,B.



We end by presenting two arguments that establish the impossibility ofscenario [3]. First, we will show that ifW setsP(g | y) for Fearful’s Bayesnet appropriately, then some of your strategiesh(y) for Realist’s Bayes netbecome impossible. In other words, for those strategy choices byW and byyou, the joint distribution over the game variables for Realists’s Bayes netcontradicts the joint distribution over the game variablesfor Fearful’s Bayesnet. (This is true for almost anyP(g | y) thatW might choose, and in particularis true even ifW does not predict perfectly.)

To begin, pick anyα ∈ (1/2, 1], and presume thatP(g | y) is set byW tobeαδg,y + (1 − α)(1 − δg,y) (for all y such thatP(y) , 0). So for example,under Fearful’s interpretation, whereW uses a perfectly error-free predictionalgorithm,α = 1. Given this presumption, the only way thatP(y | g) canbeg-independent (for allg such thatP(g) , 0) is if it is one of the two deltafunctions,δy,AB or δy,B. (See the appendix for a formal proof.) This contradictsRealist’s interpretation of the game, under which you can set P(y | g) to anyh(y) you desire.7

Conversely, if you can setP(y | g) to be an arbitraryg-independent distri-bution (as Realist presumes), then what you set it to may affect P(g | y) (inviolation of Fearful’s presumption thatP(g | y) is set exclusively by W). Inother words, if your having ‘free will’ means what it does to Realist, then youhave the power to change the prediction accuracy ofW (!). As an example,if you setP(y = AB | g) = 3/4 for all g’s such thatP(g) , 0, thenP(g | y)cannot equalδg,y. So scenario [3] is also improper; ; it is over-played, in theterminology of Sec. 3.2.

Combining our analyses of scenarios [1] through [3] shows that there aretwo ways to formalize Newcomb’s scenario. You can be free to set P(y) how-ever you want, withP(g | y) set byW, as Fearful presumes,or, as Realistpresumes, you can be free to setP(y | g) to whatever distributionh(y) youwant, withP(g) set byW. It is not possible to play both games simultaneously.

The resolution of Newcomb’s paradox is now immediate. When formal-izing Newcomb’s scenario, we must pick either Realist’s game or Fearful’s.Once we pick the game, the optimal choice of you is perfectly well-defined.The only reason that there appears to be a paradox is that the way Newcomb’squestion is phrased leads one to think that somehow those twogames arebeing combined — and in fact combining the games leads to mathematicalimpossibilities.

7 Note that of the twoδ functions you can choose in this combined decomposition, itisbetter for you to chooseh(y) = δy,B, resulting in a payoff of 1,000, 000. So your optimalresponse to Newcomb’s question for this variant is the same as if you were Fearful.


16

4.4. Discussion

It is important to emphasize that the impossibility of playing both gamessimultaneously arises for almost any error rateα in theP(g | y) chosen byW,i.e., no matter how accuratelyW predicts. This means that the stipulation inNewcomb’s paradox thatW predicts perfectly is a red herring. (Interestingly,Newcomb himself did not insist on such perfect prediction inhis formulationof the paradox, perhaps to avoid the time paradox problems.)The crucialimpossibility implicit in Newcomb’s question is the idea that at the same timeyou can arbitrarily specify ‘your’ distributionP(y | g) and W can arbitrarilyspecify ‘his” distributionP(g | y). In fact, neither of you two can set yourdistribution without possibly affecting the other’s distribution; you andW areinextricably coupled.

We also emphasize that vague concepts (e.g., ‘free will’) orcontrover-sial ones (e.g., ‘causality’) are not relevant to the resolution of Newcomb’sparadox; it is not necessary to introduce mathematizationsof those conceptsto resolve Newcomb’s paradox. The only mathematics needed is standardprobability theory, together with the axioms of VonNeumannMorgensternutility theory.

Another important contribution of our resolution is that itshows that New-comb’s scenario does not establish a conflict between game theory’s dom-inance principle and its expected utility principle, as some have suggested.Indeed, as mentioned, we adopt the standard expected utility axioms of gametheory throughout our analysis. However nowhere do we violate the domi-nance principle.

The key behind our avoiding a conflict between those two principles is ourtaking care to specify what Bayes net underlies the game. Realist’s reasoningappearsto follow the dominance principle, and Fearful’s to follow the princi-ple of maximizing expected utility. (Hence the conflicting answers of Realistand Fearful appear to illustrate a conflict between those twoprinciples.) How-ever Realist is actually following the principle of maximizing expected utilityfor that Bayes net game for which Realist’s answer is correct. In contrast,Realist’s reasoning is an unjustified violation of that principle for the Bayesnet game in which Realist’s answer is incorrect. (In particular, Realist’s an-swer doesnot follow from the dominance principle in that Bayes net game.)It is only by being sloppy in specifying the underlying Bayesnet game that itappears that there is a conflict between the expected utilityprinciple and thedominance principle.

Note also that your utility function is irrelevant to the paradox and itsresolution. The fundamental contradiction built into Newcomb’s scenario in-volves the specification of the strategy sets of you andW. The associatedutility functions never appear.



Another note-worthy point is that no time variable occurs inour analysisof Newcomb’s paradox. Concretely, nothing in our analysis using Bayes netsand associated games requires thatW’s prediction occur before your choice.This lack of a time variable in our analysis means we can assign times tothe events in our analysis any way we please, and the analysisstill holds; theanalysis is time-reversal invariant.

This invariance helps clarify the differences in the assumptions made byRealist and Fearful. The invariance means that both the formal statement ofthe paradox and its resolution are unchanged if the prediction occursafteryour choice rather than (as is conventional) before your choice. In otherwords,W could use data accumulated up to a timeafteryou make your choiceto ‘retroactively predict’ what your choice was. In the extreme case, the ‘pre-diction’ algorithm could even directly observe your choice. All of the mathe-matics introduced above concerning Bayes nets, and possible contradictionsstill holds.

In particular, in this time-reversed version of Newcomb’s scenario, Fearfulwould be concerned thatW canobservehis choice with high accuracy. (That’swhat it means to haveP(g | y) be a delta function whose argument is set bythe value ofy.) Formally, this is exactly the same as the concern of Fearfulin the conventional Newcomb’s scenario thatW canpredict his choice withhigh accuracy.

In contrast, in the time-reversed version of Newcomb’s scenario, Realistwould believe that he can guarantee that his choice is independent of whatW says he chooses. (That’s what it means to haveP(y | g) equal someh(y),independent ofg.) In essence, he assumes that you can completely hide yourchoice fromW, so thatW can only guess randomly what choice you made.Formally, this assumption is exactly the same as the ‘free will’ belief of Real-ist in the conventional Newcomb’s scenario that you can forceW’s predictionto be independent of your choice.

In this time-reversed version of Newcomb’s scenario, the differences be-tween the assumptions of Realist and Fearful are far starkerthan in the con-ventional form of Newcomb’s scenario, as is the fact that those assumptionsare inconsistent. One could use this time-reversed versionto try to arguein favor of one set of assumptions or the other (i.e., argue infavor of oneBayes net game or the other). Our intent instead is to clarifythat Newcomb’squestion does not specify which Bayes net game is played, andtherefore isnot properly posed. As soon as one or the other of the Bayes netgames isspecified, then Newcomb’s question has a unique, correct answer.


18

5. Conclusion

Newcomb’s paradox has been so vexing that it has led some to resort to non-Bayesian probability theory in their attempt to understandit (Gibbard andHarper, 1978; Hunter and Richter, 1978), some to presume that payoff mustsomehow depend on your beliefs as well as what’s under the boxes (Geanako-plos, 1997), and has even even led some to claim that quantum mechanicsis crucial to understanding the paradox (Piotrowski and Sladkowski, 2002).This is all in addition to work on the paradox based on now-discreditedformulations of causality (Jacobi, 1993).

Our analysis shows that the resolution of Newcomb’s paradoxis in factquite simple. Newcomb’s paradox takes two incompatible interpretations ofa question, with two different answers, and makes it seem as though they arethe same interpretation. The lesson of Newcomb’s paradox isjust the ancientverity that one must carefully define all one’s terms.

Acknowledgements We would like to thank Mark Wilber, Charlie Straussand an anonymous referee for helpful comments.



Choose AB Choose B

Predict AB: 1000 0

Predict B: 1, 001, 000 1, 000, 000

Table 1: The payoff to you for the four combinations of your choice andW’sprediction.


20

APPENDIX

In the text, it is claimed that if for someα ∈ (1/2, 1], W can ensure thatP(g | y) = αδg,y + (1 − α)(1 − δg,y) for all y such thatP(y) , 0, and if inadditionP(y | g) is g-independent for allg such thatP(g) , 0, then in factP(y | g) is one of the two delta functions,δy,AB or δy,B. This can be seen justby examining the general 2×2 table of values ofP(g, y) that is consistent withthe first hypothesis, thatP(g | y) = αδg,y + (1− α)(1− δg,y):

y = AB y = B

g = AB: αzAB (1− α)zB

g = B: (1− α)zAB αzB

Table 2: The probabilities of the four combinations of your choicey andW’spredictiong, given thatP(g | y) = αδg,y + (1 − α)(1 − δg,y) for all y withnon-zeroP(y). Both zAB andzB are in [0, 1], and normalization means thatzAB+ zB = 1.

If both zAB , 0 andzB , 0, then neitherP(g = AB) nor P(g = B)equals zero. Under such a condition, the second hypothesis,that P(y | g)is g-independent for allg such thatP(g) , 0, would mean thatP(y | g) is thesame function ofy for bothg’s. This in turn means that

αzAB

(1− α)zB=

(1− α)zAB

αzB(4)

which is impossible given our bounds onα. Accordingly, eitherzAB or zB

equals zero.QED



References

Bar-Hillel, M. and A. Margalit: 1972, ‘Newcomb’s paradox revisited’. British Journal ofPhilosophy of Science23, 295–304.

Benford, G., D. Book, andW. Newcomb: 1970.Physical Review D2, 263.Berger, J. M.: 1985,Statistical Decision theory and Bayesian Analysis. Springer-Verlag.Binder, P.: 2008, ‘Theories of almost everything’.Nature455, 884–885.Burgess, S.: 2004, ‘The Newcomb problem: An unqualified resolution’. Synthese138, 261–

287.Campbell, R. and S. Lanning: 1985,Paradoxes of Rationality and Cooperation: Prisoners’

Dilemma and Newcomb’s Problem. University of British Columbia Press.Collins, J.: 2001, ‘Newcomb’s Problem’. In:International Encyclopedia of the Social and

Behavioral Sciences. Elsevier Science.Fudenberg, D. and J. Tirole: 1991,Game Theory. Cambridge, MA: MIT Press.Gardner, M.: 1974, ‘Mathematical Games’.Scientific Americanp. 102.Geanakoplos, J.: 1997, ‘The Hangman’s Paradox and Newcomb’s Paradox as Psychological

Games’. Yale Cowles Foundation paper, 1128.Gibbard, A. andW. Harper: 1978, ‘Counterfactuals and two kinds of expected utility’. In:

Foundations and applications of decision theory. D. Reidel Publishing.Hunter, D. and R. Richter: 1978, ‘Counterfactuals and Newcomb’s paradox’. Synthese39,

249–261.Jacobi, N.: 1993, ‘Newcomb’s paradox: a realist resolution’. Theory and Decision35, 1–17.Koller, D. and B. Milch: 2003, ‘Multi-agent influence diagrams for representing and solving

games’.Games and Economic Behavior45, 181–221.Levi, I.: 1982, ‘A Note on Newcombmania’.Journal of Philosophy79, 337–342.Myerson, R. B.: 1991,Game theory: Analysis of Conflict. Harvard University Press.Nozick, R.: 1969, ‘Newcomb’s Problem and Two principles of Choice’. In:Essays in Honor

of Carl G. Hempel. p. 115, Synthese: Dordrecht, the Netherland.Osborne, M. and A. Rubenstein: 1994,A Course in Game Theory. Cambridge, MA: MIT

Press.Pearl, J.: 2000,Causality: Models, Reasoning and Inference. Cambridge University Press.Piotrowski, E. W. and J. Sladkowski: 2002, ‘Quantum solution to the Newcomb’s paradox’.

http://ideas.repec.org/p/sla/eakjkl/10.html.Starmer, C.: 2000, ‘Developments in Non-Expected Utility Theory: the Hunt for a Descriptive

Theory of Choice under Risk’.Journal of Economic Literature38, 332–382.Wolpert, D. H.: 2008, ‘Physical limits of inference’.Physica D237, 1257–1281. More recent

version at http://arxiv.org/abs/0708.1362.Wolpert, D. H.: 2010, ‘Inference concerning physical systems’. In: Proceedings of CiE 2010.

Springer.


what does newcomb’s paradox teach us? - arxiv · on that deduction. newcomb’s paradox is that...

Documents