the malicious host: a minimax solution of the monty hall problem

This article was downloaded by: [University of Arizona]On: 20 July 2012, At: 12:52Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Applied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/cjas20

The malicious host: a minimax solutionof the Monty Hall problemJan C. Schuller aa Organisation for Research and Treatment of Cancer (EORTC), Av.E. Mounier 83/11, 1200, Brussels, Belgium

Version of record first published: 23 May 2011

To cite this article: Jan C. Schuller (2012): The malicious host: a minimax solution of the MontyHall problem, Journal of Applied Statistics, 39:1, 215-221

To link to this article: http://dx.doi.org/10.1080/02664763.2011.580337

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representationthat the contents will be complete or accurate or up to date. The accuracy of anyinstructions, formulae, and drug doses should be independently verified with primarysources. The publisher shall not be liable for any loss, actions, claims, proceedings,demand, or costs or damages whatsoever or howsoever caused arising directly orindirectly in connection with or arising out of the use of this material.

http://www.tandfonline.com/loi/cjas20

http://dx.doi.org/10.1080/02664763.2011.580337

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Journal of Applied StatisticsVol. 39, No. 1, January 2012, 215–221

The malicious host: a minimax solution ofthe Monty Hall problem

Jan C. Schuller*

Organisation for Research and Treatment of Cancer (EORTC), Av. E. Mounier 83/11,1200 Brussels, Belgium

(Received 11 August 2010; final version received 7 April 2011)

The classic solution of the Monty Hall problem tacitly assumes that, after the candidate made his/her firstchoice, the host always allows the candidate to switch doors after he/she showed to the candidate a losingdoor, not initially chosen by the candidate. In view of actual TV shows, it seems a more credible assumptionthat the host will or will not allow switching. Under this assumption, possible strategies for the candidateare discussed, with respect to a minimax solution of the problem. In conclusion, the classic solution doesnot necessarily provide a good guidance for a candidate on a game show. It is discussed that the popularityof the problem is due to its incompleteness.

Keywords: Monty Hall problem; conditional probability; minimax; simulation; incompleteness

1. Introduction

The Monty Hall problem, originally posed in a letter to The American Statistician [10], as follows.Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door

is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’sbehind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do youwant to pick door No. 2?” (p. 16). Is it to your advantage to switch your choice? [9]

This problem has gained a considerable attention, since it provides a somewhat counterintuitiveexample of conditional probabilities [1,2,7]. Provided that the car is behind any of the three doorswith equal probability and the host always opens a remaining door with a goat, switching winswith p = 2/3 and staying with the first choice wins with p = 1/3. Thus, switching is to the advantageof the candidate. We call this the “classic” solution of the Monty Hall problem.

While reviewing game shows on TV, one might challenge the assumptions that yield the classicsolution. This is particularly true for the role of the host. In contemporary TV shows, it is quitecommon that the host plays all kinds of tricks to the candidate. Consider the host shows a malicioussmile when he/she does open. One might well suspect that he/she only opened it because thecandidate’s first choice was correct and the host now wants to seduce the candidate to switch and

*Email: [email protected]

ISSN 0266-4763 print/ISSN 1360-0532 online© 2012 Taylor & Francishttp://dx.doi.org/10.1080/02664763.2011.580337http://www.tandfonline.com

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2

216 J.C. Schuller

lose. Likewise, the host might show a warm, merciful and assuring expression, when he/she opensthe door, thereby perhaps indicating that the first choice of the candidate was wrong, in whichcase switching always wins. Accordingly, we assume that the candidate does not know in advanceif the door with the goat will be opened or not. That the host might not always offer a switch wasdiscussed early but not further elaborated. We will formalize these cases using Bayes’ theoremand discuss promising rules of action for the candidate with particular attention to a minimaxstrategy that aims to minimize maximum losses. In the context of the Monty Hall problem, themaximum loss corresponds to the maximum probability not to win the car.

2. The classic solution

Suppose, the car is hidden behind one of the doors with equal probability and behind the other twodoors are goats. The candidate chooses a door which remains closed. Then, the host (who knowswhat is behind the doors) opens one of the two remaining doors which contains a goat. The hostdoes not open the door chosen by the candidate, nor does he/she open the door with the car. If thehost can choose between two doors with goats, he/she does so with equal probability. Thereafter,the candidate has the option to switch to the remaining door or to stay with his/her first choice.

We ask for the probability that the candidate’s first choice was wrong, given that the host opensa door with a goat and thus allows switching.

Notation and definitions:FR: The candidate’s first choice is correct (car).¬FR: The candidate’s first choice is wrong (goat).O: The host opens a door with a goat and the candidate has the option to switch.P(FR|O): Probability that the candidate’s first choice is right, given that the host opens a door

with a goat and thus allows the candidate to switch. This is equivalent to the probability thatswitching loses and staying wins.

P(¬FR|O): Probability that the candidate’s first choice is wrong, given that the host opens adoor with a goat and thus allows the candidate to switch. This is equivalent to the probability thatswitching wins and staying loses.

P(O|FR): Probability that the host opens a losing door not chosen by the candidate, giventhe candidate’s first choice was correct. Since it may reflect the host’s malevolence toward thecandidate, we denote this conditional probability by m.

P(O|¬FR): Probability that the host opens a losing door not chosen by the candidate, giventhe candidate’s first choice was wrong. Since it may reflect the host’s benevolence toward thecandidate, we denote this conditional probability by b.

In the classic game, the host always gives the hint, thus P(O) = 1, and since P(FR) = 1/3,

P(FR|O) = P(O|FR)P (FR)

P (O)= 1

3, (1)

and since P(¬FR) = 2/3, we find

P(¬FR|O) = P(O|¬FR)P (¬FR)

P (O)= 2

3, (2)

which is the classic solution: switching is better for the candidate and staying is not advisable.

3. The malicious host

According to the classic rule, the host always opens a losing door once the candidate made his/herfirst choice. As discussed in Section 1, this rule may not be realistic if we aim to model true TVshows. Thus, we amend this rule to as follows.

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2

Journal of Applied Statistics 217

The host may or may not open a remaining losing door after the candidate made his/her firstchoice. If the host does open, the candidate has the option to switch. If the host chooses not toopen a door, the candidate has to stay with his/her first choice.

Suppose, the host opens a losing door with p = 2/3 if the candidate’s first choice was right.This conditional probability, P(O|FR), may reflect the degree of the host’s malevolence towardthe candidate and we thus denote it by m = 2/3. Likewise, the host may open a losing door withp = 1/3 if the candidate’s first choice was wrong. This conditional probability, P(O|¬FR), is thehost’s degree of benevolence, and we write b = 1/3.

According to the law of total probability:

P(O) = m ∗ P(FR) + b ∗ P(¬FR). (3)

Plugging these values into Equation (1) and since P(O|FR) = m and P(O|¬FR) = b, we get form = 2/3 and b = 1/3:

P(FR|O) = P(FR) ∗ m

P(FR) ∗ m + P(¬FR) ∗ b(4)

= 1/3 ∗ 2/3

1/3 ∗ 2/3 + 2/3 ∗ 1/3(5)

= 1

2. (6)

Thus, in contrast to the classic solution, if the host has a malevolence of m = 2/3 and a benevolenceof b = 1/3, switching will only win with p = 1/2 instead of p = 2/3.

If we explore the whole range of 0 ≤ m ≤ 1 and 0 ≤ b ≤ 1, we obtain the contour graph inFigure 1, depicting the respective values of P(FR|O), which is the probability that staying with thefirst choice wins if a switch is offered. Equation (4) can also be written as P(FR|O) = m/(m + 2b),i.e. for the decision to switch or not, it is sufficient to know the quotient b/m. Switch if b/m > 0.5and stay if b/m < 0.5, equal chances if b/m = 0.5. The classic solution corresponds to the point x inthe upper right. Switching is twice as good as staying whenever b = m, along the dotted diagonalfrom (0, 0) to (1, 1).

4. Strategies for the candidate

In the classic game, the host always opens a losing door, once the candidate made his/her firstchoice, i.e. m = 1, b = 1. This situation corresponds to the point (1, 1), in the upper right corner ofFigure 1, and switching wins with p = 2/3 as it was shown in Section 2.

Clearly, switching will be better than staying, if P(FR|O) < 0.5. If we put this value to theleft-hand side of Equation (4) and solve for b, we find that b = 0.5m, i.e. switching is better ifb > 0.5m. If b = 0.5m, switching is as good as staying, and if b < 0.5m, switching is worse thanstaying.

Thus, the optimal strategy, switching or staying, is determined by m and b. If these are known, itis a simple calculation that will tell the candidate whether he/she should switch or stay (Figure 1).If m and b are unknown, the candidate could try and estimate them based on the behavior of thehost, if he/she seems trustworthy, etc., thereby bearing in mind that switching can be successfuleven if the host is mildly malevolent, i.e. as long as b > 0.5. Hosts on true TV shows are, however,cunning and this might prevent an estimation of m and b.

One might instead assume that m and b are uniformly distributed over the interval [0,1]. In thiscase, since

∫ 10 0.5m dm = 1/4, the candidate should stay with p = 1/4 and switch with p = 3/4.

(Also consider Figure 1: if m and b are uniformly distributed, the point (m, b) would be in the

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2

218 J.C. Schuller

Contour graph for P(FR|O) for different values of m and b

m

b

0.1

0.2

0.3

0.4

0.6

0.7

0.8

0.9

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

00.

10.

20.

30.

40.

50.

60.

70.

80.

91

w.

x.y.

z.

Figure 1. Contour graph for (P(FR|O)) for different values of m and b. The classic situation correspondsto (1, 1) in the upper right corner (x). If b = m, on the dotted diagonal through x, the host is neutral, andswitching wins twice as much compared with staying. All points right of the diagonal correspond to a rathermalicious host, while the points left of it indicate benevolence toward the candidate. The point at y (0, 1)represents the “angelic” host (switching is always possible if ¬FR and always wins), and the point at z (1, 0)the “host from hell” (switching is always possible if FR and always looses). If (m, b) is left of P(FR|O) = 0.5(dotted line marked with w), switching is to the advantage of the candidate, otherwise not.

segment of the square ((0, 0), (0, 1), (1, 1), (1, 0)) that is separated by the line defined by b = 0.5m,where b < 0.5m, in one-quarter of the cases.)

If m and b are Beta-distributed and independent, a successful strategy can be developed asfollows.

Let b be a random variable following Beta(b1, b2) distribution with density fb and cdf Fb andm be a random variable following Beta(m1, m2) distribution with density fm and cdf Fm. Assumethat they are independent.

We ask for the probability that staying is better than switching:

P(b < 0.5m) =∫ 1

0P(b < 0.5m|m = x)fm(x) dx

=∫ 1

0Fb(0.5x)fm(x) dx.

This can be approximated by Monte Carlo simulation: simulate n points from Beta(m1, m2) anddenote them by xi, i = 1, . . . , n. Then,

P(b < 0.5m) ≈ n−1n∑

i=1

Fb(0.5xi),

and the value of Fb(x) is readily available from the computer software such as R.

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2


0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

m

b

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.2

0.4

0.6

0.8

1.0

m

b

Simulations with Beta−distributed m and b

Figure 2. Simulations of P(FR|O), where m and b are Beta distributed (n = 500). The solid line showsP(FR|O) = 0.5. The broken lines indicate the means of m and b. The left panel corresponds to a neutral hostthat will mostly open a door, independent of the candidates first choice. A simulation with n = 106 revealedthat in this case switching is better than staying with p ≈ 0.99. m and b are Beta-distributed with a mean of 0.9(b1 = m1 = 5, b2 = m2 = 0.556). The data on the right panel corresponds to a malevolent host. m followedBeta with a mean of 0.7 (m1 = 3, m2 = 1.29) and b followed Beta with a mean of 0.2 (b1 = 1, b2 = 2). Asimulation with n = 106 suggested that in this case switching is better than staying with p ≈ 0.21.

In Figure 2, two simulations with n = 500 are shown, one for a relatively neutral host (leftpanel) and one for a rather malevolant one (right panel). If on average m = b = 0.9 (Figure 2, leftpanel), the host will offer a switch most of the times independently of the candidates first choiceand switching will usually be the better option for the candidate. If on average m = 0.7 and b = 0.2(Figure 2, right panel), staying will be advantageous in most of the cases.

In addition to the means of m and b, two additional shape parameters must be known in order toimplement such strategy. There are, however, countless other possibilities for distributions of mand b, many of which would demand a different strategy.

The candidate could contemplate about the motivation of the host. One important aspect of trueTV shows is the amusement of the audience. The TV audience loves shining heroes as well asmiserable losers and the behavior of a successful host will account for this. Also, a consistentlyhellish or angelic host (i.e. (m, b) = (1, 0) or = (0, 1)) would not be as amusing as one with a ratherflexible behavior.

There is the limitation of sample size: many discussions of the Monty Hall problem assume thatthe game is played repeatedly, even indefinitely. If so, the candidate can adopt a flexible strategy,thereby collecting information on m and b and adjusting the strategy accordingly, e.g. switch firstand stop to do so if the host turns out to be hellish (very much in manner which is known as “tit fortat” or “quid pro quo” in game theory). True TV shows are, however, different and such game canbe expected to be played only once. Thus, the candidate’s first guess should be as good as possible.

In summary: the estimation of m and b is often difficult or impossible. In such cases, thecandidate has one safe resort: not to maximize his/her possible maximum win that can only beachieved by switching, but rather to minimize the maximum loss, i.e. he/she should implementa minimax strategy. Depending on m and b, switching can be very successful and can evenguarantee winning with p = 1 if (m, b) = (0, 1), i.e. if the host always allows switching if the firstchoice was wrong and never does if the first choice was correct (y in Figure 1). On the other hand,if (m, b) = (1, 0), switching always loses because the host always allows switching if the first thechoice was right and never does if the first choice was wrong (z in Figure 1). If the candidate doesnot know m and b and cannot control them, he/she should stay and accept the p = 1/3 chance towin, which is independent of any cunning tricks played by the host.

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2

220 J.C. Schuller

The Monty Hall problem was described as a two-player zero-sum game [8], where the candidatewants to win the car and the host wants to keep it. The present variant of problem is differentin two regards: it is unknown if the host wants to keep the car and the host does not alwaysallow the candidate to reconsider his/her first choice. Thus, the present variant can be describedas a one-player game with a stochastic outcome. The minimax solution of such game considers aworst-case scenario rather than reasoning in the expectation of a fixed-probability density functionof m and b. In games with a stochastic outcome, the minimax solution can be too pessimistic ifextremely unlikely events are considered (an insurance against unlikely events is likely to be awaste of money) [5].

5. Discussion

The original version of the problem does not specify the rule of the host. This incompletenesswas pointed out before, and other solutions, apart from the classic one, were discussed. Amongthose are the “Monty from hell”, where switching always yields a goat [11], the “Angelic Monty”,where switching always wins the car [3] and other special cases of the present more general model[6,8]. Situations where the host does not know what is behind the doors [4] are not within theframework of this model.

A monograph on the Monty Hall problem [7] acknowledged the incompleteness, but the authorconcluded that completeness was not necessary, because it was self-evident that the host wouldact according to the classic rule. The author wrote this in 1992, before the advent of “modern”game shows on European TV. When I recently asked his opinion on the model discussed in thepresent paper, he agreed and said that today’s cunning host was virtually non-existent at the timeof his original writing (personal communication).

The Monty Hall problem still appears in the public media and still causes passionate dis-cussions. This may indicate that it is not only about conditional probability but also has astrong psychological component (some make this assertion, because it is about the conditionalprobability).

In other words, the Monty Hall problem is a popular brain teaser, not only because it involvesconditional probabilities, but also because it is incomplete. Moreover, the incompleteness is subtleand easily neglected. To come up with a solution, the problem-solver must decide which rules touse, and this decision is often not made consciously, and a particular rule may seem so obviousthat it is adopted without scrutiny. To further illustrate this point: consider the original MontyHall problem and then added: “The host acts according to certain rules which are hidden fromyou”. In this setting, many would perhaps answer “If the rules are not known, I cannot solve theproblem”. In this instance, they would possibly become suspicious of the host just because it waspointed out that he/she will act according to certain hidden rules (and they would possibly staywith the first choice).

In conclusion, while the classic Monty Hall problem is an instructive application of conditionalprobabilities, it does not necessarily provide a guideline on how to act as the candidate of a gameshow. If the rules of the game are unknown, it is safe to stay with the first choice.

Acknowledgements

Two reviewers gave valuable comments on a previous version of this article. Thanks to Juhyun Park for helpful discussions.

References

[1] E. Barbeau, The problem of the car and goats, College Math. J. 24(2) (1993), pp. 149–154.[2] L. Gillman, The car and the goats, Amer. Math. Monthly 99(1) (1992), pp. 3–7.

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2


[3] D. Granberg, To Switch or Not to Switch, Chapter Appendix, St Martin’s Press, New York, 1997.[4] D. Granberg and T.A. Brown, The Monty Hall dilemma, Pers. Soc. Psychol. Bull. 21(7) (1995), pp. 711–729.[5] H.B. McMahan, Robust planning in domains with stochastic outcomes, adversaries, and partial observability,

Dissertation, Carnegie Mellon University, 2006.[6] P.R. Mueser and D. Granberg, The Monty Hall Dilemma Revisited: Understanding the Interaction of Problem

Definition and Decision Making, Experimental, EconWPA, 1999.[7] G. V. Randow, Das Ziegenproblem. Denken in Wahrscheinlichkeiten, Rowohlt, Reinbeck, 1992.[8] J.S. Rosenthal, Monty Hall, Monty Fall, Monty Crawl, Math Horiz. September (2008), pp. 5–7.[9] M. vos Savant, Ask marilyn, Parade Mag. 9 (1990), p. 16.

[10] S. Selvin, A problem in probability, Amer. Statist. 29(1) (1975), p. 67.[11] J. Thierny, Behind Monty Hall’s doors: Puzzle, debate and answer?. New York Times, 21 July 1991. Available at

http://nytimes.com (accessed 26 January 2011).

Dow

nloa

ded

by [

Uni

vers

ity o

f A

rizo

na]

at 1

2:52

20

July

201

2

the malicious host: a minimax solution of the monty hall problem

Documents