amm april 2014

117
Periodicity Domains and the Transit of Venus Author(s): Andrew J. Simoson Source: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 283-298 Published by: Mathematical Association of America Stable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.283 . Accessed: 30/03/2014 17:28 Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp . JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected]. . Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access to The American Mathematical Monthly. http://www.jstor.org This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PM All use subject to JSTOR Terms and Conditions

Upload: tamash-ionut

Post on 02-May-2017

242 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AMM April 2014

Periodicity Domains and the Transit of VenusAuthor(s): Andrew J. SimosonSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 283-298Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.283 .

Accessed: 30/03/2014 17:28

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 2: AMM April 2014

Periodicity Domains and the Transit of Venus

Andrew J. Simoson

Abstract. A transit of Venus occurs when it passes directly between the Earth and the Sun. Astraightforward linear algebraic model for the orbits of Earth and Venus—essentially using oneparameter, namely, the relative angular velocity σ of Venus—is powerful enough to generaterespectable transit year predictions. We generalize, allowing σ to vary; uncover an algebraicanalog for predicting transits; and show that time cycles for transits are what they are becauseeach σ is sufficiently close to a suitably simple rational number, which for Venus is 13

8 , andwhich in turn induces a modulo 8 shuffling of successive transit years by a factor of 3.

1. INTRODUCTION. At least once each year, Venus passes between the Earth andthe Sun. Because the orbital planes of Earth and Venus intersect one another at anangle µ, only rarely does it come directly between the Earth and the Sun. On these oc-casions, the profile of Venus—a transit of Venus across the Sun—can be viewed fromEarth. The last transit was in June 2012 and the next one will be in December 2117.Ascertaining the periodicity of the transits is a delicate problem. In particular, relativeto Earth’s angular frequency of one rotation per year, Venus makes σ0 ≈ 1.62555 rota-tions per year. From this value, how can we deduce the 105-year transit lapse between,say, 2012 and 2117? And in general, as we allow angular velocity σ to vary, how doesthe time lapse between transits change? The answer is surprisingly chaotic. Beforeshowing this, we first give some transit history.

Figure 1. A Venus transit viewed against a spire of the Taj Mahal, June 2012, courtesy of AP Photo/KevinFrayer

2. A LITTLE HISTORY. In 1629, Johannes Kepler predicted a 1631 transit ofVenus and estimated the period between transits as 120 years. The first recorded tran-sit observation was in 1639 by Jeremiah Horox and William Crabtree. In 1663, JamesGregory realized that careful observations of these transits would enable the scientificcommunity to determine the distance a of one astronomical unit (AU)—the distance

http://dx.doi.org/10.4169/amer.math.monthly.121.04.283MSC: Primary 11A07; 70F15

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 283

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 3: AMM April 2014

between the Sun and the Earth—in miles. Up until Kepler’s day, the reigning guess fora had been about five million miles; Kepler, after studying the geocentric parallax ofMars, bumped the value of a to at least 15 million miles. With the advent of the tele-scope, the guesses improved. In 1716, Edmund Halley predicted that a was “14,000semi-diameters of the Earth” or about 56 million miles, and championed Gregory’splan to test his guess [4].

Figure 2. William Crabtree observing a transit; mural at Manchester Town Hall by Ford Madox Brown (1821–1893)

But for Halley, the next transit for Venus was 45 years in the future. Therefore hecharged the astronomers of two generations hence to do what he could not. As a recentbiographer of these events has written, “even on his death-bed whilst holding a glassof wine in his hand, Halley said, ‘I wish that many observations of this phenomenonmight be taken by different persons at separate places”’ [11, p. xxiv]. Astronomers ofthe eighteenth century had two chances to observe, June 1761 and 1769. Many of thecolorful adventures of these astronomers as they answered Halley’s call are chronicledin [10] and [11]. As reviewed recently in detail by Teets [9], James Short analyzedtransit data from sites as far afield as South Africa and northern Finland, and publishedhis conclusions in the December 1761 issue of the Philosophical Transactions of theRoyal Society that a was 93,726,000 miles.

The standard reference for transit dates is Jean Meeus’s tables, spanning 6000 years[5]. Espenak [3], who compiled NASA’s website on transits, names Meeus’s work“an indispensable reference for anyone wishing to do transit calculations.” Danloux-Dumesnils [2] calls Meeus’s original tables [6] “une belle etude.” Much of Meeus’snumber crunching is based on “the modern planetary theory VSOP87 of the Bureaudes Longitudes of Paris,” [5, p. 1]. Against this standard, we contrast our results.

3. THE MODEL. We assume that the orbits of Earth E and Venus V are circles,with periods of τe ≈ 365.26 days and τv ≈ 224.70 days, respectively. By Kepler’sthird law of planetary motion, with time t in years and distance in astronomical units(AU), a3 = τ 2 where a is the semi-major axis of a planet’s elliptical orbit and τ is itsperiod. Thus, Venus is λ ≈ 0.723 AU from the Sun S.

We further assume that E’s orbit lies in the xy-plane with S at the origin O andthat V ’s orbit lies in a plane through O inclined at angle µ ≈ 3.39◦ to the xy-plane.We call the line between these orbital planes the nexus line or, according to Meeus [5],

284 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 4: AMM April 2014

the line of nodes. The nexus line in Figure 3 is labeled BC . A nexus point or nodefor Venus—F and G in the figure—or for Earth—B and C in the figure—is wherethe orbit of V or E pierces the orbital plane of E or V , respectively. Transits willonly occur when E and V are both near B and F , respectively, or both near C and G.The former transit is called a fall transit because in modern times E is at B in earlyDecember; it is also called, according to Meeus, an ascending transit, because as V ’sprofile moves across S from left to right its trajectory rises. The latter transit is calleda spring transit because E is at C in early June; it is also called a descending transit,because the corresponding trajectory decreases. E’s and V ’s position at any time isgiven respectively by E(t) and V (t):

E(t) = cos(2π t)

sin(2π t)0

and V (t) = λ 1 0 0

0 cosµ sinµ0 − sinµ cosµ

cos(2πσ t)sin(2πσ t)

0

, (1)

where σ is the relative angular velocity of V with respect to E . For simplicity, weinitially position V and E at their spring nexus points. The value of σ for the actual Vand E is σ0 = τe/τv ≈ 1.62555. The 3× 3 matrix in (1) corresponds with a clockwiserotation by µ about the x-axis, so as to be consistent with a descending (spring) transitoccurring near nodes (nexus points) C and G, where C = (1, 0, 0).

A line parametrized by u from E through V at time t is

P(u, t) = (V (t)− E(t))u + E(t). (2)

To find the projection of V ’s shadow on S as viewed from E(t)—an ideal geocentricpoint in space at E’s center—we imagine that S resides within a rotating plane orscreen S(t) ever perpendicular to E(t). Figure 3 shows the two orbital planes and V ’sprojection on the screen as viewed from E . The plane S(t) of S can be written as

X · E(t) = 0 (3)

where X is a general point (x, y, z) on the screen. When E and V are on oppositesides of the screen at time t—which happens if and only if E(t) · V (t) < 0—we takethe projection point of V onto the screen as that screen point between the planets.

V’s orbit

E(t)

V(t)

screen

O

E’s orbit

axis between the planetary planes

Sun

V’s shadow

B

C

nexus pt for E nexus pt for V

G

F

Figure 3. The screen of the Sun

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 285

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 5: AMM April 2014

We combine (2) and (3) so as to find the point X (t) where the line intersects theplane. That is, the equations P(u, t) = X and (3) are the following system of fourequations with four unknowns x, y, z, u, as well as the time variable t :

x = (λ cos(2πσ t)− cos(2π t))u + cos(2π t)y = (λ cosµ sin(2πσ t)− sin(2π t))u + sin(2π t)z = −λ sinµ sin(2πσ t)u0 = x cos(2π t)+ y sin(2π t).

(4)

Writing (4) as a matrix equation gives AX(t) = E(t), where

A =

1 0 0 cos(2π t)− λ cos(2πσ t)0 1 0 sin(2π t)− λ cosµ sin(2πσ t)0 0 1 λ sinµ sin(2πσ t)

cos(2π t) sin(2π t) 0 0

(5)

with X(t) and E(t) being the respective vectors (x, y, z, u) and (cos(2π t), sin(2π t),0, 0). For this transformation,

det(A) = −1+ λ(cos(2πσ t) cos(2π t)+ cosµ sin(2πσ t) sin(2π t))

= −1+ λ2

((1+ cosµ) cos(2π(σ − 1)t)+ (1− cosµ) cos(2π(σ + 1)t)

)≤ −1+ λ

2

(|1+ cosµ| + |1− cosµ|) = −1+ λ < 0.

Because the determinant of A is never zero, then X(t) = A−1 E(t). Since it wouldbe convenient to see these points of intersection on a stationary screen rather than thedynamic plane S(t), we clockwise rotate the first two components of X(t) about thez-axis by 2π t radians. The result of such a transformation is a set of points whosefirst three components trace V ’s projection onto the screen of S. Finally, since the firstcomponent of such points will always be 0, and we are disinterested in u, we projectthis set of points so as to obtain their second and third components as ordered pairs,which we index as W (t) = (W1(t),W2(t)),

W (t) =[

0 1 0 00 0 1 0

] cos(2π t) sin(2π t) 0 0− sin(2π t) cos(2π t) 0 0

0 0 1 00 0 0 1

A−1 E(t). (6)

−1 10.1

distances in AU

0.005

0.005

T121.5

T117.5

T113.5

(a) A wide screen (b) Zooming in near the Sun

Figure 4. Trajectories of V ’s shadow on the screen of S

286 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 6: AMM April 2014

Figure 4(a) shows the path of V ’s projection on the screen over 1.5 years. In ourmodel, spring transits only occur near integer years, n, and fall transits only occur nearhalf-years, n + 1

2 . Figure 4(b) is a close up of the screen near S over a period of aboutten years, displaying three arcs of V ’s projection. The arc labeled T 113.5 correspondswith a fall transit near t = 113.5 years. The arc T117.5 corresponds with V and E beingon opposite sides of S near t = 117.5; as such, we display the disk of S in front of thisarc. The arc T121.5 misses the disk of S.

4. CONDITIONS FOR A TRANSIT TO OCCUR. In order to find how far fromits nexus V may wander and yet be part of a transit across S, we project the disk of Sthrough V out to E’s orbit, forming a cone as illustrated in Figure 5(a), which displaysthe situation where the base of the truncated cone is tangent to E’s orbit.

E’s orbitB

C

V

SV

E’s orbit

B

C

S

plane of V’s orbit

hD

DV’s o

rbit

disk of theSun

base of truncated cone

λ

γ

1−λ

µ

k(1 − λ)

ρ

(a) A cone of possible shadows (b) A linear approximation of orbits

Figure 5. Maximum separation from the nexus for a transit

Let ρ be the radius of this base with center point D. To approximate where thisextreme position for V occurs, we linearize the orbits of V and E , and imagine thatthey proceed along lines perpendicular to the nexus line BC, as illustrated in Figure5(b). In this figure, we take the distance SB as 1 AU. The distances SV and SD are kλand k, where k is a marginally-larger-than-1 deformation factor due to linearization.With s ≈ 0.00465 AU as the radius of S, from similar triangles, we see that

s

kλ= ρ

k(1− λ), (7)

which gives ρ ≈ 0.0178 AU. Furthermore,

sinµ = ρ

hand tan γ = h, (8)

where µ is the angle between the two orbital planes, γ is the angle between the nexusline and the line between S and V , and h is distance BD. By (7) and (8),

γ = tan−1

(s(1− λλ sinµ

)≈ s(1− λ)

λµ≈ 0.0301, (9)

since the arguments of the inverse tangent and sine are so small. Thus, in order to bepart of a transit, V may wander no further than about λγ ≈ 0.0218 AU from the nexus.By (9), the lapse of time Lv for V to travel this far from its nexus is

Lv ≈ s(1− λ)2πλσ0µ

≈ 1.08 days. (10)

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 287

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 7: AMM April 2014

The corresponding maximal time Le that E may stray from its nexus points and yettake part in a transit is

Le = γ

2π≈ 42 hours < 2 days. (11)

−0.5 0.0 0.5 1.0 1.5 2.0

2

4

6

8

10 transit occurs here

V and E onopposite

sides of S

time t, in years

speed in AU/yr

Figure 6. Speed, ||W ′(t)||, of V ’s shadow across the screen of S

Since the speed at which a transit is traced across S is bounded by 10.34 AU/yearas indicated by the graph of ||W ′(t)|| in Figure 6, then

||W ′(t)|| < 10.34 AU/year ≈ 0.0284 AU/day (12)

for all t . Let t0 be a medial transit time, a time of a spring transit near integer time nor of a fall transit near half-year time n + 1

2 where W1(t0) = 0. Since the time betweent0 and either n or n + 1

2 must be at most about 42 hours by (11), then the most that||W (n)|| or ||W (n + 1

2 )|| can differ from ||W (t0)|| is approximately

(0.0280 AU/day)(42 hours) ≈ 0.0496AU

by (12). Since |W2(t0)| < s, then 0.05 AU is about the most that ||W (n)|| or ||W (n +12 )|| can be. Therefore, our litmus test to determine if integer year n or half-year n + 1

2is a promising one for a transit is for V and E to be on the same side of S and for

||W (n)|| < 0.05 or

∣∣∣∣∣∣∣∣W (n + 1

2

)∣∣∣∣∣∣∣∣ < 0.05. (13)

Applying (13) to the integers 0 to 2000 with σ = σ0, we find the promising yearsof Table 1.

Table 1. Years at which the spring and fall transits occur

(340.5, (454,0 113.5 227 348.5) 462) 575.5 689 802.5 916

1029.5 (1143, (1256.5, 1378 1491.5 1605 1718.5 1832 (1945.5,1151) 1264.5) 1953.5)

288 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 8: AMM April 2014

Double-checking the dates in Table 1 by graphing the arc W (t) against the disk ofS verifies that each of the years or half-years corresponds with a spring or fall transit,respectively, and are the only transits during this 2000-year period in our model. Ascan be seen, the familiar differences 8, 105.5, and 113.5 between successive transittimes appear—good news for our model. The entries in the table eight years apart havebeen grouped as ordered pairs; their associated transits are called twins or doubles. Forexample, spring transits occur in our model in both year 454 and year 462. For a twintransit, we say the transit member whose path across S comes closer to S’s center isthe dominant transit of the two. If a transit has no twin, it is a singleton transit. Ascan be seen in Figure 7 of the twin transit T454 and T462, T462 is the dominant member.T227 is a singleton. In section 7, we show how to modify our model to simulate actualtransit dates.

0.005

0.005

T462

T454

dominant twin

Figure 7. A twin pair of descending spring transits

Meanwhile, in looking for a pattern with respect to the data of Table 1, the readermay notice that W (0), W (802.5), and W (1605) are all almost (0, 0). Have we stum-bled across a characteristic time period for which the data repeats? To answer, wedefine the practical period T of this data as T = 1605 years and argue that a morenatural period exists for three reasons.

• It is unclear how T is related to σ0.• It is unclear how a period of T explains the time lapse between successive transits.• Since the time lapse between twin transits is 8 years, it seems likely that T should

somehow be related to 8, but how?

In the next section, we find a natural period and demonstrate that the practical andnatural periods are related.

5. RECOGNIZING THE PATTERN. To find a more natural transit period, we fo-cus on spring transits for a season; from Table 1, we drop the fall transit dates, and areleft with Table 2. When we refer to the spring transit year n j from the table, where

Table 2. Spring transits

j 0 1 2 3 4 5 6 7 8

transit year n j 0 227 (454, 462) 689 916 (1143, 1151) 1378 1605 1832

n j mod 8 0 3 6 1 4 7 2 5 0

3 j mod 8 0 3 6 1 4 7 2 5 0

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 289

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 9: AMM April 2014

j ≥ 0, we mean term- j in row 2 or the dominant transit year if the term is a twin. Forexample, n2 = 462, as evidenced by Figure 7. Observe that the first eight spring tran-sits comprise a complete residue set modulo 8. Furthermore, n j mod 8 just happens tobe 3 j mod 8, which suggests that the relative motion of the planets induces a linearshuffling of the transit year residues modulo 8. We thus refer to 3 as a shuffling factor.

To help understand this 8-fold dynamic, observe that every eight years both Eand V pass each other not far from where they had passed each other eight yearsbefore, with V a bit further ahead of E each time. We say that the arc given byW (n years ± 1 week) is rung-n in a ladder of arcs. As the years go by, these rungsstep monotonically upward (or downward) to a climax before reversing their progres-sion, with rung-8n being more or less either above or below rung-8(n + 1) for allintegers n. Near the spring transit years, neighboring rungs are separated by a distancesomewhat more than the radius of S, as illustrated in Figures 4(b), 7, and 8; the dots inFigure 8 represent V ’s projection at t = −16,−8, 0, 8, 16 years. With p = 8, the ap-proximate distance d(p) between neighboring rungs near transit years is the distancebetween W (p) and its projection onto W (0+), where we take 0+ as one hour, is

d(p) =∣∣∣∣∣∣∣∣W (p)− W (p) ·W (0+)

W (0+) ·W (0+)W (0+)

∣∣∣∣∣∣∣∣ ≈ 0.00672 AU. (14)

Since s < d(p) < 2s, then a sequence of at most two successive rungs may cross theface of S, whereas if a rung crosses near the center of S, then only one rung in thatsuccession of rungs may correspond to a transit.

Sun

−16 −8 0 8 16

Figure 8. V ’s projection as given by W (t) near t = −16,−8, 0, 8, 16

When we extend the data as given in Table 2, the data seems to sort itself. That is,plotting {(n,W1(n))}n≥0 corresponding to the times when E is at its spring nexus pointshows a hodge-podge of dots across 100 years in Figure 9(a). Yet, when we look ata longer period of time, the trend is clear. Figure 9(b) displays the data across 2000years. It appears as if V ’s projection when sampled at E’s spring nexus point lies onone of eight branches through the data, each of which appear to be uniformly spacedtranslates of one another.

20 40 60 80 100

−1.0

−0.5

0.5

1.0AU

years500 1000 1500 2000

−1.0

−0.5

0.5

1.0AU

years

(a) A hodge-podge of dots (b) A better perspective

Figure 9. Horizontal component of V ’s projection at E’s spring nexus over time

290 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 10: AMM April 2014

By (5) and (6), finding the periodicity present within {(n,W1(n))}n≥0 is equiva-lent to finding the periodicity present within D(σ ) = {(n, sin(2πσn))}n≥0, as n rangesover integer values. Figure 10 shows that when restricted to the years 8n where n is aninteger—and when adjacent points are connected by line segments—both curves dis-play the same periodicity for σ = σ0. The curves appear to have a root near t ≈ 917,but no spring transit occurs at either 912 = 8(114) or 920 = 8(115) years, because inour model V and E are on opposite sides of S at both times. However, near the nextroot t ≈ 1834, a transit occurs at n = 1832 = 8(229) years, but not at 1840, becauseV ’s projection falls just outside S’s disk in that year.

500 1000 1500

−1.0

−0.5

0.5

1.0AU

years

(8n, W1(8n))

(8n, sin(2πσ (8n))

Figure 10. Paths through W1(t) and sin(2πσ t) when t = 8n years, σ = σ0

Can we find curves y j = sin(α(t − β j)), where α and β are real numbers and j isan integer, 0 ≤ j ≤ 7, which characterize D(σ0)? That is, we seek a period T , withT = 2π

αand β = T

8 , for which T is near 1834 and where y j passes through all points onbranch- j of D(σ0). Observe that the values of sin(2π(σ)8n) and sin(2π(σ − m

8 )8n)agree for all integers m. In particular, for the integer m for which m

8 is nearest σ ,namely, m = 13, we see that defining α and T so that

1

T= α

2π= σ − 13

8≈ 365.26

224.70− 13

8≈ 0.000545171 (15)

indeed gives the natural period of D(σ0) as

T = 2π

α≈ 2π

0.00342541≈ 1834.29 years, (16)

which means that

β = T

p= T

8≈ 1834.29

8≈ 229.286 years. (17)

When we divide the practical period T = 1605 years by 7,

T

7≈ 229.286 ≈ β.

That is,

T ≈ 8

7T .

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 291

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 11: AMM April 2014

Hence, the practical period of 1605 just happens to be a lucky seven integer multipleof the phase shift in the branches of the natural period.

To verify the fourth row of Table 2, that 3 is the shuffling factor, observe by Figure9(b) that n j is the first component in that point belonging to branch- j of D(σ ), whichis nearest the first nonnegative root of y = sin(α(t − β j)), with 0 ≤ j ≤ 7. Hence, fora given j , we wish to find the residue r j of n j modulo p, where 0 ≤ j ≤ p − 1 and0 ≤ r j ≤ p − 1, so that

sin(2πσ t) = sin(α(t − jβ)) (18)

for all times t = pn + r j , for all integers n, with p = 8. By the pigeonhole principle,since there are eight branches and eight primitive residues, r j is unique for each j .Furthermore, by the affine nature of the arguments of sine in (18), it is sufficient toshow that (18) has a solution for j = 1, which means that we must solve

sin(2πσ(pn + r)

) = sin(α(pn + r − β)) (19)

for r , where r = r1 and p = 8. By (15) through (17), (19) becomes

sin

(α(8n + r)+ 26πn + (13r)(2π)

8

)= sin

(α(8n + r)− 2π

8

).

Therefore, solving

13r ≡ −1 mod 8 (20)

gives the unique solution r = 3 for (19).Furthermore, generalizing the above argument demonstrates that the shuffling factor

r in (19) remains at r = 3 for all σ 6= 138 , for which∣∣∣∣σ − 13

8

∣∣∣∣ < 1

32= 1

4p,

a range of angular velocities called the periodicity domain of 138 . By an interval punc-

tured by x, we mean a disconnected set of real numbers J whose union with {x} is aninterval. Thus, the periodicity domain of 13

8 is an interval punctured by 138 . The reason

for excluding 138 from its periodicity domain is that its corresponding α and β would

be 0 and∞, respectively.To account for arbitrary relative positions of E and V in their orbits about S, we

imagine that at time t = 0, V is δ years ahead of its last rendezvous with its springnexus, while E is at its spring nexus. Each of the branches characterizing V ’s projec-tion undergo a phase shift ε, where sin(2πσ(8n + δ)) must equal sin(α(8n + ε)); by(15), one way for this to occur is when (2πδ)( 13

8 ) = αε, which means that

ε = qδT

p= 13δT

8,

where p = 8 and q = 13. Therefore, we have an algorithm for characterizing all springsingleton transits and all dominant members of spring twin transits, where δ is anorbital phase angle shift between V and E , p = 8 is the apparent periodicity of D(σ ),r = 3 is the shuffling factor among the year residues modulo p as given by (20), andqp is the rational number close to σ as given by (15).

292 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 12: AMM April 2014

The Transit Rule. Let k, n, and j be integers, 0 ≤ j < p. A spring transit occurs atinteger year m near time (k − q

p δ)T + β j if and only if m = pn + ( jr mod p) andm is no further from (k − q

p δ)T + β j than either m − p or m + p. If either m − p orm + p is a transit year as well, then m is the dominant member of the twin.

To ascertain whether m ± p is also a spring transit, simply utilize the decisionrule (13).

Example 1. To illustrate the transit rule, let δ = 0, k = 3, and j = 5. Since 3 jmod 8 = 7, we want to find the transit year m = 8n + 7 closest to kT + jβ ≈ 6649.3.Then m = 8(830)+ 7 = 6647, while m + 8 = 8(831)+ 7 = 6655. That is, year 6647is a singleton transit, while year 6655 is a near-miss, as shown in Figure 11(a).

T6655

T6647

T4754

T4746

(a) Spring transit near 3T + 5β (b) Spring transit near (2− 1380 )T + 6β

Figure 11. Checking the transit algorithm

Example 2. This time, let δ = 0.1, k = 2, and j = 6. Since 3 j mod 8 = 2, we wantto find the transit year m = 8n + 2 closest to (2 − 0.1( 13

8 ))T + 6β ≈ 4746.2. Thenm = 8(593)+ 2 = 4746, while m + 8 = 8(594)+ 2 = 4754. That is, year 4746 is asingleton transit, while year 4754 is far from being a transit, as shown in Figure 11(b).

As for fall transits, a similar rule applies, except that the eight branches through thedata corresponding to time n + 1

2 are

y j = sin

(t − β

(j + 1

2

)+ ε

)).

6. VARYING VENUS’S ANGULAR VELOCITY. The key behind the transit ruleis recognizing that D(σ0) consists of eight components or branches. Thus we say thatthe periodicity of D(σ ) is the integer p if D(σ ) appears to fall into p branches. Toformalize what is meant by appears, for each positive integer η, we define N (η) as themaximal integer n for which {sin(2πση j)}nj=0 is monotonic. Intuitively, N (η) countsthe number of rungs from a transit to a climax. We further define the periodicity quo-tient Q(σ, η) as

Q(σ, η) =⌊

N (η)

η

⌋,

which gives a measure of normalization among the values of N (η). We say that theapparent periodicity of D(σ ) is p, if Q(σ, p) appears to approach the maximum of{Q(σ, η)|η ∈ Z+}.

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 293

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 13: AMM April 2014

Table 3. The periodicity of D(σ0) appears to be 8.

η Q(σ0, η) η Q(σ0, η) η Q(σ0, η)

1 1 11 0 21 02 0 12 0 22 03 1 13 0 23 04 0 14 0 24 05 0 15 0 25 06 0 16 1 26 07 0 17 0 27 08 7 18 0 28 09 0 19 0 29 010 0 20 0 30 0

The first few values of Q(σ0, η) are given in Table 3, with the nonzero periodicityquotients in boldface. When extending this table indefinitely as far as a typical CASallows, it appears as if Q(σ, η) = 0 for all η > 16. From such evidence, and since themaximum quotient among this η range is 7 and corresponds to η = 8, we concludethat D(σ0) has apparent periodicity 8.

Let ν be a number between 0 and 0.5. Observe that Q(ν, η) = Q(n + ν, η) forall integers n. Because sine is an odd function, Q(ν, η) = Q(1− ν, η). Therefore, theonly σ values for which we need to evaluate Q(σ, η) are those in the range 0 ≤ σ ≤ 1

2 ,or, equivalently, the range 1.5 ≤ σ ≤ 2, the reference interval containing σ0. Armedwith the use of the measure Q we ask, how far may we perturb σ from σ0 and yet haveapparent periodicity remain invariant?

1.615 1.620 1.625 1.630 1.635

2

4

6

8

10

12Q(σ, 8)

σ

Figure 12. The range of 8-fold apparent periodicity

If Q(σ, η) ≥ 3, we say that D(σ ) displays significant apparent periodicity η. FromFigure 12, we see that D(σ ) displays significant apparent periodicity 8 on the interval(1.6237, 1.6263) punctured by σ = 13

8 . Plots of D(1.6237) and D(1.6263) are muchlike Figure 13(a), in which an 8-fold periodicity is less pronounced than in Figure 9(b).As σ approaches 13

8 = 1.625, Q(σ, 8) goes to ∞. For example, Q(1.6251, 8) = 39;this strong apparent periodicity 8 is illustrated in the graph of D(1.6251) in Figure13(b). Of course, when σ = 13/8, the eight branches collapse into five parallel linescorresponding to the sine values 0,±1,±√2/2, which means that Q( 13

8 , 8) = 0. Wetherefore say that the domain of significant periodicity for 13/8 is an interval of angularvelocities ν punctured by 13

8 , for which Q(ν, 8) is at least 3.

294 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 14: AMM April 2014

500 1000 1500 2000

−1.0

1.0

500 1000 1500 2000

−1.0

1.0

(a) D(1.6263) (b) D(1.6251)

Figure 13. A weak and a strong apparent periodicity 8

What about periodicity domains for other σ values, such as σ = 53 ,

74 ,

127 ,

1710 , and

1911 , as shown in Figure 14? It should come as no surprise that for σ values taken withinthe significant periodicity domains of these numbers, D(σ ) will exhibit apparent pe-riodicity of 3, 4, 7, 10, and 11, respectively. For example, the data set D(1.714) inFigure 15(a) shows apparent periodicity 7, and is well within the significant periodic-ity domain of 12

7 ≈ 1.71429.

1.65 1.70 1.75 1.80

2

4

6

813/8 17/10 19/1112/7 7/45/3

σ

Figure 14. Domains of periodicity

500 1000 1500 2000

−1.0

1.0

2000 4000 6000 8000 10000 12000

1.0

−1.0

(a) D(1.714) (b) {n,W1(t)}n≥0 where σ = 11√

210

Figure 15. Apparent periodicities 7 and 9

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 295

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 15: AMM April 2014

The next example is an application of the transit rule corresponding to an apparentperiodicity other than 8.

Example 3. Let σ = 11√

210 ≈ 1.55563. The plot of D(σ ), Figure 15(b), shows that its

apparent periodicity is p = 9. Since 149 is that fraction of integers with denominator 9

nearest σ , the analog of (15) is

σ − 14

9= α

2π= 1

T,

which gives T ≈ 12,600.3 years. Solving (19) gives the shuffling factor r = 7 ratherthan 3. Now let δ = 0, k = 0, and j = 5, which means that we are looking for atransit year with residue jr mod 9 ≡ 8 near time 5β = 5T/9 ≈ 7000.17. Thus, m =(777)(9)+ 8 = 7001 is a transit year. With this new value of σ , V has receded from S,so the distance d(9) between the rungs has changed to d(9) ≈ 0.0014 by (14), whichmeans that we have more than twin transits; in fact we have septuplets, as shown inFigure 16(a).

T7028

T7019

T7010

T7001T6992

T6983

T6974

actual June 2012transit path

linear modelapproximation of the

June 2012 transit

Y

Z

(a) A transit family of septuplets, σ = 11√

2

10(b) Hunting for a phase angle δ

Figure 16. Transits with σ other than σ0

7. A REALITY CHECK. How does our model contrast with reality?A phenomenon omitted thus far from our transit model is the tendency of objects

to rotate—including the orbital planes of V and E , a feature called precession. Thevalues τe and τv used to define σ0 are the periods of the two planets with respect to thebackground of the fixed stars. To adapt our model appropriately, we must incorporateslightly different periods, namely, the time it takes for a planet to return to its aphelion.Since E precesses faster than V , as time goes on the nexus line rotates and hence springand fall transits occur later in the year. Because precession rates are tiny compared toσ0, we arbitrarily take σ0 ≈ 1.625550000. Meeus [5, p. 13] predicts that “an almostexactly central transit will take place on 11 July 5900”—a transit through S’s center.Thus from 2012 to 5900, the spring transit has now become a summer transit, havingslipped forward by about 35 days during a lapse of 3888 years, which means that thechange in the relative orbital speeds of V and E with respect to the nexus line is1σ ≈

35σ03888τe

≈ 0.0000397559, which means that we might try the new angular velocity σ1 =σ0 −1σ ≈ 1.625510244.

Next, we need a phase shift δ to start our model. From [5, p. 48], the transit of 6 June2012 crossed S’s boundary at Y ≈ 39.45◦ and at Z ≈ 291.4◦ measured counterclock-

296 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 16: AMM April 2014

wise from the top of S, shown as a dotted line in Figure 16(b). Adjusting (1) and (5)so that the trigonometric arguments 2πσ t are replaced by 2πσ(t + δ), where δ is anindeterminate phase shift, and using a search method to find δ by dynamically plottingW (t − 2012) near t = 2012, yields the solid-line transit in Figure 16(b), suggestingthat δ ≈ 0.00102 is a good match. The reason that the two transit lines are non-parallelis because E’s and V ’s actual orbits have positive eccentricity. When we apply (13) inthis adjusted model for the years from 700 to 3000 AD, we find the promising springtransit Gregorian year possibilities of Table 4. The underlined years indicate a matchbetween our results and Meeus’s. Not bad for a linear model. But can we do better?

Table 4. The linear model versus Meeus’s Model

This linearmodel

{(781, 789) (1024, 1032) 1275 1518 (1761, 1769)(2004, 2012) 2255) 2498 2741 (2984, 2992)

Meeus’smodel

{(789, 797) (1032, 1040) (1275, 1283) (1518, 1526) (1761, 1769)(2004, 2012) (2247, 2255) (2490, 2498) (2733, 2741) (2976, 2984)

To do so, we work backward through the transit rule and find a magic angular veloc-ity. Since σ1 is within the periodicity domain of 13

8 , the corresponding shuffling factoris r = 3. We make use of a second unusual spring transit year, 183 BC, whose cor-responding transit Meeus describes as “almost central.” The difference between 5900AD and 183 BC is 6083 years. Identify t = 0 with year 5900. Thus, year 183 BC isreferenced by t = −6083 = 8(−761) + 5, which means that 5 ≡ 3 j mod 8, whosesolution is j = 7. Using the angular velocity σ1 with (15), the associated period isT1 ≈ 1959.85. We then solve kT1 + 7T1

8 = −6083, getting k ≈ −3.98. Next, reset k ask = −4, and solve (k + 7

8 )T2 = −6083, getting T2 = 4866425 . By (15),

σ2 = 1

T+ 13

8= 25

48664+ 13

8= 9888

6083≈ 1.6255137267795495644.

When we generate transits by the transit rule using angular velocity σ2 across the years2000 BC to 4000 AD, we get an exact match with actual spring transits from Meeus’sresults.

Table 5. Spring transit years, generated by the transit rule

1884 BC 1641 BC 1398 BC 1155 BC 912 BC 669 BC 426 BC 183 BC 60 303

546 789 1032 1275 1518 1761 2004 2247 2490 2733

2984∗ 3227 3470 3713 3956 4199 4442 4685 4928 5171

As can be seen, the difference between successive entries in Table 5 is 243 years,except when passing from 2733 to 2984, the year marked with an asterisk. The matchbetween these two approaches with respect to the recessive partner in twin transits isless spectacular.

8. SOME PARTING OBSERVATIONS AND QUESTIONS. What we have shownis that the cycle of transits is the way it is because V ’s angular velocity σ0 is enmeshedwithin the periodicity domain of 13

8 . This in turn induces a modulo 8 shuffling of suc-cessive transit years by a factor of 3, a phenomenon reflected in the 6000-year standard

April 2014] PERIODICITY DOMAINS AND THE TRANSIT OF VENUS 297

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 17: AMM April 2014

tables of transits generated by Meeus [5], provided we partition transits into two fami-lies: spring transits and fall transits, and discard one of the years from each twin transit.

With respect to the notion of periodicity domains, some natural questions arise.Does every D(σ ) have a well-defined apparent periodicity? For a challenge, try σ =√

3. What happens when σ wanders into overlapping periodicity domains? The realityis a war-torn fractal-like dominance landscape foreshadowed in part by Figure 14. Asa simple example, 35

52 ≈ 0.673077 exerts its 52-ness dominance over its immediateneighbors. Yet, it is well within the dominance of 2

3 ; an examination of D( 3552 ) shows a

clear 3-fold periodicity, and the periodicity quotient Q( 3552 , 3) = 4 supports this result.

However, Q( 3552 − 0.000001, 52) = 92 and a plot of its corresponding data set suggests

periodicity 52.With respect to permanence, in the life cycle of S, S slowly loses mass and swells

to giant status and so the orbits of the planets recede from S, which means that thetransit cycle for V may change dramatically. The rational numbers with small integerdenominator near 13

8 in increasing order are{3

2,

11

7,

8

5,

29

18,

21

13,

13

8,

31

19,

18

11,

23

14,

28

17,

33

20,

5

3,

7

4

}.

A billion or two years from now, the natural periodicity of the Venus transit maychange from 8 to 13 or 19. Hopefully, humans will yet be here to see.

For an application of the ideas of this paper to the phases of the Moon, see [8]. Justas the transit of Venus involves the periodicity domain of 13

8 , so too the phases of theMoon involve the periodicity domain of another fraction, this time 235

19 .

ACKNOWLEDGMENT. Thanks to Osmo Pekonen for asking me to write a review [7] of [11] which in turnsparked this project.

REFERENCES

1. G. K. Chesterton, Heretics. Reprint of the 1905 edition, Books for Libraries Press, Freeport, NY, 1970.2. M. Danlous-Dumesnils, Periodicite des passages de Venus, L’Astronomie 91 (1977) 117–127.3. F. Espenak, Six millenium catalog of Venus transits, NASA, 2013, available at http://eclipse.gsfc.

nasa.gov/transit/catalog/VenusCatalog.html.4. E. Halley, A new method of determining the parallax of the Sun, or his distance from the Earth, in The

Abridged Transactions of the Royal Society 6 (1809) 243–249.5. J. Meeus, Transits. William-Bell Press, Richmond, VA, 1989.6. , The transits of Venus, 3000 BC to AD 3000, Journal of the British Astronomical Association 68

(1958) 98–108.7. A. Simoson, A review of [11], Math. Intel. 35 (2013) 84–85.8. , Bilbo and the last moon of autumn, to appear in Math Horizons.9. D. A. Teets, Transits of Venus and the astronomical unit, Math. Mag. 76 (2003) 225–348.

10. H. Woolf, The Transits of Venus: A Study of Eighteenth Century Science. Princeton University Press,Princeton, NJ, 1959.

11. A. Wulf, Chasing Venus: the Race to Measure the Heavens. Alfred Knopf Press, New York, 2012.

ANDREW J. SIMOSON is a long time professor of mathematics at King University. Recently he stumbledupon a pertinent Chesterton quote, “Men take thought and ponder rationalistically touching remote things—things that only theoretically matter, such as the transit of Venus” [1, p. 141].King University, 1350 King College Road, Bristol, TN [email protected]

298 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:30 PMAll use subject to JSTOR Terms and Conditions

Page 18: AMM April 2014

A Drug-Induced Random WalkAuthor(s): Daniel J. VellemanSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 299-317Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.299 .

Accessed: 30/03/2014 17:28

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 19: AMM April 2014

A Drug-Induced Random Walk

Daniel J. Velleman

Abstract. The label on a bottle of pills says “Take one half pill daily.” A natural way to proceedis as follows: Every day, remove a pill from the bottle at random. If it is a whole pill, breakit in half, take one half, and return the other half to the bottle; if it is a half pill, take it. Weanalyze the history of such a pill bottle.

1. INTRODUCTION. A few years ago our cat Natasha (see Figure 1) began losingweight. We took her to the vet, who did some tests and determined that she had a thy-roid condition. He gave us a bottle of pills and told us to give her half a pill every day.

Figure 1. Natasha

The next day we shook a pill out of the bottle, broke it in half, gave her half of thepill, and put the other half back in the bottle. We repeated that procedure for severalmore days. Eventually, a day came when the pill we shook out of the bottle was one ofthe half pills we had put back in on one of the previous days. Of course, we just gaveher the half pill that day. We continued to follow this procedure until the bottle wasempty, and then we started on a new bottle.

The pills solved Natasha’s medical problem; she regained the weight she had lost,and she’s doing fine now. But they created an interesting mathematical problem. Thestate of the pill bottle on any day can be described by a pair of numbers (w, h), wherew is the number of whole pills in the bottle and h is the number of half pills. Wewill assume that every day a pill is removed from the bottle at random, with each pillbeing equally likely to be chosen. When a whole pill is removed, it is cut in half andhalf of it is returned to the bottle; when a half pill is removed, nothing is returnedto the bottle. Thus, if the state of the pill bottle on a particular day is (w, h), thenwith probability w/(w + h) the state on the next day will be (w − 1, h + 1), and with

http://dx.doi.org/10.4169/amer.math.monthly.121.04.299MSC: Primary 60G50, Secondary 65L05

April 2014] A DRUG-INDUCED RANDOM WALK 299

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 20: AMM April 2014

probability h/(w + h) it will be (w, h − 1). This means that the state of the pill bottleexecutes a random walk in the plane, starting at the point (w, h) = (n, 0), where n isthe initial number of pills in the bottle, and ending at (0, 0). Since the bottle contains2n doses of medicine, the walk takes 2n steps.

For example, Figure 2 shows a computer simulation of a pill-bottle walk startingwith n = 20 pills. On the first three days, whole pills are removed from the bottle, andthe state of the bottle goes from (20, 0) to (19, 1), (18, 2), and (17, 3). The next day, ahalf pill is removed, and the state goes to (17, 2). And the walk continues for 36 moresteps until it ends at (0, 0).

5 10 15 20w

12345678

h

Figure 2. A pill-bottle walk with n = 20

Figure 3 shows simulated walks with n = 100, n = 1000, and n = 10000. It ap-pears that although the walks are random, the overall shapes of the walks are similar,with the shape becoming smoother as n increases. Notice that the scales of the threewalks in Figure 3 are different; the first starts at (100, 0), the second at (1000, 0), andthe third at (10000, 0). It is only when they are drawn the same size that they looksimilar. This suggests that we should rescale the walks to a uniform size, indepen-dent of n. We will therefore switch to a new coordinate system. If we let x = w/nand y = h/n, then x represents the fraction of the original n pills that are still whole,and y represents the fraction that have become half pills. Notice that these fractionsmay add up to less than 1, since some fraction of the pills may have been used upcompletely.

25 50 75 100w

102030

h

250 500 750 1000w

100200300

2500 5000 7500 10 000w

100020003000

Figure 3. Walks with n = 100 (top left), n = 1000 (bottom), and n = 10000 (top right)

300 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 21: AMM April 2014

Using the coordinates (x, y) to represent the state of the pill bottle, we get a randomwalk that starts at (1, 0), ends at (0, 0), and stays in the triangle x + y ≤ 1, x ≥ 0,y ≥ 0. When the state is (x, y), it changes as follows:

• with probability xx+y , the state changes to

(x − 1

n , y + 1n

);

• with probability yx+y , the state changes to

(x, y − 1

n

).

We will call such a walk an n-walk. Increasing n does not make the walk larger, but itmakes the steps smaller. Figure 3 suggests that as n increases, the walk approaches asmooth curve. What is this curve?

The limit curve we seek is an example of a scaling limit of a discrete process.Perhaps the best-known example of a scaling limit is Brownian motion, which can alsobe thought of as the scaling limit of a random walk. For more on Brownian motion andscaling limits, see [5].

We first give an intuitive argument that suggests a possible answer to our question.We will find it helpful to introduce a third variable t , standing for time. We set t = 0 atthe beginning of the walk, and to keep the scales of the variables comparable we willassume that t increases by 1/n for each step of the walk. Since the walk consists of 2nsteps, this means that t will run from 0 to 2. We think of the limit curve as being givenby parametric equations

x = fx(t), y = fy(t), 0 ≤ t ≤ 2,

or, in vector notation,

(x, y) = ( fx(t), fy(t)) = f(t), 0 ≤ t ≤ 2.

When the state of an n-walk is (x, y), the displacement to the next state is eitherthe vector (−1/n, 1/n), with probability x/(x + y), or (0,−1/n), with probabilityy/(x + y). Thus, the expected value of the displacement is

x

x + y

(−

1

n,

1

n

)+

y

x + y

(0,−

1

n

)=

1

n

(−

x

x + y,

x − y

x + y

).

Since t increases by 1/n during the step, this suggests that the parametric form of thelimit curve might be a solution to the system of differential equations

dx

dt= −

x

x + y,

dy

dt=

x − y

x + y. (1)

To solve this system of equations, we first note that

dy

dx=

dy/dt

dx/dt= −

x − y

x= −1+

y

x.

We will let you check that the curve y = −x ln x satisfies this equation for 0 < x ≤ 1and passes through the point (1, 0). The graph of this curve is shown in Figure 4, andthe similarity to the walks in Figure 3 is striking. Notice that although ln 0 is undefined,limx→0+(x ln x) = 0. From now on we consider 0 ln 0 to be equal to 0, so that the curvey = −x ln x includes the point (0, 0).

April 2014] A DRUG-INDUCED RANDOM WALK 301

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 22: AMM April 2014

0.2 0.4 0.6 0.8 1x

0.1

0.2

0.3

y

Figure 4. The graph of y = −x ln x

Substituting y = −x ln x in the first equation in (1), we get

dx

dt= −

x

x − x ln x=

1

ln x − 1.

Separation of variables gives

t =∫(ln x − 1) dx = x ln x − 2x + C.

Since x = 1 when t = 0, we must have C = 2, and therefore

t = x ln x − 2x + 2. (2)

Let g(x) = x ln x − 2x + 2 for 0 ≤ x ≤ 1. (Notice that by our convention that0 ln 0 = 0, we have g(0) = 2.) Then g maps [0, 1] onto [0, 2] and is strictly decreasing,so it has an inverse. We define fx to be the inverse of g, which is a strictly decreasingfunction mapping [0, 2] to [0, 1]. Thus, if 0 ≤ t ≤ 2 and x = fx(t), then x and t satisfyequation (2).1

Using y = −x ln x , we can rewrite equation (2) as t = −y − 2x + 2, or equiva-lently y = 2− 2x − t . We therefore define

fy(t) = 2− 2 fx(t)− t. (3)

We leave it to you to verify that the equation

(x, y) = ( fx(t), fy(t)) = f(t), 0 ≤ t ≤ 2 (4)

parametrizes the curve y = −x ln x shown in Figure 4, and it satisfies the differentialequations (1) for 0 ≤ t < 2, where we interpret the derivatives at t = 0 as one-sidedderivatives. (At t = 2, we have x = y = 0, and therefore the right-hand sides of theequations in (1) are undefined.) The graphs of fx and fy are shown in Figure 5.

It turns out that an n-walk does, indeed, approach the curve (4) as n approaches∞,but the sense in which this is true must be stated carefully. Our main theorem is thefollowing.

1Using the Lambert W function W−1 (see [1]), we can express fx (t) explicitly by the equation

fx (t) =t − 2

W−1((t − 2)/e2).

However, we will not have any use for this expression.

302 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 23: AMM April 2014

0.5 1 1.5 2t

0.2

0.4

0.6

0.8

1x

0.5 1 1.5 2t

0.1

0.2

0.3

y

Figure 5. The graphs of x = fx (t) (left) and y = fy(t) (right)

Theorem 1. Suppose that ε > 0. Let the points on an n-walk be p0 = (1, 0), p1,. . . , p2n = (0, 0), and for 0 ≤ i ≤ 2n let ti = i/n. Then the probability that for everyi , ‖pi − f(ti )‖ < ε approaches 1 as n →∞. In other words, the n-walk convergesuniformly in probability to the limit curve.

Two notable features of the limit curve are that the tangent line at (1, 0) has slope−1, and the tangent line at the origin is vertical. The first feature makes intuitive sense:early in the walk, almost all of the pills in the bottle are whole pills, so it is likely thatseveral whole pills will be removed before the first half pill is removed. For example,in the walk in Figure 2, three whole pills were removed before the first half pill wasremoved. When these initial whole pills are removed, the walk will move along theline y = 1 − x , which is the tangent line at (1, 0). The second feature seems moresurprising: it appears that near the end of the walk, almost all of the pills are half pills,and the walk ends by moving along the line x = 0 toward the origin. This suggeststwo questions.

Question 1. For a bottle of n pills, what is the expected number of whole pills that areremoved from the bottle before the first half pill is removed?

Question 2. For a bottle of n pills, what is the expected number of half pills that areremoved from the bottle after the last whole pill is removed?

Versions of Question 1 have appeared in the literature before (see, for example,[3, 4, 6, 8]). In the case n = 365, it is equivalent to the following version of the birthdayproblem: If people are chosen at random, one by one, what is the expected number ofpeople with distinct birthdays who will be chosen before the first person who has thesame birthday as a previously chosen person? We will give an elementary derivationof the answer to Question 1. In our next theorem, we express the answer in terms ofthe incomplete gamma function, which is defined as follows,

0(a, x) =∫∞

xta−1e−t dt.

Theorem 2. For a bottle of n pills, the expected number of whole pills that are removedfrom the bottle before the first half pill is removed is

en

nn−10(n, n).

April 2014] A DRUG-INDUCED RANDOM WALK 303

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 24: AMM April 2014

As n→∞, this expected value is asymptotic to√πn

2.

The answer to Question 2 was found by Richard Stong.

Theorem 3 (Stong). For a bottle of n pills, the expected number of half pills thatare removed from the bottle after the last whole pill is removed is the nth harmonicnumber,

Hn = 1+1

2+

1

3+ · · · +

1

n.

For example, for a bottle of 100 pills, the expected number of whole pills before thefirst half pill is

e100

100990(100, 100) ≈ 12.21,

and the asymptotic approximation in Theorem 2 is√100π

2≈ 12.53.

The expected number of half pills after the last whole pill is

H100 ≈ 5.19.

The rest of this paper is devoted to the proofs of Theorems 1–3. We prove Theorem 1in Section 3, and Theorems 2 and 3 in Section 4. We consider variations on thesetheorems in Section 5.

2. BACKGROUND FOR PROOF OF THEOREM 1. In preparation for the proofof Theorem 1, we simplify the problem by eliminating one variable. According todefinition (3), fy(t) = 2− 2 fx(t)− t , so

f(t) = ( fx(t), 2− 2 fx(t)− t) = fx(t)(1,−2)+ (0, 2− t).

A similar equation holds for the points on any n-walk. Suppose that after i steps,the n-walk is at the point pi = (xi , yi ), and let ti = i/n. This means that there arewi = nxi whole pills and hi = nyi half pills in the bottle. These pills are enoughfor 2wi + hi doses of medicine. Since there were 2n doses in the bottle originally,and i of those doses have been used up, there must be 2n − i doses left. Therefore,2wi + hi = 2n − i , or equivalently, hi = 2n − 2wi − i . Dividing through by n, wefind that

yi = 2− 2xi − ti , (5)

and therefore

pi = (xi , 2− 2xi − ti ) = xi (1,−2)+ (0, 2− ti ).

304 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 25: AMM April 2014

It follows that

‖pi − f(ti )‖ = ‖(xi − fx(ti ))(1,−2)‖ = |xi − fx(ti )|√

5.

Thus, to ensure that pi is close to f(ti ), it will suffice to ensure that xi is close to fx(ti );we can ignore the y-coordinates of pi and f(ti ). In other words, to prove Theorem 1 itwill suffice to prove the following lemma.

Lemma 4. Suppose that ε > 0. Let the x-coordinates of the points on an n-walk bex0 = 1, x1, . . . , x2n = 0, and for 0 ≤ i ≤ 2n let ti = i/n. Then the probability that forevery i , |xi − fx(ti )| < ε approaches 1 as n→∞.

In fact, using equations (3) and (5), we can completely eliminate the variable yfrom the problem. We can describe the x-coordinates of the points on an n-walk bysaying that xi+1 is equal to either xi − 1/n or xi , with the first possibility occurringwith probability

xi

xi + yi=

xi

xi + 2− 2xi − ti=

xi

2− xi − ti. (6)

Similarly, if x = fx(t) and y = fy(t), then for 0 ≤ t < 2,

f ′x(t) =dx

dt= −

x

x + y= −

x

2− x − t= −

fx(t)

2− fx(t)− t. (7)

Thus, we can work entirely with the points (ti , xi ) and the curve x = fx(t), both ofwhich lie in the t x-plane.

The idea behind our proof of Lemma 4 is straightforward. Let m be a large positiveinteger, and let n be an integer much larger than m. Now consider an n-walk, andbreak the 2n steps of the walk into m large blocks of steps. We view the n-walk in thet x-plane, ignoring the y-coordinates. The individual steps of the n-walk are randomand unpredictable, but the net change in x that results from a large block of steps ismore predictable: by the law of large numbers, this net change is likely to be close toits expected value. It will follow that if a block of steps starts at a point (t, x), thenthe net result of this block of steps is likely to be a small displacement in the t x-planewhose slope is close to−x/(2− x − t). Since x = fx(t) is a solution to the differentialequation dx/dt = −x/(2− x − t), this means that the steps of the n-walk should stayclose to the graph of fx .

This proof sketch suggests that our proof will involve ideas related to Euler’smethod. Recall that Euler’s method is a numerical method for solving a differen-tial equation of the form f ′(t) = F(t, f (t)) for a ≤ t ≤ b, with an initial conditionf (a) = x0. Here the function F and the numbers a, b, and x0 are given, and we wantto compute values of f . To apply Euler’s method, we choose a positive integer n anda positive step size h ≤ (b − a)/n, let t j = a + jh for 0 ≤ j ≤ n, and then define x j

recursively by the equation

x j+1 = x j + hF(t j , x j ), 0 ≤ j < n.

Thus, the displacement from (t j , x j ) to (t j+1, x j+1) has slope F(t j , x j ). If h is smalland F is sufficiently well-behaved, then the points (t j , x j ) will be close to the graphof f .

April 2014] A DRUG-INDUCED RANDOM WALK 305

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 26: AMM April 2014

We will need to modify Euler’s method slightly, because according to our proofsketch for Lemma 4, the slope of the displacement caused by a block of steps in then-walk starting at (t, x) is likely to be close to −x/(2− x − t), but not exactly equalto it. We will therefore need a version of Euler’s method in which the slope of thedisplacement at step j is only approximately equal to F(t j , x j ).

To make this precise, suppose that a < b, g1 and g2 are functions from [a, b] to R,and for all t ∈ [a, b], g1(t) < g2(t). Let

D = {(t, x) ∈ R2: a ≤ t ≤ b and g1(t) ≤ x ≤ g2(t)}.

Now suppose that F : D→ R and f : [a, b] → R, and for all t ∈ [a, b], (t, f (t)) ∈ Dand

f ′(t) = F(t, f (t)),

where we interpret f ′(t) as a one-sided derivative when t = a or t = b. Let x0 = f (a).We want to use a version of Euler’s method to locate points (t j , x j ) near the graph of f .As before, we will use a positive step size h ≤ (b− a)/n, so for 0 ≤ j ≤ n we let t j =

a + jh. We will assume that for 0 ≤ j < n, the slope of the displacement from (t j , x j )

to (t j+1, x j+1) deviates from F(t j , x j ) by some amount δ j . Thus, we recursively define

x j+1 = x j + h(F(t j , x j )+ δ j ).

To ensure that this formula is defined, we assume that for every j , g1(t j ) ≤ x j ≤ g2(t j ),so that (t j , x j ) ∈ D.

Lemma 5. In the modified Euler’s method described above, assume that for 0 ≤j < n,

|δ j | ≤ δ.

We also assume that ∂F/∂x and f ′′ are defined and bounded. Thus, we assume thatthere are positive constants C1 and C2 such that for all (t, x) ∈ D,∣∣∣∣∂F

∂x(t, x)

∣∣∣∣ ≤ C1, | f′′(t)| ≤ C2.

Then for 0 ≤ j ≤ n,

|x j − f (t j )| ≤

(hC2

2C1+

δ

C1

) ((1+ C1h) j

− 1). (8)

Proof. We proceed by induction on j . Clearly, inequality (8) holds when j = 0, sinceboth sides are 0. Now suppose that the inequality holds for some j < n. By Taylor’stheorem, we can write

f (t j+1) = f (t j )+ h f ′(t j )+h2

2f ′′(c j )

for some number c j between t j and t j+1. And by the mean value theorem, we have

F(t j , x j )=F(t j , f (t j ))+∂F

∂x(t j , d j )(x j− f (t j ))= f ′(t j )+

∂F

∂x(t j , d j )(x j− f (t j ))

306 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 27: AMM April 2014

for some d j between x j and f (t j ). Thus,

x j+1 − f (t j+1) = x j + h(F(t j , x j )+ δ j )− f (t j+1)

= x j + h

(f ′(t j )+

∂F

∂x(t j , d j )(x j − f (t j ))+ δ j

)−

(f (t j )+ h f ′(t j )+

h2

2f ′′(c j )

)= (x j − f (t j ))

(1+ h

∂F

∂x(t j , d j )

)+ hδ j −

h2

2f ′′(c j ).

Next, we take absolute values and apply the bounds given in the statement of thelemma:

|x j+1 − f (t j+1)| ≤ |x j − f (t j )|(1+ C1h)+ hδ +C2h2

2.

Finally, we apply the inductive hypothesis to conclude that

|x j+1 − f (t j+1)| ≤

(hC2

2C1+

δ

C1

) ((1+ C1h) j

− 1)(1+ C1h)+ hδ +

C2h2

2

=

(hC2

2C1+

δ

C1

) ((1+ C1h) j+1

− 1),

as required.

3. PROOF OF THEOREM 1. To complete the proof of Theorem 1, we return to ourproof sketch for Lemma 4. Unfortunately, nailing down the details of this proof sketchis not easy. Nevertheless, in this section we show that, with some care, a proof basedon these ideas can be carried out.

Fix ε > 0. We will refer to the region fx(t)− ε < x < fx(t)+ ε in the t x-plane asthe ε-corridor. To prove Lemma 4, we must show that for large n, an n-walk is likelyto stay entirely inside the ε-corridor. We first determine simple bounds on any n-walk.At step i of the walk, by (5) we have

xi ≥ 0, 2− 2xi − ti = yi ≥ 0,

and therefore

0 ≤ xi ≤2− ti

2. (9)

Similar bounds apply to the graph of fx : for 0 ≤ t ≤ 2,

0 ≤ fx(t) ≤ 1, 2− 2 fx(t)− t = fy(t) = − fx(t) ln( fx(t)) ≥ 0,

so

0 ≤ fx(t) ≤2− t

2. (10)

April 2014] A DRUG-INDUCED RANDOM WALK 307

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 28: AMM April 2014

These simple bounds already imply that the end of the n-walk stays inside the ε-corridor: if ti > 2− 2ε, then

0 ≤ xi , fx(ti ) ≤2− ti

2< ε,

and therefore

|xi − fx(ti )| < ε.

Thus, we only need to worry about ti in the interval [0, 2− 2ε]. In particular, if ε > 1,then there is nothing more to prove, so we can assume now that ε ≤ 1. By stoppingshort of t = 2, we avoid having to deal with the point (t, x, y) = (2, 0, 0) on the limitcurve, where the right-hand sides of the equations in (1) are undefined.

We will find it convenient to go a bit beyond t = 2− 2ε, so we define

D =

{(t, x) ∈ R2

: 0 ≤ t ≤ 2− ε and 0 ≤ x ≤2− t

2

},

and for (t, x) ∈ D we let

F(t, x) = −x

2− x − t.

Notice that for (t, x) ∈ D,

2− x − t ≥ 2−2− t

2− t =

2− t

2> 0, (11)

so F(t, x) is defined.By (9) and (10), any n-walk and the curve x = fx(t) both stay in the region D up to

time t = 2− ε, and by (7), if 0 ≤ t ≤ 2− ε, then f ′x(t) = F(t, fx(t)). Thus, it makessense to apply Lemma 5 to the functions F and fx on the region D. In preparation forthis, we make some observations about these functions. We first note that by (11) andthe definition of D, for (t, x) ∈ D we have

2− x − t ≥2− t

2≥ x ≥ 0.

Since F(t, x) = −x/(2− x − t), it follows that

− 1 ≤ F(t, x) ≤ 0, (12)

and therefore

| f ′x(t)| = |F(t, fx(t))| ≤ 1. (13)

Next, we compute

∂F

∂x(t, x) = −

2− t

(2− x − t)2, f ′′x (t) =

fx(t)2

(2− fx(t)− t)3=(F(t, fx(t)))2

2− fx(t)− t.

308 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 29: AMM April 2014

Thus, if (t, x) ∈ D, then by (11),∣∣∣∣∂F

∂x(t, x)

∣∣∣∣ = 2− t

(2− x − t)2≤

2− t

((2− t)/2)2=

4

2− t≤

4

ε.

Similarly, if 0 ≤ t ≤ 2− ε, then

| f ′′x (t)| =(F(t, fx(t)))2

2− fx(t)− t≤

1

2− fx(t)− t≤

1

(2− t)/2=

2

2− t≤

2

ε.

We can therefore use C1 = 4/ε and C2 = 2/ε in Lemma 5. For reasons that will be-come clear later, the value we will use for δ in Lemma 5 is

δ =C1ε

6(e2C1 − 1). (14)

Since the function F(t, x) is uniformly continuous on D, we can choose some ζ >0 such that for any two points (t1, x1), (t2, x2) ∈ D,

if |t1 − t2| < ζ and |x1 − x2| < ζ , then |F(t1, x1)− F(t2, x2)| <δ

4. (15)

We now choose a positive integer m large enough that

2

m<ε

3,

2

m< ζ,

e2C1 − 1

2m<ε

6. (16)

Again, the reason for this choice will become clear later.Consider an n-walk for any n ≥ m2. As in the statement of Lemma 4, let the x-

coordinates of the points on the walk be x0 = 1, x1, . . . , x2n = 0, and for 0 ≤ i ≤ 2n letti = i/n. We now divide 2n by m, getting a quotient q and remainder r . In other words,

2n = mq + r

and 0 ≤ r < m. Notice that since n ≥ m2, we have q ≥ 2m. We think of the walkas consisting of m blocks of steps, with each block containing q steps, followed by rextra steps at the end. For 0 ≤ j ≤ m, let (T j , X j ) be the position of the walk after jblocks of steps have been traversed. Thus, T j = t jq = jq/n and X j = x jq .

Let h = q/n, so that for 0 ≤ j < m,

T j+1 − T j = h,

and note that since x either remains fixed or decreases by 1/n in each step of the walk,

0 ≤ X j − X j+1 ≤q

n= h.

Applying (16), we see that

h =2q

2n=

2q

mq + r≤

2q

mq=

2

m<ε

3,

April 2014] A DRUG-INDUCED RANDOM WALK 309

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 30: AMM April 2014

so

|T j+1 − T j | ≤2

m<ε

3, |X j+1 − X j | ≤

2

m<ε

3. (17)

In other words, in the course of a single block of steps, x and t change by less thanε/3.

For 0 ≤ j < m, let

δ j =X j+1 − X j

h− F(T j , X j ).

Rearranging this definition, this means that

X j+1 = X j + h(F(T j , X j )+ δ j ).

Of course, this is the recurrence in our modified version of Euler’s method.We would now like to apply Lemma 5, but we have no guarantee that δ will be a

bound on the numbers |δ j |. However, we can show that if δ is such a bound, then thewalk stays in the ε-corridor:

Claim. Suppose that for all j < m, if T j ≤ 2 − 2ε, then |δ j | ≤ δ. Then the n-walkstays inside the ε-corridor.

Proof of Claim. Notice that since q ≥ 2m and 2/m < ε/3,

Tm = tmq =mq

n=

2mq

2n=

2mq

mq + r>

2mq

m(q + 1)

= 2−2

q + 1> 2−

2

2m> 2−

ε

6> 2− 2ε.

Thus, we can let k be the least index such that Tk > 2 − 2ε. Then for all j < k,T j ≤ 2− 2ε, and therefore, by assumption, |δ j | ≤ δ. And since Tk−1 ≤ 2− 2ε, by (17)we have

Tk < Tk−1 +ε

3≤ 2− 2ε +

ε

3< 2− ε.

We can therefore apply Lemma 5 to the points (T j , X j ) for 0 ≤ j ≤ k and the func-tions F and fx on the region D to conclude that for all such j ,

|X j − fx(T j )| ≤

(hC2

2C1+

δ

C1

) ((1+ C1h) j

− 1).

Since j ≤ k ≤ m and h ≤ 2/m,

(1+ C1h) j≤

(1+

2C1

m

)m

< e2C1,

where the last inequality is well known (see, for example, inequality 4.5.13 in [7]).Therefore,

|X j − fx(T j )| <

((2/m)(2/ε)

2(4/ε)+

δ

C1

)(e2C1 − 1) =

e2C1 − 1

2m+δ(e2C1 − 1)

C1.

310 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 31: AMM April 2014

By (16) and (14), the last two fractions are both at most ε/6. Thus, we have shown that

|X j − fx(T j )| <ε

3. (18)

This implies that all of the points (T j , X j ) for 0 ≤ j ≤ k are in the ε-corridor.Since Tk > 2 − 2ε, as we observed after (10), all points on the n-walk beyond

(Tk, Xk) are also in the ε-corridor. We still need to worry about points on the n-walkin the interiors of the first k blocks. If (t, x) is such a point, then (t, x) occurs between(T j , X j ) and (T j+1, X j+1), for some j < k. To see that (t, x) is in the ε-corridor, wecompute

|x − fx(t)| ≤ |x − X j | + |X j − fx(T j )| + | fx(T j )− fx(t)|.

We now bound each of the terms on the right-hand side. We already know, by (17) and(18), that |x − X j | ≤ |X j+1 − X j | < ε/3 and |X j − fx(T j )| < ε/3. For the third termwe apply the mean value theorem:

fx(T j )− fx(t) = f ′x(c)(T j − t),

for some c between t and T j . By (13) and (17), we conclude that

| fx(T j )− fx(t)| = | f′

x(c)| · |T j − t | ≤ | f ′x(c)| · |T j+1 − T j | < 1 ·ε

3=ε

3.

Putting it all together, we get

|x − fx(t)| ≤ |x − X j | + |X j − fx(T j )| + | fx(T j )− fx(t)| <ε

3+ε

3+ε

3= ε,

so the point (t, x) is in the ε-corridor. We have now shown that all points on the walkare in the ε-corridor, which completes the proof of the claim.

The claim shows that if an n-walk goes outside of the ε-corridor, then there must besome j < m such that T j ≤ 2− 2ε and |δ j | > δ. To complete the proof, we will showthat this is unlikely to happen.

Partition {(t, x) ∈ D : t ≤ 2− 2ε} into finitely many disjoint regions R1, R2, . . . ,RK , each with diameter less than ζ . By (12) and (15), for each k with 1 ≤ k ≤ K wecan choose a number rk such that −1 ≤ rk ≤ 0 and for every (t, x) ∈ Rk ,

|F(t, x)− rk | <δ

4. (19)

For example, we can take rk to be F(t, x) for some particular (t, x) ∈ Rk . Notice thatthe regions Rk and numbers rk do not depend on n; as n→∞, Rk and rk will remainfixed.

We will write Prn(E) to denote the probability that an event E occurs when an n-walk takes place. The claim implies that the probability that an n-walk will leave theε-corridor is at most

m−1∑j=0

K∑k=1

p j,k(n),

April 2014] A DRUG-INDUCED RANDOM WALK 311

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 32: AMM April 2014

where

p j,k(n) = Prn((T j , X j ) ∈ Rk and |δ j | > δ).

Thus, it will suffice to show that for each j and k, limn→∞ p j,k(n) = 0.Fix j and k with 0 ≤ j < m and 1 ≤ k ≤ K . The value of δ j is determined by the

block of steps taken by the n-walk in going from (T j , X j ) to (T j+1, X j+1). The pointson this part of the walk are (t jq+i , x jq+i ) for 0 ≤ i ≤ q . We will refer to the step from(t jq+i , x jq+i ) to (t jq+i+1, x jq+i+1) as step i of this block of the n-walk. Notice that thereare q steps in the block, and since q is the quotient when n is divided by m and m isfixed, q →∞ when n→∞.

Let a be the number of steps in the block in which x decreases by 1/n. In the re-maining q − a steps, the value of x does not change, so X j − X j+1 = a/n. Therefore,by definition,

δ j =X j+1 − X j

h− F(T j , X j ) = −

a/n

q/n− F(T j , X j ) = −

a

q− F(T j , X j ).

Although the value of p j,k(n) does not depend on the precise method by which thesteps in this block of the walk are chosen, it will be helpful to specify a method. Wewill assume that for 0 ≤ i < q , random numbers si are chosen, independently anduniformly in [0, 1], and then in step i , x decreases by 1/n if

si <x jq+i

2− x jq+i − t jq+i= −F(t jq+i , x jq+i ),

and x is unchanged otherwise. Of course, according to equation (6), this proceduregenerates the correct probabilities for the steps of the walk.

Suppose that (T j , X j ) ∈ Rk . Then by (19), |F(T j , X j ) − rk | < δ/4, or in otherwords

− rk −δ

4< −F(T j , X j ) < −rk +

δ

4. (20)

Also, for 0 ≤ i < q , by (17) and (16), |t jq+i − T j | ≤ 2/m, |x jq+i − X j | ≤ 2/m,2/m < ε/3, and 2/m < ζ . Since t jq+i ≤ T j + 2/m < 2− 2ε + ε/3 < 2− ε, we have(t jq+i , x jq+i ) ∈ D, and therefore, by (15), |F(t jq+i , x jq+i )− F(T j , X j )| < δ/4. Com-bining this with |F(T j , X j ) − rk | < δ/4, we conclude that |F(t jq+i , x jq+i ) − rk | <

δ/2, or in other words

−rk −δ

2< −F(t jq+i , x jq+i ) < −rk +

δ

2.

Recall that step i is determined by how si compares to −F(t jq+i , x jq+i ). We cannow draw the conclusion that if (T j , X j ) ∈ Rk , then:

(a) if si ≤ −rk −δ

2 , then at step i , x decreases by 1n ;

(b) if si ≥ −rk +δ

2 , then at step i , x remains unchanged.

We are now ready to show that limn→∞ p j,k(n) = 0. By definition,

p j,k(n) = Prn((T j , X j ) ∈ Rk and δ j > δ)+ Prn((T j , X j ) ∈ Rk and δ j < −δ).

312 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 33: AMM April 2014

We will show that both of the probabilities on the right-hand side approach 0 asn→∞.

For the first, suppose that (T j , X j ) ∈ Rk and δ j > δ. Since δ j = −a/q − F(T j , X j ),by (20) this implies that

a

q< −F(T j , X j )− δ < −rk −

4.

Now let a′ be the number of values of i for which si ≤ −rk − δ/2. By conclusion (a)above, a′ ≤ a, and therefore

0 ≤a′

q≤

a

q< −rk −

4< −rk −

δ

2< 1.

This is very unlikely to happen. To see why, notice first that for 0 ≤ i < q , since si ischosen uniformly in [0, 1] and 0 < −rk − δ/2 < 1, the probability that si ≤ −rk − δ/2is −rk − δ/2. And since the si are chosen independently, this means that a′/q , whichis the fraction of values of i for which si ≤ −rk − δ/2, should be close to −rk − δ/2.More precisely, by the law of large numbers (see [2, Section VI.4, p. 152]), for anyα > 0, the probability that |a′/q − (−rk − δ/2)| > α must approach 0 as q →∞.And since q →∞ as n→∞, taking α = δ/4 we can conclude that

limn→∞

Prn

(a′

q< −rk −

4

)= 0.

It follows that

limn→∞

Prn((T j , X j ) ∈ Rk and δ j > δ) = 0.

The second probability is similar. If (T j , X j ) ∈ Rk and δ j < −δ, then

a

q> −F(T j , X j )+ δ > −rk +

4.

Now let a′ be the number of values of i for which si < −rk + δ/2. This time we usefact (b) above to conclude that a′ ≥ a, so

1 ≥a′

q≥

a

q> −rk +

4> −rk +

δ

2> 0.

Once again, the law of large numbers says that the probability of this event goes to 0as n→∞, which completes the proof of Lemma 4 and, therefore, Theorem 1.

4. PROOFS OF THEOREMS 2 AND 3. To prove Theorem 2, fix n > 0, and letA denote the number of whole pills removed from the bottle before the first half pill.Of course, the first pill removed from the bottle must be a whole pill, and there are nwhole pills altogether, so 1 ≤ A ≤ n.

For 1 ≤ k ≤ n, let Xk = 1 if the first k pills removed from the bottle are all wholepills, and Xk = 0 otherwise. Then we have A = X1 + X2 + · · · + Xn , and therefore

E(A) = E(X1 + X2 + · · · + Xn) = E(X1)+ E(X2)+ · · · + E(Xn).

April 2014] A DRUG-INDUCED RANDOM WALK 313

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 34: AMM April 2014

The probability that the first pill removed is a whole pill is 1. Once the first wholepill has been removed, the bottle contains n − 1 whole pills and 1 half pill, so theprobability that the second pill is also a whole pill is (n − 1)/n. Similarly, if the firsttwo pills are whole pills, then the probability that the third pill is a whole pill is (n −2)/n. Continuing in this way, we see that for 1 ≤ k ≤ n,

E(Xk) = Pr(Xk = 1)

= 1 ·n − 1

n − 2

n· · ·

n − k + 1

n

=n!

nk(n − k)!.

Thus,

E(A) =n∑

k=1

E(Xk) =

n∑k=1

n!

nk(n − k)!.

Reindexing by j = n − k, we get

E(A) =n∑

k=1

n!

nk(n − k)!=

n−1∑j=0

n!

nn− j j !=

n!

nn

n−1∑j=0

n j

j !. (21)

To relate this formula to the incomplete gamma function, we first evaluate the inte-gral in the definition of the incomplete gamma function. Applying integration by partsk times leads to the formula in the following lemma.

Lemma 6. For every integer k ≥ 0,

∫t ke−t dt = −

k!

et

k∑j=0

t j

j !+ C.

Using this lemma, we find that

0(n, n) =∫∞

ntn−1e−t dt

= limN→∞

− (n − 1)!

et

n−1∑j=0

t j

j !

N

n

=(n − 1)!

en

n−1∑j=0

n j

j !. (22)

Thus,

n−1∑j=0

n j

j !=

en

(n − 1)!0(n, n).

314 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 35: AMM April 2014

Substituting into (21), we get

E(A) =n!

nn

n−1∑j=0

n j

j !=

n!

nn·

en

(n − 1)!0(n, n) =

en

nn−10(n, n).

This proves the first statement in Theorem 2.To prove the second statement, about the asymptotic value as n→∞, we need the

following fact.

Lemma 7.

limn→∞

0(n, n)

(n − 1)!=

1

2.

Proof. According to inequality 8.10.13 of [7],

0(n, n)

(n − 1)!<

1

2<0(n + 1, n)

n!. (23)

By Lemma 6 and equation (22),

0(n + 1, n)=∫∞

ntne−t dt=

n!

en

n∑j=0

n j

j !=n

(n − 1)!

en

n−1∑j=0

n j

j !+

nn

en=n0(n, n)+

nn

en.

Substituting into the second half of inequality (23), we get

1

2<0(n, n)

(n − 1)!+

nn

enn!,

and therefore

1

2−

nn√

2πn

enn!·

1√

2πn<0(n, n)

(n − 1)!<

1

2.

By Stirling’s formula, limn→∞ nn√

2πn/(enn!) = 1, and the lemma now follows bythe squeeze theorem.

This lemma allows us to determine the asymptotic rate of growth of the expectedvalue of A. The expected length of the initial run of whole pills can be rewritten in theform

E(A) =en

nn−10(n, n) =

√2πn ·

enn!

nn√

2πn·0(n, n)

(n − 1)!∼√

2πn · 1 ·1

2=

√πn

2,

which completes the proof of Theorem 2.Finally, we give Stong’s proof of Theorem 3. For 1 ≤ k ≤ n, consider the kth whole

pill that is removed from the bottle. This pill is cut in half, and half of it is returnedto the bottle; we will refer to this half pill as the kth half pill. Let Xk = 1 if the kth

April 2014] A DRUG-INDUCED RANDOM WALK 315

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 36: AMM April 2014

half pill is removed from the bottle after the last whole pill is removed, and Xk = 0otherwise. Then the expected value we seek is

E(X1 + X2 + · · · + Xn) = E(X1)+ E(X2)+ · · · + E(Xn).

After the kth half pill has been returned to the bottle, there are n − k whole pillsstill in the bottle, and we have Xk = 1 if and only if among the set of pills consistingof these n − k remaining whole pills and the kth half pill, the half pill is the last one tobe removed from the bottle. Since each pill in this set is equally likely to be chosen ateach step, we have

E(Xk) = Prn(Xk = 1) =1

n − k + 1.

Therefore the expected number of half pills removed from the bottle after the lastwhole pill is

E(X1)+ E(X2)+ · · · + E(Xn) =1

n+

1

n − 1+ · · · + 1 = Hn.

5. VARIATIONS. In all of our calculations, we have assumed that when a pill isremoved from the bottle, all pills in the bottle are equally likely to be chosen. Butsince the whole pills are twice as big as the half pills, another natural assumptionwould be that whole pills are twice as likely to be chosen as half pills. In this sectionwe summarize the results of redoing our calculations with this alternative assumption,leaving the details to the reader.

If whole pills are twice as likely to be chosen as half pills, then the differentialequations (1) must be replaced by

dx

dt= −

2x

2x + y,

dy

dt=

2x − y

2x + y.

The solution to this system of equations that passes through the point (1, 0) is

y = 2(√

x − x), x =(2− t)2

4, y =

t (2− t)

2.

Once again, the random walk converges uniformly in probability to this curve asn→∞.

Surprisingly, in this case the expected number of whole pills removed before thefirst half pill turns out to be exactly the same as the expected number of half pillsremoved after the last whole pill. Calculations similar to those in the last section showthat both expected values are

22n(2nn

) − 1.

There is a simple explanation for why these two expected values are equal. Theexplanation is based on an alternative procedure we could follow to decide which pillto remove from the bottle each day. First, number the pills in a full bottle from 1 to n.Then make a deck of 2n cards numbered from 1 to n, with each number appearing on

316 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 37: AMM April 2014

two cards, and shuffle the deck. Every day, deal a card from the top of the deck, and ifthe card has the number k on it, then remove pill number k from the bottle. As usual,if the pill is whole, then cut it in half and return half to the bottle.

On any day, if pill number k is still whole, then there will be two cards numbered kin the deck; if half of pill number k has already been taken, then there will be only onecard numbered k in the deck; and if pill number k has been used up completely, thenthere will be no cards numbered k left in the deck. It follows that whole pills will betwice as likely to be chosen as half pills, as required.

If we follow this procedure, then the number of whole pills removed from the bottlebefore the first half pill is removed will be the same as the number of distinct cardsdealt from the top of the deck before the first duplicate card. Similarly, we could deter-mine how many half pills will be removed from the bottle after the last whole pill bydealing cards from the bottom of the deck and counting the number of distinct cardsdealt before the first duplicate. It should now be clear by symmetry that the expectedvalues of these two numbers are equal. Indeed, the problem of computing this com-mon expected value is equivalent to the third question addressed in [9], and the answerfollows from Theorem 5 of [9].

ACKNOWLEDGMENTS. I would like to thank Richard Stong, Greg Warrington, Rob Benedetto, TanyaLeise, Amy Wagaman, and the anonymous referees for helpful conversations and suggestions. Natasha wouldlike to thank Dr. Michael Katz, D.V.M.

REFERENCES

1. R. M. Corless, G. H. Gonnet, D. E. G. Hare, D. J. Jeffrey, D. E. Knuth, On the Lambert W function, Adv.Comput. Math. 5 (1996) 329–359.

2. W. Feller, An Introduction to Probability Theory and its Applications. Vol. I. Third edition, Wiley, NewYork, 1968.

3. L. Holst, On birthday, collectors’, occupancy and other classical urn problems, Int. Stat. Review 54 (1986)15–27.

4. M. S. Klamkin, D. J. Newman, Extensions of the birthday surprise, J. Comb. Theory 3 (1967) 279–282.5. G. F. Lawler, V. Limic, Random Walk: A Modern Introduction. Cambridge Studies in Advanced Mathe-

matics. Vol. 123. Cambridge University Press, Cambridge, 2010.6. B. McCabe, Matching balls drawn from an urn, Problem E 2263, Solutions by B. C. Arnold and R. J. Dick-

son, Amer. Math. Monthly 78 (1971) 1022–1024.7. National Institute of Standards and Technology, Digital Library of Mathematical Functions, March 23,

2012, available at http://dlmf.nist.gov/.8. P. N. Rathie, P. Zornig, On the birthday problem: Some generalizations and applications, Int. J. Math.

Math. Sci. 2003 (2003) 3827–3840.9. D. J. Velleman, G. S. Warrington, What to expect in a game of memory, Amer. Math. Monthly, 120 (2013)

787–805.

DANIEL J. VELLEMAN received his B.A. from Dartmouth College in 1976 and his Ph.D. from the Univer-sity of Wisconsin–Madison in 1980. He taught at the University of Texas before joining the faculty of AmherstCollege in 1983. He was the editor of the American Mathematical Monthly from 2007 to 2011. In his sparetime he enjoys singing, bicycling, and playing volleyball.Department of Mathematics, Amherst College, Amherst, MA [email protected]

April 2014] A DRUG-INDUCED RANDOM WALK 317

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:28:51 PMAll use subject to JSTOR Terms and Conditions

Page 38: AMM April 2014

Analytical Solution for the Generalized Fermat–Torricelli ProblemAuthor(s): Alexei Yu. UteshevSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 318-331Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.318 .

Accessed: 30/03/2014 17:29

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 39: AMM April 2014

Analytical Solution for the GeneralizedFermat–Torricelli Problem

Alexei Yu. Uteshev

Abstract. We present an explicit analytical solution for the problem of minimization of thefunction

F(x, y) =3∑

j=1

m j

√(x − x j )2 + (y − y j )2,

i.e., we find the coordinates of the stationary point and the corresponding critical value asfunctions of {m j , x j , y j }

3j=1. In addition, we also discuss the inverse problem of finding such

values for m1,m2, and m3 for which the corresponding function F possesses a prescribedposition of stationary point.

1. INTRODUCTION. Consider the following problem. Given the coordinates ofthree noncollinear points P1 = (x1, y1), P2 = (x2, y2), and P3 = (x3, y3) in the plane,find the coordinates of the point P∗ = (x∗, y∗) that gives a solution to the optimizationproblem

min(x,y)∈R2

F(x, y) for F(x, y) =3∑

j=1

m j

√(x − x j )2 + (y − y j )2. (1)

Here m1,m2, and m3 are assumed to be real positive numbers and will be subsequentlyreferred to as weights.

The stated problem, in its particular case of equal weights m1 = m2 = m3 = 1, hasbeen known since 1643 as the (classical) Fermat–Torricelli problem. It has a uniquesolution that coincides either with one of the points P1, P2, P3 or with the so-calledFermat or Fermat–Torricelli point [2, 4] of the triangle P1 P2 P3; this point makes anangle of 2π/3 with any two vertices of the triangle.

Generalization of the problem to the case of unequal weights has been investigatedsince the 19th century. This generalization is known under different names: the Steinerproblem, the Weber problem, the problem of railway junction ((Germ.) Problem desKnotenpunktes) [3, 8], the three factory problem [6]. The last two names were inspiredby a facility location problem such as the following. Let the cities P1, P2, and P3 bethe sources of iron ore, coal, and water, respectively. To produce one ton of steel, thesteel works needs m1 tons of iron, m2 tons of coal, and m3 tons of water. Assumingthat the freight charge for a ton-kilometer is independent of the nature of the cargo,find the optimal position for the steel works connected with P1, P2, and P3 via straightroads so as to minimize the transportation costs.

In the rest of the paper, this problem will be referred to as the generalized Fermat–Torricelli problem. Existence and uniqueness of its solution is guaranteed by the fol-lowing result [4].

http://dx.doi.org/10.4169/amer.math.monthly.121.04.318MSC: Primary 51N20

318 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 40: AMM April 2014

Theorem 1. Denote by α1, α2, and α3 the corner angles of the triangle P1 P2 P3. If theconditions

m21 < m2

2 + m23 + 2m2m3 cosα1,

m22 < m2

1 + m23 + 2m1m3 cosα2,

m23 < m2

1 + m22 + 2m1m2 cosα3

(2)

are fulfilled, then there exists a unique solution P∗ = (x∗, y∗) ∈ R2 for the generalizedFermat–Torricelli problem lying inside the triangle P1 P2 P3. This point is a stationarypoint for the function F(x, y), i.e., a real solution of the system

3∑j=1

m j (x − x j )√(x − x j )2 + (y − y j )2

= 0,

3∑j=1

m j (y − y j )√(x − x j )2 + (y − y j )2

= 0. (3)

If any of the conditions (2) are violated, then F(x, y) attains its minimum value at thecorresponding vertex of the triangle.

Let us overview some approaches for finding the point P∗. Historically, the firstapproach is geometrical: The point is found as the intersection point of a special con-struction of lines or circles. For the equal weighted case, Torricelli proved that thecircles circumscribing the equilateral triangles constructed on the sides of and outsidethe triangle P1 P2 P3 intersect at the point P∗; for an alternative Simpson constructionof P∗, see [5]. For the general, i.e., unequal weighted case, see [3, 8].

The second approach is based on the mechanical model (sometimes incorrectlycalled Polya’s mechanical model): A horizontal board is drilled with holes at the pointsP1, P2, and P3 (or at the vertices of a triangle similar to P1 P2 P3). Three strings are tiedtogether in a knot at one end, the loose ends are passed through the holes, and areattached to physical weights proportional to m1,m2, and m3, respectively, below theboard. The equilibrium position of the knot yields the solution [3].

The third approach, based on the gradient descent method, originated in the paper[11]; further developments and comments can be found in [7, 9].

The present paper is devoted to the fourth approach, the analytical one. We lookfor explicit expressions for the coordinates of the stationary point P∗ as functions of{m j , x j , y j }

3j=1. Although the existence of such a solution by radicals, i.e., in a finite

number of operations like standard arithmetic ones and extraction of (positive integer)roots, is not questioned in any review article on the problem, we failed to find in theliterature the constructive and universal version of an algorithm even for the classical,i.e., equal weighted, case.

2. ALGEBRA.

Theorem 2. Under the conditions (2), the coordinates of the stationary point (x∗, y∗)of the function F(x, y) are as follows:

x∗ =K1 K2 K3

4|S|σd

(x1

K1+

x2

K2+

x3

K3

), y∗ =

K1 K2 K3

4|S|σd

(y1

K1+

y2

K2+

y3

K3

)(4)

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 319

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 41: AMM April 2014

with

F(x∗, y∗) = min(x,y)∈R2

F(x, y) =√

d.

Here

d =1

2σ(m2

1 K1 + m22 K2 + m2

3 K3) , or alternatively, (5)

d = 2|S|σ+1

2

[m2

1(r212+r 2

13 − r 223)+m2

2(r223+r 2

12−r 213)+m2

3(r213+r 2

23−r 212)], (6)

r j` = |Pj P`| =√(x j − x`)2 + (y j − y`)2 for { j, `} ⊂ {1, 2, 3},

S = x1 y2 + x2 y3 + x3 y1 − x1 y3 − x3 y2 − x2 y1, (7)

σ =1

2

√−m4

1 − m42 − m4

3 + 2m21m2

2 + 2m21m2

3 + 2m22m2

3, (8)

and K1 = (r 2

12 + r 213 − r 2

23)σ + (m22 + m2

3 − m21)|S|,

K2 = (r 223 + r 2

12 − r 213)σ + (m

21 + m2

3 − m22)|S|,

K3 = (r 213 + r 2

23 − r 212)σ + (m

21 + m2

2 − m23)|S|.

(9)

Proof. First, we establish the validity of the equality

K1 K2 + K1 K3 + K2 K3 = 4σ |S|d, (10)

and the dual equality

r 223 K1 + r 2

13 K2 + r 212 K3 = 2|S|d (11)

for (5). Second, let us deduce the following relationships√(x∗ − x j )2 + (y∗ − y j )2 =

m j K j

2σ√

dfor j ∈ {1, 2, 3}. (12)

Here is the proof for the case j = 1:

(x∗ − x1)2+ (y∗ − y1)

2

(10)=

(K1 K2 K3

4σ |S|d

)2[(

x2

K2+

x3

K3−

x1

K2−

x1

K3

)2

+

(y2

K2+

y3

K3−

y1

K2−

y1

K3

)2]

=

(K1 K2 K3

4σ |S|d

)2 [(x2 − x1)

2+ (y2 − y1)

2

K 22

+(x3 − x1)

2+ (y3 − y1)

2

K 23

+ 2(x2 − x1)(x3 − x1)+ (y2 − y1)(y3 − y1)

K2 K3

]320 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 42: AMM April 2014

=

(K1 K2 K3

4σ |S|d

)2 [ r 212

K 22

+r 2

13

K 23

+ 21/2(r 2

12 + r 213 − r 2

23)

K2 K3

]=

K 21

(4σ |S|d)2[r 2

12 K 23 + r 2

13 K 22 + (r

212 + r 2

13 − r 223)K2 K3

]=

K 21

(4σ |S|d)2[(r 2

12 K3 + r 213 K2)(K2 + K3)− r 2

23 K2 K3

](11)=

K 21

(4σ |S|d)2[(2|S|d − r 2

23 K1)(K2 + K3)− r 223 K2 K3

]=

K 21

(4σ |S|d)2[2|S|d(K2 + K3)− r 2

23(K1 K2 + K1 K3 + K2 K3)]

(10)=

K 21

(4σ |S|d)2[2|S|d(K2 + K3)− 4r 2

23σ |S|d]

=2|S|d K 2

1

(4σ |S|d)2[K2 + K3 − 2r 2

23σ]

(9)=

K 21

8|S|dσ 2

[2m2

1|S|]

=m2

1 K 21

4σ 2d.

Similar arguments hold for j ∈ {2, 3} in (12). To complete the proof of these equalities,it should be additionally verified that the values K1, K2, and K3 are nonnegative. Thiswill be done in the next section.

To prove the first statement of the theorem, we will utilize the following alternativerepresentation for x∗ and y∗:

x∗(10)=

11

K1+

1K2+

1K3

(x1

K1+

x2

K2+

x3

K3

), and

y∗(10)=

11

K1+

1K2+

1K3

(y1

K1+

y2

K2+

y3

K3

). (13)

We substitute (4) into the left-hand side of the first equation of (3). The resulting ex-pression can be reduced with the aid of (12) to

x∗−x1

K1+

x∗−x2

K2+

x∗−x3

K3= x∗

(1

K1+

1

K2+

1

K3

)−

(x1

K1+

x2

K2+

x3

K3

)(13)= 0.

Similar arguments are valid for the second equation from (3). Finally, we computeF(x∗, y∗):

F(x∗, y∗)=3∑

j=1

m j

√(x∗ − x j )2 + (y∗ − y j )2

(12)=

3∑j=1

m2j K j

2σ√

d

(5)=

2σd

2σ√

d=√

d.

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 321

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 43: AMM April 2014

Some test values are provided in Table 1.

P1 P2 P3 P∗m1 m2 m3

√d

1. (2, 6) (1, 1) (5, 1)(

4103+1833√

152866 , 29523−4481

√15

8598

)2 3 4 ≈ (3.9086, 1.4152)

√d = 2

√79+ 15

√15 ≈ 23.4174

2. (2, 6) (1, 1) (5, 1)(

751485 ,

647485

)≈ (1.5484, 1.3340)

3 5 4√

d =√

970 ≈ 31.1448

3. (0, 0) (2, 0) (−√

2,√

2)(

1− 1√

2−

3√

110, 1√

2−

3√

55−

3√

110

)3/2 2 2 ≈ (0.0068, 0.0165)

√d =

√32+ 23

√2+ 3

√552 ≈ 7.9997

Table 1.

3. GEOMETRY. Let us give an interpretation for some constants that appeared inTheorem 2. First, on rewriting (7) in determinantal form

S =

∣∣∣∣∣∣1 1 1x1 x2 x3

y1 y2 y3

∣∣∣∣∣∣ ,we recognize that |S| = 2S

4P1 P2 P3, where S

4P1 P2 P3stands for the area of triangle

P1 P2 P3. As for the constant (8), factorization of the radicand on the right-hand sideleads to the form

σ = 2

[m1 + m2 + m3

2

(m1 + m2 + m3

2− m1

)(m1 + m2 + m3

2− m2

)

×

(m1 + m2 + m3

2− m3

)]1/2

,

which can be treated as the Heron formula for twice the area of a triangle formed bythe triple of weights m1,m2, and m3. Under the restrictions (2), such a triangle exists.Construct this triangle and denote its angles, as shown in Figure 1.

m1m2

m3

m1

m2

m3

β1

α1

β2α2

β3

α3

Figure 1. Two triangles generated by the problem

322 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 44: AMM April 2014

The first formula from (9) can thus be represented with the aid of the law of cosinesas

K1 = σ |S|

(r 2

12 + r 213 − r 2

23

|S|+

m22 + m2

3 − m21

σ

)= σ |S|

(2r12r13 cosα1

|S|+

2m2m3 cosβ1

σ

)= 2σ |S|(cotα1 + cotβ1).

Rewriting the first condition from (2) in the form cosα1 + cosβ1 > 0, we can con-clude that cotα1 + cotβ1 > 0 and, thus, K1 > 0. In a similar way, the expressions forK2 and K3 can be deduced, and we can establish that, under the restrictions (2), theyare both positive. This completes the proof of Theorem 2.

Remark 1. We set the dual generalized Fermat–Torricelli problem. Let the trianglebe composed of the sides with the lengths equal to m1,m2, and m3; let the weightsr12, r23, and r13 be placed in its vertices, as shown in Figure 2.

r13

r23

r12m1

m2

m3

Figure 2. Dual problem

The minimum value for the objective function will be the same as in the directproblem, since (6) is equivalent to

2|S|σ +1

2

[r 2

12(m21 + m2

2 − m23)+ r 2

13(m21 + m2

3 − m22)+ r 2

23(m22 + m2

3 − m21)].

4. CLASSICAL FERMAT–TORRICELLI PROBLEM. Consider now the equalweighted case m1 = m2 = m3 = 1.

Theorem 3. Let all the angles of the triangle P1 P2 P3 be less than 2π/3, or, equiva-lently,

r 212 + r 2

13 + r12r13 − r 223 > 0,

r 223 + r 2

12 + r12r23 − r 213 > 0,

r 213 + r 2

23 + r13r23 − r 212 > 0.

The coordinates of the Fermat–Torricelli point for this triangle are as follows:

x∗ =k1k2k3

2√

3|S|d

(x1

k1+

x2

k2+

x3

k3

), y∗ =

k1k2k3

2√

3|S|d

(y1

k1+

y2

k2+

y3

k3

), (14)

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 323

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 45: AMM April 2014

with the corresponding minimum value of the objective function

F(x∗, y∗) = min(x,y)∈R2

3∑j=1

√(x − x j )2 + (y − y j )2 =

√d.

Here,

d =1√

3(k1 + k2 + k3) =

r 212 + r 2

13 + r 223

2+√

3 |S| (15)

and

k1 =

√3

2(r 2

12 + r 213 − r 2

23)+ |S|,

k2 =

√3

2(r 2

23 + r 212 − r 2

13)+ |S|,

k3 =

√3

2(r 2

13 + r 223 − r 2

12)+ |S|,

with the rest of the parameters coinciding with those from Theorem 2.

It turns out that the right-hand sides of the expressions (14), being represented asrational fractions with respect to {x j , y j }

3j=1, can be reduced further to the form where

denominators become “area free.”

Corollary. Under conditions of Theorem 3, the coordinates of the Fermat–Torricellipoint are as follows:

x∗ =1

2√

3d

(x1 + x2 + x3)|S| +√

3(x1r 2

23 + x2r 213 + x3r 2

12

)(16)

+ 3 sgn(S)

∣∣∣∣∣∣1 1 1y1 y2 y3

x2x3 + y2 y3 x1x3 + y1 y3 x1x2 + y1 y2

∣∣∣∣∣∣ ,

y∗ =1

2√

3d

(y1 + y2 + y3)|S| +√

3(y1r 2

23 + y2r 213 + y3r 2

12

)(17)

− 3 sgn(S)

∣∣∣∣∣∣1 1 1x1 x2 x3

x2x3 + y2 y3 x1x3 + y1 y3 x1x2 + y1 y2

∣∣∣∣∣∣ .

Remark 2. The result of the last corollary can be extended to the generalized Fermat–Torricelli problem. Numerators and denominators in the right-hand sides of the for-mulas (4) can be reduced by the common factor |S|. We do not present the resultingexpressions here, since they are inelegantly cumbersome.

Remark 3. One of the referees of the present paper suggested that the author “providesome motivation or insight of how he found the explicit expressions in Theorem 2.”

324 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 46: AMM April 2014

Frankly speaking, the historical development of the investigation went in the directionopposite to what has been presented up to this point. First, the formulas (16)–(17)were obtained as the solution of a linear system of equations arising from the featureof the Fermat–Torricelli point to make an angle of 2π/3 with any two vertices of thetriangle. Next, in a similar way, the formulas mentioned in Remark 2 were obtained forthe generalized Fermat–Torricelli problem, i.e., for the coordinates x∗, y∗. Althoughthese formulas looked awful, they permitted us to deduce the explicit expression (6)for the value of minimal distance. Moreover, we noticed the appearance of this valuein the expressions for denominators of the formulas for x∗ and y∗. Next, we intended toperform an additional verification of the obtained results via direct substitution into theequations (3). At this moment, the following lucky guess came to mind: the radicand of√

(x∗ − x j )2 + (y∗ − y j )2

should be a perfect square! The only remaining trick was to discover the values (9).

5. INVERSE PROBLEM. Given the coordinates of the point P∗ = (x∗, y∗), we wishto find the values for the weights m1,m2, and m3 with the aim for the correspondingobjective function (1) to posses a minimum point precisely at P∗.

Theorem 4. Let the vertices of the triangle P1 P2 P3 be counted counterclockwise. Thenfor the choice

m∗1 = |P∗P1| ·

∣∣∣∣∣∣1 1 1x∗ x2 x3

y∗ y2 y3

∣∣∣∣∣∣ ,m∗2 = |P∗P2| ·

∣∣∣∣∣∣1 1 1x1 x∗ x3

y1 y∗ y3

∣∣∣∣∣∣ , and

m∗3 = |P∗P3| ·

∣∣∣∣∣∣1 1 1x1 x2 x∗y1 y2 y∗

∣∣∣∣∣∣ (18)

the function

F(x, y) =3∑

j=1

m∗j

√(x − x j )2 + (y − y j )2

has its stationary point at P∗. Provided that the latter is chosen inside the triangleP1 P2 P3, the values (18) are all positive, and

F(x∗, y∗) = min(x,y)∈R2

F(x, y) =

∣∣∣∣∣∣∣1 1 1 1x∗ x1 x2 x3

y∗ y1 y2 y3

x2∗+ y2

∗x2

1 + y21 x2

2 + y22 x2

3 + y23

∣∣∣∣∣∣∣ . (19)

Proof. Substitute x = x∗, y = y∗ and the values (18) into the left-hand side of the firstequation from (3) as follows:

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 325

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 47: AMM April 2014

(x∗−x1)

∣∣∣∣∣∣1 1 1x∗ x2 x3

y∗ y2 y3

∣∣∣∣∣∣+(x∗−x2)

∣∣∣∣∣∣1 1 1x1 x∗ x3

y1 y∗ y3

∣∣∣∣∣∣+(x∗−x3)

∣∣∣∣∣∣1 1 1x1 x2 x∗y1 y2 y∗

∣∣∣∣∣∣ . (20)

Represent this combination of the third-order determinants in the form of the fourth-order determinant, namely∣∣∣∣∣∣∣

1 1 1 1x∗ x1 x2 x3

y∗ y1 y2 y3

0 x∗ − x1 x∗ − x2 x∗ − x3

∣∣∣∣∣∣∣(expansion by its last row coincides with (20)). Now add the second row to the last toobtain the following: ∣∣∣∣∣∣∣

1 1 1 1x∗ x1 x2 x3

y∗ y1 y2 y3

x∗ x∗ x∗ x∗

∣∣∣∣∣∣∣ .In this determinant, the first row is proportional to the last one; therefore, the determi-nant equals just zero. The second equality from (3) can be verified in a similar manner.

Let us evaluate F(x∗, y∗):

F(x∗, y∗) =[(x∗ − x1)

2+ (y∗ − y1)

2] ∣∣∣∣∣∣

1 1 1x∗ x2 x3

y∗ y2 y3

∣∣∣∣∣∣+[(x∗ − x2)

2+ (y∗ − y2)

2] ∣∣∣∣∣∣

1 1 1x1 x∗ x3

y1 y∗ y3

∣∣∣∣∣∣+[(x∗ − x3)

2+ (y∗ − y3)

2] ∣∣∣∣∣∣

1 1 1x1 x2 x∗y1 y2 y∗

∣∣∣∣∣∣ .To prove the equality (19), let us split it into the x-part and the y-part. First, keep thex-terms in brackets of the previous formula:

(x∗ − x1)2

∣∣∣∣∣∣1 1 1x∗ x2 x3

y∗ y2 y3

∣∣∣∣∣∣+ (x∗ − x2)2

∣∣∣∣∣∣1 1 1x1 x∗ x3

y1 y∗ y3

∣∣∣∣∣∣+ (x∗ − x3)2

∣∣∣∣∣∣1 1 1x1 x2 x∗y1 y2 y∗

∣∣∣∣∣∣ .Similar to the proof of the first part of the theorem, represent this linear combinationas the determinant of the fourth order:∣∣∣∣∣∣∣

1 1 1 1x∗ x1 x2 x3

y∗ y1 y2 y3

0 (x∗ − x1)2 (x∗ − x2)

2 (x∗ − x3)2

∣∣∣∣∣∣∣ .326 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 48: AMM April 2014

Multiply the first row by (−x2∗), the second one by 2x∗ and add the obtained rows to

the last one: ∣∣∣∣∣∣∣1 1 1 1x∗ x1 x2 x3

y∗ y1 y2 y3

x2∗

x21 x2

2 x23

∣∣∣∣∣∣∣ . (21)

The y-part of the equality (19) can be proven in exactly the same manner with theresulting determinant differing from (21) only in its last row. The linear property ofdeterminant with respect to its rows completes the proof of (19).

Remark 4. The solution of the inverse problem is determined up to a common posi-tive multiplier, i.e., the solution triple (m1,m2,m3) is defined by the value of the ratiom1 : m2 : m3. (In the language of the facility location problem mentioned in the Intro-duction, this statement is equivalent to the fact that the optimal position of the steelworks is independent of the currency of the state.) Up to this remark, the solution ofthe inverse problem is unique. We have proven this statement via direct computationsstarting from formulas (4).

Example 1. Let P1 = (2, 6), P2 = (1, 1), P3 = (5, 1), and

P∗ =

(1

2866

(4103+ 1833

√15),

1

8598

(29523− 4481

√15)).

Find the values for the weights m∗1,m∗2, and m∗3 from Theorem 4.

Solution. Formulas (18) give:

m∗1 =2(20925− 4481

√15)

18481401

√316380606+ 35999826

√15,

m∗2 =2(15105− 2342

√15)

6160467

√75400161− 9169767

√15,

and

m∗3 =8(−1185+ 15988

√15)

18481401

√8335761− 2050623

√15,

with

F(x∗, y∗) =1

4299(−333980+ 193436

√15).

Now, compare the obtained result with the one represented in test 1 from Section 2.According to Remark 4, we might expect that

m∗1 : m∗

2 : m∗

3 = 2 : 3 : 4.

We leave the verification of this fact as an exercise for the inquisitive reader.

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 327

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 49: AMM April 2014

The next example originated from the question posed by one of the referees of thepresent paper: What will happen to the result of Theorem 4 if we take P∗ = Pj ?

Example 2. Show how to choose the values for the weights m1,m2, and m3 in order forthe point P∗ to coincide with the given point on a side of the triangle from Example 1.

Solution. If we take P∗ = P2, the formulas (18) give zero values for all the weights;however, the “weights” of these zeros are different. To explain this causistry, takeP∗ = P2 + (µ,µ) for the infinitely small µ > 0. For this case, formulas (18) give:

m∗1(µ) = 4µ√

26− 12µ+ 2µ2 = 4√

26µ+ o(µ),

m∗2(µ) = 4√

2µ(5− 2µ),

m∗3(µ) = 4µ√

16− 8µ+ 2µ2 = 16µ+ o(µ).

The weight m∗2(µ) dominates over m∗1(µ) and m∗3(µ) when µ→ +0. As a matter offact, the true values of these weights do not influence the position of the point P∗; thelatter depends only on the value of the ratio m∗1(µ) : m

2(µ) : m∗

3(µ). Thus, the choicem∗1 = 4

√26,m∗2 = 20

√2,m∗3 = 16 provides us with P∗ = P2.

Let us now manipulate the weights with the aim of extruding the point P∗ to aninternal point of the side P2 P3. This manipulation is not trivial, as in the previous case.First, we utilize formulas (18) and then simplify the obtained result with the aid offormulas (4). Finally, the variable weights

m∗1(µ) = tµ, m∗2(µ) = 1+ µ, m∗3(µ) = 1− µ

with a fixed t >√

104, provide the following asymptotics as µ→+0:

P∗ −→

(2−

10√

t2 − 4, 1

).

Thus, the two “essential” weights m∗2(µ) and m∗3(µ) guarantee delivery of P∗ to theside P2 P3, while the negligible weight m∗1(µ) ensures the fine-tuning of this deliveryto the particular point within the open line segment P2 P⊥. Here P⊥ = (2, 1) is the footof the altitude of the triangle P1 P2 P3 through the point P1.

Let us discuss the geometrical meaning of the constants from Theorem 4. The valuem∗1 equals twice the product of the distance |P1 P∗| by the area of the triangle P∗P2 P3.The first statement of the theorem is equivalent to the geometrical equality

−−→P∗P1 · S4P∗P2 P3

+−−→P∗P2 · S4P∗P3 P1

+−−→P∗P3 · S4P∗P1 P2

=−→O .

Finally, the constant (19) is connected with

h = −1

S

∣∣∣∣∣∣∣1 1 1 1x∗ x1 x2 x3

y∗ y1 y2 y3

x2∗+ y2

∗x2

1 + y21 x2

2 + y22 x2

3 + y23

∣∣∣∣∣∣∣ ,which is known [10, pp. 251–252] as the power of the point P∗ with respect to thecircle through the points P1, P2, and P3 (the circumscribed circle of the triangle). Ifwe denote the circumcenter of the triangle P1 P2 P3 by C , then

328 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 50: AMM April 2014

h = |CP∗|2− |CP j |

2 for j ∈ {1, 2, 3}, (22)

and, provided that P∗ lies inside this triangle, the value h is negative.Results of the present section can evidently be extended to the case of three (and

more) dimensions.

Theorem 5. Let the points {Pj = (x j , y j , z j )}4j=1 be noncoplanar, and be counted in

such a manner that the value of the determinant

V =

∣∣∣∣∣∣∣1 1 1 1x1 x2 x3 x4

y1 y2 y3 y4

z1 z2 z3 z4

∣∣∣∣∣∣∣ (23)

is positive. Then for the choice {m∗j = |P∗Pj | · V j

}4

j=1, (24)

where V j equals the determinant obtained on replacing the j th column of (23) by thecolumn [1, x∗, y∗, z∗]

> (here > denotes transposition), the function

F(x, y, z) =4∑

j=1

m∗j

√(x − x j )2 + (y − y j )2 + (z − z j )2

has its stationary point at P∗ = (x∗, y∗, z∗). If P∗ lies inside the tetrahedron P1 P2 P3 P4,then the values (24) are all positive, and

F(x∗, y∗, z∗) = min(x,y,z)∈R3

F(x, y, z)

= −

∣∣∣∣∣∣∣∣∣∣1 1 1 1 1x∗ x1 x2 x3 x4

y∗ y1 y2 y3 y4

z∗ z1 z2 z3 z4

x2∗+ y2

∗+ z2∗

x21 + y2

1 + z21 x2

2 + y22 + z2

2 x23 + y2

3 + z23 x2

4 + y24 + z2

4

∣∣∣∣∣∣∣∣∣∣.

(25)

Geometrical meanings of the values appearing in the last theorem are similar to theircounterparts from Theorem 4. For instance, the value (23) equals six times the volumeof tetrahedron P1 P2 P3 P4, while the value (25) divided by V is known [10, p. 255] asthe power of the point P∗ with respect to a sphere circumscribed to that tetrahedron; itis equivalent to (22), where C this time stands for the circumcenter of the tetrahedronwhile j ∈ {1, 2, 3, 4}.

6. CONCLUSIONS. An analytical solution for the generalized Fermat–Torricelliproblem and its inversion is presented. The three-point case is completely solved using“extended radicals”: In addition to elementary and extraction of roots operations, thesign function is utilized in the formulas. The treatment of the multidimensional n > 3point case requires further investigation, although some theoretical results like [1] givelittle reason to hope for a nice, e.g., extended radicals, solution.

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 329

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 51: AMM April 2014

7. APPENDIX. We prove here the equalities (10) and (11). We have

K1 K2 + K1 K3 + K2 K3

=1

2[K1(K2 + K3)+ K2(K1 + K3)+ K3(K1 + K2)]

(9)= K1(r

223σ + m2

1|S|)+ K2(r213σ + m2

2|S|)+ K3(r212σ + m2

3|S|)

= σ 2[(r 2

12 + r 213 − r 2

23)r223 + (r

223 + r 2

12 − r 213)r

213 + (r

213 + r 2

23 − r 212)r

212

]+ S2

[m2

1(m22 + m2

3 − m21)+ m2

2(m21 + m2

3 − m22)+ m2

3(m21 + m2

2 − m23)]

+ σ |S|

[m2

1(r212 + r 2

13 − r 223)+ m2

2(r212 + r 2

23 − r 213)+ m2

3(r213 + r 2

23 − r 212)

+ r 223(m

22 + m2

3 − m21)+ r 2

13(m21 + m2

3 − m22)+ r 2

12(m21 + m2

2 − m23)

]= 4σ 2S2

+ 4σ 2S2

+ 2σ |S|[m2

1(r212 + r 2

13 − r 223)+ m2

2(r213 + r 2

23 − r 212)+ m2

3(r213 + r 2

23 − r 212)].

Here we have utilized (8) and the equality

4S2= (r 2

12 + r 213 − r 2

23)r223 + (r

213 + r 2

23 − r 212)r

212 + (r

223 + r 2

12 − r 213)r

213, (26)

which can be verified either directly or with the aid of the Heron formula for the areaof a triangle (see Section 3). Reference to the definition (6) of the constant d completesthe proof of (10).

We now deduce formula (11):

r 223 K1 + r 2

13 K2 + r 212 K3

= σ[(r 2

12 + r 213 − r 2

23)r223 + (r

213 + r 2

23 − r 212)r

212 + (r

223 + r 2

12 − r 213)r

213

]+ |S|

[r 2

23(m22 + m2

3 − m21)+ r 2

13(m21 + m2

3 − m22)+ r 2

12(m21 + m2

2 − m23)]

(26)= 4σ S2

+ |S|[m2

1(r212+r 2

13−r 223)+ m2

2(r223+r 2

12−r 213)+ m2

3(r213+r 2

23−r 212)]

(6)= 2|S|d.

ACKNOWLEDGMENTS. The author thanks the referees and the editor for valuable suggestions that helpedto improve the quality of the paper.

REFERENCES

1. C. Bajaj, The algebraic degree of geometric optimization problems, Discrete Comput. Geom. 3 (1988)177–191.

2. R. Courant, H. Robbins, What is Mathematics? Oxford University Press, London, 1941.3. F. Dingeldey, Sammlung von Aufgaben zur Anwendung der Differenzial- und Integralrechnung. Erster

Teil. Aufgaben zur Anwendung der Differenzialrechnung. Teubner, Leipzig, 1910.4. Encyclopedia of Mathematics contributors, Fermat–Torricelli problem. Encyclopedia of Mathematics,

available at http://www.encyclopediaofmath.org/index.php?title=Fermat-Torricelli_

problem&oldid=22419

5. D. Gonzalez Martinez, The Fermat point, available at http://jwilson.coe.uga.edu/

EMAT6680Fa10/Gonzalez/Assignment6/THEFERMATPOINT.htm

330 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 52: AMM April 2014

6. I. Greenberg, R. A. Robertello, The three factory problem, Math. Mag. 38 (1965) 67–72.7. H. W. Kuhn, A note on Fermat’s problem, Math. Program 4 (1973) 98–107.8. W. Launhardt, Kommercielle Tracirung der Verkehrswege, Z. f. Architekten u. Ingenieur-Vereinis im Kon-

igreich Hannover, 18 (1872) 516–534.9. L. M. Ostresh, On the convergence of a class of iterative methods for solving the Weber location problem,

Oper. Res. 26 (1978) 597–609.10. J. V. Uspensky, Theory of Equations. McGraw-Hill, New York, 1948.11. E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donnes est minimum. Tohoku

Math. J. 43 (1937) 355–386.

ALEXEI UTESHEV received his Ph.D. from the Leningrad (St. Petersburg) State University in 1988. Hismathematical interests lie in computational algebra and geometry; he also carries on personal educationalon-line resources in these areas. He is also interested in history and enjoys cross-country skiing.Faculty of Applied Mathematics, St. Petersburg State UniversityUniversitetskij pr. 35, 198504, Petrodvorets, St. Petersburg, [email protected]

A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem

The maximum value of a continuous real-valued function f on [a, b] is attainedat its largest “lookout point.” We call x in [a, b] a lookout point if, whenevert lies in [a, x), we have f (t) ≤ f (x). The set L of lookout points is closed.Indeed, let xn → x , with xn in L . If t is in [a, x), then eventually t lies in [a, xn),so f (t) ≤ f (xn). By continuity, f (t) ≤ f (x), as desired. We use the fact thata closed, bounded, and nonempty set has a maximum and a minimum. Thus,max(L) exists.

Extreme Value Theorem. If f is a real-valued continuous function on [a, b]then f has a maximum value on [a, b]. In other words, for some c in [a, b], novalue attained by f exceeds f (c).

Proof. Letting L = {x in [a, b] such that t in [a, x) implies f (t) ≤ f (x)} andc = max(L), it suffices to show that, given k > f (c), the closed, bounded setSk = {t in [a, b] such that f (t) ≥ k} is empty, and this follows since, if some dsatisfies f (d) ≥ k, then d > c, whence d is not in L , so there exists a t < d forwhich f (t) > f (d) ≥ k, proving that Sk has no minimum.

—Submitted by Samuel J. Ferguson, University of Iowa

http://dx.doi.org/10.4169/amer.math.monthly.121.04.331MSC: Primary 26A03 Secondary: 26A15

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 331

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:14 PMAll use subject to JSTOR Terms and Conditions

Page 53: AMM April 2014

A One-Sentence Line-of-Sight Proof of the Extreme Value TheoremAuthor(s): Samuel J. FergusonSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), p. 331Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.331 .

Accessed: 30/03/2014 17:29

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:25 PMAll use subject to JSTOR Terms and Conditions

Page 54: AMM April 2014

6. I. Greenberg, R. A. Robertello, The three factory problem, Math. Mag. 38 (1965) 67–72.7. H. W. Kuhn, A note on Fermat’s problem, Math. Program 4 (1973) 98–107.8. W. Launhardt, Kommercielle Tracirung der Verkehrswege, Z. f. Architekten u. Ingenieur-Vereinis im Kon-

igreich Hannover, 18 (1872) 516–534.9. L. M. Ostresh, On the convergence of a class of iterative methods for solving the Weber location problem,

Oper. Res. 26 (1978) 597–609.10. J. V. Uspensky, Theory of Equations. McGraw-Hill, New York, 1948.11. E. Weiszfeld, Sur le point pour lequel la somme des distances de n points donnes est minimum. Tohoku

Math. J. 43 (1937) 355–386.

ALEXEI UTESHEV received his Ph.D. from the Leningrad (St. Petersburg) State University in 1988. Hismathematical interests lie in computational algebra and geometry; he also carries on personal educationalon-line resources in these areas. He is also interested in history and enjoys cross-country skiing.Faculty of Applied Mathematics, St. Petersburg State UniversityUniversitetskij pr. 35, 198504, Petrodvorets, St. Petersburg, [email protected]

A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem

The maximum value of a continuous real-valued function f on [a, b] is attainedat its largest “lookout point.” We call x in [a, b] a lookout point if, whenevert lies in [a, x), we have f (t) ≤ f (x). The set L of lookout points is closed.Indeed, let xn → x , with xn in L . If t is in [a, x), then eventually t lies in [a, xn),so f (t) ≤ f (xn). By continuity, f (t) ≤ f (x), as desired. We use the fact thata closed, bounded, and nonempty set has a maximum and a minimum. Thus,max(L) exists.

Extreme Value Theorem. If f is a real-valued continuous function on [a, b]then f has a maximum value on [a, b]. In other words, for some c in [a, b], novalue attained by f exceeds f (c).

Proof. Letting L = {x in [a, b] such that t in [a, x) implies f (t) ≤ f (x)} andc = max(L), it suffices to show that, given k > f (c), the closed, bounded setSk = {t in [a, b] such that f (t) ≥ k} is empty, and this follows since, if some dsatisfies f (d) ≥ k, then d > c, whence d is not in L , so there exists a t < d forwhich f (t) > f (d) ≥ k, proving that Sk has no minimum.

—Submitted by Samuel J. Ferguson, University of Iowa

http://dx.doi.org/10.4169/amer.math.monthly.121.04.331MSC: Primary 26A03 Secondary: 26A15

April 2014] SOLUTION FOR THE FERMAT–TORRICELLI PROBLEM 331

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:25 PMAll use subject to JSTOR Terms and Conditions

Page 55: AMM April 2014

On the Proof of the Existence of Undominated Strategies in Normal Form GamesAuthor(s): Martin Kovár, Alena ChernikavaSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 332-337Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.332 .

Accessed: 30/03/2014 17:29

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 56: AMM April 2014

On the Proof of the Existence ofUndominated Strategies in

Normal Form GamesMartin Kovar and Alena Chernikava

Abstract. In the game theory literature, there are two versions of the proof of the well-knownfact that in a normal form game of n persons with compact spaces of strategies and continu-ous utility functions, the sets of undominated strategies are nonempty. The older one, statedin the first edition of the well-known book by Herve Moulin, depends on certain, relativelynontrivial results from measure theory, metric topology, and mathematical analysis. The proofis valid only for metrizable topological spaces. The second, revised edition of the same bookcontains a simplified proof, which is, however, incorrect. The author implicitly assumes thatany linearly ordered set contains a cofinal subsequence, which is certainly not true. In thispaper we correct, simplify, and generalize the second proof of Moulin by its reformulation interms of topological convergence of nets. This modified technique also yields a slightly betterresult than is stated in the original. The assertion now holds for almost compact spaces. Theargument used is elementary and easily understandable to non-experts.

1. INTRODUCTION. In a normal form game, assume that the set of all strategiesof a player is compact and its associated utility function is continuous. In this paper,we present a slightly improved modification of the well-known result, which ensuresthe existence of an undominated strategy. Moreover, our result has a new and simplerproof.

The standard and best-known version of the proof is in the first edition of thecomprehensive textbook on game theory by Herve Moulin [8]. It is dependent ona combination of relatively nontrivial results from measure theory, metric topology,and mathematical analysis. In the second, revised edition [9] of the same book, thereis a newer, simplified proof using some topological arguments together with Zorn’sLemma. Unfortunately, the second Moulin’s proof is incorrect, since it implicitly usesa non-valid argument that every chain (that is, a linearly ordered set) contains a cofi-nal subsequence. The first uncountable ordinal ω1 is a proper counterexample, whichdemonstrates that in general it is not true. The mistake itself is not very critical forgame theory, since in metric spaces, for which the classical results are usually formu-lated, the topology is first countable and hence the sequences are still sufficient to fullydescribe the topology by means of convergence. Nevertheless, the mentioned fact it-self, noticed by the second author during her study of [9], constitutes an opportunityfor a revision of Moulin’s original proof using somewhat finer and a little bit moregeneral topological arguments.

We will correct, simplify, and somewhat strengthen the proof of Moulin’s result tobe applicable for slightly more general topological assumptions than is stated in itsoriginal version. Although the central notion that we use to express the phenomenaof convergence—nets—lies rather aside from the main stream of the current generaltopology, it has an important advantage. It is understood also by non-topologists be-cause of its similarity to the widely-spread and well-known convergence theory of

http://dx.doi.org/10.4169/amer.math.monthly.121.04.332MSC: Primary 91A10, Secondary 54D30

332 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 57: AMM April 2014

sequences in metric spaces. We also keep the simplified formulation of Moulin’s the-orem as in [8] and exclude the part regarding the prudent strategies, which is new in[9], but irrelevant with respect to the discussed correctness of Moulin’s proof. A shortdescription of our modification of the proof now follows. To reach more clarity of thesketch, we use compactness instead of almost compactness (in a contrast to the full ver-sion presented in Section 3). We modify the relation of dominance to induce a preorderon the strategy set of the i th player. The identity map, restricted to any chain (meaningnow a linearly or totally preordered subset) forms a net in a compact topological space,which, hence, has a cluster point. The continuous utility function maps the net as wellas its cluster point to R, equipped with the Euclidean topology. Thus the cluster pointis an upper bound of the chain, and Zorn’s Lemma finally completes the proof.

2. DEFINITIONS AND NOTATIONS. Before demonstrating the complete proof,we will need to recapitulate some necessary notions from game theory and topology.Recall that an n-person game G in a normal or strategic form is denoted by the 2n-tuple G = (X1, X2, . . . , Xn, u1, u2, . . . , un), where for each i ∈ {1, 2, . . . , n}, X i is anonempty set of strategies of the i th player and ui : 5

nj=1 X j → R is his real-valued

utility, or pay-off function. Let i ∈ {1, 2, . . . , n} and let xi , yi ∈ X i be some strategiesof the i th player. We say that the strategy yi dominates the strategy xi , if the followingconditions hold.

(1) For any selection of strategies sk ∈ Xk , where k ∈ {1, 2, . . . , n}, k 6= i ,

ui (s1, s2 . . . , si−1, xi , si+1, . . . , sn) ≤ ui (s1, s2 . . . , si−1, yi , si+1, . . . , sn).

(2) For each k ∈ {1, 2, . . . , n}, k 6= i , there exists some strategy tk ∈ Xk such that

ui (t1, t2 . . . , ti−1, xi , ti+1, . . . , tn) < ui (t1, t2 . . . , ti−1, yi , ti+1, . . . , tn).

The strategy xi ∈ X i of the i th player is said to be undominated if there is no strategyyi ∈ X i that dominates xi . It should be noted that this kind of dominance is sometimesreferred to as a weak dominance, in opposition to strict dominance, which differs fromthe above-defined notion at the condition (1) by the strict form< of the inequality. Twostrategies xi , yi ∈ X i are called equivalent, if for any selection of strategies sk ∈ Xk ,where k ∈ {1, 2, . . . , n}, k 6= i , it follows that

ui (s1, s2 . . . , si−1, xi , si+1, . . . , sn) = ui (s1, s2 . . . , si−1, yi , si+1, . . . , sn).

(For more detail, see, for example, [2, 11].)A binary relation on a set is called a preorder, if it is reflexive and transitive (and

not necessarily antisymmetric). Let A be a nonempty set and 4 be a preorder on Asuch that for every x, y ∈ A there exists z ∈ A with x 4 z and y 4 z. Then we saythat (A,4) is a directed set.

Let X be a topological space. A net in X is an arbitrary mapping from a directed setto the space X . A family 8 of nonempty sets is called a filter base if any intersectionof two sets belonging to 8 contains a subset from 8. We say that p ∈ X is a θ -clusterpoint of a filter base 8 in X , if for every closed neighborhood H of p and everyF ∈ 8, the intersection H ∩ F is nonempty. Similarly, p is a θ -cluster point of a netϕ(A,4), if for each closed neighborhood H of p and for each a ∈ A, there existsb ∈ A, b < a, such that ϕ(b) ∈ H . Taking the ϕ-images of the principal upper sets↑a = {b| b ∈ A, b < a}, we can easily convert the net ϕ(A,4) into a filter base, whilethe corresponding convergence and θ -convergence notions will remain preserved.

April 2014] UNDOMINATED STRATEGIES IN NORMAL FORM GAMES 333

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 58: AMM April 2014

A topological space X is said to be compact, if every filter base (or equivalently,every net) in X has a cluster point. For more detail and other equivalent characteriza-tions of compactness, especially in terms of open covers, we refer the reader to [3]. Wealso remark that in a modern approach to compactness, motivated by the growing in-terest of theoretical computer scientists in topology, the Hausdorff separation axiom isno longer assumed as a part of the definition of compactness (see, for example, [15]).A topological space is called almost compact [1] if every open filter base in X has acluster point. It is clear that every compact topological space is almost compact (butnot vice-versa).

The real line R we consider is a topological space equipped with the natural, Eu-clidean topology, generated by all open intervals.

3. MAIN RESULTS. We will start with the following simple example. It illustratessome of the limitations of Moulin’s classical result. The undominated strategies mayexist even if the spaces of strategies are not compact.

Example 3.1. Consider a normal form game of two players with the same sets ofstrategies X1 = X2 = [0, 1) × {0} ∪ {1} × {0, 1, . . .}. Let the corresponding utilityfunctions of the players be

u1 =x1

x1 + x2· f (y2), and u2 =

x2

x1 + x2· g(y1),

where f, g are arbitrary real-valued functions defined on {0} ∪ N. It is easy to see thatthe pairs (1, n) ∈ X i , where n ∈ {0, 1, . . .} and i = 1, 2, are equivalent, maximal andundominated strategies of the i th player. However, although the utility functions ui

are continuous, the topology of X i , induced from the real plane is not compact. Forinstance, the sequence {(1, n)| n = 0, 1, 2, . . .} has no cluster point in X i . Hence, theexistence of undominated strategies of the i th player is not a consequence of Moulin’stheorem.

There are many possible interpretations of the previous example, but probably oneof the most important is that there could be a duopolistic competition over marketshare with patent wars. The first component xi of the strategy (xi , yi ) of the i th playermay represent the market share, while the second component yi can be interpreted asobstructions extracting the profit of the player’s opponent, in particular, litigation overpatent rights.

For our main theorem and also for a better understanding of some other aspects ofthe previous example, we will need the following lemma. The contents of the lemmaare already known—it is essentially contained in (but rather split between) the book[1] and the paper [14]. Also useful are comments in [5]. We present the result herewith a proof, in order to repeat and concentrate some ideas of these resources in oneplace for the reader’s convenience.

Lemma 3.1. Let (X, τ ) be a topological space. The following conditions are equiva-lent:

(i) (X, τ ) is almost compact,

(ii) every filter base in X has a θ -cluster point,

(iii) every net in X has a θ -cluster point,

(iv) every open cover of X has a finite subfamily whose union is dense in X.

334 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 59: AMM April 2014

Proof. Suppose (i), and let 8 be a filter base in X . The family 9 = {U |U ∈ τ , thereexists F ∈ 8 with F ⊆ U } is an open filter base, and so it has a cluster point, sayp ∈ X . Let H be a closed neighborhood of p. We will show that H ∩ F 6= ∅ for everyF ∈ 8. Suppose conversely that F ⊆ X \ H for some F ∈ 8. Then X \ H ∈ 9 andso p ∈ cl(X \ H). But this is not possible, since p ∈

∫H and (

∫H) ∩ (X \ H) = ∅.

Hence, (ii) follows.Consider (ii) and take a net ϕ(A,4) in X . The family 8 = {ϕ(↑a)| a ∈ A} is a

filter base with a θ -cluster point, say p ∈ X . Let H be a closed neighborhood of p andlet a ∈ A. Then H ∩ ϕ(↑a) 6= ∅, so there is some b ∈ A, b < a, with ϕ(b) ∈ H . Itmeans that p is a θ -cluster point of ϕ(A,4) and (iii) holds.

Assume (iii), and take an open cover � of X . Let �F be the family of all finiteunions of elements of �. The family �F is directed by the set inclusion. Suppose thatfor every U ∈ �F , the set X \ cl U is nonempty, so it contains some element ϕ(U ).The net ϕ(�F ,⊆) has a θ -cluster point, say p ∈ X . Since �F is also a cover, thereis some V ∈ �F containing p. By the definition of the θ -cluster point, there existsW ∈ �F , W ⊇ V , such that ϕ(W ) ∈ cl V . But it also holds that ϕ(W ) ∈ X \ cl W , so∅ 6= (X \ cl W ) ∩ V ⊆ (X \ W ) ∩ W , which is not possible. Then some element of�F must be dense in X .

Finally, suppose (iv). Let 9 be an open filter base in X with no cluster point. Then⋂{cl U |U ∈ 9} = ∅, so � = {X \ cl U |U ∈ 9} is an open cover of X ; and since 9

is a filter base, it is directed by the inclusion. By (iv), there exists U ∈ 9, such thatX = cl(X \ cl U ). Since X \U is a closed set containing (X \ cl U ), it also containsits closure, and so X \U = X . But this is not possible according to the fact that a filterbase contains only nonempty elements. Therefore, 9 has a cluster point and (i) nowfollows.

From the previous lemma also follows the well-known fact that for regular spaces,the compactness and almost compactness coincide. On the other hand, there existsa Hausdorff almost compact space that is not compact, as the reader may checkin [1]. Hausdorff almost compact spaces are also known as H -closed spaces (also in[1], or [14]).

Theorem 3.1. Let G = (X1, X2, . . . , Xn, u1, u2, . . . , un) be a normal form game of nplayers. Suppose that for some i ∈ {1, 2, . . . , n}, X i is almost compact and the utilityfunction ui is a continuous, real valued function of the argument xi ∈ X i . Then the i thplayer has an undominated strategy.

Proof. For two strategies xi , yi ∈ X i we put xi 4 yi (sometimes we will write thisrelation as yi < xi ) if they satisfy the condition (1) of the definition of dominance inSection 2. It is easy to see that 4 is a preorder on X i . Let L ⊆ X i be an arbitrarylinearly preordered subset of X i (that is, for every a, b ∈ L , it holds a 4 b or b 4 a).Let l be the identity mapping on X i , restricted to L . Then l is a net in an almostcompact topological space X i , so l has a θ -cluster point p ∈ X i . By the definition, itmeans that for every closed neighborhood H of p and every t ∈ L , there exists somes ∈ L , s < t , with l(s) ∈ H .

Now, suppose that the strategies sk ∈ Xk of the other players, k 6= i , are arbi-trarily chosen, but fixed in this paragraph. We denote u′i (xi ) = ui (s1, s2 . . . , si−1, xi ,

si+1, . . . , sn). Suppose, for a moment, that there exist some t ∈ L with u′i (p) <u′i (t). Take c ∈ R such that u′i (p) < c < u′i (t). Because of continuity of u′i , H =u′−1

i ((−∞, c]) is a closed set in X i whose interior contains p. Since p is a θ -cluster point of l, there exists s ∈ L , with s < t , such that s = l(s) ∈ H . But

April 2014] UNDOMINATED STRATEGIES IN NORMAL FORM GAMES 335

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 60: AMM April 2014

then u′i (s) ∈ (−∞, c], which is not possible, because the relation s < t means thatc < u′i (t) ≤ u′i (s). Consequently, p is an upper bound of L . By Zorn’s Lemma, thereis a maximal element m in the preordered set (X i ,4). This completes the proof, sincethe strategy, maximal with respect to 4, cannot be dominated.

Note that Zorn’s Lema is usually formulated for partially ordered sets. Using pre-ordered sets, its appropriate formulation can be found in [7]. Hence, the maximalityof m, which is claimed in our proof, is maximality up to the equivalence of strategies.It means that there may exist another strategy m ′ ∈ X i , different from m (and also“maximal”), on which the utility function ui has the same values.

Since every compact topological space is almost compact, the classical version ofMoulin’s theorem now follows as a corollary. The reader can also compare Theo-rem 3.1 with other interesting results and techniques known from the game theoryliterature. For instance, H. Salonen in [12] replaced the continuity of the utility func-tion by its upper semi-continuity. He essentially used a characterization of compact-ness by the centered collections of sets (in other words, having the finite intersectionproperty, [10]), or filters and filter bases, which are topologically equivalent to nets. Asimilar technique was also used in [11] for iteratively undominated strategies with thecontinuous utility function.

Now, let us check the advantage of Theorem 3.1 over its original, classical version.

Example 3.2. Consider the game already described in Example 3.1. Let us define an-other topology on X i , where i = 1, 2, by the local base of a general point (x, y) ∈ X i :

(i) the point (0, 0) has neighborhoods of the form [0, ε)× {0}, 0 < ε < 1,(ii) for every x ∈ (0, 1), the point (x, 0) has neighborhoods of the form (x − ε,

x + ε)× {0}, where 0 < ε < min{x, 1− x},(iii) for every n = 0, 1, . . . , the point (1, n) has neighborhoods having the form

(1− ε, 1)× {0} ∪ {(1, n)}, where 0 < ε < 1.

The new topology on X i is now similar to the Euclidean topology on the unit seg-ment [0, 1], but with one important difference—the right end point of the “segment” ispresent infinitely many times. The space X i is T1, but certainly non-Hausdorff and non-compact. Indeed, denoting Yn = [0, 1)× {0} ∪ {(1, n)}, the family {Yn| n = 0, 1, . . .}is an open cover of X i , having no finite subcover. However, we can show that thenew topology is almost compact. Let � be an open cover of X i . The subspaceY0 = [0, 1] × {0} ⊆ X i is compact since it is homeomorphic to the unit segment[0, 1], so there exists a finite subfamily {U1,U2, . . . ,Uk} ⊆ � with Y0 ⊆

⋃kj=1 U j .

Then there is r ∈ {1, 2, . . . , k} such that (1, 0) ∈ Ur . But for every n = 1, 2, . . . , itfollows that (1, n) ∈ cl Ur , so the closures of {U1,U2, . . . ,Uk} cover X i . By condition(iv) of Lemma 3.1, X i is almost compact. The utility functions ui are continuousfunctions of the argument (xi , yi ), since they are continuous on the open subspacesYn = [0, 1)× {0} ∪ {(1, n)} of X i , n = 0, 1, . . . , homeomorphic to [0, 1]. Hence, theexistence of the undominated strategies now follows from Theorem 3.1. Note that sim-ilar spaces as X i are also known as examples of non-Hausdorff manifolds with somemotivation in sheaf theory and mathematical physics (see, for example, [6] or [4]).

ACKNOWLEDGMENTS. The authors are very grateful to both anonymous referees for many valuable sug-gestions and comments, in particular related to the game-theoretical part of the content of our paper, and to theeditor for his assistance with preparation of the final form of the manuscript. The authors are also thankful toProfessor V. A. Gorelik from the Dorodnitsyn Computing Center of the Russian Academy of Sciences for hisadvice on finding an appropriate game-theoretical literature at the initial stage of their work.

336 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 61: AMM April 2014

This work is supported by a specific research grant FEKT-S-11-2/921 of the Faculty of Electrical Engi-neering and Communication, Brno University of Technology.

REFERENCES

1. A. Csaszar, General Topology. Akademiai Kiado, Budapest, 1978.2. D. Fudenberg, J. Tirole, Game Theory. MIT Press, Cambridge, 1991.3. R. Engelking, General Topology. Heldermann Verlag, Berlin, 1989.4. M. Heller, L. Pysiak, W. Sasin, Geometry of non-Hausdorff spaces and its significance for physics,

J. Math. Phys. 52 (2011) 1–7, available at http://dx.doi.org/10.1063/1.3574352.5. D. S. Jankovic, θ -regular spaces, Internat. J. Math. Sci. 8 (1985) 615–619, available at http://dx.doi.

org/10.1155/S0161171285000667.6. S. L. Kent, R. A. Mimna, J. K. Tartir, A note on topological properties of non-Hausdorff manifolds,

Internat. J. Math. Sci. (2009) 1–4, available at http://dx.doi.org/10.1155/2009/891785.7. R. E. Meggison, An Introduction to Banach Space Theory. Springer-Verlag, Berlin, 1998.8. H. Moulin, Theorie des Jeux pour l’Economie et la Politique. Hermann Paris—Collection Methodes,

Paris, 1981.9. , Game Theory for the Social Sciences. Second and revised edition. New York University Press,

New York, 1986.10. J. Nagata, Modern General Topology. North-Holland, Amsterdam, 1974.11. K. Ritzberger, Foundations of Non-Cooperative Game Theory. Oxford University Press, Oxford, 2002.12. H. Salonen, On the existence of undominated Nash equilibria in normal form games, Games and Eco-

nomic Behavior 14 (1996) 208–219, available at http://dx.doi.org/10.1006/game.1996.0049.13. W. J. Thron, Topological Structures. Holt, Rinehart and Winston, New York, 1966.14. N. V. Velicko, H -closed topological spaces, Mat. Sb. 70(112) (1966) 98–102 (Russian).15. S. Vickers, Topology Via Logic. Cambridge University Press, Cambridge, 1989.

MARTIN KOVAR is an associate professor of mathematics at Brno University of Technology (Brno, CzechRepublic). He obtained his Ph.D. (1994) from Masaryk University in Brno and his habilitation degree (2006)from Charles University in Prague. He started to study physics at Masaryk University in 1985. Even as astudent, he was fascinated by the elegance and beauty of classical topological results. His growing interest intopology finally led to a change in his area of specialization. His main fields of interest are general and appliedtopology motivated by problems from computer science and physics. However, theoretical and mathematicalphysics still remain in the extended range of his scientific interests. To date, he has published approximately30 research articles. He strongly believes that topology is a fascinating mathematical discipline of the future,with an excellent, but so far underused, potential for many applications, including computer science, physics,and modern technologies. He also believes that science is fun and that the liberty of scientific research is onethe greatest values and achievements of humanity and should be carefully protected.Department of Mathematics, Faculty of Electrical Engineering and Communication,Brno University of Technology, Technicka 8, Brno, 616 00, Czech [email protected]

ALENA CHERNIKAVA received her master’s degree (2009) at the State University of Minsk in Belarus.She excelled at her studies, and in 2010 she began Ph.D. study at Brno University of Technology in CzechRepublic. Her principal fields of interest are applied topology, formal concept analysis, and game theory. Sheis an author or co-author of several research papers.Department of Mathematics, Faculty of Electrical Engineering and Communication,Brno University of Technology, Technicka 8, Brno, 616 00, Czech [email protected]

April 2014] UNDOMINATED STRATEGIES IN NORMAL FORM GAMES 337

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:39 PMAll use subject to JSTOR Terms and Conditions

Page 62: AMM April 2014

An Asymptotic Formula for (1 + 1/ x ) x Based on the Partition FunctionAuthor(s): Chao-Ping Chen, Junesang ChoiSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 338-343Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.338 .

Accessed: 30/03/2014 17:29

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 63: AMM April 2014

An Asymptotic Formula for (1 + 1/x)x

Based on the Partition FunctionChao-Ping Chen and Junesang Choi

Abstract. We present a method to produce estimations of the natural logarithmic constante, accurate to as many decimal places as we desire. The method is based on an asymptoticformula for (1+ 1/x)x , which uses the partition function.

In contrast with the continuing fascination with finding as many digits as possible ofthe decimal approximation of π , few mathematicians seem interested in computingthe base e of the natural logarithms to a comparable precision (see [2, 6]). Joost Burgiseems to have formulated the first approximation to e around 1620, obtaining three-decimal-place accuracy (see [3, p. 31], [5], and [6, pp. 26–27]).

One classical computation of e depends upon the well-known limit

e = limx→∞

(1+

1

x

)x

. (1)

Indeed, it is easy to see that the function f (x):= (1+ 1/x)x increases and is boundedabove by 3 on the interval [1,∞); thus, a larger value of x gives a more accurateapproximation to e. For example, f

(105)= 2.7182 6823 . . . approximates e to four

decimal places.Another classical computation of e uses Isaac Newton’s first version (1620) of what

is now known as the Maclaurin series expansion for ex [2]:

ex=

∞∑j=0

x j

j != 1+ x +

x2

2!+

x3

3!+ · · · for all x ∈ R. (2)

Setting x = 1 in (2) and choosing a large value of n, we obtain the partial sum

n∑j=0

1

j != 1+ 1+

1

2!+

1

3!+ · · · +

1

n!,

which gives a simple, direct approximation to e that is the best way of calculating e tohigh accuracy [1, 2]. Present numerical values of e are derived using either optimizedversions of this Maclaurin series (2) or the continued-fraction expansion approachinitiated by Euler [2].

A further classical approach to approximating e uses the Maclaurin series expansionof ln(1+ x):

ln(1+ x) =∞∑j=1

(−1) j−1

jx j for − 1 < x ≤ 1. (3)

http://dx.doi.org/10.4169/amer.math.monthly.121.04.338MSC: Primary 05A17, Secondary 11P81

338 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 64: AMM April 2014

The only example of this alternative approach (that the authors of [2] have found in theliterature) is given by replacing x with 1/x in (3) and then multiplying the resultingseries by x to get

x ln

(1+

1

x

)= 1−

1

2x+

1

3x2−

1

4x3+

1

5x4−

1

6x5+

1

7x6− · · · (4)

for x < −1 or x ≥ 1. By exponentiating each side of (4) and collecting the same pow-ers of 1/x with the help of the Maclaurin series (2) for ex , we find an approximation toe that has been known by mathematicians and bankers alike since the early seventeenthcentury (see [2, Eq. (4)]). For x < −1 or x ≥ 1, we obtain(

1+1

x

)x

= e

(1−

1

2x+

11

24x2−

7

16x3+

2447

5760x4−

959

2304x5+

238043

580608x6−· · ·

).

(5)

Setting, for example, x = 100,000 in the left-hand side of (5) yields an approximationto e that is accurate to four decimal places.

Motivated by this technique, Knox and Brothers [5] (see also Brothers and Knox[2]) present an interesting and useful method that yields a new and more accurateapproximation to e by combining two good approximations. We choose to demon-strate one of their many results here (see [2] or [5]). Adding approximation (5) andthe approximation obtained by replacing x by −x in (5), and multiplying the resultingidentity by 1/2, they obtain the following better approximation to e than that givenby (5):

1

2

[(1+

1

x

)x

+

(1−

1

x

)−x]= e

(1+

11

24x2+

2447

5760x4+

238043

580608x6+· · ·

). (6)

Even though we can obtain as many coefficients as we please in the right-hand sideof (5) by using Mathematica, here we aim at giving a formula for determining thesecoefficients. Our formula is based mainly on the partition function (see, e.g., [7, 9]).

For our later use, we introduce the following set of partitions of an integer n ∈ N =N0 \ {0} := {1, 2, 3, . . .}:

An :={(k1, k2, . . . , kn) ∈ Nn

0 : k1 + 2k2 + · · · + nkn = n}. (7)

In number theory, the partition function p(n) represents the number of possible parti-tions of n ∈ N (e.g., the number of distinct ways of representing n as a sum of naturalnumbers, regardless of order). By convention, p(0) = 1 and p(n) = 0 for n a negativeinteger. For more information on the partition function p(n), please refer to [7] and thereferences therein. The first several values of the partition function p(n) are (startingwith p(0) = 1, see [9]):

1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, . . . .

It is easy to see that the cardinality of the set An is equal to the partition function p(n).Now we are ready to present a formula that determines the coefficients a j ’s in (8), withthe help of the partition function asserted by the following theorem.

April 2014] AN ASYMPTOTIC FORMULA FOR (1+ 1/x)x 339

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 65: AMM April 2014

Theorem. The following approximation formula holds true:(1+

1

x

)x

= e∞∑j=0

a j

x jas x →∞, (8)

where the coefficients a j ( j ∈ N) are given by

a0 := 1 and

a j = (−1) j∑

(k1,k2,...,k j)∈A j

1

k1! k2! · · · k j !

(1

2

)k1(

1

3

)k2

· · ·

(1

j + 1

)k j

, (9)

where the A j (for j ∈ N) are given in (7).

Proof. In view of the Maclaurin series (2) of ln(1+ x), we can let

x ln

(1+

1

x

)= 1+ ln

1+q∑

j=1

a j

x j

+ O(x−q−1) for x →∞ and q ∈ N,

where a1, . . . , aq are real numbers to be determined. From the fundamental theoremof algebra, we see that there exist unique complex numbers x1, . . . , xq such that

1+a1

x+ · · · +

aq

xq=

(1+

x1

x

)· · ·

(1+

xq

x

). (10)

By using the following series expansion:

ln(

1+z

x

)=

q∑j=1

(−1) j−1z j

j x j+ O(x−q−1) for |z| < |x | and x →∞,

we obtain

ln(

1+a1

x+ · · · +

aq

xq

)=

q∑j=1

(−1) j−1S j

j x j+ O(x−q−1) for x →∞, (11)

where

S j = x j1 + · · · + x j

q for j = 1, . . . , q.

Replacing x by 1x in (3) and multiplying the resulting equation by x , we get

x ln

(1+

1

x

)= 1−

q∑j=1

(−1) j−1

( j + 1)x j+ O(x−q−1) for x →∞. (12)

We then find from (11) and (12) that

S j = −j

j + 1for j = 1, . . . , q, (13)

340 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 66: AMM April 2014

that is,

x1 + · · · + xq = −1

2,

x21 + · · · + x2

q = −2

3,

...

xq1 + · · · + xq

q = −q

q + 1.

(14)

Let

Pq(x) = xq+ b1xq−1

+ · · · + bq−1x + bq

be a polynomial with zeros x1, . . . , xq satisfying the system of equations (14). So wehave

Pq(x) = (x − x1) · · · (x − xq). (15)

The Newton formulas (see, e.g., [4] and references therein) give the connection be-tween the coefficients b j and the power sums S j :

S j + S j−1b1 + S j−2b2 + · · · + S1b j−1 + jb j = 0 for j = 1, . . . , q.

It is known [4] that b j can be expressed in terms of S j :

b j =

∑(k1,k2,...,k j)∈A j

(−1)k1+k2+···+k j

k1!k2! · · · k j !

(S1

1

)k1(

S2

2

)k2

· · ·

(S j

j

)k j

, (16)

where the A j (for j ∈ N) are given in (7).From (15), we obtain

(−1)q

xqPq(−x) =

(1+

x1

x

)· · ·

(1+

xq

x

).

We thus have

1+(−1)b1

x+(−1)2b2

x2+ · · · +

(−1)q−1bq−1

xq−1+(−1)qbq

xq

=

(1+

x1

x

)· · ·

(1+

xq

x

). (17)

We see from (10) and (17) that the coefficients a j are given by

a j = (−1) j b j

= (−1) j∑

(k1,k2,...,k j)∈A j

(−1)k1+k2+···+k j

k1!k2! · · · k j !

(S1

1

)k1(

S2

2

)k2

· · ·

(S j

j

)k j

, (18)

where the S j are given in (13). Finally, substituting the expression (13) into (18) yields(9). This completes the proof.

April 2014] AN ASYMPTOTIC FORMULA FOR (1+ 1/x)x 341

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 67: AMM April 2014

Remark. Here we give explicit numerical values of some first terms of a j by usingthe partition set (7) and the formula (9). This shows how easily we can determine a j ’sin (9). Obviously,

a1 = −

∑k1=1

1

k1!

(1

2

)k1

= −1

2.

For k1 + 2k2 = 2, since p(2) = 2, the partition set A2 in (7) is seen to have two ele-ments:

A2 = {(0, 1), (2, 0)} .

From (9), we have

a2 =

∑(k1,k2)∈A2

1

k1!k2!

(1

2

)k1(

1

3

)k2

=11

24.

For k1 + 2k2 + 3k3 = 3, since p(3) = 3, as above, the partition set A3 in (7) containsthree elements:

A3 = {(0, 0, 1), (1, 1, 0), (3, 0, 0)} .

We then find from (9) that

a3 = −

∑(k1,k2,k3)∈A3

1

k1!k2!k3!

(1

2

)k1(

1

3

)k2(

1

4

)k3

= −7

16.

Likewise, the partition sets A4 and A5 have 5 = p(4) and 7 = p(5) elements, respec-tively, and so

A4 = {(0, 0, 0, 1), (1, 0, 1, 0), (0, 2, 0, 0), (2, 1, 0, 0), (4, 0, 0, 0)} and

A5 = {(0, 0, 0, 0, 1), (1, 0, 0, 1, 0), (0, 1, 1, 0, 0), (2, 0, 1, 0, 0),

(1, 2, 0, 0, 0), (3, 1, 0, 0, 0), (5, 0, 0, 0, 0)} ,

which yields

a4 =2447

5760and a5 = −

959

2304.

We note that the explicit numerical values of a j (for j = 1, 2, 3, 4, 5) here correspondwith the coefficients of 1/x j (for j = 1, 2, 3, 4, 5) in (5), respectively.

By using (8), we find that

1

2

[(1+

1

x

)x

+

(1−

1

x

)−x]= e

∞∑j=0

(1+ (−1) j

)a j

2x jfor x →∞, (19)

where the a j (for j ∈ N0) are given in (9).

342 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 68: AMM April 2014

ACKNOWLEDGMENTS. Thanks to the Editor, Professor Scott Chapman, for his several enduring encour-agements to improve the exposition of this note and to the anonymous referees for their constructive comments.Thanks also to Professor Jack R. Quine and Professor Bettye Anne Case of Florida State University for theirhelp in improving the exposition of this note. This research was supported by the Basic Science Research Pro-gram through the National Research Foundation of Korea funded by the Ministry of Education, Science andTechnology of the Republic of Korea (2012-0002957).

REFERENCES

1. G. Arfken, Mathematical Methods for Physicists. Third edition. Academic Press, New York, 1985.2. H. J. Brothers, J. A. Knox, New closed-form approximations to the logarithmic constant e, Math. Intelli-

gencer 20 (1998) 25–29.3. H. T. Davis, Tables of the Mathematical Functions. Vol. I, The Principia Press of Trinity University, San

Antonio, Texas, 1963.4. H. W. Gould, The Girard-Waring power sum formulas for symmetric functions and Fibonacci sequences,

Fibonacci Quart. 37 (1999) 135–140.5. J. A. Knox, H. J. Brothers, Novel series-based approximations to e, College Math. J. 30 (1999) 269–275.6. E. Maor, e: The Story of a Number. Princeton University Press, Princeton, New Jersey, 1994.7. Wikipedia contributors, Partition (number theory), Wikipedia, The Free Encyclopedia, available at http:

//en.wikipedia.org/wiki/Partition_function_(number_theory)#Partition_function.8. J. Sondow, E. W. Weisstein, “e.” From MathWorld–A Wolfram Web Resource, available at http:

//mathworld.wolfram.com/e.html.9. N. J. A. Sloane, a(n) = number of partitions of n (the partition numbers). Maintained by The OEIS

Foundation, available at http://oeis.org/A000041.

CHAO-PING CHEN received his Bachelor of Science degree from Henan Normal University (China) in1986 and his Master of Science Degrees from Southwest Jiaotong University (China) in 1995. He currentlyteaches at Henan Polytechnic University (Jiaozuo) in China.School of Mathematics and Informatics, Henan Polytechnic University, Jiaozuo City 454003,Henan Province, [email protected]

JUNESANG CHOI received his B.A. from Gyeongsang National University (Republic of Korea) in 1981 andhis Ph.D. from the Florida State University in 1991. He currently teaches at Dongguk University (Gyeongju)in the Republic of Korea. See http://wwwk.dongguk.ac.kr/~junesang/.Department of Mathematics, Dongguk University, Gyeongju 780-714, Republic of [email protected]

April 2014] AN ASYMPTOTIC FORMULA FOR (1+ 1/x)x 343

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:29:52 PMAll use subject to JSTOR Terms and Conditions

Page 69: AMM April 2014

Stirling’s Approximation for Central Extended Binomial CoefficientsAuthor(s): Steffen EgerSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 344-349Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.344 .

Accessed: 30/03/2014 17:30

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 70: AMM April 2014

NOTESEdited by Sergei Tabachnikov

Stirling’s Approximation forCentral Extended Binomial Coefficients

Steffen Eger

Abstract. We derive asymptotic formulas for central extended binomial coefficients, whichare generalizations of binomial coefficients, using the distribution of the sum of independentdiscrete uniform random variables with the Central Limit Theorem and a local limit variant.

1. STIRLING’S FORMULA AND CENTRAL BINOMIAL COEFFICIENTS.For a nonnegative integer k, Stirling’s formula,

k! ∼√

2πk

(k

e

)k

where e is Euler’s number, yields an approximation of the central binomial coefficient( kk/2

)using

( km

)=

k!m!(k−m)! as

(k

k/2

)∼

2k+1

√2πk

,

where we write ak ∼ bk as short-hand for limk→∞akbk= 1. In our current note, we de-

rive asymptotic formulas for central extended binomial, or polynomial, coefficients (cf.[2, 3, 7]). These coefficients appear in the extended binomial triangles (which we alsocall (l + 1)-nomial, polynomial, or multinomial triangles [8]), which are generaliza-tions of binomial, or Pascal, triangles, where entries in row k are defined as coefficientsof the polynomial (1+ x + x2

+ · · · + x l)k for l ≥ 0. Our derivation is not based uponasymptotics of factorials, but upon the limiting distribution of the sum of discrete uni-form random variables.1

2. EXTENDED BINOMIAL TRIANGLES. In generalization to binomial triangles,(l + 1)-nomial triangles, for l ≥ 0, are defined in the following way. Starting with a1 in row zero, construct an entry in row k, for k ≥ 1, by adding the overlying (l + 1)entries in row (k − 1) (some of these entries are taken as zero if not defined); thereby,row k has (kl + 1) entries. For example, the binomial (l = 1), trinomial (l = 2), andquadrinomial triangles (l = 3) start as follows,

http://dx.doi.org/10.4169/amer.math.monthly.121.04.344MSC: Primary 11B65, Secondary 11N37; 60G50

1Throughout, we assume that all fractional values such as x = kl2 are integral when used in the context of

extended binomial coefficients. If this is not the case, then replace respective quantities with their floor, bxc,the largest integer less than or equal to x .

344 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 71: AMM April 2014

11 11 2 11 3 3 1

11 1 11 2 3 2 11 3 6 7 6 3 1

11 1 1 11 2 3 4 3 2 11 3 6 10 12 12 10 6 3 1

In the (l + 1)-nomial triangle, the nth entry, for 0 ≤ n ≤ kl in row k, which wedenote by

(kn

)l+1

, has the following interpretation. It is the coefficient of xn in theexpansion of

(1+ x + x2+ · · · + x l)k =

kl∑n=0

(k

n

)l+1

xn. (1)

It has been shown that(k

n

)l+1

denotes the number of restricted integer compositions(for a definition, see, e.g., [9] and many others) of the nonnegative integer n with kparts π1, . . . , πk , each from the set {0, 1, . . . , l} (cf. [5]), and allows the followingrepresentation, (

k

n

)l+1

=

∑k0≥0,...,kl≥0k0+···+kl=k

0·k0+1·k1+···+l·kl=n

(k

k0, . . . , kl

), (2)

where( k

k0,...,kl

)is a multinomial coefficient, defined as k!

k0!...kl !, for nonnegative integers

k0, . . . , kl . We can verify representation (2) by noting that for real numbers x0, . . . , xl ,the multinomial theorem (cf. [15]) states that

(x0 + x1 + · · · + xl)k=

∑k0≥0,...,kl≥0k0+···+kl=k

(k

k0, . . . , kl

)x k0

0 · · · · · xkll .

Thus, setting xi = x i for i = 0, . . . , l,

(1+ x + x2+ · · · + x l)k =

∑k0≥0,...,kl≥0k0+···+kl=k

(k

k0, . . . , kl

)x0·k0+···+l·kl , (3)

so that comparing coefficients of the right-hand sides of (1) and (3) leads to (2).

3. GENERALIZED STIRLING’S APPROXIMATION. Our strategy for derivingapproximation formulas for central extended binomial coefficients is as follows. First,we determine the asymptotic distribution of the sum of discrete uniform variables,which we easily find to be a normal distribution by the Central Limit Theorem (CLT).Then, we determine the exact distribution, which turns out to yield the normalizedextended binomial coefficients

(kn

)l+1

. By relating the density of the asymptotic distri-bution to the density of the exact distribution (e.g., via a ‘local limit’ argument), weobtain an extended binomial analogue of Stirling’s approximation to central binomialcoefficients.

3.1. Step 1: Asymptotic distribution of the sum of discrete uniform variables.Let k be a positive integer and let l be a nonnegative integer. Let X j , for j = 1, . . . , k,

April 2014] NOTES 345

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 72: AMM April 2014

be identically and independently distributed random draws from the discrete uniformdistribution on the set {0, . . . , l}, and let Sk be their sum,

Sk =

k∑j=1

X j .

Obviously, by standard moments of the uniform distribution, the mean and variance ofeach X j are given by

µ = E[X j ] =l

2, and σ 2

= Var[X j ] =(l + 1)2 − 1

12.

Hence, by independent and identical distribution of X1, . . . , Xk , and application ofthe CLT, the random variable

√k( Sk

k − µ) converges, as k →∞, in distribution to anormally N (0, σ 2) distributed random variable. Recall that convergence in distribu-tion precisely means that the cumulative density function of

√k( Sk

k − µ) convergespointwise to the cumulative density function of the N (0, σ 2) distribution.

3.2. Step 2: Exact distribution of the sum of discrete uniform random variables.We now determine exactly the probability that Sk takes on the integer value n, for 0 ≤n ≤ kl. To do so, we consider ‘isomorphic copies’ X j of X j , which are independentlyand identically multinomially distributed with probabilities p0 = · · · = pl =

1l+1 of

types 0 to l. Each X j = (A0, . . . , Al) is vector-valued, with P[X j = (a0, . . . , al)] =1

l+1 for nonnegative integers as , with a0 + · · · + al = 1, where As denotes the number

of times an event of type s, for s = 0, . . . , l, occurs. Then, the sum Sk = X1 + · · · + Xk

has the interpretation of representing the event of drawing with replacement k balls of(l + 1) different types from a bag, where the probability of drawing type s = 0, . . . , lis 1

l+1 . Thus, by the standard interpretation of the multinomial distribution, Sk hasdensity

P[Sk = (a0, . . . , al)] = P[A0 = a0, . . . , Al = al] =

(k

a0, . . . , al

)(1

l + 1

)k

,

where a0+· · ·+al=k for nonnegative integers a0, . . . , al . Then, if Sk= (a0, . . . , al),Sk , the variable corresponding to Sk , represents the integer 0 · a0 + · · · + l · al . Thus,for n such that 0 ≤ n ≤ kl,

P[Sk = n] =∑

a0≥0,...,al≥0a0+···+al=k

0·a0+···+l·al=n

P[Sk = (a0, . . . , al)]

=

(1

l + 1

)k ∑a0≥0,...,al≥0a0+···+al=k

0·a0+···+l·al=n

(k

a0, . . . , al

)=

(1

l + 1

)k (k

n

)l+1

,

using representation (2).An arguably more straightforward derivation of the exact distribution of Sk , making

use of probability generating functions (pgfs), can be given by noting that the pgf

346 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 73: AMM April 2014

G X j (x) =∑

n≥0 P[X j = n]xn of each X j is given by

G X j (x) =1

l + 1

l∑n=0

xn.

Whence, the pgf of Sk is given as, by independence of X1, . . . , Xk ,

GSk (x) = G X1(x) · · · · · G Xk (x) =

(1

l + 1

)k(

l∑n=0

xn

)k

=

(1

l + 1

)k kl∑n=0

(k

n

)l+1

xn.

Thus,

P[Sk = n] =G(n)

Sk(0)

n!=

(1

l + 1

)k n!

n!

(k

n

)l+1

=

(1

l + 1

)k (k

n

)l+1

,

where, by G(n)X (0), we denote the nth derivative of G X , evaluated at zero.

3.3. Step 3: Local limit theorem. To derive an asymptotic formula for(k

n

)l+1

, wewould like to make use of the results derived in Steps 1 and 2 above. Ideally, we wouldlike to equate the probability density function of the asymptotic normal dstributionof Sk with the exact distribution. However, as mentioned, convergence in distribution,as assured by the CLT, only guarantees pointwise convergence of cumulative densityfunctions. On the contrary, ‘local limit theorems’ describe how the probability den-sity function of a sum of random variables approaches the normal density function.For integer-valued random variables (also called lattice or arithmetical distributions),Gnedenko and Kolmogorov [10] provide the following result.

Theorem 3.1 (see [10, p. 233]). If X1, X2, . . . are independent lattice random vari-ables with identical distribution with finite mean µ and variance σ 2, such that thegreatest common divisor of the differences of all the values of X j taken with positiveprobability is 1, then ∣∣∣∣√k σ P[Sk = n] −

1√

2πe−(n−kµ)2

2σ2k

∣∣∣∣→ 0

uniformly in n as k →∞.

Since in our situation, the set of values of each X j taken with positive probability is{0, 1, . . . , l}, the greatest common divisor of the differences is clearly 1. Thus, all as-sumptions of Theorem 3.1 are satisfied in our case and, hence, also, the consequenceshold. Therefore, the following approximation is suggested for large k:

√k σ P[Sk = n] ∼

1√

2πe−(n−kµ)2

2σ2k . (4)

April 2014] NOTES 347

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 74: AMM April 2014

For n = kµ = kl/2, the argument to the exponential function is zero, and thus

√k σ P[Sk = kl/2] ∼

1√

2π, or equivalently, P[Sk = kl/2] ∼

1√

2πσ 2k.

Using the exact form for P[Sk = n] from Step 2 above, we hence have, bringing thenormalizing term (l + 1)k to the right-hand side,(

kkl2

)l+1

∼(l + 1)k√

2πk (l+1)2−112

. (5)

For example, for l = 1, Pascal’s case, l = 2, l = 3, and l = 4, we therefore have theapproximations(

kk2

)∼

2k+1

√2πk

,

(k

k

)3

∼3k√43πk

,

(k32 k

)4

∼4k√52πk

, and

(k

2k

)5

∼5k

2√πk.

In Figure 1, we show for l = 4 the distributions P[Sk = n] for k = 5, 10, 20, andtheir respective normal approximations. There, we can see the local limit theorem‘at work’: The exact density function apparently approaches, pointwise, the normaldensity function.

Figure 1. Distributions P[Sk = n] for k = 5, 10, 20 for l = 4 fixed, and normal approximations.

348 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 75: AMM April 2014

4. DISCUSSION. Although extended binomial coefficients, together with their con-nection to the sum of discrete uniform random variables, go back at least to DeMoivre’s Doctrine of Chances [4] and to Euler’s [6] analytical study of the coefficientsof polynomial (1), the mathematics community has apparently more or less ignoredtheir systematic study, except for a few recent publications such as [1, 2, 5, 7, 8].Next, using the CLT (or a local limit variant) to deduce asymptotics of mathemati-cal objects has been suggested, for example, by Walsh [14], who derives Stirling’sformula for factorials by equating the distribution of the sum of Poisson distributedrandom variables with the normal density. Finally, the asymptotics of both the centralbinomial (l = 1) as well as the central trinomial coefficients (l = 2) seem to be known(e.g. [7, 13]), while the general formula (5) is, to the best of our knowledge, novel.However, Ratsaby [12] derives our general result (4), as an estimate of the numberof restricted integer compositions, by application of Cauchy’s coefficient formula tothe polynomial (1) and computation of the resulting integral by Laplace’s method forevaluation of integrals. A historical perspective of local versus central limit theoremis provided by McDonald [11].

ACKNOWLEDGMENT. The author would like to thank the anonymous reviewers for helpful comments.

REFERENCES

1. R. C. Bollinger, C. L. Burchard, Lucas’s theorem and some related results for extended Pascal triangles,Amer. Math. Monthly 97 no. 3 (1990) 198–204.

2. C. C. S. Caiado, P. N. Rathie, Polynomial coefficients and distribution of the sum of discrete uniformvariables, in Eighth Annual Conference of the Society of Special Functions and their Applications. Editedby A. M. Mathai, M. A. Pathan, K. K. Jose, and J. Jacob, Pala, India, 2007.

3. L. Comtet, Advanced Combinatorics: The Art of Finite and Infinite Expansions. D. Reidel PublishingCompany, Dordrecht, 1974.

4. A. De Moivre, The Doctrine of Chances: Or, A Method of Calculating the Probabilities of Events in Play.Reprint of the third (1756) edition. Chelsea, New York, 1967.

5. S. Eger, Restricted weighted integer compositions and extended binomial coefficients, J. Integer Seq., 16(2013).

6. L. Euler, De evolutione potestatis polynomialis cuiuscunque (1+ x + x2+ · · · )n . Nova Acta Academiae

Scientarum Imperialis Petropolitinae 12 (1801), available at http://math.dartmouth.edu/~euler/.7. N.-E. Fahssi, The polynomial triangles revisited (2012), available at http://arxiv.org/abs/1202.

0228.8. D. C. Fielder, C. O. Alford, Pascal’s triangle: Top gun or just one of the gang?, in Applications of Fi-

bonacci Numbers. Edited by G. E. Bergum, A. N. Philippou, A. F. Horadam, Kluwer, Dordrecht, 1991.9. P. Flajolet, R. Sedgewick, Analytic Combinatorics. Cambridge University Press, Cambridge, 2009.

10. B. V. Gnedenko, A. N. Kolmogorov, Limit Distributions for Sums of Independent Random Variables.Second edition. Addison-Wesley, Cambridge, MA, 1968.

11. D. R. McDonald, The local limit theorem: A historical perspective, JIRSS 4 (2005) 73–86.12. J. Ratsaby, Estimate of the number of restricted integer-partitions, Appl. Anal. Discrete Math 2 (2008)

222-233.13. The On-Line Encyclopedia of Integer Sequences, available at http://oeis.org, 2012, Sequence

A002426.14. D. P. Walsh, Equating Poisson and normal probability functions to derive Stirling’s formula, Amer. Statist.

49 (1995) 270–271.15. E. Weisstein, Multinomial Series—From MathWorld, A Wolfram Web Resource, available at http:

//mathworld.wolfram.com/MultinomialSeries.html.

Economics Department, Goethe University Frankfurt am Main, [email protected]

April 2014] NOTES 349

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:05 PMAll use subject to JSTOR Terms and Conditions

Page 76: AMM April 2014

A New Proof of Stirling’s FormulaAuthor(s): Thorsten NeuschelSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 350-352Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.350 .

Accessed: 30/03/2014 17:30

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PMAll use subject to JSTOR Terms and Conditions

Page 77: AMM April 2014

A New Proof of Stirling’s Formula

Thorsten Neuschel

Abstract. A new, simple proof of Stirling’s formula via the partial fraction expansion for thetangent function is presented.

1. INTRODUCTION. Various proofs for Stirling’s formula

n! ∼ nn e−n√

2πn, as n→∞, (1.1)

have been established in the literature since the days of de Moivre and Stirling in 1730(for a historical exposition see, e.g., [1]). Many of these proofs show that the limit

limn→∞

n!

nn e−n√

n

exists (for instance, via the Euler–Maclaurin formula) in order to identify this limitby using the asymptotical behavior of the Wallis product, which is the crucial step.We will show that this last, quite wily, step can be replaced by a simple straightfor-ward computation of the limit using only the partial fraction expansion for the tangentfunction

π tanπx =∞∑ν=0

2x

(ν + 12 )

2 − x2. (1.2)

This expansion was probably found by Euler by the time Stirling determined his proofvia Wallis’ formula, see, e.g., [6, p. 327]. For some alternative elementary proofs ofStirling’s formula see, e.g., [1, 2, 4, 5, 7].

2. PROOF. An application of the well-known Euler–Maclaurin formula in its sim-plest form (see, e.g., [8, p. 37, (6.21)]) yields

log n! = n log n − n + 1+ log√

n +∫ n−1

0

x − [x] − 12

1+ xdx .

In order to prove (1.1), it is sufficient to show∫∞

0

x − [x] − 12

1+ xdx = log

√2π − 1. (2.1)

To prove this, we will show directly the identity∫∞

0

x − [x] − 12

1+ xdx =

∫ 1/2

0

(8x2

1− 4x2− πx tanπx

)dx, (2.2)

http://dx.doi.org/10.4169/amer.math.monthly.121.04.350MSC: Primary 41A60

350 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PMAll use subject to JSTOR Terms and Conditions

Page 78: AMM April 2014

where the integral on the right-hand side can be evaluated by elementary calculus. Westart our computation with∫

0

x − [x] − 12

1+ xdx =

∞∑ν=0

{∫ ν+1/2

ν

x − ν − 12

1+ xdx +

∫ ν+1

ν+1/2

x − ν − 12

1+ xdx

}

=

∞∑ν=0

{∫ 1/2

0

x − 12

1+ ν + xdx +

∫ 1/2

0

x32 + ν + x

dx

}.

By an easy change of variables, we observe that∫ 1/2

0

x − 12

1+ ν + xdx = −

∫ 1/2

0

x32 + ν − x

dx,

so that we obtain∫∞

0

x − [x] − 12

1+ xdx =

∞∑ν=0

∫ 1/2

0

(x

32 + ν + x

−x

32 + ν − x

)dx

=

∞∑ν=0

∫ 1/2

0

−2x2

(ν + 32 )

2 − x2dx

=

∫ 1/2

0

∞∑ν=1

−2x2

(ν + 12 )

2 − x2dx, (2.3)

where the interchange of summation and integration is allowed, due to the uniformconvergence of the series in (2.3) on the interval [0, 1

2 ]. Applying (1.2), we immedi-ately obtain (2.2). At this point of the proof, we have reduced the problem of determin-ing the constant in Stirling’s formula to a simple matter of elementary calculus as theresulting integral in (2.2) can be evaluated easily. For convenience, we will give somedetails. For example, using the decomposition

8x2

1− 4x2=

1

1+ 2x+

1

1− 2x− 2

it can be rewritten as∫ 1/2

0

(8x2

1− 4x2− πx tanπx

)dx = log

√2− 1+

∫ 1/2

0

(1

1− 2x− πx tanπx

)dx .

Now, by a standard argumentation involving integration by parts, we can observe for0 < ε < 1/2 that∫ ε

0

(1

1− 2x− πx tanπx

)dx

= −1

2log(1− 2ε)−

∫ ε

0πx tanπx dx

=

(ε −

1

2

)log cos(πε)+

1

2log

cos(πε)

1− 2ε−

∫ ε

0log cos(πx) dx .

April 2014] NOTES 351

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PMAll use subject to JSTOR Terms and Conditions

Page 79: AMM April 2014

Letting ε tend to 1/2, we immediately obtain∫ 1/2

0

(1

1− 2x− πx tanπx

)dx = log

√π − log

√2−

∫ 1/2

0log cos(πx) dx .

The remaining integral on the right-hand side can be easily evaluated to − log√

2 asshown, e.g., in [2], [3]. This computation relies on the fact that its value, say c, remainsunchanged if cos(πx) is replaced by sin(πx) so that we have (using the double angleformula) ∫ 1/2

0log sin(2πx) dx = log

√2+ 2

∫ 1/2

0log sin(πx) dx .

As both integrals in the last equation coincide, we obtain c = − log√

2, which com-pletes the proof of (1.1).

REFERENCES

1. P. Diaconis, D. Freedman, An elementary proof of Stirling’s formula, Amer. Math. Monthly 93 (1986)123–125.

2. W. Feller, A direct proof of Stirling’s formula, Amer. Math. Monthly 74 (1967) 1223–1225.3. W. Feller, Correction to “A direct proof of Stirling’s formula”, Amer. Math. Monthly 75 (1968) 518.4. R. Michel, On Stirling’s formula, Amer. Math. Monthly 109 (2002) 388–390.5. J. Patin, A very short proof of Stirling’s formula, Amer. Math. Monthly 96 (1989) 41–42.6. R. Remmert, Theory of Complex Functions. Springer, New York, 1991.7. H. Robbins, A remark on Stirling’s formula, Amer. Math. Monthly 62 (1955) 26–29.8. R. Wong, Asymptotic Approximation of Integrals. Society for Industrial and Applied Mathematics,

Philadelphia, PA, 2001.

Department of Mathematics, University of Trier, D-54286 Trier, [email protected]

By 1914, the MONTHLY had outgrown its financial arrangements, and itwas Slaught who turned to the American Mathematical Society to adopt theMONTHLY as an official journal. But American mathematics was growing asfast as the MONTHLY, and the Society was already plagued by factional disputesbetween the Eastern establishment (the Ivy league schools) and the Midwest(led by Chicago). Slaught’s request became a controversy. Should an organi-zation dedicated to promoting mathematical research support a journal like theMONTHLY? Many, especially in the East (led by Osgood), thought is should not,and the AMS voted narrowly to give the MONTHLY a pat on the back rather thanmoney.

A Century of Mathematics:Through the Eyes of the Monthly,

Edited by John Ewing.Mathematical Association of America,

Washington, DC, 1994, p. 4.

352 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:13 PMAll use subject to JSTOR Terms and Conditions

Page 80: AMM April 2014

Zeta(2) Once AgainAuthor(s): Ralph M. KrauseSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 353-354Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.353 .

Accessed: 30/03/2014 17:30

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PMAll use subject to JSTOR Terms and Conditions

Page 81: AMM April 2014

Zeta(2) Once Again

Ralph M. Krause

Abstract. This note provides a strikingly efficient evaluation of zeta(2).

An article in the January, 2012, MONTHLY [1] proved, in a manner that might have ap-pealed to Euler, his famous result that ζ(2) =

∑∞

1 1/k2= π2/6. The argument there

suggested the following, which makes the same claim and resembles the sixth in [2].The Taylor’s series for log(1+ z) converges on the unit circle z = eiθ for z 6= −1. Thus

log(1+ z) =∞∑1

(−1)k−1zk/k, (1)

and

log(1+ z−1) =

∞∑1

(−1)k−1z−k/k. (2)

To be convinced that equation (1) holds on the circle of convergence (z = −1 ex-cepted) and not merely inside it, we argue thus. For z in the first quadrant, |1/(1+ z)−∑N−1

0 (−z)k | = |zN/(1 + z)| ≤ |zN|. Integrating 1/(1 + z) −

∑N−10 (−z)k and |zN

|

along a radius from z = 0 to z = eiθ shows that | log(1 + z) −∑N

1 (−1)k−1zk/k| <1/(N + 1), establishing (1) on the portion of the unit circle lying in the first quadrant.(This is all we need below, although the preceding argument may be modified easilyto use ever larger bounds than 1/(N + 1) and prove convergence, though not uniformconvergence, on the entire unit circle with the exception of the point z = −1.) (2) thenfollows, as the conjugate of (1).

Subtracting (2) from (1), still for z = eiθ and z 6= −1,

iθ = log(z) =∞∑1

(−1)k−1[zk− z−k

]/k = 2i∞∑1

(−1)k−1 sin(kθ)/k. (3)

Nowadays, one might verify that this is a Fourier series and let Parseval’s formulafinish the job. Although Euler was perhaps a century too early for Fourier analysis, hewould have been willing to integrate the first and last expressions in (3) from 0 to π/2after dividing by i . Making free use of the familiar observation that the even termscomprise 1/4 of the sum of the series of reciprocal squares, we arrive at

π 2

8=

3

4

∞∑1

1

k2.

http://dx.doi.org/10.4169/amer.math.monthly.121.04.353MSC: Primary 11M06

April 2014] NOTES 353

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PMAll use subject to JSTOR Terms and Conditions

Page 82: AMM April 2014

The brevity here legitimately ignores the fact that the sums in (1)–(3) convergeonly conditionally. What is essential is that they converge uniformly on the interval ofintegration [0, π/2], and this they do by the estimate made in paragraph 2 above. Thus∣∣∣∣∣

∫ π/2

0

[θ − 2

N∑1

(−1)k−1 sin(kθ)/k

]dθ

∣∣∣∣∣ < (π/2)2/(N + 1).

The result now follows.

REFERENCES

1. D. Kalman, M. McKinzie, Another way to sum a series: Generating functions, Euler and the dilog function,Amer. Math. Monthly 119 (2012) 42–51.

2. R. Chapman, Evaluating ζ(2), available at http://www.uam.es/personal_pdi/ciencias/

cillerue/Curso/zeta2.pdf.

3208 44th Street, N.W., Washington, DC [email protected]

100 Years Ago in The American Mathematical MonthlyEdited by Vadim Ponomarenko

In recent years several German professors of mathematics have called publicattention to the fact that the number of mathematical students at the variousGerman universities is larger than the probable number of mathematical po-sitions in the German schools. According to the Jahresbericht der DeutschenMathematiker-Vereinigung, 22 (1913), page 369, the number of mathematicalstudents is much smaller during the current year than it has been during recentyears. The number of women students of mathematics is, however, still on theincrease in the German universities.

All copies of the MONTHLY for January, 1913, have been exhausted. Thedemand for sample copies was so great for this particular number that the supplywas entirely inadequate. Any one who may know of extra copies not belongingto sets will confer a great favor by informing the MANAGING EDITOR.

—Excerpted from “Notes and News,” 21 (1914) 136–138.

354 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:27 PMAll use subject to JSTOR Terms and Conditions

Page 83: AMM April 2014

Polynomials ( x 3 –n )( x 2 + 3) Solvable Modulo Any IntegerAuthor(s): Andrea M. Hyde, Paul D. Lee, Blair K. SpearmanSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 355-358Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.355 .

Accessed: 30/03/2014 17:30

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PMAll use subject to JSTOR Terms and Conditions

Page 84: AMM April 2014

Polynomials (x3 − n)(x2 + 3) SolvableModulo Any Integer

Andrea M. Hyde, Paul D. Lee, and Blair K. Spearman

Abstract. We give an infinite family of polynomials that are solvable modulo m for every inte-ger m > 1, yet have no roots in the rational numbers. Such polynomials are called intersective.Our classification uses only techniques available in an undergraduate course in number theory.

1. INTRODUCTION. Let f (x) be a monic polynomial with integer coefficients. Weare interested in those polynomials f (x) that have no root in the rational numbers Q,but do have a root modulo m for all positive integers m. Polynomials of this type arecalled intersective (see Sonn [6, 7]). These polynomials provide counterexamples tothe local-global principle. Further information on the local-global principle is availablein the book by Gouvea [3, pp. 75–83.]. It is known that f (x) cannot be irreducible overQ since in this case there exist prime numbers p for which

f (x) ≡ 0 (mod p)

is insolvable (see Brandl, Bubboloni, and Hupp [2]). Consequently, an intersectivepolynomial requires at least two irreducible factors over Q. Berend and Bilu prove atheorem that can be used to establish the intersective property for a given polynomial[1]. They provide the example

f (x) = (x3− 19)(x2

+ x + 1).

The verification of the intersective property, that is, of confirming that

f (x) ≡ 0 (mod m)

is solvable for every m > 1, may proceed by showing that for each prime p and posi-tive integer j , the congruence

f (x) ≡ 0 (mod p j )

is solvable. General solvability then follows from the Chinese Remainder Theorem.For a given prime p, one of the factors of f (x) is proven to be solvable mod p j for allpositive integers j . In this note, we propose to use techniques from an undergraduatenumber theory course to investigate polynomials of the form

f (x) = (x3− n)(x2

+ x + 1),

or equivalently

f (x) = (x3− n)(x2

+ 3),

http://dx.doi.org/10.4169/amer.math.monthly.121.04.355MSC: Primary 11R09

April 2014] NOTES 355

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PMAll use subject to JSTOR Terms and Conditions

Page 85: AMM April 2014

classifying those that are intersective. The equivalence of these two families is due tothe easily established fact that for a given prime p, the congruence

x2+ x + 1 ≡ 0 (mod p j )

is solvable for all positive integers j if and only if the congruence

x2+ 3 ≡ 0 (mod p j )

is solvable for all positive integers j . Berend and Bilu [1] state that these polynomialshave the least possible degree to be intersective. The explanation of this involves Ga-lois theory and algebraic number theory. Avoiding these more advanced theories, wemake use of results from an undergraduate number theory course, particularly Hensel’sLemma and a refined version of Hensel’s Lemma. Our method not only establishes theintersective property for the polynomials we study, but also enables a characteriza-tion of them, showing that they form an infinite set. Before we begin, we impose twosimple restrictions on the value of n in our polynomials (x3

− n)(x2+ 3), the validity

of which can be easily established by the reader. We assume that n is a positive inte-ger and that n is cubefree. Our definition of intersective implies that n 6= 1. Our maintheorem is the following.

Theorem. Let n be a cubefree positive integer, not equal to 1. Then

f (x) = (x3− n)(x2

+ 3)

is intersective if and only if the prime factors of n are of the form 3k + 1 and n ≡ 1(mod 9).

Before giving the proof of this theorem, we state the theory that we require. Allof it is available in basic number theory textbooks such as Niven, Zuckerman, andMontgomery [4]. We begin with Hensel’s Lemma.

Hensel’s Lemma (see [4, p. 87]). Suppose that f (x) is a polynomial with integralcoefficients. If f (a) ≡ 0 (mod p j ) and f ′(a) 6≡ 0 (mod p), then there is a unique t(mod p) such that f (a + tp j ) ≡ 0 (mod p j+1).

If the condition f ′(a) 6≡ 0 (mod p) holds, then the root a is called nonsingular. Byrepeated application of Hensel’s Lemma, a nonsingular root a of f (x) ≡ 0 (mod p)may be lifted to a root modulo p j , for j = 2, 3, . . . . We also require a refined versionof Hensel’s Lemma which, in the case of a singular root, enables us to lift our solutionsmodulo arbitrarily high prime powers.

Refined Hensel’s Lemma (see [4, p. 89]). Let f (x) be a polynomial with integralcoefficients. Suppose that f (a) ≡ 0 (mod p j ), that pτ ‖ f ′(a), and that j ≥ 2τ + 1.If b ≡ a mod p j−τ , then f (b) ≡ f (a) (mod p j ) and pτ ‖ f ′(b). Moreover, there isa unique t (mod p) such that f (a + tp j−τ ) ≡ 0 (mod p j+1).

As noted in [4, p. 89], since the hypotheses of the theorem apply with a replacedby a + tp j−τ and (mod p j ) replaced by (mod p j+1) but with τ unchanged, thelifting may be repeated and continues indefinitely. This means that if the polynomialcongruence is solvable to a sufficiently high power of p (as defined in the Lemma),

356 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PMAll use subject to JSTOR Terms and Conditions

Page 86: AMM April 2014

then it can be solved to all powers of p. We require one more lemma involving powercongruences.

Lemma (see [4, p. 101). If p is a prime and gcd(a, p) = 1, then the congruence x t≡

a (mod p) has d = gcd(t, p − 1) solutions if a(p−1)/d≡ 1 (mod p), and no solutions

otherwise.

We are now ready to prove our theorem.

Proof. Assume that the prime factors of n are of the form 3k + 1 and n ≡ 1 (mod 9).Let p be a prime.

Case 1. Suppose that p ≡ 1 (mod 3). For these primes, −3 is a quadratic residuemod p (see [5, p. 440, Exercise 3]) so that the congruence x2

+ 3 ≡ 0 (mod p) issolvable for some integer u, which is clearly not divisible by p. Since the derivativeof x2

+ 3 evaluated at u equals 2u, which is again not divisible by p, we may applyLemma 1 to conclude that x2

+ 3 ≡ 0 (mod p j ) is solvable for all positive integers j .

Case 2. Suppose next that p ≡ 2 (mod 3). Clearly, p - n, since n contains only primefactors of the form 3k + 1. The factor x2

+ 3 is insolvable mod p, since −3 is aquadratic nonresidue (mod p). Applying Lemma 3 to the factor x3

− n, with t =3, a = n, and noting that d = (3, p − 1) = 1, we see that x3

− n ≡ 0 (mod p) is solv-able if n p−1

≡ 1 (mod p), which is true by Fermat’s Little Theorem. Thus, x3− n ≡ 0

(mod p) is solvable for some integer u. Since p - n, we see that p - u. We may ap-ply Lemma 1 by observing that p - 3u2, the derivative of x3

− n evaluated at u, andconclude that x3

− n ≡ 0 (mod p j ) is solvable for all positive integers j .

Case 3. Finally, suppose that p = 3. The factor x2+ 3 has one solution mod 3 (namely

x = 0), but has no solutions mod 32. Therefore, we consider the factor x3− n. Since

n ≡ 1 (mod 9), we have n ≡ 1, 10, 19 (mod 33) so that x3≡ n (mod 33) is solvable.

For solutions, we may choose x ≡ 1, 4, 7 (mod 33), respectively. At the same time,the derivative of x3

− n evaluated at 1, 4, and 7 is exactly divisible by 3. ApplyingLemma 2 with j = 3, τ = 1, and recalling the remark after Lemma 2, we concludethat x3

− n ≡ 0 (mod 3 j ) is solvable for all positive integers j . Hence, by the ChineseRemainder Theorem, f (x) is solvable for all positive integers m. This establishes theintersective property.

Conversely, suppose that (x3− n)(x2

+ 3) is intersective with n cubefree and notequal to 1. Let p be a prime.

Case 1. First, suppose that p ≡ 2 (mod 3). For such a prime, x2+ 3 ≡ 0 (mod p)

is insolvable, since −3 is a quadratic nonresidue mod p as noted earlier. Therefore,we must have x3

− n ≡ 0 (mod p j ) solvable for all positive integers j . It is easy toshow that if p divides n, then p must also divide any solution x . We then see thatthe congruence x3

≡ n (mod p3) is only solvable if n is divisible by p3, but this is acontradiction since n is cube free. Hence, p - n and n has no prime factors of the form3k + 2.

Case 2. Suppose now that p = 3. Since x2+ 3 ≡ 0 (mod 9) is insolvable, we must

have x3− n ≡ 0 (mod 3 j ) solvable for all positive integers j . By the same arguments

as in Case 1, we require 3 - n, for otherwise x3− n ≡ 0 (mod 33) is insolvable as n is

April 2014] NOTES 357

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PMAll use subject to JSTOR Terms and Conditions

Page 87: AMM April 2014

cubefree. Therefore, 3 is not a prime factor of n, and thus the prime factors of n musthave the form 3k + 1.

To complete the classification part of our proof, we study the solvability of x3− n ≡

0 (mod 32). The nonzero cubes modulo 9 are congruent to 1 and 8. Combining thiswith the previously established fact that the prime factors of the positive integer n areof the form 3k + 1, we deduce that n ≡ 1 (mod 9), as required.

We close by giving some examples of intersective polynomials obtained from ourtheorem.

Factorization of n Intersective polynomial

37 (x3− 37)(x2

+ 3)

7 · 13 (x3− 91)(x2

+ 3)

163 (x3− 163)(x2

+ 3)

19 · 37 (x3− 703)(x2

+ 3)

72· 61 (x3

− 2989)(x2+ 3)

REFERENCES

1. D. Behrend, Y. Bilu, Polynomials with roots modulo every integer, Proc. Amer. Math. Soc. 124 (1996)1663–1671.

2. R. Brandl, D. Bubboloni, I. Hupp, Polynomials with roots mod p for all primes p, J. Group Theory 4(2001) 233–239.

3. F. Q. Gouvea, P-adic Numbers, An Introduction. Springer Verlag, New York, 1993.4. I. Niven, H. S. Zuckerman, H. L. Montgomery, An Introduction to the Theory of Numbers. Fifth edition.

John Wiley and Sons, New York, 1995.5. K. H. Rosen, Elementary Number Theory and its Applications. Sixth edition. Addison-Wesley, Reading,

MA, 2011.6. J. Sonn, Polynomials with roots in Q p for all p, Proc. Amer. Math. Soc. 136 (2008) 1955–1960.7. J. Sonn, Two remarks on the inverse galois problem for intersective polynomials, J. Theor. Nombres Bor-

deaux, 21 (2009) 437–439.

Department of Mathematics and Statistics, University of British Columbia’s Okanagan campus, Kelowna, BC,Canada, V1V [email protected]

Department of Mathematics and Statistics, University of British Columbia’s Okanagan campus, Kelowna, BC,Canada, V1V [email protected]

Department of Mathematics and Statistics, University of British Columbia’s Okanagan campus, Kelowna, BC,Canada, V1V [email protected]

358 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:47 PMAll use subject to JSTOR Terms and Conditions

Page 88: AMM April 2014

Macaulay ExpansionAuthor(s): B. SurySource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 359-360Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.359 .

Accessed: 30/03/2014 17:30

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PMAll use subject to JSTOR Terms and Conditions

Page 89: AMM April 2014

Macaulay Expansion

B. Sury

Abstract. Given natural numbers n and r , the “greedy” algorithm enables us to obtain anexpansion of the integer n as a sum of binomial coefficients in the form

(arr

)+(ar−1

r−1

)+ · · · +(a1

1

). We give an alternate interpretation of this expansion, which also proves its uniqueness in

an interesting manner.

The 1996 Iranian mathematical olympiad competition contained the following prob-lem. For natural numbers n and r, there is a unique expansion

n =

(ar

r

)+

(ar−1

r − 1

)+ · · · +

(a1

1

)with each ai an integer and ar > ar−1 > · · · > a1 ≥ 0.

The existence is fairly easy to prove using the “greedy” algorithm. This expansion issometimes known as the Macaulay expansion. However, the following alternate inter-pretation does not seem to be well known; it gives uniqueness in an interesting manner.In what follows, the following well-known convention is used: the binomial coefficient(n

r

)is equated to 0 if n < r .

For each natural number r , denote by Sr the set of all r -digit numbers in some baseb whose digits are in strictly decreasing order of size. Evidently, Sr is nonempty if andonly if b ≥ r ; in this case, Sr has

(br

)elements. Let us now write the elements of Sr in

increasing order.For instance, in base 10, the first few of the 120 members of S3 are:

(2, 1, 0), (3, 1, 0), (3, 2, 0), (3, 2, 1), (4, 1, 0), (4, 2, 0), (4, 2, 1), (4, 3, 0), . . . .

We will prove the following.

Theorem. Given any positive integer n, and any base b such that(b

r

)> n, the (n + 1)-

th member of Sr is (ar , . . . , a2, a1), where n =(ar

r

)+(ar−1

r−1

)+ · · · +

(a11

). In particular,

for each n, the Diophantine equation(ar

r

)+(ar−1

r−1

)+ · · · +

(a11

)= n has a unique solu-

tion in positive integers ar > ar−1 > · · · > a1 ≥ 0.

Here are a couple of examples to illustrate the theorem.

(i) Let r = 3 and n = 12. We may take any base b so that(b

3

)> 12. For example,

b = 6 is allowed because(6

3

)= 20. Among the 20 members in S3, the 13th

member is (5, 2, 1). Note that(5

3

)+

(2

2

)+

(1

1

)= 12.

http://dx.doi.org/10.4169/amer.math.monthly.121.04.359MSC: Primary 05A10

April 2014] NOTES 359

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PMAll use subject to JSTOR Terms and Conditions

Page 90: AMM April 2014

(ii) Let r = 3, n = 74. We may take b = 10 as(10

3

)= 120. The 75th member of S3

is (8, 6, 3). Note that (8

3

)+

(6

2

)+

(3

2

)= 74.

Proof of theorem. First of all, we notice that the number of members in Sr thathave first digit < m equals

(mr

); this is because we are choosing r numbers from

{0, 1, . . . , m − 1} and arranging them in decreasing order. Now, suppose the (n + 1)thmember of Sr is

(ar , ar−1, . . . , a1).

The number of members of Sr with first digit < ar is(ar

r

). The number of members

of Sr , whose first digit is ar and which occur before the above member, is the numberof members of Sr−1 occurring prior to (ar−1, . . . , a1). Inductively, it is clear that thisequals (

ar−1

r − 1

)+ · · · +

(a2

2

)+

(a1

1

).

Therefore, the number of members of Sr occurring prior to the (n + 1)th memberabove (which must be n) is(

ar

r

)+

(ar−1

r − 1

)+ · · · +

(a1

1

).

This proves our result.

Remark. We may proceed in a slightly different direction, if we do not use the firstobservation in the proof. For any k, we can obtain by induction that the number ofelements in Sk starting with some a is

( ak−1

). Indeed, to prove this by induction, we use

the identity

(n

r

)=

n−1∑m=1

(m

r − 1

),

which is itself seen by induction on n.

ACKNOWLEDGMENTS. We are indebted to the referee for a number of constructive suggestions. In par-ticular, she/he drew attention to a simple way to count something for which we gave a roundabout argumentas remarked above. The referee’s suggestions to add some illuminating examples and to make the uniquenessargument transparent are well appreciated.

Stat-Math Unit, Indian Statistical Institute, 8th Mile Mysore Road, Bangalore 560059, [email protected]

360 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:30:58 PMAll use subject to JSTOR Terms and Conditions

Page 91: AMM April 2014

Evaluating Lebesgue Integrals Efficiently with the FTCAuthor(s): J. J. KolihaSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 361-364Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.361 .

Accessed: 30/03/2014 17:31

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PMAll use subject to JSTOR Terms and Conditions

Page 92: AMM April 2014

Evaluating Lebesgue Integrals Efficientlywith the FTC

J. J. Koliha

Abstract. This note addresses evaluation of Lebesgue integrals on the real line using the Fun-damental Theorem of Calculus, without having to verify that the primitive is absolutely con-tinuous.

The Fundamental Theorem of Calculus (FTC) provides an efficient method for theevaluation of Lebesgue integrals on real intervals, but only if we can find an abso-lutely continuous primitive (antiderivative) to the integrand. However, checking abso-lute continuity can be quite difficult. In this note, we give examples of evaluation ofintegrals that require only continuity of the primitive. Here is a version of Lebesgue’sFTC extended to a possibly unbounded interval.

Lebesgue’s FTC. Let ∞ ≤ a < b ≤ ∞. Let F : (a, b)→ C be absolutely continu-ous on (a, b) and let F ′ = f almost everywhere on (a, b), where f : (a, b)→ C isLebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b−) exist, then∫ b

af (t) dt = F(b−)− F(a+).

It may seem that with the absolute continuity of F , the hypothesis that f isLebesgue integrable is redundant. Alas, no: The notorious function

F(t) := Si(t) =∫ t

0

sin x

xdx, t > 0,

shows the error of our ways [2, Example 14.17]. The absolute continuity of F on(0,∞) follows from the Mean Value Theorem; F(0+) = 0 is clear and F(∞−) =π/2 is well known. Yet the derivative F ′(x) = f (x) = (sin x)/x is not Lebesgue inte-grable as

limt→∞

∫ t

0

∣∣∣∣sin x

x

∣∣∣∣ dx = ∞.

It is well known that on a compact interval, the integrability of f is indeed redundant(see, for instance, [2, Theorem 14.7]).

The problem with application of Lebesgue’s FTC can be seen in this situation.Suppose we know that F ′(x) = f (x) everywhere in [a, b] and that f is Lebesgueintegrable on [a, b]. Then we have a paradoxical situation of not being able to useLebesgue’s FTC, since we do not know whether F is absolutely continuous. If we writeG(x) =

∫ xa f (t) dt , we know that G is absolutely continuous, and (F − G)′(x) = 0

almost everywhere. However, we cannot conclude that F − G is constant.

http://dx.doi.org/10.4169/amer.math.monthly.121.04.361MSC: Primary 26A42

April 2014] NOTES 361

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PMAll use subject to JSTOR Terms and Conditions

Page 93: AMM April 2014

In order to overcome this problem, we need to look at a different type of FTC,one which is usually proved by methods outside the theory of Lebesgue integration.A proof that stays strictly within the realm of the Lebesgue theory was given by theauthor in this MONTHLY [1]. We recall three versions of this theorem, whose proofscan be found in [1] and [2, Chapter 14].

Theorem 1 (see [1]). Let∞ ≤ a < b ≤ ∞. Let F : (a, b)→ C be such that F ′(x) =f (x) for all x ∈ (a, b), where f : (a, b)→ C is Lebesgue integrable on (a, b). If theone-sided limits F(a+) and F(b−) exist, then∫ b

af (t) dt = F(b−)− F(a+).

Even if we tighten the hypotheses to assume that F has a derivative on a compactinterval [a, b] (with one-sided derivatives at the end points), the integrability of fcannot be dropped due to a possible blowout of the positive and negative oscillationof f . To see this, define

F(t) = t2 cos2

(1

t2

)if t 6= 0 and F(0) = 0,

and

f (t) = F ′(t) if t 6= 0 and f (0) = 0.

But f is not Lebesgue integrable on [0, 1], since∫ 1ε| f (t)| dt →∞ as ε→ 0+. (See

[2, Example 14.15] for details.)

Theorem 2 (see [1]). Let ∞ ≤ a < b ≤ ∞. Let F : (a, b)→ C be continuous on(a, b) and let F ′(x) = f (x) nearly everywhere on (a, b), where f : (a, b)→ C isLebesgue integrable on (a, b). If the one-sided limits F(a+) and F(b−) exist, then∫ b

af (t) dt = F(b−)− F(a+).

The expression nearly everywhere means ‘everywhere except for a countable set’.If F is continuous on (a, b), F ′ = f nearly everywhere on (a, b), and the one-sidedlimits F(b−) and F(a+) exist, then we say that f is Newton integrable on (a, b), anddefine its Newton integral by

(N )∫ b

af (t) dt := F(b−)− F(a+).

Theorem 1 enables us to calculate the integral∫ 1

0 t−1/2 dt by observing that F(t) =2t1/2 is a primitive for the integrand f (t) = t−1/2 everywhere in (0, 1), that F(0+) = 0and F(1−) = 2, but we have to know that f is Lebesgue integrable on (0, 1). For thiswe can use, for instance, the Monotone Convergence Theorem applied to the trunca-tions fn = min( f, n) of f . But this does not seem to be the most efficient way to doit—we would like to conclude the integrability of f directly from the existence of theNewton integral. For this we need to consider absolute Newton integrability. We say

362 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PMAll use subject to JSTOR Terms and Conditions

Page 94: AMM April 2014

that a function f : (a, b)→ C is absolutely Newton integrable if the Newton integralexists for both f and | f | (where | f | is real valued and nonnegative). Here is the desiredtheorem.

Theorem 3 (see [1]). Let∞ ≤ a < b ≤ ∞. Let f : (a, b)→ C be absolutely Newtonintegrable on (a, b). Then f is Lebesgue integrable on (a, b), and∫ b

af (t) dt = (N )

∫ b

af (t) dt.

The readers can hone their skills by evaluating the following integrals using Theo-rems 1, 2, or 3.

Example 1. Evaluate Lebesgue integrals efficiently:

(i)∫ 1

0

(1

t3/4+ i log t

)dt, (ii)

∫ 2

1

t − 3i

t + 2idt, and (iii)

∫∞

0

dt

(2t+ i)3.

So far, the substantial power hidden in Theorems 2 and 3 has not been fully utilized,namely the fact that the derivative of the continuous function F may exist only nearlyeverywhere. We illustrate this in the following examples.

Example 2. Let f : (0, 1) → C be the function defined by f (t) = 0 if t is ra-tional, and f (t) = log t + i t−4/5 otherwise. Let ϕ be the characteristic function of(0, 1) \ Q. Then F1(t) = t log t − t and F2(t) = 5t1/5 are generalized primitives tof1(t) = ϕ(t) log t and f2(t) = ϕ(t) t−4/5 on (0, 1), respectively. Further, F1(1) = −1,F1(0+) = 0, F2(1) = 5, and F2(0) = 0. Both f1 and f2 are absolutely Newton in-tegrable as they do not change sign on (0, 1). By Theorem 3, f is Lebesgue inte-grable with

∫ 10 f =

∫ 10 f1 + i

∫ 10 f2 = −1 + i5. (Note that by splitting the real and

imaginary parts of f , we avoided the need for finding a generalized primitive for| f (t)| = ϕ(t)(log2 t + t−8/5)1/2. This is not always the most efficient maneuver—seeExample 1 (iii).)

Example 3. Let f be defined on the interval (0, 1) by

f (x) =1

√(n + 2){(n + 1)x − n}

if xn < x ≤ xn+1, n = 0, 1, 2, . . .

where xn = n/(n + 1), n = 0, 1, 2, . . . . First sketch a graph of f ; it reveals infinitelymany vertical asymptotes at the points xn , n = 0, 1, 2, . . . , neatly clustering near x =1. On each interval (xn, xn+1), a primitive to f is

F(x) =2

(n + 1)√

n + 2

√(n + 1)x − n + cn, x ∈ (xn, xn+1).

The constants of integration cn must be chosen wisely to make F continuous on (0, 1).From F(xn−) = F(xn+), we obtain cn = 2/[(n(n + 1)] + cn−1. Choosing c0 = 0, weget

cn = 2n∑

k=1

1

k(k + 1)= 2

n∑k=1

(1

k−

1

k + 1

)= 2

(1−

1

n + 1

)=

2n

n + 1,

April 2014] NOTES 363

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PMAll use subject to JSTOR Terms and Conditions

Page 95: AMM April 2014

n = 0, 1, 2, . . . Setting F(xn) = F(xn−) = F(xn+) for n = 1, 2, 3, . . . , we makeF continuous on (0, 1), but the derivatives F ′(xn) fail to exist for n = 1, 2, 3, . . . .As the integrand is nonnegative, its Newton integrability implies absolute Newtonintegrability. Clearly, F(0+) = 0. Further, F is increasing on (0, 1) being con-tinuous there and having a positive derivative nearly everywhere in (0, 1) (see[2, Theorem B25]). Also, F is bounded on (0, 1) as on each interval (xn, xn+1]

we have F(xn+1) = 2xn+1 ≤ 2. Hence, the limit F(1−) exists and is equal tolimn→∞ F(xn+1) = 2. Thus,

∫ 10 f (x) dx = F(1−) − F(0+) = 2. We note that f

is not improperly Riemann integrable.

Example 4. A striking example of a Lebesgue integrable function that is not improp-erly Riemann integrable and that has a vertical asymptote at each rational point of theinterval [0, 1] is given by Richardson in [3, Example 5.44]:

f (x) =∞∑

k=1

2−k|x − qk |

−1/2,

where (qk) is a sequence containing all rational numbers in [0, 1]. Write fk(x) =2−k|x − qk |

−1/2 for x ∈ [0, 1] \ {qk}, k = 1, 2, . . . . Then fk is absolutely Newton in-tegrable with a generalized primitive Fk(x) = 2−k+1sgn(x − qk)|x − qk |

1/2 in [0, 1],and the integral (N )

∫ 10 fk = Fk(1−)− Fk(0+) = 2−k+1((1− qk)

1/2+ q1/2

k ). By The-orem 3, this is also Lebesgue integral of fk . We have

α :=

∞∑k=1

∫ 1

0| fk(t)| dt =

∞∑k=1

2−k+1((1− qk)1/2+ q1/2

k ) <∞.

By the term-by-term integration of series [2, Theorem 13.35], f =∑

k fk convergesalmost everywhere in [0, 1], is Lebesgue integrable, and

∫ 10 f (t) dt = α.

ACKNOWLEDGMENT. I would like to thank the referees for their comments, which led to improved pre-sentation of this note.

REFERENCES

1. J. J. Koliha, A fundamental theorem of calculus for Lebesgue integration, Amer. Math. Monthly 113 (2006)551–555.

2. , Metrics, Norms and Integrals: An Introduction to Contemporary Analysis. World Scientific Pub-lishing, Singapore, 2008.

3. L. F. Richardson, Measure and Integration: A Concise Introduction to Real Analysis. John Wiley, NewYork, 2009.

The University of Melbourne, Melbourne VIC 3010, [email protected]

364 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:16 PMAll use subject to JSTOR Terms and Conditions

Page 96: AMM April 2014

Problems and SolutionsSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 365-372Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.365 .

Accessed: 30/03/2014 17:31

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 97: AMM April 2014

PROBLEMS AND SOLUTIONS

Edited by Gerald A. Edgar, Doug Hensley, Douglas B. Westwith the collaboration of Itshak Borosh, Paul Bracken, Ezra A. Brown, RandallDougherty, Tamas Erdelyi, Zachary Franco, Christian Friesen, Ira M. Gessel, LaszloLiptak, Frederick W. Luttmann, Vania Mascioni, Frank B. Miles, Richard Pfiefer,Dave Renfro, Cecil C. Rousseau, Leonard Smiley, Kenneth Stolarsky, Richard Stong,Walter Stromquist, Daniel Ullman, Charles Vanden Eynden, Sam Vandervelde, andFuzhen Zhang.

Proposed problems and solutions should be sent in duplicate to the MONTHLY

problems address on the back of the title page. Proposed problems should neverbe under submission concurrently to more than one journal. Submitted solutionsshould arrive before August 31, 2014. Additional information, such as general-izations and references, is welcome. The problem number and the solver’s nameand address should appear on each solution. An asterisk (*) after the number ofa problem or a part of a problem indicates that no solution is currently available.

PROBLEMS

11768. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca, Cluj-Napoca, Romania. Let f be a bounded continuous function mapping [0,∞) to itself.Find

limn→∞

n

(n

√∫∞

0f n+1(x)e−x dx − n

√∫∞

0f n(x)e−x dx

).

11769. Proposed by Pal Peter Dalyay, Szeged, Hungary. Let a1, . . . , an and b1, . . . , bn

be positive real numbers. Show that n∑j=1

a j

b j

2

− 2n∑

j,k=1

a j ak

(b j + b j )2≤ 2

n∑j,k=1

a j ak

(b j + bk)

n∑l,m=1

alam

(bl + bm)3

1/2

.

11770. Proposed by Spiros P. Andriopoulos, Third High School of Amaliada, Eleia,Greece. Prove, for real numbers a, b, x, y with a > b > 1 and x > y > 1, that

ax− by

x − y>

(a + b

2

)(x+y)/2

log

(a + b

2

).

11771. Proposed by D. M. Batinetu-Giurgiu, “Matei Basarab” National College,Bucharest, Romania, and Neculai Stanciu, “George Emil Palade” School, Buzau,Romania. Let n!! =

∏b(n−1)/2ci=0 (n − 2i). Find

limn→∞

(n√(2n − 1)!!

(tan

π n+1√(n + 1)!

4 n√

n!− 1

)).

http://dx.doi.org/10.4169/amer.math.monthly.121.04.365

April 2014] PROBLEMS AND SOLUTIONS 365

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 98: AMM April 2014

11772. Proposed by Mircea Merca, University of Craiova, Craiova, Romania. Let nbe a positive integer. Prove that the number of integer partitions of 2n + 1 that do notcontain 1 as a part is less than or equal to the number of integer partitions of 2n thatcontain at least one odd part.

11773. Proposed by Moubinool Omarjee, Lycee Henri IV, Paris, France. Given a posi-tive real number a0, let an+1 = exp

(−∑n

k=0 ak

)for n ≥ 0. For which values of b does∑

n=0(an)b converge?

11774. Proposed by Yunus Tuncbilek, Ataturk High School of Science, Istanbul, Turkeyand Danny Lee, Herkimer Senior High School, NY, NY. Let ω be the circumscribedcircle of triangle ABC. The A-mixtilinear incircle of ABC and ω is the circle that isinternally tangent to ω, AB, and AC, and similarly for B and C . Let A′, PB , and PC bethe points on ω, AB, and AC, respectively, at which the A-mixtilinear incircle touches.Define B ′ and C ′ in the same manner that A′ was defined. (See figure.)

A

BC

PC

PB

A′

O

OA

B ′

C ′

Prove that triangles C ′PB B and C PC B ′ are similar.

SOLUTIONS

The Lenstra Constant of a Ring

11628 [2012, 162]. Proposed by Jeffrey C. Lagarias and Michael E. Zieve, Universityof Michigan, Ann Arbor, MI. Define the Lenstra constant L(R) of a commutative ringR to be the size of the largest subset A of R such that a − b is a unit (invertibleelement) in R for any distinct elements a, b ∈ A. Show that for each positive integerN , the Lenstra constant of the ring Z(1/N ) is the least prime that does not divide N .

Solution by Mark D. Meyerson, United States Naval Academy, Annapolis, MD. Theelements of Z(1/N ) are the numbers of the form k/N r with k, r ∈ Z. Let pe1

1 · · · pemm

be the prime factorization of N ; each ei is a positive integer. The units in Z(1/N ) arenumbers of the form ±pd1

1 · · · pdrr with each di ∈ Z. Let p be the least prime that does

not divide N . The set {1, . . . , p} has the property that any difference of two distinctelements is a unit, since any prime factor of such a difference is a prime factor of N .Hence, L(Z(1/N )) ≥ p.

366 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 99: AMM April 2014

Now, let L be a subset of Z(1/N ) such that any nonzero difference is a unit, andsuppose that |L| > p. By deleting extra elements, we may assume |L| = p + 1. Ifwe multiply the p + 1 elements of L by a sufficiently high power of N to make allthe elements integers, the nonzero differences will still be units. However, by the pi-geonhole principle, two of the p + 1 elements are congruent mod p. Their differenceis a multiple of p and hence is not a unit. It follows that L(Z(1/N )) ≤ p. The twoinequalities prove that p is the Lenstra constant of this ring.

Also solved by P. Budney, N. Caro (Brazil), R. Chapman (U. K.), W. Chengyuan (Singapore), P. P. Dalyay(Hungary), S. Dey (India), D. Fleischman, O. Geupel (Germany), Y. J. Ionin, B. Karaivanov, J. H. Lindsey II,O. Lossers (Netherlands), A. Magidin, G. Martin (Canada), M. A. Prasad (India), F. Richman, J. Riegsecker,K. Schilling, J. H. Smith, J. H. Steelman, R. Stong, M. Tetiva (Romania), Colgate University Problem SolvingGroup, NSA Problems Group, TCDmath Problems Group (Ireland), Texas State University Problem SolvingGroup, University of Louisiana at Lafayette Math Club, and the proposers.

Rotatable Quasigroups

11631 [2012, 247–248]. Proposed by Pal Peter Dalyay, Szeged, Hungary. A quasi-group (Q, ∗) is a set Q together with a binary operation ∗ such that for each a, b ∈ Qthere exist unique x and unique y (which may be equal) such that ax = b and ya = b.The Cayley table of a finite quasigroup is its ‘times table’. A quasigroup has propertyP if each row of the table is a rotation of the first row.

Find all positive integers n for which there exists a quasigroup ({1, . . . , n}, ∗) withproperty P in which all elements are idempotent. (For instance, the Cayley table belowdefines a binary operation on {1, . . . , 5} with property P in which each element isidempotent.)

* 1 2 3 4 5

1 1 5 4 3 22 3 2 1 5 43 5 4 3 2 14 2 1 5 4 35 4 3 2 1 5

Solution by Fred Richman, Florida Atlantic University, Boca Raton, FL. Such quasi-groups exist if and only if n is odd. Cayley tables are just Latin squares; idempotencerequires diagonal 1, . . . , n in order. The table is then determined by its first row andproperty P . The problem is thus to find a permutation of 1, . . . , n as the first row sothat the entries in the first column are distinct, since property P then completes a Latinsquare for the table.

We calculate the first entry in row 1 ∗ k. This row is a rotation of row 1, and itmust have 1 ∗ k in column 1 ∗ k. Also row 1 has 1 ∗ k in column k, so row 1 is rotatedleftward by k − (1 ∗ k) positions to become row 1 ∗ k. Thus, the first entry in row 1 ∗ kis 1 ∗ [k − (1 ∗ k)+ 1]. For these values to be distinct, the values k − (1 ∗ k) must bedistinct modulo n.

When n is odd, 2 is invertible (modulo n). Setting 1 ∗ k ≡ 2− k as in the proposer’sexample yields k − (1 ∗ k) ≡ 2(k − 1), and these elements are distinct (modulo n).When n is even, the values k − (1 ∗ k) cannot be distinct (modulo n) because

∑ni=1 i =

(n + 1)n/2 ≡ n/2 (mod n) and∑n

k=1(k − (1 ∗ k)) = 0.

Editorial comment. When n is odd, one can require even more: There are manyidempotent commutative quasigroups on Zn , such as by putting (i + j)/2 in position(i, j), using the uniqueness of the multiplicative inverse of 2. This construction for

April 2014] PROBLEMS AND SOLUTIONS 367

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 100: AMM April 2014

n = 2k + 1 is used in the Bose construction of a Steiner triple system on 6k + 3elements (R. C. Bose, On the construction of balanced incomplete block designs, Ann.Eugenics 9 (1939), 353–399).

Also solved by D. Beckwith, R. Chapman (U. K.), S. M. Gagola Jr., O. Geupel (Germany), A. Habil (Syria),E. A. Herman, Y. J. Ionin, B. Karaivanov, J. H. Lindsey II, J. M. Lockhart, O. P. Lossers (Netherlands),C. R. Pranesachar (India), R. E. Prather, J. H. Steelman, R. Stong, J. Wojdylo, Colgate University ProblemSolving Group, GCHQ Problem Solving Group (U. K.), TCDmath Problem Group (Ireland), and the proposer.

A Harmonic Identity

11633 [2012, 248]. Proposed by Anthony Sofo, Victoria University, Melbourne, Aus-tralia. For real a, let H (a)

n =∑n

j=1 j−a . Show that for integers a, b, and n with a ≥1, b ≥ 0, and n ≥ 1,

n∑k=1

k(H 2k + H (2)

k )+ 2(k + b)a H (1)k H (a)

k+b−1

k(k + b)a= H (a)

n+b(H2n + H (2)

n ).

Solution by Subhadip Dey, Bangalore City, Karnataka, India. As in the problem, weuse the notation Hn = H (1)

n and H (a)0 = 0. Using the identities

H 2n + H (2)

n = 2n∑

j=1

j∑i=1

1

i j= 2

n∑j=1

H j

jand

n∑j=1

j∑i=1

ai b j =

n∑i=1

n∑j=i

ai b j ,

the first term on the left side of the identity becomes

n∑k=1

H 2k + H (2)

k

(k + b)a= 2

n∑k=1

k∑j=1

H j

j (k + b)a= 2

n∑j=1

H j

j

n∑k= j

1

(k + b)a.

Therefore, we compute

n∑k=1

H 2k + H (2)

k

(k + b)a+ 2

n∑k=1

Hk H (a)k+b−1

k= 2

n∑j=1

H j

j

n∑k= j

1

(k + b)a+ 2

n∑j=1

H j

jH (a)

j+b−1

= 2n∑

j=1

H j

j

n∑k= j

1

(k + b)a+ H (a)

j+b−1

= 2n∑

j=1

H j

jH (a)

n+b = H (a)n+b(H

2n + H (2)

n ).

Editorial comment. Several solvers noted that the identity is valid for all real a. E. A.Herman generalized it to

n∑k=1

k(H pk + H (p)

k )+ Zk,p(k + b)a Hk H (a)k+b−1

k(k + b)a= H (a)

n+b(Hp

n + H (p)n ),

where p is a positive even integer and Zk,p =∑p−1

j=1

(pj

)H j−1

k

(−

1k

)p−1− j.

Also solved by P. Bracken, R. Chapman (U. K.), P. P. Dalyay (Hungary), E. S. Eyeson, O. Geupel (Ger-many), E. A. Herman, B. Karaivanov, O. Kouba (Syria), O. P. Lossers (The Netherlands), M. Omarjee (France),C. R. Pranesachar (India), M. A. Prasad (India), J. H. Steelman, R. Stong, R. Tauraso (Italy), GCHQ ProblemSolving Group (U. K.), and the proposer.

368 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 101: AMM April 2014

A Fractional Integral

11637 [2012, 344]. Proposed by Ovidiu Furdui, Technical University of Cluj-Napoca,Cluj, Romania. Let m ≥ 1 be a nonnegative integer. Let {u} = u − buc; the quantity{u} is called the fractional part of u. Prove that∫ 1

0

{1

x

}m

xm dx = 1−1

m + 1

m∑k=1

ζ(k + 1).

(Here ζ denotes the Riemann zeta function.)

Solution by Patrick J. Fitzsimmons, San Diego, CA. First note that{

1x

}=

1x − n if

1n+1 ≤ x < 1

n . From this it follows that

∫ 1

0

{1

x

}m

xm dx =∞∑

n=1

∫ 1n

1n+1

(1− nx)m dx =∞∑

n=1

(1− nx)m+1

−n(m + 1)

] 1n

1n+1

=1

m + 1

∞∑n=1

1

n

(1

n + 1

)m+1

.

On the other hand, with Z =∑m

k=1 ζ(k + 1), we have

Z =m∑

k=1

∞∑n=1

1

nk+1=

∞∑n=1

m∑k=1

1

nk+1

= m +∞∑

n=2

1n2 −

1nm+2

1− 1n

= m +∞∑

n=2

1

n(n − 1)

(1−

1

nm

)

= m +∞∑

n=2

1

n(n − 1)−

∞∑n=2

1

(n − 1)nm+1= m + 1−

∞∑n=1

1

n(n + 1)m+1.

Thus both sides of the stated identity equal 1m+1

∑∞

n=11

n(n+1)m+1 .

Editorial comment. A similar problem appeared as Problem 1845, Math. Mag., 84(April 2011), 155–156, and as Problem 11206, this MONTHLY 114 (2007), 928–929.Eugene A. Herman showed for a > m − 1 that∫ 1

0

{1

x

}m

xa dx =1

a − m + 1−

1

m + 1

m∑k=1

ζ(k + 1− m + 1)

( m+1m+1−k

)( a+1m+1−k

) .Also solved by T. Amdeberhan, P. J. Anderson (Canada), M. Bataille (France), D. Beckwith, K. N. Boyadzhiev,M. A. Carlton, N. Caro (Brazil), R. Chapman (U. K.), M. W. Coffey, C. Curtis, P. P. Dalyay (Hungary),E. S. Eyeson, D. Fleischman, O. Geupel (Germany), M. L. Glasser, M. Goldenberg & M. Kaplan, D. Gove,G. C. Greubel, J.-P. Grivaux (France), J. A. Grzesik, E. A. Herman, E. Hysnelaj (Australia) & E. Bojaxhiu(Germany), W. Janous (Austria), B. Karaivanov, D. R. Kim (Korea), O. Kouba (Syria), H. Kwong, J. B. Little,O. P. Lossers (Netherlands), I. Mezo (Hungary), U. Milutinovic (Slovenia), J. Minkus, R. Nandan, M. Omarjee(France), P. Perfetti (Italy) T. Perrson & M. P. Sundqvist (Sweden), C. R. Pranesachar (India), M. A. Prasad(India), R. Pratt, V. Sah, J. Schlosberg, N. C. Singer, A. Stenger, R. Stong, R. Tauraso (Italy), D. B. Tyler,J. Vinuesa (Spain), T. Viteam (Uruguay), M. Vowe (Switzerland), A. Witkowski (Poland), J. Zacharias, GCHQProblem Solving Group (U. K.), Missouri State University Problem Solving Group, NSA Problems Group,TCDmath Problem Group (Ireland), and the proposer.

April 2014] PROBLEMS AND SOLUTIONS 369

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 102: AMM April 2014

Independent Triples in a Discrete Probability Space

11643 [2012, 426]. Proposed by Eugen J. Ionascu, Columbus State University, Colum-bus, GA. Let r be a real number with 0 < r < 1, and define a discrete probabilitymeasure P on N by P(k) = (1 − r)r k−1 for k ≥ 1. Show that there are uncount-ably many triples (A1, A2, A3) of subsets of N that are mutually independent, that is,P(Ai ∩ A j ) = P(Ai )P(A j ) for i 6= j and P(A1 ∩ A2 ∩ A3) = P(A1)P(A2)P(A3).

Solution by Oliver Geupel, Bruhl, NRW, Germany. Let A1 =⋃

m≥0{4m + 1, 4m + 2}and A2 =

⋃m≥0{4m + 1, 4m + 3}. For any set B of nonnegative integers, let A3 =⋃

m∈B{4m + 1, 4m + 2, 4m + 3, 4m + 4}. Since B is arbitrary, there are uncountablymany such triples.

We show that the events A1, A2, A3 are mutually independent. We have

P(A1) = (1− r)∞∑

m=0

(r 4m+ r 4m+1) = (1− r)

1+ r

1− r 4=

1

1+ r 2,

P(A2) = (1− r)∞∑

m=0

(r 4m+ r 4m+2) = (1− r)

1+ r 2

1− r 4=

1

1+ r, and

P(A3) = (1− r)∑m∈B

(r 4m+ r 4m+1

+ r 4m+2+ r 4m+3) = (1− r 4)

∑m∈B

r 4m .

Furthermore,

P(A1 ∩ A2) = (1− r)∞∑

m=0

r 4m=

1− r

1− r 4= P(A1)P(A2),

P(A1 ∩ A3) = (1− r)∑m∈B

(r 4m+ r 4m+1) = (1− r 2)

∑m∈B

r 4m= P(A1)P(A3),

P(A2 ∩ A3) = (1− r)∑m∈B

(r 4m+ r 4m+2) = P(A2)P(A3),

and

P(A1 ∩ A2 ∩ A3) = (1− r)∑m∈B

r 4m= P(A1)P(A2)P(A3).

Editorial comment. Many solvers noted that there are trivial solutions, such as A1 =

A2 = N and A3 arbitrary. The solution presented here demonstrates that the sets canbe required to be nontrivial.

Solved also by M. Carlton, J. H. Lindsey II, M. D. Meyerson, M. Rajeswari (India), K. Schilling, R. Stong,GCHQ Problem Solving Group (U. K.), and the proposer.

Factorable Polynomials

11645 [2012, 427]. Proposed by Christopher J. Hillar, University of California, Berke-ley, CA, Lionel Levine, Cornell University, Ithaca, NY, and Darren Rhea, Universityof California, San Francisco, CA. Determine all positive integers n such that the poly-nomial g in two variables given by g(x, y) = 1 + y2

∑nk=1 x2k

+ y4x2n+2 factors inC[x, y].

370 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 103: AMM April 2014

Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, TheNetherlands. For n = 1, g has x2 y2

+ ξ as a factor, where ξ is a primitive cuberoot of unity in C. For n = 2, g has x2 y2

+ 1 as a factor. We claim that g does notfactor in C[x, y] when n ≥ 3. Equivalently, we claim that h does not factor in C[x, y]when n ≥ 3, where h(x, y) = y4

+ y2∑n

k=1 x2k+ x2n+2.

First, we note that h is a polynomial in y2 over C[x], so if h has a linear factor,necessarily of the form y + a(x), then y − a(x) is another linear factor and so y2

a(x)2 is a quadratic factor of h.If our claim is false, then h factors as a product of two quadratic polynomials in y

over C[x], and such a factorization has the form

h(x, y) = (y2+ a(x)y + λxr )(y2

− a(x)y + λ−1x s),

where r and s are nonnegative integers such that r + s = 2n + 2 and a(x) ∈ C[x].Inspecting the coefficient of y shows that λxr a(x) = λ−1x sa(x). Now, a(x) = 0 isimpossible if n ≥ 3, as the expression for h(x, y)would not have enough terms. There-fore, r = s = n + 1 and λ = ±1.

Let σ be the polynomial given by σ(x) =∑n

k=1 x2k . From the coefficient of y2

in h, we see that σ = 2λxn+1− a2(x), with a of degree n. Writing a = ib and b =

c(x2)+ xd(x2) gives

σ = 2λxn+1+ c2(x2)+ 2xc(x2)d(x2)+ x2d2(x2). (1)

If n is even, then equating odd parts in (1) gives 0 = 2λxn+1+ 2xc(x2)d(x2), whence

c and d must be monomials. But then the left side of (1) has n terms while the rightside has just two. So n is odd, say n = 2m + 1.

In this case, from (1) it follows that cd = 0, and since b(x) = c(x2)+ xd(x2) hasdegree 2m + 1, it must be c that is 0 so that b can have odd degree. Writing z = x2,we thus have

2m+1∑k=1

zk= zd2(z)+ 2λzm+1, (2)

where d has the form d = 1+∑m−1

j=1 d j z j+ εzm with ε ∈ {−1, 1}. The first m terms

of d now coincide with those of (1− z)−1/2, so d j = (−1) j(−1/2

j

)for 0 ≤ j ≤ m − 1.

A similar calculation for the last m coefficients of d shows that dm− j = εd j for 0 ≤j ≤ m − 1. But that gives contradictory values for d1 when m ≥ 1, so there is nofactorization if n ≥ 2, as claimed.

Also solved by G. Apostolopoulos (Greece), R. Chapman (U. K.), P. P. Dalyay (Hungary), D. Fleischman,O. Geupel (Germany), M. Goldenberg & M. Kaplan, E. A. Herman, B. Karaivanov, O. Kouba (Syria),J. H. Lindsey II, A. Magidin, M. A. Prasad (India), N. Singer, R. Stong, E. Verriest, and the proposers.

A Geometric Inequality

11646 [2012, 427]. Proposed by Pal Peter Dalyay, Szeged, Hungary. Let ABC be anacute triangle, and let A1, B1, C1 be the intersection points of the angle bisectors fromA, B, C to the respective opposite sides. Let R and r be the circumradius and theinradius of ABC, and let RA, RB , RC be the circumradii of the triangles AC1 B1, BA1C1,and CA1 B1, respectively. Let H be the orthocenter of ABC, and let da , db, dc be thedistances from H to sides BC, CA, and AB, respectively. Show that

2r(RA + RB + RC) ≥ R(da + db + dc).

April 2014] PROBLEMS AND SOLUTIONS 371

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 104: AMM April 2014

Solution by Peter Nuesch, Switzerland. Our solution uses Problem 11552 (this MONTHLY,October 2012, p. 702–703). We write a, b, c for the lengths of the sides of4ABC, s forthe semi-perimeter, and α, β, γ for the measures of the angles. From the definitions,we have

AB1 =bc

c + a, AC1 =

bc

a + b, B1C1 = a1 = 2RA sinα.

Using a = 2R sinα, we get RA = Ra1/a. Thus,

RA + RB + RC = R

(a1

a+

b1

b+

c1

c

)≥ R

(1+

r

R

)= R + r,

where the inequality is Problem 11552. From da = 2R cosβ cos γ , we have

da + db + dc = 2R(cosβ cos γ + cos γ cosα + cosα cosβ) =r 2+ s2− 4R2

2R.

Note that (r 2+ s2− 4R2)/2R ≤ (2r(R + r))/R, since this is a rearrangement of a

Blundon inequality, s2≤ 4R2

+ 4Rr + 3r 2. (This follows from s2≤ 2R2

+ 10Rr −r 2+ 2(R − 2r)

√R(R − 2r), found in W. J. Blundon, Inequalities associated with the

triangle, Canad. Math. Bull. 8 (1965) 615–626.)This proves 2r(RA + RB + RC) ≥ 2r(R + r) ≥ R(da + db + dc).

Also solved by B. Karaivanov, J. Zacharias, and the proposer.

A Subset That Is Not Closed

11648 [2012, 427]. Proposed by Moubinool Omarjee, Paris, France. Let E be the setof all continuous, differentiable functions from (0, 1] into R such that

∫ 10 t1/2 f 2(t) dt

converges. Let F be the set of all f in E such that∫ 1

0 t−3/2 f 2(t) dt and∫ 1

0 t1/2 f ′(t)2 dtconverge. Equip E with the distance

d( f, g) =

(∫ 1

0t1/2( f − g)2(t) dt

)1/2

to make it a metric space. Is F a closed subset of E?

Solution by O. P. Lossers, Eindhoven University of Technology, Eindhoven, TheNetherlands. No, F is not closed. Consider f (t) = t1/4, so that f ∈ E but f 6∈ F .Let φ be a differentiable function with minimum 0 and maximum 1, and such thatφ(t) = 0 for 0 < t < 1 and φ(t) = 1 for t > 2. Define φε(t) = φ(t/ε) for ε > 0.Note that φε f ∈ F . Now

d( f, φε f )2 =∫ 1

0t1/2

(1− φ(t/ε)

)2f (t)2 dt ≤

∫ 2ε

0t1/2 f (t)2 dt,

which goes to 0 as ε goes to 0. Hence, f is in the closure of F .

Editorial comment. If the problem statement had said “continuously differentiable”and not just “continuous, differentiable”, then the above argument would in fact showthat F is dense in E .

Also solved by P. P. Dalyay (Hungary), O. Kouba (Syria), J. H. Lindsey II, R. Stong, and GCHQ ProblemSolving Group (U. K.).

372 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:09 PMAll use subject to JSTOR Terms and Conditions

Page 105: AMM April 2014

ReviewEncounters with Chaos and Fractals . 2nd edition. By Denny Gulick. Chapman and Hall/CRCPress, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.Review by: Jeffrey NunemacherThe American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 373-376Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.373 .

Accessed: 30/03/2014 17:31

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PMAll use subject to JSTOR Terms and Conditions

Page 106: AMM April 2014

REVIEWSEdited by Jeffrey Nunemacher

Mathematics and Computer Science, Ohio Wesleyan University, Delaware, OH 43015

Encounters with Chaos and Fractals, 2nd edition. By Denny Gulick. Chapman and Hall/CRCPress, Boca Raton, 2012, xvi + 371 pp., ISBN 978-1-58488-517-7, $79.95.

Reviewed by Jeffrey Nunemacher

How can we convince undergraduates that mathematics is as modern and vibrant asphysics or biology in these days of the Higgs boson and genome sequencing? Cer-tainly ordinary calculus, although it is intellectually rich, does not do the trick. Sincemost promising mathematics and science students see it first in high school, the levelof excitement that I still remember from seeing it presented relatively rigorously incollege many years ago is simply not present today. My candidate for a teachable con-temporary mathematical topic that can attract modern students is chaotic dynamicalsystems, or to give the subject a more enticing name, chaos and fractals. I have taughtcourses on this subject at a variety of levels from freshman honors to senior capstone.And the text that I have enjoyed using the most (at least for a lower-level version) is theGulick book, which has recently appeared in a second edition. The new edition offersmore material on fractals (three chapters rather than one) and gives expanded coverageof background material and attention to modern algorithms. This second edition is thesubject of the current review.

The subject of chaos was invented around the turn of the twentieth century byPoincare (but named much later by Yorke). He showed that a deterministic system ofsecond-order differential equations modeling a particular three-body solar system canhave solutions that display sensitive dependence on initial conditions. Thus, some tra-jectories simply cannot be predicted with any degree of accuracy over the long term.But the subject did not really take off until the development of software for experi-mentation and graphics. Once these tools were available and applications to subjectslike weather prediction and chemical reactions were discovered, there was incentive tofind the correct mathematical framework and to build an appropriate theory. Somechaotic trajectories display fractal behavior, so this modern geometric concept oc-curs naturally in the study of chaotic systems. Fractals also occur as the limit setsof simple discrete dynamical systems. Take, for instance, the iterated function sys-tem (IFS) defined by the three affine mappings of the plane: T1(v) = 1/2v, T2(v) =

1/2v + (1/2, 0), T3(v) = 1/2v + (1/4,√

3/4). If we start with the origin and iteratethis IFS many times, the limit set is the famous Sierpinski gasket, and a good computerimage is obtained by using the tenth iterate.

It is possible to teach much of this material to motivated students who have a back-ground of only first-semester calculus. Of course, the more mathematics a studentknows the better, but a course can be taught with this very minimal prerequisite. En-countered during the course will be some topics from sophomore courses includingiteration (discrete mathematics), matrices (linear algebra), the qualitative study of so-lutions of differential equations, and algorithms employing pseudorandom numbers

http://dx.doi.org/10.4169/amer.math.monthly.121.04.373

April 2014] REVIEWS 373

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PMAll use subject to JSTOR Terms and Conditions

Page 107: AMM April 2014

(programming and statistics). But it can be argued that seeing interesting topics fromthese courses as they arise naturally is the best possible motivation for then takingthe standard courses. Ideas from topology and real analysis, both on the line and inabstract spaces, also come up naturally as the course proceeds.

The subjects of chaos and fractals have been part of the undergraduate mathemat-ical landscape ever since Devaney’s first edition of his attractive book [3] in 1985. Aspecial issue of the College Mathematics Journal (Volume 22, No. 1, January 1991)was devoted to this new topic and discusses how it might fit into the undergraduatecurriculum. Let me list particular aspects of the subject that I find particularly wellemphasized in the Gulick book.

1. There are fundamental simple, yet fecund, examples to explore and generalize,e.g., the quadratic mapping in one variable, the Sierpinski gasket, Smale’s Horse-shoe mapping in the plane, the Lorenz system of differential equations (whichprovided the first example of chaos in a real situation), the Mandelbrot set in thecomplex plane. Most students and some professors do not appreciate how cru-cial examples are for the development of a mathematical subject. Since most ofthe mathematics that we teach is quite old, motivating examples are often treatedvery briefly in the rush to get to theorems. The examples in chaos and fractalsare rich and somewhat complicated, and it is natural to linger over them. Thusthe subject is a good corrective to standard courses. Gulick does a good job ana-lyzing these examples, starting with easy mathematics but getting to some depth.

2. A variety of tools and theory are useful in exploring these examples, e.g., deriva-tives and Jacobians, Lyapunov exponents, symbolic dynamics, conjugacy, bifur-cation theory.

3. It is nontrivial to arrive at the best definitions on which to base the relevant the-ory, e.g., strong and weak chaos for function iteration, the Hausdorff metric onthe space of compact sets in the plane, and various versions of dimension. Recallthe Bourbaki point of view that definitions in mathematics should be carefullyconstructed (and perhaps difficult) in order to make the theorems easy.

4. There are surprising fundamental theorems, e.g., Sharkovsky’s Theorem aboutthe occurrence of periods in one dimension based on a particular total orderingof the natural numbers, the Stable and Unstable Manifold Theorem, the connect-edness of the Mandelbrot set.

5. It is natural to explore the phenomena of both chaos and fractals using computa-tional resources, e.g., to draw bifurcation diagrams, to approximate fractal sets.Chaos is a wonderful subject for exploratory mathematics.

6. Finally, examples from a wide variety of applied areas are available to show therelevance of this subject, e.g., chaotic pendula in mechanics or fractal coastlinesin geography.

While there are many excellent texts about these subjects at various levels, I havenot found a better book than Gulick’s for a serious course at the honors freshman orsophomore level. There are very elementary books that concentrate on intuitive under-standing and visual images, and many others that require a greater depth of mathemat-ical background. One requirement, which to me is important, is that the course (andthus the text) should treat both discrete and continuous dynamical systems. I feel thatthe richness and applications of the subject can only be seen by studying both types.Excellent books that fail to satisfy this criterion (and which also are too advanced forbeginning students) are [1], [7], and [10]. A standard book on fractals and the algo-rithms to produce them on a computer is [2]. Typically, for my course I use a main

374 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PMAll use subject to JSTOR Terms and Conditions

Page 108: AMM April 2014

text and then also a good expository book of broader scope. In the past I have selectedpopular books by Gleick [4], Peterson [6], Ruelle [8], or Stewart [9]. Of these, the oneI’ve enjoyed using the most is [9]. I’ve also required each student to do an independentproject, which can be experimental, computational, or mathematical.

Next, I will briefly discuss the contents of Gulick’s book. The choice of topics isparticularly well selected for the not particularly advanced but still seriously mathe-matical undergraduate course that I envision. The book begins with two chapters ondiscrete one-dimensional iteration, the first devoted to simple examples, fixed and pe-riodic points, and bifurcation, and the second focusing on chaotic behavior. Some ofthe examples are explored in some detail; for example, the study of the logistic familyQµ(x) = µx(1− x) requires ten pages. The two most common bifurcations, namelythe period-doubling bifurcation and the tangent bifurcation, are studied and explored inexamples and problems. The Li-Yorke Theorem, which asserts that if f is continuouson a closed interval J and maps J into itself, then if f has a period-3 point it also haspoints of all other periods, is proven in detail, while its generalization by Sharkovskyis simply stated and discussed. By the way, the Li-Yorke result first appeared in thisMONTHLY in 1975 [5] and is one of the early papers that made the subject of chaospopular. The tools needed in one dimension are the single-variable derivative and acomputational system to explore examples of iteration. Chapter 3 generalizes theseideas to two dimensions using simple matrix theory and the Jacobian, and explorestwo classic examples of chaotic behavior: the Henon quadratic mapping and Smale’sHorseshoe.

Chapter 4 moves from the discrete setting to continuous dynamical systems, whichare defined in terms of first-order differential equations. It generalizes the basic con-cepts to this setting and explores the pendulum system and the Lorenz system as twoexamples. No experience in solving differential equations is necessary. The basic ideaof a differential equation defining a flow, together with some of the basic propertiesof the flow, is developed. Continuous dynamical systems require more machinery andsophistication to develop (which is mostly not done in this book). However, the mostimportant applications of chaos to reality lie in this realm. There are also some philo-sophical points to make about the modeling process. For example, since chaos is amathematical construct, it can apply to a given mathematical mode of reality but neverto physical reality itself. Thus no phenomenon can ever be chaotic in the mathematicalsense.

The last three chapters of the book concentrate on fractals. Chapter 6 introducesthe basic idea of a fractal and discusses self-similarity and various kinds of fractal di-mension. It also presents some basic examples, such as the Cantor set, the Sierpinskigasket, and the Henon attractor. Chapter 7 discusses Barnsley’s Iterated Function Sys-tems using metric spaces and shows how they can be used to generate fractals on acomputer. This chapter includes several elegant and useful results, such as the com-pleteness of the collection of compact sets in the plane under the Hausdorff metric.Finally, Chapter 8 studies fractals in the complex plane and introduces Julia sets andthe Mandelbrot set. The second edition of the book offers enhanced coverage of frac-tals beyond what was presented in the first edition.

An appendix in the book presents MATLAB functions to allow the study of iterationempirically and to generate on the computer the classical images associated with chaosand fractals. For instance, there is MATLAB code to produce the bifurcation diagramof the logistic mapping Qµ(x) asµ varies over an interval, to draw the Henon attractor,and to display Julia sets and the Mandelbrot set. It is a good choice to offer theseexperimental tools in a commonly available package, since the operation of the codecan be understood with minimal effort. However, minor errors in some of the code

April 2014] REVIEWS 375

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PMAll use subject to JSTOR Terms and Conditions

Page 109: AMM April 2014

will cause problems for beginning users of MATLAB. For example, in Program 4 toproduce a bifurcation diagram, I found four separate errors: The increments are 0.01,not 0.001 as promised (and 0.001 is necessary to obtain a good picture); a closingparenthesis is needed for the axis command; “hold on” should replace “holdon”; and“m” should replace “n” as an argument for the function Qm. The author intends tocorrect the errors on an Errata page. It is important for inexperienced users to be able touse the code, since the ability to experiment with examples is one of the most attractivefeatures of this area of mathematics.

I also found a few small errors in the text and exercises and some imprecise state-ments, such as Lemma 1 on page 227, which is stated for all increasing continuousfunctions but applies only to Cantor-like ones. Also, there is a loose statement onpage 214 that asserts concepts pertinent to two-dimensional differential equations ap-ply equally well in dimension three. The Poincare-Bendixson Theorem, which is usedin the book, is a rather stark counterexample to this assertion.

Despite these minor errors, I feel that this book is the best text available for amidlevel undergraduate course on chaos and fractals. The choice of topics, readableprose, and level of presentation make it a very attractive book.

REFERENCES

1. K. T. Alligood, T. D. Sauer, J. A. Yorke, Chaos: An Introduction to Dynamical Systems. Springer-Verlag,New York, 1996.

2. M. F. Barnsley, Fractals Everywhere. Second edition, Academic Press, Boston, MA, 1993.3. R. L. Devaney, An Introduction to Chaotic Dynamical Systems. Second edition, Addison-Wesley, Read-

ing, MA, 1989.4. J. Gleich, Chaos: The Making of a New Science. Viking, New York, 1988.5. T. Y. Li, J .A. Yorke, Period three implies chaos, Amer. Math. Monthly 82 (1975) 985–992.6. I. Peterson, Newton’s Clock: Chaos in the Solar System. W. H. Freeman, New York, 1995.7. R. C. Robinson, An Introduction to Dynamical Systems. Prentice Hall, Englewood Cliffs, NJ, 2004.8. D. Ruelle, Chance and Chaos. Princeton University Press, Princeton, NJ, 1993.9. I. Stewart, Does God Play Dice? The New Mathematics of Chaos. Second edition, Blackwell, Malden,

MA, 2002.10. S. H. Strogatz, Nonlinear Dynamics and Chaos With Applications to Physics, Biology, Chemistry, and

Engineering. Addison-Wesley, Reading, MA, 1994.

Ohio Wesleyan University, Delaware, OH [email protected]

376 c© THE MATHEMATICAL ASSOCIATION OF AMERICA [Monthly 121

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:47 PMAll use subject to JSTOR Terms and Conditions

Page 110: AMM April 2014

Back MatterSource: The American Mathematical Monthly, Vol. 121, No. 4 (April)Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.bm .

Accessed: 30/03/2014 17:31

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:56 PMAll use subject to JSTOR Terms and Conditions

Page 111: AMM April 2014

Illustrated Special Relativity through Its Paradoxes:A Fusion of Linear Algebra, Graphics, and Reality

By John dePillis and José WudkaSpectrum Series

The text illustrates and resolves several apparent paradoxes of Special Relativity including the twin paradox and train-and-tunnel paradox. Assuming a minimum of technical prerequisites the authors introduce inertial frames and use them to explain a variety of phenomena: the nature of simultaneity, the proper way to add velocities, and why faster-than-light travel is impossible. Most of these explanations are contained in the resolution of apparent paradoxes, including some lesser-known ones: the pea-shooter paradox, the bug-and-rivet paradox, and the accommodating universe paradox. The explanation of time and length contraction is

especially clear and illuminating.

The roots of Einstein’s work in Maxwell’s lead the authors to devote several chapters to an exposition of Maxwell’s equations. The authors establish that those equations predict a frame-independent speed for the propagation of electromagnetic radiation, a speed that equals that of light. Several chapters are devoted to experiments of Roemer(SYMBOL!), Fizeau, and de Sitter to measure the speed of light and the Michelson-Morley experiment abolishing the aether.

Throughout the exposition is thorough, but not overly technical, and often illustrated by cartoons. The volume might be suitable for a one-semester general-education introduction to Special Relativity. It is especially well-suited to self-study by interested laypersons or use as a supplement to a more traditional text.

eISBN 978-1-61444-517-32013, 478 pp.Catalog Code: ISR PDF Price: $33.00

MATHEMATICAL ASSOCIATION OF AMERICA

New in the MAA eBooks Store

BS

TppaiatteaopT

IllustratedSpecial

Relativitythrough its

Paradoxes

IIIIIIIIIIIIIIIIllllll

Spectrum

John dePillis & José WudkaIllustrations and animations by John dePillis

To order, visit www.maa.org/ebooks/ISR.

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:56 PMAll use subject to JSTOR Terms and Conditions

Page 112: AMM April 2014

MATHEMATICAL ASSOCIATION OF AMERICA1529 Eighteenth St., NW • Washington, DC 20036

Recently Released from the MAADistilling Ideas: An Introduction to Mathematical Th inkingBy Brian P. Katz and Michael StarbirdMAA Textbooks

Mathematics is not a spectator sport: successful students of mathematics grapple with ideas for themselves. Distilling Ideas presents a carefully designed sequence of exercises and theorem statements that challenge students to create proofs and concepts. As students meet these challenges, they discover strategies of proofs and strategies of thinking beyond mathematics. In other words, Distilling Ideas helps its users to develop the

skills, attitudes, and habits of mind of a mathematician and to enjoy the process of distilling and exploring ideas.

Catalog Code: DIMT ISBN: 978-1-93951-203-1List Price: $54.00 171 pp., Paperbound, 2013MAA Member: $45.00

To order, visit maa-store.hostedbywebstore.com or call 800-331-1622.

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:31:56 PMAll use subject to JSTOR Terms and Conditions

Page 113: AMM April 2014

Front MatterSource: The American Mathematical Monthly, Vol. 121, No. 4 (April), pp. 281-282Published by: Mathematical Association of AmericaStable URL: http://www.jstor.org/stable/10.4169/amer.math.monthly.121.04.fm .

Accessed: 30/03/2014 17:27

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Mathematical Association of America is collaborating with JSTOR to digitize, preserve and extend access toThe American Mathematical Monthly.

http://www.jstor.org

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PMAll use subject to JSTOR Terms and Conditions

Page 114: AMM April 2014

THE AMERICAN MATHEMATICAL

MONTHLYVOLUME 121, NO. 4 APRIL 2014

283Periodicity Domains and the Transit of VenusAndrew J. Simoson

299A Drug-Induced Random WalkDaniel J. Velleman

318Analytical Solution for the Generalized Fermat–TorricelliProblem

Alexei Yu. Uteshev

332On the Proof of the Existence of Undominated Strategies inNormal Form Games

Martin Kovar and Alena Chernikava

338An Asymptotic Formula for (1+ 1/x)x Based on the PartitionFunction

Chao-Ping Chen and Junesang Choi

NOTES

344Stirling’s Approximation for Central Extended BinomialCoefficients

Steffen Eger

350A New Proof of Stirling’s FormulaThorsten Neuschel

353Zeta(2) Once AgainRalph M. Krause

355Polynomials (x3− n)(x2

+ 3) Solvable Modulo Any IntegerAndrea M. Hyde, Paul D. Lee, and Blair K. Spearman

359Macaulay ExpansionB. Sury

361Evaluating Lebesgue Integrals Efficiently with the FTCJ. J. Koliha

365PROBLEMS AND SOLUTIONS

REVIEWS

373Encounters with Chaos and FractalsBy Denny Gulick

Jeffrey Nunemacher

MATHBITS

331, A One-Sentence Line-of-Sight Proof of the Extreme Value Theorem

An Official Publication of the Mathematical Association of America

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PMAll use subject to JSTOR Terms and Conditions

Page 115: AMM April 2014

Latest in the MAA Notes Series

Applications of Mathematics in EconomicsWarren Page, Editor

Applications of Mathematics in Economics presents an overview of the (qualitative and graphical) methods and perspectives of economists. Its objectives are not intended to teach economics, but rather to give math-ematicians a sense of what mathematics is used at the undergraduate level in various parts of economics, and to provide students with the opportunities to apply their math-ematics in relevant economics contexts.

The volume’s applications span a broad range of mathematical topics and levels of sophis-

tication. Each article consists of self-contained, stand-alone, expository sections whose problems illustrate what mathematics is used, and how, in that subdiscipline of economics. The problems are intended to be richer and more informative about economics than the economics exercises in most mathematics texts. Since each section is self-contained, instructors can readily use the economics background and worked-out solutions to tailor (simplify or embellish) a section’s problems to their students’ needs. Overall, the volume’s 47 sections contain more than 100 multipart prob-lems. Thus, instructors have ample material to select for classroom uses, homework assignments, and enrichment activities.

eISBN: 9781614443179Print ISBN: 9780883851920 ebook: $24.00Print on demand (paperbound): $40.00

MATHEMATICAL ASSOCIATION OF AMERICA

To order go to www.maa.org/ebooks/NTE82

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PMAll use subject to JSTOR Terms and Conditions

Page 116: AMM April 2014

THE AMERICAN MATHEMATICAL

MONTHLYVolume 121, No. 4 April 2014

EDITORScott T. Chapman

Sam Houston State University

NOTES EDITOR BOOK REVIEW EDITORSergei Tabachnikov Jeffrey Nunemacher

Pennsylvania State University Ohio Wesleyan University

PROBLEM SECTION EDITORSDouglas B. West Gerald Edgar Doug Hensley

University of Illinois Ohio State University Texas A&M University

ASSOCIATE EDITORS

William AdkinsLouisiana State University

David AldousUniversity of California, Berkeley

Elizabeth AllmanUniversity of Alaska, Fairbanks

Jonathan M. BorweinUniversity of Newcastle

Jason BoyntonNorth Dakota State University

Edward B. BurgerSouthwestern University

Minerva Cordero-EppersonUniversity of Texas, Arlington

Allan DonsigUniversity of Nebraska, Lincoln

Michael DorffBrigham Young University

Daniela FerreroTexas State University

Luis David Garcia-PuenteSam Houston State University

Sidney GrahamCentral Michigan University

Tara HolmCornell University

Roger A. HornUniversity of Utah

Lea JenkinsClemson University

Daniel KrashenUniversity of Georgia

Ulrich KrauseUniversitat Bremen

Jeffrey LawsonWestern Carolina University

C. Dwight LahrDartmouth College

Susan LoeppWilliams College

Irina MitreaTemple University

Bruce P. PalkaNational Science Foundation

Vadim PonomarenkoSan Diego State University

Catherine A. RobertsCollege of the Holy Cross

Rachel RobertsWashington University, St. Louis

Ivelisse M. RubioUniversidad de Puerto Rico, Rio Piedras

Adriana SalernoBates College

Edward ScheinermanJohns Hopkins University

Anne SheplerUniversity of North Texas

Susan G. StaplesTexas Christian University

Dennis StoweIdaho State University

Daniel UllmanGeorge Washington University

Daniel VellemanAmherst College

EDITORIAL ASSISTANTBonnie K. Ponce

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PMAll use subject to JSTOR Terms and Conditions

Page 117: AMM April 2014

NOTICE TO AUTHORSThe MONTHLY publishes articles, as well as notes andother features, about mathematics and the profes-sion. Its readers span a broad spectrum of math-ematical interests, and include professional mathe-maticians as well as students of mathematics at allcollegiate levels. Authors are invited to submit arti-cles and notes that bring interesting mathematicalideas to a wide audience of MONTHLY readers.

The MONTHLY’s readers expect a high standard of ex-position; they expect articles to inform, stimulate,challenge, enlighten, and even entertain. MONTHLYarticles are meant to be read, enjoyed, and dis-cussed, rather than just archived. Articles may beexpositions of old or new results, historical or bio-graphical essays, speculations or definitive treat-ments, broad developments, or explorations of asingle application. Novelty and generality are farless important than clarity of exposition and broadappeal. Appropriate figures, diagrams, and photo-graphs are encouraged.

Notes are short, sharply focused, and possibly infor-mal. They are often gems that provide a new proofof an old theorem, a novel presentation of a familiartheme, or a lively discussion of a single issue.

Submission of articles, notes, and filler pieces is re-quired via the MONTHLY’s Editorial Manager System.Initial submissions in pdf or LATEX form can be sentto the Editor Scott Chapman at

http://www.editorialmanager.com/monthly

The Editorial Manager System will cue the authorfor all required information concerning the paper.Questions concerning submission of papers canbe addressed to the Editor at [email protected] who use LATEX can find our article/note tem-plate at http://www.shsu.edu/~bks006/Monthly.html. This template requires the style file maa-monthly.sty, which can also be downloaded from thesame webpage. A formatting document for MONTHLYreferences can be found at http://www.shsu.edu/~bks006/FormattingReferences.pdf. Follow thelink to Electronic Publications Information forauthors at http://www.maa.org/pubs/monthly.html for information about figures and files, as wellas general editorial guidelines.

Letters to the Editor on any topic are invited.Comments, criticisms, and suggestions for mak-ing the MONTHLY more lively, entertaining, andinformative can be forwarded to the Editor [email protected].

The online MONTHLY archive at www.jstor.org is avaluable resource for both authors and readers; itmay be searched online in a variety of ways for anyspecified keyword(s). MAA members whose institu-tions do not provide JSTOR access may obtain indi-vidual access for a modest annual fee; call 800-331-1622.

See the MONTHLY section of MAA Online for currentinformation such as contents of issues and descrip-tive summaries of forthcoming articles:

http://www.maa.org/

Proposed problems or solutions should be sent to:

DOUG HENSLEY, MONTHLY ProblemsDepartment of MathematicsTexas A&M University3368 TAMUCollege Station, TX 77843-3368.

In lieu of duplicate hardcopy, authors may submitpdfs to [email protected].

Advertising correspondence should be sent to:

MAA Advertising1529 Eighteenth St. NWWashington DC 20036.

Phone: (877) 622-2373,E-mail: [email protected].

Further advertising information can be found onlineat www.maa.org.

Change of address, missing issue inquiries, andother subscription correspondence can be sent to:

MAA Service Center, [email protected].

All of these are at the address:

The Mathematical Association of America1529 Eighteenth Street, N.W.Washington, DC 20036.

Recent copies of the MONTHLY are available for pur-chase through the MAA Service Center:

[email protected], 1-800-331-1622.

Microfilm Editions are available at: University Micro-films International, Serial Bid coordinator, 300 NorthZeeb Road, Ann Arbor, MI 48106.

The AMERICAN MATHEMATICAL MONTHLY (ISSN0002-9890) is published monthly except bimonthlyJune-July and August-September by the Mathe-matical Association of America at 1529 EighteenthStreet, N.W., Washington, DC 20036 and Lancaster,PA, and copyrighted by the Mathematical Asso-ciation of America (Incorporated), 2014, includingrights to this journal issue as a whole and, exceptwhere otherwise noted, rights to each individualcontribution. Permission to make copies of individ-ual articles, in paper or electronic form, includingposting on personal and class web pages, for ed-ucational and scientific use is granted without feeprovided that copies are not made or distributed forprofit or commercial advantage and that copies bearthe following copyright notice: [Copyright the Math-ematical Association of America 2014. All rights re-served.] Abstracting, with credit, is permitted. Tocopy otherwise, or to republish, requires specificpermission of the MAA’s Director of Publications andpossibly a fee. Periodicals postage paid at Washing-ton, DC, and additional mailing offices. Postmaster:Send address changes to the American Mathemati-cal Monthly, Membership/Subscription Department,MAA, 1529 Eighteenth Street, N.W., Washington, DC,20036-1385.

This content downloaded from 92.82.8.190 on Sun, 30 Mar 2014 17:27:55 PMAll use subject to JSTOR Terms and Conditions