number theoretic aspects of complex dynamics · in the ﬁeld of dynamics, λhas its own name....

Number theoretic aspects of complex dynamicsJun Le Goh Aditya Vaidyanathan

May 16, 2014

Contents1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The linearization problem . . . . . . . . . . . . . . . . . . . . . . 21.3 Number theoretic considerations . . . . . . . . . . . . . . . . . . 4

2 Hyperbolic Case 62.1 Koenigs’ function . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Parabolic Case 83.1 Obstructions to Linearizability . . . . . . . . . . . . . . . . . . . 93.2 Leau-Fatou Flowers . . . . . . . . . . . . . . . . . . . . . . . . . . 9

4 Elliptic Case 114.1 Siegel and Cremer theorems . . . . . . . . . . . . . . . . . . . . . 114.2 Generic vs. Lebesgue Almost Every . . . . . . . . . . . . . . . . 124.3 Qualitative Siegel-Brjuno-Yoccoz . . . . . . . . . . . . . . . . . . 134.4 Quantitative Siegel-Brjuno-Yoccoz . . . . . . . . . . . . . . . . . 154.5 Yoccoz renormalization and Yoccoz’s inequality . . . . . . . . . . 16

5 Appendix 19

1 Introduction1.1 OverviewThis paper is an extension of a presentation given as part of a graduate com-plex analysis course, taught by John Hubbard at Cornell University in the spring2014 semester. The aim of the paper is to provide a survey of a few key resultsin complex dynamics that have ties to number theory. We start by covering thetheorems of Koenigs, Siegel and Cremer and then move on to the more sub-stantial Siegel-Brjuno-Yoccoz theorem. While these results have been treatedin various texts (such as [Mil06],[Mar06], [CG93]), we hope this paper provides

1

a self-contained exposition to the subject and serves as an introduction to amathematically, as well as aesthetically, beautiful field.

1.2 The linearization problemThe iteration and properties of complex quadratic polynomials

fλ(z) = z2 + λz (1)

for λ ∈ C has been extensively studied since the pioneering work of Fatou andJulia in the 1920’s. Of importance in this study is the set of all points in thecomplex plane that have bounded forward orbits, that is

K(f) := {z ∈ C : the sequence z, f(z), f◦2(z), . . . is bounded}

where f◦k is f iterated k times. This set is called the filled Julia set and itsboundary is called the Julia set, J(f) = ∂K(f). An example of a Julia set isshown below, and upon a glance the picture seems to indicate that the dynamicsis quite complicated. While there may be points or regions where complicated

Figure 1: Julia set for fλ with λ = e2πiφ, where φ is the golden ratio (thispolynomial is known as the golden mean quadratic polynomial).

dynamics occurs, one can hope that the dynamics around the origin is somewhatsimpler (in a sense which we make precise throughout the rest of the paper).

2

This hope comes from the observation that 0 is a fixed point of fλ : C → Cand that the quadratic term tends to 0 faster than the linear term (in absolutevalue).

The simplest possible non-trivial dynamics comes from rotations. Thus weask whether there exists a domain D, containing the origin, and a biholomorphicfunction φ : D → D (where D is the unit disc in the complex plane) that makesthe following diagram commute

Dfλ //

φ

��

D

φ

��D z 7→λz // D

The reason we seek a biholomorphic and not a bicontinuous function is to allowfor a power series expansion, and a classical theorem in complex analysis equatesholomorphicity with analyticity (see [SS03]). The commutativity of the diagramis equivalent to the following functional equation, called the Schroder functionalequation after Ernst Schroder [Sch70] who first studied it in 1870. Note that λin the functional equation stands for the multiplication map by λ.

φ ◦ fλ = λ ◦ φ (2)

Accordingly we have the following definition.

Definition 1 (Linearization). If D is a domain containing the origin andφ : D → D is a biholomorphic function that satisfies the Schroder functionalequation for all z ∈ D, then we say that φ linearizes fλ, or that φ conjugatesfλ to the rotation z 7→ λz.

The next definition gives a dichotomy of classification of the origin.

Definition 2 (Siegel Disc/Cremer Point). The maximal domain D such thatthere exists φ linearizing fλ is called the Siegel disc of the quadratic. If no suchdomain exists we say that the origin is a Cremer point.

The origin of these names will become more apparent in §4.1 where wepresent the corresponding theorems of Siegel and Cremer, and our focus willbe more on Siegel discs than on Cremer points. In figure 1 the white regionenclosed by the black outline is the Siegel disc of the quadratic, and the centerof this region is the origin in C. The other white regions in the other so-calledFatou components are pre-images of the Siegel disc.

In the rest of the paper we concern ourselves with the fundamental ques-tion: when is fλ linearizable? Remarkably, the answer depends only on certainarithmetic properties of λ. An indication of why this is the case is obtainedby trying the naive/obvious thing: since we search for an analytic function φsatisfying (2) we write down a power series expansion for φ and solve for thecoefficients. Hence let

φ(z) = z +∞∑i=2

bizi

3

Substituting the above into (2) we get

λz + z2 + b2(λz + z2)2 + b3(λz + z2)3 + · · · = λz + b2λz2 + b3λz

3 + · · ·

Comparing term by term we get

b2 = 1λ− λ2 , b3 = 2b2λ

λ− λ3 , · · ·

It is not immediately obvious, but nonetheless true, that every coefficient bi hasas its denominator λ− λi (the denominator may also contain terms λ− λj forj 6= i). Hence as long as λ is not a (complex) root of unity, we can formallywrite down a power series expansion for the linearizing coordinate φ but arenot guaranteed that this power series has non-zero radius of convergence (instandard jargon, fλ is formally conjugate to a rotation). The problem of dealingwith these terms in the denominator has a long history and is known as thesmall divisors problem; it is this problem that first establishes the link to thearithmetic properties of λ.

Notice that the above computation shows that any function ψ satisfying (2)must have ψ′(0) = 1. Now we turn to the uniqueness of a linearization. Letφ and ψ be two functions that solve (2) for the same quadratic polynomial fλ.Then φ ◦ ψ−1 is well-defined and maps the unit disc to itself while fixing theorigin. By the Schwarz Lemma ([Mil06, Lemma 1.2], [SS03, Ch.8 Lemma 2.1])the map φ ◦ ψ−1 must be a map of the form z 7→ e2πiθz for some θ ∈ R/Z. Butsince φ and ψ are both linearizations it follows that φ′(0) = ψ−1′(0) = 1, whichvia the chain rule means that (φ ◦ ψ−1)′(0) = 1. Hence φ ◦ ψ−1 = id so thatφ = ψ.

Finally, while it may seem that our emphasis on quadratic polynomials isrestrictive, many of the proofs that are presented carry over to functions of theform

fλ(z) = λz + z2 +O(z3)but are simply messier. Here O(z3) is used to denote terms of degree 3 or higher.See [Bra10] for a more general treatment using the language of germs.

1.3 Number theoretic considerationsIn lieu of the claim that the linearizability of fλ is wholly dependent on thenumber theoretic properties of λ, we provide a brief review of a few key conceptsnow rather than interrupting the discussion later, beginning with continuedfractions.

Definition 3 (Continued Fraction). A continued fraction is a fraction of theform

a0 + 1a1 + 1

a2+...

specified by the sequence a0, a1, a2, . . ., which may or may not be an infinitesequence (in the sense of having infinitely may non-zero terms). The terms

4

ai are called elements of the continued fraction, and for our purposes we takeai ∈ N for all i.

A compact way of specifying a continued fraction is with the notation[a0; a1, a2, . . .], which is standard in the literature. It is a theorem that everyα ∈ R has a unique continued fraction representation and that the fraction isfinite if α ∈ Q and infinite if α ∈ Qc (for a proof, and an extensive treatment ofcontinued fractions see [Khi64]). Thus a0 represents the integer part of α and[a1, a2, . . .] represents the fractional part.

Definition 4 (Rational Convergents). The nth rational convergent of the con-tinued fraction is the fraction

pnqn

= a0 + 1a1 + 1

...+ 1an−1

= [a0; a1, . . . , an−1]

The theorem tells us that the sequence of rational convergents of α ter-minates if and only if α is rational while the sequence is non-terminating forirrational α. Finally we introduce some notation that is particular to complexdynamics. For θ ∈ R/Z, let θn denote the fraction

θn = 1an+1 + 1

an+2+···(3)

This notation will re-appear in sections §4.3 - §4.5, particularly when we discussYoccoz renormalization.

It will be of importance later to classify irrational numbers according to theirarithmetic properties. The most natural way to do this is using the Diophantineclassfication.

Definition 5 (Diophantine Numbers). A number θ ∈ R/Z is Diophantine oforder κ if there is a constant C > 0 such that for all p/q ∈ Q∣∣∣∣θ − p

q

∣∣∣∣ > C

qκ

The set of all such numbers is denoted by D(κ).

One property of D(κ) is that for κ < 2, D(κ) = ∅. While the definitionallows for non-integer κ, the previous statement has a nice interpretation whenκ < 2 is integer (κ = 1), namely that the rational numbers are dense in R (andalso in R/Z).

Another property of D(κ) is that for κ < η we have D(κ) ⊂ D(η). This isbecause for all θ ∈ D(κ), there is a constant C > 0 such that for all p/q ∈ Q,

|θ − p

q| > C

qκ>C

qη

5

Thus the class of all Diophantine numbers, D(2+), is the intersection

D(2+) =⋂κ≥2D(κ)

and this class has full Lebesgue measure in R/Z.Finally we introduce a class of irrational numbers specific to complex dy-

namics. Following [Mar00], [Mar06] we set

βn−1 = θ0θ1 · · · θn−1

where the θi = [ai+1, ai+2, . . .], as in equation (3). Define the Brjuno functionB : R/Z→ R+ ∪ {∞} as

B(θ) :=∞∑n=1

βn−1 log 1θn

Definition 6 (Brjuno Number). An irrational number α is a Brjuno numberif B(α) <∞, that is if its image under the Brjuno function is finite.

The class of all Brjuno numbers is denoted B. Notice that rational numbersare not Brjuno numbers since θi = 0 for some i large enough.

2 Hyperbolic CaseAs already mentioned, the claim is that the linearizability of fλ depends onlyon the properties of λ. In the field of dynamics, λ has its own name.

Definition 7 (Multiplier). Let g : C → C be a holomorphic map and let w bea fixed point of g, g(w) = w. The multiplier of w is the value g′(w).

Hence λ is the multiplier of the origin for fλ. We use the following classifi-cation for λ (or equivalently any multiplier).

Definition 8 (Hyperbolic/Parabolic/Elliptic). For a quadratic polynomial ofthe form (1), we say that λ is

1. hyperbolic if |λ| 6= 0, 1

2. parabolic if |λ| = 1 and λq = 1 for some q ∈ N

3. elliptic if |λ| = 1 and λq 6= 1 for all q ∈ N

Accordingly we say that fλ is hyperbolic, parabolic or elliptic.

Another classification could be according to whether the multiplier was at-tracting (|λ| < 1), repelling (|λ| > 1) or indifferent (|λ| = 1). We will groupthe first two together for reasons that will become clear shortly, and we distin-guish whether λ = e2πip/q or not, which is a fundamental dichotomy (note thatλ = e2πip/q means we are in the parabolic case).

6

In the classical jargon the fixed point 0 would be called superattracting forfλ if λ = 0, as this is the most attracting a fixed point can be. The reason whywe include |λ| 6= 0 in the definition of hyperbolicity is because if λ = 0 thenfλ is not linearizable for the following reason. Consider all points in the unitdisc D. Under iteration of fλ these points tend to 0 in absolute value since fλis the map z 7→ z2. This scenario cannot happen for a rotation and hence thelocal dynamics cannot be described as such (refer to §3.1 for more obstructionsto linearizability). Hence we exclude the superattracting case for compatibilitywith the next section.

2.1 Koenigs’ functionThe question of linearizability is considerably easier in the hyperbolic case thanin the other two. In fact, we can build the linearizing function with fλ itselfusing a construction due to Koenigs ([Koe84]). First we consider λ hyperbolicwith |λ| < 1, and set

ϕ(z) = limn→∞

f◦nλ (z)λn

(4)

The function ϕ above is known alternatingly as the Koenigs function or theKoenigs intertwining map. The latter name is due to the fact that any termof the limiting sequence represents the operation of iterating f forward n timesand then iterating the map z 7→ λz backwards n times, i.e. dividing by λn.This gives a nice geometric interpretation of the intertwining map.

Assuming ϕ is well-defined, let us check that the intertwining map satisfiesthe desired functional equation. By substitution we have

ϕ(fλ(z)) = limn→∞

f◦(n+1)λ (z)λn

= λ limn→∞

f◦(n+1)λ (z)λn+1 = λϕ(z)

It remains to show that the limit exists, and hence that ϕ is well-defined. Wedo this in the following theorem.Theorem 1 (Koenigs (1884)). Let fλ be a quadratic polynomial as in (1). If|λ| < 1 then there exists a region DR such that the limit in (4) exists, and hencefλ is locally (in DR) conjugate to a rotation by λ.

Proof. First let us give a name to the kth element of the sequence in the limit.

ϕk(z) = f◦kλλk

Now, choose R > 0 small enough so that, for all z ∈ C such that |z| < R wehave

|fλ(z)| < λ′|z| where λ′2< |λ| < λ′ < 1

and call the set of such z, DR. Consider the difference |ϕk(z)−ϕk−1(z)|, whichis given by ∣∣∣∣∣f◦kλ (z)

λk−f◦(k−1)λ (z)λk−1

∣∣∣∣∣ =∣∣∣∣∣fλ(f◦(k−1)

λ (z))λk

−f◦(k−1)λ (z)λk−1

∣∣∣∣∣7

But since fλ = λz + z2 we have

=∣∣∣∣∣λf◦(k−1)

λ (z) + (f◦(k−1)λ (z))2 − λf◦(k−1)

λ (z)λk

∣∣∣∣∣ < R2λ′2k

λ′|λ|k

where the inequality comes from the bound on |fλ(z)|. Now we write ϕk as thetelescoping sum

ϕk(z) =k−1∑i=0

(ϕi+1(z)− ϕi(z))

The modulus of each difference is given by the above inequality, so that by theWeierstrass M-test, ϕk converges uniformly as k → ∞. Furthermore, since ϕis the limit of a sequence of holomorphic functions on a domain with compactclosure it follows that ϕ is holomorphic (see [SS03, Ch.2 Theorem 5.3]).

While the proof depended strongly on the property that |λ| < 1 we actuallyhave the following corollary.

Corollary 1. Let fλ be a quadratic polynomial whose fixed point has multiplier|λ| > 1. Then there exists a domain D′R in which fλ is conjugate to a rotationby λ.

Proof. If fλ has multiplier greater than 1 then f−1λ has multiplier less than 1.

This follows from a straightforward application of the chain rule (differentiatingfλ ◦ f−1

λ = id). By theorem 1 there exists a holomorphic map σ such that

σ ◦ f−1λ = λ−1σ

Multiplying by λ and pre-composing by fλ ensures that the desired functionalequation holds

λ ◦ σ = σ ◦ fλ

This theorem and its corollary gives us our first affirmative results for lin-earization. However, the proof hinged on the fact that |λ| 6= 1 and the story ismore involved for the parabolic and elliptic cases.

3 Parabolic CaseNow let |λ| = 1 so that we can write λ = e2πiθ for θ ∈ R/Z. In certain instancesit will be more convenient to work with the angle θ, but it is understood thatcorresponding to θ there exists an associated λ.

8

3.1 Obstructions to LinearizabilityIf |λ| = 1 there are some possible obstructions to linearizability:

1. Every neighborhood of 0 contains points which leave that neighborhood.

2. Every neighborhood of 0 contains points which converge to 0.

3. θ is irrational yet there are periodic orbits in every neighborhood of 0,known as small cycles.

4. θ is irrational yet there are periodic points in every neighborhood of 0.

All obstructions are dynamical in nature. The dynamics of fλ cannot be locallyconjugate to a rotation if 1 or 2 holds since points undergoing rotation neverleave a neighborhood nor do they converge to 0. Condition 3 is an obstructionbecause the orbits of a rotation by an irrational angle θ are dense on circles ofconstant radius; hence if 3 holds for fλ, linearization is not possible. The sameis true for 4, though the distinction between 3 and 4 is subtle. In condition3 the entire orbit is contained within a specified neighborhood, which is whatis meant with the phrase periodic orbit. In 4 we only require that a point beperiodic after say T iterates, i.e. there is a z such that f◦Tλ (z) = fλ(z), but wemay have the points f◦kλ (z) for 0 < k < T leave the given neighborhood. Inparticular condition 4 implies either condition 1 or condition 3.

3.2 Leau-Fatou FlowersKeeping these obstructions in mind we state the main theorem of this section.

Theorem 2 (Leau (1897), Julia (1918), Fatou (1920)). If θ is rational, then fλis not linearizable.

In the proof we shall omit a few details, referencing [Mil06] when needed.Rather, we take the view of covering the main ideas and presenting the geometricintuition.

Proof. Since fλ is parabolic we may iterate until λ = 1; more specifically ifλ = e2πip/q we would have to iterate q times. Call this resulting function f .Then

f(z) = z + azn+1 +O(zn+2) = z(1 + azn +O(zn+1))where the big-O notation means terms of higher degree.

Definition 9 (Attraction/Repulsion Vectors). A complex number v will becalled a repulsion vector if the product navn is equal to +1 and an attractionvector if navn = −1.

The use of boldface v is simply to indicate that we are thinking of the corre-sponding point in the complex plane as a vector from the origin. By elementarycomplex analysis, there are n equally spaced repulsion vectors, and n equallyspaced attraction vectors. We shall index these by v0, . . . ,v2n−1 where vi is an

9

attraction vector if i is even and a repulsion vector if i is odd. Let us cover thepunctured plane by 2n open sets {∆i} of angle 2π/n such that ∆i is centeredat the vector vi. Explicitly,

∆i = {z ∈ C : z = reiθvi, r > 0, |θ| < π/n}

Hence each pairwise intersection ∆i ∩∆j is a sector of angle π/n for j = i± 1and empty otherwise.

Now we make the coordinate change w = φ(z) = c/zn with c = −1/na. Themap φ takes each ∆i to a slit w-plane, with φ(∆i) omitting R+ for i even andR− for i odd. To see this notice that each ∆i does not contain its bounding ray,which is of the form reiπ/nvi. Under φ this maps to 1/(rnvni ) and by definitionvni = ±1 depending on whether the vector is repelling or attracting.

Now we can set F = φ ◦ f ◦ φ−1 so that we have

F (w) = φ(f(φ−1(w))) = c(n√c/w(1 + ac/w +O(c/w))

)−nReally we should index F by i since we define it sector-wise, so that taking then-th root is well-defined. Doing this, and expanding the above, we get

Fi(w) = w(1 + ac/w +O(c/w))−n = w + 1 +O(1)

Figure 2: Diagram of what happens under forward iteration of f in the z-plane(right) and of F0 in the w-plane (left). Points of the z-plane within the dashedlines eventually land in the attracting petal (solid line) and remain there for therest of the forward orbit.

There is a region in the w-plane, HR, for which successive iterates of Fi mapthat region into itself. That is, once points enter this region, they remain therefor the remainder of the forward orbit. This is shown schematically in figure 2and the boundary of HR is indicated by the solid black line. Under the inversemap φ−1 (which is well defined after the restriction to sectors) the region HR

maps to a so-called attracting petal, Pi. Points in ∆i (for i even) eventually land

10

in the attracting petal and stay there for the remainder of the forward orbit.Furthermore, since Fi is a translation, all points h ∈ HR tend to infinity underforward iteration, which corresponds to points p ∈ Pi tending to 0 (they cannever equal zero since 0 6∈ ∆i for all i).

Since a repulsion vector for f is an attraction vector for f−1, the oppositestatements hold: points will eventually leave the so-called repelling petals, whichare simply attracting petals for f−1. The situation is shown in figure 3.

Figure 3: Diagram illustrating the attracting petal for f−1 and the correspond-ing map F−1

1 in the w-plane.

To complete the proof we note that the dynamics in the attracting petalssatisfy obstruction 3.

4 Elliptic Case4.1 Siegel and Cremer theoremsHaving established that linearizability is impossible for rational θ, we turn tothe case when |λ| = 1 with irrational angle. Historically the two theorems thatfirst dealt with this case were provided by Cremer and Siegel, with the latterproviding the first affirmative statement for the elliptic case.

Theorem 3 (Cremer (1927), Siegel (1942)). Let λ = e2πiθ with θ ∈ R/Z irra-tional. If

1. lim supq→∞

1/|λq − 1|1/(2q−1) =∞, then fλ is not linearizable.

2. 1/|λq−1| is bounded by some polynomial function of q, then fλ is lineariz-able.

The first condition is satisfied if |λq−1| is in a sense “too small”. This showsthat not only does it suffice for θ to be irrational, but it is necessary that θ be“far from rational”. It will take the rest of the paper to make precise the phrasein quotation marks. But first, we give a proof of Cremer’s theorem.

11

Proof. We show that fλ has periodic points in every neighborhood of 0. To dothis, let us find the periodic points of period q:

0 = f◦q(z)− z = z2q + · · ·+ (λq − 1)z = z

2q−1∏i=1

(z − zi)

The last equality comes from writing the polynomial as a product of its factors,counting multiplicity, and noting that 0 has multiplicity 1. Dividing both sidesby z gives us,

z◦(2q−1) + · · · (λq − 1) =2q−1∏i=1

(z − zi)

Setting z = 0 and taking absolute values then yields,

2q−1∏i=1|zi| = |λq − 1|

Note that the right-hand side is non-zero because θ is irrational. Taking thegeometric mean of the product of the zeroes, it follows that there must be a zisuch that

0 < |zi| ≤ |λq − 1|1/(2q−1)

with equality if and only if all the non-zero zeroes are equal. As q → ∞ itfollows that there are periodic points in every neighborhood of 0, since the termon the right tends to 0 by assumption. Hence fλ is not linearizable as it satisfiesobstruction 4.

The proof of Siegel’s theorem is a bit more involved, so we omit it for lengthconsiderations. Siegel’s original proof involved expansions of power series andinductively estimating coefficients, and his original proof is almost never re-peated. A shorter treatment is provided by [CG93] using basic Kolmogorov-Arnold-Moser theory and some careful estimates.

4.2 Generic vs. Lebesgue Almost EveryThe two theorems in the previous section have important corollaries, beginningwith a corollary of Cremer’s theorem.

Corollary 2. For generic choice of ξ ∈ R/Z, and if ρ = e2πiξ, then the quadraticpolynomial fρ is not linearizable.

Here the term generic is used in the Baire category sense, i.e. a set isgeneric if it is the countable intersection of open, dense sets. This corollarycan be proved directly and the proof (in [Bra10]) constructs the open, densesets required. These sets are centered at rationals numbers and small enoughto ensure that even irrational angles produce small cycles. On the other handwe have the following corollary of Siegel’s theorem.

12

Corollary 3. For every η ∈ R/Z outside a set of Lebesgue measure zero, thequadratic polynomial with multiplier e2πiη is linearizable.

Thus we have the bizarre fact that for generic choice of λ linearization is notpossible, while for Lebesgue almost every λ linearization is possible. We high-light this contrast to emphasize that the definition and usefulness of genericityis highly dependent on context.

4.3 Qualitative Siegel-Brjuno-YoccozIn this section we prove corollary 3, following an argument due to Yoccoz. Wewill call this the qualitative Siegel-Brjuno-Yoccoz theorem (QSBY) as it givesus linearizability for a set of full measure, but gives no arithmetic condition formembership in this set. First we begin with a definition:

Definition 10. A critical point of a holomorphic map is a point at which thederivative vanishes.

The quadratic polynomial fλ = z2 + λz has critical points at c = −λ/2 andat ∞ ∈ C. That ∞ is a critical point follows from the change of coordinatesz 7→ 1/z = ζ and by evaluation of the derivative at ζ = 0. Critical pointsare important because they present obstructions to conjugacy by preventinginvertibility. Hence we will only concern ourselves with the critical point c, aswe are focused on linearizability about the origin.

Next we define the conformal radius function, which will be positive , real-valued and take as argument the set of all possible multipliers, i.e. the closedunit disc D.

Definition 11. Define σ : D → R+ as follows. For λ ∈ D set σ(λ) to be thelargest number such that there exists a linearizing coordinate φλ. Set σ(λ) = 0if linearization is not possible.

Notice that from before we have σ(λ) > 0 for λ hyperbolic and that σ(0) = 0.The idea of working directly with the conformal radius function was Yoccoz’singenious idea. The second ingenious idea is to work with upper-semicontinuityas opposed to the continuity of the function. Continuity is too strong a conditionto hope for since σ(λ) = 0 for λ a parabolic or Cremer number, and both ofthese are dense on the unit circle. We recall the definition of upper- and lower-semicontinuity.

Definition 12. A function g is said to be upper-semicontinuous at a point z0if

lim supz→z0

g(z) ≤ g(z0)

for all points z → z0. On the other hand we say g is lower-semicontinuous if

g(z0) ≤ lim infz→z0

g(z)

13

We provide the definition of lower-semicontinuous for completeness and sothat we may remark that a function is continuous if and only if it is upper-and lower-semicontinuous. Given that we introduced the notion of upper-semicontinuity, it is natural that we wish to show that σ is upper-semicontinuous.This is the subject of the next lemma ([Mil06, Lemma 11.15]).

Lemma 1. The conformal radius function σ : D → R+ is bounded and upper-semicontinuous.

Proof. First we begin by showing that σ is bounded. To do this note that|fλ(z)| = |z(z + λ)| > |z| if |λ| ≤ 1 and |z| > 2. Thus all points outside of theclosed disc of radius 2 tend to infinity under forward iteration of f . Since thiscannot occur for a rotation it follows that for such points linearization is notpossible (obstruction 1). Hence σ(λ) ≤ 2.

For the second statement, we note that upper-semicontinuity is equivalentto the condition that for all σ0, the set {z ∈ D : σ(z) ≥ σ0} is closed. Let {λk}be a sequence in {z ∈ D : σ(z) ≥ σ0} which converges to λ ∈ D. This givesa sequence of linearizations, which we denote by {ψλk}. By assumption thesefunctions are defined on discs of radii at least σ0 so let us restrict these to Dσ0 toget {ψλk |Dσ0

: Dσ0 → D2}. By Montel’s theorem (see the appendix in §5), somesubsequence of {ψλk |Dσ0

} converges locally uniformly to some ψ : Dσ0 → D2.By Hurwitz’s theorem (§5), ψ is univalent and one may check that ψ linearizesfλ. Hence the set is closed, and it follows that σ is upper-semicontinuous.

The next lemma relates the critical point to the conformal radius.

Lemma 2. If |λ| < 1 then σ(λ) = |η(λ)| where η(λ) is holomorphic in the openunit disc.

Proof. To prove this lemma recall that the Koenigs intertwining map was definedas

φλ(z) = limn→∞

f◦n(z)λn

We proved in §2.1 that the limit exists for |λ| < 1 and that this limit convergeslocally uniformly. In particular this means that φλ is a holomorphic function ofλ. Set

η(λ) = φλ(−λ/2)that is η takes on the value of φλ at the critical point. As noted earlier, thecritical point is the obstruction to linearizability (think of it as a barrier to thegrowth of the conformal radius) so we have |η(λ)| = σ(λ) and η is our desiredholomorphic function, currently defined on D − {0}. However, since σ itself isbounded on D−{0}, Riemann’s theorem on removable singularities says that 0is a removable singularity. In particular, η(0) = 0 by the upper-semicontinuityof σ according to the following short argument

lim supλ↘0, |λ|<1

|η(λ)| = lim supλ↘0, |λ|<1

σ(λ) ≤ σ(0) = 0

This shows that η is holomorphic on the entire unit disc.

14

The final ingredients for the proof of the theorem are two classical resultsconcerning radial limits.

Theorem 4 (Fatou). For a bounded, non-constant, holomorphic function η :D → C the radial limit limr↗1 η(re2πiθ) exists for Lebesgue almost every θ ∈R/Z.

Theorem 5 (F. Riesz and M. Riesz). Let η : D→ C be a non-constant, boundedholomorphic function. Then for any c ∈ C the set {θ ∈ R/Z : limr↗1 η(re2πiθ) =c} has Lebesgue measure 0.

In particular the combination of Fatou’s theorem and the Riesz-Riesz theo-rem gives the following corollary.

Corollary 4. For a bounded, non-constant, holomorphic function η : D → Cand for almost every θ ∈ R/Z, limr↗1 η(re2πiθ) exists and is non-zero.

Proof of corollary 3. Using the above theorems and lemmas, the proof goes asfollows. For almost every θ ∈ R/Z the limit limr↗1 η(re2πiθ) exists and isnon-zero. For such θ

limr↗1|η(re2πiθ)| = lim

r↗1σ(re2πiθ)

exists and is positive. By upper-semicontinuity of σ it follows that

σ(e2πiθ) ≥ limr↗1

σ(re2πiθ) > 0

so that fλ has a Siegel disc about the origin.

4.4 Quantitative Siegel-Brjuno-YoccozIn this section we discuss the sharp arithmetic condition on λ for linearizability,which is given by the Siegel-Brjuno-Yoccoz theorem. Siegel’s name is includedbecause he was the first to prove a sufficient arithmetic condition in the ellipticcase. However, the arithmetic condition comes from Brjuno, who in 1971 [Brj71]showed that the class of linearizable angles includes at least the Brjuno numbers.In 1995 Yoccoz [Yoc95] then showed that if the Brjuno condition is not satisfiedthen linearizability is not possible. The theorem goes as follows.

Theorem 6 (Siegel-Bryuno-Yoccoz). The Brjuno numbers θ ∈ B are pre-cisely those for which fe2πiθ is linearizable, i.e., fλ is linearizable if and onlyif∑∞n=1 βn−1 log 1

θnconverges. Furthermore, if the series diverges, then every

neighborhood of 0 contains periodic orbits of fλ.

The implications of this, and the preceding theorems regarding linearizabilitycan be summarized in the following schematic, inspired by Milnor [Mil06].

The picture can be interpreted in as follows. The vertical line dividingBrjuno from Yoccoz can be thought of as an infinitely large circle, extending theDiophantine class. To the right of the dividing line stands Yoccoz with his proof

15

Yoccoz!Brjuno!

Cremer!(generic)!

Diophantine!

D(2+)!

D(2)!

(full measure)!

Figure 4: Illustration of arithmetic classes, ordered by inclusion and separatedaccording to linearizability. Notice the contrast between Lebesgue a.e. vs.generic.

that stepping outside of the Brjuno class yields non-linearizability. A subset ofthe non-linearizable angles is given by Cremer’s condition, which was earliershown to be generic. Thus the picture not only describes a complex dynamicalordering of arithmetic classes, but also contains some historical information.

To prove the converse Yoccoz developed a technique known as Yoccoz renor-malization or alternatively as geometric renormaliztion. It is beyond the scopeof this paper to go through the details of his argument, so instead we give asketch of the proof and try to convey some of the geometric intuition.

4.5 Yoccoz renormalization and Yoccoz’s inequalityWe present the ideas behind Yoccoz’s proof that fλ is linearizable if∑∞n=0 βn−1 log 1

θnconverges. Henceforth we fix θ and write f instead of fλ.

We begin by observing that linearizability of f is equivalent to the seeminglyweaker notion of stability of f .

Definition 13. f is stable for z if for all n, f◦n(z) ∈ D. f is stable at 0 if itis stable for all z in a neighborhood of 0, i.e., there is open D about 0 such thatfor all n, we have f◦n(D) ⊆ D.

Proposition 1. f is stable at 0 if and only if f is linearizable at 0.

Proof. (⇐) is easy. (⇒). Define D to be the connected component about 0 ofthe interior of the filled Julia set Kf . By the classification of maps on hyperbolicRiemann surfaces ([Mil06, Theorem 5.2]), D is conformally isomorphic to D, sayvia φ : D → D. By the Schwarz lemma, φ linearizes f .

Our task is then to control the long-term orbits of points near 0. A naiveapproach is to use the mean value inequality and a bound on f ′ near 0. Thatfails because in a neighborhood of 0, |f ′| could be sometimes below 1, sometimesequal to 1, and sometimes above 1. This approach is unable to keep track of

16

the different regions, without which one cannot hope to bound the orbits of allpoints near 0.

Yoccoz successfully controlled the long-term behavior of orbits of points near0 by employing the technique of renormalization. Renormalization allows us tolook at f over larger and larger time-scales, via the first-return map. We nowpresent an oversimplified version of Yoccoz renormalization. Some preliminaryremarks:

• We have changed the definitions of the functions and suppressed variousconstants in the inequalities. The following proof is therefore incorrect.Nevertheless, we have attempted to emphasize the ideas and heuristicswhich make the proof work.

• ∼ denotes some notion of approximation which is intentionally left unde-fined.

We begin by lifting f : D → D via Z 7→ exp(2πiZ) to get F : H → C. Thislets us work with translations instead of rotations. This also means that Z ∈ His stable for F if the F -orbit of Z never drops out of H.

1. Start with F0 = F : H→ C. Let l denote the positive imaginary axis, andconsider the strip U bounded between l and F0(l). By gluing l and F0(l)via F0, we make U into a Riemann surface V .

2. V may be uniformized to D∗ (the punctured unit disc). Let L : U → D∗denote the composition of the gluing and the uniformizing map. Then liftL via Z 7→ exp(2πiZ), and let L : U → H denote the lift which satisfiesL(0) = 0. In fact, L extends continuously to U .Note that if F0(Z) ∼ Z + θ, then L(Z) ∼ Z/θ.

3. Define W to be the union of U and the strip −1 ≤ Re(Z) ≤ 0. Extend Lto W as follows: for Z ∈W , let k be such that F ◦k(Z) ∈ U . Then defineL(Z) = L(F ◦k0 (Z))− k. In this way,

for all Z ∈W ∩ F−1(W ), we have L(F (Z)) = L(Z) + 1.

4. Let T : H→ H be translation by 1. Define F1 : H→ C by

F1 = s ◦ L ◦ T−1 ◦ L−1 ◦ s−1 − a1,

where s(x+ iy) = −x+ iy. A few remarks are in order:

• Recall that a1 is the first entry of the continued fraction expansionof θ.

• As written, F1 is defined on s(L(U)), because L is defined on W andT−1(U) ⊆W .

• s is necessary because W is the union of U with a strip before U .One would hope to define F1 = L ◦ T ◦L−1 − a1. But T (U) 6⊆W , soin that case F1 would not even be defined on L(U).

17

5. Repeat this process to get F2 : H→ C, F3 : H→ C, etc.

The renormalization construction leads us to consider the continued fractionexpansion of θ = [a1, a2, . . . ] and the tails θn = [an+1, an+2, . . . ]. To be precise,

if Fn(Z) ∼ Z + θn, then Fn+1(Z) ∼ Z + θn+1.

Let us “derive” the above relation. First pretend that F0 is in fact equal toits first-order approximation Z 7→ Z + θ. Then L : U → H is Z 7→ Z/θ, so F1 is

−θ(−Z)− 1θ

− a1 = Z + 1θ− a1 = Z + θ1

as expected.How does Yoccoz renormalization help us control the long-term behavior of

the orbits of F? Say we have Z0 ∈ H with Im(Z0) ≥ log(1/θ), and we want todetermine if the F0-orbit of Z0 escapes H. Take k such that Z0 − k ∈ W , anddefine Z1 = s(L(Z0−k)). In the same way, we define Z2, Z3, and so on. Havingdefined Zn, we only define Zn+1 if Im(Zn) ≥ log(1/θn).

The reason for requiring Im(Zn) to be large is because the approximationFn(Z) ∼ Z + θn is better for Z which is higher in H. In particular, one mayshow that if Im(Z) ≥ log(1/θn), then

|F (Z)− Z − θn| ≤ θn/20 and |F ′(Z)− 1| ≤ 1/20.

In fact,

Theorem 7. The F -orbit of Z never escapes H if and only if Zn is defined forall n.

This theorem is genuine progress toward the stability problem. We men-tioned previously that it is difficult to bound the F -orbit {Z,F (Z), F ◦2(Z), . . . }because high iterates of F are difficult to control. This theorem allows us toconsider {Z0, Z1, . . . } instead. The Zn’s are easier to control because they donot involve high iterates of F or Fn.

It remains to control (Zn)n. For that, we have the following estimates, fromwhich one may see why the series

∑∞n=0 βn−1 log 1

θnis relevant to the stability

of f at 0.

Proposition 2. If Im(Zn) ≥ log(1/θn), then

θnIm(Zn+1) ≥ Im(Zn)− log 1θn.

By induction, we get

Theorem 8. For all n, if Zn+1 is defined, we have

θ0θ1 · · · θnIm(Zn+1) ≥ Im(Z)−n∑i=0

θ0θ1 · · · θi−1 log 1θi.

18

Finally, we may prove the backward direction of the Siegel-Bryuno-Yoccoztheorem. Suppose

∑∞i=0 θ0θ1 · · · θi−1 log(1/θi) converges. Then one could choose

Z with Im(Z) sufficiently greater than∑∞i=0 θ0θ1 · · · θi−1 log(1/θi), thereby en-

suring that Im(Zn+1) ≥ log(1/θn) for all n. Therefore Zn+2 is defined for alln. By Theorem 7, the F -orbit of Z never escapes H, i.e., f is stable at 0. ByProposition 1, f is linearizable at 0 as desired.

5 AppendixHere we present concise proofs of some classical theorems in complex analysisthat were used in the proof of lemma 1.

Theorem 9 (Montel). Let F be a family of holomorphic self-maps of the unitdisc. Every sequence in F has a locally uniformly convergent subsequence.

Proof. Fix 0 < r < 1 and consider F|Dr . For f ∈ F and z ∈ Dr, apply Cauchy’sintegral formula along |w − z| = 1− r:

|f ′(z)| =∣∣∣∣ 12πi

∮f(w)

(w − z)2 dw

∣∣∣∣ ≤ 12π 2π(1− r) 1

(1− r)2 = 11− r .

The bound does not depend on f or z, showing that F|Dr is equicontinuous. ByArzela-Ascoli, every sequence in F|Dr has a uniformly convergent subsequence.Hence every sequence in F has a locally uniformly convergent subsequence.

Theorem 10 (Hurwitz). Let {ψn} be a sequence of holomorphic self-maps ofthe unit disc, which converges to ψ locally uniformly. Suppose ψ has a zero oforder m at z0. Then there is a disc D(z0, r) such that for all large n, ψn hasexactly m zeroes in D(z0, r), and those zeroes converge to z0 as n→∞.

Proof. Fix any 0 < r < 1− |z0| such that ψ has no zero in D(z0, r)\{z0}. Now{ψn} → ψ uniformly on D(z0, r), so∮

|z−z0|=r

ψ′n(z)ψn(z) dz →

∮|z−z0|=r

ψ′(z)ψ(z) dz.

By the argument principle, LHS equals the number of zeroes of ψn in D(z0, r),and RHS equals m. Since both LHS and RHS are always integers, LHS equalsRHS for all large n. For such n, ψn has exactly m zeroes in D(z0, r).

In the above, r could have been arbitrarily small, showing that the m-manyzeroes of ψn around z0 must converge to z0 as n→∞.

Corollary 5. Let {ψn} be a sequence of univalent self-maps of the unit disc,which converges to ψ locally uniformly. Then ψ is either univalent or constant.

Proof. Assuming that ψ is nonconstant, we show that it is univalent.Fix w and consider z such that ψ(z) = w. Since ψ is nonconstant, z is a

zero of finite order of ψ −w. By Hurwitz’s theorem, for all large n, ψn −w hassome zeroes in some disc around z, and those zeroes converge to z as n→∞.

19

Now since the ψn are univalent, each ψn − w has only one zero, which wedenote as ψ−1

n (w). By the previous paragraph, for all z such that ψ(z) = w,{ψ−1

n (w)}n converges to z. Therefore there is at most one z such that ψ(z) =w.

References[Bra10] Filippo Bracci. Local holomorphic dynamics of diffeomorphisms in di-

mension one, volume 525 of Contemporary Mathematics. AmericanMathematical Society, 2010.

[Brj71] A.D. Brjuno. Analytic form of differential equations. i, ii. Trans.Moscow Math. Soc., 25, 1971.

[CG93] Lennart Carleson and Theodore W. Gamelin. Complex Dynamics.Springer, 1993.

[Khi64] Aleksandr Khinchin. Continued Fractions. The University of ChicagoPress, 1964.

[Koe84] Gabriel Koenigs. Recherches sur les integrales de certaines equationsfonctionnelles. Ann. Sci. Ecole Norm. Sup., 1:2–41, 1884.

[Mar00] Stefano Marmi. An introduction to small divisors problems. arXiv,2000.

[Mar06] Stefano Marmi. From small divisors to brjuno functions. online notes,2006.

[Mil06] John W. Milnor. Dynamics in one complex variable. Number 160 inAnnals of Mathematics Studies. Princeton University Press, 3rd edi-tion, 2006.

[Sch70] Ernst Schroder. Uber iterirte funktionen. Mathematische Annalen,3(2):296–322, 1870.

[SS03] Elias Stein and Rami Shakarchi. Complex Analysis, volume 2 of Prince-ton Lectures in Analysis. Princeton University Press, 2003.

[Yoc95] Jean-Christophe Yoccoz. Petits diviseurs en dimension 1. Asterisque,(231), 1995.

20

number theoretic aspects of complex dynamics · in the ﬁeld of dynamics, λhas its own name....

Documents