web.vu.ltweb.vu.lt/.../uploads/2018/09/lect-rpermuterasmus.pdfcontents introduction 1 i. elementary...
TRANSCRIPT
INTRODUCTION TO THE THEORYOF RANDOM PERMUTATIONS
Eugenijus Manstavicius
ERASMUS lectures, ELTE, Budapest, 2010
ii
Contents
Introduction 1
I. ELEMENTARY THEORY 31.1 Cyclic structure of a permutation . . . . . . . . . . . . . . . . . . . . 31.2 Distribution of the cycle structure vector . . . . . . . . . . . . . . . . 61.3 The number of cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.4 Feller’s coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.5 Estimation of the total variation distance . . . . . . . . . . . . . . . . 171.6 Estimates of conditional probabilities . . . . . . . . . . . . . . . . . . 191.7 Additive functions and the tail probabilities . . . . . . . . . . . . . . 241.8 Moments of additive functions . . . . . . . . . . . . . . . . . . . . . . 271.9 The law of large numbers . . . . . . . . . . . . . . . . . . . . . . . . . 321.10 The central limit theorem . . . . . . . . . . . . . . . . . . . . . . . . 341.11 Mean values of multiplicative functions . . . . . . . . . . . . . . . . . 381.12 The three series theorem . . . . . . . . . . . . . . . . . . . . . . . . . 47
iii
Introduction
A permutation of the set 1, 2, . . . , n = Nn can be understood as a combinatorialword, that is, an n sequence of different letters. Then appearing inversions, descents,monotone subsequences, alternating runs, patterns et cetera become very interestingobjects for deep investigations. References in this direction can be found in theseminal book by M. Bona 1. Nevertheless, we will not go along this path.
Our interests lay in permutations as bijective mappings on Nn. From such pointof view, a permutation can be imagined very visually as a labelled digraph comprisedof disjoint cycles. This exposition just mirrors the fact that each permutation is aproduct of independent cyclic permutations or cycles. The decomposition into theproduct is unique up to the order. As a labelled decomposable combinatorial struc-ture, a permutation is the simplest instance; it hides many secrets to be opened,however. Mathematical innovations created for Sn are indispensable for more com-plicated problems on general classes of combinatorial structures such as abelianpartitional complexes2, called also assemblies3.
Maybe, the strongest motivation do deal with permutations comes from algebra.All bijective mappings on Nn with respect to multiplication comprise the symmetricgroup Sn of order n!. By Cayley’s theorem, each group of the order n can be isomor-phically imbedded into Sn. This fact gives a possibility for computer calculationsof finite groups. For large n, this is not a panacea, however. The main obstacle isa great complexity and unclear asymptotic behavior of properties of the very Sn asn→∞. H. Weil4 writes:
This group has a different constitution for each individual number n. The ques-tion is whether there are nevertheless some asymptotic uniformities prevailing forlarge n . . . .
Our experience, which is also based on many investigations including those car-ried out by the Hungarian mathematicians P. Turan, P. Erdos, M. Szalay and others,suggests the answer that the asymptotic uniformities do exist for almost all permu-tations. That motivates us to start with the phrase: Take a permutation at random!
1M. Bona, Combinatorics of Permutations, Chapman and Hall/CRC Press, Boca Raton, 20042D. Foata, La serie generatrice exponentielle dans les problemes d’enumeration, Seminaire de
Mathematiques Superieures. Les Presses de l’Universite de Montreal, Quebec, 19743R. Arratia, A. D. Barbour and S. Tavare, Logarithmic Combinatorial Structures: a Probabilistic
Approach, EMS Monographs in Mathematics, EMS Publishing House, Zurich, 20034H. Weil, Philosophy of Mathematics and Natural Sciences, 1949
1
2 INTRODUCTION
I. ELEMENTARY THEORY
1.1 Cyclic structure of a permutation
Recall that the set of permutations σ : Nn → Nn with respect to multiplicationcomprises the symmetric group which we denote by Sn. Let
σ = κ1 · · ·κw , (1.1)
be the canonical decomposition of σ ∈ Sn into the product of independent cycles.Here w = w(σ) denotes their number. For instance,
σ =(1 2 3 4 5 6 7
2 3 1 4 6 5 7
)= (1, 2, 3)(4)(5, 6)(7) .
There is no need to recall more detailed definition of the product (1.1); instead ofit, we may imagine the labelled digraph being an unordered collection of w cycles.For the instance above, we have the directed edges:
1 7→ 2 7→ 3 7→ 1, 4 7→ 4, 5 7→ 6 7→ 5, 7 7→ 7 .
Let kj(σ) ≥ 0 be the number of cycles of length 1 ≤ j ≤ n in (1.1). Then
k(σ) = (k1(σ), . . . , kn(σ))
is called the cycle structure vector. For the instance, k(σ) = (2, 1, 1, 0, 0, 0, 0). Thenthe set of permutations with the same cycle structure
Sn(s) = σ ∈ Sn : k(σ) = s ,
where s = (s1, . . . , sn) ∈ Zn+ and `(s) := 1s1 + · · · + nsn = n agrees with a classof conjugate permutations. Such classes play important role in algebra. For thecombinatorial purposes, we need to know their cardinalities. It is worth to recalltwo simple propositions.
1.1.1 Theorem. If s = (s1, . . . , sn) ∈ Zn+ and `(s) = n, then
|Sn(s)| = n!n∏j=1
1
sj!jsj.
3
4 I. ELEMENTARY THEORY
Proof. According to the vector s, make sj boxes of size j, 1 ≤ j ≤ n:
s1︷ ︸︸ ︷(·) . . . (·) . . .
sj︷ ︸︸ ︷(·, . . . , ·)︸ ︷︷ ︸
j
. . . (·, . . . , ·)︸ ︷︷ ︸j
. . .
sn︷ ︸︸ ︷(·, . . . , ·)︸ ︷︷ ︸
n
. . . (·, . . . , ·)︸ ︷︷ ︸n
.
Exploit all n! possibilities to distribute the numbers of Nn into the boxes and considerthe latter as cycles. In each such a case, we get a decomposed permutation with thecycle structure vector s. Since the order of cycles is irrelevant, we have repeated thesame permutation
s1! . . . sj! . . . sn!
times. Moreover, a cycle in the box of size j is repeatedly exposed j times. Becauseof this reason, the same permutation has been repeated
1s1 . . . jsj . . . nsn
times. Dividing n! by the last two products, we obtain the number of σ ∈ Sn withthe cycle vector s.
Cauchy’s equality. For every n ∈ N,∑`(s)=n
n∏j=1
1
jsjsj!= 1 ,
where the summation is extended over s = (s1, . . . , sn) ∈ Znn such that `(s) = n.
The First Proof. Sum up the cardinalities found in the last theorem over thedisjoint classes indexed by s with `(s) = n.
The Second Proof is based on the fundamental property of exponential generatingfunctions corresponding to decomposable combinatorial structures and to the classof their components. In our case, stressing that there are n! permutations of ordern and (j − 1)! cycles of length j, we rewrite it as follows:∑
n≥0
n!
n!xn =
1
1− x= exp− log(1− x) = exp
∑j≥1
(j − 1)!
j!xj.
Now, it suffices to perform simple formal calculations with the right-hand serieswhich equals ∏
j≥1
∑s≥0
xjs
jss!=∑n≥0
( ∑`(s)=n
∏j≤n
1
jsjsj!
)xn.
Comparing the coefficients at xn we complete the proof. With some effort, the cardinalities |Sn(s)| can be examined elementarily.
1.1.2 Theorem. Assume n ≥ 3. Then
max |Sn(s)| : s ∈ Zn+, `(s) = n = n!/(n− 1) .
Moreover, the maximum is achieved for the unique vector s = (1, 0, . . . , 0, 1, 0).
1.1. CYCLIC STRUCTURE OF A PERMUTATION 5
Proof. DenoteΠ(s): = 1s12s2 · · ·nsns1!s2! · · · sn!
and find its minimum under the extra condition `(s) = n. Set
r := r(s) = maxj : sj ≥ 1 .
Firstly, we claim that the vector s with r ≥ 3 minimizing the product Π(s) satisfiesthe condition
s2 = · · · = sr−1 = 0. (1.2)
Indeed, assuming the converse statement, we had an index i, 2 ≤ i ≤ r − 1, suchthat si ≥ 1 and
n =n∑j=1
jsj ≥ isi + rsr ≥ i+ r .
Consequently, the vector has the (i+ r)-th coordinate and we may redistribute twoof the summands in n so that
isi + rsr = i(si − 1) + r(sr − 1) + (i+ r)1 .
Now, instead of the vector s, we can take
s(1): = (s1, . . . , si − 1, . . . , sr − 1, . . . , 1, 0, . . . 0)
obtained from s by changing the i-th, r-th, and (i+ r)-th coordinates. Let they beequal to si − 1, sr − 1, and 1, respectively. The new vector also satisfies `(s(1)) = nand
Π(s(1))
Π(s)=
i+ r
isi rsr≤ i+ r
ir=
1
i+
1
r≤ 1
2+
1
3=
5
6< 1 .
We have succeeded in finding a smaller value of the product. The contradictionproves our claim (1.2). Thus, if r ≥ 3, we may proceed with the vectors minimizingΠ(s) and having the form
s = (s1, 0, . . . , 0, sr, 0, . . . , 0) .
Assume now that r ≥ 2. We claim that s1 = 1. If s1 ≥ 2, then, as in the proofof the first claim, we can exploit the relations
n ≥ 1s1 + rsr ≥ 2 + r, 1s1 + rsr = 1(s1 − 1) + r(sr − 1) + (1 + r)1 .
We can take the vector s(2) obtained from s by subtracting 1 from the first and ther-th coordinates but, substituting 1 for 0 on the (1 + r)-th place. Then
Π(s(2))
Π(s)=
1 + r
s1 rsr≤ 1 + r
2r=
1
2
(1 +
1
r
)≤ 3
4< 1 .
We have again a contradiction showing that s1 = 1 or s1 = 0. In the second case,we had the vector
s = (0, . . . , 0, s, 0, . . . , 0) ,
6 I. ELEMENTARY THEORY
where s ≥ 1 is on the r-th position. Then rs = n and
Π(s) = rss! = (n/s)ss! = n(n/s)s−1(s− 1)! ≥ n > n− 1 .
This value is greater than expected. Check the remaining case if r ≥ 2. Then
s = (1, 0, . . . , 0, sr, 0, . . . , 0) .
If r = 2, then, of course, there are no zeros between 1 and sr. The value sr satisfiesthe equation 1s1+rsr = 1+rsr = n. If sr ≥ 2, then 2r < n and we have a possibilityto use the (2r)-th coordinate.
Let r ≥ 2 and sr ≥ 2. Then
rsr + (2r) · 0 = r(sr − 2) + (2r) · 1.
Introduce the vector
s(3) = (1, 0, . . . , 0, sr − 2, 0, . . . , 1, 0, . . . , 0) ,
here sr − 2 stands on the r-th place and 1 is on the (2r)-th one. We see that
Π(s(3))
Π(s)=
(2r)rsr−2(sr − 2)!
rsrsr!=
2
rsr(sr − 1)≤ 1
2< 1 .
That is impossible. Consequently, if r ≥ 2, then the vector minimizing the productsatisfies sr = 1. From the equality 1 + rsr = 1 + r = n, we find r = n−1. The valueof the product equals n− 1.
If r = 1, we obtain the vector s = (s1, 0, . . . , 0) and s1 = n. For it, the value ofproduct is n, that is, larger than in the previous case.
The theorem is proved. Exercise. Find the second and the third largest classes of permutations and the
appropriate cycle structure vectors. Which class is the smallest in cardinality?
1.2 Distribution of the cycle structure vector
To start a probabilistic theory on permutations, we introduce the measure
νn(A) :=|A|n!, A ⊂ Sn .
Since now σ ∈ Sn is an elementary event and every mapping f : S → R is a ran-dom variable (r.v.). Emphasizing the combinatorial content and transgressing theprobability tradition we will write f(σ) instead of f . Later on, we will also usesome abstract probability space Ω,F , P and r.vs defined on it η, ξ, . . . withoutthe argument ω ∈ Ω.
Firstly, we examine the distribution of coordinates kj(σ), 1 ≤ j ≤ n, of thecycle structure vector k(σ). Observe that the relation `(k(σ)) = n, satisfied for eachσ ∈ Sn, implies the stochastic dependence of them.
1.2. DISTRIBUTION OF THE CYCLE STRUCTURE VECTOR 7
Recall the folklore Chinese restaurant problem asking to find the probability thatexactly m from n, 1 ≤ m ≤ n, hats of n gentlemen were returned to their ownersafter a dinner. It is assumed that each of the possibilities is equally like. In ournotation, solution to the problem is
νn(k1(σ) = m) =1
m!
n−m∑s=0
(−1)s
s!→ 1
m!
∞∑s=0
(−1)s
s!= e−1 1m
m!,
if n → ∞. Here m ∈ Z+ is a fixed number. To extend this formula, we need thefollowing elementary inclusion-exclusion lemma.
1.2.1 Lemma. Let A1, . . . , AJ ⊂ X be finite sets. The number of elements a ∈ Xbelonging to exactly m sets Ai, 1 ≤ i ≤ J , equals
Σm −(m+ 1
m
)Σm+1 +
(m+ 2
m
)Σm+2 ∓ · · ·+ (−1)J−m
(J
m
)ΣJ . (1.3)
Here
Σk =∑
1≤i1<...<ik≤J
∑x∈X
1x ∈ Ai1 ∩ . . . ∩ Aik
and 1· denotes the indicator of the set in the braces.
The next result belongs to V.L. Goncharov [?].
1.2.1 Theorem. Let 1 ≤ j ≤ n and m ∈ Z+. Then
νn(kj(σ) = m) =1
jmm!
∑0≤s≤n/j−m
(−1)s
jss!.
Moreover, if j and m are fixed and n→∞, then
νn(kj(σ) = m) =e−1/j
jmm!+ o(1) .
Proof. Let κi bet an arbitrary cycle of length j with vertexes from Nn. Assumenow that Ai ⊂ Sn is the subset of permutations with the cycle κi. We have
J : = n(n− 1) · · · (n− j + 1)/j
different cycles of length j with vertexes in Nn and the same number of the subsetsAi. It remains to find the number of permutations belonging to exactly m of them.The lemma above is at our disposal.
Calculating Σk, we observe that, if
σ ∈ Ai1 ∩ . . . ∩ Aik ,
8 I. ELEMENTARY THEORY
these k cycles κi1 , . . . ,κik must be independent; therefore, this determines jk ≤ nvertexes of σ, while the remaining n− jk vertexes can be distributed in other cyclesarbitrarily. Hence the number of such permutations equals
(n− jk)!1jk ≤ n .
This is the value of the inner sum in Σk provided that κi1 , . . . ,κik are independent.For the other possibilities of indexes i1, . . . , ik, the inner sum is zero.
How many nonzero summands are in Σk? As we have noted, these summandscorrespond to independent cycles κi1 , . . . ,κik of length j. To find the number of suchfamilies of cycles, we can collect numbers from Nn to kj positions taking accountinto their order. For this, there are n(n − 1) · · · (n − jk + 1) possibilities. Thesenumbers must comprise k cycles of length j. Since the cyclic change of the ordergives the same cycle, we must divide the latter number by jk. Moreover, the order ofthe very cycles is irrelevant; therefore, we have again to divide by k!. Consequently,in the outer sum of Σk, there are
n(n− 1) · · · (n− jk + 1)1
jkk!
nonzero summands. Hence
Σk = n(n− 1) · · · (n− jk + 1)1
jkk!· (n− jk)!1jk ≤ n =
n!
jkk!1jk ≤ n .
Inserting these values into formula (1.3), we obtain
|σ ∈ Sn : kj(σ) = m| =J∑
k=m
(−1)k−m(k
m
)Σk
=J∑
k=m
(−1)k−m(k
m
)n!
jkk!1jk ≤ n
=n!
m!
∑m≤k≤n/j
(−1)k−m
jk(k −m)!.
The substitution k = m+s and division by n! yield the first claim of Theorem 1.2.1.The second statement is now evident.
Let ξj be a Poisson r.v. with Eξj = 1/j. The latter theorem can be rewrittenas L(kj(σ))→ L(ξj) as n→∞ provided that j is fixed. Here and afterwards L(X)denotes the distribution of a r.v. X. Can we extend this to a many-dimensionalrelation? The answer will be given in the sequel. The next relations explain thedependence of coordinates in k(σ).
1.2.2 Theorem. Let ξ1, . . . , ξn be mutually independent Poisson r.vs given on somespace, Eξj = 1/j for 1 ≤ j ≤ n, ξ = (ξ1, . . . , ξn), and ζ := `(ξ). Then
νn(k(σ) = s) = 1`(s) = nn∏j=1
1
jsjsj!= P (ξ = s| ζ = n) .
1.2. DISTRIBUTION OF THE CYCLE STRUCTURE VECTOR 9
Proof. The first equality follows from Theorem 1.1.1. The proof of the secondone is an easy exercise.
More useful is the following infinite version of the last theorem.
1.2.3 Theorem. Let 0 < x < 1 and ξ(x)1 , . . . , ξ
(x)n , . . . be mutually independent
Poisson r.vs given on some space with Eξ(x)j = xj/j for j ≥ 1. Denote
ξ(x) = (ξ(x)1 , . . . , ξ(x)
n , ξ(x)n+1, . . . ) ,
ζ(x) := 1ξ(x)1 + · · ·+ nξ(x)
n + (n+ 1)ξ(x)n+1 + · · · ,
andk(σ) = (k1(σ), . . . , kn(σ), 0, . . . ).
Thenνn(k(σ) = s) = P (ξ(x) = s| ζ(x) = n) (1.4)
for each s = (s1, . . . , sn, sn+1, . . .) ∈ Z∞+ .
Proof. Firstly, check that
∞∑j=1
P (ξ(x)j 6= 0) =
∞∑j=1
(1− exp
− xj
j
)<∞∑j=1
xj
j<∞
if 0 < x < 1. By the Borel-Cantelli lemma, the series of r.vs ξ(x) converges withprobability one and determines a r.v. Secondly, we see that only the vectors s withsj = 0 for j ≥ n + 1 and such that `(s) = n can give give nonzero probabilities inthe both sides of (1.4). Afterwards, only such the vectors s are taken into account.
Further, we find
P (ζ(x) = n) = P (ξ(x)n+1 = ξ
(x)n+2 = . . . = 0)P (1ξ
(x)1 + · · ·+ nξ(x)
n = n)
= exp−
∞∑j=n+1
xj
j
∑`(s)=n
n∏j=1
P (ξ(x)j = sj)
= exp−∞∑j=1
xj
j
∑`(s)=n
n∏j=1
xjsj
jsjsj!
= (1− x)xn. (1.5)
In the last step, we applied Cauchy’s equality. Summation above has run over s ∈ Zn+satisfying `(s) = n. We have also
P (ξ(x) = s, ζ(x) = n) =∞∏
j=n+1
P (ξ(x)j = 0)
n∏j=1
P (ξ(x)j = sj)
= exp−∞∑j=1
xj
j
n∏j=1
xjsj
jsjsj!= (1− x)xn
n∏j=1
1
jsjsj!.
10 I. ELEMENTARY THEORY
The ratio of the last two probabilities gives the already known value of the frequencyon the left-hand side in (1.4).
The theorem is proved. In the sequel, we denote En the mean value with respect νn and use E to denote
the mean value in the space Ω,F , P in which the r.vs ξ(x)j , j ≥ 1, are defined.
1.2.4 Theorem. Let 0 < x < 1 and Ψ:Z∞+ → C be a function such that EΨ(ξ(x))exists. Then
EnΨ(k(σ)) = E(Ψ(ξ(x))| ζ(x) = n) .
Moreover,
(1− x)−1EΨ(ξ(x)) =∞∑n=0
EnΨ(k(σ))xn . (1.6)
Proof. The first claim follows from (1.4) in the previous theorem.Further, we use the well-known probabilistic equality
EX = E(E(X|Y )) =∞∑n=0
E(X|Y = n)P (Y = n)
provided that the mean value EX exists. Here Y is an arbitrary r.v. taking valuesin Z+. Applying the latter, we just take Y = ζ(x), X = Ψ(ξ(x)), and combine (1.5)with the already proved formula.
The mean value EΨ(ξ(x)) is a function defined in the interval (0, 1). By theprinciple of analytic continuation, one can use (1.6) for complex values x belongingto the open unit disk. This relation is fundamental calculating the mean valuesEnΨ(k(σ)) as the Taylor coefficients. We now demonstrate such a possibility.
Set(u)r = u(u− 1) · · · (u− r + 1), r = 0, 1, . . .
and recall that the factorial moment of order r of a Poisson r.v. ξ with Eξ = λ is
E(ξ)r = λr . (1.7)
1.2.5 Theorem. Let r1, . . . , rs ≥ 0, m: = 1r1 + · · · + srs, and 1 ≤ s ≤ n be fixed.Then the mixed factorial moment
Enm : = En
((k1(σ))r1 · · · (ks(σ))rs
)= 1m ≤ n
s∏j=1
1
jrj= 1m ≤ n
s∏j=1
E(ξj)rj .
Proof. If m > n ≥ s, then one of the differences (kj(σ)− i), i ≤ rj is zero. HenceEnm = 0 as well as 1m ≤ n = 0.
Assume m ≤ n and apply the formula in Theorem 1.2.4 for
Ψ(k) = (k1)r1 · · · (ks)rs .
1.3. THE NUMBER OF CYCLES 11
We have
(1− x)−1E( s∏j=1
(ξj)rj
)=∑n≥0
Enmxn , 0 < x < 1 .
Exploiting the independence of r.vs, from (1.7), we obtain
( l∏j=1
1
jrj
)xm (1 + x+ · · ·) =
∑n≥0
Enmxn ,
if 0 < x < 1. Comparing the coefficients at xn, we complete the proof.
1.2.1 Corollary. The finite-dimensional distributions of the vector
(k1(σ), . . . , kn(σ), 0, . . . )
converge to that of ξ(1)) which has been defined in Theorem 1.2.3. Moreover, allmixed moments also converge to the appropriate moments of the limit r.vs.
Proof. It suffices to recall that the many-dimensional Poisson distribution isuniquely determined by its moments. The result follows from the last theorem.
Exercise. Using Theorem 1.2.4, find the sum∑σ∈Sn
zw(σ), z ∈ C .
1.3 The number of cycles
As we have remarked, the number of cycles
w(σ) = k1(σ) + · · ·+ kn(σ)
is the sum of dependent r.vs, nevertheless, we can examine value distribution. Thetwo-dimensional sequence
c(n,m) := |σ ∈ Sn : w(σ) = m|, 0 ≤ m ≤ n ,
where c(0, 0) := 1, had been dealt with long before the probabilistic combinatoricsstarted.
1.3.1 Theorem. We have
c(n+ 1,m) = c(n,m− 1) + nc(n,m)
for 1 ≤ m ≤ n.
12 I. ELEMENTARY THEORY
Proof. We split all permutations defined on Nn+1 = 1, . . . , n, n + 1 into twoclasses. To the first class we ascribe a permutation if the number n+ 1 comprises acycle of length one in it. The remaining permutations comprise the second class.
The cardinality of the first class coincides with the number of permutations inSn having m− 1 cycle, therefore it equals c(n,m− 1).
Each permutation of the second class can be obtained from σ ∈ Sn having mcycles. For this, we can consecutively insert n + 1 into each of the cycles of σ. Wehave j positions in a cycle of length j and altogether n positions. Going along thispath, from each of σ ∈ Sn having m cycles, we construct n permutations of thesecond class. Consequently, the cardinality of the latter is nc(n,m). Adding theobtained cardinalities, we complete the proof of Theorem 1.3.1.
Recall that the Stirling numbers of the first kind s(n,m), 0 ≤ m ≤ n, are definedby the equalities
(x)n =n∑
m=0
s(n,m)xm . (1.8)
It is an easy exercise to verify that they satisfy the recurrence relation
s(n+ 1,m) = s(n,m− 1)− ns(n,m), 1 ≤ m ≤ n ,
where s(n, 0) = 0 if n ≥ 1 and s(0, 0) = 1. The substitution of −x in the place of xin (1.8) and Theorem 1.3.1 imply the following relation.
1.3.1 Corollary. For 0 ≤ m ≤ n,
c(n,m) = (−1)n−ms(n,m) = |s(n,m)| .
The asymptotical behavior as n→∞ of the local probabilities
νn(w(σ) = m) =c(n,m)
n!
is fairly involved. The integral limit theorem, established in 1942 by V.L. Goncharovand presented below, is historically notable.
Let
Φ(x): =1√2π
∫ x
−∞e−u
2/2du
be the distribution function of the standard normal law. For brevity, in the proofof the next theorem, we will use the classical asymptotic estimate(
z + n− 1
n
)=nz−1
Γ(z)(1 + O(n−1)) , (1.9)
if n→∞ uniformly in |z| ≤ 1. Here Γ(z) is the Euler gamma-function.
Goncharov’s Theorem. If n→∞, then
νn(w(σ)− log n < x√
log n) = Φ(x) + o(1)
uniformly in x ∈ R.
1.3. THE NUMBER OF CYCLES 13
Proof. Firstly, we find the moment generating function
ψn(z): =1
n!
∑σ∈Sn
zw(σ) =1
n!
∑`(s)=n
∑σ∈Sn(s)
zw(σ), z ∈ C .
Here, as previously, s = (s1, . . . , sn) ∈ Zn+. The summands in the inner sum areequal to zs1+···+sn and their number is |Sn(s)|. Theorem 1.1.1 implies
ψn(z) =1
n!
∑`(s)=n
zs1+···+sn |Sn(s)| =∑`(s)=n
n∏j=1
(zj
)sj 1
sj!.
This recalls the second proof of Cauchy’s equality. The same argument now leadsto the equality
(1− y)−z =∞∏j=1
expzyj
j
=∞∏j=1
∞∑s=0
(zyj)s
jss!=∞∑n=0
ψn(z)yn .
Here y is a formal unknown. The n-th coefficient of the series on the left-hand sidecan be found by Newton’s formula. Hence by (1.9)
ψn(z) =
(z + n− 1
n
)=nz−1
Γ(z)(1 + O(n−1)), |z| ≤ 1 , (1.10)
as n→∞.Now, the characteristic function of the law under investigation
ϕn(t): = e−it√
lognψn(eit/√
logn), t ∈ R.
It remains to find the asymptotic formulae of each of the factors on the right-handside. We have
Γ(eit/√
logn) = 1 + o(1)
uniformly in |t| ≤ T for every T > 0. The asymptotic expansions of the exponentialfunction imply
ϕn(t) = exp − it√
log n+ (eit/√
logn − 1) log n(1 + o(1))
= e−t2/2(1 + o(1))
as n→∞ with the same uniformity in t. This completes the proof of the Goncharovtheorem.
Observe an interesting relation. The first equality in (1.10) can be rewritten as
ψn(z) =(
1 +z − 1
n
)(1 +
z − 1
n− 1
)· · ·(
1 +z − 1
2
)z .
This shows that the distribution of w(σ) coincides with that of the sum of theindependent Bernoulli r.vs ηj, 1 ≤ j ≤ n, defined by
P (ηj = 1) = 1− P (ηj = 0) =1
j.
This raises a suspicion that there exists a probability space carrying kj(σ) and ηj,j ≥ 1, where the first r.vs can be expressed in terms of the second ones. This leadsto the theory of coupling of probability spaces.
14 I. ELEMENTARY THEORY
1.4 Feller’s coupling
Let Ω,F , P be a probability space and ηj, j ≥ 1, be the independent Bernoullir.vs defined on it. As previously, P (ηj = 1) = 1/j for j ≥ 1. We also assumethat the space is rich enough and carries the choices of elements from finite sets asrandom events independently of the family ηj, j ≥ 1. Afterwards a number taken atrandom will mean that it is taken from an appropriate set with equal probability.W. Feller has proposed the probabilistic algorithm to generate a random σ.
Feler’s algorithm:
Initialization. Open the permutation σ starting with the cycle containing 1;that is, write
σ: = (1
and define the set X1: = Nn \ 1.Step 1. If ηn = 1, then close the just opened cycle and start the next one with
the least number of the set X1; that is, write
σ = (1)(2
and define X2: = X1 \ 2.If ηn = 0, then continue the opened cycle with a number s taken at random from
X1; that is, write
σ = (1, s
and set X2: = X1 \ s.Assume that we did 1 ≤ i < n steps and already have the combination
σ = κ1 · · ·κj(a, . . . , r ;
where κk, 1 ≤ k ≤ j, are independent cycles and the involved numbers from Nn
comprise the set Ai. Set Xi: = Nn \ Ai.Step i + 1. If ηn−i = 1, then we close the cycle, open the next one with the
least number of Xi (say, t); that is, write
σ = κ1 · · ·κj(a, . . . , r)(t
and set Xi+1: = Xi \ t.If ηn = 0, then we continue the started cycle with a number q taken at random
from Xi; that is, write
σ = κ1 · · ·κj(a, . . . , r, q
and set Xi+1: = Xi \ q.The end. Since P (η1 = 1) = 1, we close the last opened cycle and return the
generated σ.
1.4.1 Lemma. Feller’s algorithm generates a random σ with the probability 1/n!.
1.4. FELLER’S COUPLING 15
Proof. Observe that, during the initialization and in n − 1 steps, new num-bers from Nn are included into σ. Hence we obtain a permutation in its canonicalrepresentation.
The probability of the combination obtained in the first step is 1/n.Assume that the probability of the combination obtained in the i-th step is
1
n· 1
n− 1· · · 1
n− i+ 1.
In different steps, only independent r.vs and events are used, therefore we maymultiply the probabilities. In the first case of Step (i + 1), the combination isobtained with the probability
1
n· 1
n− 1· · · 1
n− i+ 1· 1
n− i,
while in the second case when ηn−i = 0, this probability equals
1
n· 1
n− 1· · · 1
n− i+ 1·(
1− 1
n− i
) 1
n− i− 1.
Consequently, in either of the cases the probability is the same. By the inductionprinciple, we obtain the expected probability 1/n! of σ.
In what follows, we can imagine that the generated σ = σ(ω), where ω ∈ Ω isan elementary event, is defined in the space Ω,F , P. This allows to express kj(σ)in terms of ηi, i ≥ 1. We will denote by
X =d Y
the coincidence of distributions of the r.vs X and Y .
1.4.1 Theorem. We have
n∑j=1
ηj =d
n∑j=1
kj(σ) = w(σ) .
Proof. It suffices to observe that, during the generation of a σ by Feller’s algo-rithm, a cycle is finished if and only if ηj = 1, 1 ≤ j ≤ n. Hence the sum of theser.vs is equal to the number of cycles w(σ).
1.4.2 Theorem. Define r.vs
Xnj : = ηn−j+1(1− ηn−j+2) · · · (1− ηn)
+
n−j∑i=1
ηi(1− ηi+1) · · · (1− ηi+j−1)ηi+j (1.11)
in the space Ω,F , P. Then
kj(σ) =d Xnj, 1 ≤ j ≤ n .
16 I. ELEMENTARY THEORY
Proof. The first cycle of length j appears in the generated σ if and only if(ηn, . . . , ηn−j+1) = (0, . . . , 0, 1). This is equivalent to the equality
ηn−j+1(1− ηn−j+2) · · · (1− ηn) = 1 .
Later such a cycle is formed up if and only if the value of vector (ηn, . . . , η1) is(1, 0, . . . , 0, 1). Here, between ones, there are j − 1 zeros. The summand in the sumover i = 1, . . . , n− j of (1.11) is one if and only if such a run occurs. In other words,the sum counts the cycles of length j starting with the second one.
The theorem is proved.
1.4.3 Theorem. Denote
Xj =∞∑i=1
ηi(1− ηi+1) · · · (1− ηi+j−1)ηi+j .
If n→∞, then P (Xnj → Xj) = 1 for each fixed j ∈ N.
Proof. First of all, we check that the r.v. Xj is correctly defined. Indeed,
∞∑i=1
P(ηi(1− ηi+1) · · · (1− ηi+j−1)ηi+j 6= 0
)≤
∞∑i=1
P (ηi = 1, ηi+j = 1) =∞∑i=1
1
i(i+ j)<∞ .
Consequently, by the Borel-Cantelli lemma, the infinite series of r.vs converges al-most surely (a.s.). Hence
Rnj: =∑i>n−j
ηi(1− ηi+1) · · · (1− ηi+j−1)ηi+j → 0 (a.s.)
Moreover, ifQnj: = ηn−j+1(1− ηn−j+2) · · · (1− ηn) ,
thenP (Qnj = 1) ≤ P (ηn−j+1 = 1) = 1/(n− j + 1)→ 0 ,
for every j, as n→∞. This implies that Qnj → 0 a.s. Altogether,
|Xj −Xnj| ≤ Rnj +Qnj → 0 (a.s.)
for every fixed j ≥ 1.The theorem is proved. The next representation of the Poisson r.vs in terms of the Bernoulli ones is
interesting from the probabilistic point of view.
1.4.1 Corollary. In the notation above
(ξ1, ξ2, . . .) =d (X1, X2 . . .) .
Proof. It suffices to apply Theorems 1.2.1, 1.4.2, and 1.4.3. Exercise. Using Theorem 1.4.1 and the Berry–Esseen bound for the conver-
gence rate in the central limit theorem, find the convergence rate in the Goncharovtheorem.
1.5. ESTIMATION OF THE TOTAL VARIATION DISTANCE 17
1.5 Estimation of the total variation distance
Recall the definition of the total variation distance between the distributions of r.vsX and Y which are (maybe) defined on different probability spaces Ω1,F1, P1 andΩ2,F2, P2. If their values range in a metric space M and M is the Borel class ofits subsets, then this distance is
d(X, Y ): = d(L(X),L(Y )): = supB∈M
|P1(X ∈ B)− P2(Y ∈ B)| .
For discrete r.vs taking the values in Zr+, the distance equals
d(X, Y ) =1
2
∑s∈Zr+
|P1(X = s)− P2(Y = s)| .
It appears that the distributions of the vectors
kr(σ): = (k1(σ), . . . , kr(σ))
and
ξr: = (ξ1, . . . , ξr)
are close if r = o(n) as n → ∞. As previously, ξj, 1 ≤ j ≤ r ≤ n, are mutuallyindependent Poisson r.vs with Eξj = 1/j. Set
dr(n): = d(L(kr(σ)),L(ξr)) .
Seeking an estimate for dr(n), we involve another metrics.
1.5.1 Lemma. Let Xnj and Xj are the r.vs defined in the previous section. Then
dr(n) ≤∑j≤r
E|Xnj −Xj|: = dr(n).
Proof. Denote
Xnr = (Xn1, . . . , Xnr).
By Theorem 1.4.2, the distribution of this vector coincides with that of the vectorkr(σ). Moroever, Corollary 1.4.1 allows us to substitute Xr := (X1, . . . , Xr) for ξr.Hence
dr(n) = supA⊂Zr+
∣∣∣P (Xnr ∈ A)− P (Xr ∈ A)∣∣∣ ≤ P (Xnr 6= Xr)
≤ P(∑j≤r
|Xnj −Xj| ≥ 1)≤ dr(n) .
The lemma is proved. The next result is fundamental for the theory of random permutations.
18 I. ELEMENTARY THEORY
Fundamental Lemma. If 1 ≤ r ≤ n, then
dr(n) ≤ 2r
n− r + 1.
Proof. By the just proved lemma, it suffices to deal with the metrics dr(n). Inthe notation of previous section, we have
Xj −Xnj = Rnj −Qnj =∑i>n−j
ηi(1− ηi+1) · · · (1− ηi+j−1)ηi+j
−ηn−j+1(1− ηn−j+2) · · · (1− ηn) .
Exploiting the independence of the factors and omitting some of them, we obtain
E|Xj −Xnj| ≤∑i>n−j
EηiEηi+j + Eηn−j+1
=∑i>n−j
1
l(i+ j)+
1
n− j + 1
=1
j
∑i>n−j
(1
i− 1
i+ j
)+
1
n− j + 1
=1
j
∑n−j<i≤n
1
i+
1
n− j + 1
≤ 1
n− j + 1+
1
n− j + 1≤ 2
n− r + 1,
for j ≤ r. Adding these inequalities, we complete the proof. Of course, the just applied argument does not give an optimal estimate of dr(n).
Better estimates can be found in [?]. For many purposes, it suffices to have onlydr(n) = o(1) for r = o(n) as n→∞. Can one extend this bound of r?
1.5.1 Theorem. Let n→∞. The estimate of the total variation distance dr(n) =o(1) holds if and only if r = o(n).
Proof. The sufficiency is established in the Fundamental lemma. To prove thenecessity, we will use the expression
dr(n) = supA⊂Zr+
∣∣∣νn(kr(σ) ∈ A)− P (ξr ∈ A)∣∣∣
= supA⊂Zr+
∣∣∣P (ξr ∈ A| ζ = n)− P (ξr ∈ A)∣∣∣ .
Here as previously, ζ = 1ξ1 + · · · + nξn. It remains to choose a bad set A ⊂ Zr+.Take
kr ∈ Zr+: 1k1 + · · ·+ rkr > n .The conditional probability now equals zero. Hence
dr(n) ≥ P(∑j≤r
jξj > n)≥ P
(r2
∑r/2<j≤r
ξj > n)
= P (Y > 2n/r) ,
1.6. ESTIMATES OF CONDITIONAL PROBABILITIES 19
where Y is the Puasson r.v. with the parameter∑r/2<j≤r
1/j ∼ log 2, r →∞ .
If r = 2εn and ε > 0 is fixed, then
P (Y > ε−1) ≥ c(ε) > 0 .
The theorem is proved. Resuming this section, we stress once more that the first coordinates in the cycle
structure vector kr(σ), if r ≤ εn for every ε > 0, allow an approximation by thevector ξr with independent coordinates. What could we do with the remainingcomponents corresponding to the long cycles? In the next section we include anapproach proposed by the author in [?]. The very idea goes back to I.Z. Ruzsa’spaper in probabilistic number theory.
1.6 Estimates of conditional probabilities
The purpose of this section is to find upper estimates of the conditional probabilitiesappearing in the relation
νn(k(σ) ∈ A) = P (ξ ∈ A| `(ξ) = n), A ⊂ Zn+ ,
which has been proved in Theorem 1.2.2 in terms of unconditional probabilities.Here, as previously, `(k) = 1k1 + · · ·+ nkn, k = (k1, . . . , kn) ∈ Zn+. It is essential toconsider all components of the vector k(σ). The idea is to extend the set A but notup to the whole Zn+ which would be trivial.
We will exploit geometric interpretation of the semi-lattice Zn+; therefore, it isconvenient to examine a probability space Zn+,Zn, µ, where µ is a probability mea-sure defined on all subsets of Zn+ comprising Zn. Later on, µ will be the distributionof the poissonian vector ξ, and we will return to the space Ω,F , P just applyingthe relation
µ(A) = P (ω: ξ(ω) ∈ A), A ⊂ Zn+. (1.12)
Let us introduce some notation. Set
ki = (ki1, ki2, . . . , kin) ∈ Zn+, i = 1, 2, . . . ,
Ωm: = k ∈ Zn+: `(k) = m, 0 ≤ m ≤ n ,
and
ej: = (0, . . . , 1, . . . , 0) ,
where 1 stands at the j-th place, j = 1, . . . , n. The relation
k11k21 + k12k22 · · ·+ k1nk2n = 0 ,
20 I. ELEMENTARY THEORY
will be denoted by k1 ⊥ k2; the very vectors will be called orthogonal. We will writek1 ≤ k2 if k1j ≤ k2j for each 1 ≤ j ≤ n. We will use the notation k1 ‖ k2 for tworelations:
k1 ≤ k2, k1 ⊥ k2 − k1 .
In the last case, we will say that k1 exactly enters into the vector k2.Let us a bit specify the measure µ. Take arbitrary distributions 0 ≤ pj(k) ≤ 1,
where k ≥ 0 and pj(0) + pj(1) + · · · = 1, on Z+ and define µ on Zn+ by
µ(k) := µ(k) =n∏j=1
pj(kj) .
In this notation, our purpose is to estimate the conditional probabilities µ(U |Ωn)by unconditional ones. Observe that only the case Pn := µ(Ωn) = o(1) as n→∞ isnot trivial. A delicate extension of the event U ⊂ Zn+ is as follows:
V = V (U) := k = k1 + k2 − k3 : k1, k2, k3 ∈ U, k1 ⊥ k2 − k3, k3 ‖ k2.
For an arbitrary event U ⊂ Zn+, set U = Zn+ \ U . For brevity, the next result isformulated for the complementary events.
1.6.1 Theorem. Let n ≥ 1 and let there exist positive constants c, c1, C1, and C2
such that
(i) pj(0) ≥ c for all 1 ≤ j ≤ n;
(ii) µ(Ωm) ≤ C1 Pn for all 0 ≤ m ≤ n− 1;
(iii) Pn ≥ c1n−1;
(iv) ∑k≥1,≤nkj=m
pj(k)
pj(0)≤ C2
m
for all 1 ≤ m ≤ n.
Thenµ(V |Ωn) ≤ Cµ(U).
Here C > 0 is a constant depending on the constants in the conditions.
We start with the following technical lemma.
1.6.1 Lemma. Let condition (i) be satisfied. Denote π = kej with an arbitraryk ≥ 1 and j ≤ n, q(π) = pj(k)/pj(0),
Qn = π : `(π) = jk ≤ n ,
Q′ = π ∈ Qn : ∃k ∈ U, k ⊥ π, π + k ∈ U ,and Q′′ = Qn \Q′. If µ(U) < c2/32, then∑
π∈Q′′q(π) ≤ 4c−1µ(U) .
1.6. ESTIMATES OF CONDITIONAL PROBABILITIES 21
Proof. The essence of the claim is rather clair. If the probability µ(U) is large,then the set Q′ is also large. Hence the compliment Q′′ should be small. The claimquantitatively expresses this observation.
Let π = kej ∈ Q′′, where k ≥ 1 and jk ≤ n, and define
Wπ = l = π + k: k ∈ U, k ⊥ π ⊂ U .
If π ⊥ k, then kj = 0. Hence
µ(π + k) = pj(k)∏i≤ni6=j
pi(ki) =(pj(k)
pj(0)
)(pj(0)
∏i≤ni 6=j
pi(ki))
= q(π)µ(k) . (1.13)
Consequently,
µ(U) ≥ µ(Wπ) = q(π)∑
k∈U,k⊥π
µ(k)
≥ q(π)(∑k∈U
µ(k)−∑k∈Ωkj≥1
µ(k))
= q(π)(
1− µ(U)− (1− pj(0)))
≥ cq(π)(1− c/32) > q(π)c/2 (1.14)
andq(π) ≤ 2c−1µ(U) ≤ c/8 . (1.15)
If π1 = rei 6= π, π1 ∈ Q′′, then Wπ ∩Wπ1 = ∅ for i = j and
Wπ ∩Wπ1 ⊂ l = π + π1 + k : k ∈ Ω, k ⊥ π, k ⊥ π1 ,
for i 6= j. Hence in either of the cases, from (1.13), we obtain
µ(Wπ ∩Wπ1) ≤ q(π)q(π1)∑k∈Ω
µ(k) = q(π)q(π1) . (1.16)
Take an arbitrary subset Q ⊂ Q′′ and denote
W =⋃π∈Q
Wπ, a := a(Q) =∑π∈Q
q(π) .
Since W ⊂ U , we obtain from (1.14) and (1.16) that
µ(U) ≥ µ(W ) ≥∑π∈Q
µ(Wπ)−∑
π,π1∈Q
µ(Wπ ∩Wπ1)
≥ a(c
2− a) ≥ ca
4, (1.17)
22 I. ELEMENTARY THEORY
provided that a ≤ c/4. In other words, if a(Q′′) ≤ c/4, taking Q = Q′′ we completethe proof.
Assume that a(Q′′) > c/4 and choose Q to be a maximal subset of Q′′ suchthat a(Q) ≤ c/4. Further take a π′ from the nonempty set Q \ Q′′. We haveq(π′) + a(Q) > c/4; therefore, (1.17) and (1.15) imply
µ(U) ≥ ca(Q)
4>c
4
( c4− q(π′)
)≥ c2
32.
This is a contradiction to the assumption of the lemma.Lemma 1.6.1 is proved. Proof of Theorem 1.6.1. It suffices to examine the case µ(U) ≤ c2/32. Otherwise
the claim holds with C = 32c−2.A vector l ∈ V ∩ Ωn is non zero. It allows a decomposition l = π + k with some
π = kej, 1 ≤ `(π) = kj ≤ n, where π ⊥ k. We have
`(π + k) = `(π) + `(k) = n .
Moreover, π ∈ Q′′ or k ∈ U . Indeed, in the converse case, by the definition of Q′,we had a vector k1 ∈ U such that k1 ⊥ π, π + k1 ∈ U and
l = k + (π + k1)− k1 ∈ V .
This is impossible. Now, the above notation and the equality∑π‖l
`(π) = n ,
if l ∈ Ωn, and (1.13) imply
Pnµ(V |Ωn) =1
n
∑l∈V ∩Ωn
µ(l)∑π‖l
`(π)
=1
n
∑π+k∈V ∩Ωn
π⊥k
q(π)`(π)µ(k)
≤∑π∈Q′′
q(π)∑
`(k)=n−`(π)
µ(k)
+1
n
∑k∈U
µ(k)(n− `(k))∑
`(π)=n−`(k)
q(π)
: = Σ1 + Σ2 . (1.18)
Using Lemma 1.6.1 and the conditions, we obtain
Σ1 =∑π∈Q′′
q(π)µ(Ωn−`(π)) ≤4C1
cPnµ(U)
1.6. ESTIMATES OF CONDITIONAL PROBABILITIES 23
and
Σ2 ≤C2
nµ(U) ≤ C2
c1
Pnµ(U).
Inserting the last two estimates into (1.18) and dividing byPn, we have
µ(V |Ωn) ≤(4C1
c+C2
c1
)µ(U) .
Recalling the already examined case, we see that the claim of Theorem 1.6.1 holdswith
C = max32
c2,4C1
c+C2
c1
. (1.19)
The theorem is proved.
1.6.1 Corollary. Theorem 1.6.1 holds for
pj(k) = e−1/j 1
jkk!, 1 ≤ j ≤ n, k ≥ 0 . (1.20)
Proof. We have to verify the conditions of the last theorem. Condition (i) holdswith c = e−1.
The relation `(k) = m ≤ n implies kj = 0 for m+1 < j ≤ n. Hence the Cauchy’sformula yields
µ(Ωm) =∑
`(k)=m
n∏j=1
e−1/j
jkjkj!= exp
−
n∑j=1
1
j
∑`(k)=m
m∏j=1
1
jkjkj!
= exp−
n∑j=1
1
j
= Pn .
for each 0 ≤ m ≤ n. Consequently, (ii) and (iii) hold with C1 = 1 and c1 = e−1.Finally, ∑
jk=m
1
jkk!≤ 2
m+
1
m
∑k|m
2≤k≤m/2
1
(k − 1)!
( km
)k−1
≤ 2
m+
1
m
∞∑k=0
1
k!2k≤ 2
m+
e1/2
m.
We can take C2 = 4 in condition (iv).The corollary is proved. Exercises. 1. Improve the estimate of C given in (1.19). Our result for the
Poisson r.vs in the corollary gives only C = 32e2.2. Generalize conditions (1)-(iv) of Theorem 1.6.1. The attempt done in [?]
seems to be unsatisfactory.
24 I. ELEMENTARY THEORY
1.7 Additive functions and the tail probabilities
In Sections 1.5 and 1.6, we have developed necessary tools to investigate the valuedistribution of mappings defined on Sn via k(σ). We now confine ourselves to oneclass of such mappings.
Given a two-dimensional real sequence hj(k), where k ∈ Z+ and j ∈ Nn, andsatisfying the condition hj(0) ≡ 0, we define by
h(σ) =n∑j=1
hj(kj(σ)) (1.21)
an additive function h : Sn → R. Evidently, the definition can be extended tothe additive functions h : Sn → G, where (G,+) is an abelian group. If hj(k) =kaj for some aj ∈ R, for each k ∈ Z+ and j ∈ Nn, such a function is calledcompletely additive. The simplest instance belonging to this class h(σ) = w(σ) hasbeen examined in the Goncharov theorem. If hj(k) = 1(k ≥ 1), the appropriatevalue of the additive function h(σ) expresses the number of cycles in σ with differentlengths.
Let σ0 denote the identical permutation in Sn. The group theoretical order ofσ ∈ Sn,
Ord(σ) := mink ∈ Z+ : σk = σ0 = L.C.M.j ≤ n : kj(σ) ≥ 1 ,
where L.C.M. denotes the least common multiplier (nor its logarithm) is not addi-tive. Nevertheless, it is known (see, for instance, [?], [?], or [?]) that log Ord(σ) beapproximated by the additive function
h(σ) =∑j≤n
1kj(σ) ≥ 1 log j
for all σ ∈ Sn but a small their subset. Even the function `(σ) ≡ n is completelyadditive in our sense.
To avoid repetitions, we now present a few corollaries of the results obtained inthe previous section applied for an additive function h : Sn → G, where (G,+) is anabelian group.
Let h(σ) be defined via hj(k) ∈ G, j, k ≥ 1. On the set Zn+, we can define anassociated additive function H:Zn+ → G by setting
H(k): =n∑j=1
hj(kj) ,
where k = (k1, . . . , kn). As in the last section, we consider them elementary eventstaken with the probabilities
µ(k) = exp−
n∑j=1
1
j
n∏j=1
1
jkk!,
Now H(k) is the sum of G-valued r.vs. Let, as previously, ξ = (ξ1, . . . , ξn) be therandom vector with independent Poisson coordinates, Eξj = 1/j for 1 ≤ j ≤ n.
1.7. ADDITIVE FUNCTIONS AND THE TAIL PROBABILITIES 25
1.7.1 Lemma. If h:Sn → G and H:Zn+ → G are additive functions, B ⊂ G andΩn = k ∈ Zn+: `(k) = n, then
νn(h(σ) ∈ B) = µ(H(k) ∈ B|Ωn) . (1.22)
Proof. Theorem 1.2.2 and relation (1.12) imply
νn(h(σ) ∈ B) =∑
H(k)∈B
νn(k(σ) = k) =∑
H(k)∈B
P (ξ = k| `(ξ) = n)
= P (H(ξ) ∈ B| `(ξ) = n) = µ(H(k) ∈ B|Ωn) .
The lemma is proved. Theorem 1.6.1 is applicable for the conditional probability on the right-hand side
in (1.22) For an arbitrary A ⊂ G, we denote
A+ A− A = b = a1 + a2 − a3: a1, a2, a3 ∈ A .
1.7.1 Theorem. If A ⊂ G and h(σ) is an additive G-valued function, then
νn(h(σ) 6∈ A+ A− A) ≤ Cµ(H(k) 6∈ A) = CP (H(ξ) 6∈ A) .
Here C > 0 is the constant defined in Corollary 1.6.1.
Proof. We apply Theorem 1.6.1 for
U : = k: H(k) ∈ A .
Examine the extension V = V (U), defined in Section 1.6. If
l = k1 + k2 − k3, k1, k2, k3 ∈ U, k1 ⊥ k2 − k3, k3 ‖ k2 ,
then, using the additivity, we obtain
H(l) = H(k1) + H(k2)−H(k3) .
HenceV ⊂ k: H(k) ∈ A+ A− A .
For the compliments of the sets the converse implication holds; therefore, by Theo-rem 1.6.1 and Lemma 1.7.1,
νn(h(σ) 6∈ A+ A− A) = µ(H(k) 6∈ A+ A− A|Ωn) ≤ µ(V |Ωn)
≤ Cµ(U) = Cµ(H(k) 6∈ A) .
The claim is proved. Examine the cases G = R and G = Rn.
1.7.2 Theorem. Let h:S→ R be an additive function, a ∈ R and u ≥ 0 be arbitrarynumbers, then
νn(|h(σ)− a| ≥ 3u) ≤ CP (|h(ξ)− a| ≥ u) .
26 I. ELEMENTARY THEORY
Proof. The probability on the right-hand side equals
µ(|H(k)− a| ≥ u) .
IfA = x ∈ R: |x− a| < u ,
then, for every x ∈ A + A − A, we have x = x1 + x2 − x3, xi ∈ A, i = 1, 2, 3.Moreover,
|x− a| ≤ |x1 − a|+ |x2 − a|+ |x3 − a| < 3u
andA+ A− A ⊂ x: |x− a| < 3u .
We now obtain the desired inequality from Theorem 1.7.1 because of
νn(|h(σ)− a| ≥ 3u) ≤ νn(h(σ) 6∈ A+ A− A) ≤ Cµ(|H(k)− a| ≥ u) .
The theorem is proved.
1.7.3 Theorem. Let ht:Sn → R, t ∈ T ⊂ R, be a family of additive functionsdefined by (1.21) via hj(k) = hjt(k) and Ht(k) be the corresponding associated familyof additive function on Zn+. For arbitrary bt ∈ R, t ∈ T , and u ≥ 0, we have
νn
(supt∈T|ht(σ)− bt| ≥ 3u
)≤ CP
(supt∈T|Ht(ξ)− bt| ≥ u
).
Proof. It suffices to repeat the argument given in the proof of previous theorem.One has just to substitute R by the additive group of real functions defined on Tand to use the supremum norm instead of the absolute value.
1.7.1 Corollary. For an arbitrary real two-dimensional array hj(k), satisfyingthe condition hj(0) = 0 and a sequence of real numbers b1, . . . , bn and u ≥ 0, wehave
νn
(max
1≤m≤n
∣∣∣∑j≤m
hj(kj(σ))− bm∣∣∣ ≥ 3u
)≤ CP
(max
1≤m≤n
∣∣∣∑j≤m
hj(ξj)− bm∣∣∣ ≥ u
).
Proof. Apply Theorem 1.7.3 to the family of truncated additive functions
h(σ,m): =∑j≤m
hj(kj(σ)), m = 1, 2, . . . , n .
The corollary is proved. Exercise. Prove an analog of Kolmogorov’s inequality:
νn
(maxm≤n
∣∣∣∑j≤m
hj(kj(σ))−∑j≤mk≤n/j
hj(k)
jkk!
∣∣∣ ≥ u
)≤ C1u
−2∑jk≤n
|hj(k)|2
jkk!,
where u > 0 is an arbitrary number and C1 > 0 is an absolute constant.
1.8. MOMENTS OF ADDITIVE FUNCTIONS 27
1.8 Moments of additive functions
Let h : Sn → R be an additive function. We now examine the moments
En|h(σ)− An|s =1
n!
∑σ∈Sn
|h(σ)− An|s ,
where s > 0 and An is an appropriate centralizing sequence. To warm up, we startwith the variance for a completely additive function
h(σ) =n∑j=1
ajkj(σ) ,
where aj ∈ R, 1 ≤ j ≤ n, are arbitrary numbers.
1.8.1 Theorem. For the just defined completely additive function h(σ), set
An: = An(h): =n∑j=1
ajj.
Then, for all n ≥ 1,
En(h(σ)− An)2 ≤ 2n∑j=1
a2j
j. (1.23)
Proof. Firstly, recall the relations obtained in Theorem 1.2.5:
Enkj(σ) =1
j
and
Enk2j (σ) = En(kj(σ))2 + Enkj(σ) = 1(2j ≤ n)
1
j2+
1
j
for each j ≤ n, and
Enki(σ)kj(σ) = 1(i+ j ≤ n)1
ij
for each pair i, j ≤ n, i 6= j. Hence
Enh(σ) =n∑j=1
ajEnkj(σ) = An .
and
Enh(σ)2 =n∑i=1
n∑j=1
aiajEn(ki(σ)kj(σ))
=n∑j=1
a2jEnkj(σ)2 +
∑i,j≤ni6=j
aiajEn(ki(σ)kj(σ))
=∑j≤n
a2j
j+∑j≤n/2
a2j
j2+∑i+j≤ni6=j
aiajij
=∑j≤n
a2j
j+∑i+j≤n
aiajij
.
28 I. ELEMENTARY THEORY
Consequently,
En(h(σ)− An)2 = Enh(σ)2 − A2n
=∑j≤n
a2j
j−∑j,j≤ni+j>n
aiajij
. (1.24)
If aj, 1 ≤ j ≤ n, are nonnegative, then omitting the last term in the last inequality,we see that the claim of Theorem 1.8.1 is true even with the unit constant.
If aj take positive and negative values, we can separate them. Set u+ = u ifu ≥ 0, u+ = 0 if u < 0, and u− = u − u+. The function h(σ) can be decomposedinto the sum h+(σ) + h−(σ), where h±(σ) are completely additive functions definedby the sequences a±j respectively. After the same trick we have An = A+
n + A−n ,where A±n = An(h±). Since
En(h±(σ)− A±n )2 ≤n∑j=1
(a±j )2
j,
applying the inequality (a+ b)2 ≤ 2a2 + 2b2, a, b ∈ R, we complete the proof of thetheorem.
Inequality (1.23) may be attributed to the theory of quadratic forms. Denotingaj = xj
√j and
γ(j, σ) =√j(kj(σ)− 1), σ ∈ Sn, 1 ≤ j ≤ n ,
we can rewrite it as ∑σ∈Sn
(∑j≤n
γ(j, σ)xj
)2
≤ 2n!∑j≤n
x2j , (1.25)
where x ∈ R.To shorten, one can use the matrix exposition. Set X = (x1, . . . , xn) and define
the matrix Kn = ((γ(j, σ))), where the columns are enumerated by the permutationsσ written in an arbitrary fixed order. The dimension of Kn is (n×n!). Let ′ mean thetransposition of matrices and vectors. Calculating the Euclidean norms, we obtain
||XKn||2 = (XKn)(XKn)′ = X(KnK′n)X ′
=∑σ∈S
(∑j≤n
γ(j, σ)xj
)2
. (1.26)
Comparing this with inequality (1.25), we obtain a matrix reformulation of Theorem1.8.1.
1.8.2 Theorem. For each vector X ∈ Rn,
||XKn||2 = X(KnK′n)X ′ ≤ 2n!||X||2 .
1.8. MOMENTS OF ADDITIVE FUNCTIONS 29
In addition, this shows that the maximal eigenvalue of the matrix KnK
′n, which
has the dimension n× n, grows not faster than 2n! as n→∞. It is proved (see [?])that 2 can not be substituted by a constant less than 3/2. In 2018 the author andhis student J.Klimavicius established the exact inequality substituting 2 by 3/2.
Continuing the investigation of the quadratic forms, we will derive a useful com-binatorial result. In the sequel, we will need a complex version of the above inequal-ities.
1.8.1 Lemma. Let Q = ((qij)), where 1 ≤ i ≤ m and 1 ≤ j ≤ n, be a real matrixand λ > 0. The inequality
n∑j=1
∣∣∣ m∑i=1
qijxi
∣∣∣2 ≤ λ
m∑i=1
|xi|2
holds for all vectors X = (x1, . . . , xm) ∈ Cm if and only if, for all Y = (y1, . . . , yn) ∈Cn, the following inequality
m∑i=1
∣∣∣ n∑j=1
qijyj
∣∣∣2 ≤ λn∑j=1
|yj|2
holds.
Proof. Using the argument applied above, we can rewrite the inequalities of thelemma in the matrix form:
||XQ||2 ≤ λ||X||2 (1.27)
and||Y Q′||2 ≤ λ||Y ||2 . (1.28)
Let us denote the scalar product by < ·, · >. It follows from Cauchy’s inequalityand (1.27) that
| < Y,XQ > |2 ≤ ||Y ||2||XQ||2 ≤ λ||Y ||2||X||2 .
The square of the scalar product on the left-hand side equals
|Y (XQ)′|2 = |Y Q′X ′|2 ,
The bar now stands for the complex conjugate vector. Hence
|Y Q′X ′|2 ≤ λ||Y ||2||X||2 (1.29)
for arbitrary X ∈ Cm and Y ∈ Cn. Choosing X = Y Q′, we obtain from inequality(1.29) that
|Y Q′(Y Q′)′|2 = ||Y Q′||4 ≤ λ||Y ||2||Y Q′||2 .If ||Y Q′|| = 0, inequality (1.28) is trivial. Otherwise, dividing by ||Y Q′||2, we
have inequality (1.28).By the same argument we could derive (1.27) from inequality (1.28).The lemma is proved.
30 I. ELEMENTARY THEORY
1.8.3 Theorem. Let y(σ) ∈ C, σ ∈ Sn, be arbitrary. Then∑j≤n
j∣∣∣ ∑σ∈Sn
y(σ)(kj(σ)− 1
j
)∣∣∣2 ≤ 2n!∑σ∈Sn
| y(σ) |2 . (1.30)
Proof. Apply the lemma with the matrix Q = Kn. Considering the real andimaginary parts separately, from (1.25), we have
∑σ∈Sn
∣∣∣∑j≤n
γ(j, σ)xj
∣∣∣2 ≤ 2n!n∑j=1
|xj|2 .
Hence its inversion, by Lemma 1.8.1, becomes
∑σ∈Sn
∣∣∣ n∑j=1
γ(j, σ)y(σ)∣∣∣2 ≤ 2n!
∑σ∈Sn
| y(σ) |2 .
It remains to use the definition of γ(j, σ).The theorem is proved. Recalling Enkj(σ) = 1/j, we see that inequality (1.30) is an estimate of the
variance. Again, the constant 2 can not be substituted by 1; therefore, this showsthe stochastic dependence of kj(σ), 1 ≤ j ≤ n.
The just described direct method becomes rather involved estimating the mo-ments of higher order. There is an undirect approach based on the tail probabilityestimate given in the Corollary of Theorem 1.7.2. This method reduces the problemto that for sums of independent r.vs. For brevity we will use as an analog of O(·)with a constant depending on s only.
Rozenthal’s Inequality. Let X1, . . . , Xn be independent r.vs, EXj = 0, E|Xj|s <∞, 1 ≤ j ≤ n, and s ≥ 2. Then
E∣∣∣∑j≤n
Xj
∣∣∣s (∑j≤n
EX2j
)s/2+∑j≤n
E|Xj|s .
Proof can be found in V.V. Petrov’s book [?].
1.8.4 Theorem. Let h(σ) be an additive function, s ≥ 2, and
En(h): =∑kj≤n
hj(k)
jkk!, Dn(h; s): =
∑kj≤n
|hj(k)|s
jkk!.
ThenEn|h(σ)− En(h)|s Dn(h; s) + (Dn(h; 2))s/2 .
Proof. We apply Rozenthal’s inequality and the following formula
E|Y |s = s
∫ ∞0
ys−1P (|Y | ≥ u)du
1.8. MOMENTS OF ADDITIVE FUNCTIONS 31
which is valid for arbitrary r.v. Y having the s-th moment.The values hj(k), which define the additive function h(σ), for jk > n, do not
have influence to the considered quantity En|h(σ)−En(h)|s; therefore, we may takehj(k) = 0 if jk > n.
As previously, let ξj, 1 ≤ j ≤ n, be independent Poisson r.vs with Eξj = 1/j,
Xj = hj(ξj)− E(hj(ξj)) , En(h) =∑j≤n
Ehj(ξj) .
Denote Y = X1 + · · · + Xn. From the Corollary of Theorem 1.7.2 and Rozenthal’sinequality, we obtain
En|h(σ)− En(h)|s = s
∫ ∞0
us−1νn(|h(σ)− En(h)| ≥ u)du
≤ Cs
∫ ∞0
us−1P (|Y | ≥ u/3)du = C3sE|Y |s
(∑j≤n
E|Xj|2)s/2
+∑j≤n
E|Xj|s. (1.31)
Further, there remain only technical calculations. Firstly, we can change the cen-tralizing sequence on the left-hand side. Cauchy’s inequality implies
|En(h)− En(h)| =∣∣∣ ∑jk≤n
hj(k)
jkk!(e−1/j − 1)
∣∣∣ ≤∑jk≤n
|hj(k)|jk+1k!
≤(∑jk≤n
1
jk+1k!
)1/2
Dn(h; 2)1/2
≤ C3Dn(h; 2)1/2.
Here C3 > 0 is an absolute constant. Now, using the inequality |a+b|s ≤ 2s−1(|a|s+|b|s), where a, b ∈ R, we have
En|h(σ)− En(h)|s ≤ 2s−1En|h(σ)− En(h)|s + 2s−1Cs3Dn(h; 2)s/2 . (1.32)
Consequently, it suffices to estimate the quantities on the right-hand side of (1.31).Since
E|Xj|s = E|hj(ξj)− Ehj(ξj)|s ≤ 2sE|hj(ξj)|s ≤ 2s∑k≤n/j
|hj(k)|s
jkk!,
we have ∑j≤n
E|Xj|s ≤ 2s∑jk≤n
|hj(k)|s
jkk!= 2sDn(h; s),
if s ≥ 2. Combining this with (1.32) and (1.31), we complete the proof. Are these estimate sharp? Yes, but the constants. The worst function for which
the moment estimates are really bad is `(σ) ≡ n, σ ∈ S. In this case, trivially
32 I. ELEMENTARY THEORY
centralizing by n, we have the zero central moments. In general, an additive functioncan have an expression h(σ) = h(σ) + λ`(σ) with some λ ∈ R. In such cases, it isbetter to get rid from the second summand by changing the centralizing sequencesand apply the just derived inequalities for the first summand.
Exercise. Find a sequence α(n) and estimate the moment En|h(σ)− α(n)|s, if0 < s < 2. The necessary results on sums of independent r.vs can be found in [?] orin [?].
1.9 The law of large numbers
As in the probabilistic laws of large numbers, we can seek conditions under which,for a real additive function h(σ), a real sequence α(n) and β(n) > 0,
νn(|h(σ)− α(n)| ≥ δβ(n))→ 0
for each δ > 0 as n → ∞. More generally, we can examine sequences of additivefunctions hn : Sn → R defined via
hn(σ) =n∑j=1
hnj(kj(σ)) . (1.33)
Now hnj(k) is even a three-dimension array, hnj(0) ≡ 0 if j ≤ n and n ∈ N. In thissection we will find sufficient conditions assuring the weak law of large numbers:
νn(|hn(σ)− α(n)| ≥ δ) = o(1) (1.34)
as n → ∞. Here α(n) is a real centralizing sequence and δ > 0 is arbitrary fixednumber. Observe that the values hnj(k) = 0, jk > n, can are not involved in h(σ)if σ ∈ Sn; therefore, they can be changed in an arbitrary way.
Let us compare our problem with the classical weak law of large numbers for thesum
Yn := Xn1 + · · · ,+Xnn
of independent r.vs Xnj, 1 ≤ j ≤ n. Recall that they are infinitesimal if
pn := maxj≤n
P (|Xnj| ≥ ε) = o(1) ,
for every ε > 0 as n → ∞. In what follows, we take Xnj = hnj(ξj), where aspreviously, ξj, j = 1, 2, ..., are independent Poisson r.vs and Eξj = 1/j for 1 ≤ j ≤ n.
1.9.1 Lemma. The r.vs Xnj, 1 ≤ j ≤ n, are infinitesimal if and only if
hnj(k) = o(1) (1.35)
for each fixed j ≥ 1 and k ≥ 1 as n→∞.
1.9. THE LAW OF LARGE NUMBERS 33
Proof. We have
pn = maxj≤n
∑k≥1
|hnj(k)|≥ε
e−1/j 1
jkk!.
If K ≥ 1 is an arbitrary fixed number, then, for arbitrary ε > 0, we can choosen ≥ n0 sufficiently large such that |hnj(k)| < ε for all j ≤ K and k ≤ K. Hence, ifn ≥ n0, we can examine the summands with j > K or k > K only. Consequently,
pn ≤ maxK<j≤n
∑k≥1
1
jkk!+ max
j≥1
∑k>K
1
jkk!= O
( 1
K
).
Hence pn → 0 as n→∞.The converse claim follows by obtaining a contradiction.The lemma is proved. Condition (1.35) will be frequently used in the sequel. It will fairly simplify our
calculations. Set for brevity u∗ = min|u|, 1sgnu and anj = hnj(1) where 1 ≤ j ≤ n.We first recall a special case of the classical probabilistic result (see, for instance,[?]).
1.9.1 Theorem. Let n→∞ and condition (1.35) hold. If
∑j≤n
a∗2
nj
j→ 0 , (1.36)
thenP (|Yn − An| ≥ δ) = o(1) ,
where
An :=∑j≤n
a∗njj,
It is worth to observe that, under condition (1.35), the values hnj(k) for k ≥ 2are negligible. The next assertion concerns the additive functions.
1.9.2 Theorem. Let the conditions of the previous theorem be satisfied. Then
νn(δ) := νn(|hn(σ)− An| ≥ δ) = o(1) ,
as n→∞.
Proof. It suffices to apply Theorem 1.7.2 and to obtain
νn(δ) ≤ CP (|Yn − An| ≥ δ/3) .
The claim now follows from Theorem 1.9.1. We now include a more transparent version of the law of large numbers, this
time formulated for one fixed additive function.
34 I. ELEMENTARY THEORY
1.9.3 Theorem. Let h(σ) be an additive function defined via hj(k). As previously,set
En(h) =∑jk≤n
hj(k)
jkk!, Dn(h) := (Dn(h; 2))1/2 =
(∑jk≤n
h2j(k)
jkk!
)1/2
.
For arbitrary positive ψn →∞, we have
νn(|h(σ)− En(h)| ≥ ψnDn(h)) = o(1) ,
as n→∞.
Proof. It suffices to apply the previous theorem with
hn(σ) = h(σ)/(ψnDn(n)), anj = hj(1)/(ψnDn(h)) .
Condition (1.36) is evidently satisfied. Another choice of the centralizing sequencecan be substantiated by the inequalities∑
jk≤nk≥2
|hj(k)|jkk!
Dn(h) ,∑j≤n|anj |≥1
|anj|j≤ ψ−2
n = o(1) .
The theorem is proved. Necessary and sufficient conditions for the weak law of large numbers of additive
functions have been found in author’s paper [?]. Actually, we have exposed allideas leading to the sufficiency part of this result. The only needed refinement ismentioned in the exercise below.
Exercise. Extend Theorems 1.9.2 and 1.9.3 taking into account the possiblepart λ`(σ) = λn which hn(σ) could have.
1.10 The central limit theorem
We again examine a sequence of additive functions
hn(σ) =n∑j=1
hnj(kj(σ)) .
The problem is to find conditions when the relation
νn(hn(σ)− α(n) < x) =1√2π
∫ x
−∞e−u
2/2dx+ o(1) =: Φ(x) + o(1), x ∈ R ,
holds as n → ∞. Here α(n) ∈ R is a centralizing sequence. For simplicity, weassume condition (1.35). Under it, the problem reduces to the case of the completelyadditive functions defined by anj := hnj(1), that is, to
hn(σ) =∑j≤n
anjkj(σ) . (1.37)
1.10. THE CENTRAL LIMIT THEOREM 35
To substantiate this, we check that the difference of additive functions is additiveand, by Theorem 1.9.2, satisfies the weak law of large numbers
νn(|hn(σ)− hn(σ)| ≥ δ) = o(1)
for every δ > 0 as n→∞. This shows that the asymptotic distributions of hn(σ)−α(n) and hn(σ)− α(n) can exist only at the same time and coincide.
We again find an appropriate result in probability theory. As previously, let ξjbe the Poisson r.vs with Eξj = 1/j, 1 ≤ j ≤ n. Set Xnj := anjξj and
Yu :=∑j≤u
Xnj , Au :=∑j≤u
a∗njj, 1 ≤ u ≤ n.
By Va(x) we denote the distribution function of the degenerated at x = a law.
1.10.1 Theorem. Let n→∞. The relations
P (Yn − α(n) < x) = Φ(x) + o(1) (1.38)
uniformly in x ∈ R andanj = o(1) (1.39)
for each fixed j ≥ 1 hold if and only if the following conditions are satisfied:∑j≤n
a∗nj2
j1anj < x = V0(x) + o(1) , x ∈ R, (1.40)
andα(n) = An + o(1) . (1.41)
Proof. This is a partial case of the Lindeberg–Feller’s theorem (see [?]).
1.10.1 Remark. The Lindeberg type condition (1.40) is equivalent to∑j≤n
1
j1|anj| ≥ ε = o(1) (1.42)
for every ε > 0. This remains to hold for some ε = εn → 0. Hence, for each fixed0 < δ < 1,∑
δn<j≤n
a∗nj2
j= o(1) + ε2
n
∑δn<j≤n
1
j1|anj| < εn o(1) + ε2
n log(1/δ) = o(1) .
Now, by Theorem 1.9.1 with anj = 0 for j ≤ δn,
P (ε) := P (|Yn − Yδn − (An − Aδn)| ≥ ε) = o(1) (1.43)
for every ε > 0. This also remains to hold for some δ = δn → 0. In other words, theLindeberg condition assures also the convergence
P (Yr − Ar < x) = Φ(x) + o(1) (1.44)
uniformly in x ∈ R for some r = rn = o(n) as n→∞.
36 I. ELEMENTARY THEORY
We will need the so-called lemma on the convergence of types of distributions.
1.10.1 Lemma. Let Fn(x), G(x) 6= Va(x) and F (x) 6= Va(x), where a ∈ R, bedistribution functions, α(n) and β(n) > 0 be sequences of constants, and let ⇒denote weak convergence. If
Fn(x)⇒ F (x) and Fn(β(n)x+ α(n))⇒ G(x),
then G(x) = F (βx+ α), β(n)→ β, and α(n)→ α, where α and β are constants.
Proof can be found in [?]. We now return to combinatorics.
1.10.2 Theorem. Let hn(σ) be a sequence of additive functions defined in (1.33)via hnj(k) which satisfies condition (1.35) and anj := hnj(1).
If conditions (1.40) and (1.41) are satisfied, then
νn(x): = νn(hn(σ)− α(n) < x) = Φ(x) + o(1) (1.45)
uniformly in x ∈ R as n→∞.Conversely, if for every 0 < δ < 1
∑δn<j≤n
a∗2
nj
j= o(1) , (1.46)
and (1.45) holds with some α(n) ∈ R, then conditions (1.40) and (1.41) are satisfied.
Proof. Condition (1.35) allows us to deal with the completely additive functiondefined via hnj(k) = anjk, 1 ≤ j ≤ n and k ≥ 0. This concerns either the sufficiencyor the necessity part of the proof.
Sufficiency. We first see that, by the central limit theorem above, relations(1.38) and (1.44) hold. As we have remarked, we also have (1.46), which followsfrom conditions (1.40). By Theorem 1.7.2, this leads to
νn(ε) := νn
(|(hn(σ)− hr(σ))− (An − Ar)| ≥ ε
) P (ε/3) = o(1) , (1.47)
for every ε > 0 for some r = o(n) as n→∞. Here and in the sequel
hr(σ) :=∑j≤r
anjkj(σ)
is a truncated additive function. Now, by Fundamental Lemma and (1.44),
νn(hr(σ)− Ar < x) = P (Yr − Ar < x) + o(1) = Φ(x) + o(1)
uniformly in x ∈ R as n→∞. Recalling the just proved estimate (1.47), we obtain
(1.45) with α(n) = An. The sufficiency is proved.
1.10. THE CENTRAL LIMIT THEOREM 37
Necessity. We now have (1.45). The a fortiori condition (1.46) implies (1.43).Hence there exists a sequence r = δn = o(n) such that, for n→∞,
Φ(x) + o(1) = νn(hn(σ)− α(n) < x)
= νn
(hr(σ)− (α(n)− An + Ar) + (hn(σ)− hr(σ)− An + Ar) < x
)= νn
(hr(σ)− αr < x) + o(1)
= P (Yr − αr < x) + o(1) ,
where αr := α(n) − An + Ar. In the last step, we used the Fundamental Lemma.
The central limit theorem for Yr holds with the centralizing sequence Ar; therefore,Lemma 1.10.1 implies
αr = α(n)− An + Ar = Ar + o(1) ,
that is, α(n) = An + o(1) as n→∞. Thus, we have proved relation (1.41).By Remark, condition (1.46) shows that the last but one relation can be extended
to (1.38). Now, Theorem 1.10.1 implies the necessity of condition (1.40).The theorem is proved. The just proved Theorem 1.10.2 becomes simpler if one normalized function h(σ)
is considered. In such cases, we examine the convergence of
νn(h(σ)− α(n) < xβ(n))
as n → ∞. By virtue of (1.40), one can define the normalizing sequence β(n) > 0,β(n)→∞, just solving asymptotically the relation
n∑j=1
( ajβ(n)
)∗2 1
j= 1 + o(1) .
Here aj := hj(1). If the latter values are comparatively small, one can use sequences
α(n) = An: =n∑j=1
ajj, β(n) = Bn: =
( n∑j=1
a2j
j
)1/2
.
In some sense, this idea has been aproved in the moment analysis and by the weaklaw of large numbers. Check that, if Bn →∞ as n→∞, then hnj(k) = hj(k)/Bn =o(1) for each fixed j, k ≥ 1. As in the proof of Theorem 1.10.2, we can confineourselves to the case of completely additive functions.
1.10.3 Theorem. Assume that Bn →∞ and n→∞. If
1
B2n
∑j≤n
a2j
j1|aj| ≥ εBn = o(1) (1.48)
for every ε > 0, then
νn(h(σ)− An < xBn) = Φ(x) + o(1)
uniformly in x ∈ R as n→∞. Conversely, the last relation and the extra conditionBr ∼ Bn for some r = o(n) as n→∞, imply (1.48).
38 I. ELEMENTARY THEORY
Proof. As an exercise, we leave it for the Reader.
1.10.1 Example. Examine asymptotic distribution of
w(σ): =n∑j=1
1kj(σ) ≥ 1
expressing the number of different lengths of cycles in a random σ ∈ Sn. Thisfunction is additive and defined via the array
wj(k) = 1k ≥ 1, j ≥ 1, k ≥ 0 .
Now aj ≡ 1 and An = B2n ∼ log n as n→∞. Theorem 1.10.3 implies
νn(w(σ)− log n < x√
log n) = Φ(x) + o(1)
as n→∞.
In contrast to the central limit theorem for independent r.vs, our Theorem 1.10.2is not a final solution of the posed problem. The necessity part is proved only underthe extra condition (1.46). The author jointly with G.J. Babu [?] have constructedan instance of additive function which does not satisfy (1.40), nevertheless its asymp-totic distribution, after an appropriate normalization, is normal. The problem stillwaits for new ideas.
Apart from the normal law, one can seek other limit distributions. Recent au-thor’s results [?] give necessary and sufficient conditions if the normalizing sequenceβ(n) satisfies some additional regularity conditions. There is some progress in find-ing the necessary conditions for the convergence of νn(hn(σ) < x) to the Poisson orother discrete laws if hnj(k) ∈ Z (see [?] and [?]).
Exercise. Find the normalizing sequences α(n) and β(n) > 0 such that
νn
( n∑j=1
kj(σ) log j − α(n) < xβ(n))
= Φ(x) + o(1), n→∞.
1.11 Mean values of multiplicative functions
A multiplicative function f : Sn → C is defined via a two-dimensional array ofcomplex numbers fj(k), where j ≥ 1 and k ≥ 0, setting
f(σ) =n∏j=1
fj(kj(σ)) .
Here also the condition fj(0) ≡ 1, j ≥ 1, is assumed. If fj(k) = bkj with somebj ∈ C for all j ≥ 1 and k ≥ 0, keeping the agreement 00 := 1, we have a completelymultiplicative function.
1.11. MEAN VALUES OF MULTIPLICATIVE FUNCTIONS 39
The asymptotic behavior of the mean values
Mn := Mn(f) :=1
n!
∑σ∈Sn
f(σ)
as n → ∞ is now the main target. The previously touched problems on the valuedistribution of sequences of additive functions, using Fourier transforms, can bereduced to these mean values. Then the function f(σ) depends on the sequenceparameter n and t ∈ R, and we need asymptotic formulae with the uniform withrespect to t ∈ ∆ estimates of the remainders for various ∆ ⊂ R. This possibleobstacle is taken into account in this section.
1.11.1 Lemma. Let f(σ) be a multiplicative function defined via fj(s), j ≥ 1and s ≥ 0, where fj(0) ≡ 1, then
Mn = Mn(f) =∑`(s)=n
n∏j=1
fj(sj)
jsjsj!. (1.49)
Here the summation is over s = (s1, . . . , sn) ∈ Zn+ such that `(s) = n.
Proof. Group the summands in Mn over the classes Sn(s) which are indexedby s. All values of f(σ) for σ ∈ Sn(s) are the same. Applying Theorem 1.1.1 wecomplete the proof.
1.11.2 Lemma. In the notation of Lemma 1.11.1, we have the following formalpower series relation:
M(z) =∞∑m=0
Mmzm =
∞∏j=1
(1 +
∞∑s=1
fj(s)
jss!zjs).
For completely multiplicative functions defined via fj(s) = bsj, this reduces to
M(z) = m(z) := exp ∞∑
j=1
bjjzj
=: exp B(z) . (1.50)
Proof. Multiply the given formal power series term by term and use Lemma1.11.1 to compare the coefficients.
1.11.3 Lemma. In the case of completely multiplicative function f(σ), we have thefollowing recurrence relation:
Mn =1
n
n−1∑k=0
Mkbn−k , M0: = 1 . (1.51)
Proof. Exploit the formal equality
M ′(z) = M(z)B′(z) .
40 I. ELEMENTARY THEORY
The main traditional ways to investigate Mn are based on the complex analysis,
Cauchy’s formula, and contour integration involving the function M(z) providedthat it is well defined in a nontrivial vicinity of zero. We now present a comparativelynew approach elaborated by V. Zacharovas in his thesis [?]. The first ingredient isthe following inequality raising an interest in itself.
1.11.4 Lemma. Let the series
g(z) =∞∑k=0
gkzk, gk ∈ C ,
converge in |z| < 1. If
S(m) =m∑k=0
kgk, m ≥ 0
then, for every n ≥ 1,∣∣∣ n∑k=0
gk − g(e−1/n)− S(n)
n
∣∣∣ ≤ 2
n
n∑k=1
|S(k)|k
+2
n
∑k>n
|S(k)|k
e−k/n.
Proof. For the finite sum and the infinite series we apply Abel’s summation. Theneeded reordering in the infinite series is justified by the absolute convergence onthe circumference |z| = e−1/n. Since gk = S(k)− S(k − 1), S(0) = 0, we obtain
R: =n∑k=0
ak − g(e−1/n)− S(n)
n=
n−1∑k=1
S(k)(1
k− 1
k + 1
)−∞∑k=1
S(k)(e−k/n
k− e−(k+1)/n
k + 1
). (1.52)
The modulo of the remainder sum over k ≥ n of the last series does not exceed∑k≥n
|S(k)|k
e−k/n∣∣∣1− e−1/n +
e−1/n
k + 1
∣∣∣ ≤ 2
n
∑k≥n
|S(k)|k
e−k/n.
The remaining summands on the right-hand side of (1.52) can be added. They give
R1: =n−1∑k=1
S(k)[1
k
(1− e−k/n
)− 1
k + 1
(1− e−(k+1)/n
)].
Including the terms ±e−(k+1)/n/k under the brackets and grouping appropriately,we see the absolute value of the quantity within them equals∣∣∣1
ke−k/n(e−1/n − 1) +
1
k(k + 1)(1− e−(k+1)/n)
∣∣∣ ≤ 2
kn.
1.11. MEAN VALUES OF MULTIPLICATIVE FUNCTIONS 41
Consequently,
|R1| ≤2
n
n−1∑k=1
|S(k)|k
.
Inserting the obtained estimates into formula (1.52) and observing that one of thesummands is attributed to another sum, we complete the proof.
The next theorem gives an approximation of the mean values of completelymultiplicative functions. Apart from the combinatorial meaning the result providesan asymptotic formula for the Taylor coefficients of a power series if some informationon the Taylor coefficients of its logarithmic derivative is available.
1.11.1 Theorem. Assume that bj ∈ C, |bj| ≤ 1, j ≥ 1, n ≥ 1, and
m(z): =∞∑n=0
mnzn = exp
∞∑j=1
bjjzj.
Then ∣∣∣mn − exp∑j≤n
bj − 1
j
∣∣∣ ≤ 5
n
∑j≤n
|bj − 1|(
1 + logn
j
).
Moreover, for every p > 1,∣∣∣mn − exp∑j≤n
bj − 1
j
∣∣∣ ≤ (2 +3p
p− 1
)( 1
n
∑j≤n
|bj − 1|p)1/p
. (1.53)
Proof. We will apply Lemma 1.11.4 for the function
g(z): =∞∑n=0
gnzn: = (1− z)m(z) = exp
∞∑j=1
bj − 1
jzj,
which is defined in the disk |z| < 1. The values bj for j > n have no influence tomn; therefore, we put 1 instead of them and obtain g(z) in the whole complex plane.Now S(k) = 1g1 + · · ·+ kgk and
∞∑k=1
S(k)zk =zg′(z)
1− z=
1
1− zexp
∞∑j=1
bj − 1
jzj ∞∑
j=1
(bj − 1)zj
= m(z)∞∑j=1
(bj − 1)zj .
By (1.49), the condition |bj| ≤ 1, j ≥ 1, implies |mn| ≤ 1. Consequently,
|S(k)| =∣∣∣ k∑j=1
(bj − 1)mk−j
∣∣∣ ≤ k∑j=1
|bj − 1| ≤ min
2k,n∑j=1
|bj − 1|
42 I. ELEMENTARY THEORY
for every k ≥ 1. The conditions of Lemma 1.11.4 are satisfied. It yields
|mn − g(1)| =∣∣∣mn − exp
n∑j=1
gj − 1
j
∣∣∣≤ |g(e−1/n)− g(1)|+ |S(n)|
n
+2
n
n∑k=1
|S(k)|k
+2
n
∑k>n
|S(k)|k
e−k/n . (1.54)
Applying the above estimate of |S(k)|, we obtain
|g(1)− g(e−1/n)| ≤∫ 1
e−1/n
|g′(x)| dx ≤∫ 1
e−1/n
1− xx
∞∑k=1
|S(k)|xk dx
≤∑j≤n
|bj − 1|∫ 1
e−1/n
1− xx
∞∑k=1
xk dx
= (1− e−1/n)∑j≤n
|bj − 1| ≤ 1
n
∑j≤n
|bj − 1| =: ρn .
As we have observed, |S(n)|/n ≤ ρn. The last but one error in (1.54) does notexceed
2
n
n∑k=1
1
k
k∑j=1
|bj − 1| = 2
n
n∑j=1
|bj − 1|∑j≤k≤n
1
k≤ 2
n
n∑j=1
|bj − 1|(
1 + logn
j
).
Finally, the last term of (1.54) does not exceed
2
n
n∑j=1
|bj − 1|∞∫
1
e−v
vdv ≤ (2/e)ρn .
Adding the just obtained estimates we complete the proof of the first claim of thetheorem.
To prove the second claim, it suffices to apply (1.54) and the inequality
|S(k)| ≤ k1/q(∑j≤k
|bj − 1|p)1/p
≤ k1/q(∑j≤n
|bj − 1|p)1/p
,
where q = p/(p − 1) and k ≥ 1. The appearing extra sums of monotonicallydecreasing summands can be approximated by integrals as, for instance, this one:
∑k≤n
k−1/p ≤ 1 +
n∫1
u−1/p du ≤ 1 + qn1/q − q ≤ qn1/q.
The theorem is proved.
1.11. MEAN VALUES OF MULTIPLICATIVE FUNCTIONS 43
As we see, we were rather pedantically calculating the constants in Theorem1.11.1. This was done to stress that the result is valid even uniformly in the functionsm(z) satisfying the conditions in Theorem 1.11.1. The instance below demonstrateshow we can extend the classical problem on counting of permutations missing somecycles. In it, we can take the forbidden set of the cycle lengths depending on n.
1.11.1 Example. We seek an asymptotic formula for the number of permutationsσ ∈ Sn which cycle lengths belong to J := Jn ⊂ 1, . . . , n = Nn. If, as previously,kj(σ) ≥ 0 denotes the number of cycles of lengths j, then the problem is to examinethe cardinality
Sn(J): = σ ∈ Sn: kj(σ) = 0,∀j 6∈ J .
The indicator of such set of permutations is the completely multiplicative functiondefined via bj = 1 if j ∈ J and bj = 0 if j ∈ J := Nn \ J . Hence
1 +∞∑m=1
|Sn(J)|m!
zm = exp∑
j∈J
zj
j
.
Theorem 1.11.1 yields the inequality∣∣∣ |Sn(J)|n!
− exp−∑j∈J
1
j
∣∣∣ ≤ 5
n
∑j∈J
(1 + log
n
j
).
The result is nontrivial if |J | is not large.
The function m(z) examined in Theorem 1.11.1 has a very special shape. Formany applications, we need asymptotic formulae for the Taylor coefficients of theproduct
M(z) :=∞∑n=o
Mnzn = m(z)H(z) := exp
∞∑j=1
bjzj
j
∞∑k=0
Hkzk ,
where H(z) is a function defined in the closed disk |z| ≤ 1 and, in some sense,”better” than B(z). We start with a very simple case.
1.11.2 Theorem. Assume that bj ∈ C, |bj| ≤ 1 if j ≥ 1, and
rn :=1
n
∑j≤n
|bj − 1|2 .
If|Hk| ≤ C3/(k + 1)2 (1.55)
for some constant C3 > 0 and all k ≥ 0, then∣∣∣Mn −H(1) exp∑j≤n
bj − 1
j
∣∣∣ ≤ 4C3
n+
7π2C3
3
√rn .
44 I. ELEMENTARY THEORY
Proof. It suffices to apply estimate (1.53) in the first part of the sum
Mn =( ∑k≤n/2
+∑
n/2<k≤n
)Hkmn−k
and to estimate the second part. Since |mk| ≤ 1 for all k ≥ 0, the contribution ofthe second sum is not greater than C3
∑k>n/2 1/(1+k)2 ≤ 2C3/n. Inserting formula
(1.53) with p = 2, we have∣∣∣ ∑k≤n/2
Hkmn−k −∑k≤n/2
Hk exp ∑j≤n−k
bj − 1
j
∣∣∣ ≤ 8√rn∑k≤n/2
|Hk| ≤ (4/3)π2C3
√rn .
Here we have inserted the value ζ(2) = π2/6 of the Riemann zeta-function ζ(s) =∑n≥1 n
−s. If k ≤ n/2 and n ≥ 1, then∑n−k<j≤n
|bj − 1|j
≤ 2 log(
1− k
n
)−1
≤ log 4 .
Hence, by the inequality |ez − 1| ≤ |z|e|z|, where z ∈ C,∣∣∣∣ exp ∑n−k<j≤n
1− bjj
− 1
∣∣∣∣ ≤ 4(∑j≤n
|bj − 1|2)1/2( ∑
j>n−k
1
j2
)1/2
≤ 6√rn
and ∑k≤n/2
Hk exp ∑j≤n−k
bj − 1
j
=∑k≤n/2
Hk exp∑j≤n
bj − 1
j
(1 + 6θ
√rn
)=
(H(1) +
2θC3
n
)exp
∑j≤n
bj − 1
j
+ C3θπ
2√rn
with some |θ| ≤ 1, not the same at different places. Collecting all estimates weobtain the desired result.
The theorem is proved. Theorem 1.11.2 enables to find necessary and sufficient conditions under which
a multiplicative function f(σ) such that |f(σ)| ≤ 1 posses the limit mean valuelimn→∞Mn(f). We have just to analyze condition (1.55) more carefully. Afterwardswe will exploit some technical lemmata which are essentially due to H.-K. Hwang[?].
Let [zm]q(z) denote the m-th Taylor coefficient of a function q(z) defined in avicinity of zero.
1.11.5 Lemma. Let n ≥ 0. If |[zn]q(z)| ≤ C4/n! and |[zn]g(z)| ≤ C5/n!, then
[zn](q(z)g(z)) ≤ C4C52n/n! .
If |[zn]q(z)| ≤ C4(n+ 1)−2 and |[zn]g(z)| ≤ C5(n+ 1)−2, then
[zn](q(z)g(z)) ≤ 8ζ(2)C4C5(n+ 1)−2, [zn] exp g(z) ≤ C6(n+ 1)−2 .
Here C6 is a positive constant depending on C4 and C5 only.
1.11. MEAN VALUES OF MULTIPLICATIVE FUNCTIONS 45
Proof. The first claim is easy. To prove the second one, we exploit the givenconditions as follows:
[zn](q(z)g(z)) ≤ C4C5
( ∑k≤n/2
+∑
n/2<k≤n
) 1
(k + 1)2(n− k + 1)2≤ 8ζ(2)C4C5(n+1)−2 .
Further, by induction, we obtain
[zn](g(z)r) ≤ (8ζ(2))r−1Cr5(n+ 1)−2
for each r ≥ 1 and n ≥ 0. Now, if g(0) = 0, then
|[zn] exp g(z)| =∣∣∣ n∑r=0
[zn]gr(z)1
r!
∣∣∣≤ (n+ 1)−2
∞∑r=0
(8ζ(2))r−1Cr5
r!
< (n+ 1)−2 exp 8ζ(2)C5 =: (n+ 1)−2C7 .
In the remaining case, we have
[zn]eg(z) = eg(0) [zn]eg(z)−g(0) ≤ eC5C7 .
The lemma is proved.
1.11.6 Lemma. Assume that
χj(z) =∑n≥2
ajnzn, j ≥ 1,
are entire functions satisfying |ajn| ≤ Cn/n! for all j ≥ 1 and n ≥ 0, where C > 0is a constant. Then
An := [zn]∏j≥1
(1 + χj(zj/j)) ≤ C1
n2, n ≥ 1,
where C1 is a positive constant depending on C only.
Proof. We have
An =∑`(k)=nkj 6=1
n∏j=1
ajkjjkj≤∑`(k)=nkj 6=1
n∏j=1
(Cj
)kj 1
kj!
= [zn]∏j≥1
(1 +
(eCz
j/j − 1− Czj
j
))≤ [zn] exp
∑j≥1
(eCz
j/j − 1− Czj
j
).
46 I. ELEMENTARY THEORY
Check that
[zn]∑j≥1
(eCz
j/j − 1− Czj
j
)=∑
d|n,d≥2
Cd
(n/d)dd!
≤ 2C2
n2
∑d|n,d≥2
Cd−2
(d− 2)!
(dn
)d−2
≤ 2C2eC
n2.
The desired estimate now follows from the previous lemma.The lemma is proved.
1.11.3 Theorem. Let f(σ) be a complex-valued multiplicative function defined via|fj(s)| ≤ 1 for j, s ≥ 1. Denote
rn =1
n
∑j≤n
|fj(1)− 1|2 .
Then
Mn(f)−∏j≤n
e−1/j
(1 +
∞∑s=1
fj(s)
jss!
)√rn +
1
n. (1.56)
The constant implied in the symbol is absolute.
Proof. In the previous notation, the generating series M(z) defined in Lemma1.11.2 can be written as M(z) = m(z)H(z) with
H(z) =∏j≥1
e−bjzj/j
(1 +
∞∑s=1
fj(s)
jss!zsj)
=:∏j≥1
(1 + χj(zj/j))
and bj := fj(1). Here
χj(z) =∑n≥2
zn
n!
(−bj)n − n(−bj)n + n!∑r+s=nr≥0,s≥2
(−bj)rfj(s)r!s!
=:∑n≥2
ajnzn
are entire functions. Moreover,
|ajn| ≤1
n!
(1 + n+
n∑s=0
(ns
))≤ 4n
n!.
By Lemma 1.11.6, this implies condition (1.55). Theorem 1.11.2 now yields theapproximation
Mn(f)−H(1) exp∑j≤n
bj − 1
j
1
n+√rn.
Since the values fj(s) for j > n and s ≥ 1 do not have any influence to Mn(f) wemay assume that they are equal to 1. The last formula then coincides with (1.56).
The theorem is proved.
Exercise. Find a proof of the classical Tauber theorem based on Lemma 1.11.4.
1.12. THE THREE SERIES THEOREM 47
1.12 The three series theorem
Given a real-valued additive function h(σ), its distribution can be treated by theuse of the Fourier transform∫
Reitxdνn(h(σ) < x) =
1
n!
∑σ∈Sn
eith(σ) =: ϕn(t), t ∈ R .
This leads to a problem on asymptotic mean values for multiplicative functions asn→∞. We now can apply Theorem 1.11.1.
Recall that u∗ = min|u|, 1sgnu. For simplicity, examine the completely addi-tive function
h(σ) =n∑j=1
ajkj(σ) .
1.12.1 Theorem. The distribution functions νn(h(σ) < x) weakly converge to alimit distribution if and only if the series
∞∑j=1
a∗2
j
j(1.57)
and∞∑j=1
a∗jj
(1.58)
converge. Under these conditions the characteristic function of the limit law is
ϕ(t) = exp ∞∑
j=1
eitaj − 1
j
.
Proof. Define a completely multiplicative function f(σ) via bj = eitaj , wherej ≥ 1 and t ∈ R.
Sufficiency. We have to prove that, under conditions (1.57) and (1.58), thecharacteristic functions
ϕn(t) :=1
n!
∑σ∈Sn
eith(σ)
converge uniformly in |t| ≤ T for every T > 0 as n → ∞. Nevertheless, we willprove even more. Namely, assuming only (1.57), we will derive that
ϕn(t) := exp − itAnϕn(t) = ϕ(t) + o(1) (1.59)
with the same uniformity, where
An =n∑j=1
a∗jj.
48 I. ELEMENTARY THEORY
Afterwards we will use the inequalities |1− z|2 ≤ 2(1−<z) if |z| ≤ 1 and
1− cosαt ≤ 21|α| ≥ 1+ |t|21|α| < 1, α, t ∈ R .
If (1.57) holds, then∑j≤n
1− cos tajj
≤ 2∑|aj |≥1
1
j+ T 2
∑|aj |<1
a2j
j≤ C(T ) <∞
uniformly in |t| ≤ T . Moreover,
1
n
∑j≤n
|1− eitaj |2 ≤ 4 log n
n+ 2
∑logn<j≤n
1− cos tajj
= o(1)
as n→∞ in the same region. Further, from Theorem 1.11.1, we obtain∣∣∣ϕn(t)−exp−itA(n)+
∑j≤n
eitaj − 1
j
∣∣∣ ≤ 12( 1
n
∑j≤n
(1−cos taj))1/2
= o(1) . (1.60)
The real part of the sum in the exponential function on the left-hand side convergesuniformly in |t| ≤ T .
The remaining sums can be rewritten as:
i∑j≤n|aj |<1
sin taj − tajj
+ i∑j≤n|aj |≥1
sin taj − tsgn ajj
.
By inequality | sinx − x| |x|3, x ∈ R, and (1.57), either of the sums convergesas n→∞ with the desired uniformity. The proof of (1.59) is completed. Estimate(1.60) also shows that
ϕ(t) = exp ∞∑
j=1
eitaj − 1− ita∗jj
, t ∈ R .
If both series in Theorem 1.12.1 converge, the same argument applies to provethe sufficiency part of the theorem and it gives the claimed expression for ϕ(t) aswell.
Necessity. For each T > 0, if |t| ≤ T , we have
ϕm(t) = ϕ(t) + o(1), (1.61)
as m → ∞. Here ϕ(t) is a characteristic function. There exists a T > 0 such that|ϕ(t)| ≥ 1/2 for t ∈ [−T, T ]. We will use (1.61) for m = n, n− 1, . . . , n−K, whereK ≥ 1 is an arbitrary fixed natural number.
We now do a very unexpected step in applying Theorem 1.8.3. If y(σ) = f(σ) =eith(σ) and bj = eitaj , inequality (1.30) yields∑
j≤K
j
∣∣∣∣ ∑σ∈Sn
f(σ)(kj(σ)− 1
j
)∣∣∣∣2 ≤ 2(n!)2 (1.62)
1.12. THE THREE SERIES THEOREM 49
for every 1 ≤ K ≤ n− 1. Calculating the sums, we first reckon the summands withkj(σ) = 1, where j ≤ K. We obtain
1
n!
∑σ∈Snkj(σ)=1
f(σ) =∑`(s)=nsj=1
n∏i=1
(bii
)si 1
isisi!
=bjj
∑`(s)=n−jsj=0
n∏i=1
(bii
)si 1
isisi!.
Observe that, if m < j ≤ n, then it follows from `(s) = m that sj = 0. Hence
1
n!
∑σ∈Snkj(σ)=1
f(σ) =bjj
∑1s1+···+(n−j)sn−j=n−j
sj=0
n−j∏i=1
(bii
)si 1
isisi!
=bj
j(n− j)!∑σ∈Sn−jkj(σ)=0
f(σ)
=bjjEn−jf(σ) +R1;
here
|R1| =
∣∣∣∣ −bjj(n− j)!
∑σ∈Sn−jkj(σ)≥1
f(σ)
∣∣∣∣ ≤ 1
j(n− j)!∑σ∈Sn−jkj(σ)≥1
1
=1
j
∑1≤m≤(n−j)/j
νn−j(kj(σ) = m) .
Theorem 1.2.1 now yields
|R1| ≤1
j
∑1≤m≤(n−j)/j
1
jmm!
∑0≤s≤((n−j)/j)−m
(−1)s
jss!
≤ e−1/j
j
∑m≥1
1
jmm!≤ 1
j(1− e−1/j) ≤ 1
j2.
Similarly,
|R2| :=1
n!
∣∣∣ ∑σ∈Snkj(σ)≥2
f(σ)kj(σ)∣∣∣
≤∑
2≤m≤n/j
mνn(kj(σ) = m)
=∑
2≤m≤n/j
1
jm(m− 1)!
∑0≤s≤(n/j)−m
(−1)s
jss!
≤ e−1/j
j
∑r≥1
1
jrr!≤ 1
j2.
50 I. ELEMENTARY THEORY
In the just introduced notation, inequality (1.62) can be rewritten as∑j≤K
j
∣∣∣∣bjj En−jf(σ) +R1 +R2 −1
jEnf(σ)
∣∣∣∣2 ≤ 2 .
Hence using |a+ b|2 ≤ 2|a|2 + 2|b|2, where a, b ∈ C, we obtain∑j≤K
j∣∣∣bjjEn−jf(σ)− 1
jEnf(σ)
∣∣∣2≤ 4 + 2
∑j≤K
j(|R1|+ |R2|)2
≤ 4 + 8∞∑j=1
1
j3≤ C <∞ .
It follows from (1.61) that Emf(σ) = ϕ(t) + o(1) uniformly in |t| ≤ T and n−K ≤m ≤ n. Consequently, ∑
j≤K
j
∣∣∣∣bjj ϕ(t)− 1
jϕ(t) + o(1)
∣∣∣∣2 ≤ C .
Since |ϕ(t)| ≥ 1/2, as n→∞, we obtain∑j≤K
|eitaj − 1|2
j≤ 4C
for each K ≥ 1. In other words, the series
∞∑j=1
1− cos tajj
(1.63)
converges uniformly in |t| ≤ T .We now claim that this implies the convergence of series (1.57). Indeed, applying
1− cosx ≥ 2π−2x2, where |x| ≤ π/2, we obtain the convergence of∑|aj |≤π/(2T )
a2j
j. (1.64)
We nay assume here that π/(2T ) ≥ 2. Integrating the part of series (1.63) with|aj| ≥ π/(2T ) in the interval [0, T ] and dividing by T , we derive the inequality∑
|aj |≥π/(2T )
1
j
(1− sin taj
Taj
)≤ 4C
which implies ∑|aj |≥π/(2T )
1
j≤ 4C
(1− 2
π
)−1
=: C1 <∞ .
1.12. THE THREE SERIES THEOREM 51
Hence and from (1.64) we obtain
∑|aj |≥1
1
j≤ C1 +
∑|aj |≤π/(2T )
a2j
j<∞ .
Now, using (1.64), we have convergence of (1.57). By the sufficiency part, this leadsto
νn(h(σ)− An < x)⇒ G(x) ,
here G(x) is a distribution function with the characteristic function ϕ(t). Now therelation
expitAn = ϕ(t)/ϕ(t) + o(1)
as n→∞ gives (1.58).The theorem is proved.