divisibility - king's college london · induction. we write z for all integers: f:::; 2;...

24
PRIME NUMBERS YANKI LEKILI We denote by N the set of natural numbers: 1,2, . . . , These are constructed using Peano axioms. We will not get into the philosophical ques- tions related to this and simply assume the usual properties of natural numbers: There is an addition and multiplication law on numbers. These satisfy the commutative, associative and distributive laws. There is an order on N so that either a<b or b<a for distinct natural numbers. Furthermore, every non-empty set in N has a smallest element, i.e. the order on N is a well-ordering. Finally, we shall appeal to the principle of mathematical induction. We write Z for all integers: {..., -2, -1, 0, 1, 2,...} and Z 0 for non-negative integers. 1. Divisibility Definition 1. An integer a is divisible by b if there is a third integer c such that a = bc We write b | a if b divides a and b - a if b does not divide a. The relation | is reflexive, a | a; transitive, b | a and c | b implies c | a, but not symmetric, if b | a then it is not usually the case that a | b. In fact, if b and a are positive integers and b | a, then we have b a. Definition 2. A positive integer p is said to be a prime number (or simply a prime ) if p> 1 and p has no positive divisors except 1 and p. The first few prime numbers are: 2, 3, 5, 7, 11, 13, 17, 19, 23, . . . The primes are the “building blocks” of numbers. The following theorem makes this precise: Theorem 3. Every positive integer except 1 is a product of primes. Proof. Let n N be a number. Either n is prime, when there is nothing to prove or n has divisors between 1 and n. Let S be the set of divisors of n greater than 1. Then this set has a smallest element m. We claim that m is a prime. Otherwise, there would be natural number l with 1 <l<m such that l | m but since m | n, by transitivity, we have that l | n. We obtained an element of S , namely l, that is smaller than the smallest element of S . This is a contradiction. Hence, m must have been a prime. Therefore, n is either prime or divisible by a prime less than n, say p 1 in which case, n = p 1 n 1 , 1 <n 1 <n 1

Upload: hoangminh

Post on 26-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

PRIME NUMBERS

YANKI LEKILI

We denote by N the set of natural numbers: 1,2, . . . ,These are constructed using Peano axioms. We will not get into the philosophical ques-

tions related to this and simply assume the usual properties of natural numbers: There isan addition and multiplication law on numbers. These satisfy the commutative, associativeand distributive laws. There is an order on N so that either a < b or b < a for distinctnatural numbers. Furthermore, every non-empty set in N has a smallest element, i.e. theorder on N is a well-ordering. Finally, we shall appeal to the principle of mathematicalinduction.

We write Z for all integers: {. . . ,−2,−1, 0, 1, 2, . . .} and Z≥0 for non-negative integers.

1. Divisibility

Definition 1. An integer a is divisible by b if there is a third integer c such that

a = bc

We write b | a if b divides a and b - a if b does not divide a.

The relation | is reflexive, a | a; transitive, b | a and c | b implies c | a, but not symmetric,if b | a then it is not usually the case that a | b. In fact, if b and a are positive integers andb | a, then we have b ≤ a.

Definition 2. A positive integer p is said to be a prime number (or simply a prime) ifp > 1 and p has no positive divisors except 1 and p.

The first few prime numbers are: 2, 3, 5, 7, 11, 13, 17, 19, 23, . . .

The primes are the “building blocks” of numbers. The following theorem makes thisprecise:

Theorem 3. Every positive integer except 1 is a product of primes.

Proof. Let n ∈ N be a number. Either n is prime, when there is nothing to prove or n hasdivisors between 1 and n. Let S be the set of divisors of n greater than 1. Then this sethas a smallest element m. We claim that m is a prime. Otherwise, there would be naturalnumber l with 1 < l < m such that l | m but since m | n, by transitivity, we have thatl | n. We obtained an element of S, namely l, that is smaller than the smallest element ofS. This is a contradiction. Hence, m must have been a prime. Therefore, n is either primeor divisible by a prime less than n, say p1 in which case,

n = p1n1, 1 < n1 < n

1

2 YANKI LEKILI

Here n1 is either a prime, in which case the proof is completed or it is divisible by a primep2, in which case, we have

n = p1n1 = p1p2n2, 1 < n2 < n1 < n

Repeating the argument, we obtain a sequence of decreasing numbers n, n1, . . . , nk−1, . . .The sequence stops when nk is prime for some k, and then we have:

n = p1p2 . . . pk

Note that pi’s in the above proof do not have to be distinct, we can group them togetherand write:

n = pe11 pe22 . . . pess

to get the prime factorisation of the integer n. For example, we have:

666 = 2.32.37

We will see later that the factors peii are unique apart from rearrangement of factors. But,first we need to develop our understanding of division a little more.

Lemma 4. (Division Algorithm) Given a ∈ Z and b > 0, there exists a unique q, r ∈ Zsuch that a = qb+ r and 0 ≤ r < b.

Proof. Consider the arithmetic progression

. . . , a− 3b, a− 2b, a− b, a, a+ b, a+ 2b, a+ 3b . . .

extending indefinitely in both directions. Let S be the set of nonnegative elements inthis list. S is non-empty: Either a is nonnegative then a ∈ S, or if a is negative,thena−ab = a(1− b) ≥ 0 hence a−ab ∈ S. Now, S is non-empty, hence has a smallest elementr. Thus, by definition, we have r = a − qb for some q and r ≥ 0. Also r < b becauseotherwise r − b = a− (q + 1)b would be an element of S that is smaller than r.

Next, to prove uniqueness, let a = q1b + r1 = q2b + r2 satisfying the same conditions.Then (q1 − q2)b = r2 − r1. Taking absolute values, we get |q1 − q2|b = |r2 − r1|, henceb | |r2− r1|, but 0 ≤ r1, r2 < b hence |r2− r1| < b. Hence, it must be that |r2− r1| = 0 and|q2 − q1| = 0. In other words, r1 = r2 and q1 = q2. �

Definition 5. Let a, b ∈ N. The greatest common divisor of a and b is the greatest numberd ∈ N such that d | a and d | b. We write (a, b) (or gcd(a, b)) for the greatest commondivisor of a and b . The numbers a and b are said to be coprime (or relatively prime) if(a, b) = 1.

For example, by listing all divisors of 12 and all divisors of 8, one can easily compute(12, 8) = 4 but soon we will do this in a much more efficient way.

Theorem 6. If d = (a, b) then there exists integers x0 and y0 such that

d = (a, b) = ax0 + by0

PRIME NUMBERS 3

Another way to state this fundamental result is that the greatest common divisor of aand b is a Z-linear combination of a and b.

Proof. Consider the set S of all natural numbers of the form ax + by with x and y in Z.The set is non-empty, for example it contains a and b. Hence, S has a smallest element m.So m is a natural number of the form m = ax0 + by0 for some integers x0 and y0. Everycommon divisor of a and b divides m, hence in particular d | m. To conclude that d = m,we shall show that m | a and m | b. Using the division algorithm, write a = qm + r for0 ≤ r < m. Let x′ = (1− qx0) and y′ = −qy0. Then, we have

ax′ + by′ = a− qax0 − qby0 = a− qm = r

Hence, by the minimality property of m it follows that r = 0. This shows that m | a,similarly we show that m | b. �

Note that the integers x0, y0 are not uniquely determined. Indeed, given one solution(x0, y0) to d = ax0 + by0, we can obtain infinitely many other solutions as:

d = a(x0 + nb) + b(y0 − na) for n ∈ Z.The previous theorem gives a characterization of the greatest common divisor of a and b.

Namely, it is least positive integer value of ax+ by where x and y ranges over all integers.But how to compute this value? (and the integers x0, y0? )

We shall use Euclid’s algorithm. The crucial observation is the following lemma:

Lemma 7. If a = qb+ r then (a, b) = (b, r).

Proof. If d is a common divisor of a and b, then d | a− qb = r, hence d is a common divisorof b and r. Conversely, if d is a common divisor of b and r, then d | qb + r = a, hence dis a common divisor of a and b. Therefore, the set of common divisors of a and b agreewith the set of common divisors of b and r, hence the greatest common divisors are thesame. �

Now, given a, b ∈ N, Euclid’s algorithm works as follows to determine (a, b). Withoutloss of generality, suppose b < a, then we apply division algorithm to write a = q1b+r1 with0 ≤ r1 < b. If r1 6= 0, then we apply division algorithm to b and r1 to write b = q2r1 + r2with 0 ≤ r2 < r1. We repeat this until we find a remainder which is zero. (This musthappen at some finite step, since b > r1 > r2 . . . ≥ 0. Thus, we have a system of equations:

a = q1b+ r1, 0 < r1 < b

b = q2r1 + r2, 0 < r2 < r1

r1 = q3r2 + r3, 0 < r3 < r2...

rn−3 = qn−1rn−2 + rn−1, 0 < rn−1 < rn−2

rn−2 = qnrn−1 + rn, 0 < rn < rn−1

rn−1 = qnrn + 0.

4 YANKI LEKILI

We apply the above lemma repeatedly to deduce

(a, b) = (b, r1) = (r1, r2) = . . . = (rn−2, rn−1) = (rn−1, rn) = (rn, 0) = rn

Thus, we proved:

Theorem 8. The last non-zero remainder rn of Euclid’s algorithm is the greatest commondivisor of a and b.

Here is an example. Let us compute (187, 35). We have

187 = 5.35 + 12

35 = 2.12 + 11

12 = 1.11 + 1

11 = 11.1

Thus, we see that (187, 35) = 1. Thus, we should be able to find integers x0 and y0 suchthat

187x0 + 35y0 = 1

Euclid’s algorithm also gives a way to do this. Namely, we have

1 = 12− 1.11

1 = 12− 1.(35− 2.12)

1 = 187− 5.35− 1.(35− 2.(187− 5.35)) = 3.187− 16.35

Indeed, one can use Euclid’s algorithm to give another proof of Theorem 6.We will now return back to factorisations of natural numbers into primes. We start with

the following:

Theorem 9. (Euclid’s first theorem) Let p be a prime number and let a1, a2 ∈ N. Ifp | a1a2 then p | a1 or p | a2. More generally, if a prime p divides a product a1a2 . . . an,then p divides ai for some i.

Proof. The case of a product with n factors follows easily from the case with two factors.Suppose p is a primes and p | a1a2. If p - a1 then (a1, p) = 1 and therefore, by Theorem

6, there are an x0 and a y0 for which

x0a1 + y0p = 1

Multiplying this by a2 gives

x0a1a2 + y0pa2 = a2

Now, p | x0a1a2 and p | y0pa2, hence p | a2. �

Let’s use this to prove a theorem due to Pythagoras.

Theorem 10.√

2 is irrational.

PRIME NUMBERS 5

Proof. If√

2 is rational, we can write it as√

2 = ab

for integers a, b such that (a, b) = 1.Then, a and b satisfy the equation:

a2 = 2b2

Hence, b | a2. Therefore, p | a2 for any prime factor p of b. It follows from Theorem 9 thatp | a. But, this is contrary to the assumption that (a, b) = 1. Hence b = 1, and this also isclearly false. �

We now come to one of the main tools of elementary number theory:

Theorem 11. (Fundamental theorem of arithmetic) Every positive integer a > 1 hasa factorisation into prime factors as a = pe11 p

e22 . . . pess , and apart from rearrangement of

factors, this factorisation is unique.

Proof. We have already seen the existence of a factorisation in Theorem 3. Now, we showuniqueness.

Suppose that a = p1 . . . pk = q1 . . . qj are two prime factorisations of a. Then, byTheorem 9, p1|qi1 for some i1. Since qi1 is a prime, this implies that p1 = qi1 . We can thendivide out p1 and qi1 from both sides to get two prime factorisations of a/p1 = p2 . . . pk =q1 . . . qi1−1qi1+1 . . . qj. We can then match p2 with qi2 for some i2 by the same argument.Continuing this way, we get that for all s, ps = qis for some is. After cancelling out all ofp1, p2, . . . pk, the remaining product must equal 1. Hence, there are no remaining factorson the right hand side either. Hence k = j and the matching p1 = qi1 , p2 = qi2 , . . . pk = qikshows that the two factorisations differ by only a rearrangement of factors. �

It is now clear why 1 should not be counted as a prime. If it were, then Theorem 11would be false, since we could insert any number of unit factors.

If we know the prime factorisation of positive integer a then we can immediately writedown all positive divisors: if a = pe11 p

e22 . . . pekk then b|a if and only if b has a prime factori-

sation of the form b = pf11 pf22 . . . pfkk with 0 ≤ fi ≤ ei for all i. This observation gives the

following lemma:

Lemma 12. If m,n ∈ N are coprime then every natural number d with d | mn can bewritten uniquely as d = d1d2 where d1, d2 ∈ N and d1 | m and d2 | n.

Proof. Since m and n are coprime, they don’t have any prime factors in common. So,m = pk11 . . . pkrr and n = p

kr+1

r+1 . . . pkr+s

r+s where all the pi are distinct. If d | mn then

d = pl11 . . . plr+s

r+s with 0 ≤ li ≤ ki for 1 ≤ i ≤ r+s. Let d1 = pl11 . . . plrr and d2 = p

lr+1

r+1 . . . plr+s

r+s .Then, obviously d = d1d2, d1 | m and dr | n.

Conversely, if d = d′1d′2, d

′1 | m and d′2 | n then we must have d′1 = p

l′11 . . . p

l′rr and

d′2 = pl′r+1

r+1 . . . pl′r+s

r+s . From d = d′1d′2 it follows that li = l′i for 1 ≤ i ≤ r + s and therefore

d1 = d′1 and d2 = d′2. This shows uniqueness. �

We will also need the following lemma.

Lemma 13. If p1, p2 . . . pr be distinct prime numbers and let n be any integer. If pi | n forall i then p1p2 . . . pr | n.

6 YANKI LEKILI

Proof. For every r ≥ 1, we must show the following statement: If p1, p2 . . . , pr are distinctprimes, n is any integer and pi | n for i = 1, 2, . . . , r, then p1p2 . . . pr | n. To do this, weuse induction on r. The case r = 1 is obviously true. Now assume r ≥ 2 and that we haveshown the statement for r− 1. Let p1 . . . , pr be distinct primes and n an integer such thatpi | n for all i. By induction hypothesis, we get that p1p2 . . . pr−1 | n. So, we can writen = p1p2 . . . pr−1m for some m ∈ Z. Now, since pr | n and pr - pi for 1 ≤ i ≤ r − 1, itfollows, by Theorem 9, that pr | m. Hence, it follows that p1p2 . . . pr | p1p2 . . . pr−1m = nas desired. �

Exercise: Give an alternative proof of Lemma 13 using FTA.

Finally, let us discuss factorials and their prime factorisations. Recall that given anatural number N , we have the notation:

N ! :=N∏i=1

i = 1.2 . . . (N − 1).N

This number is equal to the number of permutations of a set of N elements.Let us observe that if p | N !, then p must divide one of the numbers 1, 2, . . . , N and

therefore p ≤ N . On the other hand, every prime number p ≤ N is a prime factor of N !.So, to find a prime factorisation of N !, we need to determine the exponent of each primep ≤ N which divides N !. Let us write this first as:

N ! =∏p≤N

pep

where the product is over all p ≤ N and ep are non-negative integers.To state the next result, we find convenient to introduce the following notation:

Definition 14. For any real number x ∈ R, one signifies by [x] the largest integer ≤ x,that is, the unique integer such that x − 1 < [x] ≤ x. This function is called the integralpart of x.

Lemma 15. N ! =∏

p≤N pep where ep =

∑∞m=1[

Npm

]

Note that the sum∑∞

m=1[Npm

] has only finitely many non-zero terms because if pm > N ,

then [ Npm

] = 0.

Proof. Consider a prime number p ≤ N . We must count how often p appears in theproduct N ! = 1.2. · · · .N . Clearly, [N

p] of the factors 1, 2, . . . , N are multiples of p; [N

p2]

factors are multiples of p2 etc. Hence, in ep = [Np

] + [Np2

] + . . . we have counted once the

number of factors which are divisible by p but not p2 (as part of [Np

]), we have counted

twice the number of factors which are divisible by p2 but not p3 (as part of [Np

] + [Np2

]) etc.

This completes the proof. �

PRIME NUMBERS 7

Note that it follows easily from above that

ep ≤ [N

p− 1]

for the sum∑∞

m=1[Npm

] <∑∞

m=1Npm

= N(1p

+ 1p2

+ . . .) = Np−1 .

As an example, let us find the largest integer k such that 7k | 50!. We can compute thisas:

k = [50

7] + [

50

72] = 7 + 1 = 8.

1.1. Computational problems.

Lemma 16. A positive integer n is composite if and only if n has a prime divisor p ≤√n

Proof. If n is composite then n = ab with 1 < a < n and 1 < b < n. We can assume thata ≤ b. Let p be a prime factor of a. Then p is also a prime factor of n and p ≤ a =

√a.a ≤√

a.b =√n. �

A primality test is an algorithm that determines whether an integer n > 1 is a primeor composite. The above lemma gives the following test. For every prime p ≤

√n test

whether n is divisible by p or not. We know that if p|n for some p ≤√n then n is

composite, otherwise n is prime.The sieve of Eratosthenes is the method of computing list of primes up to a number n

by using this algorithm. Write down all the integers n and cross out multiples of 2, thencross out multiples of 3 and continue until we cross out multiples of all primes p ≤

√n.

Then the remaining numbers are prime.This method is useful for small numbers but, of course, it is clear that it is not very

efficient for large numbers. In 2002, Agrawal, Kayal and Saxena developed the first polyno-mial time primality test (known as the AKS primality test). Polynomial time means thatthere exists constants C, k such that for every integer n > 1 the algorithm needs at mostC.(log n)k many steps to decide whether n is prime or not. Note that AKS test determineswhether n is prime or not without finding a prime factor.

2. Basic distribution results

Recall that fundamental theorem arithmetic leads us to think that prime numbers are“building blocks” of all numbers.

We begin with the famous result of Euclid that says that there are infinitely many primes:

Theorem 17. (Euclid’s second theorem) The number of primes is infinite.

We will give two proofs of this result. Here is Euclid’s own proof:

Proof. Let 2, 3, 5, . . . , p be the list of primes up to p, and let

q = 2.3.5 . . . p+ 1

Then q is not divisible by any of the numbers 2, 3, 5, . . . , p. It is therefore either prime, ordivisible by a prime between p and q. In either case, there is a prime greater than p, whichproves the theorem. �

8 YANKI LEKILI

To study the distribution of prime numbers among all natural numbers, we give thefollowing definition:

Definition 18. For a real number x ∈ R, we let

π(x) = “the number of primes that are not greater than x”

For example, π(7) = 4, π(10) = 4, π(π) = 2, π(1) = 0.Note that if we know the function π then we also know all the prime numbers: an integer

n is prime if and only if π(n) > π(n− 1).Obviously, π(x) ≤ x for all x ≥ 0. Euclid’s second theorem implies that π(x) → ∞ as

x→∞. Can we show that π(x) is given by a formula in terms of more familiar functions?Let us try to deduce a bit more from Euclid’s argument. Let pn denote the nth prime.

So p1 = 2, p2 = 3, . . . etc.. Let us define

q = p1.p2.p3 . . . pn + 1

Then, since pi < pn for i = 1, . . . , n− 1, we deduce that

q < pnn + 1

for n > 1, and so that

pn+1 < pnn + 1

as either q = pn+1 or there is a prime bigger than p1, . . . , pn that divides q.In fact, we can do a bit better. Suppose that pn < 22n for n = 1, 2, . . . N , then Euclid’s

argument shows that

pN+1 ≤ p1p2 . . . pN + 1 < 22+4+...2N + 1 < 22N+1

Since, p1 = 2 < 22 = 4, by induction we conclude that

pn < 22n , for all n.

Therefore, we have that π(22n) ≥ n. Now, given any positive real number x, we cansandwich it as

een−1

< x ≤ een

, for some n

since the series eem

for m = 1, 2, . . ., is monotonically increasing. In other words, we have:

n− 1 < log log x ≤ n

Let us suppose n ≥ 3, so that en−1 > 2n and so een−1

> 22n , then we see

π(x) ≥ π(een−1

) ≥ π(22n) ≥ n ≥ log log x

Note that the only place where we used the assumption n ≥ 3 was in deducing π(een−1

) ≥n. This can easily be checked directly for n = 1 and n = 2 as well. Hence, we proved thatπ(x) ≥ log log x for all x > e. We also note that log log x ≤ 0 for 1 < x ≤ e, hence we getthat:

Theorem 19. π(x) ≥ log log x, for x > 1.

PRIME NUMBERS 9

However, log log x is a rather weak bound. For example, for x = 109, it gives π(x) ≥ 3,whereas the value of π(x) is over 50 million.

Figure 1 shows the graph of π(x) for 0 ≤ x ≤ 100.

Figure 1. Graph of π(x) for 0 ≤ x ≤ 100

To draw this for yourselves, go to Mathematica and type ”Plot[PrimePi[x],{x,0,100}]”.Around 1800, several mathematicians conjectured approximations for π(x) as x → ∞.

Legendre suggested in 1798 that π(x) is approximately equal to the function xlog x

. Figure

2 shows how π(x) and xlog x

compares for x ≤ 400.

It is close but does not quite match. Of course, we are able to see this much easilytoday with the help of computers. A better approximation to π(x) is provided by the“logarithmic integral”, defined by:

Li(x) :=

∫ x

2

dt

log t

Here is the Mathematica code for playing with these functions:

p1 = Plot[PrimePi[x], {x, 2, 100000}, ImageSize→ Large]

p2 = Plot[x/Log[x], {x, 2, 100000}, P lotStyle→ {Green}, ImageSize→ Large]

p3 = Plot[LogIntegral[x], {x, 2, 100000}, P lotStyle→ {Red}, ImageSize→ Large]

Show[p1, p2, p3]

We will return to this interesting subject of asymptotic approximation of π(x) later inthe next section.

10 YANKI LEKILI

Figure 2. Graph of π(x) vs. x/ log x for 2 ≤ x ≤ 400

Let’s discuss a second proof of Euclid’s theorem. The proof uses a special sequence ofrather fast growing numbers called Fermat’s numbers. They are defined by:

Fn = 22n + 1

so that

F1 = 5, F2 = 17, F3 = 257, F4 = 65537.

It can be checked easily that F1, F2, F3, F4 are prime numbers. Fermat conjectured thatall Fn are primes but in fact this was disproved by Euler in 1732 when he showed that

F5 = 225 + 1 = 4294967297 = 641× 6700417

An easy proof was given by Coxeter. Indeed,

641 = 54 + 24 = 5.27 + 1

hence divides each of 54.228 + 232 = 228(54 + 24) and 54.228− 1 (since x4− 1 = (x+ 1)(x−1)(x2 + 1)). So, it divides their difference.

In fact, there is no known prime number of the form Fn for n > 4. People are stillsearching for them (why?). Here is a website for it: http://www.fermatsearch.org/

However, we have a more theoretical use of Fermat numbers in mind. Namely, let usprove the following:

Theorem 20. (Goldbach) No two Fermat numbers have a common divisor greater than 1.

PRIME NUMBERS 11

Proof. Suppose that Fn and Fn+k where k > 0 are two Fermat numbers and that m | Fnand m | Fn+k. Letting x = 22n we can write:

Fn+k − 2

Fn=

22n+k − 1

22n + 1=x2

k − 1

x+ 1= x2

k−1 − x2k−2 + x2k−3 . . .− 1

So, we see that Fn | Fn+k − 2, but now since m | Fn+k and m | Fn+k − 2, we deduce thatm | 2. So m = 1 or 2 but it can’t be 2 since all the Fermat numbers are odd. Hence, thetheorem follows. �

Corollary 21. There are infinitely many primes.

Proof. Each Fi is either prime or have an odd prime divisor which does not divide anyother. So, there are at least as many primes as there are Fermat numbers. There areinfinitely many Fermat numbers by definition. �

This proof also gives pn+1 ≤ Fn = 22n + 1, which is slightly better than what we haveseen before (but not much).

Can one find a simple function f : N → N such that for each natural number n, f(n)is a prime number? Clearly, Fermat’s sequence could not do this job. There is no knownsatisfactory answer to this. Of course, one could define f(n) = pn, the nth prime numberbut this is by no means a “simple” function.

There is a remarkable polynomial function given by

f(n) = n2 − n+ 41

It turns out this polynomial takes prime values for all n with 0 ≤ n ≤ 40, but obviouslyf(41) is composite, since 41 | f(41).

The following theorem says that polynomial functions with integer coefficients are nogood for answering the above question.

Theorem 22. No polynomial f(n) with integral coefficients, not a constant, can be a primefor all n, or for all sufficiently large n.

Proof. Consider a polynomial given by

f(x) = a0xk + a1x

k−1 + . . . ak

We may assume that the leading coefficient a0 > 0 so that f(n) → ∞ as n → ∞ sinceotherwise f(x) will become negative when x is big. Thus, f(x) > 1 for all x sufficientlylarge, so say f(x) > 1 for all x > N . Let x0 > N be such a number, and let us take

y = f(x0) > 1.

Then, for all integers r ∈ Z, we see that

f(ry + x0) = a0(ry + x0)k + . . .

is divisible y by the binomial expansion theorem. Now, f(ry+x0)→∞ as r →∞. Hence,there are infinitely many composite values of f(n). �

12 YANKI LEKILI

What about simple functions f : N→ N such that f(n) is prime for infinitely many n?If f(n) is of the form

f(n) = an+ b for some a, b with a > 0 and (a, b) = 1

then there is a nice answer to this.Let’s first practice in a few examples.

Theorem 23. There are infinitely many primes of the form 4n+ 3.

Proof. Let p1, . . . , pn be the first n primes, then consider:

q = 22.3.5. . . . pn − 1

Then q is of the form 4n+ 3, and is not divisible by any of the primes p1, . . . , pn, hence itshould have prime divisors p > pn. Furthermore, it cannot be that all the prime divisorsof q are of the form 4n + 1 since the product of such numbers is of the same form, hencethere must be at least one prime divisor of q of the form 4n + 3 and greater than pn. Byletting n→∞, we can construct infinitely many primes of this form. �

Here is a similar result:

Theorem 24. There are infinitely many primes of the form 6n+ 5.

Proof. The proof is similar. We define q by

q = 2.3.5 . . . .pn − 1

and observe that any prime number, except 2 or 3 is of the form 6n+ 1 or 6n+ 5 (why?),and the product of two numbers of the form 6n+ 1 is again of the same form. �

All these theorems are particular cases of a famous theorem of Dirichlet:

Theorem 25. (Dirichlet 1837) If a > 0 and b are integers such that (a, b) = 1, then thereare infinitely many primes of the form an+ b.

The proof of this theorem uses analytical methods too difficult to discuss here. We shallnot cover its proof in this course.

That deals with the linear functions. What about quadratic polynomials? The questionbecomes much harder and we don’t even know whether the following conjecture is true ornot.

Conjecture 26. There are infinitely many primes of the form n2 + 1.

3. Chebyshev’s theorem

We now return back to our study of the prime distribution function π(x).We will give a proof of a theorem due to Chebyshev (also spelled Tchebychef):

Theorem 27. There exists constants c1, c2 > 0 such that

c1x

log x< π(x) < c2

x

log x

PRIME NUMBERS 13

The closer the constants cj are to 1, the more technical the proof becomes. Here we will

show that c1 = log 22

and c2 = 6 log(2).The proof will use the following elementary facts:

(1) 22n

2n≤(2nn

)≤ 22n

(2)(2nn

)is not divisible by any p > 2n.

(3)(2nn

)is divisible by all primes n < p ≤ 2n.

The first one follows from (1 + 1)2n =∑2n

m=0

(2nm

), the second and third follows from our

formula for the prime factorizations of factorials (2n)! and n!.

Proof. (Proof of Chebyshev’s theorem)Upper bound : Any p with n < p ≤ 2n divides

(2nn

)so by Lemma 13, the product∏

n<p≤2n

p

divides(2nn

). Therefore, we have

nπ(2n)−π(n) ≤∏

n<p≤2n

p ≤(

2n

n

)≤ 22n

Taking the log gives

π(2n)− π(n) ≤ 2 log(2)n

log nUsing induction, we now easily see that

π(2k) ≤ 3 · 2k

kIn fact, this is checked directly for k ≤ 5; k > 5, we argue by induction:

π(2k+1) ≤ π(2k) +2k+1

k≤ 3.2k

k+

2.2k

k≤ 5.2k

k≤ 3.2k+1

k + 1

Next, since the function f(x) = xlog x

is monotonically increasing for x ≥ e, we have that

if 4 ≤ 2k < x ≤ 2k+1, then

π(x) ≤ π(2k+1) ≤ 6.2k

k + 1≤ 6 log 2

2k

log 2k≤ 6 log 2

x

log x

Since π(x) ≤ 6 log 2 xlog x

for x ≤ 4 as well, we have now established the claimed upper

bound for all x.Lower boundPut N =

(2nn

), let vp(N) denote the highest power of p that divides N . By the formula

from Lemma 15, we now that

vp(N) =∑m≥1

([2n

pm

]− 2

[n

pm

])Now, we use the following lemma:

14 YANKI LEKILI

Lemma 28. For all x ∈ R, we have [2x]− 2[x] ∈ {0, 1}.

Proof. Let us write x = [x] + {x} where {x} is called the fractional part of x. Now, if{x} < 1/2, then 2x = [2x] + {2x}, hence [2x]− 2[x] = 0. Otherwise, if {x} ≥ 1/2, then weget [2x]− 2[x] = 1. �

If pm > 2n or equivalently m > log 2nlog p

, then we have that[2npm

]− 2

[npm

]= 0. Thus, we

find that

vp(N) ≤[

log 2n

log p

]Now,

2n log 2− log 2n ≤ log

(2n

n

), because

22n

2n≤(

2n

n

)

log

(2n

n

)≤∑p≤2n

[log 2n

log p

]log p, because N =

∏p≤2n

pvp(N), and vp(N) ≤[

log 2n

log p

]∑p≤2n

[log 2n

log p

]log p ≤

∑p≤2n

log 2n = π(2n) log 2n

This yields the lower bound

π(2n) ≥ log 22n

log 2n− 1

We claim that this implies that

π(x) ≥ log 2

2

x

log x, for all x ≥ 2

This inequality can be checked directly for x ≤ 16, hence it suffices to prove it for x > 16.Pick an integer n with 16 ≤ 2n < x ≤ 2n+ 2. Then,

2n

log 2n− n+ 1

log 2n=n− 1

log 2n≥ 7

4 log 2>

1

log 2

hence,

π(x) ≥ π(2n) ≥ log 22n

log 2n− 1 ≥ (n+ 1) log 2

log(2n)≥ (n+ 1) log 2

log(2n+ 2)≥ log 2

2

x

log x

as required. �

We will next prove Bertrand’s postulate. It was conjectured by Bertrand in 1845 andproved by Chebyshev in 1850.

Theorem 29. For every integer n ≥ 1, there is a prime p satisfying n < p ≤ 2n.

PRIME NUMBERS 15

Chebyshev introduced an auxiliary function, the θ-function. It is defined by

θ(x) =∑p≤x

log p

for real numbers x (summation over all prime numbers p ≤ x). For example,

θ(10) = log 2 + log 3 + log 5 + log 7

Chebyshev proved upper and lower estimates for the function θ, and then deduced upperand lower estimates for the function π. We have the following upper estimate for θ.

Lemma 30. If x > 0, then θ(x) < log(4) · x.

Proof. The statement is clearly true for 0 < x < 1, so we can assume that x ≥ 1. Sinceθ(x) = θ([x]), it is enough to prove the statement θ(n) < log(4) · n for n ∈ N.

For this, we use induction on n. The cases n = 1 and n = 2 are obviously true. Now,assume that n ≥ 3 and that θ(m) < log(4) ·m for m < n. We must distinguish the casesn even and n odd.

If n is even then θ(n) = θ(n− 1) < log(4) · (n− 1) < log(4) · n, as required.If n is odd, let n = 2m+ 1 for m ≥ 1. We will show below that θ(2m+ 1)− θ(m+ 1) <

log(4) ·m. It then follows that

θ(n) = θ(2m+1)−θ(m+1)+θ(m+1) < log(4)·m+log(4)·(m+1) = log(4)(2m+1) = log(4)·n

as required.So, it remains to show that θ(2m+ 1)− θ(m) < log(4) ·m for every m ≥ 1.

Consider M =(2m+1m

)= (2m+1)2m···(m+2)

m!. If p is a prime number with m+2 ≤ p ≤ 2m+1,

then p divides M (because p divides the numerator but not the denominator). Hence, byLemma 13, the product ∏

m+2≤p≤2m+1

p

divides M , in particular, is less than or equal to M .On the other hand, M < 22m because

2M =

(2m+ 1

m

)+

(2m+ 1

m+ 1

)< (1 + 1)2m+1

It follows that

θ(2m+1)−θ(m+1) =∑

m+2≤p≤2m+1

log p = log

( ∏m+2≤p≤2m+1

p

)≤ logM < log 22m = m·log 4

Proof. (Proof of Bertrand’s postulate) Recall first that any prime number p that dividesN =

(2nn

)has to satisfy p ≤ 2n and if there is any prime n < p ≤ 2n then it divides N .

16 YANKI LEKILI

Next, let us observe that for n ≥ 3 if 23n < p ≤ n, then p does not divide N =

(2nn

).

Indeed, 2n < 3p ≤ p2, hence 2 ≤ 2np< 3. Thus,

vp(N) =

[2n

p

]− 2

[n

p

]= 2− 2 = 0

Now, we prove Bertrand’s postulate by contradiction. Suppose that there is an integern and there is no prime in the interval (n, 2n]. By the discussion above, this implies thatthere is no prime p > 2

3n that divides N .

Next, consider primes p | N , such that vp(N) > 1. They satisfy p2 ≤ pvp(N) ≤ 2n, hence

we must have p ≤√

2n for such primes. The number of such primes is clearly bounded by√2n. Now, we have

22n

2n≤(

2n

n

)=

∏vp(N)>1

pvp(N) ·∏

vp(N)=1

p ≤ (2n)√2n ·

∏vp(N)=1

p

Taking logs, we get:

2n(log(2))−log(2n) ≤√

2n log(2n)+log(∏

vp(N)=1

)p ≤√

2n log(2n)+θ(2n

3) ≤√

2n log(2n)+log(4)(2n

3)

Reorganizing, we get2n log(2) ≤ 3(1 +

√2n) log(2n)

Now, since xlog x

is monotonically increasing for x > 3, this inequality cannot hold

for large n. In fact, it is false for n ≥ 512 and we arrive at a contradiction if n ≥512. For n < 512, Bertrand’s postulate is proved by looking at the sequence of primes7, 13, 23, 43, 83, 163, 317, 631.

Corollary 31. Let pn denote the n-th prime number. Then pn ≤ 2n.

Proof. By Bertrand’s postulate we know that each of the intervals (1, 2], (2, 4], (4, 8], etc.contains at least one prime number. Hence the interval (1, 2n] contains at least n primenumbers. Thus the n-th prime number must be contained in this interval, i.e. pn ≤ 2n. �

4. The prime number theorem

We have mentioned before that π(x) is approximately equal to the function xlog x

. We

will now make this statement more precise.

Definition 32. Let f and g be functions which are defined for all sufficiently large realnumbers, and assume that f(x) and g(x) are positive for all large x. We say that f and gare asymptotically equal, and write this as

f ∼ g

if f(x)g(x)→ 1 as x→∞, i.e. the limit limx→∞ f(x)/g(x) exists and is equal to 1.

We can now state one of the landmark theorems in elementary number theory:

PRIME NUMBERS 17

Theorem 33. (Prime number theorem)

π(x) ∼ x

log x

The prime number theorem was proved in 1896 independently by Hadamard and de laVallee Poussin using methods of complex analysis. An elementary proof was given in 1948by Selberg and also by Erdos (based on Selberg’s lemma).

We shall not cover the proof of this theorem in this class but if you are seriously interestedin number theory you should learn its proof.

Corollary 34. Prime number theorem implies Chebyshev’s theorem.

Proof. limx→∞ f(x)/g(x) = 1 means for any numbers c1 < 1 < c2 and sufficiently large x,we have that

c1 < f(x)/g(x) < c2Hence,

c1g(x) < f(x) < c2g(x) as x→∞�

Here is another corollary of the prime number theorem

Corollary 35. Let pn denote the n-th prime number. Then

pn ∼ n log n as n→∞

(Here pn → n log n means that limn→∞pn

n logn= 1.)

Proof. If y = xlog x

, then log y = log x− log log x. Therefore,

limx→∞

log y

log x= 1− lim

x→∞

log log x

log x= 1

Thus, x = y log x ∼ y log y as x → ∞. Since, by the prime number theorem, π(x) ∼ y, itfollows that x ∼ π(x) log π(x). In other words,

limx→∞

π(x) log π(x)

x= 1

Now, let x = pn, then π(x) = n, so we get

limn→∞

n log n

pn= 1

as required. �

In fact, it is not too hard to show that the statement given in the above corollary isequivalent to the prime number theorem.

Finally, we shall give a corollary of the Prime Number Theorem that recovers an asymp-totic version of Bertrand’s postulate.

Corollary 36. Let δ > 1, then for all sufficiently large x, the interval (x, δx] contains aprime number.

18 YANKI LEKILI

(Note that this is not necessarily true for all x, for example (1, δ · 1] does not contain aprime for δ < 2.)

Proof. The interval (x, δx] contains a prime number if and only if π(δx)−π(x) ≥ 1. Thus,we must show that π(δx)− π(x) ≥ 1 for all sufficiently large x. We have

π(δx)

π(x)=

π(δx)

δx/ log(δx)· x/ log(x)

π(x)· δx/ log(δx)

x/ log(x),

By the prime number theorem, we see that the first two factors go to 1 as x→∞. For thethird factor, we find

δx/ log(δx)

x/ log(x)= δ

log(x)

log(δ) + log(x)→ δ, as x→∞.

Hence,

limx→∞

π(δx)π(x) = δ.

Now, fix any γ with 1 < γ < δ. From limx→∞π(δx)π(x)

= δ it follows that π(δx) ≥ γπ(x) for

all sufficiently large x, and from π(x)→∞ as x→∞, it follows that (γ − 1)π(x) ≥ 1 forall sufficiently large x. Thus for all large enough x, we obtain

π(δx)− π(x) ≥ (γ − 1)π(x) ≥ 1

as required. �

A refinement of the approximation given in the prime number theorem can be obtainedif instead of x/ log x one uses the following function:

Definition 37. For x ≥ 2, define

li(x) =

∫ x

2

1

log tdt

The function li(x) is called the logarithmic integral.

One can then show that

li(x) ∼ π(x)

The bound on error terms of these approximations is still an important area of researchin number theory. We mention the following important conjecture:

Conjecture 38. (Riemann hypothesis) Let ε > 0. Then,

|π(x)− li(x)| < x1/2+ε

PRIME NUMBERS 19

5. Arithmetic functions and Dirichlet series

Definition 39. A real- or complex-valued function defined on the positive integers is calledan arithmetic function.

Definition 40. Given an arithmetic function f(n) = αn ∈ C, we define its Dirichlet series:

F (s) =∞∑n=1

αnns

The most important example of a Dirichlet series is the Riemann zeta function associatedto the arithmetic function u defined by u(n) = 1 for all n ∈ N.

Definition 41. The Riemann zeta function, denoted by ζ(s), is the function of a realvariable s > 1 defined by the series

ζ(s) =∞∑n=1

1

ns

We must check that the series converges for s > 1. Since all summands in the series arepositive, it suffices to check that the partial sums

∑Nn=1 n

−s are bounded above. Thesepartial sums can be estimated as follows:

N∑n=1

n−s = 1 +N∑n=2

n−s < 1 +

∫ N

1

x−sdx < 1 +

∫ ∞1

x−sdx = 1 +1

s− 1→ 0 (as s→∞).

Here the inequality∑N

n=2 n−s <

∫ N1x−sdx can be seen by comparing the area under the

curve x−s for 1 ≤ x ≤ N to the sum of the areas of the rectangles of width 1 and height2−s, 3−s, . . . N−s under this curve.

The following important result shows how Riemann zeta function is related to primenumbers:

Theorem 42. (Euler product formula) If s > 1 then

ζ(s) =∏p

1

1− p−s

Proof. Since p ≥ 2, we have

1

1− p−s= 1 + p−s + p−2s + . . .

for s > 1 (indeed for s > 0). If we take p = 2, 3, . . . q, and multiply the series together, thegeneral term resulting is of the type

2−a2s3−a3s . . . q−aqs = n−s,

where

n = 2a23a3 . . . qaq (a2 ≥ 0, a3 ≥ 0, . . . , aq ≥ 0)

20 YANKI LEKILI

A number n will occur if and only if it has no prime factors greater than q, and then byFundamental theorem of arithmetic, once only. Hence,∏

p≤q

1

1− p−s=∑(q)

n−s

the summation on the right-hand side extending over the numbers formed from the primesup to q.

These numbers include all the numbers up to q, so that we have

0 <∞∑n=1

n−s −∑(q)

n−s <∞∑q+1

n−s

and the last sum tends to 0 when q →∞. Hence,∞∑n=1

n−s = limq→∞

∑(q)

n−s = limq→∞

∏p≤q

1

1− p−s

Note that the essential step in the proof was the existence of a unique prime factorisationfor every positive integer. Therefore, the Euler product can be considered as an analyticexpression of the fundamental theorem of arithmetic.

Proposition 43. ζ(s)→∞ as s→ 1 and s > 1.

Proof. For every s > 1 the integral∫∞1x−sdx is smaller than the sum

∑∞n=1 n

−s. Thus forevery s > 1, we have

ζ(s) =∞∑n=1

n−s >

∫ ∞1

x−sdx =1

s− 1

Since 1s−1 →∞ as s→ 1, s > 1, it follows that ζ(s)→∞ as s→ 1, s > 1.

Note that from this it follows easily that there are infinitely many primes (This proof isdue to Euler). Suppose there were finitely many primes, then we have:

lims→1

ζ(s) = lims→1

∏p

(1− p−s)−1 =∏p

lims→1

(1− p−s)−1 =∏p

(1− p−1)−1

contradicting ζ(s)→∞ as s→ 1.

Corollary 44. The series∑

p1p

is divergent.

Proof. Recall that

log1

1− x= x+

x2

2+x3

3+ . . .

for |x| < 1. Applying this to p−s for s > 1, we have:

log1

1− p−s= p−s +

p−2s

2+p−3s3

+. . .

PRIME NUMBERS 21

Taking the logarithm of ζ(s), we get:

log ζ(s) = log∏p

1

1− p−s=∑p

(log

1

1− p−s

)

=∑p

(p−s +

p−2s

2+p−3s

3+ . . .

)=∑p

p−s +∑p

∑k≥2

p−ks

k

Now, we observe that the second sum is bounded above by 1. Namely,∑p

∑k≥2

p−ks

k<∑p

∑k≥2

p−k =∑p

p−2∑k

p−k =∑p

p−2

1− p−1

<∞∑n=2

n−2

1− n−1=∞∑n=2

(1

n− 1− 1

n

)= 1.

Hence,∑

p p−s > log ζ(s) − 1 for all s > 1. Since ζ(s) → ∞ as s → 1 with s > 1, this

implies that∑

p p−s → ∞ as s → 1. Now, we have

∑p p−1 >

∑p p−s for all s > 1, hence

it follows that∑

p p−1 is divergent. �

We have seen that the trivial arithmetic function u with u(n) = 1 for n ∈ N gives riseto the important Riemann zeta function. Here are some important arithmetic functions:

For n ∈ N,

(1) u(n) = 1.(2) d(n) = #{d ∈ N : d | n} =

∑d|n 1.

(3) σ(n) =∑

d|n d.

(4) σk(n) =∑

d|n dk.

(5) (Mobius function) µ(n) =

µ(1) = 1,

µ(n) = 0, if p2 | n for some prime p

µ(n) = (−1)s, if n = p1 . . . ps where pi are distinct.

(6) (Euler’s totient function) φ(n) = #{d ∈ N : d < n, (d, n) = 1}.

Definition 45. An arithmetic function f is multiplicative if f(mn) = f(m)f(n) whenever(m,n) = 1.

Observe that a multiplicative function is uniquely determined by its values on powers ofprime numbers. Indeed, if n = pe11 p

e22 . . . pekk and f is multiplicative, then

f(n) =∏pi

f(peii )

Clearly, the function n→ u(n) = 1 is multiplicative. We shall next see that d(n) is alsomultiplicative by the following lemma:

22 YANKI LEKILI

Lemma 46. Let f : N → C an arithmetic function. Define the arithmetic functiong : N→ C by the formula

g(n) =∑d|n

f(d).

If f is multiplicative, then so is g.

Proof. Let m,n ∈ N, then

g(mn) =∑d|mn

f(d) =∑

d1|m,d2|n

f(d1d2) =∑d1|m

f(d1)∑d2|m

f(d2) = g(m)g(n).

Corollary 47. n → d(n) =∑

d|n 1 is multiplicative. n → σk(n) =∑

d|n dk is multiplica-

tive.

It is easy to check that µ(n) is multiplicative. As part of the homework, you will seethat φ(n) is also multiplicative.

Manipulations of Dirichlet series.Dirichlet series allows one to study an arithmetic function via complex analysis. We

shall not delve into analysis here but discuss basic manipulations of Dirichlet series.We shall not care about convergence questions. To justify the rearrangement of terms

one often requires absolute convergence of the series.

Lemma 48. Let F (s) =∑∞

n=1a(n)ns and G(s) =

∑∞n=1

b(n)ns , then their product is given by

F (s)G(s) =∞∑n=1

c(n)

ns

wherec(n) =

∑d|n

a(d)b(n/d) =∑d|n

a(n/d)b(d)

Proof. The product is

F (s)G(s) =∞∑d=1

∞∑k=1

a(d)b(k)

dsks=∞∑n=1

∑d|n

a(d)b(n/d)

ns=∞∑n=1

c(n)

ns

where we set n = dk. �

Lemma 49. Let F (s) =∑∞

n=1a(n)ns . Then F ′(s) =

∑∞n=1

b(n)ns where

b(n) = − log(n) · a(n).

Proof. Since dds

(n−s) = − log(n) · n−s, we have

F ′(s) =d

ds

∞∑n=1

a(n)

ns=∞∑n=1

− log(n) · a(n)

ns.

PRIME NUMBERS 23

Examples:

• The Dirichlet series associated to the identity function N→ N, n→ n is given by∞∑n=1

n

ns=∞∑n=1

1

ns−1= ζ(s− 1).

• ζ(s) · ζ(s) =∑∞

n=1d(n)ns . Indeed,

ζ(s) · ζ(s) =∞∑n=1

1

ns·∞∑n=1

1

ns=∞∑n=1

∑d|n 1 · 1ns

=∞∑n=1

d(n)

ns.

•∑∞

n=1σ(n)ns = ζ(s− 1)ζ(s). Indeed,

ζ(s− 1)ζ(s) =∞∑n=1

n

ns

∞∑n=1

1

ns=∞∑n=1

∑d|n d.1

ns=∞∑n=1

σ(n)

ns

• ζ ′(s) = −∑∞

n=1log(n)ns .

Lemma 50. Let ε be the arithmetic function defined by ε(1) = 1 and ε(n) = 0 for alln > 1. We have: ∑

d|n

µ(d) = ε(n)

Proof. It is easy to see that by definition µ is multiplicative. Therefore, the functionn →

∑d|n µ(d) is also multiplicative. Thus, it suffices to compute its value on prime

powers. We have ∑d|pk

µ(pk) =

{1, if k = 0

1− 1 + 0 + . . .+ 0, if k > 0

Corollary 51.∑∞

n=1µ(n)ns = 1

ζ(s).

Proof. We have∞∑n=1

µ(n)

ns

∞∑n=1

1

ns=∞∑n=1

∑d|n µ(d) · 1ns

= 1.

In a similar way, using∑

d|n φ(n) = n, one shows that

∞∑n=1

φ(n)

ns=ζ(s− 1)

ζ(s)

We end with Mobius inversion formula:

Theorem 52. For arithmetic functions f and g, the following statements are equivalent:

(1) g(n) =∑

d|n f(d) for all n.

24 YANKI LEKILI

(2) f(n) =∑

d|n µ(d)g(n/d) =∑

d|n µ(n/d)g(d) for all n.

Proof. Assume that (1) holds, then∑d|n

µ(d)g(n/d) =∑d|n

µ(d)∑c|n

d

f(c) =∑c|n

f(c)∑d|n

c

µ(d) = f(n)

where we used the fact that a pair of positive integers (c, d) satisfies d | n and c | nd

if andonly if c | n and d | n

cand that

∑d|n

cµ(d) = 0 unless c = n.

Assume that (2) holds, then∑d|n

f(d) =∑d|n

∑c|d

µ(c)g(d/c) =∑c|n

∑d′|n

c

µ(d′)g(c) = g(n)

where we used the fact that a pair (c, d) of positive integers satisfies d | n and c | d if andonly if the pair (c, d′) = (c, d/c) satisfies c | n and d′ | n

c.