math 180a - notesndonalds/math180a/notes.pdfmath 180a - notes neil donaldson march 14, 2018 1...

63
Math 180A - Notes Neil Donaldson March 14, 2018 1 Introduction & Notation Number Theory is primarily concerned with the properties of integers and with integer solutions to equations, so-called Diophantine Equations in honor of Diophantus of Alexandria, a Greek Mathemati- cian of the 3rd century CE. Here are some classic number theory problems and examples. 1. Find all the integer points ( x, y) on the line 3x - 2y = 1. The answer is ( x, y)=(1 + 2n,1 + 3n) where n Z. Can you prove right now that these are all the solutions? 2. If n is an odd integer then n 2 - 1 is a multiple of 8. 3. Can we find all Pythagorean triples: integers x, y, z such that x 2 + y 2 = z 2 ? 4. Prime numbers: if n is prime, what is the next prime? Is there a formula for the nth prime? Is n 2 + n + 41 always prime whenever n is an integer? 5. Which integers can be written as the sums of two squares? Three? Four? 6. Fermat’s Last Theorem: 1 if n 3 is an integer, then there are no positive integers x, y, z such that x n + y n = z n . Mathematics restricted to the integers is less intuitive than with the reals, or rational, numbers. The fundamental reason is that division within the integers is often impossible: for instance 7 ÷ 4 is not an integer. Instead, an alternative notion of division involving remainders is used: e.g. 7 ÷ 4 is 1 remainder 3. In algebraic language the integers are merely a ring, not a field like the rationals or reals. Notation The Integers: Z = {..., -3, -2, -1, 0, 1, 2, 3, . . . } The Natural Numbers: N = {1, 2, 3, 4, . . . } The Whole Numbers: N 0 = {0, 1, 2, 3, . . . } The Rational Numbers: Q = { m n : m Z, n N 0 } The real numbers R and complex numbers C will not play much role in this class. 1 Historical note: In 1637 Pierre de Fermat left a note in the margin of a copy of Diophantus’ Arithmetica famously claiming to have proved his ‘theorem.’ A complete proof took mathematicians another three and a half centuries. . .

Upload: others

Post on 10-May-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Math 180A - Notes

Neil Donaldson

March 14, 2018

1 Introduction & Notation

Number Theory is primarily concerned with the properties of integers and with integer solutions toequations, so-called Diophantine Equations in honor of Diophantus of Alexandria, a Greek Mathemati-cian of the 3rd century CE. Here are some classic number theory problems and examples.

1. Find all the integer points (x, y) on the line 3x− 2y = 1. The answer is (x, y) = (1 + 2n, 1 + 3n)where n ∈ Z. Can you prove right now that these are all the solutions?

2. If n is an odd integer then n2 − 1 is a multiple of 8.

3. Can we find all Pythagorean triples: integers x, y, z such that x2 + y2 = z2?

4. Prime numbers: if n is prime, what is the next prime? Is there a formula for the nth prime? Isn2 + n + 41 always prime whenever n is an integer?

5. Which integers can be written as the sums of two squares? Three? Four?

6. Fermat’s Last Theorem:1 if n ≥ 3 is an integer, then there are no positive integers x, y, z suchthat xn + yn = zn.

Mathematics restricted to the integers is less intuitive than with the reals, or rational, numbers. Thefundamental reason is that division within the integers is often impossible: for instance 7÷ 4 is notan integer. Instead, an alternative notion of division involving remainders is used: e.g. 7÷ 4 is 1remainder 3. In algebraic language the integers are merely a ring, not a field like the rationals or reals.

Notation

The Integers: Z = {. . . ,−3,−2,−1, 0, 1, 2, 3, . . .}The Natural Numbers: N = {1, 2, 3, 4, . . .}The Whole Numbers: N0 = {0, 1, 2, 3, . . .}The Rational Numbers: Q = {m

n : m ∈ Z, n ∈N0}

The real numbers R and complex numbers C will not play much role in this class.

1Historical note: In 1637 Pierre de Fermat left a note in the margin of a copy of Diophantus’ Arithmetica famouslyclaiming to have proved his ‘theorem.’ A complete proof took mathematicians another three and a half centuries. . .

Divisibility in the integers

Given two integers m, n, it is unlikely that the ratio nm is also an integer. E.g. 2

3 6∈ Z. Algebraistswould say the following:

Z is not closed under division.

The first order of business is to identify those pairs of integers for which division is allowed.

Definition 1.1. Let m, n ∈ Z. We say that m divides n, and write m | n if:

∃k ∈ Z such that n = km

We say that m is a divisor or factor of n.A common factor of two integers x, y is any (positive) integer d such that d | x and d | y. We say thatx, y are relatively prime or coprime2 if the only positive common factor is 1.

Notes and conventions

• Keep the line vertical! m | n is a proposition (a statement which is either true or false), whereasm/n = m

n is (usually) a rational number. Thus:

4 | 12 is true,

7 | 9 is false,4

12 is a rational number.

Some version of the following is a very common mistake:

m | n ! m/n !mn∈ Z

Not only are we confusing propositions with numbers, but the resulting fraction is upside-down!

• The word positive is usually omitted when talking about common factors. For instance, eventhough −2 is a common factor of 8 and −12, it is common to say that the common factors areonly 1, 2 and 4.

• Note that m | 0 for all integers m, since k = 0 satisfies the definition! In particular, 0 | 0 is true.

• The first observation relates to a subtely in the definition. It would be tempting to say that mdivides n if and only if n

m is an integer. There are two problems with this:

– nm ∈ Z =⇒ m | n is true but it’s converse is false, 0 | 0 being the sole counter-example.

– Divisibility is a property solely of the integers. It is somehow cleaner not to introduce theconcept of a rational number n

m into a discussion purely about integers.2Colloquially we say that x, y have no common factors.

2

2 Pythagorean triples

As a motivational problem, we find all positive integers x, y, z for which x2 + y2 = z2. It is easy tofind many:

1. Take a known Pythagorean triple (3, 4, 5) and multiply it by a constant. Thus (3n)2 + (4n)2 =(5n)2 for any n ∈N. We immediately have infinitely many triples.

2. Use a spreadsheet or computer program to run through a large number of pairs (x, y) of inte-gers, take the square-root of x2 + y2, and test whether this is an integer. For example: in C/C++,the code3

for(int x=1; x<=100; ++x)

{for(int y=x; y<=100; ++y)

{real z=sqrt(x^2+y^2);

if(z-floor(z)==0){write(x,y,z);}

}

}

would return all Pythagorean triples where x, y ≤ 100 (the last in the list is (80, 84, 116)).

But what if we want to find them all? We need to proceed more deviously. Case 1 above is a motiva-tor: in the triple (3, 4, 5), none of x, y, z have any common factors.

Definition 2.1. A Pythagorean triple (x, y, z) is primitive if no pair of x, y, z have a common factor.

We can now state some basic results that help narrow our search:

Lemma 2.2. Suppose that (x, y, z) is a Pythagorean triple.

1. If any pair of x, y, z have a common factor, the third shares this factor.

2. All non-primitive triples are a common multiple of a primitive triple.

3. If (x, y, z) is primitive, then z is odd.

Proof (sketch). 1. This hard at the moment: it depends on being able to show that d2 | m2 =⇒ d |m. This follows very quickly from unique factorization, which we shall see later. . .

2. If a triple is non-primitive then some pair of x, y, z have a common factor. By part 1 they all do.Divide x, y, z by their greatest common factor d to obtain the primitive triple ( x

d , yd , z

d ).

3. If (x, y, z) is primitive, then at most one of x, y, z can be even. Moreover, they cannot all be odd,since odd + odd 6= odd.If z = 2m is even, then x and y are both odd and we may write x = 2k + 1 and y = 2l + 1 forsome integers k, l. But then

4m2 = x2 + y2 = (2k + 1)2 + (2l + 1)2 = 4(k2 + l2 + k + l) + 2.

The right hand side is not divisible by 4 so we have a contradiction. Hence z must be odd.

3This code is very inefficient but is fine for investigating. A more efficient algorithm could be based on Theorem 2.3.

3

To summarize the Lemma, we may assume that a primitive Pythagorean triple (x, y, z) has x, z oddand y even. We are now ready to finish things off.

Suppose that (x, y, z) is a primitive triple where y is even. Then

x2 = z2 − y2 = (z− y)(z + y)

Observe that z− y and z+ y have no common factors, for if they did, such would be a common factorof y and z: a contradiction.It can now be shown4 that z− y and z + y must both be perfect squares. Write

z− y = t2, z + y = s2

Moreover, s, t must be relatively prime for otherwise y, z have a common factor. We have thereforesketched a proof of the following result.

Theorem 2.3. All primitive triples (x, y, z) with x odd and y even have the form

x = st, y =s2 − t2

2, z =

s2 + t2

2

where s > t ≥ 1 are any odd integers with no common factor.

For example, take s = 9, t = 5 to get (45, 28, 53). All Pythagorean triples are simply multiples ofthese.

3 Pythagorean Triples and the Unit Circle

Assume that (x, y, z) is a Pythagorean triple. Then

x2 + y2 = z2 =⇒( x

z

)2+(y

z

)2= 1

Since x, y, z ∈N, it follows that the point ( xz , y

z ) is a rational point5 on the unit circle. For example,(35

)2

+

(45

)2

= 1

whence ( 35 , 4

5 ) is a rational point on the unit circle.

Conversely, suppose that (α, β) is a rational point satisfying α2 + β2 = 1. Let d be the product of thedenominators of α, β. Then αd and βd are both integers. Moreover,

(αd)2 + (βd)2 = d2

We therefore have a Pythagorean triple (αd, βd, d).

Indeed there is a correspondence between rational points on the unit circle and Pythagorean triples.The correspondence is not 1–1, but with a little care it can be made so. We state the following withoutproof.

4Unique factorization again. . .5A point whose co-ordinates are both rational numbers.

4

Theorem 3.1. 1. Suppose that (x, y, z) is a primitive Pythagorean triple. Then ( xz , y

z ) is a rational pointin the first quadrant of the unit circle.

2. Suppose that (α, β) is a rational point in the first quadrant of the unit circle. When written in low-est terms, α = a

c and β = bc have the same denominator c. It follows that (a, b, c) is a primitive

Pythagorean triple.

To obtain a formula for the rational points we could simply divide the values for (x, y, z) in Theorem2.3 to obtain( x

z,

yz

)=

(2st

s2 + t2 ,s2 − t2

s2 + t2

)=

(2m

m2 + 1,

m2 − 1m2 + 1

)where m =

st

Noting, for primitive triples, that s > t, we see that m2 > 1 whence the resulting point really does liein the first quadrant.

Alternative viewpoint

We could instead have started with the geometric problem of finding all rational points (x, y) onthe unit circle. For this, imagine drawing a straight line with gradient m through the point (0,−1).Where does this intersect the circle?We want to solve the simultaneous equations{

x2 + y2 = 1y = mx− 1

Substituting one in the other, we obtain

x2 + m2x2 − 2mx + 1 = 1 =⇒ x[(m2 + 1)x− 2m] = 0 =⇒ x = 0,2m

m2 + 1

x = 0 manifestly gives us our base point (0,−1), whereas the other yields

y = mx− 1 =2m2

m2 + 1− 1 =

m2 − 1m2 + 1

We therefore obtain the second point of intersection

(x, y) =(

2mm2 + 1

,m2 − 1m2 + 1

)It is immediate that this is a rational point if and only if m is rational. Indeed we can go a little further:letting m = 0 yields the point (0,−1), while6 m = ∞ results in the point (0, 1). We have thereforeproved:

6I.e. limm→∞

(x, y) = (0, 1).

5

Theorem 3.2. There is a bijective correspondence between the setof extended rational numbers Q∪ {∞} and the rational points onthe unit circle according the the formula

m 7→ (x, y) =(

2mm2 + 1

,m2 − 1m2 + 1

)Indeed m can be interpreted as the gradient of the line joining thesouth pole (0,−1) with the desired rational point (x, y).

The picture shows the line with gradient m = − 52 through

the south pole S, which generates the point P = (− 2029 , 21

29 ).Note that (20, 21, 29) is a (primitive) Pythagorean triple. −1

1y

−1 1x

S

P

Generalizing the method

This method pay be applied to other quadratic curves. A full discussion requires an introduction toprojective geometry which will have to wait until next term, but a simplified version of the idea is asfollows.

1. Let C be a curve in the plane whose equation is quadratic with rational coefficients. I.e.

ax2 + bxy + cy2 + dx + ey + f = 0 where a, b, c, d, e, f ∈ Q

2. Suppose that S is a rational point on C.

3. All rational points on C may be found by drawing a line through S which is either vertical orhas rational gradient and intersecting it with C.

Example Find all the rational points on the hyperbola x(y + x) = 3.We may choose S = (1, 2). A line with gradient m through S has equation

y = m(x− 1) + 2

Substituting into the original curve, we obtain

(m + 1)x2 + (2−m)x− 3 = 0=⇒ (x− 1)[(m + 1)x + 3] = 0

=⇒ x = 1,− 3m + 1

It follows that all rational points on the hyperbola are givenby the formula

(x, y) =(− 3

m + 1,

2− 2m−m2

m + 1

)where m ∈ Q.

In this case, a vertical line (m = ∞) does not yield a point onthe hyperbola. −3

−2

−1

1

2

3y

−3 −2 −1 1 2 3x

S

P

6

Hopefully these introductory sections convince you that the approaches that may be required inNumber Theory are very different to those seen in other courses. We’ve already seen a deep con-nection to Geometry; there are equally deep links to other areas of Mathematics. It is now time westarted a thorough discussion of the integers: of divisibility and of the prime numbers.

5 Divisibility and the Greatest Common Divisor

First we rehash part of Definition 1.1.

Definition 5.1. Let a, b, d be integers: if d satisfies d | a and d | b then d is a common divisor7 of a and b.Suppose that a, b are not both zero. The greatest common divisor8 of a, b is written d = gcd(a, b).We say that a, b are coprime or relatively prime iff gcd(a, b) = 1.

Examples gcd(0, 9) = 9, gcd(45, 33) = 3, gcd(162, 450) = 18.

The definition may be extended to any list of numbers: gcd(a1, . . . , an) is the largest divisor of all thenumbers a1, . . . , an.

A famous algorithm exists for computing the GCD of a pair of numbers. Since it dates back at leastto Euclid it is named for him. The extended Euclidean Algorithm (Bezout’s Identity) will be evenmore useful to us, for it shows not only how to find the GCD d of two integers a, b, but also how toconstruct integers x, y satisfying the linear Diophantine equation ax + by = d. Using this approachwill allow us to find all solutions to such equations.

Theorem 5.2 (Division algorithm). If a ∈ Z, b ∈N then there exist unique q, r ∈ Z such that

a = qb + r, 0 ≤ r < b.

We call q the quotient and r the remainder. While we can’t divide in the integers, we can calculateusing remainders exactly as you did in elementary school:

13÷ 4 = 3r1b÷ a = qrr

}⇐⇒

{13 = 3 · 4 + 1

a = q · b + r

Proof. Consider the set S = N0 ∩ {a− bz : z ∈ Z}. This is a non-empty (take z large and negative)subset of the natural numbers, whence (well-ordering) it has a minimum element. Call this minimumr. Certainly r ∈ [0, b) for otherwise r− b ∈ S. Now let q = a−r

b be the corresponding choice of z. Thisestablishes existence.For uniqueness, suppose that a = q1b + r1 and a = q2b + r2 where 0 ≤ r1, r2 < b. Then

−b < r1 − r2 < b and r1 − r2 = (q2 − q1)b

Thus r1 − r2 is divisible by b and lies in the interval (−b, b). Clearly r2 = r1, whence q2 = q1 and wehave uniqueness.

7By convention one tends to list only positive common divisors.8All positive common divisors satisfy d ≤ max(|a| , |b|), hence there are a finite number of them; a greatest such must

therefore exist.

7

While it is known as an algorithm, the presentation of Theorem 5.2 doesn’t seem very algorithmic:indeed we shall simply take it as given that we can find q, r by whatever means we wish (messingwith a calculator is fine!). To see it more as an algorithm, consider the case where a > 0 and followthese instructions:

1. Is a < b? If Yes, stop: r = a and q = 0.

2. Otherwise, compute a− b.

3. Is a− b < b? If Yes, stop: r = a− b and q = 1.

4. Otherwise, compute a− 2b, etc.

5. Repeat until the process terminates.

For example, the following simple program computes q = 34 and r = 2 from a = 240 and b = 7simply by subtracting 7 from a until it can no longer do so. You can check that 240 = 34 · 7 + 2.

int a=240;

int b=7;

int q=0;

int r=a;

while(r>=b){r=r-b;

q=q+1;

}

write(q);

write(r);

The Euclidean Algorithm for computing gcd(a, b)

Suppose a > b > 0. By Theorem 5.2 there exist integers q1, r1 with 0 ≤ r1 < b such that

a = q1b + r1

Supposing r1 6= 0 and noting that r1 < b we apply the Division Algorithm again to see that thereexist q2, r2 with 0 ≤ r2 < r1 and

b = q2r1 + r2

We iterate this process until we reach a remainder9 rk+1 = 0:

(Line 1) a = q1b + r1

(Line 2) b = q2r1 + r2

(Line 3) r1 = q3r2 + r3...

(Line k− 1) rk−3 = qk−1rk−2 + rk−1

(Line k) rk−2 = qkrk−1 + rk

(Line k + 1) rk−1 = qk+1rk + 0

9To help distinguish quotients from remainders, when working the Algorithm we will type all remainders a, b, r1, r2, . . .in boldface; observe how one can trace the same remainder diagonally↙ from one line to the next.

8

We can now state the important result:

Theorem 5.3. The Euclidean Algorithm always terminates with final non-zero remainder rk = gcd(a, b).

Proof. First observe that the sequence a > b > r1 > r2 > r3 > · · · > 0 is a decreasing sequence ofpositive integers. At worst, one might imagine that this sequence takes b steps to reach 0 (in practice itrequires far fewer). We may therefore follow the algorithm for any pair of integers a > b > 0 and beassured of its termination.Now let d = gcd(a, b) and consider the first line of the Algorithm:

a = q1b + r1

Certainly r1 = a− q1b is divisible by d, whence d is a common divisor of b and r1. Moreover, if c wereany larger common divisor of b and r1, then c would divide a = bq1 + r1 and necessarily be a largercommon divisor of a, b than d = gcd(a, b). This is a contradiction, whence

gcd(b, r1) = gcd(a, b)

Iterating this (strictly by induction) we obtain

gcd(a, b) = gcd(b, r1) = gcd(r1, r2) = · · · = gcd(rk−1, rk) = gcd(rkqk+1, rk) = rk

Note that if a or b are negative, you may still apply the Theorem to the pair |a| , |b| before compensat-ing for the sign afterwards.

Example We use the Algorithm to compute gcd(161, 21)

161 = 1 · 140 + 21140 = 6 · 21 + 14

21 = 1 · 14 + 714 = 2 · 7

=⇒ gcd(161, 140) = 7

We could easily have done this by listing the positive divisors of 21 (there are only 1, 3, 7, 21) andchecking which of these is also a divisor of 161, but it is good to see the Algorithm at work. For largera, b, finding all the divisors is prohibitively time-consuming, whereas the Euclidean Algorithm willalways do the job in a (relatively) efficient manner.

Bezout’s Identity

The next result is of great importance: not only does it allow us to write the GCD of two numbersin a special way, it tells us how we can construct those numbers. There are a great many existencetheorems in Mathematics, but few of them tell you explicitly how to construct the desired objects.

Theorem 5.4 (Extended Euclidean Algorithm/Bezout’s Identity). Suppose that a, b ∈ Z are not bothzero. Then there exist integers x, y such that

gcd(a, b) = ax + by

9

Proof. Suppose that d = gcd(a, b). In the Euclidean Algroithm this appears in the penultimate line(line k), which can be rearranged to write d as an integer linear combination of the remainders rk−2and rk−1:

d = rk = rk−2 − qkrk−1

Move one line up the Algorithm: we can substitute for rk−1 (using line k− 1):

rk−1 = rk−3 − qk−1rk−2 =⇒ d = rk−2 − qk(rk−3 − qk−1rk−2)

= (1 + qk−1qk)rk−2 − qkrk−3

We now have an expression for d as an integer linear combination of the remainders rk−2 and rk−3.Simply continue moving up the Algorithm ans substituting: after substituting for rj using line j, wewill obtain an expression

d = αj−1rj−1 + αj−2rj−2

where αj−1, αj−2 ∈ Z. Eventually one reaches the first line of the Algorithm resulting in an integerlinear combination for d in terms of the a and b.

The proof is much easier to follow with our above example where d = r2 = 7.

7 = 21− 1 · 14 (rearrange line 3)= 21− (140− 6 · 21) (substitute for r2 = 14 using line 2)= −140 + 7 · 21= −140 + 7 · (161− 140) (substitute for r1 = 21 using line 1)= 7 · 161− 8 · 140

We therefore obtain 7 = 161x + 140y where (x, y) = (7,−8).

Example Find d = gcd(1132, 490) and integers x, y such that

d = 1132x + 490y

Simply apply the Algorithm:

1132 = 2 · 490 + 152490 = 3 · 152 + 34152 = 4 · 34 + 16

34 = 2 · 16 + 216 = 8 · 2

=⇒ gcd(1132, 490) = 2

We therefore have d = 2. Now reverse the steps of the Algorithm:

2 = 34− 2 · 16 (line 4)= 34− 2 · (152− 4 · 34) = 9 · 34− 2 · 152 (line 3)= 9 · (490− 3 · 152)− 2 · 152 = 9 · 490− 29 · 152 (line 2)= 9 · 490− 29 · (1132− 2 · 490) = 67 · 490− 29 · 1132 (line 1)

Hence (x, y) = (−29, 67) is a solution to d = 1132x + 490y.

10

As an example of the immediate theoretical power of Theorem 5.4 we prove the following:

Corollary 5.5. Suppose that gcd(a, b) = 1 and a | bc. Then a | c.

Proof. Since gcd(a, b) = 1, there exist integers x, y such that ax + by = 1. But then (ac)x + (bc)y = c,whence

a | bc =⇒ a | LHS =⇒ a | c

Well-ordering, or the Least Integer Principle

Recall that a set (of numbers) is well-ordered if every non-empty subset has a minimum element. Inparticular the natural numbers form a well-ordered set. In this context, well-ordering is also knownas the least integer principle: any non-empty subset of the positive integers has a minimum element.We have now used this concept twice:

1. In the proof of the Division Algorithm, to guarantee the existence of r = min S.

2. To obtain a contradiction in the proof of the Euclidean Algorithm. The set of remainders{b, r1, r2, . . .} is a non-empty set of natural numbers: this has a minimum and since the re-mainders are decreasing, the minimum must be the last remainder.

This second application of well-ordering is used repeatedly in Number Theory in particular in themethod of descent. In short, any decreasing sequence of positive integers much have a minimum andtherefore a finite length. The observation depends crucially on the terms of the sequence being positiveintegers; a decreasing sequence of positive rational numbers can be infinitely long (e.g. (1, 1

2 , 13 , 1

4 , . . .)).

6 Linear Diophantine Equations

A Linear Diophantine equation is an equation of the form ax + by = c where a, b, c ∈ Z are givenand we are interested only in integer solutions (x, y). As the previous section shows we have alreadyfound solutions to some such equations: if c = gcd(a, b) then Bezout’s Identity tells us how to finda solution. As this section shows, Bezout’s Identity is essentially all one needs to deal with all linearequations. To see this, we use Bezout’s Identity to obtain an important visualization of the GCD oftwo numbers.

Theorem 6.1. If a, b ∈ Z are not both zero, then d = gcd(a, b) is the least positive member of the set

D = {ax + by : x, y ∈ Z}

Moreover, if E = {md : m ∈ Z} is the set of all integer multiples of d, then E = D.

Proof. If one of a or b were zero then the GCD is the other and the theorem is trivial. If either isnegative, consider |a| , |b| and observe that the sets D are independent of the signs of a, b. We thusassume without loss of generality that a > b > 0 and that we have applied the Euclidean Algorithmand its Extension to obtain integers x, y such that

d = ax + by

11

We have therefore shown that d ∈ D1. Moreover, we easily see that

md = a(mx) + b(my) ∈ D =⇒ E ⊆ D

Conversely, d | aX + bY for all X, Y ∈ Z; every element aX + bY ∈ D is therefore a multiple of d andso D ⊆ E. The two sets are identical.Finally observe that d is clearly the least positive element of E.

Corollary 6.2. The Diophantine equation ax + by = c has a solution if and only if gcd(a, b) | c.

Proof. We have a solution iff c ∈ D which, by the Theorem, is iff c is a multiple of d = gcd(a, b).

Example Show that 147x− 45y = 2 has no solutions in integers.

147 = 3 · 45 + 1245 = 3 · 12 + 912 = 1 · 9 + 3

9 = 3 · 3

=⇒ gcd(147, 45) = 3

Since {147x − 45y : x, y ∈ Z} = {3n : n ∈ Z} does not contain 2, there are no solutions to theequation.

General Solutions

We have already seen (Corollary 6.2) that ax + by = c has a solution in integers iff d = gcd(a, b) | c,and how, when a solution exists, to find one using the Extended Euclidean algorithm (Theorem 5.4).Here we find all solutions to such an equation.Suppose that d | c so that we have a solution (x0, y0) to ax + by = c. Moreover, suppose that (x1, y1)is another solution. Then

a(x1 − x0) + b(y1 − y0) = (ax1 + by1)− (ax0 + by0) = c− c = 0

It follows that the difference (x1 − x0, y1 − y0) is a solution to the associated homogeneous equation10

ax + by = 0

Indeed we see that

ax + by = c ⇐⇒ (x, y) = (x0, y0) + (xh, yh) where axh + byh = 0

It remains to solve the homogeneous equation. For this, divide by d to obtain

ad

xh +bd

yh = 0 =⇒ bd

yh = − ad

xh (∗)

Note that the coefficients are integers and that gcd(

ad , b

d

)= 1. Since b

d divides (∗), we may appeal to

Corollary 5.5 to see that bd divides xh. We quickly conclude that

xh =bd

t and yh = − ad

t for some t ∈ Z.

Indeed we have proved the following:10This method of solution is analogous to the standard approach to inhomogeneous linear ordinary differential equa-

tions, and to non-homogeneous linear algebra problems Ax = b.

12

Theorem 6.3. The Diophantine equation ax + by = c has a solution iff d | c where d = gcd(a, b). In such acase there are infinitely many solutions: if (x0, y0) is a given solution then all may be found using the formula

(x, y) =(

x0 +bd

t, y0 −ad

t)

where t ∈ Z

We have therefore reduced the problem to finding the GCD d = gcd(a, b) and a single solution(x0, y0) to ax + by = c. Thankfully the (Extended) Euclidean Algorithm does both for us! Rememberto take care to solve the correct equation; Bezout’s Identity only solves ax + by = d: if d 6= c thenmultiply your solution (x0, y0) accordingly. Moreover, if the signs of a, b are not positive take thisinto account in your final answer.

1. Find all the solutions to the Diophantine equation 161x + 140y = −14.From before we have d = gcd(161, 140) = 7 and a solution (7,−8) to 161x + 140y = 7. Multi-plying this by −2 to obtain a solution to the desired equation, we see that the general solutionto 161x + 140y = −14 is

(x, y) =(−14 +

1407

t, 16− 1617

t)= (−14 + 20t, 16− 23t) : t ∈ Z

2. Find all the solutions in integers to the equation 490x− 1132y = 4.We know that d = gcd(1132, 490) = 2 and that (−29, 67) is a solution to 1132x + 490y = 2.Rearranging this and taking the signs into account, we see that (x0, y0) = (134, 58) is a solutionto the equation of interest. Hence the general solution is

(x, y) =(

134 +1132

2t, 58 +

4902

t,)= (134 + 566t, 58 + 245t) : t ∈ Z.

7 Primes and Unique Factorization

Now we turn to the building blocks of the integers, the prime numbers. Intuitively we understandwhat a prime is and that a positive integer can be decomposed into a product of primes: e.g.

156 = 22 · 3 · 13

The primary question for this section involves establishing that 156 cannot be factored into primesin any other way. Indeed the same is true for all positive integers: up to reordering there is one, andonly one, decomposition as a product of primes. This famous result is known as the Unique Factor-ization Theorem or the Fundamental Theorem of Arithmetic.

You have probably come across two different notions of a prime number:

1. A prime is an integer ≥ 2 whose only positive divisors are 1 and itself.

2. A prime is an integer ≥ 2 which, if it divides a product of integers must divide at least one ofthem.

In abstract algebra, the first notion is known as irreducibility and the second primality. The challengeof proving unique factorization is showing the uniqueness part which, in essence, amounts to showingthat these two concepts are the same. We restate the definitions for clarity.

13

Definition 7.1. Let z ≥ 2 be an integer. We say that z is:

Irreducible if, for any positive k, we have k | z =⇒ k = 1 or k = z.

Composite if z is not irreducible.

Prime if z | ab =⇒ z | a or z | b.

We also refer to ±1 as units.11

We will build up to Unique Factorization in two stages. First we show that every positive integer maybe factored in terms of irreducibles. Then, by showing that primes and irreducibles are identical, wesee that said factorization is unique.

Irreducibiles and Composites

First observe that a composite number z must have a positive divisor which is neither 1 nor z. Thatis, z is composite iff there exists a, b neither of which are units such that z = ab.

Lemma 7.2. Every composite number is divisible by an irreducible.

Proof. Suppose that z ≥ 2 is composite, but has no irreducible factors. Then:

• We may write z = a1b1 where a1, b1 ≥ 2 are not irreducible, and thus must be composites.

• If a1 had an irreducible factor then this would be an irreducible factor of z. Hence a1 is compos-ite and may be written a1 = a2b2 for a2, b2 ≥ 2 composite.

• We may repeat the process ad infinitum:

z = a1b1 = a2b2b1 = a3b3b2b1 = · · ·

Since each bn ≥ 2 we see that (a1, a2, a3, a4, . . .) is a decreasing sequence of positive integers.This is a contradiction.

We conclude that z must have an irreducible factor.

We can use Lemma 7.2 to prove Euclid’s famous theorem that the set of irreducibles (primes) isinfinite.

Theorem 7.3. There are infinitely many irreducibles.

Proof. Suppose that {p1, . . . , pn} constitutes all the irreducibles and consider P := p1 · · · pn + 1. ByLemma 7.2, P has an irreducible factor p which, by assumption, is one of our irreducibles pi. But then

p | P and p | p1 · · · pn =⇒ p | 1

This contradicts the fact that p ≥ 2.

11In a pure algebra sense, we should also deal with negative numbers and state that, for instance, −2 isprime/irreducible. Don’t worry if this makes you uncomfortable: we won’t do this!

14

Theorem 7.4 (Fundamental Theorem of Arithmetic, part 1 (existence)). Every integer a ≥ 2 may befactorized into irreducibles: that is

a = p1 · · · pn

where p1, . . . , pn are a list of irreducibles.

Proof. This is algorithmic. If a is irreducible, we are done. Otherwise (Lemma 7.2) a has an irreduciblefactor p1. But then a = p1a1 for some a1 ∈N.If a1 is an irreducible p2, we are done. If a1 is composite, apply Lemma 7.2 again to obtain an irre-ducible factor p2 and write a1 = p2a2.Continue until the process terminates: we have our factorization

a = p1 p2 · · · pn

If the process never terminates, then we have an infinite sequence (a, a1, a2, a3, . . .) of decreasingpositive integers; a contradiction.

Primes and Irreducibles

Lemma 7.5. Every prime is irreducible.

Proof. Suppose that p = k1k2 is prime where k1, k2 are positive. Then p | k1 or p | k2; without loss ofgenerality suppose the former. Then k1 = pα for some α ∈ Z. But then

p = pαk2 =⇒ αk2 = 1

Since we are working in the integers and k2 > 0, it follows that k2 = 1 and k1 = p.

Lemma 7.6. Every irreducible is prime.

Proof. This is a consequence of Bezout’s Identity and the Euclidean Algorithm. Suppose that z isirreducible and that z | ab for some integers a, b. Let d = gcd(a, z). Since z is irreducible, there areonly two possibilities:

• d = 1: in this case gcd(a, z) = 1 and z | ab implies (Corollary 5.5) that z | b.

• d = z: in this case z | a.

From now on we can simply refer to irreducibles as primes.

Theorem 7.7 (Fundamental Theorem of Arithmetic, part 2 (uniqueness)). Every integer a ≥ 2 may beuniquely factorized

a = pµ11 · · · p

µnn

where p1 < · · · < pn are a list of primes and each µi ∈N.

15

Proof. Theorem 7.4 says that we can factor a into irreducibles. Now suppose that we have two distinctsuch factorizations of a12

pµ11 · · · p

µnn = pν1

1 · · · pνnn

Since the factorizations are distinct, we may suppose WLOG that µ1 > ν1. But then

pµ1−ν11 pµ2

2 · · · pµnn = pν2

2 · · · pνnn

Clearly p1 | LHS whence p1 | RHS. Since p1 is prime (this is where we use Lemma 7.6) we see that p1divides at least one of p2, . . . , pn. This is a contradiction.

The result is often stated as follows:

Theorem 7.8 (Unique Prime Factorization/Fundamental Theorem of Arithmetic). Every integer a iseither 0, a unit, or may be written uniquely as

a = upµ11 · · · p

µnn

where p1 < · · · < pn are a list of primes, u is a unit, and each µi ∈N.

Now that we have unique factorization, all manner of ‘obvious’ things are seen to be true. For in-stance, suppose that

a = pµ11 · · · p

µnn and b = pν1

1 · · · pνnn

are written in terms of their factorizations, where some of the exponents may need to be zero in orderto have the same list of primes. The following should be immediate:

1. b | a ⇐⇒ νi ≤ µi for all i. In essence, ‘all the primes’ in b must also be in a.

2. gcd(a, b) = pmin(µ1,λ1)1 · · · pmin(µn,λn)

n .

3. a is a perfect square if and only if every µi is even (consider a = b2 then µi = 2νi).

Indeed, look back to our discussion of Pythagorean triples where we used the facts that

4. d2 | m2 =⇒ d | m.

5. If ab is a perfect square and gcd(a, b) = 1 then both a and b are perfect squares.

These facts are also very easy to prove using unique factorization.

12Note that some of the exponents µi, νi could be zero if the supposed lists of primes were different.

16

Least Common Multiple

Definition 7.9. The least common multiple lcm(a, b) of two positive integers a, b is the smallest positiveinteger divisible by both a and b.

In terms of the unique prime decompositions of a and b we clearly have

a = pµ11 · · · p

µnn

b = pλ11 · · · pλn

n

}=⇒ lcm(a, b) = pmax(µ1,λ1)

1 · · · pmax(µn,λn)n .

As ever, we allow some of the µi, λi to be zero so as to simultaneously list all primes appearing inboth decompositions.Recalling observation 2 above, we see that

lcm(a, b) · gcd(a, b) = ab.

This follows since max(µi, λi) + min(µi, λi) = µi + λi for each i.Warning: this formula does not hold for gcd’s or lcm’s of three or more integers.

Example Find lcm(110, 154).We can either do this by brute force, listing the multiples of each number and looking for the smallestin the list, or we may proceed by calculating the GCD instead. By the Euclidean Algorithm we have

154 = 110 · 1 + 44110 = 44 · 2 + 2244 = 22 · 2

=⇒ gcd(110, 154) = 22

Using the above formula we see that

lcm(110, 154) =110 · 154

22= 5 · 154 = 770

8 Congruences and Zn

A great many problems in number theory rely only on remainders when dividing by an integer. Recallthe Division Algorithm: given a ∈ Z and n ∈N there exists a unique quotient q and remainder r (bothintegers) such that

a = qn + r, 0 ≤ r < n (∗)

Motivated by this, we make a definition:

Definition 8.1. For each n ∈N, the set of residues modulo n is Zn = {0, 1, . . . , n− 1}.(∗) says that every integer a ∈ Z has a unique residue r ∈ Zn.a, b ∈ Z are said to be congruent modulo n if they have the same residue modulo n. We write this as

a ≡ b mod n

17

Example We may write 7 ≡ −3 mod 5, since applying the Division Algorithm yields

7 = 5× 1 + 2 and − 3 = 5× (−1) + 2

Indeed both 7 and 12 have residue 2 modulo 5.

As a further example of using just this definition, we prove a simple result.

Proposition 8.2. All perfect squares of integers have remainders 0 or 1 upon dividing by 3.

Proof. By the definition, every integer x ∈ Z has remainder 0, 1 or 2 upon division by 3. We thereforehave three mutually exclusive cases to check:

(Remainder zero) We may write x = 3y for some integer y. But then

x2 = 9y2 = 3(3y2) ≡ 0 mod 3

(Remainder one) We may write x = 3y + 1 for some integer y. Then

x2 = 9y2 + 6y + 1 = 3(3y2 + 2y) + 1 ≡ 1 mod 3

(Remainder two) We may write x = 3y + 2 for some integer y. Then

x2 = 9y2 + 12y + 4 = 3(3y2 + 4y + 1) + 1 ≡ 1 mod 3

A perfect square can therefore never have remainder 2.

This is tedious notation, and we will shortly make it less unwieldy. To start this process we observethat there is an easier way to check whether two integers are congruent modulo n.

Theorem 8.3. a ≡ b mod n ⇐⇒ n | (a− b)

Proof. Suppose that a = nq1 + r1 and b = nq2 + r2 are the results of applying the Division Algorithmto a, b modulo n. We prove each direction separately:

(⇒) First note that

a ≡ b mod =⇒ r1 = r2 =⇒ a− nq1 = b− nq2 =⇒ a− b = n(q2 − q1)

Since q2 − q1 is an integer, this forces a− b to be a multiple of n.

(⇐) Conversely, suppose that a− b = kn is a multiple of n. Then

r1 − r2 = (a− nq1)− (b− nq2) = (a− b) + n(q2 − q1) = n(k + q2 − q2)

This says that r1− r2 is an integers multiple of n. Recalling the proof of the Division Algorithm,the fact that −n < r1 − r2 < n forces r1 − r2 = 0, whence a ≡ b mod n.

18

For instance, we can now prove that 7 ≡ −3 mod 5 simply by observing that 7 − (−3) = 10 isdivisible by 5. The advantage should be clear: Theorem 8.3 says that we can compare remainderswithout computing quotients.Our next goal is to define an arithmetic with remainders: that is, we want to be able to add andmultiply remainders without calculating quotients. For instance, it certainly seems reasonable that ifx and y have remainders 3 and 5 (respectively) modulo 7, then

x ≡ 3, y ≡ 5 =⇒ xy ≡ 3 · 5 ≡ 15 ≡ 1 mod 7

so that the product has remainder 1. At the present, we have to justify this in laborious fashion:If x ≡ 3 and y ≡ 5 modulo 7, then there exist integers k, l such that x = 7k + 3 and y = 7l + 5; butthen

xy = 7(7kl + 5k + 3l) + 15 = 7(7kl + 5k + 3l + 2) + 1 =⇒ xy ≡ 1 mod 7

We now establish this in general.

Theorem 8.4. Suppose that x ≡ a, y ≡ b modulo n. Then

1. x± y ≡ a± b mod n

2. xy ≡ ab mod n

3. For any m ∈N, xm ≡ am mod n

Proof. We just prove the second: the first is similar, and the third is by induction using the second asthe induction step.By Theorem 8.3, there exist integers k, l such that x = kn + a and y = ln + b. But then

xy = (kn + a)(ln + b) = n(kln + al + bk) + ab =⇒ xy ≡ ab mod n

According to the theorem, we can now easily compute remainders of complex arithmetic objects; forinstance:

1. What is the remainder when 17113 is divided by 3?

Don’t bother asking your calculator: 17113 is 139 digits long so a calculator won’t help! Insteadwe use modular arithmetic:

17 ≡ −1 mod 3 =⇒ 17113 ≡ (−1)113 (Theorem 8.4, part 3.)≡ −1 mod 3 (since 113 is odd)

Since −1 ≡ 2, we conclude that 17113 has remainder 2 when divided by 3.

2. Similarly, calculating remainders modulo 10 gives

21945 − 4312 ≡ (−1)45 − 312 ≡ −1− 96 ≡ −1− (−1)6 ≡ −1− 1 ≡ −2 ≡ 8 mod 10

19

3. In this lengthy example, we first search for a power of 4 which is small modulo n = 67: theobvious choice is 43 = 64.

449 ≡ 4 · (43)16 ≡ 4 · (−3)16 ≡ 4 · 316 mod 67

Next we search for a power of 3 which is small: since 34 = 81 ≡ 14 mod 67 we obtain

449 ≡ 4 · (34)4 ≡ 4 · 144 mod 67

Now observe that 142 = 196 ≡ −5 mod 67 and we are almost finished:

449 ≡ 4 · (−5)2 ≡ 4 · 25 ≡ 100≡ 33 mod 67

Don’t try this without a calculator!

Now that we have some better notation, here is a much faster proof of Proposition 8.2.

Proof. Modulo 3 we have:

02 ≡ 0, 12 ≡ 1, 22 ≡ 4 ≡ 1

Hence squares can only have remainders 0 or 1 modulo 3.

As an application, we can easily show that in a Pythagorean triple (a, b, c) exactly one of a or b is amultiple of three: just think about the remainders modulo 3:

a2 + b2 ≡ c2 where each square is congruent to 0 or 1.

The only possibilities are 0 + 0 ≡ 0, 0 + 1 ≡ 1 and 1 + 0 ≡ 1, however the first of these says that allthree of a, b, c are divisible by three whence the triple is non-primitive.

Aside: What is Zn? (for those with some abstract algebra experience)Our notation Zn = {0, 1, . . . , n− 1} for the set of residues is strictly incorrect, for it makes it appearthat the elements of Zn are integers. In fact the symbol Zn is used to denote something subtly differ-ent. Here is some of the detail:Strictly speaking, ‘congruence modulo n’ is an equivalence relation on the ring (Z,+, ·) of integers.Write Z/

nZ = {[0], [1], . . . , [n− 1]} for the set of equivalence classes: that is

[x] = [a] ⇐⇒ x ≡ a mod n

In this language the subring nZ = [0] of multiples of n is an ideal in Z. It follows that the set ofequivalence classes Z/

nZ inherits a ring structure from Z where addition +n and multiplication ·nare defined by

[x] +n [y] := [x + y], [x] ·n [y] := [xy]

We call the triple(

Z/nZ,+n, ·n

)a quotient or factor ring. Since the notation is very ugly, it is cus-

tomary to omit the square brackets and subscripts and to denote the new ring by Zn. Thus Zn is thequotient ring of residues modulo n. It is perfectly legitimate to denote the elements of this ring by

Zn = {0, 1, . . . , n− 1}

20

as long as one understands that each element is an equivalence class and may be represented by anyother element in the class. Thus it is perfectly acceptable to write −1 = 4 in the ring Z5.This discussion means that we now have three competing notations: for example, if n = 6:

Congruence notation: 4 + 5 ≡ 3 mod 6

Factor ring notation: [4] +6 [5] = [3]

Zn notation: 4 + 5 = 3

This last is clearly the most succinct, but it very easy to be confused: 4, 5 and 3 are not integers inthis context, they are elements of a new algebraic structure, namely the ring Z6. Unless you make itabsolutely clear in which ring Zn you are working, you should avoid this notation.

Congruence and Division

We are able to add, subtract, multiply and take positive integer powers of remainders without issue.Division is another matter entirely. For example, since 8 ≡ 20 mod 6, we know that

4× 2 ≡ 4× 5 mod 6 (∗)

We’d like to be able to divide by four, however 2 6≡ 5 mod 6. What can we try instead? To motivatethe next result, we follow the definition:13

4× 2 ≡ 4× 5 mod 6 =⇒ 4× 2 = 4× 5 + 6m for some m ∈ Z

Dividing this by 2 we see that

2× 2 = 2× 5 + 3m =⇒ 2 | m =⇒ m = 2l for some l ∈ Z

But then we may divide by 2 again to correctly conclude

2 = 5 + 3l =⇒ 2 ≡ 5 mod 3

It appears that we were able to divide (∗) by four, but at the cost of dividing the modulus by 2: it justso happens that 2 = gcd(4, 6).

Theorem 8.5. Suppose that k 6= 0. If gcd(k, n) = d then

ka ≡ kb mod n =⇒ a ≡ b modnd

Proof. gcd(k, n) = d ⇐⇒ gcd(

kd , n

d

)= 1. Therefore

ka ≡ kb =⇒ n | k(a− b) =⇒ nd

∣∣∣ kd(a− b)

Since nd and k

d are comprime integers, an appeal to Corollary 5.5 tells us that nd | a− b. Otherwise said

a ≡ b mod nd .

13It is obvious that m = −2 but leaving this unsaid makes it easier to see a proof of the following theorem.

21

Examples

1. We divide by 4 in the congruence 12 ≡ 28 mod 8. Since gcd(4, 8) = 4 we also divide themodulus by 4 to obtain

12 ≡ 28 mod 8 =⇒ 3 ≡ 7 mod 2

2. We divide by 12 in the congruence 12 ≡ 72 mod 30. Since gcd(12, 30) = 6, we conclude that

12 ≡ 72 mod 30 =⇒ 1 ≡ 6 mod 5

Aside: Rings and FieldsWhile considering division, it is worth revisiting Corollary 5.5 and Bezout’s identity. We know that

gcd(a, p) = 1 =⇒ ∃x, y ∈ Z such that ax + py = 1

Looking at this modulo p, we obtain

ax ≡ 1 mod p

Otherwise said, if a ∈ Zp is relatively prime to p then a has a multiplicative inverse x. If p is prime thenevery non-zero element in the ring Zp has a multiplicative inverse. This is precisely what it means fora ring to be a field.For example: in Z5 we have

1 = 1 · 1 = 2 · 3 = 4 · 4 =⇒ 1−1 = 1, 2−1 = 3, 3−1 = 2, 4−1 = 4

In Z6 however, we see that the remainder 2 has no multiplicative inverse:

x 0 1 2 3 4 52x 0 2 4 0 2 4

I.e. there is no x such that 2x ≡ 1 mod 6. In general this approach gives us a converse for compositenumbers.Suppose that n = ab is composite, where 2 ≤ a, b < n: if a ∈ Zn had a multiplicative inverse c thenwe would have

ac ≡ 1 mod n =⇒ abc ≡ b =⇒ b ≡ 0 mod n

But this says that b is divisible by n: a contradiction. We conclude:

Theorem. Zn is a field if and only if n is prime.

Tying this to Theorem 8.5, we see that in a field Zp we can divide by any non-zero remainder whileremaining in the same set of remainders.

22

Congruence Equations

We can rephrase our discussion of Linear Diophantine Equationsax ≡ c mod m has a solution x ⇐⇒ ∃y s.t. ax− c = my ⇐⇒ ax−my = c has a solution. But thisis iff gcd(a, m) | c. Indeed:

Theorem 8.6. Let d = gcd(a.m). The equation ax ≡ c mod m has a solution iff d | c. If x0 is such asolution, then all solutions are

x = x0 + kmd

: k ∈ Z.

Indeed, modulo m, there are exactly d solutions x0, x0 +md , x0 +

2md , . . . , x0 +

(d−1)md

Example 1288x ≡ 21 mod 1575 has a solution since d = gcd(1575, 1288) = 7 and 7 | 21. IndeedBezout’s identity says

7 = 1575 · 9− 1288 · 11 =⇒ 7 ≡ 1288(−11) mod 1575 =⇒ x = −33 ≡ 1542

is a solution. Since md = 1575

7 = 225 in this case, we see that all solutions are then

{x ≡ −33 + 225k : k = 0, . . . , 6} = {192, 417, 642, 867, 1092, 1317, 1542}.

Polynomial Congruence Equations

Consider the quadratic equation x2 + 3x ≡ 0 mod 10. One can easily check by plugging in theremainders 0, . . . , 9 that the solutions to this equation are

x ≡ 0, 2, 5, 7 mod 10

This is perhaps surprising. We are used to quadratic equations having at most two solutions.Now consider the same equation modulo the two prime divisors of 10, namely 2 and 5. Indeed itshould be clear that

x2 + 3x ≡ 0 mod 10 ⇐⇒{

x2 + 3x ≡ 0 mod 2, and,x2 + 3x ≡ 0 mod 5.

Again we can check by substituting values for x, that

x2 + 3x ≡ 0 mod 2 ⇐⇒ x ≡ 0, 1 mod 2,

x2 + 3x ≡ 0 mod 5 ⇐⇒ x ≡ 0, 2 mod 5.

Sanity is restored! Indeed, we can even factorize like we are used to:

x2 + 3x ≡ x2 − x ≡ x(x− 1) mod 2,

x2 + 3x ≡ x2 − 2x ≡ x(x− 2) mod 5.

Modulo 10, we have two distinct factorizations:

x2 + 3x ≡ x(x− 7) ≡ (x− 2)(x− 5) mod 10.

For general polynomial congruences, the same sort of thing is true, but only when the modulus isprime.

23

Theorem 8.7 (Lagrange). Let p be prime and f (x) a degree n polynomial with integer coefficients. Then thecongruence f (x) ≡ 0 mod p has at most n distinct roots modulo p.

Of course Lagrange’s Theorem is useless for polynomial congruences such as x39 + 25x2 + 1 ≡ 0mod 17. There are only 17 distinct values of x to try, and so the congruence can only have a maximumof 17 solutions, not the 39 given by Lagrange’s Theorem.

Aside: a (sketch) proof of Lagrange’s TheoremSince we’ve not done all the preliminaries for a proof of Lagrange’s Theorem, we provide only asketch. One needs a little more algebra than we have, in particular the division algorithm in thering Z[x] of polynomial with integers coefficients. Suppose that f (c1) ≡ 0 mod p. Then there existpolynomials q(x), r(x) satisfying{

f (x) = (x− c1)q(x) + r(x)0 ≤ deg(r) < deg(x− c1) = 1

Since the degree of the remainder r(x) must be zero, we see that it is constant. Moreover f (c1) ≡0 =⇒ r ≡ 0 mod p. We conclude that (x− c1) is a factor of f (x) modulo p.

Proof. Suppose that we find n roots of the equation. According to the division algorithm above, wemay therefore totally factorize f (x) as

f (x) = a(x− c1) · · · (x− cn).

Since the degree of both sides is n we cannot divide out by any further linear factors. Now supposethat ξ 6≡ c1, . . . , cn mod p. Then ξ − ci 6≡ 0 mod p for all i. Since products of non-zero elements ina field Zp are non-zero we must have f (ξ) 6= 0. There are thus at most n roots of the polynomialcongruence.

In fact, the ring of polynomials Zp[x] with coefficients in the field Zp has a Euclidean Algorithm,and therefore a unique factorization theorem. This means that there is only one way to factorize anypolynomial modulo p, but this takes us beyond the scope of the course. The practical upshot is thatyou can hunt for roots of f (x) ≡ 0 modulo p by extracting a linear factor f (x) ≡ (x− c1)q(x), thensearching for roots of q(x) ≡ 0, exactly as you would for polynomials with real coefficients.

Lagrange’s Theorem is completely useless in the situation when n ≥ p as there can only be at most psolutions to any equation modulo p.

Examples

1. Factorize f (x) = x3 + 2x2 + 4x + 3 over Z5. By inspection we see that x ≡ ±1,−2 are solutions.By Lagrange’s Theorem these are the only solutions and we can factorize

f (x) ≡ (x− 1)(x + 1)(x + 2) mod 5.

We know that the factorization is unique and there are no other solutions, but it is worth seeingit played out in stages.

f (x) ≡ x3 + 2x2 + 4x + 3 ≡ (x− 1)(x2 + 3x + 7) (spot x ≡ 1 and factorize)

24

≡ (x− 1)(x2 + 3x + 2) (simplify)≡ (x− 1)(x + 1)(x + 2) (spot x ≡ −1 and factorize)

2. Note that Lagrange only says that there are at most n solutions modulo p. Consider the polyno-mial f (x) = x2 + x + 1 mod 2. It is easy to check that this has no solutions.14

3. Here is another example of a quadratic with four roots: modulo 6 we have

f (x) ≡ x2 − 5x ≡ x(x− 5) ≡ (x− 2)(x− 3).

Comparing with example 1, note that we can’t simply factor out (x− 0) from x2 − 5x becausethe factorization need not be unique. This is because 6 is not prime.

4. We find all solutions to x2 + 14x − 3 ≡ 0 mod 18. While you may feel it is fastest to try allremainders 0, 1, . . . , 17 with your calculator, we give a more systematic approach.x is a solution if and only if both{

x2 + 14x− 3 ≡ x2 − 1 ≡ 0 mod 2 ⇐⇒ x odd, and,x2 + 14x− 3 ≡ x2 + 5x− 3 ≡ 0 mod 9.

The second condition implies that x2 + 2x ≡ 0 mod 3 which, by factoring, yields x ≡ 0, 1mod 3. We therefore try x ≡ 0, 1, 3, 4, 6, 7 mod 9 and observe that only x ≡ 6, 7 mod 9 work.We therefore have to solve two different sets of equations:{

x ≡ 1 mod 2,x ≡ 6 mod 9,

or

{x ≡ 1 mod 2,x ≡ 7 mod 9.

We have two sets of simultaneous equations. In general, the Chinese Remainder Theorem(later) can deal with these, but these are so simple that there’s no need. For instance

x ≡ 6 mod 9 =⇒ x ≡ 6, 15 mod 18

If x must also be odd (and 18 is even), only x ≡ 15 mod 18 will do. Similarly, the secondsimultaneous congruence has solution x ≡ 7 mod 18.

5. Find all solutions to x3 − 2x + 1 ≡ 0 mod 12.We easily spot that x ≡ 1 mod 12 is a solution. Are there others? Considering the primesdividing 12 we see that any solution must satisfy

x3 − 2x + 1 ≡ (x− 1)(x2 + x− 1) ≡ 0 mod 2 and mod 3.

It is clear by inspection that the only solutions modulo 2 and 3 are x ≡ 1. It follows that anysolution must satisfy x ≡ 1 mod 6. Stepping this up to modulo 12, we should try x ≡ 1 andx ≡ 7 mod 12. The first is certainly a solution. As for the latter,

73 − 2 · 7 + 1 ≡ 7 · 49− 14 + 1 ≡ 7− 2 + 1 ≡ 6 mod 12

It follows that the only solution is x ≡ 1 mod 12.14In the language of Section 7, f is an irreducible polynomial in the ring Z2[x].

25

9 Congruences, Powers and Fermat’s Little Theorem

Fermat’s Little15 Theorem provides a useful trick for simplifying large powers in congruence equa-tions. Perhaps the simplest proof relies on a simple fact about the residues modulo a prime.

Lemma 9.1. Let p be a prime and a be a positive integer less than p. Then numbers a, 2a, 3a, 4a, . . . , (p− 1)aconstitute all the non-zero remainders modulo p.Otherwise said, modulo p these are the numbers 1, 2, 3, . . . , p− 1 though probably in a different order.

Example If you’re having trouble believing this, try an examples. Let p = 5 and we can create atable:

a 2a 3a 4a1 2 3 42 4 1 33 1 4 24 3 2 1

Notice that every remainder appears exactly once in each row. If we try to repeat with a non-prime,say p = 6, we get a different story:

a 2a 3a 4a 5a1 2 3 4 52 4 0 2 43 0 3 0 34 2 0 4 25 4 3 2 1

The only lines in which all the non-zero remainders appear are when a = 1 or a = 5. This will beimportant in the next section: these are precisely the remainders a for which gcd(a, 6) = 1. Withprimes, we always have gcd(a, p) = 1, and this forms the heart of the proof.

Proof. If 1 ≤ a ≤ p− 1 and p is prime, then gcd(a, p) = 1. Suppose that two of the remainders xa, yawere equal. Appealing to Theorem 8.5, we can divide by a to obtain

xa ≡ ya =⇒ x ≡ y mod p

It follows that the numbers a, 2a, . . . , (p− 1)a are distinct modulo p. Moreover, none are zero, sincenone are divisible by p.

Corollary 9.2 (Fermat’s Little Theorem). If p is prime and p - a then

ap−1 ≡ 1 mod p

Proof. Multiply the remainders a, 2a, . . . , a(p− 1) together. Since these are just the remainders 1, 2, . . . , p−1 in a different order, we obtain

ap−1(p− 1)! ≡ (p− 1)! mod p

Since p is prime and gcd((p− 1)!, p) = 1 we must be able to divide by (p− 1)!. The result follows.

15To distinguish it from his famous ‘last.’

26

Examples Here are a few examples of using Fermat’s Little Theorem to simplify calculations. Doingthese without the Theorem is very tedious!

1. Since 239 is not divisible by the prime 137, we instantly see that

239136 ≡ 1 mod 137

2. Compute the remainder when 6698 is divided by 97.Since 97 is prime and 66 is coprime to it, we can apply Fermat’s Little Theorem:

6698 ≡ 6697−1 · 662 ≡ 662 mod 97

≡ (−31)2 ≡ 961 ≡ −9≡ 88 mod 97

3. This time we employ the Theorem to help solve the high-powered congruence x74 ≡ 12 mod 37.First note that x 6≡ 0. If there is a solution, we see that the theorem applies. But then x37−1 ≡x36 ≡ 1 mod 37. Since 74 = 36× 2 + 2 we conclude that

12 ≡ x74 ≡ (x36)2 · x2 ≡ x2 mod 37

We have therefore reduced the congruence to something much more manageable. Finally, weconsider numbers congruent to 12 modulo 37: we don’t have far to look before we find a perfectsquare!

12, 49, . . .

Thus x ≡ 7 is a solution, which says that x ≡ −7 ≡ 30 is another. By Lagrange’s Theorem,there are at most two solutions to this congruence: we conclude

x74 ≡ 12 ⇐⇒ x ≡ 7, 30 mod 37

Theorem 9.3 (Wilson’s Theorem). If p is prime then (p− 1)! ≡ −1 mod p.

Proof. Consider the polynomial congruence

g(x) ≡ (xp−1 − 1)− (x− 1)(x− 2) · · ·(x− (p− 1)

)≡ 0 mod p

We can attack this using two theorems:

• Multiplying out and cancelling the xp−1 terms, we see that g has degree at most p − 2. La-grange’s Theorem says that g(x) ≡ 0 can have at most p− 2 distinct roots.

• Fermat’s little theorem says that the congruence has at least p − 1 distinct roots, namely x ≡1, 2, . . . , p− 1.

The only way to make sense of this is if g(x) is not a polynomial at all: it is identically zero modulop. It follows that

xp−1 − 1 ≡ (x− 1)(x− 2) · · ·(x− (p− 1)

)mod p

Evaluating at x ≡ 0 yields the result.

27

If you’re having trouble understanding the proof, try an example! When p = 3 we have

g(x) ≡ x2 − 1− (x− 1)(x− 2) ≡ x2 − 1− x2 + 3x− 2 ≡ 3x− 3 ≡ 0 mod 3

The point is that while g(x) might look like a degree ≤ 1 polynomial, it is in fact the zero polynomial.

For those of you with more taste in algebra, the same proof may be viewed as follows: By Fermat’sLittle Theorem the equation

xp−1 − 1 ≡ 0 mod p

has p − 1 distinct roots, namely x ≡ 1, . . . , p − 1 mod p. Therefore xp−1 − 1 may be factorizeduniquely in the ring Zp[x] as

xp−1 − 1 ≡ a(x− 1)(x− 2) · · ·(x− (p− 1)

)mod p

for some a. Clearly a ≡ 1 by considering the xp−1 terms. Now evaluate at x ≡ 0 as before.

10 Euler’s Formula

In this section we see a generalization of Fermat’s Little Theorem: recall that if p is prime and a 6=0 mod p then ap−1 ≡ 1 mod p. This is a very tight theorem, in that it applies to all non-zeroremainders modulo p. What can we say about powers of remainders modulo a composite.To search for some patterns, consider powers ak modulo 6, 7 and 8:

modulo 6 modulo 7 modulo 8

k 1 2 3 4 5a = 1 1 1 1 1 1

2 2 4 2 4 23 3 3 3 3 34 4 4 4 4 45 5 1 5 1 5

k 1 2 3 4 5 6a = 1 1 1 1 1 1 1

2 2 4 1 2 4 13 3 2 6 4 5 14 4 2 1 4 2 15 5 4 6 2 3 16 6 1 6 1 6 1

k 1 2 3 4 5 6 7a = 1 1 1 1 1 1 1 1

2 2 4 0 0 0 0 03 3 1 3 1 3 1 34 4 0 0 0 0 0 05 5 1 5 1 5 1 56 6 4 0 0 0 0 07 7 1 7 1 7 1 7

Hopefully there are enough terms in each table for you to spot the patterns for larger values of k.Note first that the column in red (modulo 7) represents Fermat’s Little Theorem. Unfortunately theredon’t seem to be many 1’s in the other tables: for instance, modulo 6 we will never have 2k, 3k or 4k

congruent to 1. Indeed the tables suggest that remainders of 1 appear precisely in those rows wherea is relatively prime to the modulus. This motivates the following:

Lemma 10.1. If ak ≡ 1 mod n, then gcd(a, n) = 1.

Proof. If ak ≡ 1 mod n, then ak = 1 + ln for some l ∈ Z. But then

a · ak−1 − n · l = 1

Thus any common divisor of a and n also divides 1.

28

Now that we’ve ruled out the possibility of ak being congruent to 1 unless gcd(a, n) = 1, it remainsto convince ourselves that, for these a there is some k such that ak ≡ 1. One way of thinking aboutthis is to recall Fermat’s Little Theorem and consider what p− 1 refers to modulo p: it is precisely thenumber of non-zero remainders relatively prime to p. Now look at the other tables:

Modulo 6 The remainders a such that gcd(a, 6) = 1 are precisely a ≡ 1, 5. For such a we see thata2 ≡ 1 mod 6.

Modulo 8 The remainders a such that gcd(a, 8) = 1 are precisely a ≡ 1, 3, 5, 7. For such a we seethat16 a4 ≡ 1 mod 8.

Given all this, we make a definition and make a hypothesis:

Definition 10.2. For n ≥ 1 define ϕ(n) =∣∣{0 < a ≤ n : gcd(a, n) = 1}

∣∣: for n 6= 1 this is the numberof remainders relatively prime to n. The function ϕ : N→N is called Euler’s Totient Function.

Theorem 10.3 (Euler’s Theorem). If gcd(a, n) = 1 then aϕ(n) ≡ 1 mod n.

Here are the first few values of Euler’s function, where we list the elements a which are relativelyprime to n:

ϕ(1) = 1 =∣∣{1}∣∣ ϕ(7) = 6 =

∣∣{1, 2, 3, 4, 5, 6}∣∣

ϕ(2) = 1 =∣∣{1}∣∣ ϕ(8) = 4 =

∣∣{1, 3, 5, 7}∣∣

ϕ(3) = 2 =∣∣{1, 2}

∣∣ ϕ(9) = 6 =∣∣{1, 2, 4, 5, 7, 8}

∣∣ϕ(4) = 2 =

∣∣{1, 3}∣∣ ϕ(10) = 4 =

∣∣{1, 3, 7, 9}∣∣

ϕ(5) = 4 =∣∣{1, 2, 3, 4}

∣∣ ϕ(11) = 10 =∣∣{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}

∣∣ϕ(6) = 2 =

∣∣{1, 5}∣∣ ϕ(12) = 4 =

∣∣{1, 5, 7, 11}∣∣

We clearly have ϕ(p) = p − 1 whenever p is prime, from which we see that Fermat’s Little Theo-rem is simply a special case of Euler’s Theorem. You should mentally check that Euler’s Theoremholds for several of the values listed above. . . Perhaps unsurprisingly, we can prove Euler’s Theoremanalogously to how we proved Fermat’s.

Proof of Euler’s Theorem. The proof hinges on the fact that a ∈ Zm has a multiplicative inverse iffgcd(a, m) = 1. If n = 1, then the theorem is trivial. Otherwise, consider the set of remaindersrelatively prime to n

X := {x ∈ Zn : gcd(x, n) = 1}

Let a ∈ X. We claim that the function fa : X → X : x 7→ ax mod n is a bijection.Well-defined: we require fa(x) = ax ∈ X, equivalently gcd(ax, m) = 1. But this is trivial: if a, x shareno divisors with n, neither can ax.Injectivity: Since a ∈ X, Bezout’s identity tells us that ∃b ∈ X with ba ≡ 1 mod m. Therefore

fa(x) ≡ fa(y) =⇒ ax ≡ ay =⇒ bax ≡ bay =⇒ x ≡ y mod n

16In fact a2 ≡ 1 mod 8 for such a as can easily be seen from the table. We are only looking for existence of an exponent k,not necessarily the smallest such.

29

Surjectivity: This either comes for free since X is a finite set and fa : X → X is injective, or we simplyobserve that x = fa(bx) for all x ∈ X.To finish the proof, first observe that |X| = ϕ(n). Since fa : X → X is bijective, we may list theelements of the set X in two ways:

X = {x1, x2, . . . , xϕ(n)} = {ax1, ax2, . . . , axϕ(n)}

Multiplying the elements of X together we obtain

x1x2 · · · xϕ(n) ≡ ax1ax2 · · · axϕ(n) ≡ aϕ(n)x1x2 · · · xϕ(n) mod n

Since the xi are all relatively prime to n, we may cancel both sides by each of them in turn (Theorem8.5) thus obtaining the result.

Example It should be clear that gcd(a, 35) = 1 ⇐⇒ gcd(a, 5) = 1 and gcd(a, 7) = 1. We see thatthe set of remainders modulo 35 which is relatively prime to 35 is

X = {1, 2, 3, 4, 6, 8, 9, 11, 12, 13, 16, 17, 18, 19, 22, 23, 24, 26, 27, 29, 31, 32, 33, 34}= Z35 \ {0, 5, 10, 15, 20, 25, 30, 7, 14, 21, 28}

We clearly have ϕ(35) = 24 = 35 − 11 (an easier method of calculation shall be seen shortly. . . ).We may now employ this to simplify congruences as we did with Fermat’s Theorem. For instance,suppose you wanted to solve the congruence equation

x49 ≡ 12 mod 35

First we observe that if gcd(x, 35) = d then we must have d | 12 and so d is a common divisor17 of 12and 35: manifestly d = 1. It follows that, if a solution exists, it satisfies gcd(x, 35) = 1 and so Euler’sTheorem applies. We now observe that

x24 ≡ 1 =⇒ x49 ≡ x ≡ 12 mod 35

Riffle-shuffling

As an example of using Euler’s Theorem (in fact Fermat’s Little Theorem in the standard example)consider a standard shuffle of a 52-card deck of playing cards. The process is as follows:

• Label the cards 1, 2, 3, . . . 52 from bottom to top.

• Cut the deck into two stacks of 26 cards.

• Alternate cards from the bottom of each stack: the card in position x moves to position s(x),where

x 1 2 3 · · · 25 26 27 28 · · · 50 51 52s(x) 2 4 6 · · · 50 52 1 3 · · · 47 49 51

It is not hard to give a formula to this function:

s : {1, 2, . . . , 52} → {1, 2, . . . , 52} : x 7→ 2x (mod 53)

30

127

2349

228

2450

329

2551

430

2652

1

49

2

50

3

51

4

52

25

27

26

28

We can now ask some questions:

• If we keep perfectly shuffling the pack, will it eventually end up in the starting arrangement?

• How many shuffles are required to put the pack back as it began?

• Of all the possible arrangements of a deck of 52 cards, how many can be achieved just byshuffling?

Fermat’s Little Theorem makes this easy to answer: shuffling n times produces the function

sn : x 7→ 2nx (mod 53)

Every card ends up in its starting position after n shuffles if and only if 2n ≡ 1 mod 53. Since 53 isprime, Fermat’s Little Theorem says that 252 ≡ 1 mod 53, whence the deck will sort itself after 52shuffles. In fact this is the minimum number of shuffles required, though it is a bit tedious to check!Even though there are 52! ≈ 1068 potential arrangements of 52 cards in a deck, perfect shuffling18 ofa new pack can only result in a comparatively tiny 52 distinct arrangements.

The problem can be generalized in several ways:

1. Repeat the questions for decks with an even number of cards.

2. What about shuffles of odd-numbered decks or other types of shuffle? This is harder, as definingthe shuffle is more complicated!

As another example, if we have a deck of 1000 cards, the shuffle has formula x 7→ 2x (mod 1001).By Euler’s Theorem, 2ϕ(1001) ≡ 1 (mod 1001). Since ϕ(1001) = 720, so the deck is guaranteed to sortitself after 720 riffle-shuffles. In fact it can be checked that only 60 shuffles are required!

17We have x = λd and 35 = µd for some λ, µ ∈ Z. Thus ∃n ∈ Z such that λ49d49 = 12 + 35nµd =⇒ d | 12.18Of course shuffling is rarely perfect even when performed by a pro, though combining shuffling methods is necessary

to produce a well-mixed pack.

31

11 Euler’s Function and the Chinese Remainder Theorem

Everything Fermat’s Theorem can do, Euler’s can do better! Given this, it becomes important to beable to compute the value of Euler’s function ϕ(n) for given n. Sadly this is very difficult to do, at leastfor large n. We build this up in stages using a classic number-theory approach: worry about primesfirst, then glue everything together.

n = p prime: Clearly ϕ(p) = p− 1.

n = p2 where p is prime: We want to count the remainders in the set {1, 2, 3, . . . , p2} which are rela-tively prime to p2. Since the only divisors of p2 are 1, p and p2, we are simply looking to deletethe multiples of p. Thus

ϕ(p2) =∣∣{1, . . . , p2} \ {p, 2p, 3p, . . . , (p− 1)p, p2}

∣∣ = p2 − p

n = pk: More generally

ϕ(pk) =∣∣∣{1, . . . , pk} \ {ap : 1 ≤ a ≤ pk−1}

∣∣∣ = pk − pk−1 = pk−1(p− 1) = pk(

1− 1p

)We now know how to compute ϕ(n) whenever n is a power of a prime; it remains to investigatecomposites. To get started, look for a pattern in the table of small values on page 29. We see that

ϕ(6) = ϕ(2) · ϕ(3), ϕ(10) = ϕ(2) · ϕ(5), ϕ(12) = ϕ(3) · ϕ(4)

Moreover, we’ve also calculated ϕ(35) = 24 and we observe that this equals ϕ(5)ϕ(7). We thereforehave a hypothesis: if m, n are relatively prime, then ϕ(mn) = ϕ(m)ϕ(n). If this is true, then we onlyneed to know the primes that divide an integer to be able to compute ϕ. For example

ϕ(60) = 16 = ϕ(22)ϕ(3)ϕ(5) = 3 · 2 · 4 = 24

This property is important:

Definition 11.1. A function f : Z→ Z is multiplicative (also called arithmetic or number-theoretic) if

gcd(m, n) = 1 =⇒ f (mn) = f (m) f (n)

There are many very simple examples of multiplicative functions, for example the identity functionf (x) = 1, the identity function f (x) = x, the square function f (x) = x2, etc. These examples satisfythe product formula even if m, n are not coprime. The Euler function is a more exotic example, andone which really requires the coprime restriction.

Theorem 11.2. ϕ is multiplicative.

We first consider an example to motivate the proof. We list all the remainders modulo 36 = 9× 4 ina rectangle:

0 1 2 3 4 5 6 7 89 10 11 12 13 14 15 16 1718 19 20 21 22 23 24 25 2627 28 29 30 31 32 33 34 35

32

The remainders coprime to 36 are in red: there are clearly ϕ(36) = 12 of them. Observe that 12 =ϕ(9)ϕ(4) = 6 × 2. Also observe how the red remainders are distributed: they lie in six columnscontaining two each. Now imagine counting the remainders which are coprime to 36: clearly

gcd(x, 36) = 1 ⇐⇒ gcd(x, 9) = 1 = gcd(x, 4)

Now think about rewriting the above table modulo 4 and 9:

0 1 2 3 0 1 2 3 01 2 3 0 1 2 3 0 12 3 0 1 2 3 0 1 23 0 1 2 3 0 1 2 3

0 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 80 1 2 3 4 5 6 7 8

We can now make an argument for why ϕ(36) = ϕ(9)× ϕ(4).

1. Since all the elements of a column are congruent modulo 9, all remainders coprime to 36 mustlie in one of the ϕ(9) = 6 indicated columns.

2. Since every column contains a complete set of remainders modulo 4, exactly ϕ(4) = 2 elementsin each column are coprime to 4.

3. Putting this together, a remainder is coprime to 36 iff it is coprime to both 9 and 4, whence itmust be one of the ϕ(4) = 2 entries lying in one of the ϕ(9) = 6 columns of interest: thusϕ(36) = ϕ(9)× ϕ(4).

The proof of the multiplicativity of Euler’s function is merely an abstraction of the above.

Proof of Theorem 11.2. If either of m, n are equal to 1, then gcd(m, n) = 1 and ϕ(mn) = ϕ(m)ϕ(n) istrivial. We therefore suppose that gcd(m, n) = 1 where m, n > 1 and list all the elements of Zmn inan n×m table:

0 1 2 · · · m− 1m m + 1 m + 2 · · · m + (m− 1)

2m 2m + 1 2m + 2 · · · 2m + (m− 1)...

......

...(n− 1)m (n− 1)m + 1 (n− 1)m + 2 · · · (n− 1)m + (m− 1)

(∗)

Of the entries in the table, ϕ(mn) are relatively prime to mn. We aim to count these in a different way,by first observing that

gcd(x, mn) = 1 ⇐⇒ gcd(x, m) = 1 = gcd(x, n)

Look at the first row of the table. There are ϕ(m) entries in this row which are coprime to m. Howevereach column is congruent modulo m, hence there exist ϕ(m) columns, all of whose entries are coprimeto m. Moreover, no other entries in the table are coprime to m.Now consider any column. This consists of a set of elements

{j, m + j, 2m + j, . . . , (n− 1)m + j}

33

for some j. Since gcd(m, n) = 1, Theorem 8.5 tells us that no two of these elements are congruentmodulo n: indeed

km + j ≡ lm + j mod n =⇒ km ≡ lm mod n =⇒ k ≡ l mod n

It follows that each column consists of a complete set of remainders modulo n. Consequently, ϕ(n)of the elements in each column are relatively prime to n.Putting this together we see that ϕ(m)ϕ(n) of the elements in the table are relatively prime to both mand n. Otherwise said

gcd(m, n) = 1 =⇒ ϕ(mn) = ϕ(m)ϕ(n)

Computing ϕ(n) Suppose that we have the unique prime decomposition of n. Theorem 11.2 tellsus that

ϕ(pµ11 · · · p

µll ) = ϕ(pµ1

1 ) · · · ϕ(pµll )

Hence ϕ is determined completely by its values on prime powers. However, we already know thevalue of ϕ(pl), whence we obtain a theorem:

Theorem 11.3. ϕ(n) = n ∏p|n

(1− 1

p

)for any n.

In words: ϕ(n) is the product of n with all the terms 1− 1p for each prime p dividing n.

Proof. Suppose that n = pµ11 · · · p

µll , written in terms of its prime decomposition. Then

ϕ(n) = ϕ(pµ11 · · · p

µll ) = ∏

pi |nϕ(pµi

i ) = ∏pi |n

pµi−1i (pi − 1)

= ∏pi |n

pµii (1− p−1

i ) = n ∏pi |n

(1− 1

pi

)

Example ϕ(200) = ϕ(23 · 52) = 200(1− 1

2

) (1− 1

5

)= 80.

The Chinese Remainder Theorem

Return to the argument for Euler’s function being multiplicative: we make an additional observation.Suppose x ≡ c mod m and x ≡ d mod n where gcd(m, n) = 1. Such an x must lie in the cth columnof the table (∗) in the proof. But this column comprises precisely the residues modulo n, exactly oneof which is d. If follows that the simultaneous congruence{

x ≡ c mod mx ≡ d mod n

(†)

has a unique solution modulo mn. Armed with this knowledge, it is not too difficult to find it. Sincegcd(m, n) = 1, there exist integers λ, µ such that

λm + µn = 1

Now observe that the integer

x = µnc + λmd

manifestly satisfies both congruences in (†).

34

Example Solve the simultaneous congruence{x ≡ 4 mod 50x ≡ 15 mod 33

Applying the Euclidean Algorithm (or by divine intervention), we see that

(λ, µ) = (2,−3) satisfies 50λ + 33µ = 1

We conclude that the congruence has unique solution

x ≡ −3 · 33 · 4 + 2 · 50 · 15 ≡ −396 + 1500 ≡ 1104 mod 1650

For the theorem in full generality, we consider multiple congruences, where each of the moduli arepairwise coprime:

x ≡ b1 mod n1, x ≡ b2 mod n2, . . . x ≡ bk mod nk (‡)

Theorem 11.4 (Chinese Remainder Theorem). Suppose that the moduli n1, . . . , nk in the simultaneouscongruence (‡) satisfy gcd(ni, nj) = 1 for all i 6= j. Then (‡) has a unique solution modulo N = n1 · · · nk.

The proof is nasty, but constructive: it will likely be easier to read in conjuction with the examplewith follows.

Proof. Let Ni =Nni

for each i. Since gcd(Ni, ni) = 1, it follows (Bezout) that there exists λi ∈ Z suchthat λiNi ≡ 1 mod ni. Now let

x ≡ λ1N1b1 + λ2N2b2 + · · ·+ λkNkbk mod N

Since

λiNi ≡{

0 mod nj if i 6= j1 mod ni

it is easy to check that x satisfies the simultaneous congruence (‡).Now suppose that y is another solution to (‡). Then x − y ≡ 0 mod ni for all i which, since the niare pairwise coprime, says that x ≡ y mod N.

Example Find all solutions x ∈ Z to the simultaneous congruences

x ≡ 3 mod 5, x ≡ 5 mod 7, x ≡ 2 mod 8

The moduli 5, 7 and 8 are pairwise coprime hence the theorem applies. We start by computing theterms N, N1, N2, N3:

N = 5 · 7 · 8 = 280, N1 = 56, N2 = 40, N3 = 35

We must therefore solve:

56λ1 ≡ 1 mod 5 =⇒ λ1 ≡ 1

35

40λ2 ≡ 1 mod 7 =⇒ λ2 ≡ 335λ3 ≡ 1 mod 8 =⇒ λ3 ≡ 3

Now define x = 3 · 56 · 1+ 5 · 40 · 3+ 2 · 35 · 3 = 978. The general solution is x ≡ 978 ≡ 138 mod 280.Otherwise said:

x = 138 + 280t : t ∈ Z

Aside: non-coprime moduli? A generalization of the Chinese Remainder Theorem is availablewhen the ni are not pairwise coprime.

Theorem 11.5. A system of simultaneous congruences (‡) may be solved iff gcd(ni, nj) | (bi − bj) for alli 6= j. In such a situation, all solutions are congruent modulo lcm(n1, . . . , nk).

The method is essentially to remove superfluous congruences so that we can apply the Chinese Re-mainder Theorem. For example, given the simultaneous congruences

x ≡ 1 mod 3x ≡ 2 mod 4x ≡ 8 mod 10

we see that the generalization applies: the only divisor property we have to check is

gcd(4, 10) | (2− 8)

Tackling the final congruence, this is iff x ≡ 0 mod 2 and x ≡ 3 mod 5. The first of these is unnec-essary, as x ≡ 2 mod 4 already implies it. We need only therefore solve the congruence system

x ≡ 1 mod 3x ≡ 2 mod 4x ≡ 3 mod 5

=⇒ x ≡ 58 mod 60

using the standard Chinese Remainder Theorem. Note that the modulus is 60 = lcm(3, 4, 10).

Counting residues which are non-coprime to n

Euler’s function records how many integers in Zn are relatively prime to n. What about counting theremainders which have other gcd’s with n? In Euler’s function does this as well.

Lemma 11.6. Suppose that d | n. Then ∃ϕ( n

d

)integers a such that 1 ≤ a ≤ n and gcd(a, n) = d.

Proof. gcd(a, n) = d ⇐⇒ gcd( a

d , nd

)= 1. By definition there are ϕ

( nd

)values of a

d ≤ nd such that the

above holds. Hence there are ϕ( n

d

)possible values of a with gcd(a, n) = 1.

36

Example There are ϕ(136/4) = ϕ(34) = 16 integers 1 ≤ a ≤ 136 for which gcd(136, a) = 4. With abit of thinking, these are precisely

4, 12, 20, 28, 36, 44, 52, 60, 68, 76, 84, 92, 100, 108, 116, 132

Theorem 11.7. For n > 0 we have

∑d|n

ϕ(d) = n

where the sum is over all positive divisors of n.

Proof. Partition {1, . . . , n} into subsets according to the gcd of each number with n. By the aboveLemma this gcd is d for exactly ϕ

( nd

)of the numbers. Hence

∑d|n

ϕ(n

d

)= n (we’ve counted the whole set!)

However the values nd are simply all the divisors of n in the reverse order to d, whence the sums must

be identical:

∑d|n

ϕ(n

d

)= ∑

d|nϕ(d)

Example 28 has positive divisors 1, 2, 4, 7, 14 and 28. Computing ϕ of each of these yields 1, 1, 2, 6,6, 12 respectively. The sum of the ϕ values is 28, as required.

12 Primes

We have already seen the importance of the concepts of prime and irreducible, in particular in beingable to unique factorize integer n ≥ 2 into primes. Given the usefulness of primes as ‘building blocks’of the integers, Number Theory naturally wants to investigate the distribution of the primes. That is,we want to answer questions such as:

1. How many primes are there?

2. How many primes are there with a certain property? (e.g. congruent to 3 modulo 4)

3. If we have discovered the first n primes, how long might we expect to have to search to findthe next prime?

4. Can we write every integer ≥ 4 as a sum of two primes?

5. Are there infinitely many primes p such that p + 2 is also prime?

6. Does there exist at least one prime between any consecutive squares?

7. Are there infinitely many primes of the form N2 + 1?

37

The first three questions can, more or less, be answered. The final three are famous conjectures (theGoldbach, Twin Prime, Legendre’s and N2 + 1 conjectures respectively) that have remained unsolvedfor over a century.The first question in our list has already been answered by Euclid’s Theorem on the fact that the setof primes is infinite (Theorem 7.3). We can continue by modifying Euclid’s proof to other situations.For example, it is clear that any prime p ≥ 3 cannot be even and must therefore be congruent to 1or 3 modulo 4. Consider the following table of the primes p such that 3 ≤ p ≤ 120, arranged byremainder modulo 4:

p ≡ 1 mod 4 5 13 17 29 37 41 53 61 73 89 97 101 109 113 · · ·p ≡ 3 mod 4 3 7 11 19 23 31 43 47 59 67 71 79 83 103 107 · · ·

It appears that the primes are fairly evenly distributed between the two classes, and we might rea-sonably conjecture that there are infinitely many primes of each type. This is indeed the case.

Theorem 12.1. There are infinitely many primes congruent to 3 modulo 4.

To help us, we first prove a lemma.

Lemma 12.2. (a) The set X = {x ∈ Z : x ≡ 1 mod 4} is closed under multiplication.

(b) If Π ≡ 3 mod 4, then at least one of the primes dividing Π is congruent to 3 modulo 4.

Proof. (a) Suppose that x, y ∈ X. Then xy ≡ 1 mod 4.

(b) Suppose that Π ≡ 3 mod 4. Then Π is odd and so is not divisible by 2. Suppose also that theprime decomposition of Π contains only primes congruent to 1 modulo 4. By part (a), Π ≡ 1mod 4. A contradiction.

Proof of Theorem. The idea is to modify Euclid’s proof. Suppose that there are finitely many primescongruent to 3 modulo 4: list them as 3, p1, . . . , pn. Now define the number

Π := 4p1 p2 p3 · · · pn + 3

Certainly Π is congruent to 3 modulo 4. By the Lemma, Π must be divisible by some prime p ≡ 3mod 4. By assumption, we have all of these. There are only two possibilities:

1. p = 3 from which 3 | 4p1 p2 p3 · · · pn =⇒ 3 | pi =⇒ pi = 3 for some i: a contradiction.

2. p = pi for some i, in which case p | 3 =⇒ p = 3: again a contradiction.

We conclude that p is a new prime congruent to 3 modulo 4, whence our claim that we had them allis false.

Before moving on, consider why the above proof can’t be modified to show that there are infinitelymany primes congruent to 1 modulo 4. One issue is that the lemma fails: the set of integers

{y ∈ Z : y ≡ 3 mod 4}

is not closed under multiplication. We simply can’t claim that any integer Π ≡ 3 (or indeed Π ≡ 1)modulo 4 must be divisible by a prime congruent to 1. Indeed:

38

• Π := 21 = 3 · 7 ≡ 1 mod 4 is not divisible by any primes congruent to 1.

• Π := 3 · 7 · 11 = 231 ≡ 3 mod 4 is not divisible by any primes congruent to 1.

We will provide a proof later, using quadratic residues, to show that there are indeed infinitely manyprimes congruent to 1 modulo 4.

In generaly, there is a much harder theorem:

Theorem 12.3 (Dirichlet). If gcd(a, m) = 1 then there are infinitely many primes p satisfying p ≡ amod m.

13 Counting Primes

Now we turn to the third in our list of questions from the previous section. To think about this, weintroduce the concept of a counting function. This is a function f : N → N0 for which f (x) is thenumber of positive integers less or equal to x with a certain property. Euler’s totient function ϕ is anexample:

ϕ(x) = |{n ∈N≤x : gcd(x, n) = 1}|

Here is another, simpler, example: for each x ∈N, define

f (x) = |{n ∈N≤x : n ≡ 4 mod 7}|

The purpose of this function is to try to make sense of the following intuitive statement:

One seventh of the integers are congruent to 4 modulo 7

The difficulty is that there are infinitely many integers, so what exactly does a ‘seventh’ of infinity looklike? To get a feel for f , compute the first few values:

x 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20f (x) 0 0 0 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3

It seems reasonable to claim that, for large x, f (x) is approximately a seventh of x. More precisely,we could obtain a formula using the ceiling function:

f (x) =⌈

x− 37

⌉= ‘least integer greater than or equal to

x− 37

In particular,

x− 37≤ f (x) <

x + 47

Now consider the Squeeze Theorem applied to f (x)x :

x− 37x≤ f (x)

x<

x + 47x

=⇒ limx→∞

f (x)x

=17

39

This gives precision to the idea that one seventh of the integers are congruent to 4 modulo 7. Thereis even a notation for this:

f (x) ∼ 17

x

which is read, ‘ f (x) is asympototic to 17 x.’

Armed with this notation we can ask a similar question of the primes.

Definition 13.1. π(x) := |{p : p ≤ x}| is the number of primes less than x.

Theorem 13.2 (Prime number theorem). limx→∞

π(x)x/ ln x = 1. Otherwise said π(x) ∼ x

ln x .

You might find it easier to understand the prime number theorem in terms of probabilities. In theinterval [1, x] there are π(x) primes: the chance that a random integer in this interval is prime istherefore

P(y ∈N≤x prime

)=

π(x)x≈ 1

ln x

The proof is too difficult for us. Of course, xln x is only an estimate of the function π(x) (it is actually

always an under-estimate). A marginally more accurate estimate, albeit at the cost of having toestimate an integral, is

π(x) ∼∫ x

2

1ln t

dt

To check the veracity of these claims: consider the 1000th prime p1000 = 7919. We have

π(7919) = 1000,7919

ln 7919≈ 882,

∫ 7919

2

1ln t

dt ≈ 1016

A little extra algebra actually tells us that the nth prime should be located around

pn ≈ n ln n

Indeed 1000 ln 1000 ≈ 6908, which is a 13% under-estimate.

14 Mersenne Primes

Definition 14.1. A prime of the form 2p − 1, where p is prime is called a Mersenne prime.

Mersenne primes19 make the headlines: whenever the ‘world’s largest prime’ is announced, it is usu-ally a Mersenne prime.20 Indeed the current largest known prime is the Mersenne prime 277,232,917− 1with 23,249,425 digits. . .

It may seem strange for people to hunt for Mersenne primes, but there is a reason: if you want tofind large primes, Mersenne primes are an easy way to get large examples. Indeed, as far as usingexponentiation is concerned, they’re just about our only simple option.

19Named for Marin Mersenne, a 17th century French music theorist and mathematician.20The major project searching for large primes is known as the Great Internet Mersenne Prime Search, or GIMPS for the

humorously-minded. . .

40

Theorem 14.2. If p = an − 1 is prime for some a, n ≥ 2, then p is a Mersenne prime (i.e. a = 2 and n isprime).

Proof. Recall the geometric series formula:

xn − 1 = (x− 1)(xn−1 + · · ·+ 1)

Clearly (a− 1) | (an − 1), so p = an − 1 is prime only if a = 2.Now suppose that n = mk is composite. Then

2n − 1 = (2m)k − 1 = (2m − 1)((2m)k−1 + (2m)k−2 + · · ·+ 1)

is also composite. So n must be prime.

Note that the proof isn’t bidirectional. In fact, relatively few primes p produce a Mersenne prime2p − 1: for instance 211 − 1 = 2047 = 23 · 89 isn’t prime. As an idea of their rarity, the current largestknown prime is only the 50th Mersenne prime to be discovered: it is merely conjectured that thereare infinitely many of them.

15 Mersenne Primes and Perfect Numbers

Definition 15.1. A positive integer is perfect if equals the sum of its proper (positive) divisors.

For example, the number 6 = 1 + 2 + 3 and the number 28 = 1 + 2 + 4 + 7 + 14 are both perfect.These are intimately related to Mersenne primes. We use this as an excuse to introduce two furthermultiplicative functions.

Definition 15.2. Let n ∈N and define:

τ(n) = |{d : d | n}| = ‘the number of positive divisors of n’

σ(n) = ∑d|n

d = ‘the sum of the positive divisors of n’

It is considered implicit in the definitions that we are only dealing with the positive divisors of n: ifwe summed over all positive and negative divisors, we’d always get zero!

Theorem 15.3. σ and τ are multiplicative functions.

Proof. Everything relies on the following simple fact: since gcd(m, n) = 1,

Every divisor d of mn is uniquely a product d = d1d2 of a divisor of m and a divisor of n.

It is then immediate that τ(mn) = τ(m)τ(n). Moreover

∑d|mn

d = ∑d1|m, d2|n

d1d2 = ∑d1|m

d1 · ∑d2|n

d2

from which σ(mn) = σ(m)σ(n).

41

We can easily compute the values of τ and σ when n = pm is a power of a prime:

τ(pm) =∣∣{1, p, p2, . . . , pm}

∣∣ = m + 1

σ(pm) =m

∑i=0

pm =pm+1 − 1

p− 1

Since both functions are multiplicative, we have proved:

Corollary 15.4. Suppose that n = pm11 · · · p

mkk is the prime decomposition of n. Then,

τ(n) =k

∏i=1

(mi + 1), σ(n) =k

∏i=1

pmi+1i − 1pi − 1

Examples There are τ(260) = τ(22 · 5 · 13) = (2 + 1)(1 + 1)(1 + 1) = 12 positive divisors of 260,and their sum is

σ(260) =23 − 1

1· 52 − 1

4· 132 − 1

12= 588

This can tediously be checked since the divisors of 260 are 1, 2, 4, 5, 10, 13, 20, 26, 52, 65, 130, 260.Repeating with n = 1000 = 23 · 53, we see that

τ(1000) = (3 + 1)(3 + 1) = 16, σ(1000) =24 − 12− 1

· 54 − 15− 1

= 2340

Perfect Numbers

We can now redefine the concept of a perfect number:21

n is perfect if and only if σ(n) = 2n

There is an intimate relation between perfect numbers and Mersenne primes: half of it indeed ap-pears in Euclid’s Elements.

Theorem 15.5. If 2p − 1 is a Mersenne prime, then 2p−1(2p − 1) is perfect.

Proof. Suppose that 2p − 1 is a Mersenne prime. Then

σ(2p−1(2p − 1)

)=

2p − 12− 1

· (2p − 1)2 − 1

2p − 1− 1= (2p − 1)2p = 2 · 2p−1(2p − 1)

so that 2p−1(2p − 1) is perfect.22

21Note that σ(n) counts all the positive divisors of n including n itself.22For a simpler argument omitting σ, note that the primality of 2p − 1 means that the proper divisors of 2p−1(2p − 1) are

1, 2, 22, . . . , 2p−1, and 2p − 1, 2(2p − 1), . . . , 2p−2(2p − 1)

These can be summed using geometric series.

42

For small values of p we have the following table

p 2p − 1 n2 3 63 7 285 31 4967 127 812813 8191 33550336

It was conjectured in the middle ages and finally proved by Euler that all even perfect numbers havethis form; it is not known if there are any odd perfect numbers.

Theorem 15.6 (Euler). Every even perfect number has the form 2p−1(2p − 1) for some Mersenne prime2p − 1.

Proof. Suppose that n = 2km is an even perfect number, where k ≥ 1 and m is odd. Our goal is toprove that m is prime; we will do this by showing that σ(m) = m + 1.By our assumptions, we have two expressions for σ(n):

σ(n) =

{σ(2k)σ(m) = (2k+1 − 1)σ(m) since gcd(2k, m) = 1,2n = 2k+1m since n is perfect.

It follows that

(2k+1 − 1)σ(m) = 2k+1m

Since 2k+1 − 1 is odd, we see that 2k+1 | σ(m), whence we write σ(m) = 2k+1α for some α ∈ N. Wenow have

(2k+1 − 1)α = m

If we can show that α = 1 then we are finished: in such a case

σ(m) = 2k+1 = (2k+1 − 1) + 1 = m + 1

whence m is prime.For a contradiction, assume that α > 1. Then m is divisible by the distinct divisors 1, α, m. Clearly

2k+1α = σ(m) ≥ 1 + α + m = 1 + α + (2k+1 − 1)α = 1 + 2k+1α

Contradiction!We conclude that m = 2k+1 − 1 is prime. By Theorem 14.2 we see that k + 1 = p is also be prime,whence m is a Mersenne prime. We therefore see that n = 2k(2k+1 − 1) = 2p−1(2p − 1) where 2p − 1 isa Mersenne Prime, as required.

Thusfar, fifty Mersenne primes have been discovered, and thus fifty perfect numbers: no othersare known, although the conjecture regarding infinitely many Mersenne primes would imply theexistence of infinitely many perfect numbers.

43

16 Powers Modulo m and successive squaring

In this section we construct an algorithm for computing large powers modulo m. At the present, ifwe were confronted with 14217 mod 67, we have a couple of options:

1. Hunt for a small power of 14 which has small remainder modulo 67. This could take a while!

2. Use Euler’s Theorem (really Fermat’s Little Theorem since the modulus 67 is prime), to see that1466 ≡ 1 mod 67. This allows us to reduce the problem to computing 1419 mod 67, but nowwe’re back to option 1.

As the modulus gets larger these options become markedly less attractive: in particular, it could bevery difficult to compute ϕ(m) for a large value of m. Rather than finish the calculation by tryingto find a plesantly small remainder for 14k, here is a more systematic approach where we computesquares:

• 142 ≡ 196 ≡ −5 mod 67

• 144 ≡ 25 mod 67

• 148 ≡ 252 ≡ 625 ≡ 22 mod 67

• 1416 ≡ 222 ≡ 484 ≡ 15 mod 67

Each squaring is not very difficult, and now we have enough information to compute:

1419 ≡ 1416+2+1 ≡ 15 · (−5) · 14 ≡ −75 · 14 ≡ −8 · 14 ≡ −112≡ 22 mod 67

The method relies on the binary decomposition of the exponent: 19 = 16 + 2 + 1 = 24 + 21 + 20. Hereis the method in general.

Successive Squaring Algorithm to compute ak mod m

1. Write k = µ0 + 2µ1 + 22µ2 + · · ·+ 2rµr where each µj = 0, 1. This is the binary decomposition23

of k.

2. Make a list of powers modulo m: for each j = 0, . . . , r we compute

Aj ≡ a2jmod m

iteratively, via Aj+1 ≡ A2j mod m.

3. Then ak ≡ Aµ00 · · · A

µrr mod m.

23For those keeping track of computing time, it is likely that a computer already has k stored this way.

44

Example To find 673 mod 25 using successive squaring, we follow the algorithm:

1. 73 = 64 + 8 + 1 = 26 + 23 + 20.

2. Starting with A0 = a = 6 we square:

A1 ≡ 62 ≡ 35 ≡ 11

A2 ≡ 112 ≡ 121 ≡ −4

A3 ≡ (−4)2 ≡ 16 ≡ −9

A4 ≡ 162 ≡ 256 ≡ 6

A5 ≡ 62 ≡ 11

A6 ≡ 112 ≡ −4

Notice that the pattern starts to repeat once we reach A4 ≡ A0.

3. 673 ≡ A0A3A6 ≡ 6 · (−9) · (−4) ≡ 6 · 11 ≡ 16 mod 25.

Of course we could have streamlined the calculation by starting with Euler’s Theorem: ϕ(25) =5 · 4 = 20, whence

673 ≡ (620)3 · 613 ≡ 623+22+20 ≡ A3A2A0 ≡ (−9) · (−4) · 6 ≡ 16

If you think about the table of values and how it repeats, this didn’t save us much time.

Efficiency While tedious to perform by hand, the algorithm is very efficient for a computer. Ob-serve that an exponent k has a binary expansion with r terms (up to 2r−1µr−1) if and only if

2r−1 ≤ k < 2r ⇐⇒ r− 1 ≤ log2 k < r

We have to square and compute each of the remainders Aj for j = 1, . . . , r − 1, then a final compu-tation of the remainder of the product Aµ0

0 · · · Aµr−1r−1 . The algorithm therefore requires approximately

log2 k ‘take the remainder’ steps to complete, which is roughly 3.32 times number of digits of k.Compare this with log2 73 ≈ 6.19 for the above computation where we needed seven steps.

17 kth-roots modulo m

In this section we apply successive square to the problem of finding kth-roots modulo m: otherwisesaid, we want to solve congruences of the form xk ≡ b mod m. In particular, we want to considerthe cases when a remainder b has a unique kth root. For instance, here is an example where we findthe unique 53rd-root of 7 modulo 26.

Example Solve the congruence x53 ≡ 7 mod 26.

• First note that gcd(7, 26) = 1. If d | x and d | 26, then d | 7, and so d = 1.

• Apply Euler’s Theorem: any solution x satisfies gcd(x, 26) = 1, and so xϕ(26) ≡ 1.

• Since ϕ(26) = 12 we see that x12 ≡ 1.

45

• Put it together:

x53 ≡ x4·12+5 ≡ x5 mod 26

so we need to solve x5 ≡ 7 mod 26.

• Now what? Hunt for a multiple of 5 which is congruent to 1 modulo ϕ(m) = 12. In this case52 = 25 = 2ϕ(26) + 1. We conclude

x ≡ x1+2ϕ(26) ≡ x25 ≡ 75 ≡ 11 mod 26

In the last step we may appeal to successive squaring to compute 75 or simply hack at it, startingwith 72 = 49 ≡ −3. . .

We lucked out in the example: in the final step we relied on being able to solve the congruence

5u ≡ 1 mod 12

which we know we can do because gcd(5, 12) = 1. In general, this trick will not succeed, but wehave identified the critical ingredient necessary to being able to to find a unique kth root.

Theorem 17.1. Suppose that xk ≡ b mod m, where gcd(b, m) = gcd(k, ϕ(m)

)= 1. Then there is exactly

one solution, which can be found by applying the following steps:

1. Find ϕ(m).

2. Find u ∈N such that ku ≡ 1 mod ϕ(m).

3. Find x ≡ bu mod m.

Proof. Step 2 is possible since since gcd(k, ϕ(m)

)= 1; we can therefore write ku = 1 + λϕ(m) for

some λ ∈ N. For step 3, we first observe that any solution x must have gcd(x, m) = 1: if not, thenb ≡ xk and m would have a common divisor greater than 1. Now apply Euler’s Theorem:

bu ≡ (xk)u ≡ x1+λϕ(m) ≡ x mod m

It should be clear that x ≡ bu is the unique kth-root of b since we found x by doing the same thing(raising to the power u) to both sides of the congruence xk ≡ b mod m.

Example Find the solution to x283 ≡ 29 mod 42.A quick check shows that gcd(29, 42) = 1 and ϕ(42) = 12. We therefore need to solve

283u ≡ 1 mod 12

This is straightforward, since 283 ≡ 7 mod 12, we easily spot that u ≡ 7 solves the congruence.24

Both gcd conditions are met, whence there is a unique solution to the problem. We compute

x283 ≡ 29 =⇒ x283·7 ≡ 297 mod 42

24If this makes you nervous, you can do it the slow way, either by applying the Euclidean Algorithm to solve 7u =1 + 12λ, or indeed by applying it to 283 = 1 + 12λ:

283 = 12 · 23 + 712 = 7 · 1 + 5

7 = 5 · 1 + 25 = 2 · 2 + 1

=⇒ gcd(283, 12) = 1 = 12 · 118− 283 · 5 =⇒ 283 · 7 = 1 + 12 · 165

where we reversed the algorithm to obtain the final result. Of course this is very tedious!

46

=⇒ x ≡ x1+165ϕ(42) ≡ 297 mod 42

It remains to compute the final power: applying the successive squaring algorithm, we have 7 =20 + 21 + 22, and

A0 = 29, A1 = 292 = 169 = 1, A2 = 1

whence

x ≡ 297 ≡ 29 · 1 · 1 ≡ 29 mod 42

It is a reasonable question to ask what can happen in the case that either or both of the conditionsgcd(b, m) = 1 or gcd

(k, ϕ(m)

)= 1 fails. The short answer is that anything is possible; you could

have no kth-root, a unique kth-root, or several kth-roots. Some of the details are in the homework.

Aside: Bijections and the set of Units To extend the uniqueness thought from the Theorem a littlefurther, consider the set of units

Um = {x ∈ Zm : gcd(x, m) = 1}Any solution x to xk ≡ b mod m where b is a unit must also be a unit. Supposing that gcd

(k, ϕ(m)

)=

1, we have a unique solution, from which it follows that the function f : x 7→ xk mod m is a bijectionof the set Um with itself; that is, f simply rearranges the elements of Um. Indeed the inverse functionis f−1 : x 7→ xu mod m. As an example, consider

U14 = {1, 3, 5, 9, 11, 13}and k = 5 (we need k to be coprime to ϕ(14) = 6). Then the function f can be written in tabular form:

x 1 3 5 9 11 13f (x) 1 5 3 11 9 13

More is true, for Um forms an abelian group under multiplication (the product of any units is still aunit and multiplication is commutative) and f satisfies

f (xy) ≡ (xy)k ≡ xkyk ≡ f (x) f (y)

This says that f : Um → Um is a homomorphism and, being bijective, is an automorphism of Um.In the case of our example, it is easy to see that 3 is a generator of U14 (5 is another) and so the groupof units is cyclic of order 6 and thus isomorphic to Z6. There are only two automorphisms of thisgroup (a generator must be mapped to a generator. . . ), the trivial automorphism f (x) ≡ x and thenon-trivial one we found above, namely f (x) ≡ x5. The generators 3 and 5 are known as primitiveroots modulo 14. Study of these objects is very important in more advanced Number Theory.

18 Powers, Roots and Unbreakable codes (RSA)

The famous RSA cryptosystem is perhaps the modern-world’s most utilised.25 It is likely that you(indirectly) use some version of it every day, when your phone or computer connects securely toanother, for instance using the https protocol. Here is how the method works.

25The acronym is formed from the initials of Rivest, Shamir and Adleman who discovered the system while workingat MIT in 1977. The system was in fact first described in 1973 by Clifford Cocks while working for GCHQ, the Britishversion of the NSA. Its discovery was classified; it was deemed to have no practical application at the time due to the lackof available computing power.

47

Encoding

1. Start with a semiprime: this is a number m = pq which is a product of two distinct primes.

2. Calculate ϕ(m) = (p− 1)(q− 1).

3. Choose an integer s such that 1 < s < ϕ(m) and gcd(s, ϕ(m)) = 1.

4. Encode a numerical message by mapping x 7→ xs mod m.

Decoding This is based on the following.

Theorem 18.1. Let u ∈N satisfy us ≡ 1 mod ϕ(m). Then (xs)u ≡ x mod m for all x.

Proof. Since m = pq is a semiprime, we have that

xsu ≡ x mod m ⇐⇒ xsu ≡ x mod p and mod q

However su ≡ 1 mod ϕ(m) ⇐⇒ su = 1 + j(p− 1)(q− 1) for some j ∈ Z. Hence

xsu ≡ x · (xp−1)j(q−1) ≡ x mod p,

by Fermat’s little theorem.26 The calculation is similar for the other modulus q.

The process is very simple:

x encode7−−−→ xs mod m decode7−−−→ (xs)u ≡ x mod m

Think of s for ‘scramble’ and u for ‘unscramble.’

A Simple Example

We first encode a string of letters by the substitution A 7→ 1, B 7→ 2, etc. For a more sophisticatedencoding you might use the UTF code of a character (include punctuation), or assign numbers todifferent words in a prearranged table. We start with the message:

I T S A L L G R E E K T O M E9 20 19 1 12 12 7 18 5 5 11 20 15 13 5

Given x, we now encode by replacing x by xs mod m. For example if we choose the semiprimem = 5× 7 = 35, then ϕ(m) = 4× 6 = 24, and we can choose, say, s = 5. Then:

9 7→ 95 ≡ 812 · 9 ≡ 112 · 9 ≡ 16 · 9 ≡ 4 mod 35

20 7→ 205 ≡ 20

19 7→ 195 ≡ 24

1 7→ 15 ≡ 1...

26Either x ≡ 0 in which case the result is trivial, or gcd(x, p) = 1 and so Fermat’s little theorem applies.

48

We would now transmit the resulting string of numbers:

4, 20, 24, 1, 17, 17, 7, 23, 10, 10, 16, 20, 15, 13, 10

or by translating them back into letters,

D, T, X, A, Q, Q, G, W, J, J, P, T, O, M, J

To decode, we must choose u such that 5u ≡ 1 mod 24: take u = 5. Now

45 ≡ 43 · 42 ≡ 64 · 42 ≡ −6 · 42 ≡ −24 · 4 ≡ 44 ≡ 9 mod 35

205 ≡ 20

245 ≡ 19...

and we recover the original string of numbers and the message ITSALLGREEKTOME.

Public and Private Keys To decode messages you need u with us ≡ 1 mod ϕ(m). If you keep thisnumber secret and give everyone else the modulus m and s, then they can send you messages, butonly you can read them. Since s, m may be made freely available without (seriously) compromisingthe system, they are known as the public key. u is known as the private key.One idea would be that a group of friends each have different keys. They all keep their own decoderu but share their encoders s, m with the others. Then all friends can send messages to each other but,once encoded, only the intended recipient can read each one.

Example: ‘Cracking’ RSA Suppose you intercept the following message

59, 57, 59, 4, 2, 82, 4, 86, 43, 4, 43, 57, 4

which you know has been encoded using the public key s = 11 and with modulus m = 119. You alsoknow that the message may be read by virtue of the translation

11↔ A, 12↔ B, . . . , 36↔ Z

The first job is computing the totient. This is easy, since we can quickly factorize m = 7× 17 to seethat ϕ(m) = 6 · 16 = 96.We now need to find the private key u, which satisfies 11s ≡ 1 mod 96. A relatively short applicationof the Euclidean algorithm says that

1 = 11 · 35− 4 · 96 =⇒ u = 35

We now compute:27

59 7→ 5935 ≡ 19 mod 11927If you don’t want to beg the help of a calculator, 35 = 25 + 21 + 20 and the successive squaring algorithm for 59

mod 119 yields

A0 = 59, A1 = 30, A2 = −52, A3 = −33, A4 = 18, A5 = −33

=⇒ 5935 ≡ A5 A1 A0 ≡ −33 · 30 · 59 ≡ 19 mod 119

Don’t knock it: it’s what your computer has to do for every element of the code!

49

etc. The full decode is

19, 29, 19, 30, 25, 24, 30, 18, 15, 30, 15, 29, 30

which you can translate, if you’re so inclined.

Security of the RSA system

The security of the RSA cryptosystem is based on the fact that it is relatively easy to computing pow-ers and roots compared to the difficulty of computing a totient ϕ(m). There is nothing secure aboutour example above: if you have made m = 35 and s = 5 public knowledge, then anyone with rudi-mentary knowledge can quickly compute ϕ(m) = 24 and see that u = 5 solves us ≡ 1 mod ϕ(m). Inpractical applications, the semiprime m will be 300 to 600 digits long; this makes a massive difference!

Think about what we know so far about the steps involves in the RSA system: we assume that s andm are public knowledge so that anyone can encode, and that u is known only to the receiver.

1. To encode we compute each power xs mod m. Using successive squaring, this requires roughlylog2 s applications of the Division algorithm together with a little multiplication and addition.This is quick for a computer.

2. Decoding requires the receiver to compute each power yu mod m, again this requires roughlylog2 u applications of the Division algorithm.

And we really do mean quick. While such calculations were unfeasible in the 1973, a modern com-puter performs billions of calculations per second. For example, mine took milliseconds when askedto compute the following calculation modulo a 90-digit number:

293816492876398263289730 ≡66665001540241942867724210467701215126066320524539327//

//4812775759690507953856390011570848512 (mod 1090 + 1723)

Now suppose that you want to attack the RSA system. You know the values of m and s but don’tknow u. In the abstract this means that you need to do two things:

1. Find ϕ(m). Since m = pq is a semiprime, this is as time-consuming as finding the prime de-composition of m: no polynomial time algorithm exists for this, so for large m this is painfullyslow.

2. Find u ∈ N such that us ≡ 1 mod ϕ(m). Using the Euclidean Algorithm, this requires nomore than 2 log2(k) applications of the Division algorithm and some back substitution, so thisis quick for a computer, provided it already knows ϕ(m).

To compare with the above, my computer took 7 minutes to find the following totient

ϕ(1090 + 1723) = 66665001540241942867724210467701215126066320////5245393274812775759690507953856390011570848512

More generally, a desktop computer running current factoring algorithms will factor any 75 digitsemiprime within hours, the largest semiprime that has been factored by a general algorithm on a

50

supercomputer has 232 digits.28 The fact that modern implementations use semiprimes that are 300–600 digits long makes them very secure.

While the RSA system is resilient against general attack, it is not foolproof. The main drawbackis that RSA is a table cipher: if 18 7→ 120 during encoding, then 18 is always mapped to 120. Ifsomeone intercepts an decoded message, they would know how to decode future messages withoutcalculation, just by transcribing numbers. Moreover, long messages are far less secure than shortones. In English the most common letter is ‘e’ while ‘the’ is likely to be a commonly occurring three-letter combination. By correctly guessing even a few letters, a hacker might be able to decode the restof a message by inspection, and then decode any future message sent using the same keys. For thesereasons it is common to send a new encoding key29 each time you request a message from a source.There are other issues, for instance 1 will always encode to 1; if you are using simple translations ofnumbers for letters, you will need to scramble the letters up or in some way to make 1 redundant,otherwise you are giving away cheap information. There are also more technical considerationsabout choosing your public key since efficient algorithms exist in certain situations: for example ifthe primes p and q are ‘too close’ together.

20 Squares Modulo p

We want to solve equations of the form x2 ≡ a mod p have a solution, when p is prime. Moreprecisely, we first want to identify when such an equation has a solution. The first thing to do is tothink about the distribution of squares modulo p.

Lemma 20.1. If p is an odd prime, then the numbers 02, 12, 22, . . . ,(

p−12

)2are distinct modulo p.

Proof. If x2 ≡ y2 mod p then (x− y)(x + y) ≡ 0. By unique factorization, we have x ≡ ±y.

If we ignore zero, this says that the congruence x2 ≡ a may be solved for p−12 , or exactly half, of the

non-zero remainders a modulo p. Such numbers a are given a name:

Definition 20.2. a 6≡ 0 is a quadratic residue (QR) modulo p if x2 ≡ a mod p is solvable. Otherwise itis a quadratic non-residue (QNR, or just NR).

Examples Here are some tables of what happens modulo 3, 5 and 7.

28For years RSA Labs offered cash prizes for factoring large numbers: the number in question, RSA-768, was worth$50,000 when it was factored in 2009 after two years of calculation. A modern desktop computer running the same algo-rithm would take well-over a lifetime.

29Suitably encrypted of course, perhaps using a different semiprime or via a secure key-exchange protocol such as Diffie–Hellman.

51

a equation solutions QR?1 x2 ≡ 1 mod 3 x ≡ 1, 2 X2 x2 ≡ 2 mod 3 none X

a equation solutions QR?1 x2 ≡ 1 mod 5 x ≡ 1, 4 X2 x2 ≡ 2 mod 5 none X3 x2 ≡ 3 mod 5 none X4 x2 ≡ 4 mod 5 x ≡ 2, 3 X

a equation solutions QR?1 x2 ≡ 1 mod 7 x ≡ 1, 6 X2 x2 ≡ 2 mod 7 x ≡ 3, 4 X3 x2 ≡ 3 mod 7 none X4 x2 ≡ 4 mod 7 x ≡ 2, 5 X5 x2 ≡ 5 mod 7 none X6 x2 ≡ 6 mod 7 none X

The evidence confirms that half of the non-zero remainders are quadratic residues. In fact moreis true: think about multiplying quadratic residues together: you always get another one! Indeedmultiplication of residues and non-residues behaves very like the multiplication of ±1.

Theorem 20.3. QR×QR=QR, QR×NR=NR, NR×NR=QR.

Proof. We will get this as a corollary of a more complex result in the next section. For now we provedirectly; suppose throughout that a, b are non-zero modulo p, so that a general quadratic residue maybe written as a2.

1. a2b2 ≡ (ab)2, so the product of QR’s is a QR.

2. Suppose a2n ≡ b2. Then n ≡ b2(a−1)2 is a QR (all a 6≡ 0 are invertible in Zp). Thus if n is a NR,we must also have a2n be a non-residue.

3. Let n be a NR. Since n 6≡ 0, we see that the map30

µ : x 7→ xn : Zp \ {0} → Zp \ {0}

is bijective. For any QR a2, we have µ(a2) = a2n a NR by part 2. Moreover, by Lemma 20.1the sets of QR’s and NR’s have the same cardinality. It follows that µ maps the set of QR’sbijectively to the NR’s and must therefore map NR’s back to QR’s. In particular, if m is a NR,then µ(m) = mn is a QR.

Definition 20.4. For an odd primes p, the Legendre symbol(

ap

)is defined by

(ap

):=

0 if p | a,1 if p - a and x2 ≡ a is solvable,−1 if p - a and x2 ≡ a is unsolvable.

Otherwise said, a is a quadratic residue if(

ap

)= 1, and a quadratic non residue if

(ap

)= −1.

Legendre symbols will prove very useful for checking whether we have a quadratic residue or not.To see how, we develop a little, hopefully obvious, algebra.

30If you’ve done group theory, the argument in part 3 of the proof should remind you of the comparison of even andodd cycles in the group Sn, where we see that the sets of such have the same cardinality.

52

Theorem 20.5. For a, b ∈ Z and p an odd prime, we have:

1.(

abp

)=(

ap

) (bp

)2. a ≡ b mod p =⇒

(ap

)=(

bp

)3. p - a =⇒

(a2

p

)= 1

Part 1 is simply a restatement of Theorem 20.3, while 2 and 3 are immediate from the definition.

Example To check whether 27 is a QR modulo 61, we compute the Legendre symbol.(2761

)=

(32

61

)(3

61

)=

(3

61

)We are left to decide whether 3 is a QR modulo 61; equivalently, we want to solve x2 ≡ 3 mod 61.Manifestly x ≡ 8 is a solution (as is x ≡ −8 ≡ 53). Thus 27 is a QR modulo 61.We can actually go further:

82 ≡ 3 =⇒ (3 · 8)2 ≡ 33 =⇒ 242 ≡ 27 mod 61

It follows that the solutions to the original congruence are

x2 ≡ 27 mod 61 ⇐⇒ x ≡ ±24 ≡ 24, 37 mod 61

Computing Legendre Symbols

While Legendre symbols were helpful for simplifying the problem of checking that 27 was a quadraticresidue modulo 61, they weren’t quite enough. We still needed to be able to spot that 3 was aquadratic residue. Thankfully this was easy: in general we need some way of computing the valueof a Legendre symbol without simply relying on Theorem 20.5 to reduce the problem to somethingtrivial. For instance:(

32101

)=

(16

101

)(2

101

)=

(2

101

)But can we solve x2 ≡ 2 mod 101? It just so happens that we can’t, but how can we see this withouttrying lots of options? One way to proceed is to consider Fermat’s Little Theorem. Suppose that wecould solve the congruence: then x100 ≡ 1 mod 101. Raising both sides to the power 50, we musthave

1 ≡ x100 ≡ 250 mod 101

A bit of calculating should convince you that 250 ≡ −1 mod 101, whence 2, and also 32, are quadraticnon-residues modulo 101.

This approach works in general. Suppose that a is a quadratic residue modulo p: thus there existssome x such that x2 ≡ a mod p where both x and a are non-zero. If p is an odd prime, then p− 1 = 2kis even. But then

1 ≡ (x2)k ≡ ak ≡ ap−1

2 mod p

We have proved one direction of the following.

53

Theorem 20.6. If p is an odd prime and p | a then x2 ≡ a mod p is trivially solvable (x ≡ 0 mod p is theonly solution). Otherwise, x2 ≡ a mod p is{

solvable ⇐⇒ ap−1

2 ≡ 1 mod p

unsolvable ⇐⇒ ap−1

2 ≡ −1 mod p

Otherwise said,(

ap

)≡ a

p−12 mod p.

Proof. We’ve already seen that if a is a quadratic residue then ap−1

2 ≡ 1 mod p. Conversely, considerthe equation

yp−1

2 ≡ 1 mod p

By Lagrange’s Theorem there are at most p−12 solutions to this equation. However, every quadratic

residue a is a solution, and by Lemma 20.1 there are p−12 of these. Hence a is a quadratic residue iff

ap−1

2 ≡ 1 mod p.Now observe that, for any non-zero a, be it a QR or not, Fermat’s Little Theorem may be factorized:

0 ≡ ap−1 − 1 ≡(

ap−1

2 − 1) (

ap−1

2 + 1)

mod p

We conclude that a is a NR iff ap−1

2 ≡ −1 mod p.

Examples 3 is a QR modulo 13 since 313−1

2 ≡ 36 ≡ 272 ≡ 12 ≡ 1 mod 13. It is easy to see that thesolutions to x2 ≡ 3 mod 13 are x ≡ 4, 9.Returning to our earlier problem: 3 is also a QR modulo 61, since 3

61−12 ≡ 330 ≡ 1 mod 61.

As a final observation, return to Theorem 20.3 and note how it may be viewed as a corollary of thisresult. In particular(

ap

)(bp

)≡ a

p−12 b

p−12 ≡ (ab)

p−12 ≡

(abp

)

21 Are −1 and/or 2 Quadratic Residues?

As an application of Theorem 20.6 we find those primes for which −1 or 2 are quadratic residues.

Corollary 21.1. If p is an odd prime, then −1 is a QR iff p ≡ 1 mod 4. Indeed(−1p

)= (−1)

p−12 =

{1 if p ≡ 1 mod 4−1 if p ≡ 3 mod 4

As a nice by-product, we can use this observation to prove the counterpart to Theorem 12.1.

Theorem 21.2. There are infinitely many primes congruent to 1 modulo 4.

54

The idea of the proof is to obtain a contradiction by producing a solution x to a quadratic equationx2 ≡ −1 mod q where q ≡ 3 mod 4.

Proof. Suppose that p1, . . . , pn constitute all the primes congruent to 1 modulo 4. Construct

Π = (2p1 · · · pn)2 + 1

Certainly Π is a product of primes, none of which can be in the original list. Since Π is odd andwe’re assuming all the primes congruent to 1 modulo 4 are accounted for, it follows that every primeq dividing Π must be satisfy q ≡ 3 mod 4.Now observe that Π ≡ 0 mod q, whence x ≡ 2p1 · · · pn solves x2 ≡ −1 mod q. Thus that −1 is aQR modulo q, which contradicts Corollary 21.1.

Is 2 a Quadratic Residue?

This is much harder than dealing with −1, although a nice answer is still available. At issue iscomputing the value of 2

p−12 modulo p. One trick for how to do this is attributable to Gauss.31 Here

are two examples of the idea.

1. Let p = 23. We multiply the even remainders together in two ways:

2 · 4 · 6 · · · 22 ≡ 211 · 11! mod 23

and

2 · 4 · 6 · · · 22 ≡ 2 · 4 · · · 10 · 12 · 14 · · · 22≡ 2 · 4 · · · 10 · (−11) · (−9) · · · (−1)

≡ (−1)6 · 11! mod 23

It follows that 211 ≡ 223−1

2 ≡ (−1)6 ≡ 1 mod 23, whence 2 is a quadratic residue modulo 23.

2. Let p = 37. This time p−12 = 18 so we we break the even remainders at 18:

218 · 18! ≡ 2 · 4 · 6 · · · 36≡ 2 · 4 · · · 18 · 20 · 22 · · · 36≡ 2 · 4 · · · 18 · (−17) · (−15) · · · (−1)

≡ (−1)9 · 18! mod 37

We therefore have that 218 ≡ 237−1

2 ≡ (−1)9 ≡ −1 mod 37, whence 2 is a quadratic non-residuemodulo 37.

To get a clean theorem, we simply need to do this in the abstract!

31I.e. the Carl Friedrich Gauss (1777–1855), arguably the most consequential mathematician in history, and a majorcontributor to number theory. Incidentally, Gauss was allegedly such a perfectionist that he preferred to hide all scratchwork thereby disguising the inspiration for his proofs. We take a somewhat different approach here. . .

55

Theorem 21.3. Let p be an odd prime. Then 2 is a quadratic residue modulo p if and only if p is congruent to1 or 7 modulo 8. Otherwise said,(

2p

)=

{1 if p ≡ 1, 7 mod 8−1 if p ≡ 3, 5 mod 8

Proof. Since p is odd, we can define the integer P = p−12 . Now multiply together the even remainders

modulo p to obtain

2 · 4 · 6 · · · (p− 1) = 2P · 1 · 2 · · · P = 2PP! (∗)

Now consider the same product of numbers, split at P:

2 · 4 · 6 · · ·︸ ︷︷ ︸≤P

· · · (p− 5)(p− 3)(p− 1)︸ ︷︷ ︸>P

≡ 2 · 4 · 6 · · · · · · (−5) · (−3) · (−1)

To finish the proof, we need to make sure that the RHS has the form (−1)m · P! and we need to countthe number of negative signs m. There are two cases to consider:

• P = 2k is even. Then p = 2P + 1 = 4k + 1 is congruent to 1 modulo 4. There are P = 2k evenremainders modulo p, whence the split is as follows:

2 · 4 · 6 · · · (p− 1) ≡ 2 · 4 · 6 · · · P︸ ︷︷ ︸k terms ≤P

· (P + 2) · · · (p− 5)(p− 3)(p− 1)︸ ︷︷ ︸k terms >P

≡ 2 · 4 · · · P ·(− (P− 1)

)· · · (−3) · (−1)

≡ (−1)k · P! mod p

where we used the fact that (P + 2)− p = P + 2− 2P− 1 = −(P− 1). Combining with (∗), itfollows that

2p−1

2 ≡ 2P ≡ 22k ≡ (−1)k ≡{

1 if k is even ⇐⇒ p ≡ 1 mod 8−1 if k is odd ⇐⇒ p ≡ 5 mod 8

The case where P = 2k + 1 is odd is similar, so we omit the calculation: check it yourself!

Example We could use this to see whether, say, 95 is a QR modulo 127. Observe that(95127

)=

(−32127

)=

(−1127

)(42

127

)(2

127

)=

(−1127

)(2

127

)Since 127 = 15 · 8 + 7 ≡ 7 mod 8 and therefore 127 ≡ 3 mod 4, we see that(

95127

)= (−1) · 1 = −1

Thus 95 is a QNR modulo 127.

56

22 Quadratic Reciprocity

Suppose we are asked to decide whether −1500 is a QR modulo 997. We start to compute, using allour knowledge from the previous sections:(−1500

977

)=

(−1997

)(3

997

)(22

997

)(53

997

)(Theorem 20.5, part 1)

=

(−1997

)(3

997

)(5

997

)(Theorem 20.5, part 3)

=

(3

997

)(5

997

)(since 997 ≡ 1 mod 4 (Corollary 21.1)

≡ 3498 · 5498 mod 997

A very nasty calculation still remains! Thankfully it can be made very easy still by appealing to thefollowing.

Theorem 22.1 (Quadratic Reciprocity). If p 6= q are prime, then(pq

)·(

qp

)= (−1)

p−12 ·

q−12

Otherwise said,(

qp

)= −

(pq

)⇐⇒ both p, q ≡ 3 mod 4.

We’ll give a proof in the next section. Meanwhile, we continue with our example:(9973

)=

(13

)= 1,

(9975

)=

(25

)= −1

Moreover, 997 = 996 + 1 ≡ 1 mod 4 so neither application of quadratic reciprocity changes the signof the Legendre symbols. We conclude that(−1500

997

)=

(3

997

)(5

997

)=

(997

3

)(9975

)= −1

whence −1500 is a quadratic non-residue modulo 997.

Examples Here is another example where we use reciprocity repeatedly: We check whether 563 isa quadratic residue modulo 997.(

563997

)=

(997563

)=

(−129563

)(563 is prime and 997 ≡ 1 mod 4)

=

(−1563

)(3

563

)(43563

)= (−1)(−1)

(563

3

)(−1)

(56343

)(3, 43 and 563 ≡ 3 mod 4)

= −(

23

)(4

43

)= 1

Note that this calculation gives us no idea what the solutions to the congruence x2 ≡ 563 mod 997are,32 it merely tells us that they exist!

32In fact the solutions are x ≡ ±470 ≡ 470, 527 mod 997

57

Jacobi Symbols Legendre symbols have one big weakness: you can only apply the reciprocityformula when you have two primes. This means either that you have to factorize your numeratorsand might still be left with large computations of the form a

p−12 . Both of these can be difficult for

large numbers. Thankfully, with a small extension to the definition, this problem can be overcome,and the issue of checking whether we have a quadratic residue can be turned into a straightforwardalgorithm.

Definition 22.2. Let a be an integer and n an odd positive integer. If n = pα11 · · · p

αkk is the unique

prime factorization of b, then the Jacobi Symbol( a

b

)is defined by( a

n

)=

(ap1

)α1

· · ·(

apk

)αk

where each(

api

)is a Legendre symbol.

Clearly the notions of Legendre and Jacobi symbol correspond if n is an odd prime. Moreover, the‘obvious’ facts of Theorem 20.5 translate immediately: if a, b ∈ Z and n is an odd positive integer, wehave:

1.(

abn

)=( a

n

) ( bn

) (by definition, Jacobi symbols also satisfy

( amn

)=( a

m

) ( an

))2. a ≡ b mod n =⇒

( an

)=(

bn

)3. gcd(n, a) = 1 =⇒

(a2

n

)= 1.

One disadvantage of working modulo a composite n is that the definition of a quadratic residue ismore complex33 and doesn’t strictly correspond to a Jacobi symbol being 1. We will only use Jacobisymbols to assist with computing Legendre symbols and to facilitate a famous primality test.

The final pieces of information that make Jacobi symbols easy to work with are the Reciprocity The-orems, which are also exactly the same as for Legendre symbols.

Theorem 22.3. (a)(−1

n

)=

{1 if n ≡ 1 mod 4−1 if n ≡ 3 mod 4

(b)( 2

n

)=

{1 if n ≡ 1, 7 mod 8−1 if n ≡ 3, 5 mod 8

(c) If gcd(m, n) = 1 and both are odd, then

(mn

) ( nm

)= (−1)

m−12 · n−1

2 =

{1 ⇐⇒ m or n ≡ 1 mod 4−1 ⇐⇒ m and n ≡ 3 mod 4

Proof. We just prove 3. The other proofs are straightforward, if tedious, exercises in counting howmany primes dividing m, n are congruent to 1 or 3 modulo 4, or to 1, 3, 5 or 7 modulo 8.

33If gcd(a, n) = 1 we say that a is a QR modulo n if x2 ≡ a mod n has a solution. This is if and only if a is a quadraticresidue modulo every prime dividing n. We can prove that

( an)= −1 =⇒ a is a NR, but the converse is false.

58

Let m = ∏ pi and n = ∏ qj be the prime decompositions, where there are no primes in common.Then (m

n

) ( nm

)= ∏

i,j

(pi

qj

)(qj

pi

)= ∏

i,j(−1)

pi−12 ·

qj−12

Since the only way we can have negatives appearing on the RHS is if pi and qj are both congruent to3 modulo 4, we count the number of such primes in each of m and n. Suppose there are a and b suchin m and n respectively. Then there are ab factors of −1 on the RHS. Thus(m

n

) ( nm

)= (−1)ab =

{1 ⇐⇒ a or b even ⇐⇒ m or n ≡ 1 mod 4−1 ⇐⇒ a and b odd ⇐⇒ m and n ≡ 3 mod 4

as required.

The usefulness of Jacobi symbols is that one can apply the quadratic reciprocity formula withoutknowing or bothering to check that both terms are prime! One need only use the three rules in the theorem,together with reduction modulo the denominator, and one may compute the value of any Legendre(or Jacobi) symbol. The only factorization required is to divide out negatives and 2’s.

Example We know that 317 is prime, and we want to check whether 246 is a quadratic residue. Wecompute, indicating which part of the theorem we’re using each time.(

246317

)=

(2

317

)(123317

)(factor out 2’s)

= −(

123317

)((b): since 317 ≡ 5 mod 8)

= −(

317123

)((c): since 317 ≡ 1 mod 4)

= −(

71123

)(reduce modulo 123)

=

(12371

)((c): since 123 and 71 ≡ 3 mod 4)

=

(3171

)(reduce modulo 71)

= −(

7131

)((c): since 31 and 71 ≡ 3 mod 4)

= −(

931

)= −1

Therefore 246 is a quadratic non-residue modulo 317.

Primality testing

Jacobi symbols can be used to test whether a number is a prime. Recall that if n is an odd prime, andn - a then( a

n

)≡ a

n−12 mod n

59

If we are not sure whether n is prime, we could simply compute both sides of this equality for somea. Both are relatively fast for a computer: if the two are not congruent, then there is no way that n isprime.

Example We test to see if n = 3599 is composite. It is easy to take a = 2. Since 8 | 200 it is quicklyclear that

n ≡ −1 ≡ 7 mod 8 =⇒(

23599

)= 1

Now use successive squaring to compute

2n−1

2 ≡ 21799 ≡ 946 mod 3599

Manifestly n is not prime! Indeed n = 59× 61, but we did not need this information.

The above process is known as the Solovay-–Strassen primality test. A value a for which( an

)6≡ a

p−12 mod n

is known as an Euler witness for the compositeness of n. One witness is all you need to prove that n iscomposite. Moreover, it is a theorem that if n is composite, at least half of the remainders coprime ton will be Euler witnesses, so you shouldn’t have to guess for long before you find one. Of course thisis only a probabilistic test of primality. If you try 4 values of a and find no Euler witnesses, then youhave roughly a 1

24 = 116 chance that n is composite; try it 10 times without finding a witness and the

probability drops to roughly 11024 . However this is no good for actually proving that a large number

is prime: you would have to try half the remainders without finding a witness before this could beyour conclusion!

23 Proof of Quadratic Reciprocity

The Quadratic Reciprocity Law has many different proofs (around 200 are claimed). Even Gaussgave at least 6 different proofs in his lifetime, the first when he was only 18 years old! The mainthrust of the proof we give is to count the number of integer points lying in a hexagon. A crucialpreliminary result is required though.

Definition 23.1. Let p be an odd prime and define P = p−12 . It is clear that any integer a is congruent

to a unique integer r satisfying − P2 ≤ r ≤ P

2 . We call r the least residue of a modulo p and say that a hasnegative least residue if r < 0.Now define a counting function, if p is prime and p - a,

µ(a, p) =∣∣∣ {x ∈ {a, 2a, 3a, . . . , Pa} : x has negative least residue modulo p}

∣∣∣Example First think about an example where we compute µ(8, 13).

{8k : 1 ≤ k ≤ 13−12 } = {8, 16, 24, 32, 40, 48} ≡ {−5, 3,−2, 6, 1,−4}

60

We get a complete list of residues (±1), (±2), . . . , (±6) where exactly one of the signs appears. Withthree negative signs, we see that µ(8, 13).

The purpose of the function µ is to provide a general way of computing Legendre symbols.

Theorem 23.2 (Gauss’ Lemma).(

ap

)= (−1)µ(a,p)

Note that this fits with our example:

(−1)µ(8,13) = −1, and,(813

)=

(213

)= −1

since 13 ≡ 5 modulo 8.The proof is little more than a generalization of part of Theorem 21.3.

Proof. We first prove that the least residues of the elements of the set {a, 2a, 3a, . . . , Pa} comprise theremainders (±1), (±2), . . . , (±P) where exactly one of each± remainder appears. Then µ(a, p) is thenumber of negative signs in this list.Since a is invertible modulo p it is clear that the remainders ax are distinct. Now suppose that we havecomputed their least residues. Now suppose that ax ≡ −ay so that ax and ay have corresponding ±least residues. But then a(x + y) ≡ 0 mod p. Since p - a we see that y ≡ −x. But this is impossible if1 ≤ x, y ≤ P.Now multiply together the remainders:

ap−1

2 P! ≡ a · 2a · 3a · Pa ≡ (−1)µ(a,p)P! mod p

Since P! is coprime to p, we cancel to obtain the result.

The quadratic reciprocity formula may now be rewritten as(pq

)(qp

)= (−1)µ(p,q)+µ(q,p) = (−1)

p−12 ·

q−12

To complete the proof, it suffices to prove that

µ(p, q) + µ(q, p) ≡ p− 12· q− 1

2mod 2

We do this is a rather sneaky way, by counting the number of integer points lying within a particularhexagonal region of the plane. Suppose that p, q are distinct primes. Construct a hexagon H withboundaries as shown.

All points inside H satisfy four inequalities12 < x < p

212 < y < q

2

y < qp x + 1

2

y > qp x− q

2p

The circled point has co-ordinates(

p+14 , q+1

4

)

y

x0 1

2

0

12

p2

q2

p+14

q+14

( p2 , q

2)

y = qp x − q

2p

y = qp x + 1

2

61

We count the number of points with positive integer co-ordinates lying inside H and determinewhether the number is even or odd. Here are two examples where we can see what is going on.

0

1

2

3

4

5

0 1 2 3 4 5 6 7 8

p = 17, q = 11

0

1

2

3

4

5

0 1 2 3 4 5 6 7 8 9

p = 19, q = 11

Notice that the lattice points appear to satisfy two properties:

• None lie on the boundaries of H.

• They are distributed symmetrically around the circled point(

p+14 , q+1

4

).

Both of these facts are easily verifiable. If (x, y) has positive integer co-ordinates then it certainlycannot lie on any of the four horizontal or vertical edges of H. Moreover we manifestly have

2py− 2qx 6= 1 and 2py− 2qx 6= q

since the LHS in both cases is even.For the symmetry of the lattice points, simply compute: if (x, y) is an integer point lying in H, thenits reflection in the circled point is(

p + 12− x,

q + 12− y)

which is easily seen to satisfy the four defining inequalities of H. For example.

12< x <

p2

=⇒ − p2< −x < −1

2=⇒ p + 1

2− p

2<

p + 12− x <

p + 12− 1

2

=⇒ 12<

p + 12− x <

p2

Similarly, the new point satisfies the third defining inequality,(q + 1

2− y)− q

p

(p + 1

2− x)=

q + 12− q(p + 1)

2p− y +

qp

x

=p(q + 1)− q(p + 1)

2p− y +

qp

x =p− q

2p− y +

qp

x

<p− q

2p+

q2p

=12

since (x, y) satisfies the fourth inequality of H. Now we make the crucial observation:

62

The circled point is a lattice point if and only if both p ≡ 3 and q ≡ 3 modulo 4.

Given this and our observations, we see that we have proved the following.

Theorem 23.3. The number of positive integer points lying inside H is odd if and only if both p and q arecongruent to 3 modulo 4. Otherwise said:

Number of integer points ≡ p− 12· q− 1

2mod 2

We now count the integer points in H in a different way.

• First observe that there are no integer points inside H which lie on the diagonal y = qp x. This is

clear since p, q are coprime.

• (x, y) ∈ H lies above the diagonal if and only if

qp

x < y <qp

x +12⇐⇒ 0 < py− qx <

p2⇐⇒ − p

2< qx− py < 0

which is if and only if qx has negative least residue modulo p. For a given qx with negativeleast residue modulo p, there is clearly only one integer y satisfying the last inequality.We conclude that there are precisely as many lattice points lying above the diagonal in H asthere are elements in the set{

q, 2q, 3q, . . . ,p− 1

2q}

with negative least residue modulo p. Otherwise said, there are µ(q, p) such points (x, y).

• Mirroring the argument we see that there are µ(p, q) points in H lying below the diagonal.

We have therefore proved.

Theorem 23.4. The number of positive integer points lying inside H is µ(p, q) + µ(q, p).

Combining the Theorems, we see that

µ(p, q) + µ(q, p) ≡ p− 12· q− 1

2mod 2

which completes the proof of Quadratic Reciprocity.

63