PartII Number Theory - University of Cambridgejg352/PartIIIPrep/PartIINumberTheor… · PartII...

PartII Number Theory

zc231

This is based on the lecture notes given by Dr.T.A.Fisher, with some other topics in numbertheory (possibly not covered in the lecture). Some of the theorems here are non-examinable. I putthose here just in case someone is interested. Solutions to the exercises are put in a separate file.

Contents

1 Division 31.1 Division Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Greatest common divisor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Euclidean Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Fundamental theorem of arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Arithmetical functions 72.1 Binomial coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 The function τ(n) and Multiplicative functions . . . . . . . . . . . . . . . . . . . . . . 72.3 Euler’s (totient) function φ(n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4 The function σ(n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.5 The Mobius function µ(n) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.6 Estimate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.7 Dirichlet series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Congruences 163.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Chinese remainder theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.3 Wilson’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.4 Lagrange’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.5 Primitive roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.6 Chevalley’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Quadratic Residues 234.1 Legendre’s symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.2 Euler’s criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.3 Gauss’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.4 Law of quadratic reciprocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.5 Jacobi’s symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.6 Hensel’s lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

1

5 Binary quadratic forms 305.1 Sum of two squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2 Definition and equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.3 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315.4 Representations by binary quadratic forms . . . . . . . . . . . . . . . . . . . . . . . . 335.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

6 Distribution of primes 356.1 The sum

∑p p−1 and the product

∏p(1− p−1)−1 . . . . . . . . . . . . . . . . . . . . . 35

6.2 Legendre’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366.3 Bertrand’s postulate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366.4 Partial summation formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.5 Merten’s results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.6 Riemann zeta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406.7 Gamma function and the functional equation . . . . . . . . . . . . . . . . . . . . . . . 426.8 Riemann Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436.9 Bernoulli numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 446.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

7 Continued fraction 487.1 Dirichlet’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487.2 Convergents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487.3 Period . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527.4 Pell’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537.5 A set of real numbers modulo 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 557.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

8 Primality testing and factoring 608.1 Fermat pseudoprime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608.2 Euler pseudoprime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608.3 Strong pseudoprime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618.4 Fermat factorisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 658.5 Factor bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 668.6 The Continued fraction method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678.7 Pollard’s p− 1 method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

2

1 Division

1.1 Division Algorithm

Definition 1.1. Let a, b ∈ Z. We say b divides a (written b|a) if there exists an integer c such thata = bc. b is called a factor (or divisor) of a.

Lemma 1.2. For all a, b ∈ Z, b > 0, there exists q, r ∈ Z such that a = bq + r and 0 ≤ r < b.

Proof. Let S = {a − nb : n ∈ Z}. If 0 ∈ S then there exists q such that a − bq = 0. If 0 6∈ S, thenclearly S contains some positive integer. Let r be the smallest positive integer in S. For any x ∈ S,x − b ∈ S. Therefore, if r ≥ b then r − b is another positive integer in S and r − b < r which is acontradiction.

1.2 Greatest common divisor

Definition 1.3. Let a1, . . . , an ∈ Z. The greatest common divisor d of a1, . . . , an (written (a1, . . . , an)) is a positive integer d such that d|ai for all i, and every common divisor of a1, . . . , an also dividesd. a1, . . . , an are called coprime if (a1, . . . , an) = 1.

Lemma 1.4. Given a1, . . . , an ∈ Z, not all zero, let I = {∑

i biai : bi ∈ Z}. Then I = dZ for someI.

Proof. Let d be the smallest positive integer in I. Then clearly dZ ⊂ I. Let a ∈ I, then by lemma1.2 there exist q and r such that a = qd+ r where 0 ≤ r < d. Since d is the smallest integer in I, sor = 0. Therefore d|a and so I ⊂ dZ.

Corollary 1.5. Let a1, . . . , an ∈ Z, not all zero. Then the greatest common divisor of a1, . . . , an isd where

dZ = {∑i

biai : bi ∈ Z}.

Proof. Let I = {∑

i biai : bi ∈ Z} = dZ for some d by previous lemma. Since ai ∈ I for all i sod|ai for all i. Let d′ be a common factor of a1, . . . , an, then ai ∈ d′Z for all i and so dZ = I ⊂ d′Z.Therefore d′|d.

Corollary 1.6. Let a1, . . . , an ∈ Z, not all zero. Then there exist b1, . . . , bn ∈ Z such that∑

i aibi = cif and only if d|c where d is the greatest common divisor of a1, . . . , an.

Proof. By previous corollary, d is the positive integer such that

dZ = {∑i

aibi : bi ∈ Z}.

So there exist integer b1, . . . , bn such that∑

i aibi = d, and so∑i

ai(bic

d) = c

if d|c. Conversely, if there exist b1, . . . , bn ∈ Z such that∑

i aibi = c. Then c ∈ {∑

i aibi : bi ∈ Z} = dZand so d|c.

Corollary 1.7. If (a, b) = 1 then there exists x, y ∈ Z such that ax+ by = 1.

3

1.3 Euclidean Algorithm

We now give an algorithm to compute the greatest common divisor of two positive integers a, b. Since(a, b, c) = ((a, b), c) so this can be used to compute the greatest common divisor of a1, . . . , an.

If a = b then (a, b) = a. So we may assume a > b > 0. Let a = r0 and b = r1. For each i ≥ 1, byLemma 1.2, take qi, ri+1 to be positive integers such that

ri−1 = riqi + ri+1, 0 ≤ ri+1 < ri

until rk+1 = 0 for some k. So we have

r0 = r1q1 + r2, 0 < r2 < r1;

r1 = r2q2 + r3, 0 < r3 < r2;

. . .

rk−2 = rk−1qk−1 + rk, 0 < rk < rk−1;

rk−1 = rkqk;

We claim that rk = (a, b). Indeed,

(a, b) = (r0, r1) = (r1, r2) = · · · = (rk−1, rk) = rk.

Moreover, by Corollary 1.6 there exist x, y such that ax + by = d where d = (a, b). Euclideanalgorithm gives a way to compute the integers x, y. We simply work backwards:

d = rk = rk−2 − rk−1qk−1 = rk−2 − qk−1(rk−3 − rk−2qk−2) = · · · .

Example 1.8. Compute x and y such that 20x+ 12y = 4. We have

20 = 12 + 8, 12 = 8 + 4, 8 = 4 · 2

and so 4 = 12− 8 = 12− (20− 12) = 2 · 12− 20.

1.4 Fundamental theorem of arithmetic

Definition 1.9. A positive integer n > 1 is called a prime if whenever b|n, b = 1 or b = n. Otherwisen is called composite.

Lemma 1.10. Let p be a prime. If p|ab then p|a or p|b.

Proof. If p|ab and p - a, then (p, a) = 1. By Corollary 1.7, there exists x, y such that px + ay = 1.Therefore, pxb+ aby = b. Since p|ab and p|pxb so p|b.

Similarly,

Lemma 1.11. Let n be a positive integer. If n|ab then (n, a) > 1 or n|b.

Proof. If (n, a) = 1, then there exist x, y such that nx + ay = 1. Therefore, nxb + aby = b. Sincen|nxb, n|ab so n|b.

Theorem 1.12 (Fundamental Theorem of Arithmetic). Every integer n > 1 can be written asa product of primes. The representation is unique up to order.

4

Proof. Existence is clear: If n is a prime, then we are done. Otherwise there exists a factor m of nsuch that 1 < m < n. Let m be the smallest such factor between 1 and n. Then m must be a prime.Now repeat this with n

m. For uniqueness, suppose

n = p1 · · · pr = q1 · · · qk.

Since p1|n so p1|q1 · · · qk. Apply the previous lemma repeatedly we conclude p1|qi for some i. Wemay relabel q1, . . . , qk so that p1|q1. Then we have

n

p1= p2 · · · pr = q2 · · · qk.

Repeat the above so we conclude r = k and pi = qi after relabeling q1, . . . , qk.

Definition 1.13. Let a1, . . . , an be positive integers. m is a common multiple of a1, . . . , an if ai|cfor all i. m is called the least common multiple of a1, . . . , an (written {a1, . . . , an}) m is a commonmultiple of a1, . . . , an and m|m′ for any other common multiple m′ of a1, . . . , an.

Remark 1.14. If a =∏r

i=1 paii and b =

∏ri=1 p

bii where p1, . . . , pr are distinct primes and ai, bi ≥ 0.

Then (a, b) =∏r

i=1 pcii and {a, b} =

∏ri=1 p

dii where ci = min{ai, bi} and di = max{ai, bi}. Note that

ai + bi = ci + di so(a, b){a, b} = ab.

Therefore we can compute {a, b} by computing (a, b).

One of the important topics in number theory is to study the distribution of prime numbers.

Definition 1.15. The function π(x) is defined to be

π(x) = {p : p ≤ x, p is a prime}.

Theorem 1.16 (Euclid). There are infinitely many primes. In other words, π(x)→∞ as x→∞.

Proof. Suppose there are only finitely many primes p1, . . . , pn. Let N =∏n

i=1 pi + 1 and let q be aprime factor of N . Then (q, pi) = 1 for all i because pi - N . So q 6= pi for all i and this gives acontradiction.

Definition 1.17. A prime p is called a Mersenne prime if p = 2n − 1 for some n.

All the largest knowing primes are Mersenne primes.

1.5 Exercises

1. (i) Find integers x, y such that 6x+ 10y = 2. (ii) Find integers 6x+ 10y + 15z = 1.

2. For each n > 1, let Sn =∑n

i=21i. Show that Sn is not an integer.

3. Let n be a positive integer, prove that n can be written as a sum of (at least two) consecutivepositive integers if and only if n 6= 2k for some k.

4. Prove that if g1, g2, . . . , gk are integers> 1, then every natural number can be expressed uniquelyin the form

a0 + a1g1 + a2g1g2 + · · ·+ akg1g2 · · · gkwhere the aj are integers satisfying 0 ≤ aj < gj+1. In particular, if gi = k for all k then werecover the decimal expansion in base k.

5

5. Given a ≥ b ≥ 1, let λ(a, b) be the number of steps of finding (a, b) by Euclidean algorithm.Show that

λ(a, b) ≤ 2

[log b

log 2

].

6. (i) By considering the factorisation of the form n = a2b where b is square free, show thatπ(x) ≥ log x

2 log 2for x ≥ 2. (ii) By using Fundamental theorem of arithmetic, show that in fact

π(x) ≥ log x2 log log x

for x large enough.

7. Show that there exist infinitely many primes of the form 4n+ 3.

6

2 Arithmetical functions

2.1 Binomial coefficients

Definition 2.1. Let x be a real number. [x] = max{n ∈ Z : n ≤ x}. The fractional part of x, written{x} is defined to be x− [x].

For each n, we use the notation n! =∏n

i=1 i.

Lemma 2.2. Let p be a prime and n be a positive integer. The largest l such that pl divides n! canbe expressed as

l =∞∑k=1

[n/pk].

Proof. For each k ≥ 0, let ak be the size of the set

{m ∈ Z : 1 ≤ m ≤ n, pk|m, pk+1 - m}.

Each number in the above set has exact power k dividing n!, so l =∑∞

k=0 kak. But ak = [n/pk] −[n/pk+1]. So

l =∞∑k=0

k([n/pk]− [n/pk+1]) =∞∑k=0

(k + 1)[n/pk+1]−∞∑k=0

k[n/pk+1] =∞∑k=0

[n/pk].

Definition 2.3. Given two positive integers m ≥ n > 0, the binomial coefficient(mn

)is defined to be(

m

n

)=

m!

n!(m− n)!.

Proposition 2.4. For all m ≥ n > 0,(mn

)is an integer.

Proof. Let p be a prime number. If p|n!(m− n)! then p|m! because p < m. By the previous lemma,if li, i = 1, 2, 3 are the largest integers such that pl1|n!, pl2 |(m− n)!, pl3|m!, then

l1 =∞∑k=1

[n/pk], l2 =∞∑k=1

[(m− n)/pk], l3 =∞∑k=1

[m/pk].

For each k ≥ 0, we have[m/pk] ≥ [n/pk] + [(m− n)/pk]

and so l3 ≥ l1+l2. This shows that if pk|n!(m−n)! then pk|m! for any prime p and so by FundamentalTheorem of Arithmetic,

(mn

)is an integer.

2.2 The function τ(n) and Multiplicative functions

Definition 2.5. A real function f defined on the positive integer is said to be multiplicative iff(mn) = f(m)f(n) for all m,n with (m,n) = 1. It is said to be completely multiplicative if f(mn) =f(m)f(n) for all positive integers m,n. In particular, if n =

∏i p

aii then

f(n) =∏i

f(paii ).

7

Definition 2.6. The function τ(n) is the number of all positive factors of n. That is,

τ(n) =∑d|n

1.

Lemma 2.7. For all (m,n) = 1, there is a bijection

{(d1, d2) ∈ N2 : d1|m, d2|n} → {d ∈ N : d|mn}, (d1, d2) 7→ d1d2.

In particular, σ is multiplicative.

Proof. Indeed, d1d2|mn so the map is well-defined. It is injective: Suppose d1d2 = d3d4 whered1, d3|m, d2, d4|n. Then d1|d3d4. But (d1, d4) = 1 so d1|d3 by Lemma 1.11. Similarly, d3|d1 and sod1 = d3 and d2 = d4. It is surjective: By Fundamental Theorem of Arithmetic we have m =

∏i p

aii

and n =∏

j qbjj where pi 6= qj for all i, j. Then mn =

∏i,j p

aii q

bjj . For any d|mn we have d =

∏i,j p

a′ii q

b′jj

where a′i ≤ ai, b′j ≤ bj. Let d1 =

∏i p

a′ii and d2 =

∏j q

b′jj , then d1|m and d2|n. So the map is a bijection.

By comparing the sizes of the two sets we conclude that σ(mn) = σ(m)σ(n).

Corollary 2.8. Let n =∏

i paii . Then

τ(n) =∏i

(ai + 1).

Proof. It suffices to show thatτ(pa) = a+ 1

for all prime a. Indeed, the factors of pa are pi, i = 0, 1, . . . , a.

Lemma 2.9. If f is multiplicative and if g(n) =∑

d|n f(d). Then the function g is also multiplicative.

Proof. By the previous lemma, we have a bijection

{(d1, d2) ∈ N2 : d1|m, d2|n} → {d ∈ N : d|mn}, (d1, d2) 7→ d1d2

for any (m,n) = 1. Therefore,

g(mn) =∑d|mn

f(d) =∑

d1|m,d2|n

f(d1d2) =∑d1|m

f(d1)∑d2|n

f(d2) = g(m)g(n).

2.3 Euler’s (totient) function φ(n)

Definition 2.10. The Euler’s (totient) function φ(n) is defined to be the size of the set

{m ∈ N : m ≤ n : (m,n) = 1}.

By convention, φ(1) = 1.

Lemma 2.11. There is a bijection

{d ∈ N : d ≤ mn, (d,mn) = 1} → {(d1, d2) ∈ N2 : d1 ≤ m, (d1,m) = 1, d2 ≤ n, (d2, n) = 1}

by d 7→ (d1, d2) where 0 ≤ d1 < m, 0 ≤ d2 < n such that

d = q1m+ d1, d = q2n+ d2

for some integers q1, q2. In particular, φ is multiplicative.

8

Proof. (d,mn) = 1 and sinced = q1m+ d1, d = q2n+ d2,

we conclude that (d1,m) = 1 and (d2, n) = 1. So the map is well-defined. It is injective: Supposed = q1m+ d1, d = q2n+ d2 and d′ = q3m+ d1, d

′ = q4n+ d2. Then

m|(q1 − q3)m|d− d′ and n|(q2 − q4)n|d− d′.Since (m,n) = 1 so mn|d− d′ and so d− d′ = 0 because 0 < d, d′ ≤ mn.

It is surjective: Given d1, d2 with d1 ≤ m, (d1,m) = 1 and d2 ≤ n, (d2, n) = 1, let d = d1ny +d2mx + zmn where mx + ny = 1 and z is an integer such that 0 ≤ d < mn. Writing ny = 1 −mxwe have d = d1 + (d2 − d1)mx+ zmn and so

d = ((d2 − d1)x+ zn)m+ d1

and similarly by writing mx = 1− ny we have

d = (d1 − d2)y + zm)n+ d2.

So d 7→ (d1, d2). Finally,

(d,m) = (d1ny,m) = (d1 − d1mx,m) = (d1,m) = 1

and(d, n) = (d2mx, n) = (d2 − d2ny, n) = (d2, n) = 1

so (d,mn) = 1. So d is the preimage of (d1, d2). By comparing the sizes of the two sets we concludethat φ is multiplicative.


i paii . Then

φ(n) =∏i

pai−1(pi − 1).

Proof. Since φ is multiplicative, it suffices to show that φ(pa) = pa−1(p− 1) for any prime p. Indeed,for any n, (n, pa) = 1 if and only if (n, p) = 1. Since the number of positive integers less than pa

which are divisible by p is pa−1, so

φ(pa) = pa − pa−1 = pa−1(p− 1).

Note that this also shows φ(n) is always even for all n > 2. Indeed, (a, n) = 1 if and only if(n− a, n) = 1.

Corollary 2.13. We have an identity ∑d|n

φ(d) = n.

Proof. Since φ is multiplicative, then∑

d|n φ(d) is also multiplicative and so it suffices to show that∑d|pa

φ(d) = pa

for any prime p. Indeed, ∑d|pa

φ(d) =a∑i=0

φ(pi) = 1 +∑i=1

(pi − pi−1) = pa.

Remark 2.14. One can show the above identity directly by considering the bijection

{c ∈ N : c ≤ d, (c, d) = 1} → {c ∈ N : c ≤ n, (c, n) =n

d}, c 7→ c

n

d

and∑

d|n φ(d) =∑

d|n φ(nd

).

9

2.4 The function σ(n)

Definition 2.15. The function σ(n) is defined to be the sum of all positive factors of n. In otherwords, σ(n) =

∑d|n d.

Lemma 2.16. σ(n) is multiplicative.

Proof. By Lemma 2.7, for all (m,n) = 1 we have

σ(mn) =∑d|mn

d =∑

d1|m,d2|n

d1d2 =∑d1|m

d1∑d2|n

d2 = σ(m)σ(n).


i paii . Then

σ(n) =∏i

(pai+1i − 1)/(pi − 1).

Proof. It suffices to show thatσ(pa) = (pa+1 − 1)/(p− 1)

for all primes p. Indeed,

σ(pa) =a∑i=0

pi = (pa+1 − 1)/(p− 1).

2.5 The Mobius function µ(n)

Definition 2.18. The Mobius function µ(n) is defined to be 1 if n = 1, 0 if n is divisible by p2 forsome prime p, and (−1)k if n = p1 · · · pk where p1, . . . , pk are distinct primes.

Lemma 2.19. µ(n) is multiplicative.

Proof. For all (m,n) = 1 if one of m,n is divisible by p2 for some p, so is mn. If m,n are bothsquare free, (i.e. not divisible by p2 for any prime) then mn is also square free. Since (−1)k1+k2 =(−1)k1(−1)k2 so µ is multiplicative.

Corollary 2.20. Let ν(n) =∑

d|n µ(d). Then ν is multiplicative and ν(n) = 1 if n = 1 and ν(n) = 0for all n ≥ 2.

Proof. By the previous lemma and Lemma 2.9, ν is multiplicative. If n = 1 then ν(1) = µ(1) = 1.For any prime p,

ν(pa) =∑d|pa

µ(d) = 1− 1 + 0 + 0 + · · · = 0.

Therefore ν(n) = 0 for all n ≥ 2.

Definition 2.21. Let f, g be real functions defined on positive integers. The convolution of f and g,written f ∗ g is defined to be

f ∗ g(n) =∑d|n

f(d)g(n/d).

By convention, the function 1(n) is defined to be 1(n) = 1 for all n. Therefore,∑

d|n f(d) can bewritten as f ∗ 1.

Lemma 2.22. f ∗ g = g ∗ f for any f and g. Further, (f ∗ g) ∗ h = f ∗ (g ∗ h) for any f, g and h.

10

Proof. For any f and g,

f ∗ g =∑d|n

f(d)g(n/d) =∑d|n

f(n/d)g(d) = g ∗ f.

For any f, g and h, we have

(f ∗ g) ∗ h =∑d|n

(f ∗ g)(d)h(n/d) =∑d|n

∑e|d

f(e)g(d/e)h(n/d)

=∑dd′=n

∑ee′=d

f(e)g(e′)h(d′)

=∑

ee′d′=n

f(e)g(e′)h(d′).

Since f ∗ (g ∗ h) = (g ∗ h) ∗ f , by symmetry

(g ∗ h) ∗ f =∑

ee′d′=n

f(e)g(e′)h(d′)

and so (f ∗ g) ∗ h = f ∗ (g ∗ h).

Theorem 2.23 (Mobius Inversion Formula). Let f be any real function defined on positiveintegers (not necessarily multiplicative), and let g = f ∗ 1. Then f = g ∗ µ. Conversely, if f = g ∗ µthen g = f ∗ 1.

Proof. We have

g ∗ µ =∑d|n

g(n/d)µ(d) =∑d|n

∑e|n/d

f(e)µ(d)

=∑dd′=n

∑e|d′

f(e)µ(d)

=∑dee′=n

f(e)µ(d)

=∑e|n

f(e)∑d|n/e

µ(d) =∑e|n

f(e)ν(n/e) = f(n).

Conversely, if f = g ∗ µ then

f ∗ 1 =∑d|n

f(d) =∑d|n

∑e|d

g(e)µ(d/e)

=∑dee′=n

g(e)µ(e′)

=∑e|n

g(e)ν(n/e) = g(n).

The following is the multiplicative version of Mobius inversion.

Corollary 2.24. If F (n) =∏

d|n f(d), then f(n) =∏

d|n F (d)µ(n/d).

11

Proof. Let G(n) = logF (n) and g(n) = log f(n). Then

G(n) =∑d|n

g(d)

and so by Mobius inversion formula

g(n) =∑d|n

G(d)µ(n/d).

Therefore,

f(n) = exp g(n) =∏d|n

F (d)µ(n/d).

2.6 Estimate

Often we want to estimate∑

n≤x f(n), where x ∈ R.

Definition 2.25. For any real function f(x), g(x) = O(f(x)) is a quantity such that there existpositive constants c,M such that

|g(x)| ≤ c|f(x)| for all x ≥M.

In other words, |g(x)/f(x)| is bounded. The little o notation h(x) = o(f(x)) means that h(x)/f(x)→0 as x→∞. The asymptotic notation r(x) ∼ f(x) means that r(x)/f(x)→ 1 as x→∞.

We shall give several examples.

Proposition 2.26.∑

n≤x τ(n) = x log x+O(x).

Proof. We have ∑n≤x

τ(x) =∑n≤x

∑d|n

1 =∑dm≤x

1 =∑d≤x

[x/d] =∑d≤x

x/d+O(x).

But∑

d≤x 1/d = log x+O(1) by using integral approximation so∑n≤x

τ(n) = x log x+O(x).

Therefore the average order of τ is about log x.


n≤x σ(n) = π2

12x2 +O(x log x).

Proof. We observe that ∑n≤x

σ(n) =∑n≤x

∑d|n

d =∑dm≤x

d =∑m≤x

∑d≤x/m

d.

Since ∑d≤x/m

d =1

2[x/m]([x/m] + 1) =

1

2(x/m)2 +O(x/m),

we have ∑m≤x

∑d≤x/m

d =∑m≤x

(1

2(x/m)2 +O(x/m))

x2

2

∑m≤x

1

m2+∑m≤x

O(x/m).

12

We have seen in the previous proposition∑

m≤xO(x/m) = O(x log x). Finally we use the fact that

∑m≤x

1

m2=

∞∑m=1

1

m2+O(1/x)

and the result follows by the identity∑∞

m=11m2 = π2/6.


n≤x φ(x) = 3π2x

2 +O(x log x).


φ(n) =∑n≤x

∑d|n

µ(d)(n/d) =∑de≤x

µ(d)e =∑d≤x

µ(d)∑e≤x/d

e.

Since we have seen that∑

e≤x/d e = 12(x/d)2 +O(x/d), so∑

d≤x

µ(d)∑e≤x/d

e = x2/2∑d≤x

µ(d)(1/d)2 + x∑d≤x

µ(d)O(1/d).

But∑

d≤x µ(d)/d2 =∑∞

d=1 µ(d)/d2 + O(1/x). We wil show later that∑∞

d=1 µ(d)/d2 = 6/π2. So weconclude that ∑

n≤x

φ(n) =3

π2x2 +O(x log x).

Corollary 2.29. The probability of two randomly selected positive integers being coprime is 6π2 .

Proof. For each positive integer x, the sum∑

n≤x φ(n) is the number of unordered pairs of coprime

integers a, b with 0 < a, b ≤ x. There are(x2

)ways to select two positive integers a, b with 0 < a, b ≤ x.

So the probability that two randomly selected integers less than or equal to x being coprime is∑n≤x φ(n)

(x2). By the previous proposition and let x→∞ we conclude that

limx→∞

∑n≤x φ(n)(

x2

) =6

π2.

2.7 Dirichlet series

We introduce Dirichlet series of the form∑∞

n=1f(n)ns

where f(n) ∈ Z for each n and s ∈ C. Forconvention, we write s = σ + it with σ, t ∈ R.

Definition 2.30. The Riemann zeta-function ζ(s) is defined to be

ζ(s) =∞∑n=1

1

ns.

It converges absolutely for σ > 1.

Proposition 2.31. Given Dirichlet series F (s) =∑

n f(n)/ns and G(s) =∑

n g(n)/ns, if F (s), G(s)both converge absolutely for s ∈ S for some S. Then F (s)G(s) =

∑n(f ∗ g)(n)/ns for s ∈ S.

13

Proof. Since F,G both converge absolutely so we are free to rearrange the sum. Indeed, we have

F (s)G(s) =∑k

f(k)/ks∑m

g(m)/ms =∑m,k

f(k)g(m)/(km)s =∑n

∑k|n

f(k)g(n/k)/ns.

Corollary 2.32. 1ζ(s)

=∑

n µ(n)/ns for σ > 1. In particular,∑

n µ(n)/n2 = 6/π2.

Proof. It suffices to prove that ζ(s)∑

n µ(n)/ns = 1. Since ζ(s) and∑

n µ(n)/ns both convergeabsolutely for σ > 1 and so

ζ(s)∑n

µ(n)/ns =∑n

(1 ∗ µ)(n)/ns =∑n

ν(n)/ns = 1.

2.8 Exercises

1. Show that the number of ordered pairs of positive integers whose least common multiple is nequals τ(n2).

2. Show that∑

d2|n µ(d) = |µ(n)|.

3. Show that if σ(n) is odd then n is a square or twice a square.

4. For each n ≥ 2, show thatσ(n) + φ(n) = nτ(n)

if and only if n is a prime.

5. For each n ≥ 2, letT (n) = {a : 1 ≤ a ≤ n, (a, n) = 1}

and f(n) = 1n

∑a∈T (n) a. (i) Show that f(n) = 1

2φ(n).

(ii) By evaluating∏

a≤nan

in two different ways, show also that∏a∈T (n)

a = nφ(n)∏d|n

(d!/dd)µ(n/d).

6. If n has k distinct prime factors then∑

d|n |µ(d)| = 2k.

7. Find all positive integers n such that (i) φ(n)|n (ii) φ(n) = 12n (iii) φ(n) = φ(2n).

8. For <(s) > 1, compute the Dirichlet series of (i) ζ(s)2 (ii) 1/ζ(s) (iii) ζ(s− 1)/ζ(s).

9. Let A be the matrix with Aij = (i, j). (i) Let g(i, j) = 1 if j|i and 0 otherwise. Show that

Aij =∑d≤n

g(i, d)g(j, d)φ(d).

(ii) By considering the matrices B,C where

Bij = g(i, j) and Cij = g(j, i)φ(i)

Show that

detA =n∏k=1

φ(k).

14

10. (i) Prove the following generalised version of Mobius inversion formula. Let f, g be functionsdefined over R≥1. Show that if

g(x) =∑n≤x

f(x/n),

thenf(x) =

∑n≤x

µ(n)g(x/n).

(ii) Show that∑

n≤x µ(n)[x/n] = 1. Hence prove that

|∑n≤x

µ(n)/n| ≤ 1.

15

3 Congruences

In this chapter we will introduce the concept of congruences. We shall assume n ≥ 2.

3.1 Definition

Definition 3.1. We say a is congruent to b mod n, written

a ≡ b mod n

if n|a− b. It is easy to check that this is an equivalence relation.

Lemma 3.2. If a ≡ a′ mod n and b ≡ b′ mod n then a+ b ≡ a′+ b′ mod n and ab ≡ a′b′ mod n. Forany integer c, ca ≡ ca′ mod n. Conversely, if ca ≡ ca′ mod n and (c, n) = 1, then a ≡ a′ mod n.

Further, if f is a polynomial with integer coefficients, then f(a) ≡ f(a′) mod n.

Proof. The first statement is clear because if n|a − a′, n|b − b′ then n|(a + b) − (a′ + b′). Nowab− a′b′ = ab− ab′ + ab′ − a′b′ = a(b− b′) + (a− a′)b′ so n|ab− a′b′. It is clear that if n|a− a′ thenn|ca−ca′ for any c. Conversely, if n|c(a−a′) and (c, n) = 1 then n|a−a′. Finally, n|a−a′|f(a)−f(a′)for any polynomial f with integer coefficients.

Definition 3.3. Let n ≥ 1 be a positive integer. Z/nZ is the quotient ring so that addition andmultiplication can be understood in terms of modular arithmetic. That is,

(a+ nZ) + (b+ nZ) = (a+ b) + nZ, (a+ nZ)(b+ nZ) = ab+ nZ.

Lemma 3.4. The followings are equivalent. (i) (a, n) = 1. (ii) There exists x such that ax ≡ 1 modn. (iii) a is a generator for the additive group Z/nZ.

Proof. We shall prove (i) implies (ii), (ii) implies (iii) and (iii) implies (i). If (a, n) = 1 then thereexist x, y such that ax+ny = 1 and so ax ≡ 1 mod n. Suppose there exists x such that ax ≡ 1 modn and so there exists y such that

ax+ ny = 1.

Let d be the order of a in Z/nZ and so n|ad. Since axd+nyd = d and n|axd, n|nyd so n|d. Therefored = n. Finally, if a generates Z/nZ and (a, n) = d > 1. Then n|n

da and so the order of a is at most

nd< n which is a contradiction.

Lemma 3.5. The multiplicative group of the quotient ring Z/nZ, written (Z/nZ)×, has size φ(n).

Proof. a ∈ (Z/nZ)× if and only if there exists x such that ax ≡ 1 mod n, if and only if (a, n) = 1 bythe previous lemma.

Corollary 3.6. Z/nZ is a field if and only if n is prime.

Proof. Z/nZ is a field if and only if every non-zero element is a unit, if and only if φ(n) = n− 1, ifand only if n is a prime.

Corollary 3.7 (Fermat Euler theorem). For any (a, n) = 1, we have

aφ(n) ≡ 1 mod n.

In particular, if p - a, then ap−1 ≡ 1 mod p.

Proof. Since (Z/nZ)× has size φ(n), so aφ(n) ≡ 1 mod n.

16

3.2 Chinese remainder theorem

Lemma 3.8. The linear congruence ax = b mod n is soluble for some integer x if and only if (a, n)|b.

Proof. Suppose such x exists, then n|ax − b and so (a, n)|n|ax − b. Since (a, n)|ax so (a, n)|b.Conversely, if (a, n)|b then there exist x, y such that ax+ ny = (a, n) and so

axb

(a, n)+ ny

b

(a, n)= b

and so ax b(a,n)≡ b mod n.

We now turn to simultaneous linear congruences.

Theorem 3.9 (Chinese remainder theorem). Let n1, . . . , nk be natural numbers and suppose thatthey are pairwise coprime, that is (ni, nj) = 1 for all i 6= j. Then, for any c1, . . . , ck, the congruencesx ≡ cj mod nj with 1 ≤ j ≤ k are soluble simultaneously for some integer x. The solution is uniquemodulo n =

∏i ni.

Proof. Existence: Let n =∏

i ni and mj = n/nj. Then ni|mj for all i 6= j and mj is coprime to nj.So there exists xj such that mjxj ≡ cj mod nj by the previous lemma. Let

x =∑j

xjmj.

Then x ≡ cjxjmj ≡ cj mod nj.Uniqueness: Suppose x, y both satisfy the condition. Then nj|x− y for all j and so n|x− y.

Here is another version of Chinese remainder theorem.

Corollary 3.10. Let n = m1 · · ·mk where m1, . . . ,mk are pairwise coprime. Then we have a ringisomorphism

Z/nZ ∼=∏i

Z/miZ, a+ nZ 7→∏i

a+miZ.

In particular, we have a group isomorphism

(Z/nZ)× ∼=∏i

(Z/miZ)×

and this gives another proof that φ is multiplicative by comparing the sizes of the groups.

Proof. It is clearly a well-defined ring homomorphism. Injectivity follows from the uniqueness ofChinese remainder theorem and surjectivity follows from the existence of Chinese remainder theorem.

3.3 Wilson’s theorem

Theorem 3.11. For any prime p, (p− 1)! ≡ −1 mod p.

Proof. For each a, a2 ≡ 1 mod p if and only if p|(a−1)(a+1), if and only if a ≡ ±1 mod p. Therefore,if a 6≡ ±1 mod p, then there exists b 6≡ a mod p such that ab ≡ 1 mod p. So

(p− 1)! ≡ 1 · (p− 1) ≡ −1 mod p.

17

Corollary 3.12. Let p be an odd prime. There exists x such that

x2 ≡ −1 mod p

if and only if p ≡ 1 mod 4.

Proof. Suppose there exists x such that

x2 ≡ −1 mod p.

Then1 ≡ xp−1 = (x2)

p−12 ≡ (−1)

p−12 mod p.

This implies that p−12

is even and so p ≡ 1 mod 4. Conversely, suppose p ≡ 1 mod 4. Let r = p−12

and so r is even. Then−1 ≡ (p− 1)! ≡ (r!)2(−1)r ≡ (r!)2 mod p

where we write every integer i between r + 1 and p− 1 as p− j for some j ≤ r.

3.4 Lagrange’s theorem

Definition 3.13. Let R be a (commutative) ring. We write R[X] to be the polynomial ring consistingpolynomials of the form

a0 + a1X + · · · anXn, n ∈ Z≥0, ai ∈ R.

Remark 3.14. The map

R[X]→ {functionsR→ R}, f 7→ (α 7→ f(α))

is not always surjective. For example, R = Z/pZ and f = Xp −X. Then f(α) = 0 for all α ∈ R byFermat Euler theorem. But f 6= 0 in R[X].

Lemma 3.15 (Division Algorithm). Let f, g ∈ R[X]. Suppose the leading coefficient of g is aunit in R, then there exist q, r ∈ R[X] with deg(r) < deg(g) such that f = gq + r.

Proof. If deg f < deg g then the result is obvious. Let deg f ≥ deg g and let n = deg f − deg g. Leta be the leading coefficient of f and b be the leading coefficient of g. Since b is a unit, there existsc ∈ R such that bc = a. Then f1 = f − cgXn has degree less than f . Now repeat the above for f1and g. Thus, we obtain a sequence

f = gq1 + f1, deg f1 ≥ deg f

f1 = gq2 + f2, deg f2 ≥ deg g

. . .

fk−1 = gqk + fk, deg fk < deg g.

Therefore, f = g(∑

i qi) + fk.

Corollary 3.16. If f ∈ R[X] has a root α ∈ R. Then f(X) = (X −α)q(X) for some q(X) ∈ R[X].

Proof. Apply division algorithm for f and X − α so

f(X) = (X − α)q(X) + r(X), deg r(X) < degX − α.

Therefore r(X) has degree 0 and so r(X) is a constant in R. Since f(α) = 0 so r(α) = 0 and sor(X) = 0.

18

Theorem 3.17 (Lagrange’s theorem). Let R be an integral domain (that is, ab = 0 if and onlyif a = 0 or b = 0) and f ∈ R[X] of degree n and f 6= 0. Then f has at most n roots in R.

Proof. We prove the theorem by induction on n = deg f . If n = 1, then f = aX−b for some a, b ∈ Rand so either f has no root of x = b/a is the only root. Suppose the statement is true for n. Nowlet deg f = n+ 1. If f has a root α then by the previous corollary, there exists q(X) such that

f(X) = (X − α)q(X).

Then deg q(X) = n and so it has at most n roots. Therefore, f has at most n+ 1 roots.

Remark 3.18. By considering R = Z/pZ, f(X) = Xp−1−1−∏p−1

i=1 (X−i), we have another proof ofWilson’s theorem by using Lagrange’s theorem. Indeed, f has degree at most p−2 but X = 1, . . . , p−1are roots of f . Therefore, f = 0 by Lagrange’s theorem. By considering the constant term we have−1 ≡ (p− 1)! mod p.

3.5 Primitive roots

We will show that (Z/pnZ)× is cyclic for all odd prime p.

Lemma 3.19. For each prime p, (Z/pZ)× is cyclic.

Proof. Each element a ∈ (Z/pZ)× satisfies ap−1 = 1. So the order of a is d for some d|p− 1. Let Sdbe the set of elements with order d.

Suppose Sd 6=. Let a ∈ Sd and let Gd be the subgroup generated by a. Then Gd = {1, a, . . . , ad−1}.Let R = Z/pZ and f(X) = Xd − 1. Every element of order d is a root of f . By Lagrange’s theoremf has at most d roots. But each element in Gd is a root of f and so these are all the roots of f .Therefore each element of order d is ai for some i < d.

Since Gd∼= Z/dZ via ai 7→ i. By Lemma 3.4, we conclude that (i, d) = 1. So |Sd| = φ(d).

Therefore |Sd| = 0 or φ(d). Recall that∑

d|p−1 φ(d) = p− 1, so∑d|p−1

φ(d) = p− 1 =∑d|p−1

|Sd|.

This shows that |Sd| = φ(d) for all d and so in particular |Sp−1| = φ(p− 1) ≥ 1, which means thereexists an element of order p− 1.

Theorem 3.20. (Z/pnZ)× is cyclic for all odd prime p.

We will prove this by several lemmas.

Lemma 3.21. For each n ≥ 2, g generates (Z/pnZ)× if and only if g generates (Z/pZ)× andgp

n−2(p−1) 6≡ 1 mod pn.

Proof. Suppose g generates (Z/pnZ)×, then clearly gpn−2(p−1) 6≡ 1 mod pn. Let d be the order of g in

(Z/pZ)× and so gd = 1 + pz for some z. Then

gdpn−1 ≡ 1 mod pn

and so the order of g in (Z/pnZ)× is less than dpn−1 which is a contradiction.Conversely, let g be a generator of (Z/pZ)× and gp

n−2(p−1) 6≡ 1 mod pn. Let d be the order of gin (Z/pnZ)×. Since φ(pn) = pn−1(p − 1), so d|pn−1(p − 1). Also gd ≡ 1 mod pn implies that gd ≡ 1mod p. So p− 1|d and so d = (p− 1)pk for some k. But gp

n−2(p−1) 6≡ 1 mod pn so k > n− 2 and sok = n− 1. Therefore d = φ(pn).

19

Lemma 3.22. For each odd prime p, (Z/p2Z)× is cyclic. In particular, g generates (Z/pZ)× andgp−1 6≡ 1 mod p2, if and only if g is a generator for (Z/p2Z)×.

Proof. Consider the natural reduction

(Z/p2Z)× → (Z/pZ)×, a+ p2Z 7→ a+ pZ.

It is clearly surjective. By considering the sizes of the groups, the kernel is a subgroup of order pand so it must be cyclic. Therefore, we have an element of order p and an element of order p− 1, sowe obtain an element of order p(p− 1).

We now prove Theorem 3.20.

Proof. Let g be a generator of (Zp2Z)×. So g is a generator of (ZpZ)× and gp−1 6≡ 1 mod p2. Letgp−1 = 1 + pz where (z, p) = 1. Then for each k,

gpk(p−1) = (1 + pz)p

k ≡ 1 + pk+1z mod pk+2.

In particular, gpk(p−1) 6≡ 1 mod pk+2. Let k = n− 2 and apply Lemma 3.21.

Remark 3.23. The proof of the theorem implies that g is a generator of (Z/pnZ)× if and only ifg is a generator of (Z/p2Z)×, if and only if g is a generator of (Z/pZ)× and gp−1 = 1 + pz where(z, p) = 1.

Remark 3.24. Theorem 3.20 is not true for p = 2. For example, (Z/23Z)× ∼= (Z/2Z)2.

3.6 Chevalley’s theorem

We briefly discuss the congruence of general polynomials mod p. Throughout, let p be a prime andR = Z/pZ.

Definition 3.25. For any f, g ∈ R[X1, . . . , Xn], we say f is equivalent to g, (written f ∼ g) if

f(a1, . . . , an) = g(a1, . . . , an)

for all (a1, . . . , an) ∈ Rn.

Example 3.26. Note that f ∼ g does not imply f = g. For example, Xp − X ∼ Xp+1 − X2 butXp −X 6= Xp+1 −X2.

Definition 3.27. For any f ∈ R[X1, . . . , Xn], f is called reduced if f has degree less than p in eachvariable Xi.

Lemma 3.28. For each polynomial f ∈ R[X1, . . . , Xn] there exists a reduced polynomial f ′ such thatf ′ ∼ f .

Proof. Replace Xdii by ri in f where di = qip+ ri, 0 ≤ ri < p. and let f ′ be the polynomial after this

reduction. Then f ′ is reduced and f ′ ∼ f .

Lemma 3.29. For any f, g ∈ R[X1, . . . , Xn], if f and g are both reduced and f ∼ g, then f = g.

Proof. By considering f − g, we can assume that g = 0 and so it suffices to prove that if f is reducedand f ∼ 0 then f = 0.

When n = 1 the result follows by Lagrange’s theorem. Suppose this is true for n. Then for n+ 1,consider f as a polynomial in R[X1, . . . , Xn][Xn+1]. Write

f = bmXmn+1 + · · ·+ b0, where bi ∈ R[X1, . . . , Xn].

20

Suppose bi 6∼ 0, then there exists a1, . . . , an such that bi(a1, . . . , an) 6= 0 and so for this (a1, . . . , an),the polynomial

f(a1, . . . , an, Xn+1) = bm(a1, . . . , an)Xmn+1 + · · ·+ b0(a1, . . . , an)

is a non-zero polynomial in R[Xn+1 of degree less than p. But every element in R is a root of theabove polynomial by assumption because f ∼ 0, so this gives a contradiction. Therefore, bi ∼ 0 forall i. By inductive hypothesis, bi = 0 for each i and so f = 0.

Theorem 3.30 (Chevalley’s theorem). Let f ∈ R[X1, . . . , Xn] and deg f < n. (i) If f has asolution, then f has at least two solutions. (ii)If f is homogenous, (that means, each monomial in fhas the same degree) then f has a non-trivial solution.

Proof. (i) Let r = deg f < n and h = 1 − fp−1. If Xi = ai, i = 1, . . . , n is a solution of f thenh(a1, . . . , an) = 1. Suppose f has no other root, then for any other values x1, . . . , xn ∈ R,

fp−1(x1, . . . , xn) = 1

and so h(x1, . . . , xn) = 0.By Lemma 3.28, there exists a reduced polynomial h′ such that h ∼ h′. Define a reduced

polynomial

h′′(X1, . . . , Xn) =n∏i=1

(1− (Xi − ai)p−1).

Then clearly h′′ ∼ h and so h′′ ∼ h′. Since h′, h′′ are reduced, then by the previous lemma, weconclude that h′′ = h′. Now h has degree r(p− 1). Since h′ ∼ h and h′ is reduced so

deg h′ ≤ r(p− 1) < n(p− 1) = deg h′′

which gives a contradiction. Therefore f has at least two solutions.(ii) Since f is homogenous then (0, . . . , 0) is a solution and so by (i), f has a non-trivial solution.

3.7 Exercises

1. Prove that if (a,m) = (a− 1,m) = 1, then

φ(m)−1∑i=0

ai ≡ 0 mod m.

Let S1 = {1, 11, 111, . . .} and p is a prime p 6= 2, 5. Show that there are infinitely many ele-ments in S1 which are divisible by p

2. Show that akp−k+1 ≡ a mod p for all primes p, integers a and positive integers k. Deduce that798 divides a19 − a for all integers a.

3. Show that if m > 4 then (m− 1)! ≡ −1 mod m if and only if m is a prime. Show further thatif p is an odd prime and 0 < k < p then

(p− k)!(k − 1)! ≡ (−1)k mod p.

4. Show that if (m,n) = 1 then mφ(n) + nφ(m) ≡ 1 mod mn.

21

5. (i) Prove that for a prime p > 3 then product of all the distinct primitive roots mod p iscongruent to 1 mod p. (ii) Prove that for any prime p, the sum of all the distinct primitiveroots mod p is congruent to µ(p− 1) mod p.

6. Find a counter example for Lagrange’s theorem when R is not an integral domain.

7. For each odd prime p and n ≥ 2, show that the kernel of the natural reduction

(Z/pnZ)× → (Z/pZ)×

is cyclic. This gives another proof that (Z/pnZ)× is cyclic.

By considering the natural reduction

(Z/2nZ)× → (Z/4Z)×

where n ≥ 2, show that(Z/2nZ)× ∼= (Z/2Z)× (Z/2n−2Z).

8. Let a and n be integers greater than 1, and put N = an− 1. Show that the order of a+NZ in(Z/NZ)× is exactly n and deduce that n|φ(N). If n is a prime, deduce that there are infinitelymany primes q such that q ≡ 1 mod n.

9. Show that for any odd prime p,

12 · 32 · 52 · · · (p− 2)2 ≡ (−1)p+12 mod p.

10. Let p be a prime > 3, by considering 1i

+ 1p−i for each i ≤ (p− 1)/2, show that the numerator

of up = 1 + 12

+ · · ·+ 1p−1 is divisible by p2.

22

4 Quadratic Residues

We shall study the quadratic congruences x2 ≡ a mod n.

4.1 Legendre’s symbol

Definition 4.1. Let a ∈ Z and n ∈ N such that (a, n) = 1. a is called a quadratic residue (QR)mod n if the congruence x2 ≡ a mod n has a solution. Otherwise a is called a quadratic non-residue(QNR). Note that if a|n then x2 ≡ a mod n is also soluble (though in this case a is not a QR mod nby definition).

Definition 4.2. Let p be a prime. The Legendres’ symbol(ap

)is defined as 1 if a is a QR mod p,

−1 if a is a QNR mod p, and 0 if p|a. Clearly, if a ≡ b mod p, then(a

p

)=

(b

p

).

Lemma 4.3. For each n, the set of QR mod n forms a group under multiplication.

Proof. 1 is a QR mod p because 1 ≡ 12 mod p. If x2 ≡ a mod n and y2 ≡ b mod n then

(xy)2 ≡ ab mod n.

If x2 ≡ a mod n then x−2 ≡ a−1 mod n.

Lemma 4.4. For each n > 2, the set of QR mod n has size φ(n)/t(n) where t(n) is number ofelements x ∈ (Z/nZ)× such that x2 ≡ 1 mod n. In particular, the set of QR mod p has size (p−1)/2for any odd prime p.

Proof. The map(Z/nZ)× → (Z/nZ)×, x 7→ x2

is a group homomorphism and the image is the set of QR. The kernel is the set of elements x suchthat x2 ≡ 1 mod n. The result follows from isomorphism theorem.

4.2 Euler’s criterion

Theorem 4.5 (Euler’s criterion). Let p be an odd prime and a ∈ Z. Then(a

p

)≡ a

p−12 mod p.

Proof. Let g be a generator for (Z/pZ)×. Then g2k, k = 1, . . . , (p− 1)/2 are QR mod p. Since thereare (p − 1)/2 QR mod p so these are all the QR mod p and so every QNR has the form g2k+1 forsome k. If a is a QR mod p then a = g2k and so(

a

p

)= 1 ≡ gk(p−1) = a(p−1)/2 mod p.

Since x2 ≡ 1 mod p if and only if x ≡ ±1 mod p, so g(p−1)/2 ≡ −1 mod p. Therefore, if a is aQNR and a = g2k+1 for some k, then(

a

p

)= −1 ≡ gk(p−1)+(p−1)/2 = a(p−1)/2 mod p.

23

Corollary 4.6. For all integers a, b and odd prime p,(ab

p

)=

(a

p

)(b

p

).

In particular, the map

χ : (Z/pZ)× → {±1}, a 7→(a

p

)is a group homomorphism with kernel the set of QR mod p.

Proof. By Euler’s criterion,(ab

p

)≡ ab(p−1)/2 ≡ a(p−1)/2b(p−1)/2 ≡

(a

p

)(b

p

)mod p.

Since they are both ±1 so(abp

)=(ap

)(bp

).

Corollary 4.7. −1 is a square mod p if and only if p ≡ 1 mod 4.

Proof. Apply Euler’s criterion with a = −1.

We give some applications of the above corollary.

Proposition 4.8. Let p be an odd prime. Then

(i)∑p−1

a=1

(ap

)= 0.

(ii)∑p−1

a=1 a(ap

)≡ 0 mod p if p > 3.

(iii)∑p−1

a=1

(a(a+1)p

)= −1.

Proof. (i) Let b be a QNR mod p (which exists as p > 2). Then

(Z/pZ)× → (Z/pZ)×, a 7→ ab

is a bijection. Sop−1∑a=1

(a

p

)=

p−1∑a=1

(ab

p

)=

(b

p

) p−1∑a=1

(a

p

)and so

∑p−1a=1

(ap

)= 0.

(ii) Since p > 3, we pick b 6≡ ±1, 0 mod p, and by the above bijection we have

p−1∑a=1

a

(a

p

)≡

p−1∑a=1

ab

(ab

p

)≡ ±b

p−1∑a=1

a

(a

p

)mod p

and sop−1∑a=1

a

(a

p

)(1± b) ≡ 0 mod p.

Therefore,∑p−1

a=1 a(ap

)≡ 0.

(iii) There is a bijection(Z/pZ)× → (Z/pZ)×, a 7→ a−1.

So we havep−1∑a=1

(a(a+ 1)

p

)=

p−1∑a=1

(a2

p

)(1 + a−1

p

)=

p∑b=2

(b

p

)where in the last step we set b = 1 + a−1. Then by (i)

p∑b=2

(b

p

)=

p∑b=1

(b

p

)− 1 = −1.

24

4.3 Gauss’s lemma

Lemma 4.9 (Gauss’s lemma). Let p be an odd prime and a ∈ Z such that (a, p) = 1. Let aj bethe integer such that aj ≡ aj mod p and −(p− 1)/2 ≤ aj ≤ (p− 1)/2. Then(

a

p

)= (−1)ν

where ν is the size of {j : 1 ≤ j ≤ (p− 1)/2, aj < 0}.

Proof. For each 1 ≤ i, j ≤ (p− 1)/2, if ai = ±aj, then

(j ± i)a ≡ 0 mod p.

Since −p < j ± i < p and (a, p) = 1, so j ± i ≡ 0 mod p. Therefore, for all 1 ≤ i 6= j ≤ (p − 1)/2,ai 6= aj. So

{|a1|, . . . , |a(p−1)/2|} = {1, 2, . . . , (p− 1)/2}.Therefore,

a(p−1)/2(p− 1

2

)! =

∏j

aj =

(p− 1

2

)!(−1)ν

where ν is the size of {j : 1 ≤ j ≤ (p−1)/2, aj < 0} and so the result follows by Euler’s criterion.

Corollary 4.10.(

2p

)= 1 if and only if p ≡ ±1 mod 8.

Proof. Let a = 2 and so aj = 2j for 1 ≤ j ≤ [14p] and aj = 2j − p for [1

4p] < j ≤ (p − 1)/2. So

ν = (p− 1)/2− [14p] and so ν is even if and only if p ≡ ±1 mod 8.

4.4 Law of quadratic reciprocity

We shall study the relation between(pq

)and

(qp

)for odd primes p and q.

Theorem 4.11. Let p, q be distinct odd primes. Then(p

q

)(q

p

)= (−1)

p−12

q−12 .

We give a proof using Gauss’s lemma. An alternative proof can be found in the Exercises. Thefirst step is to interpret the number ν as in Gauss’s lemma in another way.

Lemma 4.12. Let a, p, ν be the numbers as in Gauss’s lemma. Then

ν =2m∑i=1

(−1)i[ip

2a

], where m =

[a2

].

Proof. In Gauss’s lemma, ν is the number of j such that aj is inside one of the intervals

[p/2, p], [3p/3, 2p], . . . , [(n− 1/2)p, np].

Therefore it is the number of j such that j is in one of the intervals[p

2a,

2p

2a

],

[3p

2a,

4p

2a

], . . . ,

[(2n− 1)p

2a,2np

2a

].

The end points are not integers because (a, p) = 1. Since the number of integers inside an interval[α, β] with α, β 6∈ Z is [β]− [α], this proves the statement.

25

Lemma 4.13. Let p, q be distinct odd primes and a ∈ Z such that (a, p) = (a, q) = 1. Let ν1, ν2 be

the numbers for p, q respectively as in Gauss’s lemma. If p ≡ ±q mod 4a, then(ap

)=(aq

).

Proof. By Gauss’s lemma, it suffices to prove that ν1 and ν2 have the same parity. Suppose p ≡ qmod 4a, then

[ip2a

]and

[iq2a

]have the same parity for all i. Then by the previous lemma, ν1, ν2 have

the same parity.Suppose p ≡ −q mod 4a, then

[ip2a

]and

[−iq2a

]have the same parity. Since [−α] = −[α]− 1 for all

real number α and so[ip2a

]and

[iq2a

]have different parity. Therefore again by the previous lemma,

ν1 and ν2 have the same parity.

We now prove law of quadratic reciprocity.

Proof. Suppose p 6= q mod 4 then 4|p+ q and let p+ q = 4a for some a. Then(p

q

)=

(4a− qq

)=

(a

q

)and (

q

p

)=

(4a− pp

)=

(a

p

)By the previous lemma, since p ≡ −q mod 4a, so(

p

q

)=

(a

q

)=

(a

p

)=

(q

p

).

So(pq

)(qp

)= 1 if p+ q ≡ 0 mod 4.

Suppose now p ≡ q mod 4, then 4|p− q and let p− q = 4a for some a. Then(p

q

)=

(4a+ q

q

)=

(a

q

)and (

q

p

)=

(p− 4a

p

)=

(−ap

)By the previous lemma, since p ≡ q mod 4a so(

p

q

)=

(a

q

)=

(a

p

)= (−1)(p−1)/2

(q

p

).

So(pq

)(qp

)= 1 if p ≡ q ≡ 1 mod 4 and −1 if p ≡ q ≡ 3 mod 4.

Corollary 4.14. −3 is a quadratic residue mod p if and only if p ≡ 1 mod 3.

Proof. By law of quadratic reciprocity, we have(−3

p

)=

(−1

p

)(3

p

)= (−1)(p−1)/2(−1)(p−1)/2

(p3

)=(p

3

).

Therefore(−3p

)= 1 if and only if

(p3

)= 1, if and only if p ≡ 1 mod 3.

26

4.5 Jacobi’s symbol

This is a generalisation of Legendre’s symbol.

Definition 4.15. Let n > 1 be a positive odd integer and n =∏

i pi as a product of primes, notnecessarily distinct. Then for any integer a, the Jacobi’s symbol

(an

)is(a

n

)=∏i

(a

pi

).

Remark 4.16. The above definition implies that(an

)= 0 if (a, n) > 1. Also by convention we define(

an

)= 1 if n = 1. It is clear that if a ≡ b mod n then(a

n

)=

(b

n

).

Remark 4.17.(an

)= 1 does not imply a is a square mod n. For example, if a = 2 and n = 15 then(

2

15

)=

(2

3

)(2

5

)= (−1)(−1) = 1.

But if 2 is a square mod 15 then 2 is a square mod 3, which is a contradiction. But if(an

)= −1 then

a is a QNR mod n because by definition,(an

)= −1 implies

(ap

)= −1 for some prime p|n. So a is

a QNR mod p and hence a QNR mod n.

Proposition 4.18. Let n be a positive odd integer.(2n

)= 1 if and only if n ≡ ±1 mod 8.

Proof. Let n =∏

i pi and so (2

n

)=∏i

(2

pi

)= (−1)t

where t is the number of pi such that pi ≡ ±3 mod 8. So t is even if and only if n ≡ ±1 mod 8.

Theorem 4.19 (Law of quadratic reciprocity for Jacobi’s symbol). Let m,n be odd positiveintegers such that (m,n) = 1. Then (m

n

)( nm

)= (−1)

m−12

n−12 .

Proof. Let n =∏

i pi and m =∏

j qj where pi 6= qj for all i, j. Then(mn

)( nm

)=∏i

∏j

(piqj

)(qjpi

)=∏i

∏j

(−1)pi−1

2

qj−1

2 =∏i

∏j

(−1)aij

where aij = −1 if pi ≡ qj ≡ 3 mod 4 and aij = 1 otherwise.Therefore,

∏i

∏j(−1)aij = (−1)uv where u is the number of pi such that pi ≡ 3 mod 4 and v is

the number of qj such that qj ≡ 3 mod 4. Note that u is even if and only if n ≡ 1 mod 4 and v iseven if and only if m ≡ 1 mod 4. So uv has the same parity as m−1

2n−12

.

27

4.6 Hensel’s lemma

We want to determine whether a is a QR mod n for given integers a and n. By Chinese remaindertheorem, if n =

∏i p

aii then a is a QR mod n if and only if a is a QR mod paii for each i.

Theorem 4.20 (Hensel’s lemma). For each odd prime p, x is a QR mod p if and only if x is aQR mod pn for all n ≥ 1.

Proof. If x is a QR mod pn for all n ≥ 1 then x is a QR mod p. Conversely, if x is a QR mod p, thenwe prove by induction that x is a QR mod pn for all n ≥ 1. This is true for n = 1. Suppose x is aQR mod pn−1 where n ≥ 2, then there exists y, k such that

y2 = x+ pn−1k

where (y, p) = 1 because (x, p) = 1.We search for an element of the form y+pn−1b such that (y+pn−1b)2 ≡ 1 mod pn. Then we must

have(y + pn−1b)2 = x+ pn−1(k + 2by) + p2n−2b2 ≡ x mod pn.

Since 2n− 2 ≥ n so we need to pick b such that

k + 2by ≡ 0 mod p

and so b ≡ −2−1y−1k. Such b exists because (2, p) = 1 and so 2−1 exists.

Proposition 4.21. Let x be an odd integer. Then x is a QR mod 2n for all n ≥ 1 if and only ifx ≡ 1 mod 8.

Proof. If x is a QR mod 2n for all n ≥ 1 then x ≡ 1 mod 8 because

12 ≡ 32 ≡ 52 ≡ 72 ≡ 1 mod 8.

Conversely, let x ≡ 1 mod 8. Clearly, x is a QR mod 2, 4 and we prove by induction that x is a QRmod 2n for all n ≥ 3. It is clearly true for n = 3. Suppose x is a QR mod 2n−1 where n ≥ 4, thenthere exist integers y, k such that

y2 = x+ 2n−1k

where y is odd. We search for an element of the form y + 2n−2b such that (y + 2n−2b)2 ≡ x mod 2n.Then we must have

y2 + 2n−1by + 22n−4b2 = x+ 2n−1(k + by) + 22n−4b2 ≡ x mod 2n.

So it suffices to pick b such that k and b have the same parity.

4.7 Exercises

1. Show that if p is a prime ≡ 3 mod 4 and if p′ = 2p+ 1 is a prime then 2p ≡ 1 mod p′. Deducethat 2251 − 1 is not a Mersenne prime.

2. Show that if p ≡ 1 mod 4, then ∑a∈QR

a =1

4p(p− 1) =

∑a∈QNR

a.

Give a counter example when p ≡ 3 mod 4.

28

3. Let p be a prime of the form 2n + 1. Show that a is a quadratic non-residue mod p if and onlyif a is a primitive root mod p.

4. By considering integers of the form n2+4, show that there are infinitely many primes congruentto 5 mod 8. By considering n2 + 2 and n2 − 2, show further that there are infinitely manyprimes congruent to 3 or 7 mod 8.

5. Let f(x) = ax2 + bx + c where a, b, c are integers, and let p be an odd prime which does not

divide a. Prove that the number of solutions of the congruence f(x) ≡ 0 mod p is 1 +(dp

)where d = b2 − 4ac.

6. Show that an integer a is a square if and only if the congruence x2 ≡ a mod p is soluble forevery prime p.

7. Let f(x) = ax2 + bx+ c where a, b, c are integers. Let p be an odd prime which does not dividea. Further let d = b2 − 4ac. Show that if p - d then

p∑x=1

(f(x)

p

)= −

(a

p

).

Evaluate the sum when p|d.

8. Let p be a prime with p ≡ 3 mod 8. Show that

p−1∑a=1

a

(a

p

)=

(p−1)/2∑a=1

(2a− p)(a

p

)=

(p−1)/2∑a=1

(p− 4a)

(a

p

).

Deduce that if p > 3 then(p−1)/2∑a=1

(a

p

)≡ 0 mod 3.

9. Let p be an odd prime and ζ = e2πi/p. (i) Let

τ =

p−1∑a=1

(a

p

)ζa.

Show that τ 2 = p′ where p′ = (−1)(p−1)/2p.

(ii) Let q be an odd prime and q 6= p. Show that(p′

q

)= 1 if and only if

τ q−1 ≡ 1 mod q.

(iii) Show that τ q−1 ≡(qp

)mod q. Hence give an alternative proof for law of quadratic reci-

procity.

10. Prove the following general version of Hensel’s lemma. Suppose f(x) is a polynomial withinteger coefficient and m, k are positive integers such that m ≤ k. If r is an integer such that

f(r) ≡ 0 mod pk andf ′(r) 6≡ 0 mod p

then (by considering Taylor expansion) there exists an integer s such that

f(s) ≡ 0 mod pk+m and r ≡ s mod pk.

In particular, setting f(x) = x2 − a and apply this repeatedly gives the Hensel’s lemma in thelast section.

29

5 Binary quadratic forms

5.1 Sum of two squares

Which positive integers can be written as sum of two squares? By studying the Gaussian integersZ[i], we know that a prime p is a sum of squares if and only if p ≡ 1 mod 4 or p = 2. What aboutcomposite numbers? 21 ≡ 1 mod 4, but it cannot be written as sum of two squares. So we mightneed some more condition for composite numbers.

Theorem 5.1. Let n be a positive integer. n is the sum of two squares if and only if every primenumber p|n with p ≡ 3 mod 4 divides n to an even power.

Proof. Suppose n = x2 + y2 for some x, y and p|n where p ≡ 3 mod 4. Then x2 ≡ −y2 mod p. Since−1 is not a square mod p, so we must have

x ≡ y ≡ 0 mod p.

This means p|x, y and so p2|x2 + y2. Then n2/p2 is a sum of two squares and we repeat the aboveuntil p - n. Therefore p divides n to an even power.

Conversely, let n = mb2 where m is square free. It suffices to prove that m can be written as asum of two squares. If p|n where p ≡ 3 mod 4 then p divides n to an even power. Therefore, p - mand so if p|m then p = 2 or p ≡ 1 mod 4. Each such p can be written as sum of two squares, andsince

(x2 + y2)(x′2 + y′2) = (xx′ + yy′)2 + (xy′ − yx′)2

so m can be written as a sum of two squares.

5.2 Definition and equivalence

Definition 5.2. f(x, y) is called a binary quadratic form if

f(x, y) = ax2 + bxy + cy2, where a, b, c ∈ Z.

The discriminant of f is d = b2−4ac. An integer n is represented by f if there exist integers x and ysuch that f(x, y) = n. An integer n is properly represented by f if there exist integers x and y where(x, y) = 1 such that f(x, y) = n.

Remark 5.3. The discriminant d of a binary quadratic form is congruent to either 0 or 1 mod 4and d has the same parity as b.

Definition 5.4. The forms x2 − 14dy for d ≡ 0 mod 4 and x2 + xy + 1

4(1 − d)y2 for d ≡ 1 mod 4

are called the principal forms with discriminant d. Therefore for each d ≡ 0, 1 mod 4 there exist abinary quadratic form with discriminant d.

Note that 4af(x, y) = (2ax+ by)2−dy2 and so if d < 0 the values taken by f are of the same signand if d > 0 then f takes values of both signs. Indeed 4af(1, 0) > 0 and 4af(b,−2a) = −4a2y2 < 0.

Definition 5.5. A binary quadratic form is called positive definite if a > 0, d < 0 and negativedefinite if a < 0, d < 0. It is called indefinite if d > 0.

We will mainly focus on positive definite binary quadratic forms in this chapter.

Remark 5.6. We can write f as (a, b, c) or in matrix notation as

Mf =(x y

)( a b/2b/2 c

)(xy

).

Then the discriminant d is −4 detMf

30

When do two binary quadratic forms f and g represent the same numbers?

Definition 5.7. A unimodular substitution is one of the form

X = αx+ γy, Y = βx+ δy

where α, β, γ, δ ∈ Z and αδ − βγ = 1.

Definition 5.8. Two binary quadratic forms f, g are equivalent (written f ∼ g) if they are relatedby a unimodular substitution. It is not hard to check that ∼ is an equivalence relation.

Lemma 5.9. In matrix notation, if f(x, y) = ax2 + bxy + cy2 and

g(x, y) = g(αx+ γy, βx+ δy) = Ax2 +Bxy + Cy2

where α, β, γ, δ ∈ Z and αδ − βγ = 1. Then(A B/2B/2 C

)= T t

(a b/2b/2 c

)T

where T =

(α γβ δ

)∈ SL2(Z.

Proof. Write(X Y

)= T

(xy

).

Corollary 5.10. Equivalent binary quadratic forms have the same discriminant and they representthe same set of integers.

Proof. Let Mf and Mg be the matrices for f and g respectively. Then Mg = U tMfU for someU ∈ SL2(Z). Therefore detMg = detMf . Since U is invertible so Mf = (U t)−1MgU

−1 so f and grepresent the same set of integers.

Remark 5.11. The converse of the above corollary is not true. For example, f = (1, 0, 6) andg = (2, 0, 3) have the same discriminant but they are not equivalent. Indeed, f represents 1 butg(x, y) = 2x2 + 3y2 ≥ 2.

5.3 Reduction

We have an equivalence relation for binary quadratic forms. It will help to study the equivalenceclasses if we specify a ’special’ form in each equivalence class. We begin with several examples.

Example 5.12. If f(x, y) = ax2 + bxy + cy2. Then

g(x, y) = f(x+ ty, y) = a(x+ ty)2 + b(x+ ty)y + cy2 = ax2 + (2at+ b) + (at2 + bt+ c)y2.

It is clear that f ∼ g and by picking t = ±1, we have

(a, b, c) ∼ (a, b± a, a± b+ c).

We can also try the unimodular substitution

g(x, y) = f(y,−x) = ay2 − bxy + cx2

and so (a, b, c) ∼ (c,−b, a).

The above example shows that

31

Lemma 5.13. Every equivalence class of binary quadratic forms consists of a binary quadratic form(a, b, c) with |b| ≤ a and a binary quadratic form with a ≤ c.

Definition 5.14. A positive definite binary quadratic form is reduced if either −a < b ≤ a < c or0 ≤ b ≤ a = c.

For example, (2,±1, 3) are both reduced. (2, 1, 2) is reduced but (2,−1, 2) is not reduced.

Lemma 5.15. Every binary quadratic form is equivalent to a reduced form.

Proof. Define the unimodular operations

S : (a, b, c) 7→ (c,−b, a), T±(a, b± 2a, a± b, c).

If a > c, apply S to decrease a, while leaving |b| fixed. If a < c and |b| > a then apply T± to decrease|b| while leaving a fixed. Repeat these steps so we eventually obtain a form with

|b| ≤ a ≤ c.

If b = −a, then apply T+ to replace (a,−a, c) by (a, a, c). If a = c then apply S (if necessary) toensure b > 0.

The following lemma gives a useful algorithm to find reduced forms of discriminant d.

Lemma 5.16. Let f = (a, b, c) be a reduced binary quadratic form of discriminant d. Then

|b| ≤ a <

√|d|3

and b ≡ d mod 2.

Proof. We have |b| ≤ a ≤ c and so

d = b2 − 4ac ≤ ac− 4ac ≤ −3a2.

Therefore a2 ≤ |d|3

. Since d = b2 − 4ac so b and d have the same parity.

Example 5.17. If d = −4, then a ≤ 1 and so a = 1. b = 0 and c = 1 so the only reduced form ofdiscriminant −4 is x2 + y2.

Reduced forms can be used to study integers represented by binary quadratic forms.

Corollary 5.18. A prime p is a sum of two squares if and only if p = 2 or p ≡ 1 mod 4.

Proof. If p = 2 then p = 12 + 12. If p ≡ 1 mod 4 then −1 is a square mod 4 and so there exists u, vsuch that u2 = −1 + pv. Let f(x, y) = (p, 2u, v) then df = −4. Since there is only one reduced formof discriminant −4, so f ∼ (1, 0, 1) and so they represent the same set of integers. In particular, p isrepresented by f and hence p is represented by (1, 0, 1).

Here is a useful fact of reduced form.

Lemma 5.19. The smallest three positive integers properly represented by a reduced form f = (a, b, c)are a, b, a+ c− |b|.

32

Proof. If x, y are non-zero integers with |x| ≥ |y| then

f(x, y) ≥ |x|(a|x| − |b||y|) + c|y|2

≥ |x|2(a− |b|) + c|y|2 ≥ a− |b|+ c.

Similarly, if |y| ≥ |x| > 0 then

f(x, y) ≥ a|x|2 + |y|(−|b||x|+ c|y|)≥ a|x|2 + (c− |b|)|y|2 ≥ a− |b|+ c.

Remark 5.20. Note these are the smallest integers properly represented by f . For example, iff = x2 + 5y2 then the smallest three positive integers properly represented by f are 1, 5, 5 but f alsorepresents 4 (not properly represent).

Can reduced forms be equivalent?

Theorem 5.21. Let f, g be two reduced forms. Then f is not equivalent to g.

Proof. Letf = (a, b, c), g = (a′, b′, c′)

be reduced forms. Suppose f, g are equivalent. Then f, g represent the same set of integers. Inparticular, the smallest three positive integers properly represented by f and g are the same. Soa = a′, c = c′ and a+ c− |b| = a′ + c′ − |b′|. So b = ±b′. It remains to check that b = b′.

If b = a then b = a′ and so a′ = ±b′. But g is reduced so b′ = a′ = b. If a = c then a′ = c′ sob, b′ ≥ 0 and so b = b′. So we may assume that −a < b < a < c. Then by the previous lemma, forall non-zero integers x, y,

f(x, y) ≥ a− |b|+ c > c > a.

So f(x, y) = a if and only if (x, y) = (±1, 0) and g(x, y) = a′ if and only if (x, y) = (±1, 0). Sincef ∼ g, so g(x, y) = f(X, Y ) = f(αx+ γy, βx+ δy). Therefore

a = a′ = g(±1, 0) = f(±α,±β) = f(±1, 0)

and so β = 0 and α = ±1. Since αδ − βγ = 1 so δ = ±1 and the sign of δ is the same as thesign of α. Since c = c′ so β = 0 and so the only possible substitutions are g(x, y) = f(x, y) org(x, y) = f(−x,−y). Therefore b = b′.

Therefore each binary quadratic form is equivalent to a unique reduced form.

Definition 5.22. Let d < 0 be an integer. The class number of d, (written h(d)) is the number ofreduced forms of discriminant d.

Definition 5.23. An integer d ≡ 0, 1 mod 4 is called a fundamental discriminant if it is not of theform d = k2d′ for some integer k > 1 and d′ ≡ 0, 1 mod 4.

5.4 Representations by binary quadratic forms

We study the set of integers represented by a binary quadratic form.

Lemma 5.24. Let f = (a, b, c) and n ∈ Z. n is properly represented by f if and only if f is equivalentto a binary quadratic form with first coefficient n.

33

Proof. Suppose f ∼ (n, p, q) for some p, q then n is represented by (n, p, q) and so represented by f .Conversely, if f(α, β) = n where (α, β) = 1, then there exists γ, δ ∈ Z such that αδ − βγ = 1. Let

g(x, y) = f(αx+ γy, βx+ δy)

and so g(1, 0) = f(α, β) = n. Therefore g ∼ (n, p, q) for some p, q.

Theorem 5.25. Let n be a positive integer and d < 0 a discriminant. n is properly represented bya binary quadratic form of discriminant d if and only if the congruence

x2 ≡ d mod 4n

is soluble.

Proof. Suppose n is represented by (a, b, c) with b2 − 4ac = d. By the previous lemma, (a, b, c) ∼(n, p, q) for some p, q and since equivalent forms have the same discriminant, so p2 − 4qn = d.Therefore,

p2 ≡ d mod 4n.

Conversely, if there exists x such that x2 ≡ d mod 4n, then there exists y such that x2 = d+ 4ny.Take f = (n, x, y) and so f has discriminant d, and n is represented by f .

5.5 Exercises

1. Determine all odd positive integers that can be expressed in the form (i) x2 + 2y2 (ii) x2 + 3y2

(iii) x2 − y2.

2. Which primes are represented by x2 + 5y2? Which primes are represented by 2x2 + 2xy + 3y2.

3. Prove that n and 2n, where n is any positive integer, have the same number of representationsas the sum of two squares.

4. Is there a positive definite binary quadratic form that represents 2 and the primes congruent to1 or 3 modulo 8, but no other primes? Is there such a form representing the primes congruentto 1 modulo 4 only?

5. (i) Let p be an odd prime. Show that there exist x, y such that

x2 ≡ −1− y2 mod p.

Hence show that there exist x0, y0, u0, v0 and k such that

x20 + y20 + u20 + v20 = kp.

(ii) If x20 + y20 + u20 + v20 = 2mp, then show that there exist x1, y1, u1, v1 such that

x21 + y21 + u21 + v21 = mp.

(iii) If x20 + y20 + u20 + v20 = kp where k is odd, then show that there exist x1, y1, u1, v1 such that

x21 + y21 + u21 + v21 = k′p, for some k′ < k.

You may use the identity

(x2 + y2 + z2 + w2)(x′2 + y′2 + z′2 + w′2) = (xx′ + yy′ + zz′ + ww′)2 + (xy′ − yx′ + wz′ − zw′)2

+ (xz′ − zx′ + yw′ − wy′)2 + (xw′ − wx′ + zy′ − yz′)2.

(iv) Hence show that every prime can be written as a sum of four squares.

(v) Finally show that every positive integer can be written as a sum of four squares.

34

6 Distribution of primes

6.1 The sum∑

p p−1 and the product

∏p(1− p−1)−1

Theorem 6.1. The sum∑

p p−1 and the product

∏p(1− p−1)−1 both diverge.

Proof. Let S(x) =∑

p≤x p−1 and P (x) =

∏p≤x(1− p−1)−1 for x ≥ 2. We have

P (x) =∏p≤x

(1− p−1)−1 =∏p≤x

∞∑i=0

p−i ≥[x]∑n=1

n−1 →∞

as x→∞.Now using the taylor expansion of log(1− t) for |t| < 1, we have

logP (x) = −∑p≤x

log(1− p−1) =∑p≤x

∞∑i=1

p−i

i= S(x) +

∑p≤x

∞∑i=2

p−i

i.

For any p ≤ x,∞∑i=2

p−i

i≤ 1

2

∞∑i=1

p−i =1

2

1

p(p− 1).

Therefore,

0 ≤ logP (x)− S(x) ≤∑p≤x

1

2

1

p(p− 1)≤

∞∑n=1

1

2

1

n(n− 1)=

1

2

and so S(x)→∞ as x→∞.

In particular, this gives another proof of the following statement.

Corollary 6.2. There are infinitely many primes.

Moreover,

Corollary 6.3. For each x ≥ 2, we have

P (x) ≥ log x, S(x) ≥ log log x− 1

2.

Proof. We have seen that P (x) ≥∑[x]

n=11n

and by integral approximation, we have

P (x) ≥[x]∑n=1

1

n>

∫ [x]

1

1

tdt = log([x] + 1) > log x.

For each 0 < u < 1 we have

− log(1− u)− u =u2

2

(1 +

u32

+ · · ·)<u2

2(1 + u+ u2 + · · · ) =

u2

2(1− u).

Let u = 1p

and sum over all primes p ≤ x, we conclude that

0 ≤ logP (x)− S(x) ≤ 1

2

∑p≤x

1

p(p− 1)<

1

2

∞∑n=2

1

n(n− 1)=

1

2

and so S(x) ≥ log log x− 12.

We will give explicit approximations of∑

p≤x p−1 later.

35

6.2 Legendre’s formula

Recall that π(x) is the number of primes less than or equal to x. We give a method to compute π(x)effectively.

Definition 6.4. Let pn be the nth prime. For x ≥ 2, r ≥ 1, define Nr(x) to be the size of

{n : 1 ≤ n ≤ x, pi - n for all i ≤ r}.

Lemma 6.5. For each x ≥ 2, r ≥ 1, we have

Nr(x) = [x]−r∑i=1

[x/pi] +∑i 6=j≤r

[x/pipj] + · · ·+ (−1)r[x/p1 · · · pr].

Proof. Apply inclusion-exclusion formula to compute [x]−Nr(x).

Corollary 6.6. Let r = π(√x). Then

Nr(x) = π(x)− π(√x) + 1

and hence

1 + π(x) = π(√x) + [x]−

r∑i=1

[x/pi] +∑i 6=j≤r

[x/pipj] + · · ·+ (−1)r[x/p1 · · · pr].

Proof. Each composite number in [2, x] must be divisible by some prime p1, . . . , pr where r = π(√x).

Therefore, the set{n : 1 ≤ n ≤ x, pi - n for all i ≤ r}

contains 1 and the primes bigger than√x. So

Nr(x) + π(√x) = π(x) + 1.

6.3 Bertrand’s postulate

Lemma 6.7. Let C =(2nn

), then

2n log 2− log(2n+ 1) ≤ logC ≤ 2n log 2.

Proof. Consider 22n = (1 + 1)2n and the largest term in the binomial expansion of this is C =(2nn

).

Since there are 2n+ 1 terms in the expansion,

22n

2n+ 1≤ C ≤ 22n.

Taking logarithm gives the required inequalities.

Lemma 6.8. Let C =(2nn

), then

(π(2n)− π(n)) log n ≤ logC ≤ π(2n) log(2n).

36

Proof. Let r(p) be the integer such that

pr(p) ≤ 2n < pr(p)+1.

Recall from 2.2, the exponent to which any prime p divides n! is∑∞

j=1[n/pj]. So the exponent to

which p divides C is

∞∑j=1

([2n/pj]− 2[n/pj]

)=

r(p)∑j=1

([2n/pj]− 2[n/pj]

)≤

r(p)∑j=1

1 = r(p)

because [2n/pj] − 2[n/pj] = 0 or 1 for each j; Indeed, if the fractional part of n/pj is less than 1/2then [2n/pj]− 2[n/pj] = 0 and [2n/pj]− 2[n/pj] = 1 otherwise. Since this is true for all p|2n, so

C =

(2n

n

)≤∏p≤2n

pr(p) ≤∏p≤2n

2n = 2nπ(2n)

and so logC ≤ π(2n) log(2n). It is also clear that p|C for each p with n < p ≤ 2n and so

nπ(2n)−π(n) =∏

n<p≤2n

n <∏

n<p≤2n

p∣∣(2n

n

)= C

and so (π(2n)− π(n)) log n ≤ logC.

Theorem 6.9 (Chebychev’s estimate). There exist constants a, b > 0 such that for all x ≥ 2, wehave

ax

log x< π(x) <

bx

log x.

Proof. By the previous two lemmas, we conclude that

π(2n) ≥ (2n log 2− log(2n+ 1))/ log(2n)

andπ(2n)− π(n) ≤ 2n log 2/ log n.

The first inequality implies that there exists a > 0 such that

π(x) >ax

log x

and the second inequality implies that there exists c > 0 such that

π(2x)− π(x) <cx

log x.

Since π(x) =∑∞

j=1[π(x/2j−1)−π(x/2j)] so we estimate π(x/2j−1)−π(x/2j) for each j. Replacing

x by x/2j we conclude that for each j,

π(x/2j−1)− π(x/2j) ≤ cx

2j log(x/2j).

If 2j ≤ x1/2 then x/2j ≥ x1/2 and so

π(x/2j−1)− π(x/2j) ≤ cx

2j−1 log(x).

If 2j > x1/2 thenπ(x/2j−1)− π(x/2j) < π(x/2j−1) < x/2j−1 < 2

√x.

37

Since there are at most log xlog 2

non-zero terms, we conclude that

π(x) ≤log x/ log 2∑

j=1

[π(x/2j−1)− π(x/2j)] <cx

log x

∞∑j=1

1

2j−1+ 2√x

log x

log 2< bx/ log x

for some b > 0.

Remark 6.10. In fact Chebychev showed that one can take a = 0.9219 . . . and b = 1.1055 . . . ifx ≥ 30.

Corollary 6.11 (Bertrand’s postulate). For all integer x, there exists a prime p such that x <p ≤ 2x.

Proof. By the above remark, we have

π(x) <bx

log x<

2ax

log 2x< π(2x)

for a = 0.92129 . . . , b = 1.1055 . . . and x ≥ 30. Therefore, π(2x) − π(x) ≥ 1 and so there exists aprime between x and 2x. It is easy to check that the theorem holds for 2 ≤ x < 30.

6.4 Partial summation formula

We give the following theorem in real analysis.

Theorem 6.12 (Partial summation formula). Let a1, . . . , an be any real sequence and let s(x) =∑n≤x an. Further, let f(x) be a real function with continuous derivative f ′(x) for all real x > 0.

Then ∑n≤x

anf(n) = s(x)f(x)−∫ x

1

s(u)f ′(u)du.

Proof. By convention we write s(0) = 0. Write an = s(n)− s(n− 1) so we have∑n≤x

anf(n) =∑n≤x

(s(n)− s(n− 1))f(n) =∑n≤x

s(n)f(n)−∑n≤x

s(n− 1)f(n)

=∑n≤x

s(n)f(n)−∑n≤x−1

s(n)f(n+ 1)

=∑n≤x−1

s(n)(f(n)− f(n+ 1)) + s(x)f([x]).

Further, since s(x) is constant in any interval [n, n+ 1], so

[x]−1∑n=1

s(n)(f(n)− f(n+ 1)) = −[x]−1∑n=1

∫ n+1

n

s(y)f ′(y)dy.

Finally, write s(x)f([x]) =∫ x[x]s(y)f ′(y)dy + s(x)f(x) and so the result follows.

6.5 Merten’s results

We now give explicit approximations of∑

p≤x p−1 and

∏p(1− p−1)−1.

Theorem 6.13. We have ∑p≤x

log p

p= log x+O(1).

38


log n = log[x]! =∑p≤x

∞∑j=1

[x/pj] log p

because the exponent to which a prime p divides [x]! is∑∞

j=1[x/pj] by Lemma 2.2. The contribution

from the term j = 1 is ∑p≤x

[x/p] log p =∑p≤x

(x/p) log p+O(π(x) log x).

Since π(x) < bx/ log x by Chebychev’s inequality, so O(π(x) log x) = O(x). Since

∞∑j=2

1

pj=

1

p(p− 1)

and so ∑p≤x

∞∑j=2

x log p1

p(p− 1)≤∑

2≤n≤x

log n

n(n− 1)≤ x

∞∑n=2

log n

n(n− 1)

which is O(x). Therefore, ∑n≤x

log n = x∑p≤x

log p

p+O(x).

Finally, it’s an easy exercise (see Exercises) that∑n≤x

log x = x log x+O(x)

and so the result follows.

Theorem 6.14. For some constant c we have

S(x) =∑p≤x

1

p= log log x+ c+O

(1

log x

).

Proof. Apply partial summation formula with f(x) = 1log x

,

an =log n

nif n is a prime p and 0 otherwise.

Then∑

n≤x anf(n) =∑

p≤x1p. f(x) has continuous derivative only for x > 1. But a1 = 0 so the

theorem still applies ∑2≤n≤x

anf(n) = s(x)f(x)−∫ x

2

s(y)f ′(y)dy.

By the previous theorem, we have

s(x) =∑p≤x

log p

p= log x+ r(x)

where r(x) = O(1) and so s(x)f(x) = 1 +O(

1log x

). Since f ′(y) = − 1

x(log x)2, we have

−∫ x

2

s(y)f ′(y)dy =

∫ x

2

s(y)

y(log y)2dy =

∫ x

2

dy

y log y+

∫ x

2

r(y)

y(log y)2dy.

Since∫ x2

dyy log y

= [log log x]x2 and∫ x

2

r(y)

y(log y)2dy =

∫ ∞2

r(y)

y(log y)2dy −

∫ ∞x

r(y)

y(log y)2dy = c+O

(1

log x

)the result follows by taking the constant to be c+ 1− log log 2.

39

Theorem 6.15. For some constant b > 0, we have∏p≤x

(1− p−1)−1 = b log x+O(1).

Proof. We have∏p≤x

(1− p−1)−1 =∏p≤x

exp(− log(1− p−1)) = exp(∑p≤x

− log(1− p−1)).

By using

−∑p≤x

log(1− p−1) =∑p≤x

∞∑i=1

p−i

i= S(x) +

∑p≤x

r(p)

where r(p) =∑∞

i=2p−i

i. Since

r(p) ≤ 1

2

∞∑i=2

p−i ≤ 1

2

1

p(p− 1)

so ∑p>x

r(p) ≤∑n>x

r(n) ≤∑n>x

1

n(n− 1)= O

(1

x

).

Therefore, ∑p≤x

r(p) = c′ +O

(1

x

)where c′ =

∑p r(p) is a constant. Therefore, by the previous theorem

S(x) +∑p

r(p) = log log x+ c+O(1/ log x) + c′ +O(1/x) = log log x+ b+O(1/ log x)

for some b > 0. The theorem follows by taking exponentials:

P (x) = exp(S(x) +∑p

r(p)) = eb log x exp(O(1/ log x))

and use the fact exp(O(1/ log x)) = 1 +O(1/ log x).

6.6 Riemann zeta function

Recall from chapter 2 that the Riemann zeta function is ζ(s) =∑

n n−s where s ∈ C. We again write

s = σ + it where σ, t ∈ R.

Lemma 6.16. For σ > 1, the series∑∞

n=1 n−s converges absolutely.

Proof. |n−s| = | exp(−s log n)| = | exp(−σ log n − it log n)|. Since |eiθ| = 1 for all θ ∈ R, so |n−s| =| exp(−σ log n)| = n−σ. Therefore

∞∑n=1

|n−s| =∞∑n=1

n−σ <∞.

Proposition 6.17 (Euler’s product). For σ > 1, we have

ζ(s) =∏p

(1− p−s)−1.

40

Proof. Let T (N, s) =∏

p≤N(1− p−s)−1 and we will show that ζ(s)−T (N, s)→ 0 as N →∞ for anys with σ > 1. We have

T (N, s) =∏p≤N

(1− p−s)−1 =∏p≤N

∞∑i=0

p−is.

Since T (N, s) consists of terms of the form n−s for n ≤ N , so

|ζ(s)− T (N, s)| ≤∑

n≥N+1

|n−s| =∑

n≥N+1

n−σ ≤∫ ∞N

dy

yσ=

1

σ − 1N1−σ.

Therefore |ζ(s)− T (N, s)| → 0 as N →∞ when σ > 1.

Corollary 6.18. For σ > 1, ζ(s) 6= 0.

Proof. By using Euler’s product, we have

∏p≤x

(1− p−s)ζ(s) =∏p>x

(1− p−s)−1 =∏p>x

(1 +∞∑i=1

p−is)

and so ∣∣∣∣∣∏p≤x

(1− p−s)ζ(s)

∣∣∣∣∣ ≤ 1−∑n>x

n−σ.

For each σ > 1, the sum∑∞

n=1 n−σ converges and so by taking x large enough, we conclude that

1−∑

n>x n−σ ≤ 1

2and so ζ(s) 6= 0.

Theorem 6.19. ζ(s)− 1s−1 has an analytic continuation to σ > 0.

Proof. Writing 1 = n− (n− 1) we have

ζ(s) =∞∑n=1

n

ns−∞∑n=1

n− 1

ns=∞∑n=1

n

ns−∞∑n=1

n

(n+ 1)s=∞∑n=1

n(n−s − (n+ 1)−s.

Then

ζ(s) = s

∞∑n=1

∫ n+1

n

nx−s−1dx = s

∫ ∞1

[x]x−s−1dx

because n = [x] for x ∈ [n, n+ 1]. But [x] = x− {x} where {x} is the fractional part of x and so

ζ(s) = s

∫ ∞1

(x− {x})x−s−1 =s

s− 1− s

∫ ∞1

{x}x−s−1dx.

Since |{x}| ≤ 1 for all x, the integral above converges for σ > 0 and so

ζ(s)− 1

s− 1= 1− s

∫ ∞1

{x}x−s−1dx

converges to an analytic function on σ > 0.

Using a similar method (and integration by part) one can show

Theorem 6.20. ζ(s)− 1s−1 has an analytic continuation to σ > −1.

41

6.7 Gamma function and the functional equation

Let s = σ + it as usual.

Definition 6.21. For σ > 0, the Gamma function is defined by

Γ(s) =

∫ ∞0

e−xxs−1dx.

Using integration by part we conclude that

Lemma 6.22. sΓ = Γ(s + 1) for all s. In particular, for each natural number s > 0, we haveΓ(s) = s!.

Lemma 6.23.∫∞0

sin yys+1dy = − sin(1

2sπ)Γ(−s) for all σ < 0.

Proof. Write sin y = 12i

(eiy − e−iy) and we will compute∫∞0e±iyys+1.

Let C be the boundary of the region

{z ∈ C : <(z) ≥ 0,=(z) ≥ 0, |z| ≤ R |z| ≥ r}

where r < R. Then∫C

eiz

zs+1dz =

∫ R

0

eiy

ys+1dy − i

∫ R

r

e−t

(it)s+1dt+

∫ π2

0

eiReiθ

(Reiθ)s+1iReiθdθ −

∫ π2

0

eireiθ

(reiθ)s+1ireiθdθ.

When R→∞, we have intπ20

eiReiθ

(Reiθ)s+1 iReiθdθ → 0 by Jordan’s lemma. When r → 0, eire

iθ= 1 +O(r)

and so ∫ π2

0

eireiθ

(reiθ)s+1ireiθdθ =

∫ π2

0

1 +O(r)

(reiθ)sidθ → 0

because σ < 0. Therefore, we can write∫C

eiz

zs+1=

∫ R

r

eiy

ys+1dy − i

∫ R

0

e−t

(it)s+1dt+ f(r, R)

where f(r, R)→ 0 as r → 0, R→∞.Similarly, by considering the contour C ′ which is the reflection of C about the origin, we have∫

C′

e−iz

zs+1dz =

∫ R

0

e−iy

ys+1dy + i

∫ R

0

e−t

(−it)s+1dt+ g(r, R)

where g(r, R)→ 0 as r → 0, R→∞.By Cauchy’s integral theorem, we take the difference of the above integrals and let r → 0, R→∞,

so∫ ∞0

sin y

ys+1dy =

1

2

(∫ ∞0

e−t

(it)s+1dt+

∫ ∞0

e−t

(−it)s+1dt

)=

1

2i(i−s − (−i)−s)Γ(−s) = − sin(

1

2sπ)Γ(−s).

The following lemma is an exercise of integration and we omit the proof.

Lemma 6.24. For all s ∈ C,

Γ(1

2− 1

2s)/Γ(

1

2s) = π−1/22s sin(

1

2sπ)Γ(1− s).

42

Theorem 6.25 (The functional equation). Let Ξ(s) = π−12sΓ(s)ζ(s). Then Ξ(s) = Ξ(1− s).

Proof. Assume −1 < σ < 0. Recall that we can write

ζ(s) = s

∫ ∞1

[x]x−s−1dx

and now we write f(x) = [x]− x+ 12

so that

ζ(s) = s

∫ ∞1

f(x)x−s−1 + s

∫ ∞1

(x− 1

2)x−s−1dx = s

∫ ∞1

f(x)x−s−1dx+1

s− 1+

1

2.

If σ < 0, then

s

∫ 1

0

f(x)x−s−1dx = s

∫ 1

0

(−x+1

2)x−s−1dx =

1

s− 1+

1

2

and so

ζ(s) = s

∫ ∞0

f(x)x−s−1dx.

Using the Fourier expansion

f(x) =∞∑n=1

sin(2nπx)

nπ

so we have

ζ(s) =s

π

∞∑n=1

1

n

∫ ∞0

sin(2nπx)x−s−1dx =s

π

∞∑n=1

(2πn)s

n

∫ ∞0

sin y

ys+1dy.

By Lemma 6.23 we conclude that

ζ(s) = − sπ

(2π)sΓ(−s) sin(1

2sπ)

∞∑n=1

ns−1 = − sπ

(2π)sΓ(−s) sin(1

2sπ)ζ(1− s).

But −sΓ(−s) = Γ(1− s) so we have

ζ(s) = 2sπs−1 sin(1

2sπ)Γ(1− s)ζ(1− s).

Finally, since Ξ(s) = π−12sΓ(1

2s)ζ(s), we conclude Ξ(s) = Ξ(1− s) by using Lemma 6.24.

We have proved the theorem for −1 < σ < 0. Since Ξ(1− s) is analytic for σ < 0, therefore theequation above implies that ζ(s) extends to an analytic function for σ ≤ −1 and so ζ(s) extends toa meromorphic function on C. Therefore it is true for all s.

Remark 6.26. The functional equation shows that ζ(s) is analytic throughout the complex planeexcept for a simple pole at s = 1 with residue 1. It also shows that ζ(s) 6= 0 for σ < 0 by usingCorollary 6.18.

6.8 Riemann Hypothesis

There are many interesting results on the Riemann zeta function. For example, it is known thatζ(s) has infinitely many zeroes on the line σ = 1

2(Hardy). At least 109 zeroes (ordered by t) satisfy

σ = 12. The Riemann hypothesis asserts that in fact all zeroes of ζ lie on the line σ = 1

2.

Definition 6.27. The logarithmic integral function is defined by

li(x) =

∫ x

2

dt

log t.

43

It is well-known that the Riemann hypothesis is equivalent to the assertion π(x) = li(x) +O(√x log x). It is knownt hat

π(x) = li(x) +O(xe−c√log x)

for some constant c > 0.Another assertion known to be equivalent to the Riemann hypothesis is∑

n≤x

µ(n) = O(x12+ε)

for any ε > 0.

Proposition 6.28. li(x) ∼ xlog x

as x→∞.

Proof. Using integration by part, we have

li(x) =

∫ x

2

dt

log t=

[t

log t

]x2

+

∫ x

2

dt

(log t)2=

x

log x− 2

log 2+

∫ x

2

dt

(log t)2.

But ∫ x

2

dt

(log t)2=

∫ √x2

dt

(log t)2+

∫ x

√x

dt

(log t)2≤√x

(log 2)2+

x

(log√x)2

.

Therefore li(x) ∼ xlog x

as x→∞.

6.9 Bernoulli numbers

We will study the values ζ(2t) where t is a positive integer.

Definition 6.29. The nth Bernoulli number, Bn is defined by

B0 = 1, (m+ 1)Bm = −m−1∑j=0

(m+ 1

j

)Bj for all m > 0.

For example, 2B1 = −1, 3B2 = −3B1 − 1, . . ..

Lemma 6.30. If we expand tet−1 as a power series in t, then

t

et − 1=∞∑n=0

Bntn

n!.

Proof. Let tet−1 =

∑∞n=0

bntn

n!. Using et =

∑∞m=0

tm

m!, so

t =∞∑m=1

tm

m!

∞∑n=0

bntn

n!.

Compare coefficients of t on both sides, we conclude that b0 = 1. Now compare the coefficients oftk+1 on both sides for k > 0, we conclude that

k+1∑j=0

(k + 1

j

)1

(k + 1)!bj = 0.

Therefore bk(k + 1) = −∑k

j=0

(k+1j

)bj and so bk = Bk for each k.

Corollary 6.31. B2n+1 = 0 for each n > 0.

44

Proof. Since B0 = 1 and B1 = −12, so

t

et − 1+t

2= 1 +

∞∑n=2

Bntn

n!.

Butt

et − 1+t

2=t(et + 1)

2(et − 1)

which is an even function of t. Therefore B2n+1 = 0.

The following theorem gives a method to compute ζ(2t).

Theorem 6.32 (Euler). Let t be a positive integer. Then

2(2t)!ζ(2t) = (−1)t+1(2π)2tB2t.

Proof. We quote a standard result from analysis

sinx = x

∞∏n=1

(1− x2

n2π2

)and so

log sinx = log x+∞∑n=1

log

(1− x2

n2π2

).

Differentiating both sides we have

cotx =1

x− 2

∞∑n=1

x

n2π2 − x2

and so

x cotx = 1− 2∞∑n=1

x2

n2π2 − x2.

Now for each n, writex2

n2π2 − x2=

x2

n2π2

1

1− (x/nπ)2=∞∑t=1

( x

nπ

)2t.

Therefore,

x cotx = 1− 2∞∑n=1

∞∑t=1

( x

nπ

)2t= 1− 2

∞∑t=1

ζ(2t)(xπ

)2t.

Now write cot x = i eix+e−ix

eix−e−ix , so

x cotx = ixe2ix + 1

e2ix − 1= ix+

2ix

e2ix − 1= ix+

∞∑t=0

Bt(2ix)t

t!= 1 +

∞∑t=2

Bt(2ix)t

t!

by Lemma 6.30. Now compare both expressions of x cotx, we conclude that

−2ζ(2t)

π2t= (−1)t22t B2t

(2t)!.

As a simple application, we substitute t = 1 into the identity above so we have

Corollary 6.33. ζ(2) = π2

6.

45

6.10 Exercises

1. Define the von Mangoldt function as

Λ(n) =

{log p if n is a power of prime p0 otherwise

(i) Compute∑

d|n Λ(d). (ii) Show that

∞∑n=0

Λ(n)

ns= −ζ

′(s)

ζ(s).

2. Define the Liouville function as λ(n) = (−1)ω(n) where ω(n) is the number of (not necessarilydistinct) prime factors of n. Show that

∞∑n=1

λ(n)

ns=ζ(2s)

ζ(s).

3. Show that every integer N > 6 can be written as a sum of distinct primes.

4. By considering the cases k < n and k ≥ n separately, show that

1

n+

1

n+ 1+ · · ·+ 1

n+ k= m

is not soluble for any integer n > 1.

5. (i) Show that∑

n≤x log n = x log x− x+O(log x).

(ii) Show also that∑

n≤x log2 n = x log2 x− 2x log x+ 2x+O(log2 x).

6. (i) Show that the number of primes q that divides an integer n > 3 and exceed log n is at mostlogn

log logn.

(ii) By considering the product∏

q|n(1− q−1), deduce that

φ(n) >cn

log log n

for some constant c.

7. Use the functional equation to calculate ζ(0).

8. Show that

limx→∞

π(x)

x→ 0.

Show further that if π(x) log xx

tends to a limit as x→∞, then the limit must be 1.

9. Show that ζ(−2n) = 0 where n is a positive integer.

10. Let

f(t) =∞∑n=0

antn

n!and g(t) =

∞∑n=0

bntn

n!.

We say f is integral if an ∈ Z for all n. For each m ≥ 2, we say f ≡ g mod m if an ≡ bn modm for all n.

46

(i) Suppose f is integral and f(0) = 0, then fm

m!is integral.

(ii) Deduce that if m > 4 is a composite number, then

(et − 1)m−1 ≡ 0 mod m

and

(et − 1)3 ≡ 2∞∑k=1

t2k+1

(2k + 1)!mod 4.

If p is an odd prime, show that

(et − 1)p−1 ≡ −∞∑k=1

tkp−k

(kp− k)!mod p.

(iii) Write t = log(1 + (et − 1)) and expand tet−1 as a power series in et − 1. Show that

∑p−1|2n

1

p+B2n

is an integer (where the sum is taken over all primes p such that p− 1|2n).

(iv) Deduce that the denominator of B2n is divisible by 6.

47

7 Continued fraction

The continued fraction algorithm systematically produce the best rational approximation to a givenreal number.

7.1 Dirichlet’s theorem

We begin by introducing a simple result in diophantine approximation.

Theorem 7.1 (Dirichlet’s theorem). For any θ ∈ R and any integer Q > 1, there exist integersp, q with 0 < q < Q such that

|qθ − p| ≤ 1

Q.

Proof. Consider the numbers 0, 1, {kθ} where k = 1, 2, . . . , Q − 1 where {kθ} is the fractional partof kθ. These numbers are between 0 and 1 and there are Q + 1 numbers in total. So if we divide[0, 1] into Q intervals with equal size, then two of the above numbers must be in the same interval,and hence the difference of them is less than 1/Q. Write {kθ} = kθ − ak for some integer ak and sothere exist p, q such that

|qθ − p| ≤ 1

Q.

Corollary 7.2. For any θ,Q ∈ R, there exist integers p, q with 0 < q < Q such that

|pθ − q| < 1

Q.

Proof. Consider [Q] + 1 and repeat the above proof.

7.2 Convergents

Fix θ ∈ R>0. Intuitively we want to write

θ = a0 +1

a1 + 1a2+···

.

Definition 7.3. Let θ0 = θ. For each i ≥ 0, define ai = [θi]. If ai = θi then stop; otherwise letθi+1 = 1

θi−ai . The numbers a0, a1, . . . are called partial quotients and the numbers θ0, . . . are calledthe complete quotients. For each n ≥ 0 we write

[a0, a1, . . . , an] = a0 +1

a1 + 1a2+

1

···+ 1an

.

We say the continued fraction of θ is finite if there exists n and a0, . . . , an such that θ = [a0, . . . , an];otherwise we say the continued fraction of θ is infinite.

Lemma 7.4. The continued fraction of θ is finite if and only if θ ∈ Q.

Proof. Suppose the continued fraction of θ is finite, then clearly θ is rational. Conversely, if θ = ab,

a, b ∈ Z>0. Run the Euclidean algorithm for a, b and so we have

a = a0b+ r1, b = a1r1 + r2, · · · , rn−1 = anrn.

Then θ = ab

= a0 + r1b

and so

θ1 =b

r1= a1 +

r1r2.

Since rn−1 = anrn so eventually this stops and so we obtain a finite continued fraction.

48

Definition 7.5. The convergents pnqn

of θ are defined by

p0 = a0, p1 = a0a1 + 1, pn = anpn−1 + pn−2, for all n ≥ 2

andq0 = 1, q1 = a1, qn = anqn−1 + qn−2, for all n ≥ 2.

Remark 7.6. (pn)∞n=1, (qn)∞n=1 are strictly increasing sequences and so pn, qn →∞ as n→∞.

Lemma 7.7. (i) Let β ∈ R and n ≥ 2, we have

βpn−1 + pn−2βqn−1 + qn−2

= [a0, . . . , an−1, β].

(ii) For each n ≥ 0, [a0, . . . , an] = pnqn

.

Proof. (i) Induction on n. If n = 2, then

[a0, a1, β] = a0 +1

a1 + 1β

=(a0a1 + 1)β + a0

a1β + 1=βp1 + p0βq1 + q0

.

Suppose this is true for n, then for n+ 1, observe that

[a0, . . . , an, β] = [a0, . . . , an−1, an +1

β]

and so by inductive hypothesis

[a0, . . . , an, β] =(an + 1

β)pn−1 + pn−2

(an + 1β)qn−1 + qn−2

=anpn−1 + pn−2 + 1

βpn−1

anqn−1 + qn−2 + 1βqn−1

=βpn + pn−1βqn + qn−1

.

(ii) For n ≤ 1 we check that [a0] = p0q0

and [a0, a1] = a0a1+1a1

= p1q1

. For n ≥ 2, let β = an in (i) andwe have

[a0, . . . , an] =anpn−1 + pn−2anqn−1 + qn−2

=pnqn.

Lemma 7.8. (i) For each n ≥ 1 we have

pnqn−1 − pn−1qn = (−1)n−1.

(ii) For each n ≥ 2 we havepnqn−2 − pn−2qn = (−1)nan.

(iii) pn, qn are coprime for all n.

Proof. (i) Induction on n. When n = 1 we have

p1q0 − p0q1 = (a0a1 + 1)− a0a1 = 1.

Suppose this is true for n, then

pn+1qn − pnqn+1 = (an+1pn + pn−1)qn − pn(an+1qn + qn−1)pn−1qn − pnqn−1 = (−1)n.

(ii) We have

pnqn−2 − pn−2qn = (anpn−1 + pn−2)qn−2 − pn−2(anqn−1 + qn−2) = an(pn−1qn−2 − pn−2qn−1) = (−1)nan.

(iii) use (i).

49

Corollary 7.9. limn→∞pnqn

exists.

Proof. By (i) of the previous lemma we have

pnqn− pn−1qn−1

=(−1)n−1

qnqn−1→ 0

as n→∞. By (ii) of the previous lemma we have

pnqn− pn−2qn−2

=(−1)nanqnqn−2

.

Therefore, if cn = pnqn

, thenc0 < c2 < · · · < c3 < c1

and since cn − cn−1 → 0 as n→∞ so cn tends to a limit.

Theorem 7.10. Let θ ∈ R\Q, then for all n ≥ 0, we have∣∣∣∣θ − pnqn

∣∣∣∣ < 1

qnqn+1

<1

q2n.

Proof. Since θ 6∈ Q so the continued fraction is infinite. By definition of continued fraction, we have

θ = [a0, a1, . . . , an, θn+1]

and an+1 = [θn+1]. Then we have

θ − pnqn

=θn+1pn + pn−1θn+1qn + qn−1

− pnqn

=pn−1qn − pnqn−1qn(θn+1qn + qn−1)

=(−1)n

qn(θn+1qn + qn−1).

Since θn+1qn + qn−1 > an+1qn + qn−1 = qn+1 so the result follows.

Corollary 7.11. limn→∞pnqn

= θ.

Proof. The above theorem shows that

limn→∞

pnqn− θ = 0

because qn →∞ as n→∞.

Theorem 7.12. Let θ ∈ R\Q and p, q ∈ Z with 0 < q < qn+1. Then

|qθ − p| ≥ |qnθ − pn|.

Proof. Since pn+1qn − pnqn+1 = ±1 for all n and so there exist u, v ∈ Z such that

p = upn + vpn+1, q = uqn + vqn+1

because (pn+1 pnqn+1 qn

)∈ GL2(Z).

If v = 0 then|qθ − p| = u|qnθ − pn| ≥ |qnθ − pn|.

Suppose v 6= 0. Since 0 < q < qn+1, so u 6= 0 and u, v have opposite signs. But qnθ − pn andqn+1θ − pn+1 have opposite sign as in the proof of 7.10, so u(qnθ − pn) and v(qn+1θ − pn+1) have thesame sign. Therefore

|qθ − p| = |u(qnθ − pn)|+ |v(qn+1θ − pn+1)| ≥ |qnθ − pn|.

50

Corollary 7.13. If p, q ∈ Z, q > 0 and

|θ − p

q| ≥ |θ − pn

qn|,

then q > qn.

Proof. Suppose q ≤ qn < qn+1, then by the previous theorem

|θ − p

q| ≥ qn

q|θ − pn

qn| ≥ |θ − pn

qn|

which is a contradiction.

Theorem 7.14. Let θ ∈ R\Q. (i) At least one of any two successive convergents satisfy

|θ − p

q| < 1

2q2.

(ii) Conversely, if p, q ∈ Z with q > 0 and

|θ − p

q| < 1

2q2,

then pq

= pnqn

for some n.

Proof. (i) Since θ − pnqn

and θ − pn+1

qn+1have opposite signs. So∣∣∣∣θ − pn

qn

∣∣∣∣+

∣∣∣∣θ − pn+1

qn+1

∣∣∣∣ =

∣∣∣∣pnqn − pn+1

qn+1

∣∣∣∣ 1

qnqn+1

by Lemma 7.8. Since q2n − q2n+1 ≥ 2qnqn+1, so∣∣∣∣θ − pnqn

∣∣∣∣ ≤ 1

2

(1

q2n+

1

q2n+1

)and so the result follows.

(ii)Assume∣∣∣θ − p

q

∣∣∣ < 12q2

. Since q0 = 1 and qn → ∞, there exists n such that qn ≤ q < qn+1.

Then ∣∣∣∣pq − pnqn

∣∣∣∣ ≤ ∣∣∣∣θ − p

q

∣∣∣∣+

∣∣∣∣θ − pnqn

∣∣∣∣ =1

q|qθ − p|+ 1

qn|qnθ − pn|.

Since qn < qn+1, by Theorem 7.12, we have

|qθ − p| ≥ |qnθ − pn|

and so ∣∣∣∣pq − pnqn

∣∣∣∣ ≤ (1

q+

1

qn

)|qθ − p| <

(1

q+

1

qn

)1

2q.

Since qn ≤ q, we have∣∣∣pq − pn

qn

∣∣∣ < 1qqn

. Finally, we have∣∣∣∣pq − pnqn

∣∣∣∣ =

∣∣∣∣pqn − qpnqqn

∣∣∣∣and so if this is not zero, then it is at least 1

qqnbecause the numerator is an integer. This gives a

contradiction and so pq

= pnqn

.

51

7.3 Period

We give an example to compute the partial quotients of θ.

Example 7.15. θ =√

14. Then a0 = 3 and θ1 = 1√14−3 =

√14+35

. So a1 = 1 and θ2 = 5√14−2 =

√14+22

.

Then a2 = 2 and so θ3 = 2√14−2 =

√14+25

. Then a3 = 1 and so θ4 =√

14 + 3. Finally, a4 = 6 and so

θ5 =√14+35

= θ1. Since θ5 = θ1 so a5 = a1 etc. and so we have

θi+4 = θi, ai+4 = ai

for all i ≥ 1.

We have seen that the continued fraction of√

14 is ’periodic’ and so we wonder if this is true ingeneral.

Definition 7.16. A continued fraction is called eventually periodic if it is of the form

[a0, a1, . . . , am, am+1, . . . , am+n−1]

and purely periodic if m = 0.

Definition 7.17. φ is said to be a quadratic irrational if φ is a root of ax2 + bx + c = 0 for somea, b, c,∈ Z and φ 6∈ Q.

Lemma 7.18. Let θ be eventually periodic. Then there exist integers a, b, c, d such that aα+bcα+d

is purelyperiodic.

Proof. Suppose φ is eventually periodic and

φ = [a0, a1, . . . , am, am+1, . . . , am+n−1].

Let θm+1 = [am+1, . . . , am+n−1] which is purely periodic. Since a0, . . . , am are integers, so there existintegers a, b, c, d such that aθ+b

cθ+d= θm+1.

Theorem 7.19 (Lagrange). Let θ ∈ R. Then θ is a quadratic irrational if and only if the continuedfraction of θ is eventually periodic.

Proof. Suppose θ is periodic. By the previous lemma, take a, b, c, d ∈ Z so that α = aθ+bcθ+d

is purelyperiodic. Then

α = [a0, a1, . . . , am] = [a0, . . . , am, α] =αpm + pm−1αqm + qm−1

.

Therefore α satisfies a quadratic polynomial and since α has infinite continued fraction so α 6∈ Q.Then θ satisfies a quadratic equation.

Conversely, assume α is a root of ax2+bx+c = 0. Let f(x, y) = ax2+bxy+cy2. Then f(θ, 1) = 0.Write

θ = [a0, a1, . . . , an, θn+1] =pnθn+1 + pn−1qnθn+1 + qn−1

.

Then f(pnθn+1 + pn−1, qnθn+1 + qn−1) = 0. So θn+1 is a root of Anx2 + Bnx + Cn = 0 for some

An, Bn, Cn ∈ Z. Explicitly, we have

An = ap2n+bpnqn+cq2n, Bn = 2apnpn−1+bpn−1qn+bpnqn−1+2cqnqn−1, Cn = ap2n−1+bpn−1qn−1+cq2n−1

and soAn = f(pn, qn), Cn = f(pn−1, qn−1), B2 − 4AC = b2 − 4ac

because pnqn−1 − qnpn−1 = ±1.

52

Since f(θ, 1) = 0, so

f

(pnqn, 1

)= f

(pnqn, 1

)− f(θ, 1) = a

((pnqn

)2

− θ2)

+ b

(pnqn− θ)

=

(apnqn

+ aθ + b

)(pnqn− θ).

As n→∞, apnqn

+ aθ + b→ 2aθ + b and so it is bounded above by some constant c. Since∣∣∣∣pnqn − θ∣∣∣∣ < 1

q2n

by Theorem 7.10. So f(pnqn, 1)< c

q2nand so f(pn, qn) < c for all n. This means that the coefficients

An, Cn are bounded. Since B2n−4AnCn = b2−4ac is constant, so there are only finitely many triples

(An, Bn, Cn) for all n. Therefore, there exists n 6= m such that (An, Bn, Cn) = (Am, Bm, Cm) and soθm+1 = θn+1.

7.4 Pell’s equation

We study the integer solutions of x2 − dy2 = 1 where d > 0 is a square-free integer. The trivialsolutions are (x, y) = (±1, 0). It suffices to study the solutions x, y with x, y > 0.

Lemma 7.20. Suppose x, y ∈ Z>0 satisfy x2 − dy2 = 1. Then xy

is a convergent of√d.

Proof. 1 = x2 − dy2 = (x−√dy)(x+

√dy) and since x+

√dy > 0, so x−

√dy > 0. Therefore

0 < x−√dy =

1

x+√dy

<1

2√dy

<1

2y

and so∣∣∣xy −√d∣∣∣ < 1

2y2. Then apply Theorem 7.14 (ii).

We leave the following lemma as an exercise.

Lemma 7.21. Let θ be a quadratic irrational with conjugate θ′ (which has the same quadratic minimalpolynomial as θ). Then θ has purely periodic continued fraction if and only if θ > 1 and −1 < θ′ < 0.

Corollary 7.22. Let d > 0 be a square free integer. Then

√d = [a0, a1, . . . , an].

Proof. Consider√d = [a0, θ1]. Then θ1 = 1√

d−a0> 0 and the conjugate

0 > θ′1 =1

−√d− a0

> −1

and so by the previous lemma we conclude that θ1 is purely periodic.

Theorem 7.23. Let d ∈ Z>0 which is square free. Then x2 − dy2 = 1 has a non-trivial solution.

Proof. Write√d = [a0, θ1] where θ1 is purely periodic. Since

√d = a0 + 1

θ1and so θ1 = 1√

d−a0. Let

θ1 = [a1, . . . , an]

so we have√d =

θ1pn + pn−1θ1qn + qn−1

=pn + pn−1(

√d− a0)

qn + qn−1(√d− a0)

.

53

This shows thatqn−1d+ (qn − a0qn−1)

√d = (pn − a0pn−1) + pn−1

√d.

Since√d 6∈ Q, so we must have

qn−1d = pn − a0pn−1 and qn − a0qn−1 = pn−1.

Therefore,p2n−1 − dq2n−1 = pn−1qn − a0qn−1pn−1 − (pnqn−1 − a0pn−1qn−1) = (−1)n.

So we get solution (pn−1, qn−1) is a solution if n is even and (p2n−1, q2n−1) is a solution if n is odd.

Remark 7.24. The above proof also shows that x2 − dy2 = −1 has a solution when the period of dis odd.

Corollary 7.25. Let d > 0 be a square free integer and G = {x + y√d : x, y ∈ Z, x2 − dy2 = 1}.

Then (i) G is a subgroup of R×. In particular, the set of solutions to x2− dy2 is a group isomorphicto G

(ii) For any 0 < a < b, G ∩ [a, b] is finite.(iii) G ∼= Z/2Z× Z.

Proof. (i) It is easy to see that G is closed under multiplication, and the inverse of x+√dy is x−

√dy.

Therefore G is a subgroup of R×. The map (x, y) 7→ x +√dy gives a bijection between the set of

solutions to x2 − dy2 = 1 and G. Define the group operation on the set of solutions by the groupoperation of G.

(ii) If a < x+√dy < b then using x−

√dy = 1

x+√dy

we have

1

b< x−

√dy <

1

a.

If y > 0 then 1b< x < b then there are only finitely many integers x and so there are only finitely

many y because a < x+√dy < b. If y < 0 then a < x < 1

athen again we only have finitely many x

and hence finitely many y.(iii) Since there is a non-trivial solution to x2 − dy2 = 1 with x, y > 0, so by (ii) we take the

solution x, y with x+√dy smallest and x+

√dy > 1 (let a = 1 and keep increasing b until we have

a solution). Let u+√dv > 1 such that u2 − dv2 = 1. Then there exists n such that

(x+√dy)n ≤ u+

√dy < (x+

√dy)n+1

and so1 ≤ (u+

√dy)(x−

√dy)n < x+

√dy.

By the choice of x+√dy, we conclude that

(u+√dy)(x−

√dy)n = 1

and so u+√dy = (x+

√dy)n. This means that every u+

√dv > 1 with u2 − dv2 = 1 is a power of

x+√dy.

Therefore, every 0 < u−√dv < 1 is a (negative) power of x+

√dy. So G = {±(x+

√dy)n : n ∈ Z}

and soG ∼= Z/2Z× Z, ±(x+

√d)n 7→ (±, n).

Remark 7.26. The above theorem helps to generate solutions to x2 − dy2 = 1. For example, if x, yis a solution then so is x′ = x2 + dy2, y′ = 2xy because

x′ +√dy′ = (x+

√dy)2 = x2 + dy2 + 2xy

√d.

54

7.5 A set of real numbers modulo 1

Let θ be an irrational number. We will study the distribution of {{nθ} : n ∈ N} where {nθ} is thefractional part of nθ. Intuitively we would expect that these fractional parts are uniformly distributedin [0, 1]. We now formally define uniform distribution.

Definition 7.27. Let R = {a1, a2, . . .} be a set of real numbers in the interval [0, 1] and let 0 ≤ β1 <β2 ≤ 1. Let T (n, β1, β2) be the size of

{ai : i ≤ n, β1 ≤ ai ≤ β2}.

R is said to be uniformly distributed in the unit interval [0, 1] if

limn→∞

T (n, β1, β2)

n= β2 − β1

for all β1, β2 with 0 ≤ β1 < β2 ≤ 1.

We leave the following theorem as an exercise.

Theorem 7.28 (Hurwitz). There are infinitely many convergents of θ such that∣∣∣∣θ − pnqn

∣∣∣∣ < 1

q2n√

5.

We firstly prove a theorem which shows that the the set {{iθ} : i ∈ N} is dense in [0, 1].

Theorem 7.29. Let θ be an irrational number and β ∈ R. Then there exist infinitely many pairs ofintegers x and y such that

|xθ − y − β| < 3

x.

In particular, take β ∈ [0, 1] and then y = [xθ], so we have infinitely many x such that |{xθ}−β| < 3x.

Proof. By Hurwitz theorem, there exist infinitely many positive integers a, b with (a, b) = 1 suchthat

θ =a

b+γ

b2, where |γ| ≤ 1.

For each b, let c be the integer such that |c− bβ| is smallest and so

β =c

b+

δ

2b, where |δ| ≤ 1.

Since (a, b) = 1 there exist integers x, y such that

ax− by = c

and we can take x with b ≤ 2x < 3b because we can replace x by x+ kb and y by y + ka. Therefore

|xθ − y − β| =∣∣∣∣xab +

xγ

b2− y − c

b− δ

2b

∣∣∣∣ =

∣∣∣∣xγb2 − δ

2b

∣∣∣∣ < x

b2+

1

2b<

3

x

because b > 2x3

. Since we have infinitely many such b and so we have infinitely many x and y.

We now prove the set is uniformly distributed.

55

Lemma 7.30. Let a, b be integers such that b > 0, (a, b) = 1 and

θ =a

b+δ

b2, where |δ| < 1.

For each j, let Jj = {jb+ k : k = 0, 1, . . . , b− 1} and

Kj = {{iθ} : i ∈ Jj}.

Then

Kj =

{{k + γjb

}: k = 0, 1, . . . , b− 1

}.

Proof. We have

α(jb+ k) =

(a

b+δ

b2

)(jb+ k) = ja+

ka+ jδ

b+kδ

b2.

Since ja is an integer, so

{α(jb+ k)} =

{ka+ jδ

b+kδ

b

}=

{ka+ [jδ]

b+γjb

}where

γ =kδ

b+ {jδ}.

Since k < b, so |γ| < 2. Since k ranges over a complete system of residues modulo b, ka+ [jb] rangesover a complete system modulo b for all because (a, b) = 1. Therefore,

Kj =

{{k + γ

b

}: k = 0, 1, . . . , b− 1

}.

Lemma 7.31. For each subinterval [β1, β2] ⊂ [0, 1] and sufficiently large b so that we can chooseintegers v, w to satisfy

v − 1 < bβ1 ≤ v < w ≤ bβ2 < w + 1,

if Wj is the size of{x ∈ Kj : x ∈ [β1, β2]},

then w − v − 2 ≤ Wj ≤ w − v + 2.

Proof. [β1, β2] ⊃ [v/b, w/b]. So Wj is bounded below by the number of elements in Kj in the interval[v/b, w/b]. By the previous lemma,

Kj =

{{k + γ

b

}: k = 0, 1, . . . , b− 1

}and so Wj is bounded below by the number of integers 0 ≤ k < b such that

v ≤ k + γ ≤ w

and so Wj ≥ w − v − 2 because |γ| < 2. Similarly, Wj is bounded above by the number of integers0 ≤ k < b such that

v − 1 < k + γ < w + 1

and so Wj ≤ w − v + 2.

56

Theorem 7.32. Let θ be an irrational number. Then the set

R = {{iθ} : i ∈ N}

is uniformly distributed in [0, 1].

Proof. By Hurwitz theorem, there exist infinitely many integers a, b with (a, b) = 1 and b > 0 suchthat

θ =a

b+δ

b2, where |δ| < 1.

For each subinterval [β1, β2] ⊂ [0, 1], and ε > 0, pick b large enough so that b > 1ε

and we can chooseintegers v, w to satisfy

v − 1 < bβ1 ≤ v < w ≤ bβ2 < w + 1.

Then pick n > b such that b(β2−β1)n

< ε and so we have

1

n≤ w − v

n< ε

and so 1 = O(nε).Write n = rb+ s where 0 ≤ s < b. By definition of Wj in the previous lemma, we have

r−1∑j=0

Wj ≤ T (n, β1, β2) ≤r∑j=0

Wj

and so by the previous lemma,

r(w − v − 2) ≤ T (n, β1, β2) ≤ (r + 1)(w − v + 2).

Since r = n−sb

andw − v < b(β2 − β1) ≤ w − v + 2

we conclude that

n− sb

(b(β2 − β1)− 4) ≤ T (n, β1, β2) ≤n− s+ b

b(b(β2 − β1) + 2).

But s < b and so we have the following approximation

n(β2 − β1) +O(nε) ≤ T (n, β1, β2) ≤ n(β2 − β1) +O(nε)

and so ∣∣∣∣T (n, β1, β2)

n− (β2 − β)

∣∣∣∣ < O(ε).

Now let ε→ 0 and so n→∞, then

limn→∞

T (n, β1, β2)

n= β2 − β1.

57

7.6 Exercises

1. (i) Let d and m be positive integers such that d is not a square and such that m ≤√d. Prove

that if x and y are positive integers satisfying x2 − dy2 = m then x/y is a convergent of√d.

(ii) Let d and m be positive integers such that d is not a cube. If x, y are integers satisfying

x3 − dy3 = m and 3d23y > 8m, then x/y is a convergent of d

13 .

2. Let d be a positive integer that is not a square. Let θn, pn/qn be the complete quotients andconvergents for θ =

√d. Show that for all n ≥ 1 we have

pn−1 − qn−1√d = (−1)n/

n∏i=1

θi.

3. Let un be the nth Fibonacci number, i.e.

u1 = 1, u2 = 1 and un+2 = un+1 + un, n ≥ 0.

Let α = 1+√5

2and let pn/qn be the nth convergent. (i) Show that pn

qn= un+2

un+1. (ii) It is known

that if α′ = 1−√5

2, then

un =αn − α′n√

5.

Use this fact to show that p|up−1 if p ≡ ±1 mod 5 and p|up+1 if p ≡ ±2 mod 5.

4. Write βn = qn−1

qn−2where pn

qnare convergents of θ. Let θn be the nth complete quotient. Show that

if θn +βn ≤√

5 for three consecutive convergents, then βn + 1βn<√

5. Prove Hurwitz theorem.

5. Assume that θ is a quadratic irrational which satisfies θ > 1 and −1 < θ′ < 0 where θ′ is theconjugate of θ. Show that (i) −1 < θ′n < 0 where

θ = [a0, a1, . . . , an−1, θn].

(ii) an =[−1θ′n+1

].

(iii) θ has purely periodic continued fraction.

(iv) Show further that ifθ = [a0, . . . , an]

then

− 1

θ′= [an, an−1, . . . , a1, a0].

6. Assume θ > 1 is irrational and has purely periodic continued fraction with convergents pnqn

.

Show that for some suitably chosen n, f(θ) = 0 where

f(x) = x2qn−1 + x(qn−1 − pn−1)− pn−2.

Hence show that −1 < θ′ < 0. Together with the previous question, we conclude that anirrational number θ > 1 has purely periodic continued fraction if and only if −1 < θ′ < 0,which proves Lemma 7.21.

58

7. Show that there exist infinitely many n such that∑n

i=1 i is a square.

8. Let d be a rational number which is not a square. Then

√d = [a0, a1, a2, . . . , an−1, 2a0]

for some n ≥ 1 where an−1 = ai for 1 ≤ i < n. Conversely, any continued fraction of this formis the square root of a rational number.

59

8 Primality testing and factoring

We discuss the following questions in this chapter: (i) Given a large N , can we effectively determinewhether N is prime? (ii) Given a large composite integer N , can we find a non-trivial factor of N?We may assume N is odd in both questions.

For (i) the trial division up to√N is not effective. However, small factors are easily found by

trial division. For (ii) we can sometimes use Fermat Euler theorem to determine whether N is aprime, for if N is a prime, then

aN−1 ≡ 1 mod N

for all (a,N) = 1. For example, 9 is not a prime because

28 = (16)2 ≡ (−2)2 ≡ 4 mod 9.

8.1 Fermat pseudoprime

Definition 8.1. We say b is a base for N if (b,N) = 1. We usually take b ∈ {1, 2, . . . , N − 1}.

Definition 8.2. An odd composite number N is a (Fermat) pseudoprime to the base b if bN−1 ≡ 1mod N .

Theorem 8.3. For every integer b > 1, there exist infinitely many pseudoprimes to the base b.

Proof. Let p be an odd prime which does not divide any of b, b + 1, b − 1 and let N = b2p−1b2−1 . It is

clear that N is an integer. Moreover,

N =bp − 1

b− 1

bp + 1

b+ 1

and since p is odd so N is a product of two integers, and so it is composite.N − 1 = b2p−b2

b2−1 . By Fermat Euler, bp ≡ b mod p and so b2p ≡ b2 mod p. So N − 1 = kp for somek. But

N − 1 = b2bp−1 − 1

b− 1

bp−1 + 1

b+ 1= b2(1 + p+ · · ·+ bp−2)(bp−2 − bp−1 + · · · )

and so N is even because 1 + p+ · · ·+ bp−2 is even. Therefore k is even and so k = 2m for some m.On the other hand, b2p − 1 = (b2 − 1)N and so b2p ≡ 1 mod N . So

bN−1 = b2mp = (b2p)m ≡ 1 mod N.

Definition 8.4. Let N be an composite number. N is called a Carmichael number if N is a pseu-doprime to all bases b.

A result of Alford, Granville and Pomerance of 1992 gives the existence of infinitely manyCarmichael numbers.

8.2 Euler pseudoprime

Recall that Euler’s criterion states that

ap−12 ≡

(a

p

)mod p.

60

Definition 8.5. An odd composite integer N is called an Euler pseudoprime to the base b if

bN−1

2 ≡(b

N

)mod N.

Note that bN−1 ≡ 1 mod N and so every Euler pseudoprime is a Fermat pseudoprime.

Using Euler’s criterion we have the following conclusion.

Proposition 8.6. The set

{b ∈ (Z/NZ)× : N is an Euler pseudoprime to b}

is a subgroup of (ZN/Z)×.

Theorem 8.7. Let N be an odd composite integer. Then there exists b such that N is not an Eulerpseudoprime to the base b.

Proof. If N is square free, then N = pm where (p,m) = 1, p is a prime and m ≥ 3. Pick u such that(up

)= −1. Then take b such that

b ≡ u mod p, b ≡ 1 mod m.

Then(bN

)= −1. If b

N−12 ≡ −1 mod N then b

N−12 ≡ −1 mod m. But b ≡ 1 mod m.

If N is not square free, then N = prm where r ≥ 2 and (p,m) = 1. So (1 + p)N−1 ≡ 1 + (N − 1)pmod p2 and so

(1 + p)N−1 6≡ 1 mod pr.

Take b such thatb ≡ 1 + p mod p2, b ≡ 1 mod m.

Then (b,N) = 1 and bN−1 6≡ 1 mod N . So N is not a Fermat pseudoprime to the base b and hencenot an Euler pseudoprime to the base b.

Using the above theorem, we have the following primality test: Given an odd integer N , testwhether

bN−1

2 ≡(b

N

)mod N

for several randomly chosen b. If any of these fail then N is composite.

8.3 Strong pseudoprime

Definition 8.8. Let N be an odd integer and write N − 1 = 2st where s ≥ 1 and t is odd. N iscalled a strong pseudoprime prime to the base b if either bt ≡ 1 mod N or

b2rt ≡ −1 mod N, for some 0 ≤ r < s.

Remark 8.9. If N is a prime, then N is a strong pseudoprime to the base b for all bases b. Indeed,either bt ≡ 1 mod N , or bt 6≡ 1 mod N . By Fermat Euler,

b2st ≡ bN−1 ≡ 1 mod N.

Let k be the largest integer such that b2kt 6≡ 1 mod N . Then

b2k+1t ≡ 1 mod N.

But N is prime and so b2kt ≡ −1 mod N .

61

We now give a theorem which gives an upper bound of

S(N) = {b : 1 ≤ b < N, (b,N) = 1, N is a strong pseudoprime to the base b}.

Theorem 8.10 (Mercer). Assume N is an odd integer and N > 9. Then |S(N)| ≤ 14φ(N).

The idea is to estimate the size of the following set.

Definition 8.11. For each N , let u(N) be the largest integer such that 2u(N) divides p− 1 for everyprime factor p of N and let

U(N) = {b : 1 ≤ b < N, (b,N) = 1, b2u(N)−1t ≡ ±1 mod N}.

Lemma 8.12. S(N) ⊂ U(N) and so |S(N)| ≤ |U(N)|.

Proof. Let b ∈ S(N). So either (i) bt ≡ 1 mod N or (ii) there exists 0 ≤ r < s such that

b2rt ≡ −1 mod N.

If (i) holds then clearly b ∈ U(N). Assume (ii) holds. Take p|N and let kp be the order of b in(Z/pZ)×. Since

b2r+1t = (b2

rt)2 ≡ 1 mod N

and so b2r+1t ≡ 1 mod p. So kp|2r+1t. But kp - 2rt, so kp must be divisible by 2r+1. On the other

hand, kp|p− 1 by Fermat Euler. Therefore,

2r+1|kp|p− 1

and so r + 1 ≤ u(N).

Suppose r + 1 = u(N), then u(N)− 1 = r and so b2u(N)−1t ≡ −1 mod N . Suppose r + 1 < u(N),

then u(N)− 1 ≥ r + 1 and so b2u(N)−1 ≡ 1 mod N . So b ∈ U(N).

Theorem 8.13. For each N , let w(N) be the number of distinct prime factors of N . Write N −1 =2st, then

|U(N)| = 2 · 2(u(N)−1)w(N)∏p|N

(t, p− 1).

Proof. Let m = 2u(N)−1 and we count the number of solutions of

xm ≡ ±1 mod N.

Let N = pa11 · · · pakk , ai ≥ 1 and so w(N) = k.

We firstly count the number of solutions of xm ≡ 1 mod N by counting the number of solutionsof xm ≡ 1 mod paii for each i. Since (Z/paii Z)× is a cyclic group of order pai−1i (pi − 1), so we haved solutions where d = (m, pai−1i (pi − 1)). Since pi|N and (2, N) = (t, N) = 1, so (m, pai−1i ) = 1.Therefore, d = (m, pi − 1). But 2u(N)|pi − 1 and so

(m, pi − 1) = 2u(N)−1(t, pi − 1).

Since this is true for each i, then by Chinese remainder theorem, we have

2(u(N)−1)w(N)∏p|N

(t, p− 1)

solutions for xm ≡ 1 mod N .

62

Now we count the number of solutions of xm ≡ −1 mod N . We again consider the number ofsolutions of xm ≡ −1 mod paii for each i. The number of solutions for

x2m ≡ 1 mod paii

is(2m, pai−1i (p− 1)) = (2m, pi − 1) = 2u(N)(t, p− 1)

by using the same argument above. Therefore the number of solutions of

xm ≡ −1 mod paii

is 2u(N)−1(t, p− 1) because (xm − 1, xm + 1) = 1 and so x2m ≡ 1 mod paii if and only if

xm ≡ ±1 mod paii .

Therefore the number of solutions of xm ≡ ±1 mod N is 2 · 2(u(N)−1)w(N)∏

p|N(t, p− 1).

We now prove the result of Mercer.

Proof. Let δ(N) = φ(N)|U(N)| and we shall prove δ(N) ≥ 4. Then φ(N)

|S(N)| ≥ 4 because we have shown that

|S(N)| ≤ |U(N)|. Let N =∏

i pai−1i (pi − 1) and so

φ(N) =∏i

pai−1i (pi − 1).

Then by the previous theorem, we have

δ(N) =1

2

∏i

paii (pi − 1)

2u(N)−1(t, p− 1).

It is clear that p−12u(N)−1(t,p−1) ≥ 2. If i ≥ 2 and ai ≥ 2 for some i, then

δ(N) ≥ 1

222 · 3 = 6.

Similarly if i ≥ 3 then

δ(N) ≥ 1

223 = 4.

So the theorem follows in these cases.It remains to check the theorem for the following cases: (i) i = 2, ai = 1 for all i, i.e. N = pq,

p < q. (ii) i = 1. In the first case, we have

δ(N) =1

2

(p− 1)(q − 1)

22u(N)−2(t, p− 1)(t, q − 1)

where 2u(N)|p− 1, q − 1. If 2u(N)+1|q − 1 then

q − 1

2u(N)−1(t, q − 1)≥ 4

and so the result follows. If not, since

N − 1 = pq − 1 ≡ p− 1 mod q − 1

so 2u(N)|N − 1. But q − 1 - N − 1, so there must be some odd factor r|q − 1 such that r - N − 1because 2u(N)+1 - q − 1. Then

q − 1

(t, q − 1)≥ r ≥ 3

63

and so δ(N) ≥ 6.In the second case, since N is composite, we have N = pj, j ≥ 2. Then

δ(N) =pj−1(p− 1)

2u(N)(t, p− 1)≥ pj−1.

p is odd and so p ≥ 3. If p = 3 then j ≥ 3 because we assume N > 9 and so δ(N) ≥ 9. If p > 3 thenδ(N) ≥ 4.

Remark 8.14. Mercer’s result shows that the strong pseudoprime test is much more efficient thanthe Euler pseudoprime test.

Finally, we show that every strong pseudoprime is an Euler pseudoprime.

Lemma 8.15. Let N be a composite number and N − 1 = 2st. Suppose

b2rt ≡ −1 mod N, for some 0 ≤ r < s− 1.

Then for each prime p|N , if p− 1 = 2s′t′, then s′ ≥ s and(

b

p

)= −1 if s′ = r + 1, and

(b

p

)= 1 if s′ > r + 1.

Proof. The fact s′ ≥ r + 1 can be deduced by a similar argument as in the proof of Lemma 8.12:Indeed if kp is the order of b in (Z/pZ)×, we have shown that

2r+1|kp|p− 1

and so r + 1 ≤ s′.Since t′ is odd, we have

b2rtt′ ≡ −1 mod p.

By Euler’s criterion, we have (b

p

)t≡ b(p−1)t/2 = b2

s′−1tt′ mod p.

If s′ = r + 1 then(bp

)t≡ −1 mod p and if s′ > r + 1 then

(bp

)t≡ 1 mod p. The result follows

because t is odd.

Theorem 8.16. Let N be a composite number. If N is a strong pseudoprime to the base b, then Nis an Euler pseudoprime to the base b.

Proof. Let N − 1 = 2st. Suppose bt ≡ 1 mod N , then

1 ≡ bN−1

2 ≡ bt ≡(bt

N

)=

(b

N

)tmod N.

Since t is odd, so bN−1

2 ≡(bN

)tmod N .

Now suppose b2s−1t ≡ −1 mod N . Since N−1

2= 2s−1t, so we need to show that

(bN

)= −1. By

the previous lemma, if N =∏

p|N p (not necessarily distinct), then(b

N

)=∏p|N

(b

p

)= (−1)k

64

where k is the number of p with s′ = s. For those p with s′ ≥ s+ 1, we have p ≡ 1 mod 2s+1 and forthose p with s′ = s we have

p ≡ 1 + 2st′ ≡ 1 + 2s mod 2s+1

because t′ is odd. Similarly, N ≡ 1 + 2st mod 2s+1. So

1 + 2s ≡ N ≡∏p|N

p ≡ (1 + 2s)k ≡ 1 + k2s mod 2s+1

and so k must be odd. This shows that(bN

)= −1.

Finally, suppose b2rt ≡ −1 mod N where 0 ≤ r < s− 1. Then N−1

2= 2s−1t and so

bN−1

2 ≡ 1 mod N.

So we need to show that(bN

)= 1. By the previous lemma, if N =

∏p|N p (not necessarily distinct),

then (b

N

)=∏p|N

(b

p

)= (−1)k

where k is the number of p with s′ = r + 1.For those p with s′ ≥ r + 2, we have p ≡ 1 mod 2r+2 and for those p with s′ = r + 1, we have

p ≡ 1 + 2r+1t′ ≡ 1 + 2r+1 mod 2r+2

because t′ is odd. Since r ≤ s− 2, so r + 2 ≤ s and we have

N = 1 + 2st ≡ 1 mod 2r+2.

So1 ≡ N ≡

∏p|N

p ≡ (1 + 2r+1)k ≡ 1 + k2r+1 mod 2r+2.

This shows that k must be even and so(bN

)= 1.

8.4 Fermat factorisation

The idea of Fermat factorisation is to write a composite number N as a difference of two perfectsquares.

Lemma 8.17. Let N be an odd integer. There is a bijection

{(a, b) ∈ N2 : a > b, ab = N} → {(r, s) ∈ N2 : r2 − s2 = N}, (a, b) 7→(a+ b

2,a− b

2

).

Proof. Let r = a+b2, s = a−b

2then clearly r2 − s2 = N and r, s ∈ N because a, b are both odd. The

inverse is given by a = r + s, b = r − s.

Fermat factorisation is described by the following algorithm. For each i ≥ 1, let ri = [√N ] + 1.

If r2i −N is a perfect square, then stop. This will eventually stop if we assume N is a composite oddnumber.

Example 8.18. N = 15. r1 = 4, and 42 − 15 = 12 so r = 4 and s = 1.

Remark 8.19. In general this method is no better than trial division. But it is efficient if has afactor close to

√N .

65

Here are some possible improvements. Suppose we want to factorise 200819 = r2 − s2. Since200819 ≡ 2 mod 3, so we must have r ≡ 0 mod 3 and s ≡ 1 mod 3.

We can also try to factorise kN for some integer k, and consider ri = [√kN ] + i. If N = ab and

ab

is approximately uv

for some small u, v. Then let k = uv and we have

kN = uvab =

(av + bu

2

)2

−(av − bu

2

)2

where av−bu2

is much smaller than a−b2

.

8.5 Factor bases

The factor base method is one application of linear algebra in factorisation. We generalise the Fermatfactorisation in the following way: find integers r, s such that r2 ≡ s2 mod N . Then (r − s,N) is aproper divisor of n unless r ≡ s mod N . Similarly, (r+ s,N) is a proper divisor of N unless r ≡ −smod N .

Definition 8.20. A factor base B is a set {p1, . . . , pk} of distinct primes where we allow p1 to be −1.For a given odd composite N , the square of an integer b is called a B-number if b′ can be expressedas a product of powers of elements of B, where b′ ≡ b2 mod N and |b′| < 1

2N .

The following algorithm describes the factor base method:

(i) Choose a suitable factor base B.

(ii) Find some B-numbers b1, . . . , bm and record b′1, . . . , b′m where b′i ≡ b2i mod N and |b′i| < 1

2N .

(iii) Find some I ⊂ {1, . . . ,m} such that∏

i∈I b′i is a square, say c2.

(iv) Let b =∏

i∈I b′i and so b2 ≡ c2 mod N .

(iv) Compute (N, b− c) and (N, b+ c) and hope to get a non-trivial factor of N .

Example 8.21. N = 4633. We have

672 ≡ −144, 682 ≡ −9 mod 4633

and so we take B = {−1, 2, 3}. Then (67 · 68)2 ≡ (12 · 3)2 and so we have

b = 67 · 68 = 4556 ≡ −77 mod 4633

and c = 12 · 3 = 36. We have (4633,−41) = 41 and so 4633 = 41 · 113.

Remark 8.22. In step (ii), let F2 = Z/2Z = {0, 1} and write B = {r1, . . . , rk}. Let S(N,B) be theset of B-numbers mod N . So if b ∈ S(N,B) then b′ = ra11 · · · r

akk for some ai. Define

λ : S(N,B)→ Fk2, b 7→ (a1, . . . , ak)

where ai is ai mod 2. Then in step (iii) we only need to seek I ⊂ {1, 2, . . . ,m} such that∑

i∈I λ(bi) =0. So we seek a linear independent relation satisfied by the m elements λ(b1), . . . , λ(bm) in the k-dimensional vector space Fk2. If m ≥ k + 1 then we are guaranteed to find such a relation. It is thesame as finding the kernel of the matrix whose ith row is λ(bi).

66

8.6 The Continued fraction method

As we can see, the important thing in the factor base method is to pick a ’good’ factor base B. Wewant number b such that b′ is a product of small primes. One approach is to try integers close to√kN . Another approach is to use continued fraction. We assume N is not a square so

√N is an

irrational number.

Lemma 8.23. Let pnqn

be convergents of√N . Then

|p2n −Nq2n| ≤ 2√N.

Proof. Recall that |√N − pn

qn| < 1

qnqn+1by Theorem 7.10. So |

√N + pn

qn| < 1

qnqn+1+ 2√N by triangle

inequality and so

|p2n −Nq2n| = q2n|√N − pn

qn||√N +

pnqn| < q2n

1

qnqn+1

(2√N +

1

qnqn+1

) =1

qn+1

(2qn√N +

1

qn+1

).

Since qn < qn+1, so

2qn√N +

1

qn+1

− 2qn+1

√N =

2qn+1(qn − qn+1)√N + 1

qn+1

< 0.

So

|p2n −Nq2n| <1

qn+1

2qn+1

√N = 2

√N.

Remark 8.24. For all N ≥ 16, 2√N ≤ 1

2N and so the above lemma actually shows that, if p′n ≡ p2n

mod N and |p′n| < 12N , then p′n = p2n−Nq2n. For large N , 2

√N is much smaller than 1

2N and so this

gives a clue why this method is efficient. Also we only need to record pn mod N instead of pn itself.

Example 8.25. N = 12403 and try B = {−1, 2, 3, 5, 7, 11, 13}. We compute the convergents for√N ,

p0 ≡ 111, p1 ≡ 223, p2 ≡ 334, p3 ≡ 891, p4 ≡ 2116, p5 = 3300, p6 ≡ 5416 mod N

and so

p′0 ≡ −82, p′1 ≡ 117, p′2 ≡ −71, p′3 ≡ 89, p′4 ≡ −27, p′5 = 166, p′6 ≡ −39 mod N.

Then we see(p1p4p6)

2 ≡ 117 · (−27) · −39 = (33 · 13)2 mod N

and we have p1p4p6 ≡ −1062 mod N . Therefore, b = 1062 and c = 351. We try (N, 711) = 79 andso N = 79 · 157.

8.7 Pollard’s p− 1 method

The Pollard’s p− 1 method works well when N has a prime factor p such that p− 1 is a product ofsmall primes. It is described by the following algorithm

(i) Choose k which is lots of small prime powers. For example, k = m! or k is the least commonmultiple of 1, 2, . . . ,m..

(ii) Choose at random a small integer a, coprime to N .

67

(iii) Compute ak mod N via repeated squaring.

(iv) Compute (ak − 1, N) and hope this is a non-trivial factor of N .

(v) If not, repeat for other chices of a and k.

The theory behind the algorithm is the following. If p|N is a prime and p− 1|k. Then ak ≡ 1 mod pby Euler Fermat. Then p|(N, ak − 1) in which case we get a non-trivial factor, unless it so happensthat ak ≡ 1 mod N .

Example 8.26. N = 540143. Try m = 8 and a = 2. Then k = lcm(1, 2, . . . , 8) = 840. We computethat

a840 ≡ 53047 mod N

and we find that (53047, 540143) = 421. So N = 421 · 1283.

8.8 Exercises

1. Show that if p is a prime, then p2 is a pseudoprime with respect to a base b if and only ifbp−1 ≡ 1 mod p2. What about p3?

2. Show that if n is a pseudoprime to the base 2, then so is N = 2n − 1. Show further that N isa strong pseudoprime to the base 2.

3. Show that if n is a strong pseudoprime to the base b, then it is also a strong pseudoprime tothe base bk.

4. Let n be an odd composite integer. (i) Show that n is a Carmichael number if and only if n issquare free and p− 1|n− 1 for all p|n.

(ii) Show that if n is a Carmichael number, then n is the product of at least three distinctprimes.

(iii) Show that N = (6t+1)(12t+1)(18t+1) is a Carmichael number where 6t+1, 12t+1, 18t+1are all prime numbers.

(iv) Find all Carmichael numbers of the form 91p where p is a prime.

5. Let p be a prime > 5. Prove that N = (4p + 1)/5 is a composite integer. Prove that N is astrong pseudoprime to the base 2.

6. Let N > 1 be a positive odd integer. Show that the congruence

an−12 ≡

( aN

)mod N

fails to hold for at least half of the bases.

7. If n ≡ 3 mod 4 then n is a strong pseudoprime to the base b if and only if it is an Eulerpseudoprime to the base b. Show also the same holds if

(bn

)= −1.

68

8. Prove that if N has a factor which is within 4√N of

√N , then Fermat factorisation must work

on the first try.

69

Date post:	23-May-2018
Category:	Documents
Upload:	duongdieu
View:	213 times
Download:	1 times

PartII Number Theory - University of Cambridgejg352/PartIIIPrep/PartIINumberTheor… · PartII...

Documents