Why the IRS cares about the Riemann Zeta Function and Number … · 2019. 9. 9. · Intro General...

Post on 26-Sep-2020

1 views 0 download

transcript

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Why the IRS cares about the RiemannZeta Function and Number Theory

(and why you should too!)

Steven J. Millersjm1@williams.edu,

Steven.Miller.MC.96@aya.yale.edu

http://web.williams.edu/Mathematics/sjmiller/public_html/

Stresa, Italy, July 11, 1, 20191

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Introduction

A. Berger and T. P. Hill, An Introduction to Benford’s Law,Princeton University Press, Princeton, 2015. See also http://www.benfordonline.net/.

A. E. Kossovsky, Benford’s Law: Theory, the General Law ofRelative Quantities, and Forensic Fraud Detection Applications,WSPC, 2014.

S. J. Miller (editor), Theory and Applications of Benford’s Law,Princeton University Press, 2015.

M. Nigrini, Benford’s Law: Applications for Forensic Accounting,Auditing, and Fraud Detection, 1st Edition, Wiley, 2014.

2

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Interesting Question

Motivating Question: For a nice data set, such as theFibonacci numbers, stock prices, street addresses ofcollege employees and students, ..., what percent of theleading digits are 1?

Natural guess: 10% (but immediately correct to 11%!).

3

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Interesting Question

Motivating Question: For a nice data set, such as theFibonacci numbers, stock prices, street addresses ofcollege employees and students, ..., what percent of theleading digits are 1?

Answer: Benford’s law!

4

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples with First Digit Bias

Fibonacci numbers

Most common iPhone passcodes

Twitter users by # followers

Distance of stars from Earth

5

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Summary

Explain Benford’s Law.

Discuss examples and applications.

Sketch proofs.

Describe open problems.

6

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Caveats!

A math test indicating fraud is not proof of fraud:unlikely events, alternate reasons.

7

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Caveats!

A math test indicating fraud is not proof of fraud:unlikely events, alternate reasons.

8

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

recurrence relations

special functions (such as n!)

iterates of power, exponential, rational maps

products of random variables

L-functions, characteristic polynomials

iterates of the 3x + 1 map

differences of order statistics

hydrology and financial data

many hierarchical Bayesian models

9

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Applications

Analyzing round-off errors.

Determining the optimal way to storenumbers.

Detecting tax and image fraud, and dataintegrity.

10

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

General Theory

11

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford’s Law: Newcomb (1881), Benford (1938)

StatementFor many data sets, probability of observing afirst digit of d base B is logB

(d+1

d

); base 10

about 30% are 1s.

Benford’s Law (probabilities)12

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Background Material

Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.

13

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Background Material

Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.

Significand: x = S10(x) · 10k , k integer, 1 ≤ S10(x) < 10.

14

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Background Material

Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.

Significand: x = S10(x) · 10k , k integer, 1 ≤ S10(x) < 10.

S10(x) = S10(x) if and only if x and x have the sameleading digits. Note log10 x = log10 S10(x) + k .

15

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Background Material

Modulo: a = b mod c if a − b is an integer times c; thus17 = 5 mod 12, and 4.5 = .5 mod1.

Significand: x = S10(x) · 10k , k integer, 1 ≤ S10(x) < 10.

S10(x) = S10(x) if and only if x and x have the sameleading digits. Note log10 x = log10 S10(x) + k .

Key observation: log10(x) = log10(x) mod 1 if and only if xand x have the same leading digits.

Thus often study y = log10 x mod 1.Advanced: e2πiu = e2πi(u mod 1).

16

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Equidistribution and Benford’s Law

Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:

#{n ≤ N : yn mod 1 ∈ [a, b]}N

→ b − a.

17

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Equidistribution and Benford’s Law

Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:

#{n ≤ N : yn mod 1 ∈ [a, b]}N

→ b − a.

Thm: β 6∈ Q, nβ is equidistributed mod 1.

18

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Equidistribution and Benford’s Law

Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:

#{n ≤ N : yn mod 1 ∈ [a, b]}N

→ b − a.

Thm: β 6∈ Q, nβ is equidistributed mod 1.

Examples: log10 2, log10

(1+

√5

2

)6∈ Q.

19

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Equidistribution and Benford’s Law

Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:

#{n ≤ N : yn mod 1 ∈ [a, b]}N

→ b − a.

Thm: β 6∈ Q, nβ is equidistributed mod 1.

Examples: log10 2, log10

(1+

√5

2

)6∈ Q.

Proof: if rational: 2 = 10p/q.

20

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Equidistribution and Benford’s Law

Equidistribution{yn}∞n=1 is equidistributed modulo 1 if probabilityyn mod 1 ∈ [a, b] tends to b − a:

#{n ≤ N : yn mod 1 ∈ [a, b]}N

→ b − a.

Thm: β 6∈ Q, nβ is equidistributed mod 1.

Examples: log10 2, log10

(1+

√5

2

)6∈ Q.

Proof: if rational: 2 = 10p/q.Thus 2q = 10p or 2q−p = 5p, impossible.

21

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Example of Equidistribution: n√π mod 1

0.2 0.4 0.6 0.8 1

0.5

1.0

1.5

2.0

n√π mod 1 for n ≤ 10

22

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Example of Equidistribution: n√π mod 1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1.0

n√π mod 1 for n ≤ 100

23

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Example of Equidistribution: n√π mod 1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1.0

n√π mod 1 for n ≤ 1000

24

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Example of Equidistribution: n√π mod 1

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1.0

n√π mod 1 for n ≤ 10, 000

25

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law

Fundamental EquivalenceData set {xi} is Benford base B if {yi} isequidistributed mod 1, where yi = logB xi .

26

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law

Fundamental EquivalenceData set {xi} is Benford base B if {yi} isequidistributed mod 1, where yi = logB xi .

x = S10(x) · 10k then

log10 x = log10 S10(x) + k = log10 S10x mod 1.

27

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law

Fundamental EquivalenceData set {xi} is Benford base B if {yi} isequidistributed mod 1, where yi = logB xi .

x = S10(x) · 10k then

log10 x = log10 S10(x) + k = log10 S10x mod 1.

28

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law

Prob(leading digit d)= log10(d +1)− log10(d)= log10

(d+1

d

)

= log10

(1 + 1

d

).

Have Benford’s law ↔mantissa of logarithmsof data are uniformlydistributed

29

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

The Power of the Right Perspective

30

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

2n is Benford base 10 as log10 2 6∈ Q.

31

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.

32

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.

33

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.

34

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

35

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

General solution: an = c1rn1 + c2rn

2 .

36

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

General solution: an = c1rn1 + c2rn

2 .

Binet: an = 1√5

(1+

√5

2

)n− 1√

5

(1−

√5

2

)n.

37

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

General solution: an = c1rn1 + c2rn

2 .

Binet: an = 1√5

(1+

√5

2

)n− 1√

5

(1−

√5

2

)n.

Most linear recurrence relations Benford:

38

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

General solution: an = c1rn1 + c2rn

2 .

Binet: an = 1√5

(1+

√5

2

)n− 1√

5

(1−

√5

2

)n.

Most linear recurrence relations Benford:⋄ an+1 = 2an

39

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

General solution: an = c1rn1 + c2rn

2 .

Binet: an = 1√5

(1+

√5

2

)n− 1√

5

(1−

√5

2

)n.

Most linear recurrence relations Benford:⋄ an+1 = 2an − an−1

40

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

Fibonacci numbers are Benford base 10.an+1 = an + an−1.Guess an = rn: rn+1 = rn + rn−1 or r2 = r + 1.Roots r = (1 ±

√5)/2.

General solution: an = c1rn1 + c2rn

2 .

Binet: an = 1√5

(1+

√5

2

)n− 1√

5

(1−

√5

2

)n.

Most linear recurrence relations Benford:⋄ an+1 = 2an − an−1

⋄ take a0 = a1 = 1 or a0 = 0, a1 = 1.

41

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Digits of 2n

First 60 values of 2n (only displaying 30)1 1024 1048576 digit # Obs Prob Benf Prob2 2048 2097152 1 18 .300 .3014 4096 4194304 2 12 .200 .1768 8192 8388608 3 6 .100 .125

16 16384 16777216 4 6 .100 .09732 32768 33554432 5 6 .100 .07964 65536 67108864 6 4 .067 .067

128 131072 134217728 7 2 .033 .058256 262144 268435456 8 5 .083 .051512 524288 536870912 9 1 .017 .046

42

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Digits of 2n

First 60 values of 2n (only displaying 30)1 1024 1048576 digit # Obs Prob Benf Prob2 2048 2097152 1 18 .300 .3014 4096 4194304 2 12 .200 .1768 8192 8388608 3 6 .100 .125

16 16384 16777216 4 6 .100 .09732 32768 33554432 5 6 .100 .07964 65536 67108864 6 4 .067 .067

128 131072 134217728 7 2 .033 .058256 262144 268435456 8 5 .083 .051512 524288 536870912 9 1 .017 .046

43

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Digits of 2n

First 60 values of 2n (only displaying 30): 210 = 1024 ≈ 103.1 1024 1048576 digit # Obs Prob Benf Prob2 2048 2097152 1 18 .300 .3014 4096 4194304 2 12 .200 .1768 8192 8388608 3 6 .100 .125

16 16384 16777216 4 6 .100 .09732 32768 33554432 5 6 .100 .07964 65536 67108864 6 4 .067 .067

128 131072 134217728 7 2 .033 .058256 262144 268435456 8 5 .083 .051512 524288 536870912 9 1 .017 .046

44

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law

χ2 values for αn, 1 ≤ n ≤ N (5% 15.5).N χ2(γ) χ2(e) χ2(π)

100 0.72 0.30 46.65200 0.24 0.30 8.58400 0.14 0.10 10.55500 0.08 0.07 2.69700 0.19 0.04 0.05800 0.04 0.03 6.19900 0.09 0.09 1.71

1000 0.02 0.06 2.90

45

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law: Base 10 (5%: log(χ2) ≈ 2.74)

log(χ2) vs N for πn (red) and en (blue),n ∈ {1, . . . ,N}. Note π175 ≈ 1.0028 · 1087.

200 400 600 800 1000

-1.5

-1.0

-0.5

0.5

1.0

1.5

2.0

46

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Logarithms and Benford’s Law: Base 10 (5%: log(χ2) ≈ 2.74)

log(χ2) vs N for πn (red) and en (blue),n ∈ {1, . . . ,N}. Note π175 ≈ 1.0028 · 1087.

200 400 600 800 1000

-1.5

-1.0

-0.5

0.5

1.0

1.5

2.0

47

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

New Result: Linear Recurrence Relations of Degree 2

an+1 = f (n)an + g(n)an−1 with non-constantcoefficients f (n) and g(n).

Explore conditions on f and g such that thesequence generated obeys Benford’s Lawfor all initial values.

First solve the closed form of the sequence(an), then analyze its main term.

48

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Main idea: reduce the degree of recurrence.

an+1 = (λ(n) + µ(n))an − µ(n)λ(n − 1)an−1,and compare the coefficients:

f (n) = λ(n) + µ(n)

g(n) = −λ(n − 1)µ(n).

We show that for any given pair of f and g,such λ and µ always exist.

49

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Linear Recurrence Relations of Degree 2

Recurrence relations of degree 1:

an+1 = λ(n)an + bn

bn = µ(n)bn−1.

an+1 = r(n)(

1 +n∑

k=3

n∏i=k

λ(i)µ(i) +

a2

b1

n∏i=2

λ(i)µ(i)

),

where r(n) := b1

n∏i=2

µ(i).

Find conditions on µ, λ such that main termdominates; Benford if

∏µ(i) is.

50

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples when f and g are functions

If µ(k) = k , then r(n) = n!.

If µ(k) = kα where α ∈ R, then r(n) = (n!)α.

If µ(k) = exp(αh(k)) where α is irrational andh(k) is a monic polynomial, then

log r(n) = αn∑

k=1h(k).

LemmaThe sequence {αp(n)} is equidistributed mod 1if α /∈ Q and p(n) a monic polynomial.

51

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples when f and g are random variables

Take µ(n) ∼ h(n)Un where the Un’s areindependent uniform distributions on [0, 1],and h(n) is a deterministic function in n such

thatn∏

i=1h(i) is Benford.

Then r(n) =n∏

i=1h(i)

n∏i=1

Ui is Benford.

Take µ(n) ∼ exp(Un) where the Un’s are i.i.d.random variables. Then take logarithm andsum up log(µ(n)). Apply Central LimitTheorem and get a Gaussian distributionwith increasing variance.52

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Linear Recurrences of Higher Degree

Use recurrence relation of degree 3 as anexample. Similar main idea: reduce thedegree.

Define the sequence {an}∞n=1 byan+1 = f1(n)an + f2(n)an−1 + f3(n)an−2.

Define an auxiliary sequence (bn)∞n=1 by

bn = an+1 − λ(n)an. Then (bn) is degree 2.

53

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Why Benford’s Law?

54

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Streets

Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.

Probability first digit 1 versus street length L.What if we have many streets of different lengths?

55

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Streets

Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.

Probability first digit 1 versus street length L.What if we have many streets of different lengths?

56

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Streets

Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.

Probability first digit 1 versus log(street length L).What if we have many streets of different lengths?

57

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Streets

Not all data sets satisfy Benford’s Law.Long street [1,L]: L = 199 versus L = 999.Oscillates b/w 1/9 and 5/9 with first digit 1.

Probability first digit 1 versus log(street length L).What if we have many streets of different lengths?

58

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Amalgamating Streets

All houses: 1000 Streets,

each from 1 to 10000.

First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.

59

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Amalgamating Streets

All houses: 1000 Streets,

each from 1 to rand(10000).

First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.

60

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Amalgamating Streets

All houses: 1000 Streets,

each 1 to rand(rand(10000)).

First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.

61

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Amalgamating Streets

All houses: 1000 Streets,

each 1 to rand(rand(rand(10000))).

First digit and first two digits vs Benford.Conclusion: More processes, closer to Benford.

62

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Probability Review

Let X be random variable with density p(x):⋄ p(x) ≥ 0;

∫∞

−∞p(x)dx = 1;

⋄ Prob (a ≤ X ≤ b) =∫ b

a p(x)dx .

Mean µ =∫∞

−∞xp(x)dx .

Variance σ2 =∫∞

−∞(x − µ)2p(x)dx .

Independence: knowledge of one random variablegives no knowledge of the other.

63

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Central Limit Theorem

Normal N(µ, σ2) : p(x) = e−(x−µ)2/2σ2/√

2πσ2.

TheoremIf X1,X2, . . . independent, identically distributed randomvariables (mean µ, variance σ2, finite moments) then

SN :=X1 + · · ·+ XN − Nµ

σ√

Nconverges to N(0, 1).

64

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)

Y1 = X1/σX1 vs N(0, 1).

Density of Y1 versus N(0, 1).65

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)

Y2 = (X1 + X2)/σX1+X2 vs N(0, 1).

Density of Y2 versus N(0, 1).66

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)

Y4 = (X1 +X2 +X3 +X4)/σX1+X2+X3+X4 vs N(0, 1).

Density of Y4 versus N(0, 1).67

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)

Y8 = (X1 + · · ·+ X8)/σX1+···+X8 vs N(0, 1).

Density of Y4 versus N(0, 1).68

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Central Limit Theorem: Sums of Uniform Random VariablesXi ∼ Unif(−1/2,1/2) (adjusted to mean 0, variance 1)

Density of Y4 = (X1 + · · ·+ X4)/σX1+···+X4.

(Don’t even think of asking to see Y8’s!)69

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Normal Distributions Mod 1

As σ → ∞, N(0, σ2) mod 1 → Unif(0, 1).

Variance is .01.70

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Normal Distributions Mod 1

As σ → ∞, N(0, σ2) mod 1 → Unif(0, 1).

Variance is .1.71

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Normal Distributions Mod 1

As σ → ∞, N(0, σ2) mod 1 → Unif(0, 1).

Variance is .5.72

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Products and Benford’s Law

Pavlovian Response: See a product, take a logarithm.

73

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Products and Benford’s Law

Pavlovian Response: See a product, take a logarithm.

X1,X2, . . . nice, WN = X1 · X2 · · ·XN .

74

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Products and Benford’s Law

Pavlovian Response: See a product, take a logarithm.

X1,X2, . . . nice, WN = X1 · X2 · · ·XN .

Yi = log10 Xi , VN := log10 WN .

75

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Products and Benford’s Law

Pavlovian Response: See a product, take a logarithm.

X1,X2, . . . nice, WN = X1 · X2 · · ·XN .

Yi = log10 Xi , VN := log10 WN .

VN = log10(X1 · X2 · · ·XN)

76

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Products and Benford’s Law

Pavlovian Response: See a product, take a logarithm.

X1,X2, . . . nice, WN = X1 · X2 · · ·XN .

Yi = log10 Xi , VN := log10 WN .

VN = log10(X1 · X2 · · ·XN)

= log10 X1 + log10 X2 + · · ·+ log10 XN

77

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Products and Benford’s Law

Pavlovian Response: See a product, take a logarithm.

X1,X2, . . . nice, WN = X1 · X2 · · ·XN .

Yi = log10 Xi , VN := log10 WN .

VN = log10(X1 · X2 · · ·XN)

= log10 X1 + log10 X2 + · · ·+ log10 XN

= Y1 + Y2 + · · ·+ YN .

Need distribution of VN mod 1, which by CLT becomes uniform,

implying Benfordness!

78

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Applications

79

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Applications for the IRS: Detecting Fraud

A Tale of Two Steve Millers....80

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Detecting Fraud

Bank FraudAudit of a bank revealed huge spike ofnumbers starting with 48 and 49, most dueto one person.

Write-off limit of $5,000. Officer had friendsapplying for credit cards, ran up balancesjust under $5,000 then he would write thedebts off.

81

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Can you see the cat in the tree?

82

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Transmitting Images

How to transmit an image?

Have an L × W grid with LW pixels.

Each pixel a triple: (Red, Green, Blue).

Often each value in {0, 1, 2, 3, . . . , 2n − 1}.

n = 8 gives 256 choices for each, or16,777,216 possibilities.

83

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Steganography

Steganography: Concealing a message inanother message: https://en.wikipedia.org/wiki/Steganography.

84

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Steganography

Steganography: Concealing a message inanother message: https://en.wikipedia.org/wiki/Steganography.

Take one of the colors, say red, a number from 0to 255.

Write in binary: r727 + r626 + · · ·+ r12 + r0.

If change just the last or last two digits, veryminor change to image.

Can hide an image in another.85

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Can you see the cat in the tree?

86

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Can you see the cat in the tree?

87

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Good Processes

A. Kontorovich and S. J. Miller, Benford’sLaw, values of L-functions and the 3x + 1problem, Acta Arithmetica 120 (2005), no. 3,269–297.

88

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Poisson Summation and Benford’s Law: Definitions

Feller, Pinkham (often exact processes)

89

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Poisson Summation and Benford’s Law: Definitions

Feller, Pinkham (often exact processes)data YT ,B = logB

−→X T (discrete/continuous):

P(A) = limT→∞

#{n ∈ A : n ≤ T}T

90

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Poisson Summation and Benford’s Law: Definitions

Feller, Pinkham (often exact processes)data YT ,B = logB

−→X T (discrete/continuous):

P(A) = limT→∞

#{n ∈ A : n ≤ T}T

Poisson Summation Formula: f nice:∞∑

ℓ=−∞f (ℓ) =

∞∑

ℓ=−∞f (ℓ),

Fourier transform f (ξ) =

∫ ∞

−∞f (x)e−2πixξdx .

91

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Good Process

XT is Benford Good if there is a nice f st

CDF−→Y T ,B

(y) =

∫ y

−∞

1T

f(

tT

)dt+ET (y) := GT (y)

and monotonically increasing h (h(|T |) → ∞):

92

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Good Process

XT is Benford Good if there is a nice f st

CDF−→Y T ,B

(y) =

∫ y

−∞

1T

f(

tT

)dt+ET (y) := GT (y)

and monotonically increasing h (h(|T |) → ∞):Small tails: GT (∞)− GT (Th(T )) = o(1),GT (−Th(T ))− GT (−∞) = o(1).

93

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Good Process

XT is Benford Good if there is a nice f st

CDF−→Y T ,B

(y) =

∫ y

−∞

1T

f(

tT

)dt+ET (y) := GT (y)

and monotonically increasing h (h(|T |) → ∞):Small tails: GT (∞)− GT (Th(T )) = o(1),GT (−Th(T ))− GT (−∞) = o(1).Decay of the Fourier Transform:∑

ℓ 6=0

∣∣∣ f (T ℓ)ℓ

∣∣∣ = o(1).

94

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Good Process

XT is Benford Good if there is a nice f st

CDF−→Y T ,B

(y) =

∫ y

−∞

1T

f(

tT

)dt+ET (y) := GT (y)

and monotonically increasing h (h(|T |) → ∞):Small tails: GT (∞)− GT (Th(T )) = o(1),GT (−Th(T ))− GT (−∞) = o(1).Decay of the Fourier Transform:∑

ℓ 6=0

∣∣∣ f (T ℓ)ℓ

∣∣∣ = o(1).

Small translated error: E(a, b,T )) =∑|ℓ|≤Th(T ) [ET (b + ℓ)− ET (a + ℓ)] = o(1).

95

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Main Theorem

Theorem (Kontorovich and M–, 2005)XT converging to X as T → ∞ (think spreadingGaussian). If XT is Benford good, then X isBenford.

96

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Main Theorem

Theorem (Kontorovich and M–, 2005)XT converging to X as T → ∞ (think spreadingGaussian). If XT is Benford good, then X isBenford.

Examples⋄ L-functions⋄ characteristic polynomials (RMT)⋄ 3x + 1 problem⋄ geometric Brownian motion.

97

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Sketch of the proof

Structure Theorem:⋄ main term is something nice spreading out⋄ apply Poisson summation

98

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Sketch of the proof

Structure Theorem:⋄ main term is something nice spreading out⋄ apply Poisson summation

Control translated errors:⋄ hardest step⋄ techniques problem specific

99

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Sketch of the proof (continued)

∞∑

ℓ=−∞P

(a + ℓ ≤ −→

Y T ,B ≤ b + ℓ)

100

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Sketch of the proof (continued)

∞∑

ℓ=−∞P

(a + ℓ ≤ −→

Y T ,B ≤ b + ℓ)

=∑

|ℓ|≤Th(T )

[GT (b + ℓ)− GT (a + ℓ)] + o(1)

101

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Sketch of the proof (continued)

∞∑

ℓ=−∞P

(a + ℓ ≤ −→

Y T ,B ≤ b + ℓ)

=∑

|ℓ|≤Th(T )

[GT (b + ℓ)− GT (a + ℓ)] + o(1)

=

∫ b

a

|ℓ|≤Th(T )

1T

f(

t + ℓ

T

)dt + E(a, b,T ) + o(1)

102

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Sketch of the proof (continued)

∞∑

ℓ=−∞P

(a + ℓ ≤ −→

Y T ,B ≤ b + ℓ)

=∑

|ℓ|≤Th(T )

[GT (b + ℓ)− GT (a + ℓ)] + o(1)

=

∫ b

a

|ℓ|≤Th(T )

1T

f(

t + ℓ

T

)dt + E(a, b,T ) + o(1)

= f (0) · (b − a) +∑

ℓ 6=0

f (T ℓ)e2πibℓ − e2πiaℓ

2πiℓ+ o(1).

103

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Riemann Zeta Function (for real part of s greater than 1)

ζ(s) =∞∑

n=1

1ns

=∏

p prime

(1 − 1

ps

)−1

, Re(s) > 1.

Geometric Series Formula: (1 − x)−1 = 1 + x + x2 + · · · .Unique Factorization: n = pr1

1 · · · prmm .

p

(1 − 1

ps

)−1

=

[1 +

12s +

(12s

)2

+ · · ·][

1 +13s +

(13s

)2

+ · · ·]· · ·

=∑

n

1ns .

104

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Riemann Zeta Function

∣∣ζ(

12 + ik

4

)∣∣, k ∈ {0, 1, . . . , 65535}.

2 4 6 8

0.05

0.1

0.15

0.2

0.25

0.3

105

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

106

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7

107

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11

108

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11 →1 17

109

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11 →1 17 →2 13

110

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11 →1 17 →2 13 →3 5

111

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11 →1 17 →2 13 →3 5 →4 1

112

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11 →1 17 →2 13 →3 5 →4 1 →2 1,

113

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Problem

Kakutani (conspiracy), Erdös (not ready).

x odd, T (x) = 3x+12k , 2k ||3x + 1.

Conjecture: for some n = n(x), T n(x) = 1.

7 →1 11 →1 17 →2 13 →3 5 →4 1 →2 1,2-path (1, 1), 5-path (1, 1, 2, 3, 4).m-path: (k1, . . . , km).

114

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 and Benford

Theorem (Kontorovich and M–, 2005)As m → ∞, xm/(3/4)mx0 is Benford.

Theorem (Lagarias-Soundararajan, 2006)

X ≥ 2N , for all but at most c(B)N−1/36X initialseeds the distribution of the first N iterates ofthe 3x + 1 map are within 2N−1/36 of theBenford probabilities.

115

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Data: random 10,000 digit number, 2k ||3x + 1

80,514 iterations ((4/3)n = a0 predicts 80,319);χ2 = 13.5 (5% 15.5).

Digit Number Observed Benford1 24251 0.301 0.3012 14156 0.176 0.1763 10227 0.127 0.1254 7931 0.099 0.0975 6359 0.079 0.0796 5372 0.067 0.0677 4476 0.056 0.0588 4092 0.051 0.0519 3650 0.045 0.046

116

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

3x + 1 Data: random 10,000 digit number, 2|3x + 1

241,344 iterations, χ2 = 11.4 (5% 15.5).

Digit Number Observed Benford1 72924 0.302 0.3012 42357 0.176 0.1763 30201 0.125 0.1254 23507 0.097 0.0975 18928 0.078 0.0796 16296 0.068 0.0677 13702 0.057 0.0588 12356 0.051 0.0519 11073 0.046 0.046

117

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Stick Decomposition

T. Becker, D. Burt, T. C. Corcoran, A. Greaves-Tunnell, J. R.Iafrate, J. Jing, S. J. Miller, J. D. Porfilio, R. Ronan, J.Samranvedhya, F. W. Strauch and B. Talbut, Benford’s Law andContinuous Dependent Random Variables, Annals of Physics388 (2018), 350–381.

J. Iafrate, S. J. Miller and F. W. Strauch, Equipartitions and adistribution for numbers: A statistical model for Benford’s law,Physical Review E 91 (2015), no. 6, 062138 (6 pages).

118

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Decomposition Process

Decomposition Process

1 Consider a stick of length L.

119

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Decomposition Process

Decomposition Process

1 Consider a stick of length L.

2 Uniformly choose a proportion p ∈ (0, 1).

120

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Decomposition Process

Decomposition Process

1 Consider a stick of length L.

2 Uniformly choose a proportion p ∈ (0, 1).

3 Break the stick into two pieces—lengths pLand (1 − p)L.

121

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Decomposition Process

Decomposition Process

1 Consider a stick of length L.

2 Uniformly choose a proportion p ∈ (0, 1).

3 Break the stick into two pieces—lengths pLand (1 − p)L.

4 Repeat N times (using the same proportion).

122

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Decomposition Process

123

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Conjecture (Joy Jing ’13)

Conjecture: The above decomposition processis Benford as N → ∞ for any p ∈ (0, 1), p 6= 1

2 .

124

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Fixed Proportion Conjecture (Joy Jing ’13)

Conjecture: The above decomposition processis Benford as N → ∞ for any p ∈ (0, 1), p 6= 1

2 .

Counterexample (SMALL REU ’13): p = 111 , 1 − p = 10

11 .

125

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Analysis

At N th level,

2N sticks

N + 1 distinct lengths: write pN−j(1 − p)j as

pN

(1 − p

p

)j

, j ∈ {0, . . . ,N}, have

(Nj

)times.

126

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Analysis

At N th level,

2N sticks

N + 1 distinct lengths: write pN−j(1 − p)j as

pN

(1 − p

p

)j

, j ∈ {0, . . . ,N}, have

(Nj

)times.

(Weighted) Geometric with ratio 1−pp = 10y ;

behavior depends on irrationality of y !

127

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Analysis

At N th level,

2N sticks

N + 1 distinct lengths: write pN−j(1 − p)j as

pN

(1 − p

p

)j

, j ∈ {0, . . . ,N}, have

(Nj

)times.

(Weighted) Geometric with ratio 1−pp = 10y ;

behavior depends on irrationality of y !Theorem: Benford if and only if y irrational.

128

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Benford Analysis (cont)

Say 1−pp = 10r/q for r , q integers.

All terms with index j mod q have same leading digit; probability index j mod q is

1

2N

[(

N

j

)

+

(

N

j + q

)

+

(

N

j + 2q

)

+ · · ·

]

=1

q

q−1∑

s=0

(

cosπs

q

)Ncos

π(N − 2j)s

q

=1

q

1 +

q−1∑

s=1

(

cosπs

q

)Ncos

π(N − 2j)s

q

=1

q

(

1 + Err

[

(q − 1)(

cosπ

q

)N])

,

where Err[X ] indicates an absolute error of size at most X

129

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

p = 3/11, 1000 levels; y = log10(8/3) 6∈ Q

(irrational)

130

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

p = 1/11, 1000 levels; y = 1 ∈ Q

(rational)

131

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Examples

p = 1/(1 + 1033/10), 1000 levels; y = 33/10 ∈ Q

(rational)

132

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Random Cuts

L

L K1 LH1-K1L

L K1K2 L K1H1-K2L LH1-K1LK3 LH1-K1LH1-K3L

L K1K2K4 L K1K2H1-K4L L K1H1-K2LK5 L K1H1-K2LH1-K45L LH1-K1LK3K6 LH1-K1LK3H1-K6L LH1-K1LH1-K3LK7 LH1-K1LH1-K3LH1-K7L

Figure: Unrestricted Decomposition: Breaking L into pieces, N = 3.133

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Conclusions and References

134

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Conclusions and Future Investigations

See many different systems exhibit Benfordbehavior.

Ingredients of proofs (logarithms,equidistribution).

Applications to fraud detection / dataintegrity.

135

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

A. K. Adhikari, Some results on the distribution of the mostsignificant digit, Sankhya: The Indian Journal of Statistics, SeriesB 31 (1969), 413–420.

A. K. Adhikari and B. P. Sarkar, Distribution of most significantdigit in certain functions whose arguments are random variables,Sankhya: The Indian J. of Statistics, Series B 30 (1968), 47–58.

R. N. Bhattacharya, Speed of convergence of the n-foldconvolution of a probability measure ona compact group, Z.Wahrscheinlichkeitstheorie verw. Geb. 25 (1972), 1–10.

F. Benford, The law of anomalous numbers, Proceedings of theAmerican Philosophical Society 78 (1938), 551–572. http://www.jstor.org/stable/984802?seq=1#page_scan_tab_contents.

136

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

A. Berger, L. A. Bunimovich and T. Hill, One-dimensionaldynamical systems and Benford’s Law, Trans. AMS 357 (2005),no. 1, 197–219. http://www.ams.org/journals/tran/2005-357-01/S0002-9947-04-03455-5/.

A. Berger and T. Hill, Newton’s method obeys Benford’s law, TheAmer. Math. Monthly 114 (2007), no. 7, 588-601. http://digitalcommons.calpoly.edu/cgi/viewcontent.cgi?article=1058&context=rgp_rsr.

A. Berger and T. Hill, Benford on-line bibliography, http://www.benfordonline.net/.

J. Boyle, An application of Fourier series to the most significantdigit problem Amer. Math. Monthly 101 (1994), 879–886.http://www.jstor.org/stable/2975136?seq=1#page_scan_tab_contents.

J. Brown and R. Duncan, Modulo one uniform distribution of thesequence of logarithms of certain recursive sequences,Fibonacci Quarterly 8 (1970) 482–486.

137

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

P. Diaconis, The distribution of leading digits and uniformdistribution mod 1, Ann. Probab. 5 (1979), 72–81. http://statweb.stanford.edu/~cgates/PERSI/papers/digits.pdf.

W. Feller, An Introduction to Probability Theory and itsApplications, Vol. II, second edition, John Wiley & Sons, Inc.,1971.

R. W. Hamming, On the distribution of numbers, Bell Syst. Tech.J. 49 (1970), 1609-1625. https://archive.org/details/bstj49-8-1609.

T. Hill, The first-digit phenomenon, American Scientist 86 (1996),358–363. http://www.americanscientist.org/issues/feature/1998/4/the-first-digit-phenomenon/99999.

T. Hill, A statistical derivation of the significant-digit law,Statistical Science 10 (1996), 354–363. https://projecteuclid.org/euclid.ss/1177009869.

138

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

P. J. Holewijn, On the uniform distribuiton of sequences ofrandom variables, Z. Wahrscheinlichkeitstheorie verw. Geb. 14(1969), 89–92.

W. Hurlimann, Benford’s Law from 1881 to 2006: a bibliography,http://arxiv.org/abs/math/0607168.

D. Jang, J. Kang, A. Kruckman, J. Kudo & S. J. Miller, Chains ofdistributions, hierarchical Bayesian models and Benford’s Law,Journal of Algebra, Number Theory: Advances and Applications,volume 1, number 1 (March 2009), 37–60. http://arxiv.org/abs/0805.4226.

E. Janvresse and T. de la Rue, From uniform distribution toBenford’s law, Journal of Applied Probability 41 (2004) no. 4,1203–1210. http://www.jstor.org/stable/4141393?seq=1#page_scan_tab_contents.

A. Kontorovich and S. J. Miller, Benford’s Law, Values ofL-functions and the 3x + 1 Problem, Acta Arith. 120 (2005),269–297. http://arxiv.org/pdf/math/0412003.pdf.

139

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

D. Knuth, The Art of Computer Programming, Volume 2:Seminumerical Algorithms, Addison-Wesley, third edition, 1997.

J. Lagarias and K. Soundararajan, Benford’s Law for the 3x + 1Function, J. London Math. Soc. (2) 74 (2006), no. 2, 289–303.http://arxiv.org/pdf/math/0509175.pdf.

S. Lang, Undergraduate Analysis, 2nd edition, Springer-Verlag,New York, 1997.

P. Levy, L’addition des variables aleatoires definies sur unecirconference, Bull. de la S. M. F. 67 (1939), 1–41.

E. Ley, On the peculiar distribution of the U.S. Stock IndicesDigits, The American Statistician 50 (1996), no. 4, 311–313.http://www.jstor.org/stable/2684926?seq=1#page_scan_tab_contents.

140

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

R. M. Loynes, Some results in the probabilistic theory ofasympototic uniform distributions modulo 1, Z.Wahrscheinlichkeitstheorie verw. Geb. 26 (1973), 33–41.

S. J. Miller, Benford’s Law: Theory and Applications, PrincetonUniversity Press, 2015. http://web.williams.edu/Mathematics/sjmiller/public_html/benford/.

S. J. Miller and M. Nigrini, The Modulo 1 Central Limit Theoremand Benford’s Law for Products, International Journal of Algebra2 (2008), no. 3, 119–130. http://arxiv.org/pdf/math/0607686v2.

S. J. Miller and M. Nigrini, Order Statistics and Benford’s law,International Journal of Mathematics and MathematicalSciences, Volume 2008 (2008), Article ID 382948, 19 pages.http://arxiv.org/pdf/math/0601344v5.pdf.

141

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

S. J. Miller and R. Takloo-Bighash, An Invitation to ModernNumber Theory, Princeton University Press, Princeton, NJ, 2006.http://web.williams.edu/Mathematics/sjmiller/public_html/book/index.html.

S. Newcomb, Note on the frequency of use of the different digitsin natural numbers, Amer. J. Math. 4 (1881), 39-40. http://www.jstor.org/stable/2369148?seq=1#page_scan_tab_contents.

M. Nigrini, Digital Analysis and the Reduction of Auditor LitigationRisk. Pages 69–81 in Proceedings of the 1996 Deloitte & Touche/ University of Kansas Symposium on Auditing Problems, ed. M.Ettredge, University of Kansas, Lawrence, KS, 1996.

M. Nigrini, The Use of Benford’s Law as an Aid in AnalyticalProcedures, Auditing: A Journal of Practice & Theory, 16 (1997),no. 2, 52–67.

142

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

M. Nigrini and S. J. Miller, Benford’s Law applied to hydrologydata – results and relevance to other geophysical data,Mathematical Geology 39 (2007), no. 5, 469–490. http://link.springer.com/article/10.1007%2Fs11004-007-9109-5?LI=true.

M. Nigrini and S. J. Miller, Data diagnostics using second ordertests of Benford’s Law, Auditing: A Journal of Practice andTheory 28 (2009), no. 2, 305–324. http://accounting.uwaterloo.ca/uwcisa/symposiums/symposium_2007/AdvancedBenfordsLaw7.pdf.

R. Pinkham, On the Distribution of First Significant Digits, TheAnnals of Mathematical Statistics 32, no. 4 (1961), 1223-1230.

R. A. Raimi, The first digit problem, Amer. Math. Monthly 83(1976), no. 7, 521–538.

143

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

H. Robbins, On the equidistribution of sums of independentrandom variables, Proc. Amer. Math. Soc. 4 (1953), 786–799.http://projecteuclid.org/download/pdf_1/euclid.aoms/1177704862.

H. Sakamoto, On the distributions of the product and the quotientof the independent and uniformly distributed random variables,Tohoku Math. J. 49 (1943), 243–260.

P. Schatte, On sums modulo 2π of independent randomvariables, Math. Nachr. 110 (1983), 243–261.

P. Schatte, On the asymptotic uniform distribution of sumsreduced mod 1, Math. Nachr. 115 (1984), 275–281.

144

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

P. Schatte, On the asymptotic logarithmic distribution of thefloating-point mantissas of sums, Math. Nachr. 127 (1986), 7–20.

E. Stein and R. Shakarchi, Fourier Analysis: An Introduction,Princeton University Press, 2003.

M. D. Springer and W. E. Thompson, The distribution of productsof independent random variables, SIAM J. Appl. Math. 14 (1966)511–526. http://www.jstor.org/stable/2946226?seq=1#page_scan_tab_contents.

K. Stromberg, Probabilities on a compact group, Trans. Amer.Math. Soc. 94 (1960), 295–309. http://www.jstor.org/stable/1993313?seq=1#page_scan_tab_contents.

P. R. Turner, The distribution of leading significant digits, IMA J.Numer. Anal. 2 (1982), no. 4, 407–412.

145

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Productsof

Random Variables

S. J. Miller and M. Nigrini, The Modulo 1Central Limit Theorem and Benford’s Law forProducts, International Journal of Algebra 2(2008), no. 3, 119–130.

146

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Preliminaries

X1 · · ·Xn ⇔ Y1 + · · ·+ Yn mod 1, Yi = logB Xi

Density Yi is gi , density Yi + Yj is

(gi ∗ gj)(y) =

∫ 1

0gi(t)gj(y − t)dt .

hn = g1 ∗ · · · ∗ gn, hn(ξ) = g1(ξ) · · · gn(ξ).

147

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Modulo 1 Central Limit Theorem

Theorem (M– and Nigrini 2007){Ym} independent continuous random variableson [0, 1) (not necc. i.i.d.), densities {gm}.Y1 + · · ·+ YM mod 1 converges to the uniformdistribution as M → ∞ in L1([0, 1]) if and only iffor all n 6= 0, limM→∞ g1(n) · · · gM(n) = 0.

⋄ Gives info on rate of convergence.

148

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Generalizations

Levy proved for i.i.d.r.v. just one year afterBenford’s paper.

Generalized to other compact groups, withestimates on the rate of convergence.⋄ Stromberg: n-fold convolution of a regularprobability measure on a compact Hausdorffgroup G converges to normalized Haarmeasure in weak-star topology iff support ofthe distribution not contained in a coset of aproper normal closed subgroup of G.

149

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Distribution of digits (base 10) of 1000 productsX1 · · ·X1000, where g10,m = φ11m.φm(x) = m if |x − 1/8| ≤ 1/2m (0 otherwise).

2 4 6 8 10

0.1

0.2

0.3

0.4

0.5

150

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof under stronger conditions

Use standard CLT to show Y1 + · · ·+ YM

tends to a Gaussian.

Use Poisson Summation to show theGaussian tends to the uniform modulo 1.

151

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof under stronger conditions

-4 -2 2 4

0.1

0.2

0.3

0.4

Figure: Plot of normal (mean 0, stdev 1).

152

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof under stronger conditions

0.2 0.4 0.6 0.8 1.0

1

2

3

4

Figure: Plot of normal (mean 0, stdev .1) modulo 1.

153

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof under stronger conditions

0.2 0.4 0.6 0.8 1.0

0.995

1.000

1.005

1.010

1.015

Figure: Plot of normal (mean 0, stdev .5) modulo 1.

154

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Inputs

Poisson Summation Formulaf nice: ∞∑

ℓ=−∞f (ℓ) =

∞∑

ℓ=−∞f (ℓ),

Fourier transform f (ξ) =

∫ ∞

−∞f (x)e−2πixξdx .

Lemma2√

2πσ2

∫∞σ1+δ e−x2/2σ2

dx ≪ e−σ2δ/2.

155

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof Under Weaker Conditions

Lemma

As N → ∞, pN(x) = e−πx2/N√N

becomesequidistributed modulo 1.

∫∞x=−∞

x mod 1∈[a,b]pN(x)dx =

1√N

∑n∈Z∫ b

x=a e−π(x+n)2/Ndx .

e−π(x+n)2/N = e−πn2/N + O(max(1,|n|)

N e−n2/N).

Can restrict sum to |n| ≤ N5/4.1√N

∑n∈Z e−πn2/N =

∑n∈Z e−πn2N .

156

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof Under Weaker Conditions

1√N

|n|≤N5/4

∫ b

x=ae−π(x+n)2/Ndx

=1√N

|n|≤N5/4

∫ b

x=a

[e−πn2/N + O

(max(1, |n|)

Ne−n2/N

)]dx

=b − a√

N

|n|≤N5/4

e−πn2/N + O

1

N

N5/4∑

n=0

n + 1√N

e−π(n/√

N)2

=b − a√

N

|n|≤N5/4

e−πn2/N + O

(1N

∫ N3/4

w=0(w + 1)e−πw2√

Ndw

)

=b − a√

N

|n|≤N5/4

e−πn2/N + O(

N−1/2).

157

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof Under Weaker Conditions

Extend sums to n ∈ Z, apply PoissonSummation:

1√N

n∈Z

∫ b

x=ae−π(x+n)2/Ndx ≈ (b−a) ·

n∈Ze−πn2N .

For n = 0 the right hand side is b − a.For all other n, we trivially estimate the sum:∑

n 6=0

e−πn2N ≤ 2∑

n≥1

e−πnN ≤ 2e−πN

1 − e−πN,

which is less than 4e−πN for N sufficiently large.158

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof in General Case: Fourier input

Fejér kernel:

FN(x) =N∑

n=−N

(1 − |n|

N

)e2πinx .

Fejér series TN f (x) equals

(f ∗ FN)(x) =N∑

n=−N

(1 − |n|

N

)f (n)e2πinx .

Lebesgue’s Theorem: f ∈ L1([0, 1]). AsN → ∞, TN f converges to f in L1([0, 1]).TN(f ∗ g) = (TN f ) ∗ g: convolution assoc.

159

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof of Modulo 1 CLT

Density of sum is hℓ = g1 ∗ · · · ∗ gℓ.

Suffices show ∀ǫ: limM→∞∫ 1

0 |hM(x)− 1|dx < ǫ.

Lebesgue’s Theorem: N large,

||h1 − TNh1||1 =

∫ 1

0|h1(x)− TNh1(x)|dx <

ǫ

2.

Claim: above holds for hM for all M.

160

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof of Modulo 1 CLT : Proof of Claim

TNhM+1 = TN(hM ∗ gM+1) = (TNhM) ∗ gM+1

||hM+1 − TNhM+1||1 =

∫ 1

0|hM+1(x)− TNhM+1(x)|dx

=

∫ 1

0|(hM ∗ gM+1)(x)− (TNhM) ∗ gM+1(x)|dx

=

∫ 1

0

∣∣∣∣∣

∫ 1

0(hM(y)− TNhM(y))gM+1(x − y)

∣∣∣∣∣ dydx

≤∫ 1

0

∫ 1

0|hM(y)− TNhM(y)|gM+1(x − y)dxdy

=

∫ 1

0|hM(y)− TNhM(y)|dy · 1 <

ǫ

2.

161

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof of Modulo 1 CLT

Show limM→∞ ||hM − 1||1 = 0.Triangle inequality:

||hM − 1||1 ≤ ||hM − TNhM ||1 + ||TNhM − 1||1.

Choices of N and ǫ:

||hM − TNhM ||1 < ǫ/2.

Show ||TNhM − 1||1 < ǫ/2.

162

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof of Modulo 1 CLT

||TNhM − 1||1 =

∫ 1

0

∣∣∣∣∣∣∣

N∑

n=−Nn 6=0

(1 − |n|

N

)hM(n)e2πinx

∣∣∣∣∣∣∣dx

≤N∑

n=−Nn 6=0

(1 − |n|

N

)|hM(n)|

hM(n) = g1(n) · · · gM(n) −→M→∞ 0.For fixed N and ǫ, choose M large so that |hM(n)| < ǫ/4N whenevern 6= 0 and |n| ≤ N.

163

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Productsand

Chains of Random Variables

D. Jang, J. U. Kang, A. Kruckman, J. Kudoand S. J. Miller, Chains of distributions,hierarchical Bayesian models and Benford’sLaw, Journal of Algebra, Number Theory:Advances and Applications, volume 1,number 1 (March 2009), 37–60.

164

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Key Ingredients

Mellin transform and Fourier transformrelated by logarithmic change of variable.

Poisson summation from collapsing tomodulo 1 random variables.

165

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Preliminaries

Ξ1, . . . ,Ξn nice independent r.v.’s on [0,∞).Density Ξ1 · Ξ2:

∫ ∞

0f2(x

t

)f1(t)

dtt

166

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Preliminaries

Ξ1, . . . ,Ξn nice independent r.v.’s on [0,∞).Density Ξ1 · Ξ2:

∫ ∞

0f2(x

t

)f1(t)

dtt

⋄ Proof: Prob(Ξ1 · Ξ2 ∈ [0, x ]):∫ ∞

t=0Prob

(Ξ2 ∈

[0,

xt

])f1(t)dt

=

∫ ∞

t=0F2

(xt

)f1(t)dt ,

differentiate.167

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Mellin Transform

(Mf )(s) =

∫ ∞

0f (x)xs dx

x

(M−1g)(x) =1

2πi

∫ c+i∞

c−i∞g(s)x−sds

g(s) = (Mf )(s), f (x) = (M−1g)(x).

(f1 ⋆ f2)(x) =

∫ ∞

0f2(x

t

)f1(t)

dtt

(M(f1 ⋆ f2))(s) = (Mf1)(s) · (Mf2)(s).168

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Mellin Transform Formulation: Products Random Variables

TheoremXi ’s independent, densities fi . Ξn = X1 · · ·Xn,

hn(xn) = (f1 ⋆ · · · ⋆ fn)(xn)

(Mhn)(s) =n∏

m=1

(Mfm)(s).

As n → ∞, Ξn becomes Benford: Yn = logB Ξn,|Prob(Yn mod 1 ∈ [a, b])− (b − a)| ≤

(b − a) ·∞∑

ℓ 6=0,ℓ=−∞

n∏

m=1

(Mfi)(

1 − 2πiℓlogB

).

169

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof of Kossovsky’s Chain Conjecture for certain densitie s

Conditions{Di(θ)}i∈I: one-parameter distributions,densities fDi(θ) on [0,∞).p : N → I, X1 ∼ Dp(1)(1), Xm ∼ Dp(m)(Xm−1).m ≥ 2,

fm(xm) =

∫ ∞

0fDp(m)(1)

(xm

xm−1

)fm−1(xm−1)

dxm−1

xm−1

limn→∞

∞∑

ℓ=−∞ℓ 6=0

n∏

m=1

(MfDp(m)(1))

(1 − 2πiℓ

logB

)= 0

170

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Chains of Random Variables

Return to street problem: chain of uniforms.

Let Dunif(θ) be the density of a uniform randomvariable on [0, θ].

Let X1 ∼ Dunif(1) and Xn+1 ∼ Dunif(Xn).

171

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Proof of Kossovsky’s Chain Conjecture for certain densitie s

Theorem (JKKKM)If conditions hold, as n → ∞ the distributionof leading digits of Xn tends to Benford’s law.

The error is a nice function of the Mellintransforms: if Yn = logB Xn, then

|Prob(Yn mod 1 ∈ [a, b])− (b + a)| ≤∣∣∣∣∣∣∣(b − a) ·

∞∑

ℓ=−∞ℓ 6=0

n∏

m=1

(MfDp(m)(1))

(1 − 2πiℓ

logB

)∣∣∣∣∣∣∣

172

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Example: All Xi ∼ Exp(1)

Xi ∼ Exp(1), Yn = logB Ξn.

Needed ingredients:⋄∫∞

0 exp(−x)xs−1dx = Γ(s).⋄ |Γ(1 + ix)| =

√πx/ sinh(πx), x ∈ R.

|Pn(s)− log10(s)| ≤

logB s∞∑

ℓ=1

(2π2ℓ/ logB

sinh(2π2ℓ/ logB)

)n/2

.

173

Intro General Theory Why Benford? Apps Benford Good Stick Refs Products F Chains

Example: All Xi ∼ Exp(1)

Bounds on the error|Pn(s)− log10 s| ≤⋄ 3.3 · 10−3 logB s if n = 2,⋄ 1.9 · 10−4 logB s if n = 3,⋄ 1.1 · 10−5 logB s if n = 5, and⋄ 3.6 · 10−13 logB s if n = 10.

Error at most

log10 s∞∑

ℓ=1

(17.148ℓ

exp(8.5726ℓ)

)n/2

≤ .057n log10 s

174