Lempel-Ziv and Related Algorithms∗
W. Szpankowski†
Department of Computer Science
Purdue University
W. Lafayette, IN 47907
April 29, 2008
AofA and IT logos
Wroclaw 2008
∗Research supported by NSF, AFSOR, and NIH.†Joint work with M. Drmota, P. Jacquet, C. Knessl, and M. Ward.
Outline
1. Universal Source Coding
2. Error-Resilient Lempel-Ziv’77 (Analytic Pattern Matching)
3. Method of Types (Nonlinear Functional Equations)
Algorithms: are at the heart of virtually all computing technologies;
Combinatorics: provides indispensable tools for finding patterns and structures;
Information: permeates every corner of our lives and shapes our universe.
Goals of Source Coding
The basic problem of source coding (i.e., data compression) is to
find codes with shortest descriptions (lengths) either on average or for
individual sequences when the source (i.e., statistics of the underlying
probability distribution) is unknown (i.e., universal source coding).
Definition: A block-to-variable (BV) length code: Cn : An → {0, 1}∗is a bijective mapping from a set of all sequences of length n over the
alphabetA to the set {0, 1}∗ of binary sequences.
For a probabilistic source model S and a code Cn we let:
• P (xn1) be the probability of xn
1 = x1 . . . xn;
• L(Cn, xn1) be the code length for xn
1 ;
• Entropy Hn(P ) = −P
xn1
P (xn1) lg P (xn
1 ); entropy rate h ∼ H(Xn1 )/n.
Outline Update
1. Universal Source Coding
2. Algorithms: Error-Resilient Lempel-Ziv’77
(a) Redundant Bits in LZ’77
(b) Design of Encoder and Decoder
(c) Analysis through the Suffix Tree
3. Combinatorics: Method of Types
4. Information: Non-Prefix Codes
LZ’77 Scheme
The popular Lempel-Ziv’77 scheme works on-line: It compresses phrases by
consecutively replacing the longest prefix of the non-compressed portion
of a file with a pointer and the length of its copy.
The devastating effect of errors in LZ’77 is a long-standing open problem.
Castelli and Lastras in 2004 proved that a single error in LZ’77 corrupts
O(n2/3) phrases, thus about O(n2/3 log n) symbols, where n is the size the
file to be compressed.
historyhistory current positioncurrent position
0001
10
11
Figure 1: LZ’77 pointers (also for LZRS’77 we have Mn = 4).
Our Main Idea of Error Resilient LZ’77
1. We observe that there are usually multiple copies of the longest prefix.
By Mn we denote the number of copies of the longest prefix of the
uncompressed string that appear in the database.
2. By a judicious choice of pointers in the LZ’77 scheme, we can recover
⌊log2 Mn⌋ bits without losing a bit in compression.
3. Use parity bits recovered from the multiple copies (redundancy) for the
Reed-Solomon channel coding.
Note: If the greediness of LZ’77 is relaxed (say, by looking for the 10th largest
prefix, for instance), then the number of copies found in the database will
increase significantly. This would allow even more errors to be corrected.
Experimental Results: I
Table 1: The compression of “gzip -3” (we also call it LZS’77) versus
“gzipS -3” for the files of the Calgary corpus; the last column shows the
total number of available bytes for error correction.file size gzip gzipS file redundant
111,261 39,473 39,511 bib 1,721
768,771 333,776 336,256 book1 14,524
610,856 228,321 228,242 book2 10,361
102,400 69,478 71,168 geo 4,101
377,109 155,290 156,150 news 5,956
21,504 10,584 10,783 obj1 353
246,814 89,467 89,757 obj2 3,628
53,161 20,110 20,204 paper1 937
82,199 32,529 32,507 paper2 1,551
46,526 19,450 19,567 paper3 893
13,286 5,853 5,898 paper4 249
11,954 5,252 5,294 paper5 210
38,105 14,433 14,506 paper6 738
513,216 62,357 61,259 pic 3,025
39,611 14,510 14,660 progc 736
71,646 18,310 18,407 progl 1,106
49,379 12,532 12,572 progp 741
93,695 22,178 22,098 trans 1,201
Encoder and Decoder of LZRS’77
We use the family of Reed-Solomon codes RS(255, 255− 2e) that contains
blocks of 255 bytes, of which 255− 2e are data and 2e are parity.
Encoder: The data is broken into blocks of size 255 − 2e. Blocks are
processed in reverse order, beginning with the very last. When processing
block i, the encoder computes first the Reed-Solomon parity bits for the
block i + 1 and then it embeds the extra bits in the pointers of block i.
Decoder: The decoder receives a sequence of pointers, preceded by the
parity bits of the first block which are used to correct block B1. Once block
B1 is correct, it decompresses it using LZS’77. Redundant bits of block B1
are used as parity bits to correct block B2, etc.
RS
Adjust
pointers
RS
Adjust
pointers
B1 ...
RS
B2 B3 Bb
RS
Adjust
pointersStore
Figure 2: The right-to-left sequence of operations on the blocks.
Error-Resilient Algorithm LZRS’77
LZRS’77 ENCODER (X, e)
let b, j, n← 1, 1, |X|while j < n do
append LZ’77 COMPRESS(Xj) to Bb
if |Bb| = 255− 2e then let b← b + 1
for i← b, . . . , 2 do
let RSi ← REED SOLOMON ENCODER(Bi, e)
embed in the block Bi−1 the bits RSi using LZS’77
let RS1 ← REED SOLOMON ENCODER(B1, e)
return RS1, B1, B2, . . . , Bb
LZRS’77 DECODER (RS1, B1, B2, . . . , Bb, e)
D ← empty string
if REED SOLOMON DECODER(B1 + RS1, e) = errors
then correct B1
append LZ’77 DECOMPRESS(Bi) to D
recover RS2 from the pointers used in B1 using LZS’77
for i← 2, . . . , b do
if REED SOLOMON DECODER(Bi + RSi, e) = errors
then correct Bi
append LZ DECOMPRESS(Bi) to D
recover RSi+1 from the pointers in Bi using LZS’77
return D
Experimental Results II
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 2 3 4 5 6 7 8 9 10
Pro
babi
lity
that
the
file
coul
d no
t be
reco
vere
d
Number of error injected (t=1, 10 buffers)
’error10_1.dat’
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30
Pro
babi
lity
that
the
file
coul
d no
t be
reco
vere
d
Number of error injected (t=1, 100 buffers)
’error100_1.dat’
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 5 10 15 20 25 30 35 40 45 50
Pro
babi
lity
that
the
file
coul
d no
t be
reco
vere
d
Number of error injected (t=2, 10 buffers)
’error10_2.dat’
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 40 50 60 70 80
Pro
babi
lity
that
the
file
coul
d no
t be
reco
vere
d
Number of error injected (t=2, 100 buffers)
’error100_2.dat’
Figure 3: The probability that a file of b blocks could not be recovered correctly vs
the number of errors distributed over the blocks. Top-left: e = 1 and b = 10, top-right:
e = 1 and b = 100, lower-left: e = 2 and b = 10, lower-right: e = 2 and b = 100
(e.g., for e = 2 and b = 100 LZRS’77 can decompress correctly with with
20 uniformly distributed errors 90% of the time).
Analysis of Mn Via Suffix Trees
Performance of LZRS’77 depends on Mn. How does Mn typically behave?
Build a suffix tree from the first n suffixes of the database X (i.e., S1 =
X∞1 , S2 = X∞2 , . . . , Sn = X∞n ). Then insert the (n+1)st suffix, Sn+1 = X∞n+1.
Observe: Depth of insertion of Sn+1 is the (n + 1)-st phrase length. Also,
Mn is the size of the subtree that starts at the insertion point of the (n+1)st
suffix.
S1 S2
S3 S4
S5
Mn
Figure 4: M4(=2) is the size of the subtree at the insertion point of S5.
Analyzing Mn
The ith suffix of X is X(i) = XiXi+1Xi+2 . . .. Consider the longest prefix w
of X(n+1) such that X(i) also has w as a prefix, for some 1 ≤ i ≤ n. Then
Mn = #{1 ≤ i ≤ n |X(i)= XiXi+1Xi+2 . . . has w as a prefix}
1. Independent Strings: Tries built over n independent strings.
(i) Average E[MIn] satisfies the recurrence (p = 1− q probability of a “1”):
E[MIn] = pn(qn+pE[MI
n])+qn(pn+qE[MIn)]+
n−1X
k=1
“n
k
”
pkqn−k(pE[MIk ]+qE[MI
n−k]);
(ii) The probability generating functions E[uMIn] satisfy
E[uMI
n] = pn(qu
n+pE[u
MIn])+q
n(pu
n+qE[u
MIn])+
n−1X
k=1
“n
k
”
pkq
n−k(pE[u
MIk ]+qE[u
MIn−k])
Pattern matching approach, also gives (β = α⊕ 1)
MI(z, u) =
∞X
n=1
∞X
k=1
P(MIn = k)u
kz
n
=X
w∈A∗α∈A
uP(β)P(w)
1− z(1− P(w))
zP(w)P(α)
1− z(1 + uP(w)P(α)− P(w))
Analyzing Mn for Dependent Strings
2. Suffix Trees: Using analytic combinatorics on words we prove that
M(z, u) =∞
X
n=1
∞X
k=1
P(Mn = k)ukzn
=X
w∈A∗α∈A
uP(β)P(w)
Dw(z)
Dwα(z)− (1− z)
Dw(z)− u(Dwα(z)− (1− z))
where Dw(z) = (1 − z)Sw(z) + zmP (w) and Sw(z) is the autocorrelation
polynomial, namely
Sw(z) =X
k∈P(w)
P(wmk+1)z
m−k
whereP(w) denotes positions k of w satisfying w1 . . . wk = wm−k+1 . . . wm.
For any ε > 0 there exists β > 1 such that (all hard analytic work is here!)
Pr(Mn = k)− Pr(MIn = k) = O(n−εβ−k)
Random suffix trees resemble random independent tries (cf. P. Jacquet,
W.S., 1994, Ward, 2005).
Main Results
Theorem 1 (Ward, W.S., 2005). Let zk = 2krπiln p ∀k ∈ Z, where ln p
ln q = rs for
some relatively prime r, s ∈ Z (i.e., ln pln q is rational).
The jth factorial moment E[(Mn)j] = E[M(M − 1) · · ·M(−j + 1)] is
E[(Mn)j] = Γ(j)
q(p/q)j + p(q/p)j
h+ δj(log1/p n) + O(n−η)
where h = −p log p− q log q is the entropy rate, η > 0, and where Γ is the
Euler gamma function and
δj(t) =X
k 6=0
−e2krπitΓ(zk + j)
`
pjq−zk−j+1 + qjp−zk−j+1´
p−zk+1 ln p + q−zk+1 ln q.
δj is a periodic function that has a small magnitude and exhibits fluctuation
when ln pln q is rational
Note: On average there are
E[Mn] ∼ 1/h additional pointers.
j 1ln 2
P
k 6=0
˛
˛Γ`
j − 2kiπln 2
´˛
˛
1 1.4260 ×10−5
3 1.2072 ×10−3
5 1.1421 ×10−1
6 1.1823 ×100
8 1.4721 ×102
9 1.7798 ×103
10 2.2737 ×104
Distribution of Mn
Theorem 2 (Ward, W.S., 2005). Let zk = 2krπiln p ∀k ∈ Z, where ln p
ln q = rs for
some relatively prime r, s ∈ Z. Then
P (Mn = j) =pjq + qjp
jh+
X
k 6=0
−e2krπi log1/p n
Γ(zk)(pjq + qjp)(zk)
j
j!(p−zk+1 ln p + q−zk+1 ln q)+ O(n−η)
where η > 0, and Γ is the Euler gamma function.
Therefore, Mn follows the logarithmic series distribution with mean 1/h (plus
some fluctuations).
The logarithmic series distribution ((pjq + qjp)/(jh))
is well concentrated around its mean EMn ≈ 1/h.
0
0.2
0.4
0.6
0.8
2 3 4 5 6 7 8
x
Outline Update
1. Universal Source Coding
2. Algorithms: Error-Resilient Lempel-Ziv’77
3. Combinatorics: Method of Types
(a) Markov Types and Eulerian Paths
(b) Universal Types and Enumeration of Binary Trees
(c) Nonlinear Functional Equations Arising in AofA
4. Information: Non-Prefix Codes
Method of Types
The method of types is a powerful technique in information theory; it
reduces calculations of the probability of rare events to combinatorics.
Sequences are of the same type if they have the same empirical
distribution.
Warm-up Problem: How many binary strings xn1 of length n generated by
a memoryless source have k “1”s (i.e., have the same Bernoulli type)? All
such strings have the same probability
P (xn1) = p
k(1− p)
n−k
where p is the probability of generating a 1.
Answer: Certainly, the answer is:`n
k
´
.
Markov Types
Consider a Markov source over an m-ary alphabet with the transition
matrix P = {pij}mi,j=1 , that is, P (Xt+1 = j|Xt = i) = pij.
The probability of xn1 is
P (xn1 ) = p
k1111 · · · p
kmmmm
where kij is the number of pair symbols ij in xn1 , that is, i followed by j.
Example: Let xn1 = 01101, then
P (01101) = p201p11p10.
For circular strings (i.e., after the n symbol we re-visit the first symbol of xn1 ),
the matrix [kij] satisfies the following constraints that we denote as Fn
X
1≤i,j≤m
kij = n,m
X
j=1
kij =m
X
j=1
kji, ∀ i (balance property)
Markov Types and Eulerian Cycles
Problem. Let k = [kij]mi,j=1 be a given frequency matrix satisfying the
balance property.
A: How many strings of a given frequency matrix k (given type) are there?
Example: Let A = {0, 1} and
k =
»
1 2
2 2
–
B: How to enumerate Eulerian paths (types) in a multigraph with |A|vertices and kij edges between ith and jth vertices?
We are interested in:
Nk – number of (cyclic) strings xn1 belonging to the same type k.
Nak – number of strings xn
1 of type k and starting with a symbol a.
Nabk – # strings xn
1 of type k, starting with a symbol a and ending with b.
Main Technical Tool
Let gk be a sequence of scalars indexed by matrices k and
g(z) =X
k
gkzk
be its regular generating function, and
Fg(z) =X
k∈Fgkz
k =X
n≥0
X
k∈Fn
gkzk
the F -generating function of gk for which k ∈ F .
Lemma 1. Let g(z) =P
k gkzk. Then
Fg(z) :=X
n≥0
X
k∈Fn
gkzk
=
„
1
2π
«m I
dx1
x1
· · ·I
dxm
xm
g([zij
xj
xi
])
with the ij-th coefficient of [zijxjxi
] is zijxjxi
.
Proof. It suffices to observe
g([zij
xj
xi
]) =X
k
gkzk
mY
i=1
xP
i kij−P
j kiji
Thus Fg(z) is the coefficient of g([zijxjxi
]) at x01x
02 · · · x0
m.
Enumeration of Eulerian Paths
1. Define for an m-ary alphabet
Bk =“ k1
k11 · · · k1m
”
· · ·“ km
km1 · · · kmm
”
.
2. Let Nak,k′ be the number of ways matrix k is transformed into another
matrix k′ when the Eulerian path starts with symbol a.
Nak,k′ = N
ak−k′ ×Bk′, k
′a = 0.
SinceP
k′Nak,k′ = Bk, hence Bk =
P
k′∈F,k′a=0 Nak−k′ ×Bk′.
3. We find
Nb,ak = [z
k]B(z)zba · det
bb(I− z).
and using Cauchy’s formula we can prove that
N b,ak =
kba
kb
Bk · detbb
(I− k∗)
„
1 + O
„
1
n
««
,
where k∗ is the normalized matrix such that k∗ = [kij/ki].
4. For example for a binary Markov we have
N0,0k ∼
k10
k10 + k11
“k00 + k01
k00
”“k10 + k11
k10
”
=k10
k10 + k11
Bk
Universal Types
Seroussi introduced in 2003 universal types for stationary ergodic sources:
Sequences of the same length p are said to be of the same universal
type if they generate the same set of phrases in the Lempel-Ziv’78.
Figure 5: Two universal types and the corresponding binary trees
.
Number of Types and Binary Trees
Lempel-Ziv’78 parsing scheme of a sequence of length p can be
represented by a binary tree of path length p. Let
– Tn be the set of binary trees built on n nodes.
– Tp be the set of binary trees with the path length equal to p.
# universal types over Ap ≡ |Tp|: # of trees of a given path p.
How to enumerate binary trees of a given path length p?
Enumeration of Binary Trees: Tn vs Tp
Let b(n, p) be the number of binary trees with
n nodes and path length p. It satisfies:
b(n, p) =P
k+ℓ=n−1
P
r+s+n−1=p b(k, r)b(ℓ, s)
b(n+1,p)
b(k,r) b(n-k,p-n-r)
Define Bn(w) =P∞
p=0 b(n, p)wp, and B(z, w) =P∞
n=0 znBn(w). Then
B(z, w) = 1 + zB2(zw, w)
This functional equation is asymmetric with respect to z and w.
We want to study the number of trees in Tp (of a given path length p).
Observe
|Tp| =X
n≥0
b(n, p) = [wp]B(1, w).
We set z = 1 in the functional equation leading to
B(1, w) = 1 + B2(w, w)
which is not algebraically solvable.
Generalization: Knuth’s Problem
During the 10th Seminar on Analysis of Algorithms, MSRI, 2004, Knuth posed
the problem of analyzing the left and the right path lengths in a random
binary trees.
Let N(p, q; n) be the number of binary trees with n nodes that have a total
right path length p and a total left path length q. Define
Bn(w, v) =X
p
X
q
N(p, q; n)wpv
q
which satisfies the recurrence (B0(w, v) = 1)
Bn+1(w, v) =
nX
i=0
wiv
n−iBi(w, v)Bn−i(w, v), n ≥ 0.
Thus the triple transform B(w, v, z) =P∞
n=0 Bn(w, v)zn satisfies
Knuth’s functional equation
B(w, v, z) = 1 + zB(w, v, wz)B(w, v, vz).
Catalan Numbers and Uniform Model
1. Setting w = 1 we get
B(1, z) = 1 + zB2(1, z)
which can be solved explicitly
B(1, z) =`
1−√
1− 4z´
/(2z)
leading to the Catalan number Cn = [zn]B(1, z).
2. Set w = v (and define B(w, z) := B(w, w, z)) to find
B(w, z) = 1 + zB2(w, zw).
This describes the total path length Ln in the Tn-uniform model, that is,
P (Ln = p) =b(n, p)
Cn
where b(n, p) is the number of trees with n nodes and path length p.
Path Length Distribution
Louchard (1984) and Takacs (1991) show that
E[Lnr]
(2n3)r/2∼ 2
√π
Γ((3r − 1)/2)wr
where wr satisfies the following nonlinear recurrence (w0 = −1) for r ≥ 1
2wr = (3r − 4)rwr−1 +r−1X
j=1
“r
j
”
wjwr−j
or setting cr = wr/r!
2cr = (3r − 4)cr−1 +r−1X
j=1
cjcr−j.
In other words, the limiting distribution of the total path length satisfies
P
„
Ln√2n3≤ x
«
→ W (x)
where W (x) is the Airy distribution defined by its moments through wr.
Right Path Length, Area under Bernoulli Walk
3. Let us now set v = 1 in the triple transform equation. Then
B(w, z) = 1 + zB(w, wz)B(w, z)
while Bn(w) = [zn]B(w, z) satisfies
Bn+1(w) =
nX
i=1
wiBi(w)Bn−i(w).
Observe that Bn(w) is the generating function of the right path length Rn
in the Tn model.
It was also studied by Takacs who analyzed
the area under a Bernoulli excursion 2nAn
in 2n steps. 0
1
2
3
4
0 n 2n
Rn
i
P
„
An√2n≤ x
«
= P
„
2nAn
2n√
2n≤ x
«
= P
„
Rn√2n3≤ x
«
= P
„
Ln
2√
2n3≤ x
«
= W (x)
where W (x) is the Airy’s distribution.
Finally, it appears in the Kleitman-Winston conjecture.
Back to Types: Number of Trees with a Given Path Length
4. Setting z = 1 in the previous equation we arrive at
B(w, 1) = 1 + B2(w, w).
Observe that [wp]B(w, 1) = the number of binary trees with path length=p
Seroussi (2004) and Knessl & W.S (2004) prove that (c1, c2 are constants)
|Tp| =1
(log2 p)√
πp2
2plog2 p
“
1+c1 log−2/3 p+c2 log−1 p+O(log−4/3 p)”
.
When selecting randomly a tree from Tp we may define: Np, the number
of nodes in the Tp-model.
Surprisingly, we can prove that Np is asymptotically normal, that is,
Pr{Np = n} =b(n, p)∞
X
n=0
b(n, p)
∼ 1p
2πVar[Np]exp
"
−(n− E[Np])2
2Var[Np]
#
where
E[Np] ∼p
log2 p, Var[Np] ∼
p
log2 p5/3
(log 2)A0
6(21/3
where A0 is a constant.
WKB Method – Open Problems
Knessl and W.S. use the so called WKB method (heuristic).
The WKB method assumes that the solution, B(ξ; n), to a functional
equation has the following asymptotic form
B(ξ; n) ∼ enϕ(ξ)
»
A(ξ) +1
nA
(1)(ξ) +
1
n2A
(2)(ξ) + · · ·
–
,
where ϕ(ξ) and A(ξ), A(1)(ξ), . . . are unknown functions.
These functions must be determined from the equation itself, often in
conjunction with the asymptotic matching principle.
For example, for w = 1 + a/n3/2 with a = −Y 3/2 we found that
Bn(w) ∼ 4n+1
n3/2(−a)
∞X
j=0
exp(−|rj|41/3Y )
where rj are the roots of the Airy’s function Ai(z) = 0.
Open Problems: The above two results concerning the size of Tp and the
distribution of Np do not have analytic solutions.
Back to Knuth’s Problem
5. LetDn be a random variable representing the path difference between
the left and the right paths in a binary tree.
We observe that the distribution of Dn can be only characterized by
moments. Janson 2006, Knessl, W.S., 2005, and Panholzer, 2006 proved
thatE[Dn
2m+2]
n5(m+1)/2→ (2m + 2)!
√π
∆m
Γ(52m + 2)
where ∆m satisfy the following nonlinear recurrence
∆m+1 =(5m + 6)(5m + 4)
8∆m +
1
4
mX
ℓ=0
∆ℓ∆m−ℓ, m ≥ 0.
Again, a nonlinear recurrence for the coefficients at the normalized
moments!
Page 3 of Flajolet’s Talk in Princeton’98
Distributions: Difference equations
For nonlinear parameters, esp. quadratic ones,
decomposability implies a functional equation.
Xn = n + Xsmaller + X ′smaller
Trie sort (Jacquet, Regnier)
F (z, w) = F (wz
2, w)2 + az + b
Digital search and compression (J, Szpank, Louchard)
∂
∂zF (z, w) = F (w
z
2, w)2
Quicksort (Hennequin + Regnier, Rosler)
∂
∂zF (z, w) = F (wz, w)2
In situ permutation (Knuth, Prodinger)
∂
∂zF (z, w) = F (z, w) · F (wz, q)
Linear probing hashing (Knuth, Fl-Poblete-Viola)
∂
∂zF (z, w) = F (z, w) ·
F (z, w) − wF (wz, q)
1 − w
Path length in trees (Louchard, Takacs)
F (z, w) =z
1 − F (wz, w)
Open Problem – A Conjecture
Consider problems characterized by a nonlinear differential-functional
equation
A1
∂
∂wB(w, z)+A2B(w, z) = a(w, z)+b(w, z)B(wα1zβ1, wα2zβ2)B(wα3zβ3, wα4zβ4)
where a(w, z), b(w, z) are slowly growing functions, and αi, βi ∈ {0, 1}.
Let Zn be a random variable such that for some am →∞
E[Zmn ]
an
→ cm
where in general cm satisfies
cm+1 = αm + βmcm +m−1X
i=0
γicicm−i
with some initial conditions, and given αm, βm and γm.
Similar recurrences appear in the quicksort, linear hashing, path length in
binary trees, area under Bernoulli walk, enumeration of trees with given
path length, and many others.
A new class of distributions? Can we characterize it?
Analytic Information Theory
• In the 1997 Shannon Lecture Jacob Ziv presented compelling
arguments for “backing off” from first-order asymptotics in order to
predict the behavior of real systems with finite length description.
• To overcome these difficulties we propose replacing first-order analyses
by full asymptotic expansions and more accurate analyses (e.g., large
deviations, central limit laws).
• Following Knuth and Hadamard’s precept1, we study information theory
problems using techniques of complex analysis such as generating
functions, combinatorial calculus, Rice’s formula, Mellin transform,
Fourier series, sequences distributed modulo 1, saddle point methods,
analytic poissonization and depoissonization, and singularity analysis.
• This program, which applies complex-analytic tools to information
theory, constitutes analytic information theory.
1 The shortest path between two truths on the real line passes through the complex plane.