Asymptotic Enumerators of Protograph LDPCC Ensembles Jeremy Thorpe Joint work with Bob McEliece,...

transcript

Asymptotic Enumerators of Protograph LDPCC Ensembles

Jeremy ThorpeJoint work with Bob McEliece, Sarah Fogal

Outline

Motivation What causes error floors in codes? How can we predict error floors in advance of simulation?

“Bad Sets” Codeword Stopping Set Other Sets

Protograph Ensembles and Asymptotic Enumerators

Computation of Enumerators

Why error-correcting codes?

Error correcting codes are designed to transmit information As efficiently as possible With very low probability of error

What is an Error Floor?

An error floor occurs when the probability of error doesn’t improve “fast enough” as a channel improves.

What’s going on?

Error floors occur when the typical mode of failure changes. In “waterfall” region, performance

depends on global channel statistics. In “error floor” region, performance

depends on channel output near: Low-weight codewords Low-weight stopping sets [FKV 98] Low-weight “trapping sets” [R. 04]

Can we predict error floors without simulation?

Predict the “bad sets”: Low-weight codewords Low-weight stopping sets Low-weight “trapping sets”

This is difficult to do for particular codes However, for ensembles of codes, the problem

becomes feasible, and has been solved for Codeword WE, Regular Ensembles [G 63] Codeword WE, Unstructured Irregular Ensembles [LS 98] Codeword, Stopping Set WE, UIE [Di 04]

Weight Enumerators Defined

Code Basics

Recall that a code is a set of vectors of length n.

The code on the right is the (7,4) Hamming Code.

0000000000011100110010011110010101001011010110011011010011111111111000110011011000011010101101001010011001001011

Linear Codes

Linear Codes can be represented by their parity check matrices.

0000000000011100110010011110010101001011010110011011010011111111111000110011011000011010101101001010011001001011

C =101010100111101100110

}0:{ THxxC

Representation by a Graph

Parity check matrices can be represented by a graph.

101010100111101100110

Codeword weight Enumerator for the (7,4) Hamming Code

0 1 2 3 4 5 6 7

0000000000011100110010011110010101001011010110011011010011111111111000110011011000011010101101001010011001001011

Protograph Ensembles

Protograph is expanded by “N” to obtain code graph.

Randomness comes from permutations of each edge type

Average codeword weight Enumerator of expanded code N=2

For N=2, there are (2!)|E|=4096 codes in the ensemble.

The “ensemble average” weight enumerator is shown.

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Asymptotic average codeword weight enumerator

Plot on log scale… )(log wA

0 5 10

Plot on log scale Make quantities

intrinsic…

wA )(log

0 0.2 0.4 0.6 0.8 1

Plot on log scale Make quantities

intrinsic Take Limit

)(loglim

0 0.2 0.4 0.6 0.8 1

Codewords, Stopping Sets, Trapping Sets

Codewords (on a graph)

Assignment x of the variables of a graph such that: each check node is

adjacent an even number of times to variables assigned 1

example: x If half of the variables

of a codeword are “flipped” in a BSC, ML decoding fails.

X = 1 0 0 1 1 0 0

Stopping Sets

Assignment x of the variables of a graph such that: Each check node is

adjacent 0 times, or 2 or more times to variables assigned 1

example: x, y

If all of a stopping set is erased, BP decoding cannot continue.

X = 1 0 0 1 1 0 0

y = 1 0 1 0 1 0 0

Stopping Set Enumerators

On the right is a “sneak peak” at a stopping set enumerator.

Stopping set enumerators are uniformly larger than codeword enumerators because every codeword is a stopping set.

)(loglim

Trapping Sets

Trapping sets are sets of variables that cause quantized decoders to get “stuck”.

Usually, trapping sets have a small number of Checks connected once to variables assigned a 1.

Unfortunately, this is not a good combinatorial characterization, so we forget about trapping sets for now.

Trapping Set Enumerators

Words like “usually” and “small number” usually don’t lead to useful combinatorial characterizations… ?

Formalizing “Bad Sets”

For a given “word” x, define the vector of assignments adjacent to check c as xc.

For each check node c, define the set Ωc.

Define Ω If Ωc are the sets of even

weight, then Ω is the set of codewords.

If Ωc are the set of vectors of non-unit weight, Ω is the stopping sets.

)deg(1,0 cc

ccxx :c

Example

If Ωc are the sets of even weight, thenx is not in Ω, because x1 is not in Ω1.

If Ωc are the set of vectors of non-unit weight, x is in Ω.

x = 1 0 1 0 1 0 0

0,1,1,11cx

c1 c2 c3

0,1,0,12cx

Types of words

Consider an arbitrary vector x

Define vector θ where θv is the fraction of 1’s assigned to variables of type v

1 0 1 0 0 1 01 0 0 0 0 1 11 0 0 0 1 0 01 0 0 0 1 0 0

θ = 1 0 .25 0 .5 .5 .25

A Lemma about types

Lemma: All words of a given type are equally probably a member of Ω, with respect to our ensemble.

Sketch of Proof: Suppose x and y have the same type. If x is in Ω for a certain set of permutations (= code in the ensemble), then y is in Ω for some other set of permutations.

Applying method of Types

There is usually a “typical” type θ* of words of any weight Θ.

Even if all types are equally numerous, there are only polynomially many types, and this cannot change the exponent.

)(loglim

Computing E(θ)

A(Nθ) can be broken down as shown on the right.

It is easy to compute the number of words of a given type.

The exponent of the probability that all checks of a given type will be satisfied is defined as Φ.

vHNA log

Computing Φ(θc)

Sanov’s theorem (or something like it) tells us that Φ(θc) is the minimum KL-distance to a certain distribution qθc

Pθc is the set of

distributions on Ωc with marginals equal to θc.

Finding “p”

We can do a constrained minimization to show that the optimal p must have a “Boltzmann distribution” with parameter s.

It is easy to compute θc from s, but difficult in reverse.

It is easy to compute D(p||q) from s.

D(p||q)

Optimizing over θ

E(θ) is not a convex function, thus in principle, we have to evaluate everywhere to find its maximum E(Θ).

In practice, we still use convex optimization techniques.

Using weight enumerators to design codes!

Why are Enumerators Interesting?

The zero-crossings of weight enumerators give us lower bounds on the typical size of bad sets.

In this example, the zero-crossing of the stopping set enumerator is about 0.1, so we would expect a code of size 106 to have a minimum stopping set of size 105

(or possibly bigger).

Optimizing Protograph Ensembles

Currently, many ensembles are designed only for density evolution threshold.

Such optimized codes are often ignored by actual implementers because desired error rates are around 10-10.

By optimizing simultaneously for threshold and asymptotic enumerator, we may be able to find efficient codes that can be used in these applications.

Open Questions

How fast can we compute the weight enumerator (or just the zero-crossing)?

What’s the achievable region of threshold and zero-crossing for protograph ensembles?

Asymptotic Enumerators of Protograph LDPCC Ensembles Jeremy Thorpe Joint work with Bob McEliece,...

Documents