Bennatan2006.pdf

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006 549

Design and Analysis of Nonbinary LDPC Codes forArbitrary Discrete-Memoryless Channels

Amir Bennatan, Student Member, IEEE, and David Burshtein, Senior Member, IEEE

Abstract—We present an analysis under the iterative decodingof coset low-density parity-check (LDPC) codes over GF( ),designed for use over arbitrary discrete-memoryless channels(particularly nonbinary and asymmetric channels). We use arandom-coset analysis to produce an effect that is similar tooutput symmetry with binary channels. We show that the randomselection of the nonzero elements of the GF( ) parity-check ma-trix induces a permutation-invariance property on the densities ofthe decoder messages, which simplifies their analysis and approxi-mation. We generalize several properties, including symmetry andstability from the analysis of binary LDPC codes. We show thatunder a Gaussian approximation, the entire 1-dimensionaldistribution of the vector messages is described by a single scalarparameter (like the distributions of binary LDPC messages).We apply this property to develop extrinsic information transfer(EXIT) charts for our codes. We use appropriately designed signalconstellations to obtain substantial shaping gains. Simulation re-sults indicate that our codes outperform multilevel codes at shortblock lengths. We also present simulation results for the additivewhite Gaussian noise (AWGN) channel, including results within0.56 dB of the unrestricted Shannon limit (i.e., not restricted toany signal constellation) at a spectral efficiency of 6 bits/s/Hz.

Index Terms—Bandwidth-efficient coding, coset codes, iterativedecoding, low-density parity-check (LDPC) codes.

I. INTRODUCTION

I N their seminal work, Richardson et al. [29], [28] devel-oped an extensive analysis of low-density parity-check

(LDPC) codes over memoryless binary-input output-sym-metric (MBIOS) channels. Using this analysis, they designededge-distributions for LDPC codes at rates remarkably close tothe capacity of several such channels. However, their analysisis mostly restricted to MBIOS channels. This rules out manyimportant channels, including bandwidth-efficient channels,which require nonbinary channel alphabets.

To design nonbinary codes, Hou et al. [18] suggested startingoff with binary LDPC codes either as components of a mul-tilevel code (MLC) or a bit-interleaved coded modulation(BICM) scheme. Nonbinary channels are typically not output

Manuscript received October 3, 2004; revised November 6, 2005. This workwas supported in part by the Israel Science Foundation under Grant 22/01–1, inpart by an Equipment Grant from the Israel Science Foundation to the Schoolof Computer Science at Tel-Aviv University, and in part by a Fellowship fromThe Yitzhak and Chaya Weinstein Research Institute for Signal Processing atTel-Aviv University. The material in this paper was presented in part at the41st Annual Allerton Conference on Communications, Control and Computing,Monticello, IL, October 2003 and the IEEE International Symposium on Infor-mation Theory, Adelaide, Australia, September 2005.

The authors are with the School of Electrical Engineering, Tel-AvivUniversity, Ramat-Aviv 69978, Tel-Aviv, Israel (e-mail: [email protected];[email protected]).

Communicated by M. P. Fossorier, Associate Editor for Coding Techniques.Digital Object Identifier 10.1109/TIT.2005.862080

symmetric, thus posing a problem to their analysis. To over-come this problem, Hou et al. used coset-LDPC codes ratherthan plain LDPC codes. The use of coset-LDPC codes was firstsuggested by Kavcic et al. [19] in the context of LDPC codesfor channels with intersymbol interference (ISI).

MLC and BICM codes are frequently decoded using multi-stage and parallel decoding, respectively. Both methods are sub-optimal in comparison to methods that rely only on belief-prop-agation decoding.1 Full belief-propagation decoding was con-sidered by Varnica et al. [37] for MLC and by ourselves in [1](using a variant of BICM LDPC called BQC-LDPC). However,both methods involve computations that are difficult to analyze.

An alternative approach to designing nonbinary codes startsoff with nonbinary LDPC codes. Gallager [16] defined arbi-trary-alphabet LDPC codes using modulo- arithmetic. Nonbi-nary LDPC codes were also considered by Davey and MacKay[10] in the context of codes for binary-input channels. Their def-inition uses Galois field (GF ) arithmetic. In this paper, wefocus on GF LDPC codes similar to those suggested in [10].

In [1], we considered coset GF LDPC codes under max-imum-likelihood (ML) decoding. We showed that appropriatelydesigned coset GF LDPC codes are capable of achieving thecapacity of any discrete-memoryless channel. In this paper, weexamine coset GF LDPC codes under iterative decoding.

A straightforward implementation of the nonbinary be-lief-propagation decoder has a very large decoding complexity.However, we discuss an implementation method suggested byRichardson and Urbanke [28, Sec. V] that uses the multidi-mensional discrete Fourier transform (DFT). Coupled with anefficient algorithm for computing of the multidimensional DFT,this method reduces the complexity dramatically, to that of theabove discussed binary-based MLC and BICM schemes (whenfull belief-propagation decoding is used).

With binary LDPC codes, the binary phase-shift keying(BPSK) signals are typically used instead of thesymbols of the code alphabet, when transmitting over theAWGN channel. Similarly, with nonbinary LDPC codes, astraightforward choice would be to use a pulse amplitude mod-ulation (PAM) or quadrature amplitude modulation (QAM)signal constellation (which we indeed use in some of oursimulations). However, with such constellations, the codesexhibit a shaping loss which, at high signal-to-noise ratio(SNR), approaches 1.53 dB [13]. By carefully selecting thesignal constellation, a substantial shaping gain can be achieved.

1Multistage decoding involves transferring a hard decision on the decodedcodeword (rather than a soft decision) from one component code to the next.It further does not benefit from feedback on this decision from subsequent de-coders. Parallel decoding of BICM codes is bounded away from capacity asdiscussed in [7].

0018-9448/$20.00 © 2006 IEEE

550 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

Two approaches that we discuss are quantization mapping,which we have used in [1] (based on ideas by Gallager [17] andMcEliece [25]) and nonuniform spacing (based on Sun and vanTilborg [33] and Fragouli et al. [14]).

An important aid in the analysis of binary LDPC codes isdensity evolution, proposed by Richardson and Urbanke [28].Density evolution enables computing the exact threshold of bi-nary LDPC codes asymptotically at large block lengths. Usingdensity evolution, Chung et al. [8] were able to present irreg-ular LDPC codes within 0.0045 dB of the Shannon limit of thebinary-input additive white Gaussian noise (AWGN) channel.Efficient algorithms for computing density-evolution were pro-posed in [28] and [8].

Density evolution is heavily reliant on the output symmetry oftypical binary channels. In this paper, we show that focusing oncoset codes enables extension of the concepts of density evo-lution to nonbinary LDPC codes. We examine our codes in arandom coset setting, where the average performance is evalu-ated over all possible realizations of the coset vector. Our ap-proach is similar to the one used by Kavcic et al. [19] for binarychannels with ISI. Random-coset analysis enables us to gener-alize several properties from the analysis of binary LDPC, in-cluding the all-zero codeword assumption,2 and the symmetryproperty of densities.

In [9] and [35], approximations of the density evolution wereproposed that use a Gaussian assumption. These approximationstrack one-dimensional surrogates rather that the true densities,and are easier to implement. A different approach was used in[6] to develop one-dimensional surrogates that can be used tocompute lower bounds on the decoding threshold.

Unlike binary LDPC codes, the problem of finding an effi-cient algorithm for computing density evolution for nonbinaryLDPC codes remains open. This is a result of the fact that themessages transferred in nonbinary belief propagation are mul-tidimensional vectors rather than scalar values. Just storing thedensity of a nonscalar random variable requires an amount ofmemory that is exponential in the alphabet size. Nevertheless,we show that approximation using surrogates is very much pos-sible.

With LDPC codes over GF , the nonzero elements ofthe sparse parity-check matrix are selected at random fromGF . In this paper, we show that this random selectioninduces an additional symmetry property on the distributionstracked by density evolution, which we call permutation-invari-ance. We use permutation-invariance to generalize the stabilityproperty from binary LDPC codes.

Gaussian approximation of nonbinary LDPC was first con-sidered by Li et al. [22] in the context of transmission overbinary-input channels. Their approximation uses -dimen-sional vector parameters to characterize the densities of mes-sages, under the assumption that the densities are approximatelyGaussian. We show that assuming permutation-invariance, thedensities may in fact be described by scalar, one-dimensionalparameters, like the densities of binary LDPC messages.

2Note that in [38], an approach to generalizing density evolution to asym-metric binary channels was proposed that does not require the all-zero codewordassumption.

Finally, binary LDPC codes are commonly designed usingextrinsic information transfer (EXIT) charts, as suggested byten Brink et al. [35]. EXIT charts are based on the Gaussian ap-proximation of density evolution. In this paper, we therefore usethe generalization of this approximation to extend EXIT chartsto coset GF LDPC codes. Using EXIT charts, we designcodes at several spectral efficiencies, including codes at a spec-tral efficiency of 6 bits/s/Hz within 0.56 dB of the unrestrictedShannon limit (i.e., when transmission is not restricted to anysignal constellation). To the best of our knowledge, these are thebest codes designed for this spectral efficiency. We also com-pare coset GF LDPC codes to codes constructed using mul-tilevel coding and turbo trellis-coded modulation (turbo-TCM),and provide simulation results that indicate that our codes out-perform these schemes at short block lengths.

Our work is organized as follows: We begin by introducingsome notation in Section II.3 In Section III, we formally definecoset LDPC codes over GF and ensembles of codes, and dis-cuss mappings to the channel alphabet. In Section IV, we presentbelief-propagation decoding of coset GF LDPC codes, anddiscuss its efficient implementation. In Section V, we discuss theall-zero codeword assumption, symmetry, and channel equiva-lence. In Section VI, we present density evolution for nonbinaryLDPC and permutation invariance. We also develop the stabilityproperty and Gaussian approximation. In Section VII, we dis-cuss the design of LDPC codes using EXIT charts and presentsimulation results. In Section VIII, we compare our codes withmultilevel coding and turbo-TCM. Section IX presents ideas forfurther research and concludes the paper.

II. NOTATION

A. General Notation

Vectors are typically denoted by boldface e.g., . Randomvariables are denoted by upper case letters, e.g., and theirinstantiations in lower case, e.g., . We allow an exception tothis rule with random variables over GF , to enable neaternotation.

For simplicity, throughout this paper, we generally assumediscrete random variables (with one exception involvingGaussian approximation). The generalization to continuousvariables is immediate.

B. Probability and Log-Likelihood-Ratio (LLR) Vectors

An important difference between nonbinary and binaryLDPC decoders is that the former use messages that are multi-dimensional vectors, rather than scalar values. Like the binarydecoders, however, there are two possible representations forthe messages: plain-likelihood probability vectors or log-likeli-hood-ratio (LLR) vectors.

A -dimensional probability vector is a vectorof real numbers such that for all

and . The indices of eachmessage vector’s components are also interpreted as elementsof GF . That is, each index is taken to mean the th element

3We have placed this section first for easy reference, although none of thenotations are required to understand Section III.

BENNATAN AND BURSHTEIN: DESIGN AND ANALYSIS OF NONBINARY LDPC CODES FOR ARBITRARY DISCRETE-MEMORYLESS CHANNELS 551

of GF , given some enumeration of the field elements (weassume that indices and correspond to the zero and oneelements of the field, respectively).

Given a probability vector , the LLR values associated withit are defined as

(a definition borrowed from [22]).Notice that for all , . We define the LLR-vector

representation of as the -dimensional vector. For convenience, although is

not defined as belonging to this vector, we will allow ourselvesto refer to it with the implicit understanding that it is alwaysequal to zero.

Given an LLR vector , the components of the correspondingprobability vector (the probability vector from which was pro-duced) can be obtained by

LLR (1)

We use the shorthand notation to denote the LLR-vectorrepresentation of a probability vector . Similarly, if is anLLR vector, then is its corresponding probability-vector rep-resentation.

A probability-vector random variable is defined to be a -di-mensional random variable that takesonly valid probability-vector values. An LLR-vector randomvariable is simply a -dimensional random variable

.

C. The Operations and

Given a probability vector and an element GF , wedefine the operator in the following way (note that a differentdefinition will shortly be given for LLR vectors):

(2)

where addition is performed over GF . is defined as the set

(3)

We define as the number of elements GF satisfying. For example, assuming GF addition,

, and . Note that for all ,because .

Similarly, we define

(4)

Note that the operation is reversible, and .Similarly, is reversible for all , and . InAppendix I, we summarize some additional properties of theseoperators that are used in this paper.

In the context of LLR vectors, we define the operationdifferently. Given an LLR vector , we define using thecorresponding probability vector. That is,

LLR LLR

Fig. 1. Schematic diagram of a GF(q) LDPC bipartite graph.

Thus, we obtain

(5)

The operation is similarly defined as

LLR LLR

However, unlike the operation, the resulting definition coin-cides with the definition for probability vectors, and

III. COSET GF LDPC CODES DEFINED

We begin in Section III-A by defining LDPC codes overGF . We proceed in Section III-B to define coset GFLDPC codes. In Section III-C, we define the concept of map-pings, by which coset GF LDPC codes are tailored tospecific channels. In Section III-D, we discuss ensembles ofcoset GF LDPC codes.

A. LDPC Codes Over GF

A GF LDPC code is defined in a way similar to binaryLDPC codes, using a bipartite Tanner graph [34]. The graph has

variable (left) nodes, corresponding to codeword symbols,and check (right) nodes corresponding to parity checks.

Two important differences distinguish GF LDPC codesfrom their binary counterparts. First, the codeword elementsare selected from the entire field GF . Hence, each variablenode is assigned a symbol from GF , rather than just a binarydigit. Second, at each edge of the Tanner graph, a label

GF is defined. Fig. 1 illustrates the labels at theedges adjacent to some check node of an LDPC code’s bipar-tite graph (the digits and represent nonzero elements ofGF ).

A word with components from GF is a codeword if ateach check node , the following equation holds:

where is the set of variable nodes adjacent to . TheGF -LDPC code’s parity-check matrix can easily be ob-tained from its bipartite graph (see [1]).

As with binary LDPC codes, we say that a GF LDPC codeis regular if all variable nodes in its Tanner graph have the samedegree, and all check nodes have the same degree. Otherwise,we say it is irregular.


B. Coset GF LDPC Codes

As mentioned in Section I, rather than using plain GFLDPC codes, it is useful instead to consider coset codes. Indoing so, we follow the example of Elias [12] with binary codes.

Definition 1: Given a length- linear code and a length-vector over GF , the code (i.e., obtained byadding to each of the codewords of ) is called a coset code.Note that the addition is performed componentwise over GF .

is called the coset vector.

The use of coset codes, as we will later see, is a valuable assetto rigorous analysis and is easily accounted for in the decodingprocess.

C. Mapping to the Channel Signal Set

With binary LDPC codes, the BPSK signals are typicallyused instead of the symbols of the code alphabet. Withnonbinary LDPC, we denote the signal constellation by andthe mapping from the code alphabet (GF ) by . When de-signing codes for transmission over an AWGN channel, a PAMor QAM constellation is a straightforward choice for . In Sec-tion VIII, we present codes where is a PAM signal constella-tion. However, we now show that more careful attention to thedesign of the signal constellation can produce a substantial gainin performance.

In [1], we have shown that ensembles of GF -LDPC codesresemble uniform random-coding ensembles. That is, the em-pirical distribution of GF symbols in nearly all codewords isapproximately uniform. Equivalently, for a given codeword ,

GF , where is a randomly selectedcodeword symbol. Such codes are useful for transmission oversymmetric channels, where the capacity-achieving distributionis uniform [17]. However, to approach capacity over asymmetricchannels (and overcome the shaping gap [13]), we need thesymbol distribution to be nonuniform. For example, to approachcapacity over the AWGN channel, we need the distribution to re-semble a Gaussian distribution.

One solution to this problem is a variant of an idea by Gallager[17]. The approach begins with a mapping of symbols fromGF (the code alphabet) into the channel input alphabet. Wetypically use a code alphabet that is larger than the channel inputalphabet. By mapping several GF symbols into each channelsymbol (rather than using a one-to-one mapping), we can con-trol the probability of each channel symbol. For example, inFig. 2, we examine a channel alphabet , and aquantization mapping that is designed to achieve the distribu-tion (The digitsrepresent elements of GF ). We call this a quantizaion map-ping because the mapping is many-to-one.

Formally, we define quantization mapping as follows.

Definition 2: Let be a rational probability assignmentof the form , for all . A quantization

associated with is a mapping from a set of GFelements to such that the number of elements mapped to each

is .

Fig. 2. An example of quantization mapping.

Quantizations are designed for finite channel input alphabetsand rational-valued probability assignments. However, otherprobability assignments can be approximated arbitrarily close.Independently of our work, a similar approach was developedby Ratzer and MacKay [26] (note that their approach does notinvolve coset codes).

A similar approach to designing mappings is based on Sunand van Tilborg [33] and Fragouli et al. [14], and is suitablefor channels with continuous-input alphabets (like the AWGNchannel). Instead of mapping many code symbols into eachchannel symbol, they used a one-to-one mapping to a setof channel input signals that are nonuniformly spaced. Toapproximate a Gaussian input distribution, for example, thesignals could be spaced more densely around zero.

Our analysis in this paper, however, is not restricted to theabove mappings. Given a mapping over GF , we definethe mapping of a vector with symbols in GF , as the vectorobtained by applying to each of its symbols. The mappingof a code is the code obtained by applying the mapping to eachof the codewords.

It is useful to model coset GF LDPC encoding as a se-quence of operations, as shown in Fig. 3. An incoming mes-sage is encoded into a codeword of the underlying GF LDPCcode . The coset vector is then added, and a mappingis applied. In the sequel, we will refer to the resulting codewordas a coset GF LDPC codeword, although strictly speaking,the mapping is not included in Definition 1. Finally, the re-sulting codeword is transmitted over the channel.

D. Ensembles of Coset GF LDPC Codes

As in the case of standard, binary LDPC codes, the analysisof coset GF LDPC focuses on the average behavior of codesselected at random from an ensemble of codes.

The following method, due to Luby et al. [24] is used to con-struct irregular bipartite Tanner graphs. The graphs are charac-terized by two probability vectors

and

For convenience, we also define the polynomials

and

In a Tanner graph, for each a fraction of the edgeshas left degree , and for each a fraction of the edges hasright degree . Letting denote the total number of edges, weobtain that there are left nodes with degree , and


Fig. 3. Encoding of coset GF(q) LDPC codes.

right nodes with degree . Letting denote the number of leftnodes and denotes the number of right nodes, we have

Luby et al. suggested the following method for constructingbipartite graphs. The edges originating from left nodes

are numbered from to . The same procedure is applied to theedges originating from right nodes. A permutation is then

chosen with uniform probability from the space of all permuta-tions of . Finally, for each , the edge numberedon the left side is associated with the edge numbered on theright side. Note that occasionally, multiple edges may link a pairof nodes.

A GF LDPC code is constructed from aTanner graph by random independent and identically dis-tributed (i.i.d.) selection of the labels with uniform probabilityfrom GF , at each edge. Given a mapping , acoset GF LDPC code is created by applying to a coset ofa GF LDPC code. The coset vector is generated byrandom uniform i.i.d. selection of its components from GF .

Summarizing, a random selection of a code from acoset GF LDPC ensemble amounts to a random construc-tion of its Tanner graph, a random selection of its labels, and arandom selection of a coset vector.

The rate of a coset GF LDPC code is equal to therate of its underlying GF LDPC code. The design rate ofa GF LDPC code is defined as

(6)

This value is a lower bound on the true rate of the code, mea-sured in -ary symbols per channel use.

IV. BELIEF-PROPAGATION DECODING OF

COSET GF LDPC CODES

A. Definition of the Decoder

The coset GF LDPC belief-propagation decoder is basedon Gallager [16] and Kschischang et al. [21]. The decoderattempts to recover , the codeword of the underlying GFLDPC code. Decoding consists of alternating rightbound andleftbound iterations. In a rightbound iteration, messages are sentfrom variable nodes to check nodes. In a leftbound iteration, theopposite occurs. Note that with this terminology, a rightboundmessage is produced at a left node (a variable node) and aleftbound message is produced at a right node (a check node).

As mentioned in Section II, the decoder’s messages are -di-mensional probability vectors, rather than scalar values as instandard binary LDPC.

Algorithm 1: Perform the following steps, alternately:

1) Rightbound iteration. For all edges , do thefollowing in parallel:If this is iteration zero, set the rightbound message

to the initial message , whose com-ponents are defined as follows:

(7)

and are the channel output, and the element of thecoset vector corresponding to variable node . The ad-dition operation is performed over GF .Otherwise (iteration number 1 and above)

(8)

where is the degree of the node anddenote the incoming (leftbound) messages across theedges , denoting the set ofnodes adjacent to .

2) Leftbound iteration. For all edges , do thefollowing in parallel:Set the components of the leftbound messageas follows:

(9)

where is the degree of node , and de-note the rightbound messages across the edges

and are the labels on thoseedges. denotes the label on the edge . The sum-mations and multiplications of the indices and the la-bels are performed over GF . Note that an equiva-lent, simpler expression will be given shortly.

If is a rightbound (leftbound) message from (to) a variablenode, then element represents an estimate of the a poste-riori probability (APP) that the corresponding code symbol is

, given the channel observations in a corresponding neighbor-hood graph (we will elaborate on this in Section IV-C). The de-cision associated with is defined as follows: the decoder de-cides on the symbol that maximizes . If the maximum wasobtained at several indices, a uniform random selection is madeamong them.

In our analysis, we focus on the probability that a rightboundor leftbound message is erroneous (i.e., corresponds to an in-correct decision). However, in a practical setting, the decoderstops after a fixed number of decoding iterations and computes,at each variable node , a final vector of APP values. Thevector is computed using (8), replacing with .is unique to each variable node (unlike rightbound or leftboundmessages), and can thus be used to compute a final decision onits value.


Consider expression (9) for computing the leftbound mes-sages. A useful, equivalent expression is given by

(10)

where is the entire leftbound vector (rather than a componentas in (9)) and the operator is defined as in (4). The GFconvolution operator is defined as an operation between twovectors, which produces a vector whose components are givenby

GF (11)

where the subtraction is evaluated over GF . Throughoutthe paper, the following definitions are useful:

(12)Using these definitions, (10) may be further rewritten as

(13)

Like the standard binary LDPC belief-propagation decoder, thecoset GF LDPC decoder also has an equivalent formulationusing LLR messages.

Algorithm 2: Perform the following steps, alternately:

1) Rightbound iteration. For all edges , do thefollowing in parallel:If this is iteration zero, set the LLR rightbound message

to , whose components aredefined as follows:

(14)

Otherwise (iteration number 1 and above)

(15)

where is the degree of the node anddenote the incoming (leftbound) LLR messages across theedges . Addition between vectorsis performed componentwise.

2) Leftbound iteration. All rightbound messages are con-verted from LLR to plain-likelihood representation. Ex-pression (9) is applied to obtain the plain-likelihood repre-sentation of the leftbound messages. Finally, the leftboundmessages are converted back to their corresponding LLRrepresentation.

Both versions of the decoder have similar execution times.However, the LLR representation is sometimes useful in theanalysis of the decoders’ performance. Note that Wymeerschet al. [39] have developed an alternative decoder that uses LLRrepresentation, which does not require the conversion to plain-

likelihood representation that is used in the leftbound iterationof the above algorithm.

B. Efficient Implementation

To compute rightbound messages, we can save time by com-puting the numerators separately, and then normalizing the sumto . At a variable node of degree , the computation of eachrightbound message takes computations.

A straightforward computation of the leftbound messages ata check node of degree has a complexity of perleftbound message, and a total of for all messagescombined. We will now review a method due to Richardson andUrbanke [28] (developed for the decoding of standard GFLDPC codes) that significantly reduces this complexity. Thismethod assumes plain-likelihood representation of messages. Itis nonetheless relevant to the implementation of Algorithm 2,which uses LLR representation, because with this algorithm theleftbound messages are computed by converting them to plain-likelihood representation, applying (9) and converting back toLLR representation.

We first recount some properties of Galois fields (see, e.g., [5]for a more extensive discussion). Galois fields GF exist forvalues of equal to , where is a prime number and is apositive integer. Each element of GF can be represented asan -dimensional vector over . The sum (difference)of two GF elements corresponds to the sum (difference) ofthe vectors, evaluated as the modulo- sums (differences) of thevectors’ components.

Consider the GF convolution operator, defined by (11) andused in the process of computing the leftbound message in (10).We now replace the GF indices and in (11) with theirvector representations, . The expressioncan be rewritten as

(16)

Consider, for example, the simple case of .

(17)

The right-hand side of (17) is the output of the two-dimensionalcyclic convolution of and , evaluated at . In thegeneral case, we have the -dimensional cyclic convolution.This convolution can equivalently be evaluated using the -di-mensional DFT ( -DFT) and the inverse DFT ( -IDFT) [11,p. 71]. Thus, (13) can be rewritten as

IDFT

where the multiplication of the DFT vectors is performed com-ponentwise ( can be evaluated from by ). Let

denote the DFT vector of a -dimensional probability vector


. The components of and are related by the equations [11,p. 65]

DFT

IDFT

Efficient computation of the -DFT is possible by successivelyapplying the single-dimensional DFT on each of the dimensionsin turn, as shown in the following algorithm [11, p.76]:

Algorithm 3

for tofor each vector

do

-DFT

endif then

endreturn

At each iteration of the preceding algorithm, -DFTsare computed. Each -DFT requires floating-point multipli-cations and floating-point additions (to compute allcomponents), and thus the entire algorithm requires

multiplications and addi-tions. The -IDFT can be computed in a similar manner. Notethat a further reduction in complexity could be obtained by usingnumber-theoretic transforms, such as the Winograd fast Fouriertransform (FFT).

We can use these results to reduce the complexity of left-bound computation at each check-node, by first computing the

-DFTs of all rightbound messages, then using the DFT vec-tors to compute convolutions. The resulting complexity at eachcheck node is now . The first ele-ment of the sum is the computation of -DFTs and -IDFTs,the second is the multiplications of -DFTs for all messages.This is a significant improvement in comparison to the straight-forward approach.

Note that the -DFT is particularly attractive when ,i.e., when is . The elements of the form be-come . Thus, the floating-point multiplicationsare eliminated, and the DFT involves only additions and sub-tractions. The above complexity figure per check node thus be-comes . Furthermore, all quantitiesare real valued and no complex-valued arithmetic is needed.

An additional improvement, to an order of(in the general case where is not necessarily ) can be

achieved using a method suggested by Davey and MacKay [10].This method produces a negligible improvement except at veryhigh values of , and is therefore not elaborated here.

C. Neighborhood Graphs and the Tree Assumption

Before we conclude this section, we briefly review the con-cepts of neighborhood graphs and the tree assumption. Theseconcepts were developed in the context of standard binaryLDPC codes and carry over to coset GF LDPC codes aswell.

Definition 3: (Richardson and Urbanke [28]) The neighbor-hood graph of depth , spanned from an edge , is the inducedgraph containing and all edges and nodes on directed paths oflength that end with .

At iteration , a rightbound message produced from a variablenode to a check node is a vector of APP values for the codesymbol at , given information observed in the neighborhood of

of depth . Similarly, a leftbound message from tois based on the information observed in the neighborhood of

, of depth .The APP values produced by belief-propagation decoders are

computed under the tree assumption.4 We say that the tree as-sumption is satisfied at a node in the context of computinga message , if the neighborhood graph on which the messageis based is a tree. Asymptotically, at large block lengths , thetree assumption is satisfied with high probability at any partic-ular node [28].

At finite block lengths, the neighborhood graph frequentlycontains cycles and is therefore not a tree. Such cases are dis-cussed in Appendix II. Nevertheless, simulation results indicatethat the belief-propagation decoder produces remarkable perfor-mance even when the tree assumption is not strictly satisfied.

V. COSET GF LDPC ANALYSIS IN A

RANDOM-COSET SETTING

One important aid in the analysis of coset GF LDPC codesis the randomly selected coset vector that was used in their con-struction. Rather than examine the decoder of a single cosetGF LDPC code, we focus on a set of codes. That is, given afixed GF -LDPC code and a mapping , we consider thebehavior of a coset GF LDPC code constructed using a ran-domly selected coset vector . We refer to this as random-cosetanalysis.

With this approach, the random space consists of randomchannel transitions as well as random realizations of the cosetvector . The random-coset vector produces an effect that is sim-ilar to output symmetry that is usually required in the analysis ofstandard LDPC codes [28], [29]. Note that although is random,it is assumed to have been selected in advance and is thus knownto the decoder.

Unlike the coset vector, in this section we keep the underlyingGF LDPC code fixed. In Section VI, we will consider sev-eral of these concepts in the context of selecting the underlyingLDPC code at random from an ensemble.

A. The All-Zero Codeword Assumption

An important property of standard binary LDPC decoders[28] is that the probability of decoding error is equal for any

4In [28] it is called the independence assumption.


Fig. 4. Equivalent channel model for coset GF(q) LDPC codes.

transmitted codeword. This property is central to many analysismethods, and enables conditioning the analysis on the assump-tion that the all-zero5 codeword was transmitted.

With coset GF LDPC codes, we have the followinglemma.

Lemma 1: Assume a discrete memoryless channel. Considerthe analysis, in a random-coset setting, of a coset GF LDPCcode constructed from a fixed GF LDPC code . For each

, let denote the conditional (bit or block) proba-bility of decoding error after iteration , assuming the codeword

was sent, averaged over all possible values of the cosetvector . Then is independent of .

The proof of the lemma is provided in Appendix III-B.Lemma 1 enables us to condition our analysis results on the

assumption that the transmitted codeword corresponds to ofthe underlying LDPC code.

B. Symmetry of Message Distributions

The symmetry property, introduced by Richardson and Ur-banke [29] is a major tool in the analysis of standard binaryLDPC codes. In this subsection, we generalize its definition to-ary random variables as used in the analysis of coset GF

LDPC decoders. We provide two versions of the definition, thefirst using probability-vector random variables and the secondusing LLR-vector random variables.

Definition 4: A probability-vector random variable issymmetric if for any probability vector , the following expres-sion holds:

(18)

where and are as defined in Section I.

In the context of LLR-vector random variables, we have thefollowing lemma.

Lemma 2: Let be an LLR-vector random variable. Therandom variable LLR is symmetric if andonly if satisfies

(19)

for all LLR vectors and all GF .

5In [28] a BPSK alphabet is used and thus the codeword is referred to as the“all-one” codeword.

The proof of this lemma is provided in Appendix III-C. Inthe sequel, we adopt the lemma as a definition of symmetrywhen discussing variables in LLR representation. Note that inthe simple case of , the LLR vector degenerates to a scalarvalue and from (5) we have . Thus, (19) becomes

(20)

This coincides with symmetry for binary codes as definedin [29].

We now examine the message produced at a node .

Theorem 1: Assume a discrete memoryless channel and con-sider a coset GF LDPC code constructed in a random-cosetsetting from a fixed GF LDPC code . Let denote themessage produced at a node of the Tanner graph of (and ofthe coset GF LDPC code), at some iteration of belief-propa-gation decoding. Let the tree assumption be satisfied at . Thenunder the all-zero codeword assumption, the random variableis symmetric.

The proof of the theorem is provided in Appendix III-D.

C. Channel Equivalence

Simple GF LDPC codes, although unsuitable for arbitrarychannels, are simpler to analyze than coset GF LDPC codesand decoders. Fig. 4 presents the structure of coset GF LDPCencoding/decoding. is the transmitted symbol (of the under-lying code) and is the coset symbol. (evaluated overGF ) is the input to the mapper, is the mapper’soutput, and is the physical channel’s output. will be dis-cussed shortly.

Comparing a coset GF LDPC decoder with the decoderof its underlying GF LDPC code we may observe that adifference exists only in the computation (7) of the initialmessages . The messages are APP values correspondingto a single channel observation. After they are computed,both decoders proceed in exactly the same way. It wouldthus be desirable to abstract the operations that are uniqueto coset GF LDPC codes into the channel, and examinean equivalent model, which employs simple GF LDPCcodes and decoders.

Consider the channel obtained by encapsulating the additionof a random coset symbol, the mapping, and the computation ofthe APP values into the channel model. The input to the channel


is a symbol from the code alphabet6 and the output is a prob-ability vector of APP values. The decoder of a GFLDPC code, if presented with as raw channel output, wouldfirst compute a new vector of APP values. We will soon showthat the computed vector would in fact be identical to .

We begin with the following definition.

Definition 5: Let denote the transition probabilitiesof a channel whose input alphabet is GF and whose outputalphabet consists of -dimensional probability vectors. Then thechannel is cyclic-symmetric if there exists a probability function

(defined over sets of probability vectors (3)), such that

(21)

Lemma 3: Assume a cyclic-symmetric channel. Let APPdenote the APP values for the channel output . ThenAPP .

The proof of this lemma is provided in Appendix III-F. Re-turning to the context of our equivalent model, we have the fol-lowing lemma.

Lemma 4: The equivalent channel of Fig. 4 is cyclic-sym-metric.

The proof of this Lemma is provided in Appendix III-G.Once the initial messages are computed, the performance of

both the coset GF LDPC and GF LDPC decoding algo-rithms is a function of these messages alone. Therefore, we haveobtained that the performance of a coset GF LDPC decoderin a random-coset setting over the original physical channel isidentical to the performance of the underlying GF LDPC de-coder over the equivalent channel. This result enables us to shiftour discussion from coset GF LDPC codes over arbitrarychannels to GF LDPC codes over cyclic-symmetric chan-nels.

Note that a cyclic-symmetric channel is symmetric in thesense defined by Gallager [17, p. 94]. Hence, its capacityachieving distribution is uniform. This indicates that GFLDPC codes, which have an approximately uniformly dis-tributed code spectrum (see [1]), are suitably designed for it.

We now relate the capacity of the equivalent channel to that ofthe physical channel. More precisely, we show that the equiva-lent channel’s capacity is equal to the equiprobable-signalingcapacity of the physical channel with the mapping , de-noted and defined below. Let , , and be random vari-ables corresponding to , , and in Fig. 4. is related to

through the physical channel’s transition probabil-ities. Assume that is uniformly distributed in ,then we define by . is equal to the ca-pacity of transmission over the physical channel with an inputalphabet using a code whose codewords were gener-ated by random uniform selection.

Lemma 5: The capacity of the equivalent channel of Fig. 4is equal to .

6In most cases of interest, x will be a symbol from a GF(q) LDPC codeword.However, in this section we also consider the general, theoretical case, wherethe input to the channel is an arbitrary GF(q) symbol.

The proof of this lemma is provided in Appendix III-H.Finally, the following lemma can be viewed as a generaliza-

tion of the Channel Equivalence Lemma of [29].

Lemma 6: Let be the probability function of a sym-metric probability-vector random variable. Consider the cyclic-symmetric channel whose transition probabilities are given by

. Then, assuming that the symbol zero istransmitted over this cyclic-symmetric channel, then the initialmessages of a GF LDPC decoder are distributed as .

The proof of this lemma is straightforward from Definitions 4and 5 and from Lemma 3. We will refer to the cyclic-symmetricchannel defined in Lemma 6 as the equivalent channel corre-sponding to .

Remark 1: Note that Lemma 6 remains valid if we switch toLLR representation. That is, we replace with its LLR equiva-lent LLR and define (where

is defined by (5)).

VI. ANALYSIS OF DENSITY EVOLUTION

In this section, we consider density evolution for coset GFLDPC codes and its analysis. The precise computation of thecoset GF LDPC version of the algorithm is generally not pos-sible in practice. The algorithm is however valuable as a refer-ence for analysis purposes. We begin by defining density evo-lution in Section VI-A and examine the application of the con-centration theorem of [28] and of symmetry to it. We proceedin Section VI-B to consider permutation-invariance, which isan important property of the densities tracked by the algorithm.We then apply permutation-invariance in Section VI-C to gen-eralize the stability property to coset GF LDPC codes and inSection VI-D to obtain an approximation of density evolutionunder a Gaussian assumption.

A. Density Evolution

The definition of coset GF LDPC density evolution isbased on that of binary LDPC codes. The description belowis intended for completeness of this text, and focuses on thedifferences that are unique to coset GF LDPC codes. Thereader is referred to [28] and [29] for a complete rigorousdevelopment.

Density evolution tracks the distributions of messages pro-duced in belief propagation, averaged over all possible neigh-borhood graphs on which they are based. The random space iscomprised of random channel transitions, the random selectionof the code from a coset GF LDPC ensemble (seeSection III-D), and the random selection of an edge from thegraph. The random space does not include the transmitted code-word, which is assumed to be fixed at the all-zero codeword(following the discussion of Section V-A). We denote bythe initial message across the edge, by the rightbound mes-sage at iteration and by the leftbound message at iteration. The neighborhood graph associated with and is always

assumed to be tree like, and the case that it is not so is neglected.We will use the above notation when discussing plain-like-

lihood representation of density evolution. When using LLR-


vector representation, we let , , and denote the LLR-vector representations of , , and . To simplify our nota-tion, we assume that all random variables are discrete valued andthus track their probability functions rather than their densities.The following discussion focuses on plain-likelihood represen-tation. The translation to LLR representation is straightforward.

1) The initial message. The probability function of iscomputed in the following manner:

where and are random variables denoting the channeloutput and coset-vector components, is the channeloutput alphabet, and the components of are de-fined by (7), replacing and with and . The expres-sion is equal to that at the bottom of the page.

2) Leftbound messages. is obtained from (9). Therightbound messages in (9) are replaced by independentrandom variables, distributed as and assumed to beindependent. Similarly, the labels in (9) are also replacedby independent random variables uniformly distributedin GF .Formally, let be the maximal right degree. Then foreach we first define

where is the set of all probability vectors, and the com-ponents of are defined asin (9). is a random variable corresponding to the thlabel, and thus, for all .

is obtained recursively from the pre-vious iteration of belief propagation.The probability function of is now obtained by

3) Rightbound messages. The probability function of isequal to that of . For , is obtained from (8).The leftbound messages and the initial message in (8) arereplaced by independent random variables, distributed as

and , respectively.Formally, let be the maximal left degree. Then for each

, we first define

where the components of are de-fined as in (8). and areobtained recursively from the previous iterations of beliefpropagation.The probability function of is now obtained by

Theoretically, the preceding algorithm is sufficient to com-pute the desired densities. In practice, a major problem is thefact that the quantities of memory required to store the proba-bility density of a -dimensional message grows exponentiallywith . For instance, with 100 quantization7 levels per dimen-sion, the amount of memory required for a -ary code is ofthe order of . Hence, unless an alternative method for de-scribing the densities is found, the algorithm is not realizable. Itis noteworthy, however, that the algorithm can be approximatedusing Monte Carlo simulations.

We now discuss the probability that a message examined indensity evolution is erroneous. That is, the message correspondsto an incorrect decision regarding the variable node to whom it isdirected or from which it was sent. Under the all-zero codewordassumption, the true transmitted code symbol (of the underlyingLDPC code), at the relevant variable node, is assumed to be zero.

We first assume that the message is a fixed probability vector. Suppose is greater than all other elements

. Given the decision criterion used by the belief propagationdecoder, described in Section IV-A, the decoder will correctlydecide zero. Similarly, if there exists an index such that

, then the decoder will incorrectly decide . However, ifthe maximum is achieved at as well as other indices, thedecoder will correctly decide zero with probability .

Definition 6: Given a probability vector , is the prob-ability of error in a decision according to the vector .

Thus, for example,

and

Given a random variable , we define

(22)

where the sum is over all probability vectors.Consider . This corresponds to the probability of error

at a randomly selected edge at iteration . Richardson and Ur-banke [28] proved a concentration theorem that states that as theblock length approaches infinity, the bit-error rate at iteration

7“Quantization” here means the operation performed by a discrete quantizer,not in the context of Definition 2.

was received was transmitted


converges to a similarly defined probability of error. The con-vergence is in probability, exponentially in . Replacing bit-with symbol-error rate, this theorem carries over to coset GFLDPC density evolution unchanged.

Let be a sequence of error probabilities pro-duced by density evolution. A desirable property of this se-quence is given by the following theorem.

Theorem 2: is nonincreasing with .

The proof of this theorem is similar to that of Theorem 7 of[29] and is omitted.

Finally, in Section V-B, we considered symmetry in the con-text of the message corresponding to a fixed underlying GFLDPC code and across a fixed edge of its Tanner graph. We nowconsider its relevance in the context of density evolution, whichassumes a random underlying LDPC code and a random edge.

Theorem 3: The random variables , , and (for all )are symmetric.

The proof of this theorem is provided in Appendix IV-A.

B. Permutation-Invariance Induced by Labels

Permutation-invariance is a key property of coset GFLDPC codes that allows the approximation of their densitiesusing one-dimensional functionals, thus greatly simplifyingtheir analysis. The definition is based on the permutation,inferred by the operation , on the elements of a probabilityvector.

Before we provide the definition, let us consider (10), bywhich a leftbound message is computed in the process of be-lief-propagation decoding. Let GF , and consider

(23)

With density evolution, the label is a random variable,independent of the other labels, of the rightbound messages

and consequently of . Similarly,(where is fixed) is distributed identically with ,

and is independent of . Thus, the randomvariable is distributed identically with . This leads us tothe following definition.

Definition 7: A probability-vector random variable is per-mutation-invariant if for any fixed GF , the randomvariable is distributed identically with .

Although this definition assumes plain-likelihood represen-tation, it carries over straightforwardly to LLR representation,and the following lemma is easy to verify.

Lemma 7: Let be an LLR-vector random variable andLLR . Then is permutation-invariant if

and only if, for any fixed GF , the random variableis distributed identically with .

To give an idea of why permutation-invariance is so useful,we now present two important lemmas involving permu-tation-invariant random variables. Both lemmas examinemarginal random variables. The first lemma is valid for bothprobability-vector and LLR-vector representations.

Lemma 8: Let ( ) be a probability-vector (LLR-vector)random variable. If ( ) is permutation-invariant then for any

, the random variables and ( and) are identically distributed.

The proof of this lemma is provided in Appendix IV-B.

Lemma 9: Let be a symmetric LLR-vector random vari-able. Assume that is also permutation-invariant. Then, for all

, is symmetric in the binary sense, as de-fined by (20).

Note that this lemma does not apply to plain-likelihoodrepresentation. The proof of the lemma is provided in Ap-pendix IV-C. Consider the following definition.

Definition 8: Given a probability-vector random variable ,we define the random-permutation of , denoted , as therandom variable equal to where is randomly selectedfrom GF with uniform probability, and is independentof .

The definition with LLR-vector representation is iden-tical. The following lemma links permutation-invariance withrandom-permutation.

Lemma 10: A probability-vector (LLR-vector) random vari-able ( ) is permutation-invariant if and only if there existsa probability-vector (LLR-vector) random variable ( ) suchthat ( ).

In Appendix IV-E, we present some additional useful lemmasthat involve permutation-invariance.

Finally, the following theorem discusses permutation-invari-ance’s relevance to the distributions tracked by density evolu-tion.

Theorem 4: Let , , and be defined as in Sec-tion VI-A. Then we have the following.

1) is permutation-invariant.2) Let , where is the label on the edge

associated with the message. Then is symmetric, per-mutation-invariant, and satisfies .

3) Let be a random-permutation of . Then re-placing by in the computation of densityevolution does not affect the densities of and .The random variable is symmetric, permutation-in-variant, and satisfies .

The proof of this theorem is provided in Appendix IV-F. Al-though not all distributions involved in density evolution are per-mutation-invariant, Theorem 4 enables us to focus our attentionon permutation-invariant random variables alone. Our interestin the distribution of the rightbound message is confined to


the error probability implied by it. Thus, we may instead ex-amine . Similarly, our interest in the initial message isconfined to its effect on the distribution of and . Thus, wemay instead examine .

C. Stability

The stability condition, introduced by Richardson et al. [29],is a necessary and sufficient condition for the probability of errorto approach arbitrarily close to zero, assuming it has alreadydropped below some value at some iteration. Thus, this condi-tion is an important aid in the design of LDPC codes with lowerror floors. In this subsection, we generalize the stability con-dition to coset GF LDPC codes.

Given a discrete memoryless channel with transition proba-bilities and a mapping , we define the followingchannel parameter:

(24)For example, consider an AWGN channel with a noise varianceof . for this case is obtained in a similar manner to that of[29, Example 12]

In Appendix IV-G, we present the concept of nondegeneracyfor mappings and channels (taken from [1]). Under theseassumptions, is strictly smaller than . We assume these non-degeneracy definitions in the following theorem.

Finally, we are now ready to state the stability condition forcoset GF LDPC codes.

Theorem 5: Assume we are given the triplet for acoset GF LDPC ensemble designed for a discrete-memory-less channel. Let denote the probability distribution functionof , the initial message of density evolution. Let

denote the average probability of error at iterationunder density evolution.

Assume in some neighborhood of zero

(where denotes element of the LLR representation of). Then we have the following.

1) If , then there exists a positive constantsuch that for all iterations .

2) If , then there exists a positive constantsuch that if at some iteration ,

then approaches zero as approaches infinity.

Note that the requirement is typically sat-isfied in channels of interest. The proof of Part 1 of the theoremis provided in Appendix V and the proof of Part 2 is providedin Appendix VI. Outlines of both proofs are provided below.

The proof of Part 1 is a generalization of a proof provided byRichardson et al. [29]. The proof [29] begins by observing thatsince the distributions at some iteration are symmetric, theymay equivalently be modeled as APP values corresponding tothe outputs of an MBIOS channel. By an erasure decompositionlemma, the output of an MBIOS channel can be modeled as

the output of a degraded erasure channel. The proof proceedsby replacing the distributions at iteration by erasure-channelequivalents, and shows that the probability of error with the newdistributions is lower-bounded by some nonzero constant. Sincethe true MBIOS channel is a degraded version of the erasurechannel, the true probability of error must be lower-bounded bythe same nonzero constant as well.

Returning to the context of coset GF LDPC codes, we firstobserve that by Theorem 1 the random variable at iteration

is symmetric and hence, by Lemma 6, it can be modeled asAPP values of the outputs of a cyclic-symmetric channel. Wethen show that any cyclic-symmetric channel can be modeledas a degraded erasurized channel, appropriately defined. Thecontinuation of the proof follows in the lines of [29].

The proof of Part 2 is a generalization of a proof by Khan-dekar [20]. As in [20] (and also [6]), our proof tracks a one-di-mensional functional of the distribution of a message , de-noted . We show that the rightbound messages at two con-secutive iterations, satisfy

Using first-order Taylor expansions of and , we proceedto show

Since by the theorem’s conditions, for smallenough we have where ,and thus, descends to zero. Further details, including therelation between and , are provided in Appendix VI.

D. Gaussian Approximation

With binary LDPC, Chung et al. [9] observed that the right-bound messages of density evolution are well approximated byGaussian random variables. Furthermore, the symmetry of themessages in binary LDPC decoding implies that the meanand variance of the random variable are related by .Thus, the distribution of a symmetric Gaussian random variablemay be described by a single parameter: . This property wasalso observed by ten Brink et al. [35] and is essential to theirdevelopment of EXIT charts. In the context of nonbinary LDPCcodes, Li et al. [22] obtained a description of the -dimen-sional messages, under a Gaussian assumption, by param-eters.

In the following theorem, we use symmetry and permutationinvariance as defined in Sections V-B and VI-B to reduce thenumber of parameters from to one. This is a key propertythat enables the generalization of EXIT charts to coset GFLDPC codes.

Note that the theorem assumes a continuous Gaussian dis-tribution. The definition of symmetry for LLR-vector randomvariables (Lemma 2) is extended to continuous distributions byreplacing the probability function in (19) with a probability den-sity function.

Theorem 6: Let be an LLR-vector random variable,Gaussian distributed with a mean and covariance matrix

. Assume that the probability density function of


exists and that is nonsingular. Then is both symmetric andpermutation-invariant if and only if there exists such that

(25)

That is, , and ifand otherwise.

The proof of this theorem is provided in Appendix VII. AGaussian symmetric and permutation-invariant random variableis thus completely described by a single parameter . In Sec-tions VII-B and VII-D, we discuss the validity of the Gaussianassumption with coset GF LDPC codes.

VII. DESIGN OF COSET GF LDPC CODES

With binary LDPC codes, design of edge distributions is fre-quently done using EXIT charts [35]. EXIT charts are particu-larly suited for designing LDPC codes for AWGN channels. Inthis section, we develop EXIT charts for coset GF codes. Weassume throughout the section transmission over AWGN chan-nels.

A. EXIT Charts

Formally, EXIT charts track the mutual informationbetween the transmitted code symbol at an average variablenode8 and the rightbound (leftbound) message transmittedacross an edge emanating from it. If this information is zero,then the message is independent of the transmitted code symboland thus the probability of error is . As the informationapproaches , the probability of error approaches zero. Note thatwe assume that the base of the function in the mutual infor-mation is , and thus .

is taken to represent the distribution of the message. That is, unlike density evolution, where the entire distribu-

tion of the message at each iteration is recorded, with EXITcharts, is assumed to be a faithful surrogate (we willshortly elaborate how this is done).

With EXIT charts, two curves (functions) are computed: Thevariable-node decoder (VND) curve and the check-node de-coder (CND) curve, corresponding to the rightbound and left-bound steps of density evolution, respectively. The argument toeach curve is denoted and the value of the curve is denoted

. With the VND curve, is interpreted as equal to the func-tional when applied to the distribution of the leftboundmessages at a given iteration . The output is interpreted asequal to where is the rightbound message producedat the following rightbound iteration. With the CND curve, theopposite occurs.

Note that unlike density evolution, where the densities aretracked from one iteration to another, the VND and CND curvesare evaluated for every possible value of their argument .However, a decoding trajectory that produces an approximationof the functionals and at each iteration, maybe computed (see [36] for a discussion of the trajectory).

8In Definition 1, the notationC was used to denote a code rather than a code-word symbol. The distinction between the two meanings is to be made based onthe context of the discussion.

The decoding process is predicted to converge if after eachdecoding iteration (comprised of a leftbound and a rightbounditeration), the resulting is increased in com-parison to of the previous iteration. We there-fore require for all .Equivalently, . In an EXIT chart,the CND curve is plotted with its and axes reversed (see,for example, Fig. 7). The decoding process is thus predicted toconverge if and only if the VND curve is strictly greater than thereversed-axes CND curve.

B. Using as a Surrogate

Let be a leftbound or rightbound message at some itera-tion of belief propagation. Strictly speaking, an approximationof requires not only the knowledge of the distributionof but primarily the knowledge of the conditional distribu-tion for all (we assume that isuniformly distributed). However, as shown in Lemma 17 (Ap-pendix III-A), the messages of the coset GF LDPC decodersatisfy

Thus, we may restrict ourselves to an analysis of the conditionaldistribution .

Lemma 11: Under the tree assumption, the above definedsatisfies

(26)

The proof of this lemma is provided in Appendix VIII-A.Note that by Lemma 16 (Appendix III-A), we may replace theconditioning on in (26) by a conditioning on the trans-mission of the all-zero codeword. In the remainder of this sec-tion, we will assume that all distributions are conditioned on theall-zero codeword assumption.

In their development of EXIT charts for binary LDPC codes,ten Brink et al. [35] confine their attention to LLR messagedistributions that are Gaussian and symmetric. Under these as-sumptions, a message distribution is uniquely described by itsvariance . For every value of , they evaluate (26) (with

) when applied to the corresponding Gaussian distribution. Theresult, denoted , is shown to be monotonically increasingin . Thus, is well defined. Given ,can be applied to obtain the that describes the correspondingdistribution of . Thus, uniquely defines the entiredistribution of .

The Gaussian assumption is not strictly true. With binaryLDPC codes, assuming transmission over an AWGN channel,the distributions of rightbound messages are approximatelyGaussian mixtures (with irregular codes). The distributions ofthe leftbound messages, resemble “spikes.” The EXIT methodin [35] nonetheless continues to model the distributions asGaussian. Simulation results are provided, which indicate thatthis approach still produces a very close prediction of theperformance of binary LDPC codes.

With coset GF LDPC codes, we discuss two methods fordesigning EXIT charts. The first method models the LLR-vectormessages distributions as Gaussian random variables, following


the example of [35]. This modeling also enables us to evaluatethe VND and CND curves using approximations that weredeveloped in [35], thus greatly simplifying their computation.

However, the modeling of the rightbound message distri-butions of coset GF LDPC as Gaussian is less accuratethan it is with binary LDPC codes. As we will explain inSection VII-D, this results from the distribution of the initialmessages, which is not Gaussian even on an AWGN channel.In Section VII-D, we will therefore develop an alternativeapproach, which models the rightbound distributions more ac-curately. We will then apply this approach in Section VII-E,to produce an alternative method for computing EXIT charts.With this method, the VND and CND curves are more diffi-cult to compute. However, the method produces codes withapproximately 1 dB better performance.

C. Computation of EXIT Charts, Method 1

With this method, we confine our attention to distributionsthat are permutation-invariant,9 symmetric, and Gaussian. ByTheorem 6, under these assumptions, a -dimensional LLR-vector message distribution is uniquely defined by a parameter

. We proceed to define in a manner similar to that of [35].In Appendix VIII-D, we show that is monotonically in-creasing and thus is well defined. Given ,the distribution of may be obtained in the same way as [35].

We use the following method to compute the VND and CNDcurves, based on a development of ten Brink et al. [35] for binaryLDPC codes.

1) The VND curve. By (15), a rightbound message isa sum of incoming leftbound messages and an initialmessage. Let and denote the mutual-informationfunctionals of the incoming leftbound messages andinitial messages, respectively. By Lemma 5, equalsthe equiprobable-signaling capacity of the channel withthe mapping . It may be obtained by numericallyevaluating as defined in Section V-C.

For each left-degree , we let de-note the value of the VND curve when confined to the dis-tribution of rightbound messages across edges whose leftdegree is . We now employ the following approximation,which holds under the tree assumption, when both the ini-tial and the incoming leftbound messages are Gaussian.

The validity of this approximation relies on the obser-vation that a rightbound message (15) is equal to a sumof i.i.d. leftbound messages and an independentlydistributed initial message (under the tree assumption).

is the variance of each of the leftbound mes-sages and is the variance of the initial mes-sage, and hence, the variance of the rightbound messageis

9Strictly speaking, rightbound messages are not permutation-invariant. How-ever, in Appendix VIII-B, we show that this does not pose a problem to thederivation of EXIT charts.

2) The CND curve. Let denote the value ofthe CND curve when confined to the distribution of left-bound messages across edges whose right degree is .

This approximation is based on a similar approximationthat was used in [35] and relies on Sharon et al. [31].In the context of coset GF LDPC codes, we have veri-fied its effectiveness empirically.

Given an edge distribution pair , we have

(27)

Code design may be performed by fixing the right distributionand computing . Like [35], the following constraints are usedin the design.

1) is required be a valid probability vector. That is,, and .

2) To ensure decoding convergence, we require(as explained in Sec-

tion VII-A) for all belonging to a discrete, fine gridover .

The design process seeks to maximize , which by (6) isequivalent to maximizing the design rate of the code. Typically,this can be done using a linear program. A similar process canbe used to design with fixed.

D. More Accurate Modeling of Message Distributions

We now provide a more accurate model for the rightboundmessages, as mentioned in Section VII-B. We focus, for sim-plicity, on regular LDPC codes. Observe that the computationof the rightbound message using (15) involves the summationof i.i.d. leftbound messages . This sum is typically well ap-proximated by a Gaussian random variable.10 To this sum, theinitial message is added. With binary LDPC codes, trans-mission over an AWGN channel results in an initial message

which is also Gaussian distributed (assuming the all-zerocodeword was transmitted). Thus, the rightbound messages arevery closely approximated by a Gaussian random variable.

With coset GF LDPC codes, the initial message is not wellapproximated by a Gaussian random variable, as illustrated inthe following lemma:

Lemma 12: Consider the initial message produced at somevariable node, under the all-zero codeword assumption, usingLLR representation. Assume the transmission is over an AWGNchannel with noise variance and with a mapping . Let thecoset symbol at the variable node be . Then the initial message

is given by , where is the noise

10Quantification of the quality of the approximation is beyond the scope ofthis discussion. “Well approximated” is to be understood in a heuristic sense, inthe context of suitability to design using EXIT charts.


Fig. 5. Empirical distributions of the messages of a (3; 6) ternary coset LDPC code. (a) Initial messages. (b) Leftbound messages. (c) Sum of leftbound messages,prior to the addition of the initial message. (d) Rightbound messages.

produced by the channel and and are -dimen-sional vectors, dependent on , whose components are given by

The proof of this lemma is straightforward from the observa-tion that the received channel output is .

In our analysis, we assume a random coset symbol thatis uniformly distributed in GF . Thus, and arerandom variables, whose values are determined by the mapping

and by the noise variance . The distribution of the channelnoise is determined by . The distribution of the initial mes-sages is therefore determined by and .

Fig. 5 presents the empirical distribution of LLR messages atseveral stages of the decoding process, as observed by simula-tions. The code was a coset GF LDPC. Since ,the LLR messages in this case are two-dimensional. The distri-bution of the initial messages (Fig. 5(a)) is seen to be a mixtureof one-dimensional Gaussian curves, as predicted by Lemma12. The leftbound messages at the first iteration are shown inFig. 5(b). We model their distribution as Gaussian, althoughit resembles a “spike” and not the distribution of a Gaussianrandom variable (this situation is similar to the one with binary

LDPC [9]). Fig. 5(c) presents the sum of leftbound messagescomputed in the process of evaluating (15). As predicted, thissum is well approximated by a Gaussian random variable. Fi-nally, the rightbound messages at the first iteration are given inFig. 5(d).

Following the above discussion, we model the distributionof the rightbound messages as the sum of two random vectors.The first is distributed as the initial messages above, and thesecond (the intermediate sum of leftbound messages) is mod-eled as Gaussian.11

The intermediate value (the second random variable) is sym-metric and permutation-invariant. This may be seen from thefact that the leftbound messages are symmetric and permuta-tion-invariant (by Theorems 3 and 4) and from Lemmas 18 (Ap-pendix III-E) and 22 (Appendix IV-E). Thus, by Theorem 6, itis characterized by a single parameter .

Summarizing, the approximate distribution of rightboundmessages is determined by three parameters: and , which

11Note that with irregular codes, the number of i.i.d. leftbound variables thatis summed is a random variable itself (distributed as f� g ), and thus thedistribution of this random variable resembles a Gaussian mixture rather thana Gaussian random variable. However, we continue to model it as Gaussian,following the example that was set with binary codes [35].


determine the distribution of the initial message, and , whichdetermines the distribution of the intermediate value.

E. Computation of EXIT Charts, Method 2

The second method for designing EXIT charts differs fromthe first (Section VII-C) in its modeling of the initial and right-bound message distributions, following the discussion in Sec-tion VII-D. We continue, however, to model the leftbound mes-sages as Gaussian.

For every value of , we define ( and arefixed parameters) in a manner analogous to as discussedin Section VII-C. That is, equals (26) when appliedto the rightbound distribution corresponding to , , and . Inan EXIT chart, and are fixed. The remaining parameterthat determines the rightbound distribution is thus , and

is well defined.12 The computation of andis discussed in Appendix VIII-E.

The following method is used to compute the VND and CNDcurves.

1) The VND curve. For each left-degree , we evaluate(defined in a manner analogous toof Section VII-C) using the fol-

lowing approximation:

2) The CND curve. Let be defined ina manner analogous to of Section VII-C.The parameters and are used in conjunction with

to characterize the distribution of therightbound messages at the input of the check nodes. Thecomputation of is done empiricallyand is elaborated in Appendix VIII-F.

Given an edge distribution pair we evaluateand from the above

computed

and

using expressions similar to (27).Note that needs to be computed once for each

choice of and . needs to be computedalso for each value of . needs to be computed once foreach choice of .

Design of edge distributions and may be performed bylinear programming in the same manner as in Section VII-C.Further details are provided in Section VII-F below.

F. Design Examples

We designed codes for spectral efficiencies of 6 bits/s/Hz(3 bits/dimension) and 8 bits/s/Hz (4 bits/dimension) over theAWGN channel. In all our constructions, we used method 2(Section VII-E) to compute the EXIT charts. Our Matlab sourcecode is provided in [4].

12See Appendix VIII-E for a more accurate discussion of this matter.

For the code at 6 bits/s/Hz, we set the alphabet size at .We used a nonuniformly spaced signal constellation (fol-lowing the discussion of Section III-C). The constellation wasobtained by applying the following method, which is a variationof a method suggested by Sun and van Tilborg [33]. First, theunique points were computed such thatfor ,

and

The signal constellation was obtained by scaling the result sothat the average energy was . The mapping from the code al-phabet is given below, with its elements listed in ascending orderusing the representation of GF elements as binary numbers(e.g., ). Note, how-ever, that our simulations indicate that for a given , differentmappings typically render the same performance.

We fixed and iteratively applied linear programming,first to obtain , and then, fixing , to obtain a better .

Rather than require

as in Sections VII-A and VII-C, we enforced a more stringentcondition when designing . We required

where equals when , equalswhen , and is zero elsewhere. Similarly, whendesigning , we required

After a few linear programming iterations, we obtained theedge-distributions

The code rate is GF symbols per channel use, equal to3 bits/channel use, and a spectral efficiency of 6 bits/s/Hz. Inter-estingly, this code is right-irregular, unlike typical binary LDPCcodes. Fig. 6 presents the EXIT chart for the code (computed bymethod 2). Note that the CND curve in Fig. 6 does not begin at

. This is discussed in Appendix VIII-F.Simulation results indicate successful decoding at an SNR

of 18.55 dB. The block length was symbols, and de-coding typically converged after approximately 150–200 itera-tions. The symbol error rate, after 50 simulations, was approxi-mately . The unrestricted Shannon limit (i.e., not restrictedto any signal constellation) at this rate is 17.99 dB, and thus ourgap from this limit is 0.56 dB. This result is well beyond theshaping gap, which at 6 bits/s/Hz is approximately 1.1 dB.


Fig. 6. An EXIT chart, computed using method 2, for a code at a spectralefficiency of 6 bits/s/Hz and an SNR of 18.5 dB.

We can obtain some interesting insight on these figures byconsidering the equiprobable-signaling Shannon limit for ourconstellation (defined based on the equiprobable-signaling ca-pacity, which was introduced in Section V-C). At 6 bits/s/Hz,this limit equals 18.25 dB. The equiprobable-signaling Shannonlimit is the best we can hope for with any design method forthe edge distributions of our code. The gap between our code’sthreshold and this limit is just 0.3 dB, indicating the effective-ness of our EXIT chart design method.

The equiprobable-signaling Shannon limit for a 32-PAM con-stellation, at 6 bits/s/Hz is 19.11 dB. The gap between this limitand the above-discussed limit for our constellation, is 0.86 dB.This is the shaping gain obtained from the use of a nonuniformsignal constellation.

For the code at 8 bits/s/Hz, we set the alphabet size at .We used the same method to construct a nonuniformly spacedsignal constellation. The mapping to the signal constellation isgiven below as follows:

We fixed and applied one iteration of linear pro-gramming to obtain .The code rate is GF symbols/channel use, equal to4 bits/channel use, and a spectral efficiency of 8 bits/s/Hz. Fig. 7presents the EXIT charts for the code using the two methods.

Simulation results indicate successful decoding at an SNRof 25.06 dB over the AWGN channel. The block length was

symbols, and decoding typically converged after approxi-mately 70 iterations. The symbol error rate, after 100 simula-tions, was exactly zero. We also applied an approximation ofdensity evolution by Monte Carlo simulations, as mentioned inSection VI-A, and obtained similar results. The gap between ourcode’s threshold and the unrestricted Shannon limit, which at 8bits/s/Hz is 24.06 dB, is 1 dB. This result is beyond the shapinggap, which at 8 bits/s/Hz is 1.3 dB. The equiprobable-signalingShannon limit for our signal constellation at 8 bits/s/Hz is 24.34dB. The gap between our code’s threshold and this limit is thusonly 0.72 dB.

VIII. COMPARISON WITH OTHER BANDWIDTH-EFFICIENT

CODING SCHEMES

The simulation results presented in Section VII-F indicatethat coset GF LDPC codes have remarkable performanceover bandwidth-efficient channels. In this section, we comparetheir performance with multilevel coding using binary LDPCcomponent codes and with turbo-TCM.

A. Comparison With MLC

Hou et al. [18] presented simulations for MLC over theAWGN channel at a spectral efficiency of 2 bits/s/Hz (equal to1 bit/dimension), using a 4-PAM constellation. The equiprob-able-signaling Shannon limit13 for 4-PAM at this rate is 5.12 dB(SNR). Their best results were obtained using multistagedecoding (MSD). At a block length of symbols, theirbest code is capable of transmission at 1 dB of the Shannonlimit with an average bit-error rate (BER) of about . It iscomposed of binary LDPC component codes with maximumleft-degrees of 15.

We designed edge distributions for two coset GF LDPCcodes at the same spectral efficiency, signal constellation andBER as [18]. Our first code’s edge distributions are given by

and . Our simulations at a block length of indicatethat this code is capable of transmission within 0.55 dB of theShannon limit (100 simulations), and thus has a substantial ad-vantage over the above MLC LDPC code, which is capable oftransmission only within 1 dB of the Shannon limit.

The above code has obtained its superior performance at theprice of increased decoding complexity, in comparison withthe MLC code of [18]. We also designed a second code, witha lower decoding complexity, in order to compare the twoschemes when the complexity is restricted. This code’s edgedistributions are given byand . Our simulation results indicatethat the code is capable of reliable transmission within 0.8 dBof the Shannon limit. The code’s maximum left-degree is ,and is thus lower than the MLC code of [18]. Consequently, ithas a lower level of connectivity in its Tanner graph, implying

13Throughout this section, we assume equiprobable signaling whenever werefer to the Shannon limit.


Fig. 7. EXIT charts for a code at a spectral efficiency of 8 bits/s/Hz. (a) An EXIT chart computed using method 1 at an SNR of 26.06 dB. (b) An EXIT chartcomputed using method 2 at an SNR of 25.06 dB.

that its slightly better performance was achieved at a com-parable decoding complexity. A precise comparison betweenthe decoding complexities of the two codes must account forthe entire edge distributions (rather than just the maximumleft-degrees), and for the number of decoding iterations. Sucha comparison is beyond the scope of this work.

Hou et al. [18] also experimented at a large block length of. Their best code is capable of transmission within

0.14 dB of the Shannon limit. At a slightly smaller block length( ), our above-discussed first code is capable oftransmission within 0.2 dB of the Shannon limit (14 simula-tions), and thus has a slightly inferior performance. This maybe attributed either to the smaller block-length that we used, orto the availability of density-evolution for the design of binaryMLC component LDPC codes at large block lengths.

Hou et al. [18] obtained their remarkable performance at largeblock lengths also at the price of increased decoding complexity(the maximum left-degrees of their component codes are ). Itcould be argued that increasing the decoding complexity couldproduce improved performance also at the above mentionedblock length of . We believe this not to be true, because in-creasing the maximum left-degree would also result in an in-crease in the Tanner graph connectivity. This, at short blocklengths, would dramatically increase the number of cycles inthe graph, thus reducing performance.

Summarizing, our simulations indicate that coset GFLDPC have an advantage over MLC LDPC codes at shortblock lengths in terms of the gap from the Shannon limit. Thisresult assumes no restriction on decoding complexity. Thesimulations also indicate that when decoding complexity isrestricted, both schemes admit comparable performance. In thiscase, however, further research is required in order to provide amore accurate comparison of the two schemes.

B. Comparison With Turbo-TCM

Robertson and Wörz [30] experimented with turbo-TCM atseveral spectral efficiencies and block lengths. The highest spec-tral efficiency they experimented at was 5 bits/s/Hz. They used

a 64-QAM constellation, and their best results were achieved ata block length of 3000 QAM symbols. They obtained a BER of

at an SNR of about 16.85 dB. The equiprobable-signalingShannon limit at 5 bits/s/Hz is 16.14 dB, and thus their result iswithin approximately 0.7 dB of the Shannon limit.

We experimented with an 8-PAM constellation and a blocklength of 6000 PAM symbols, which are the one-dimensionalequivalents of two-dimensional 64-QAM and of 3000 QAMsymbols. Our code’s edge distributions are

and . Simulation results indicate a symbol error rateof less than at an SNR of 16.6 dB (100 simulations). Thisresult is within 0.46 dB of the Shannon limit, and thus exceedsthe above result of 0.7 dB.

IX. CONCLUSION

A. Suggestions for Further Research

1) Nonuniform labels. The labels of GF LDPC codes,as defined in Section III-A, are randomly selected fromGF with uniform probability. Davey and MacKay[10], in their work on GF LDPC codes for binary chan-nels, suggested selecting them differently. It would be in-teresting to investigate their approach (and possibly otherapproaches to the selection of the labels) when applied tocoset GF LDPC codes for nonbinary channels.

2) Density evolution. In Section VI-A, we discussed the dif-ficulty in efficiently computing density evolution for non-binary codes. An assumption in that discussion is thatthe densities would be represented on a grid of the form

(assuming LLR-vector rep-resentation), requiring an amount of memory of the orderof . However, a more efficient approach wouldbe to experiment with other forms of quantization, per-haps tailored to each density. We have tried applying theLloyd–Max algorithm to design such quantizers for eachdensity. However, the computation of the algorithm, cou-pled with the actual application of the quantizer, are too


computationally complex. An alternative approach wouldperhaps make use of a Gaussian approximation as de-scribed in Section VI-D to design effective quantizers.

3) Other surrogates for distributions. In [6], the functional( denoting a message of a binary LDPC decoder)

was used to lower-bound (rather than approximate) theasymptotic performance of binary LDPC codes. It wouldbe interesting to find a similar, scalar, functional that canbe used to bound the performance of coset GF LDPCcodes. Another possibility is to experiment with the func-tion , which is defined in Appendix VI.

4) Comparison with the -ary erasure channel (QEC). Ina QEC channel, the output symbol is equal to the inputwith a probability of and to an erasure with a prob-ability of . Much of the analysis of Luby et al. [23] forLDPC codes over binary erasure channels is immediatelyapplicable to GF LDPC codes over QEC channels. Itmay be possible to gain insight on coset GF LDPCcodes from an analysis of their use over the QEC.

5) Better mappings. The mapping function that waspresented in Section VII-F was designed according to aconcept that was developed heuristically. Further researchmay perhaps uncover better mapping methods.

6) Additional channels. The development in Section VIIfocuses on AWGN channels. It would be interesting toextend this development to additional types of channel.

7) Additional applications. In [3], coset GF LDPCcodes were used for transmission over the binarydirty-paper channel. Applying an appropriately designedquantization mapping (as discussed in Section III-C), abinary code was produced whose codewords’ empiricaldistribution was approximately Bernoulli . Thereare many other applications, beside bandwidth-efficienttransmission, that could similarly profit from codewordswith a nonuniform empirical distribution.

B. Other Coset LDPC Codes

In [1], other nonbinary LDPC ensembles, called BQC-LDPCand MQC-LDPC, are considered (beside coset GF LDPC).Random-coset analysis, as defined in Section V, applies to thesecodes as well. Similarly, the all-zero codeword assumption(Lemma 1) and the symmetry of message distributions (Defini-tion 4 and Theorem 1) apply to these codes. With MQC-LDPC,

in (2) is evaluated using modulo- arithmetic instead of overGF . With BQC-LDPC decoders, which use scalar messages,symmetry coincides with the standard binary definition of[29]. Channel equivalence as defined in Section V-C applies toMQC-LDPC codes, but not to BQC-LDPC.

C. Concluding Remarks

Coset GF LDPC codes are a natural extension of binaryLDPC codes to nonbinary channels. Our main contribution inthis paper is the generalization of much of the analysis that wasdeveloped by Richardson et al. [28], [29], Chung et al. [9], tenBrink et al. [35], and Khandekar [20] from binary LDPC codesto coset GF LDPC codes.

Random-coset analysis helps overcome the absence of outputsymmetry. With it, we have generalized the all-zero codeword

assumption, the symmetry property, and channel equivalence.The random selection of the nonzero elements of the parity-check matrix (the labels) induces permutation invariance on themessages. Although density evolution is not realizable, permu-tation invariance enables its analysis (e.g., the stability property)and approximation (e.g., EXIT charts).

Analysis of GF LDPC codes would not be interesting iftheir decoding complexity was prohibitive. Richardson and Ur-banke [28] have suggested using the multidimensional DFT.This, coupled with an efficient recursive algorithm for the com-putation of the DFT, dramatically reduces the decoding com-plexity and makes coset GF LDPC an attractive option.

Although our focus in this work has been on the decodingproblem, it is noteworthy that the work done by Richardson andUrbanke [27] on efficient encoding of binary LDPC codes isimmediately applicable to coset GF LDPC codes. For simu-lation purposes, however, a pleasing side effect of our general-ization of the all-zero codeword assumption is that no encoderneeds to be implemented. In a random coset setting, simulationsmay be performed on the all-zero codeword alone (of the under-lying LDPC code).

Using quantization or nonuniform spaced mapping producesa substantial shaping gain. This, coupled with our generalizationof EXIT charts has enabled us to obtain codes at 0.56 dB of theShannon limit, at a spectral efficiency of 6 bits/s/Hz. To the bestof our knowledge, these are the best codes found for this spectralefficiency. However, further research (perhaps in the lines ofSection IX-A) may possibly narrow this gap to the Shannonlimit even further.

APPENDIX IPROPERTIES OF THE AND OPERATORS

Lemma 13: For GF and GF

1) ,2) ,3) ,4) .

Proof: The first two identities are proved by examiningthe th index of both sides of the equation. The third identity isobtained from the second by observing that ifand only if . The fourth identity is straightforward.

Lemma 14: For GF and GF

1) where denotes the result ofapplying the operation on all elements of ,

2) .

The proof of the first identity is obtained from Lemma 13, iden-tity 2. The second identity is straightforward.

APPENDIX IINEIGHBORHOOD GRAPHS WITH CYCLES

Fig. 8(b) gives an example of a case where a neighborhoodgraph contains cycles. The neighborhood graph corresponds tothe Tanner graph of Fig. 8(a).

When the neighborhood graph contains cycles, the APPvalues computed by a belief-propagation decoder correspond


Fig. 8. A neighborhood graph with cycles. (a) The Tanner graph. (b) Aneighborhood graph. (c) The virtual neighborhood graph.

to a virtual neighborhood graph. In this graph, nodes that arecontained in cycles are duplicated to artificially create a treestructure. For example, in Fig. 8(c), a variable-node wasproduced by duplicating . The APP values are computedaccording to the virtual code14 implied by this graph.is virtual in the sense that it is based on false assumptionsregarding the channel model and the transmitted code. InFig. 8(c), the channel model falsely assumes that the nodesand correspond to different channel observations.

APPENDIX IIIPROOFS FOR SECTION V

A. Preliminary Lemmas

The proofs in this section focus on the properties of a mes-sage produced at some iteration of coset GF LDPC beliefpropagation at a node . Assuming the underlying code isfixed, this message is a function of the channel output and thecoset vector . We therefore denote it by .

may be either a rightbound message from a variablenode or a leftbound message to a variable node. In both cases,we denote the variable node involved by . We begin with thefollowing lemma.

Lemma 15: Let be a codeword of , some given channeloutput, and an arbitrary coset vector. Then

(28)

where is the value of at the codeword position .In the left-hand side of (28), is evaluated componentwise

over GF . In the right-hand side, we are using the notationof (2).

Proof: satisfies (29) at the bottom of the page.Expression (29) is only an estimate of the true APP value. Thecode used by the decoder is not the LDPC code , but rather thecode defined by the parity-checks of the neighborhood graphspanned from , as defined in Section IV-C and Appendix II.

14See Frey et al. [15] for an elaborate discussion.

is a random variable representing the transmitted codeword of(prior to the addition of the coset vector) and is its value at

position . The vectors and are constructed from and byincluding only values at nodes contained in the neighborhoodgraph of node . We define similarly. If the neighborhoodgraph contains cycles, we use the virtual neighborhood graphdefined in Appendix II. For each variable node that has duplicatecopies in this graph, elements of the true , , and will haveduplicate entries in , , and .

The decoder assumes that all codewords are equally likely,hence (29) becomes



Equivalently, we obtain

The word , having being constructed from a true codeword, satisfies all parity checks in the neighborhood graph

and is therefore a codeword of . Changing variables, we set. Thus, for any , we have . The condition

now becomes and we have

We now examine , which denotes the right-bound (leftbound) message from (to) a variable node , at someiteration of belief propagation. and are random variablesrepresenting the coset vector and channel-output vectors, re-spectively.

Lemma 16: For any GF , the valueis well defined in the sense that for any two codewords

that satisfy

was transmitted

was transmitted

for all probability vectors .Proof: Let . Consider transmission of

with an arbitrary coset vector of , compared to transmission ofwith a coset vector of . In both cases, the transmitted

signal over the channel is , and hence the probabilityof obtaining any particular is identical. The word satisfies

. Since is linear, we have . Therefore, Lemma 15(above) implies

(30)

We therefore obtain that

was transmitted

was transmitted

was transmitted and was received (29)


Since is uniformly distributed, averaging over all possiblevalues of completes the proof.

The following lemma will be useful in Section VII-A.

Lemma 17: For any GF

Proof: The proof follows almost in direct lines as Lemma16. Let be the all-zero codeword, and a codeword thatsatisfies . Thus,

was transmitted

was transmitted

, and thus, (30) now becomes

Thus,

was transmitted

was transmitted

Averaging over all possible values of completes the proof.

B. Proof of Lemma 1

Let be some codeword. Let denote the event oferror at a message produced at a variable node after itera-tion , assuming the channel output was , the coset vector was

, and the true codeword was . Recalling the decision rule ofSection IV-A, the decoder decides (where

is defined as in Appendix III-A). Using Lemma 15(Appendix III-A), we obtain that the maximum ofis obtained at if and only if the maximum ofis obtained at . Therefore,

In both cases, the word transmitted over the channel is andhence the probability of obtaining any channel output is thesame. Therefore, we obtain

Finally, averaging over all instances of , we obtain

C. Proof of Lemma 2

We first assume that is symmetric and prove (19). Letbe an arbitrary LLR vector, LLR , and bedefined using (2) and (5), respectively,

where we have relied on Lemmas 13 and 14 (Appendix I). Thisproves (19).

We now assume (19) and prove that is symmetric. Letand be defined as above

(31)The last equality is obtained from the fact that(Lemma 13, Appendix I), and hence each is added in

exactly times. We continue

The equality before last results from (1), recalling that inall LLR vectors. We thus obtain that is symmetric as desired.

D. Proof of Theorem 1

Let be a variable node associated with the message producedat , defined as in Lemma 15 (Appendix III-A). Let , , andbe defined as in the proof of the lemma. Using this notation, wemay equivalently denote the message produced at by .This is because the message is in fact a function only of thechannel observations and coset vector elements contained in theneighborhood graph spanning from . The following corollaryfollows immediately from the proof of Lemma 15.

Corollary 1: Let be a codeword of . Then for any andas defined above

(32)

where is the value of at the codeword position corre-sponding to the variable node .

We now return to , a random variable corresponding tothe message produced at and equal to . We assumeplain-likelihood representation of messages. Let be an arbi-trary probability vector. Since we assume the all-zero codewordwas transmitted, the random space consists of random selectionof and the random channel transitions. Therefore,

(33)

Let be the block length of code (note that like , isa function of the neighborhood graph spanning from , whichis also a function of the iteration number). The set of all vectors

GF can be presented as a union of nonintersectingcosets of . That is,

GF


where is a set of coset representatives with respect to . Foreach vector GF , we let and denotethe unique vectors that satisfy .

Let be a channel output portion and a coset vector. FromCorollary 1, we have that .Therefore, if and only if . We canthus rewrite (33) as

(34)

Examining , we have (35) at the bottom of the page.We now examine for

and such that . The random space is confinedto the random selection of the coset vector from or,equivalently, a random selection of such that .

Applying Corollary 1 again, we have for andassuming

(36)

where . We assumed and thereforethere exists some index such that (or equivalently,

). We first assume, for simplicity, that . Therefore,is unique, and no other index satisfies . From (36)

we have that if and only if . Therefore, wehave the second equation at the bottom of the page. Now thekey observation in this proof is that under the tree assumption,the above corresponds to . Therefore,

We now consider the general case of , for arbitrary. In this case, there are exactly indices satisfying

. Using the same arguments as before,

we have the third equation at the bottom of the page. Recalling(34) and (35), we now have

This proves (18).

E. The Sum of Two Symmetric Variables

The following lemma is used in Section VI-D.

Lemma 18: Let and be two independent LLR-vectorrandom variables. If and are symmetric, then issymmetric as well.

Proof: The proof relies on the observation that for allGF and LLR vectors and , . Letbe an LLR vector and GF an arbitrary element

F. Proof of Lemma 3

By definition, component of APP satisfies

APP

Where is some constant, independent of (but dependent on), selected such that the sum of the vector components is .

Using (21), we have

APP

(35)

was transmitted and was received




, being the output of the equivalent channel, is a probabilityvector. Thus, the sum of all components is . Hence,

. We therefore obtain our desired result

APP

G. Proof of Lemma 4

Let be a random variable denoting the equivalent channeloutput, and assume the equivalent channel’s input (denoted inFig. 4) was zero. thus corresponds to a vector of APP proba-bilities, computed using the physical channel output and thecoset vector component . We can therefore invoke Theorem 1and obtain that for any probability vector

Note that Theorem 1 requires that the entire transmitted code-word be zero and not only the symbol at a particular discretechannel time. However, since the initial message is a functionof a single channel output, we can relax this requirement by con-sidering a code that contains a single symbol.

Let be an arbitrary symbol from the code alphabet. ApplyingLemma 17 (Appendix III-A) to the single-symbol code we ob-tain

Therefore, the equivalent channel is cyclic symmetric.

H. Proof of Lemma 5

Consider the following set of random variables, defined as inFig. 4. is is the input to the equivalent channel. is the cosetsymbol, and , evaluated over GF . isthe physical channel input and is the physical channel output,related to through the channel transition probabilities.

APP equals the output of the equivalent channel,which is a deterministic function of and .

Since the equivalent channel is symmetric, a choice of thatis uniformly distributed renders that is equal to theequivalent channel’s capacity. This choice of renders uni-formly distributed as well, and thus, . We willnow show that

(37)

where denotes the physical channel’s output alphabet, anddenotes the element of at index number

where is the set of all probability vectors. Using Lemma 4and Definition 5 we have, for some probability function

By definition of as a probability vector, we haveand thus,

(38)

Combining (37) with (38) completes the proof.

APPENDIX IVPROOFS FOR SECTION VI

A. Proof of Theorem 3

We prove the theorem for . is the message at iterationaveraged over all possibilities of the neighborhood tree

The last equation was obtained from Theorem 1.

Hence, is symmetric as desired ( is obtained as aspecial case). The proof for is similar.

B. Proof of Lemma 8

Let (evaluated over GF )

The proof for is identical.


C. Proof of Lemma 9

First, we observe that . We now have

The last result having been obtained from Lemma 8.

D. Proof of Lemma 10

We prove the lemma for the probability-vector representation.The proof for LLR-vector representation is identical. We firstassume and show that is permutation invariant. Let

GF be randomly selected as in Definition 8, suchthat . Let GF be arbitrary such that

(39)

is a random variable, independent of that is distributedidentically with . Thus, is identically distributed with

. Since was arbitrary, we obtain that is permutationinvariant.

We now assume that is permutation invariant. Consider, where is uniformly random in GF and

independent of . Equivalently, . We now show thatis independent of

the last result having been obtained by the definition of aspermutation invariant. Since the above is true for all , isindependent of . Thus, as desired.

E. Some Lemmas Involving Permutation Invariance

We now present some lemmas that are used in Appen-dices IV-F, VI, and V and in Section VI-D. The first threelemmas apply to both the probability vector and LLR represen-tations of vectors.

Lemma 19: If is a random permutation of , then.

The proof of this lemma is obtained from the fact that theoperation , for all , leaves element unchanged.

Lemma 20: If is a symmetric random variable, and isa random permutation of , then is also symmetric.

Proof:

(40)

In the following derivation, we make use of the factthat (see Lemma 13, Appendix I) and

(see Lemma 14, Appendix I):

(41)

Combining (40) and (41) we obtain

and thus conclude the proof.

Lemma 21: If is permutation invariant and is a randompermutation of , then and are identically distributed.

The proof of this lemma is straightforward from Definitions7 and 8.

The following lemmas discuss permutation invariance in thecontext of the LLR representation of random variables.

Lemma 22: Let and be two independent, permutation-invariant LLR-vector random variables. Then isalso permutation invariant.

Proof: Let GF and . Let be anarbitrary LLR vector

Since and are arbitrary, this implies that is permutation-invariant, as desired.

Lemma 23: Let and be two LLR-vector random vari-ables. Let , , and be independent random variables, uni-formly distributed in GF and independent of and .Let , , and , ,

, . Then , , and are identically dis-tributed.

Proof: We begin with the following equalities:

Consider the expressions for and . is identically dis-tributed with , and is identically distributed with . isindependent of , and both are independent of and . Thesame holds if we replace and with and . Thus, and

are identically distributed. The proof for is similar.

F. Proof of Theorem 4

is permutation invariant following the discussion at thebeginning of Section VI-B, and thus part 1 of the theorem isproved.


where the label is randomly selected, uni-formly from GF . Thus, is a random permutation of

, and by Lemma 10, it is permutation invariant. is sym-metric by Lemma 20 (Appendix IV-E), andby Lemma 19 (Appendix IV-E). This proves part 2 of the the-orem.

is permutation invariant by its construction. is arandom permutation of . Switching to LLR representation,

is obtained by applying (15). The leftbound messages arepermutation invariant, hence, by Lemma 22 (Appendix IV-E)the sum is also permutation invariant. UsingLemma 23 (Appendix IV-E), the distribution of mayequivalently be computed by replacing the instantiation of

in (15) with an instantiation of .The distribution of is computed in density evolution re-

cursively from , using (10). Thus, the preceding discussionimplies that replacing with would not affect this den-sity either. The remainder of part 3 of the theorem is obtainedfrom Lemmas 20 and 19.

G. Nondegeneracy of Channels and Mappings

A mapping is nondegenerate if there exists no integersuch that for all , the number of elements satisfying

is a multiple of . With quantization mapping, sucha mapping could be replaced by a simpler quantization overan alphabet of size that would equally attain the desiredinput distribution . With nonuniform-spaced mapping, thenumber of elements mapped to each is and thus thisrequirement is satisfied by definition.

A channel is nondegenerate if there exist no valuessuch that for all belonging to the

channel output alphabet.The proof of when both the mapping and the channel

are nondegenerate ( having been defined in (24)) follows indirect lines as the one provided for in [1, Appendix I-A].

APPENDIX VPROOF OF PART 1 OF THEOREM 5

In this appendix, we prove the necessity condition of The-orem 5. Our proof is a generalization of the proof provided byRichardson et al. [29]. An outline of the proof was provided inSection VI-C.

A. The Erasurized Channel

We begin by defining the erasurized channel for a givencyclic-symmetric channel and examining its properties. Ourdevelopment in this subsection is general, and will be put intothe context of the proof in the following subsection.

Definition 9: Let denote the transition probabilitiesof a cyclic-symmetric channel (see Definition 5). Then the cor-responding erasurized channel is defined by the following:

The input alphabet is . The output alphabet iswhere is the output alphabet of

the original (cyclic-symmetric) channel. The transition proba-bilities are defined as follows.

For all probability vectors

(42)where

• is defined as in Definition 5;• is obtained by ordering the elements of the se-

quence in descending order and selectingthe second largest. This means that if the maximum ofthe sequence elements is obtained more than once, then

would be equal to this maximum.For output alphabet elements we define

(43)

where is defined

The following lemma discusses the erasurized channel.

Lemma 24: The erasurized channel satisfies the followingproperties.

1) The transition probability function is valid.2) The original cyclic-symmetric channel can be represented

as a degraded version of the erasurized channel. That is,it can be represented as a concatenation of the erasurizedchannel with another channel, whose input would be theerasurized channel’s output.

Proof:

1) It is easy to verify that , and hencefor all by definition. The rest of the proof follows fromthe observation that for all vectors (recall that

) .

2) We define a transition probability function whereand as shown at the bottom of the page. It

is easy to verify that the concatenation of the erasurizedchannel with produces the transition probabilities

of the original cyclic-symmetric channel.

The erasurized channel is no longer cyclic symmetric. Hence,if we apply a belief-propagation decoder on the outputs of an

otherwise.


erasurized channel, Lemma 3 does not apply, and the initial mes-sages are not identical to the channel outputs. However, the fol-lowing lemma summarizes some important properties of the ini-tial message distribution, under the all-zero codeword assump-tion.

Lemma 25: Let denote the message distribution at theinitial iteration of belief propagation decoding over an erasur-ized channel (under the assumption that the zero symbol wastransmitted). Then can be written as

(44)

where is a probability function that satisfies

(45)

and is a distribution that takes the vector(i.e., the vector where and ) withprobability ( must not be confused with definedby (24)).

Proof: For any probability vector , we define

the channel output was

and

the channel output was

We now have

(46)

We first examine . Let denote the channel output.By definition we have

(47)

where is a normalization constant, dependent on but not on, selected so that the sum of the vector elements

is . We now examine all possibilities for .First assume that the maximum of is obtained

at and at only. Let be an index where the secondlargest element of is obtained. Then by (47) and(42)

Now assume that the maximum is obtained at and also atwhere . Then it is easy to observe that

. Finally, assume that the maximum of isnot obtained at . Let be an index such that obtainsthe maximum. Then

In all cases, there exists an index such that , asrequired (45).

We now examine . Assuming the symbol was trans-mitted, then by (43), the probability of obtaining any outputsymbol of the set other than is zero.Also, the only input symbol capable of producing the output

with probability greater than zero is the input .Hence, the decoder produces the initial messagewith probability , and as required.

Consider transmission over the original, cyclic-symmetricchannel. Let be the uncoded maximum a posteriori proba-bility (MAP) of error. Let be the corresponding probabilityover the erasurized channel.

In the erasure decomposition lemma of [29], similarly de-fined and are both equal to , where is the erasurechannel’s erasure probability. In the following lemma, we ex-amine of the erasurized channel.

Lemma 26: The following inequalities hold:

Proof:

1) The erasurized channel is symmetric (although not cyclic-symmetric): for all we have

and for all we have

Hence, the decoding error is independent of the trans-mitted symbol, and we may assume that the symbol was .

Consider the erasurized channel output . The MAPdecoder decides on the symbol with the maximum APPvalue. If more than one such symbol exists, a random de-cision among the maximizing symbols is made. Let de-note the vector of APP values corresponding to . ByLemma 25, we have that with probability , is dis-tributed as . Recalling (45), we have that for mes-sages distributed as , an error is made with proba-bility at least . Therefore, .

2) By Lemma 24, the cyclic-symmetric channel is adegraded version of the erasurized channel. Hence,

.3) We now prove . Let us assume once more that

the symbol was transmitted. Recall that we are now ex-amining the decoder’s performance over the cyclic-sym-metric channel (and not the erasurized channel). There-fore, by Lemma 3, the vector of APP values (accordingto which the MAP decision is made) is identical to thechannel output. Let be defined as in Definition 6.We will now show that the following inequality holds:

(48)

• If is such that the maximum ofis obtained only at we have from (42) that

. However, in this case thedecoder correctly decides . Hence, and(48) is satisfied.

• In any other case, we have .Using we obtain (48) trivially.

We now have

B. The Remainder of the Proof

To complete the proof, we would like to show that the proba-bility of error at iteration cannot be too small. Let , denotethe rightbound messages at iteration , where .


By Lemma 4 (in a manner similar to [29]), may equiva-lently be obtained as the initial message of a cyclic-symmetricchannel. We now replace this channel with the correspondingerasurized channel, and obtain a lower bound on the probabilityof error at subsequent iterations. We let , , de-note the respective messages following the replacement.

In the remainder of the proof, we switch to log-likelihood rep-resentation of messages. We let denote the LLR-vectorrepresentation of , . Adopting the notation of[29], we let denote the distribution of . de-notes the distribution of the initial message of the truecyclic-symmetric channel.

Using LLR messages, Lemma 25 becomes

now satisfies

(49)

After iterations of density evolution, the density becomes (ina manner similar to the equivalent binary case [29])

where is defined in Theorem 5. and correspond tothe random permutations of and (resulting from the ef-fect of randomly selected labels), respectively, and denotesconvolution. Let denote the distribution of , where

is the random label on the edge along which is sent. Then

where we have used Lemma 23 (Appendix IV-E) to obtain thata random permutation of is distributed as

. Using Lemma 19 (Appendix IV-E), the probabilityof error (assuming the zero symbol was selected) is the same for

and . Letting denote this probability of error, wehave

Defining the probability function , we have

(50)

Recalling (49), satisfies that with probability there exists atleast one index such that . A random permutation

would transfer to index with probability . Hence,

(51)

Let denote the marginal distribution of the element

of . By Lemma 9, is symmetrically distributed in thebinary sense. Following the development of [29] (similarly re-lying on results from [32, p. 14]), we obtain

(52)

For the above limit to be valid, we first need (see [32]) thatin some neighborhood of zero, as appears

in the conditions of the theorem. We also need to show that(also see [32]). This will be proven shortly. We first

examine .

Lemma 27:

(53)

Proof: Recalling that is a random permutation of theinitial message, we first observe

(54)

We now examine . Recalling (14), wheredenotes the random channel output and denotes the randomcoset symbol

(55)

Combining (54), (55), and the definition (24) we obtain (53).We are now ready to show . Recall from the

discussion in Section VI-C that . Using (53) and theJensen inequality, we obtain

We now proceed with the proof. By (53), (52) becomes

(56)

The remainder of the proof follows in direct lines as in [29] andis provided primarily for completeness. Combining (50) with


(51) and (56), we obtain that for arbitrary and largeenough

If , by appropriately selecting we obtainthat for large enough

(57)

denotes a function, dependent on , , and such thatfor some constant . Hence, there exists a constant

such that if , then

(58)

We now return to examine and , the probabilities oferror over the true channel, prior to the replacement of messageswith those of an erasurized channel. Since the true channel isdegraded in relation to the erasurized channel, we must have for

, .By Lemma 26, . Hence, there exists such

that if , then and hence, (58) is satisfied.However, Lemma 26 also asserts . Hence,and consequently, . This contradicts Theorem 2.Thus, we obtain our desired result of for all .

APPENDIX VIPROOF OF PART 2 OF THEOREM 5

In this section, we prove the sufficiency condition of The-orem 5. Our proof is a generalization of the proof provided byKhandekar [20] from binary to coset GF LDPC. An outlineof the proof was provided in Section VI-C.

Note that throughout the proof we denote by functionsfor whom there exists a constant , not dependent on theiteration number , such that .

We are interested in (defined as in (22)) where isthe rightbound message as defined in Section VI-A. We begin,however, by analyzing a differently defined .

Let be a probability-vector random variable. The operatoris defined as follows:

(59)

where is a random permutation of . By definition of therandom permutation, the above definition is equivalent to

(60)

for all . Letting LLR , we obtain that

Note that when , this equation coincides with the Bhat-tacharya parameter that is used in [20, eq. (4.4)]. From Lem-ma 27 (Appendix V-B) we obtain that

(61)

where is the initial message as defined in Section VI-A.We now develop a convenient expression for .

Lemma 28: Let denote a probability-vector symmetricrandom variable. Then where is givenby

(62)

Proof: From (59) we have

(63)

The outer expectation is over all sets . The inner expectationis conditioned on a particular set . We first focus on the innerexpectation

(64)

The last equality was obtained in the same way as (31). In thefollowing, we use the fact that (Lemma 13,Appendix I)

is invariant under any permutation of the elements. It istherefore constant for all vectors of the set . Thus, we canrewrite the preceding expression as

Plugging the above into (63) completes the proof.

We now examine the function .

Lemma 29: For any probability vector , .


Proof: is obtained trivially from (62) by ob-serving that all elements of the sum are nonnegative. To prove

we have

Applying Jensen’s inequality we obtain

Given a probability vector , we define

The following lemma relates the functions and .

Lemma 30:

Proof: Let be an index that achieves the maximumin .

Consider (62). For a particular element , assume withoutloss of generality . By definition of , we have

By definition we also have . Therefore,. We now have

By definition of , . Also, there must existsuch that . We nowhave

Combining both inequalities proves the lemma.We now state our main lemma of the proof.

Lemma 31: Let be a set of probability vec-tors. Then

where denotes GF convolution, defined in (11) and usedin (13).

Proof: We begin by examining the case of .We denote and by and . To simplify our anal-

ysis, we assume that . We may assumethis, because otherwise we can apply a shift by to movethe maximum to zero. This operation does not affect . It iseasy to verify that and hence, theoperation does not affect either. Similarly, we assume

.By the definition of , we have

(65)

We now examine elements of the sum. We first examine the casethat and

The result for the case of and is similarly obtained.We now assume , (the element does notparticipate in the sum)

Inserting the above into (65) we obtain

The last equality having been obtained from Lemma 30. Finally,from the above we easily obtain the desired result of

For the case of , we begin by observing that

The remainder of the proof is obtained by induction, usingLemma 29.


We now use the above lemma to obtain the following results

Lemma 32: satisfies

(66)

Proof: Consider . Since is obtained from it by ap-plying a random permutation , we obtain, using Lemma28 and the fact that is invariant under a permutation on ,that . Thus, we mayinstead examine . Similarly, we examine instead of .

Assume the right-degree at a check node is . By (13) we have

where are i.i.d. and distributed as . In the fol-lowing, we make use of Lemma 31:

Averaging over all possible values of , we obtain

(67)

We now turn to examine . Assume the variable-nodedegree at which is produced is . Applying (59) and (8)we have

where are i.i.d. and distributed as . By The-orem 4, are permutation invariant, and thus, byLemma 21 (Appendix IV-E), are distributed identically with

their random permutations . Thus, we obtain

Applying (60) and reordering the elements, we obtain

The second equality was obtained from (59). The last equality isobtained from (61). Averaging over all values of , we obtain

(68)

The function is by definition a polynomial with nonnega-tive coefficients. It is thus nondecreasing in the rangeUsing (67) and (68) we obtain (66).

The following lemma examines convergence to zero of.

Lemma 33: If then there existssuch that if at some iteration , then

.Proof: Using the Taylor expansion of the function

around

where the equality is obtained by the definition of thefunction . Plugging the above into (66) we obtain

Using the Taylor expansion of around , we obtain

Since , there exists such that ifthen

where is a positive constant smaller than . By induction,this holds for all . We have by definition, andtherefore the sequence converges to zero.

Finally, the following lemma links the operator with ourdesired , defined as in (22).

Lemma 34: Let be a symmetric probability-vector randomvariable. Then


Proof: We begin by showing that

The last result was obtained in the same way as (63) and (64).The outer sum is over all sets . Let denote the in-dices that achieve . Then

if and otherwise. Using this and the sym-metry of , we obtain the equation at the bottom of the page.By Lemma 13 (Appendix I), . We thus continueour development

The result is obtained from the fact thatis constant over all vectors in .

We now have, using Lemmas 28 and 30 and the Jensen in-equality

This proves . For the second inequality,we observe

The last inequality is obtained by Markov’s inequality. Com-bining the above with (59) we obtain our desired result of

.

Finally, consider the value of Lemma 33. Settingwe have from Lemma 34 that if then

and thus converges to zero. ApplyingLemma 34 again, this implies that converges to zero,and thus completes the proof of part 2 of the theorem.

APPENDIX VIIPROOF OF THEOREM 6

We begin by observing that since is Gaussian, is sym-metric if and only if for all and arbitrary LLRvector

(69)

We first assume that is symmetric and permutation in-variant and prove (25). Since is permutation invariant, byLemma 8 we have for all

. We therefore denote .We begin by proving that . We prove this by contradic-

tion, and hence, we first assume . Consider the marginaldistribution of for , which must also beGaussian. Since , the probability density function (pdf)of satisfies . By Lemma 9, is symmetricin the binary sense. Hence . Combiningboth equations yields for all . Hence, isdeterministic, with zero variance, for all . This leads to ,which contradicts the theorem’s condition that is nonsingular.

We now show that conditions (69) uniquelydefine . Since is symmetric, so is . Assume and aretwo symmetric matrices such that (69) is satisfied, substituting

with and with , respectively. We now show that .Let . Subtracting the equation for from that ofwe obtain, for

(70)

For convenience, we let denote the matrix corresponding tothe linear transformation . Differentiating (70) twicewith respect to , we obtain that . Equation (70)may now be rewritten as

Let . Observe that , like , is arbitrary. Simple alge-braic mainpulations lead us to


Letting we obtain thatwhere denotes Euclidean norm. Thus, .Consider the vectors . We wish to show thatthese vectors are linearly independent. From (5), we have

. Recall from Section IIthat is evaluated over GF and that . Fromour previous discussion, for all .Therefore, for all ,

We now put the vectors in a matrix such that. The matrix is now given by

Let the matrix be defined by

That is, . It is easy to verify that isthe inverse of . Hence, is nonsingular, and its columns, thevectors , are thus linearly independent. We nowhave linearly independent vectors that satisfy

. Hence, , and we obtain that as desired.Consider the matrix . If we could show that , we

would obtain (25) for ( would be implied by). For this purpose, we show that the choice

satisfies (69)

(71)

We now treat each of the above sums separately

(72)

The set of indices. Recalling , we have

(73)

Equation (72) now becomes

(74)

We now turn to the second sum of (71). In a development similarto that of the first sum, we obtain

(75)

Finally, the last sum of (71) becomes

(76)

Combining (71), (74), (75), and (76) we obtain

(77)

Thus satisfies (69) as desired. This completes theproof of (25).

We now assume (25) and prove that is symmetric and per-mutation invariant. From (25) it is clear that any reordering ofthe elements of has no effect on its distribution, and thus,is permutation invariant. To prove symmetry, we observe thatthe development ending with (77) relies on (25) alone, and thusremains valid.


APPENDIX VIIIPROOFS FOR SECTION VII

A. Proof of Lemma 11

By Lemma 17 (Appendix III-A)

The second summation in the above equations is over all LLRvectors with nonzero probability.

By the lemma’s condition, the tree assumption is satisfied.Thus, by Theorem 1, the conditional distribution of given

is symmetric (recalling Lemma 16, Appendix III-A).Using (19), we have the equation at the bottom of the page. By(5), . Since the third summation is over all, we obtain by changing variables (evaluated over

GF )

Changing variables in the second summation, , weobtain

Since the sum over is independent of , we obtain

Equation (26) now follows from the fact that by defini-tion (see Section II).

B. The Permutation-Invariance Assumption With EXITMethod 1

In this subsection, we discuss a fine point of the assumption ofpermutation invariance used in the development of EXIT charts

by Method 1 (Section VII-C). Strictly speaking, the initial mes-sage and rightbound messages are not permutation in-variant. However, we now show that we may shift our attentionto and , defined as in Theorem 4, which are symmetricand permutation invariant.

We first show that and , evaluated using(26), are equal to and (respectively). Itis straightforward to observe that the right-hand side of (26) isinvariant to any fixed permutation of the elements of the randomvector . Thus, a random permutation will also have no effecton its value. By the discussion in Appendix IV-F, andare random permutations of and , respectively. Thus,we have obtained our desired result.

We proceed to show that the derivation of the approximationof in Section VII-C is justified if we replace and

with and . By the discussion in Appendix IV-F,may be obtained by replacing the instantiation of

in (15) with an instantiation of . Thus, is obtainedfrom and using the same expressions through which

is obtained from and . Therefore, the discussionof the derivation of the approximation for (see Ap-pendix VII-C) remains justified.

By the discussion in Appendix IV-F, the distribution ofis obtained from using (10), and the distribution of isnot required for its computation. Finally, the approximation for

in Section VII-C has been verified empirically, andtherefore does not require any further justification.

C. Gaussian Messages as Initial Messages of an AWGNChannel

Let be a Gaussian LLR-vector random variable definedas in Theorem 6. Let be the transition probabilities ofthe cyclic-symmetric channel defined by (see Lemma 6 andRemark 1, Section V-C). We will now show that this channel isin effect a -dimensional AWGN channel.

We begin by examining

Thus, the channel output, conditioned on transmission of , isdistributed as . The operation , as defined by (5), islinear. Thus, is Gaussian with a mean of ( beingdefined by (25)) and a covariance matrix which we will denoteby . Let

(78)


where is given by (25) and we define, for convenience,for all (also, recall from Section II,

that and are evaluated over GF ). Evaluating (78)for all , it is easily observed that .

The above implies that the cyclic-symmetric channel definedby is distributed as a -dimensional AWGN channelwhose noise is distributed as and whose input alphabetis given by . Both the noise and the input alphabetare functions of . By definition, this channel is cyclic sym-metric and thus, the LLR-vector initial messages of LDPC de-coding satisfy where is the channel output.

In the sequel, we would like to consider channels whose inputalphabet is independent of . For this purpose, we consider achannel whose output is obtained from by .The result is equivalent to an AWGN channel whose inputalphabet is given by where

and whose noise is distributed aswhere . Letting , we obtain thatis defined as the matrix of (25) with substituted by .

The multiplication by does not fool the initial messagesof LDPC decoding, and thus .We summarize these results in the following lemma.

Lemma 35: Consider transmission over a -dimensionalAWGN channel, and assume zero-mean noise with a covariancematrix defined as the matrix of (25) with substitutedby . Assume the following mapping from the code alphabet:

, where is defined using LLRrepresentation and is defined above.

1) Let denote the -dimensional channel output anddenote the LLR-vector initial message. Then

.2) Let the random variable denote the initial message,

conditioned on the all-zero codeword assumption. Thenis Gaussian distributed, and satisfies (25) with

.

D. Properties and Computation of

We examine in lines analogous to the development often Brink [36] for binary codes. In Appendix VIII-C, we showedthat a Gaussian distributed as in Theorem 6 and character-ized by , may equivalently be obtained as the initial message,under the all-zero codeword assumption, of a -dimensionalAWGN channel characterized by a parameter . Thecapacity of this channel is . The parameterinfers an ordering on the AWGN channels such that channelswith a greater are degraded with respect to channels with alower . Thus, is monotonically increasing and iswell defined. As , approaches zero. Thus,

Similarly

To compute and , we need to evaluate (26) for aGaussian random variable as defined in Theorem 6. Following[35], we evaluate (26) for values of along a fine grid in therange ( being selected because ),

and then apply a polynomial best fit to obtain an approximationof and (note that this operation is performed once:the resulting polynomial approximations of and arethe same for all codes).

In [35], the equivalent was evaluated by numericallycomputing the one-dimensional integral by which the expecta-tion is defined. In our case, the distribution of is multidimen-sional, and is more difficult to evaluate. We therefore evaluatethe right-hand side of (26) empirically, by generating randomsamples of according to Theorem 6.

E. Computation of

The computation of is performed in linesanalogous to the computation of as described in Ap-pendix VIII-D. We compute for fixed values of

and and for values of along a fine grid in the range. We then apply a polynomial best fit to obtain an

approximation of for all and an approximationof .

To compute at a point of the above discussedgrid, we evaluate the right-hand side of (26) (replacing witha rightbound LLR-vector message ) empirically. Samples of

are obtained by adding samples of initial messages to those ofintermediate values. The samples of the initial messages are pro-duced using Lemma 12 (with the coset symbol

randomly selected with uniform probability). The samples ofthe intermediate values, for a given , are produced using The-orem 6.

Note that unlike , which satisfies ,is greater than zero. This results from the fact that the distribu-tion of the rightbound message corresponding to isequal to the initial message , and . Letting

, we have that is not defined inthe range .

F. Computation of

Our development begins in the lines of Appendices VIII-Dand VIII-E. We compute for fixed valuesof and and for values of along a fine grid. We thenapply a polynomial best fit to obtain an approximation of

for all in this range.To compute at a point of the above dis-

cussed grid, we again evaluate the right-hand side of (26) empir-ically. We begin by applying to obtain the valueof which (together with and ) characterizes the LLR-vector rightbound message distribution. We then produce sam-ples of rightbound messages as described in Appendix VIII-E.We also produce samples of labels GF that arerequired to compute the leftbound samples of . The labelsamples are generated by uniform random selection. We usethe samples of to empirically evaluate the right-hand sideof (26) (replacing with ) and obtain .Note that computing (26) with instead of had no effect onthe final result.

Finally, as defined in Section VII-E, like(discussed in Appendix VIII-E), is not defined for

. This interval is not used in the EXIT chart analysisof Section VII-E.


ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewers andthe Associate Editor for their comments and help.

REFERENCES

[1] A. Bennatan and D. Burshtein, “On the application of a LDPC codes toarbitrary discrete-memoryless channels,” IEEE Trans. Inf. Theory, vol.50, no. 3, pp. 417–438, Mar. 2004.

[2] , “Iterative decoding of LDPC codes over arbitrary discrete-mem-oryless channels,” in Proc. 41st Annu. Allerton Conf. Communication,Control and Computers, Monticello, IL, Oct. 2003, pp. 1416–1425.

[3] A. Bennatan, D. Burshtein, G. Caire, and S. Shamai (Shitz), “Superpo-sition coding for side-information channels,” IEEE Trans. Inf. Theory,submitted for publication.

[4] Matlab Source Code for EXIT Charts. [Online]. Available: http://www.eng.tau.ac.il/~burstyn/publications.htm

[5] R. E. Blahut, Theory and Practice of Error Control Codes. Reading,MA: Addison-Wesley, 1984.

[6] D. Burshtein and G. Miller, “Bounds on the performance of belief prop-agation decoding,” IEEE Trans. Inf. Theory, vol. 48, no. 1, pp. 112–122,Jan. 2002.

[7] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modula-tion,” IEEE Trans. Inf. Theory, vol. 44, no. 3, pp. 927–946, May 1998.

[8] S.-Y. Chung, J. G. D. Forney, T. Richardson, and R. Urbanke, “On the de-sign of low-density parity-check codes within 0.0045 dB of the Shannonlimit,” IEEE Commun. Lett., vol. 5, no. 2, pp. 58–60, Feb. 2001.

[9] S.-Y. Chung, T. J. Richardson, and R. L. Urbanke, “Analysis of sum-product decoding of low-density parity-check codes using a Gaussianapproximation,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 657–670,Feb. 2001.

[10] M. C. Davey and D. MacKay, “Low-density parity check codes overGF(q),” IEEE Commun. Lett., vol. 2, no. 6, pp. 165–167, Jun. 1998.

[11] D. E. Dudgeon and R. M. Mersereau, Multidimensional Digital SignalProcessing. Englewood Cliffs, NJ: Prentice-Hall, 1984.

[12] P. Elias, “Coding for noisy channels,” in IRE Conv. Rec., vol. 3, Mar.1955, pp. 37–46.

[13] G. D. Forney Jr. and G. Ungerboeck, “Modulation and coding forlinear Gaussian channels,” IEEE Trans. Inf. Theory, vol. 44, no. 6, pp.2384–2415, Oct. 1998.

[14] C. Fragouli, R. D. Wesel, D. Sommer, and G. Fettweis, “Turbo codeswith nonuniform QAM constellations,” in Proc. IEEE Int. Conf. Com-munication, vol. 1, Helsinki, Finland, Jun. 2001, pp. 70–73.

[15] B. J. Frey, R. Koetter, and A. Vardy, “Signal-space characterization ofiterative decoding,” IEEE Trans.Inf. Theory, vol. 47, no. 2, pp. 766–781,Feb. 2001.

[16] R. G. Gallager, Low Density Parity Check Codes. Cambridge, MA:M.I.T Press, 1963.

[17] , Information Theory and Reliable Communication. New York:Wiley, 1968.

[18] J. Hou, P. H. Siegel, L. B. Milstein, and H. D. Pfister, “Capacity-ap-proaching bandwidth-efficient coded modulation schemes based on low-density parity-check codes,” IEEE Trans. Inf. Theory, vol. 49, no. 9, pp.2141–2155, Sep. 2003.

[19] A. Kavcic, X. Ma, and M. Mitzenmacher, “Binary intersymbol interfer-ence channels: Gallager codes, density evolution and code performancebounds,” IEEE Trans. Inf. Theory, vol. 49, no. 7, pp. 1636–1652, Jul.2003.

[20] A. Khandekar, “Graph-Based Codes and Iterative Decoding,” Ph.D. dis-sertation, Calif. Inst. Technol., Pasadena, CA. Available: [Online] athttp://etd.caltech.edu.

[21] F. R. Kschischang, B. J. Frey, and H.-A. Loeliger, “Factor graphs andthe sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp.498–519, Feb. 2001.

[22] G. Li, I. J. Fair, and W. A. Krzymien, “Analysis of nonbinary LDPCcodes using Gaussian approximation,” in Proc. IEEE Int. Symp. Infor-mation Theory, Yokohama, Japan, Jun./Jul. 2003, p. 234.

[23] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. Spielman,“Efficient erasure correcting codes,” IEEE Trans. Inf. Theory, vol. 47,no. 2, pp. 569–584, Feb. 2001.

[24] , “Improved low-density parity-check codes using irregulargraphs,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 585–598, Feb.2001.

[25] R. J. McEliece, “Are turbo-codes effective on nonstandard channels?,”IEEE Information Theory Society Newsletter, vol. 51, no. 4, pp. 1–8,Dec. 2001.

[26] E. A. Ratzer and D. J. C. MacKay, “Sparse low-density parity-checkcodes for channels with cross-talk,” in Proc. ITW2003, Paris, France,Mar/Apr. 2003.

[27] T. Richardson and R. Urbanke, “Efficient encoding of low-densityparity-check codes,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp.638–656, Feb. 2001.

[28] , “The capacity of low-density parity check codes under message-passing decoding,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 599–618,Feb. 2001.

[29] T. Richardson, A. Shokrollahi, and R. Urbanke, “Design of capacity-approaching irregular low-density parity-check codes,” IEEE Trans. Inf.Theory, vol. 47, no. 2, pp. 619–637, Feb. 2001.

[30] P. Robertson and T. Wörz, “Bandwidth-efficient turbo trellis-codedmodulation using punctured component codes,” IEEE J. Sel. AreasCommun., vol. 16, no. 2, pp. 206–218, Feb. 1998.

[31] E. Sharon, A. Ashikhmin, and S. Litsyn, “EXIT functions for theGaussian channel,” in Proc. 40th Annu. Allerton Conf. Communication,Control, and Computers, Allerton, IL, Oct. 2003, pp. 972–981.

[32] A. Shwartz and A. Weiss, Large Deviations for Performance Anal-ysis. London, U.K.: Chapman & Hall, 1995.

[33] F.-W. Sun and H. C. A. van Tilborg, “Approaching capacity by equiprob-able signaling on the Gaussian channel,” IEEE Trans. Inf. Theory, vol.39, no. 5, pp. 1714–1716, Sep. 1993.

[34] R. M. Tanner, “A recursive approach to low complexity codes,” IEEETrans. Inf. Theory, vol. IT-27, no. 5, pp. 533–547, Sep. 1981.

[35] S. ten Brink, G. Kramer, and A. Ashikhmin, “Design of low-den-sity parity-check codes for modulation and detection,” IEEE Trans.Commun., vol. 52, no. 4, pp. 670–678, Apr. 2004.

[36] S. ten Brink, “Convergence behavior of iteratively decoded parallel con-catenated codes,” IEEE Trans. Commun., vol. 49, no. 10, pp. 1727–1737,Oct. 2001.

[37] N. Varnica, X. Ma, and A. Kavcic, “Iteratively decodable codes forbridging the shaping gap in communications channels,” in Proc.Asilomar Conf. Signals, Systems and Computers, Pacific Grove, CA,Nov. 2002.

[38] C. C. Wang, S. R. Kulkarni, and H. V. Poor, “Density evolution forasymmetric channels,” IEEE Trans. Inf. Theory, vol. 51, no. 12, pp.4216–4236, Dec. 2005.

[39] H. Wymeersch, H. Steendam, and M. Moeneclaey, “Log-domain de-coding of LDPC codes over Gf(q),” in Proc. IEEE Int. Conf. Commu-nications, Paris, France, Jun. 2004, pp. 772–776.

Date post:	13-Dec-2015
Category:	Documents
Upload:	nomore891
View:	222 times
Download:	0 times

Bennatan2006.pdf

Documents