+ All Categories
Home > Documents > Analysis and design of trellis codes optimized for a binary symmetric Markov source with MAP...

Analysis and design of trellis codes optimized for a binary symmetric Markov source with MAP...

Date post: 22-Sep-2016
Category:
Upload: nc
View: 213 times
Download: 1 times
Share this document with a friend
11
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998 2977 Analysis and Design of Trellis Codes Optimized for a Binary Symmetric Markov Source with MAP Detection James M. Kroll, Member, IEEE, and Nam C. Phamdo, Member, IEEE Abstract— We consider the problem of transmitting a bi- nary symmetric Markov source (BSMS), over the additive white Gaussian noise (AWGN) channel. The coding technique consid- ered is trellis-coded modulation (TCM), where we utilize decoders which implement the maximum-likelihood (ML) and maximum a posteriori (MAP) criteria. Employing 8-PSK Ungerboeck codes on a BSMS with state transition probability , we first show that the MAP decoder realizes a 0.8–2.1-dB coding gain over the ML decoder. Motivated by these gains, we consider the design of trellis codes optimized for the BSMS/AWGN/MAP system. An approximate union bound is established for this system. Using this bound, we found codes which exhibit additional 0.4–1.1-dB gains over Ungerboeck codes. Finally, we compare the proposed TCM system with a tandem coding system. At normalized signal- to-noise ratio (SNR) of 10.8 dB and below, the proposed system significantly outperforms the tandem system. Index Terms— MAP detection, Markov source, trellis-coded modulation. I. INTRODUCTION T RELLIS-coded modulation (TCM), as conceived by Ungerboeck [1], was originally investigated as a coding scheme for the additive white Gaussian noise (AWGN) channel. Under the assumption of a binary symmetric (equally likely) source, producing an independent and identically distributed (i.i.d.) bit stream, the optimal decoder is one which utilizes the maximum-likelihood (ML) criterion. In practice, this is implemented using the Viterbi algorithm [2]. For a given code complexity, optimal codes for this source and channel were defined as codes which maximized the free squared Euclidean distance . This is an asymptotic assumption, that is, we assume the signal-to-noise ratio (SNR) is large enough to ignore the effect of error paths with squared distance exceeding . As the SNR decreases, this assumption becomes inappropriate, and we must resort to a more detailed analysis of the code to predict its performance. Typically, this is done by determining the distance spectrum Manuscript received July 20, 1996; revised August 2, 1997. This work was supported in part by Nippon Telegraph and Telephone Corporation and by Northrop Grumman Corporation. The material in this paper was presented at the IEEE International Symposium on Information Theory, Ulm, Germany, June–July 1997. J. M. Kroll is with 3Com’s Carrier Systems Division, Mount Prospect, IL 60056 USA (e-mail: [email protected]). N. C. Phamdo is with the Department of Electrical and Computer Engi- neering, State University of New York, Stony Brook, NY 11794-2350 USA (e-mail: [email protected]). Publisher Item Identifier S 0018-9448(98)06894-1. of the code, and applying a union bound to characterize the effect of all possible error paths at the SNR of interest [3]. In this paper, we again consider the application of TCM to the AWGN channel. However, we assume the source exhibits a certain redundancy. Specifically, we assume the source is a binary symmetric Markov source (BSMS). This assumption is motivated by the fact that many practical source encoders fail to remove all the redundancy inherent in the source. The remaining redundancy can be modeled, and utilized at the decoder. One such example is the code excited linear predictive (CELP) speech coder [4], in which a residual redundancy exists in the line spectral pair (LSP) parameters. Other examples include the residual redundancy in ADPCM image coding [5], and the residual redundancy which exists after discrete cosine transformation or subband coding of images [6]. Given a coded-channel sequence, produced from a source which generates data according to some quantifiable model, we may now utilize the maximum a posteriori (MAP) criterion when determining survivor paths through the trellis. The MAP decoder can utilize the appropriate model of source redundancy, to more effectively combat channel noise. This provides performance gains over that obtainable by ML de- coding (i.e., if we disregard the redundancy inherent in the information sequences). This paper provides a method by which we may predict the performance of trellis codes given a BSMS, AWGN channel, and MAP detection. In practice, many sources may be described by such Markovian models. One example is the bit stream resulting from FAX documents. Another is the residual redundancy observed after vector quantization of grey-scaled images [7]. Because the gains associated with MAP detection are especially significant at low SNR, our interest is in developing a bound based on a detailed analysis of the distance spectrum [3]. This bound is then used to find trellis codes which are optimized for this source/channel/decoder model. Here, our optimization criterion is the minimization of the error event probability. Section II discusses the system model, and presents the ML and MAP criterions. In Section III, we investigate the coding gains achieved by MAP detection. This is done by simulating standard Ungerboeck codes for both the ML and MAP decoders. The coding gains achieved provide the mo- tivation to find trellis codes which are optimized for the BSMS/AWGN/MAP system. Section IV begins with a review of error-bounding results for trellis coding and ML detection. 0018–9448/98$10.00 1998 IEEE
Transcript

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998 2977

Analysis and Design of Trellis CodesOptimized for a Binary Symmetric

Markov Source with MAP DetectionJames M. Kroll,Member, IEEE, and Nam C. Phamdo,Member, IEEE

Abstract— We consider the problem of transmitting a bi-nary symmetric Markov source (BSMS), over the additive whiteGaussian noise (AWGN) channel. The coding technique consid-ered is trellis-coded modulation (TCM), where we utilize decoderswhich implement the maximum-likelihood (ML) and maximuma posteriori (MAP) criteria. Employing 8-PSK Ungerboeck codeson a BSMS with state transition probability 0:1, we first showthat the MAP decoder realizes a 0.8–2.1-dB coding gain over theML decoder. Motivated by these gains, we consider the designof trellis codes optimized for the BSMS/AWGN/MAP system. Anapproximate union bound is established for this system. Usingthis bound, we found codes which exhibit additional 0.4–1.1-dBgains over Ungerboeck codes. Finally, we compare the proposedTCM system with a tandem coding system. At normalized signal-to-noise ratio (SNR) of 10.8 dB and below, the proposed systemsignificantly outperforms the tandem system.

Index Terms—MAP detection, Markov source, trellis-codedmodulation.

I. INTRODUCTION

T RELLIS-coded modulation (TCM), as conceived byUngerboeck [1], was originally investigated as a coding

scheme for the additive white Gaussian noise (AWGN)channel. Under the assumption of a binary symmetric (equallylikely) source, producing an independent and identicallydistributed (i.i.d.) bit stream, the optimal decoder is onewhich utilizes the maximum-likelihood(ML) criterion. Inpractice, this is implemented using the Viterbi algorithm [2].For a given code complexity, optimal codes for this sourceand channel were defined as codes which maximized thefree squared Euclidean distance . This is an asymptoticassumption, that is, we assume the signal-to-noise ratio (SNR)is large enough to ignore the effect of error paths withsquared distance exceeding . As the SNR decreases, thisassumption becomes inappropriate, and we must resort to amore detailed analysis of the code to predict its performance.Typically, this is done by determining the distance spectrum

Manuscript received July 20, 1996; revised August 2, 1997. This work wassupported in part by Nippon Telegraph and Telephone Corporation and byNorthrop Grumman Corporation. The material in this paper was presented atthe IEEE International Symposium on Information Theory, Ulm, Germany,June–July 1997.

J. M. Kroll is with 3Com’s Carrier Systems Division, Mount Prospect, IL60056 USA (e-mail: [email protected]).

N. C. Phamdo is with the Department of Electrical and Computer Engi-neering, State University of New York, Stony Brook, NY 11794-2350 USA(e-mail: [email protected]).

Publisher Item Identifier S 0018-9448(98)06894-1.

of the code, and applying a union bound to characterize theeffect of all possible error paths at the SNR of interest [3].

In this paper, we again consider the application of TCM tothe AWGN channel. However, we assume the source exhibitsa certain redundancy. Specifically, we assume the source isa binary symmetric Markov source (BSMS). This assumptionis motivated by the fact that many practical source encodersfail to remove all the redundancy inherent in the source.The remaining redundancy can be modeled, and utilized atthe decoder. One such example is the code excited linearpredictive (CELP) speech coder [4], in which aresidualredundancyexists in the line spectral pair (LSP) parameters.Other examples include the residual redundancy in ADPCMimage coding [5], and the residual redundancy which existsafter discrete cosine transformation or subband coding ofimages [6]. Given a coded-channel sequence, produced froma source which generates data according to some quantifiablemodel, we may now utilize themaximum a posteriori(MAP)criterion when determining survivor paths through the trellis.The MAP decoder can utilize the appropriate model of sourceredundancy, to more effectively combat channel noise. Thisprovides performance gains over that obtainable by ML de-coding (i.e., if we disregard the redundancy inherent in theinformation sequences).

This paper provides a method by which we may predictthe performance of trellis codes given a BSMS, AWGNchannel, and MAP detection. In practice, many sources maybe described by such Markovian models. One example isthe bit stream resulting from FAX documents. Another isthe residual redundancy observed after vector quantization ofgrey-scaled images [7]. Because the gains associated withMAP detection are especially significant at low SNR, ourinterest is in developing a bound based on a detailed analysis ofthe distance spectrum [3]. This bound is then used to find trelliscodes which are optimized for this source/channel/decodermodel. Here, our optimization criterion is the minimizationof the error event probability.

Section II discusses the system model, and presents theML and MAP criterions. In Section III, we investigate thecoding gains achieved by MAP detection. This is done bysimulating standard Ungerboeck codes for both the ML andMAP decoders. The coding gains achieved provide the mo-tivation to find trellis codes which are optimized for theBSMS/AWGN/MAP system. Section IV begins with a reviewof error-bounding results for trellis coding and ML detection.

0018–9448/98$10.00 1998 IEEE

2978 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

Fig. 1. Generalized systematic convolutional encoder with� delays.

We then provide a result for the pairwise error probabilityassuming MAP detection. Finally, a computationally efficient,approximate bound is developed for the case of trellis codingon the BSMS/AWGN/MAP system. In Section V, we utilizethis bound to search over an ensemble of trellis codes, andidentify codes which are optimum for this system over a rangeof SNR. We consider 8-PSK modulation, at a data rate of2 bits/ . However, the search could easily be modified toconsider other modulations and data rates. The resulting codesachieve gains of up to 1.09 dB relative to Ungerboeck codesof comparable complexity. Additionally, simulation studiessuggest that the optimized codes are robust to both source/codeand source/decoder mismatch. Finally, Section VI offers asummary of the relevant results and suggestions for futurework.

II. SYSTEM MODEL

A BSMS is a stationary stochastic process, , whichmay be completely characterized by the source probabilitytransition matrix

ifif

(1)

and the initial distribution , where. When , (1) reduces to a binary

symmetric i.i.d. source. Throughout this paper, we are mainlyinterested in (binary symmetric i.i.d. source), and

(BSMS).The sequence is broken up into blocks of two

according to , , Assuming8-PSK modulation, the generalized systematic encoder withdelays is illustrated in Fig. 1. Here, the connections at level

of the encoder are completely described by the parity-checkpolynomials

for

and

for

Vectors spanning modulation intervals are denoted in thefollowing manner:

Here and represent the inputand output of the convolutional encoder at time. For thesystematic convolutional encoder in Fig. 1, for

. Hence is a three-bit vector which is mappedinto the 8-PSK signal space point according to:

. At the decoder, we denote the complex signalreceived at time as , where

(2)

In this equation is assumed to be a complex zero-mean i.i.d. Gaussian random process with diagonal covariancematrix , where is the identity matrix. In somecases, we use to denote . Focusing again on our vectornotation, corresponds to an —dimensional binary infor-mation sequence composed of two-bit vectors (is bitslong). is convolutionally coded over modulation intervalsto form .is a complex -dimensional codeword. Hence mapsthe binary codewordsequence into the complex channelsequence . Finally, is the noise-corruptedobservation of at the receiver.

Two decoding rules are considered. Both are implementedusing the Viterbi algorithm.

• ML—Choose the sequence which minimizes the like-lihood metric

(3)

• MAP—Choose the sequence which minimizes theaposteriori metric

(4)

Here, is the set of all possible coded sequences, andis the probability of the corresponding information sequence

. Details of the implementation of the MAP decoderusing the Viterbi algorithm can be found in [8]. We note thatthe implementation of the MAP decoder using (4) requiresknowledge of the noise parameter ; whereas for the MLdecoder of (3), it is not needed.

For the case of MAP detection, we assume the decoderknows precisely. In practice, this may be achieved byfirst calculating and then transmitting this parameter with

KROLL AND PHAMDO: TRELLIS CODES OPTIMIZED FOR A BINARY SYMMETRIC MARKOV SOURCE 2979

Fig. 2. Performance evaluation of the 16-state Ungerboeck code, assuming a BSMS withp = 0:9. Simulation (sim), and bound results for the ML and MAPdecoders. Simulation results for the ML decoder with Markov and binary symmetric i.i.d. sources are essentially identical.

TABLE IMAP GAINS (IN DECIBELS) OVER THE ML D ECODER FOR THE8-, 16-, 32-,AND 64-STATE UNGERBOECK CODES FORVARIOUS ERROR EVENT PROBABILITIES.

THE SNR (IN dB) AT WHICH THE CODE ACHIEVES A PARTICULAR Pe IS GIVEN IN PARENTHESIS

adequate error coding to insure it is received correctly. Thebandwidth efficiency is not compromised since the requiredoverhead should be negligible relative to the size of the datastream.

III. MAP V ERSUS ML PERFORMANCE

FOR UNGERBOECK CODES

In this section, we consider the performance of severalUngerboeck codes for the source and channel of interest.Specifically, the BSMS has a probability of stayingin the same state, and we quantify the relative performance ofthe ML and MAP decoders for the 8-, 16-, 32-, and 64-stateUngerboeck codes [1].

The performance of the 16-state Ungerboeck code is plottedin Fig. 2. This figure provides performance predictions basedon computer simulation, as well as union bounds for eachdecoder. The bound for the ML decoder is calculated using thealgorithm described in [3]. The bound for the MAP decoderis calculated using an algorithm developed in Sections IV andV. At an error event probability , the MAP decoderachieves a 1.74-dB gain over the ML decoder. More significantgains are obtained at lower SNR. As the SNR increases, wenotice the power gains achieved by the MAP decoder slowlydiminish. The curve for ML decoding of the i.i.d. source isimportant to the discussion in Section IV-C2.

Table I summarizes the coding gains achieved by the MAPdecoder over the ML decoder for the aforementioned codes.All coding gains cited are determined by computer simulation.The MAP decoder achieves coding gains in the range of0.80–2.13 dB, with the most significant gains realized at lowSNR. Recently, the problem of transmitting a BSMS over dis-crete memoryless channels, utilizing ML and MAP decoders

was addressed in [8]. In that paper, the superiority of theMAP decoder, especially at low SNR, was likewise observed.A similar relationship between MAP and ML performancewas observed in [9], where these decoders were utilized todetect the CELP LSP parameters, assuming convolutional andblock FEC schemes. Finally, similar relative performance wasobserved in [10], where trellis coding with ML and MAPdetection was utilized to code both Markov and CELP LSPsources, assuming both AWGN and Rayleigh channels. Hence,the results reported here are characteristic of the MAP decoder.

As previously mentioned, the Ungerboeck codes were de-signed to maximize . This criterion is only appropriate athigh SNR, and for the ML decoder. This fact, along withthe coding gains obtained by the MAP decoder, providesmotivation to find trellis codes which are optimized for theBSMS/AWGN/MAP system. In particular, the performance ofTCM to the applicable system at very low SNR is of primaryinterest. The remainder of this paper is devoted to solving thisproblem, and quantifying the coding gains obtained by theoptimum TCM codes, over Ungerboeck codes for this system.

IV. PERFORMANCE BOUNDS AND ANALYSIS

A. AWGN Channel, with ML Decoding

This subsection reviews several well-known results for theperformance of TCM given a binary symmetric i.i.d. source,AWGN channel, and ML decoder. We denote the probabilityof a pairwise error event as

(5)

2980 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

This is the probability that the received sequenceis closerto the incorrect sequence than to the correct sequence

, given was transmit. Here is the squared Euclideandistance between the sequencesand , and is theGaussian integral function

A union bound on the error event probability may bewritten as

(6)

Here, is the a priori probability of transmitting thesequence , and each summation is over all coded sequencesin . Such a bound is useful given the need to predict codeperformance at very high SNR, where simulation becomesimpractical due to the small error rates involved, and at lowSNR where asymptotic approximations based on are nolonger valid.

Traditionally, the analysis of quasiregular trellis-coded sys-tems is accomplished by determining the set of “error paths”relative to the all sequence [3]. The significance of quasireg-ularity is that for a binary symmetric i.i.d. source, the previousunion bound simplifies to a summation over all nonzero errorsequences

(7)

where is the collection of nonzero binary codewords [3].Furthermore, is the set of squared Euclidean distancesresulting from codewords which are paired over .Equivalently, the upper bound on may be written as asummation over all lines in the distance spectrum, wheresuch lines are defined by a squared Euclidean distanceandmultiplicity . Notice that the multiplicity of paths atmay be determined using the distance polynomial inthe following manner:

(8)

where is provided in Fig. 3(b). Furthermore, thepolynomial coefficient is the conditional probability thatthe squared Euclidean distance between two coded sequencesis given . Clearly

B. Pairwise Error Event Probability—MAP Decoder

In this section, an expression for the pairwise error eventprobability is provided for the MAP decoder. Given the correctsequence and an incorrect sequence , the MAP decoderchooses over if

(a)

(b)

Fig. 3. (a) One possible trellis error pair, with (b) 8-PSK mapping andapplicable distance polynomials. Outer constellation labels correspond toencoded bits as defined in Fig. 1. Inner labels correspond to signal spacelabels.�i for i 2 f0; 1; 2; 3g are squareddistances.

Making the appropriate substitutions from (4), we obtain

(9)

In [11], we show that the probability density function (pdf) ofthe random variable is normally distributed as

This pdf can then be integrated over the region to findthe corresponding error event probability assuming a MAPdecoder

(10)

where and . Observe that whenand are equally likely, , and

KROLL AND PHAMDO: TRELLIS CODES OPTIMIZED FOR A BINARY SYMMETRIC MARKOV SOURCE 2981

is equivalent to . The definition in (10) ismade because the pairwise error probability depends on twocodeword parameters: the squared Euclidean distance between

and and the ratio of their probabilities.

C. A Bound on for the BSMS/AWGN/MAP System

1) Efficient Calculation of Given : The unionbound on given in (6) is a computationally burdensomeexpression. The assumptions of a binary symmetric i.i.d.source, quasiregularity, and ML decoding are essential whenthis equation is reduced to (7). For the trellis-coded BSMSwith MAP decoding, it would appear we must resort to anexpression analogous to (6) to upper-bound. Our goal,therefore, is to develop a computationally efficient bound on

for the BSMS/AWGN/MAP system.Fig. 3 illustrates one particular pair of paths through a

trellis. Traditionally, we are concerned with determining errorsequences relative to the all information sequence. Let usapply this approach to Fig. 3 and assess the information theerror sequence provides. In this figure, . Thisprovides all the necessary information to determine the setof squared Euclidean distances between 8-PSK channelsequences which are paired over. However, it does notappear to provide us with information regarding the ratio ofsequence probabilities required in (10). To thisend, we define some additional terminology and parameters.We consider and to be paired over provided

where

Hence denotes a - addition over the correspondingbits. For example, in Fig. 3, and

. Let and represent the number ofbit changes along the sequencesand , respectively, and let

. For the example in Fig. 3, we haveand . Utilizing these parameters, the required

ratio of sequence probabilities is written as

(11)

Given our definition for , this equation would be useful forthe calculations required in (6) with (10).

To develop a bound analogous to (7), we need to find arelationship between the parameterand the error sequence.Consider the pair of trellis paths in Fig. 3, and the correspond-ing error sequence. For an arbitrary channel sequence twomodulation intervals long, provides information regardingthe set of possible squared Euclidean distances betweenand , but we cannot guarantee that . However,does provide information concerning the distribution of.This is demonstrated in Table II. In this table, we considerthe probability of transmitting each of the possible four-bit

TABLE IITHE DISTRIBUTION OF � GIVEN eee = (001; 010)P (� = z) IS OBTAINED BY SUMMING OVER ALL

P (III) FOR WHICH � = z, WHERE z 2 f�2; 0; 2g

information sequences, the resulting sequences, and thecorresponding values ofassuming . This tableprovides the probability mass function for, given the errorsequence. This suggests an approach to calculating a boundanalogous to (7). Specifically,

(12)

where

(13)Here is the set of values for given , and isthe pairwise error probability for the sequence. Notice thateven if we know the error sequence, there is an ambiguitywith respect to the values of and . In(12), we resolve this ambiguity by averaging over all possiblevalues of and associated with the error sequence.

The alternative to using (12) and (13) as our bound onis to utilize (6), where is replaced by

(10). In practice, it is impossible to use an infinite number ofcodeword pairs as implied in (6). Typically, researchers definesome cutoff on the maximum length of the codewords in eachsum. Notice that the number of terms (codewords) in each sumincreases exponentially with codeword length. For example, ifwe increase the maximum length of an allowable codewordby one modulation interval, the number of terms in each sumincreases by a factor of . Hence the overall complexityincreases by a factor of . The expressionin (12) and (13) requires a triple summation. Notice, however,that the number of terms in the two right-most sums increasesapproximately linearly with error vector length (both increaseat most by a factor of). Therefore, the complexity of (12) and(13) increases by a factor of for each one

2982 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

TABLE IIISOURCE CHANNEL OPTIMIZED (SCO) TRELLIS CODES FOR THEBSMS/AWGN/MAP

SYSTEM. SNRopt IS THE OPERATING POINT AT WHICH THE PARTICULAR CODE IS OPTIMUM

THESE CODES WERE OPTIMIZED FOR A BSMS WITH p = 0:9

modulation interval increase in error vector length. Clearly,the bound in (12) and (13) is substantially more efficient thanthe alternative approach when the maximum vector length andnumber of states becomes large.

This approach to calculating is still cumbersomethough, since it appears a table similar to Table II is requiredfor all possible error sequences. The following theoremprovides a method of determining the distribution offora given error sequence.

Theorem 1: Let be a BSMS with probability of stayingin the same state. Let be a fixed binary error sequence andlet . Let and be the number of bit changesin and , respectively. Let

Finally, let . Then

i) takes values in the set

(14)

and ii) is binomially distributed according to

(15)

where .

Therefore, the distribution of depends only on and .Furthermore, is conditionally independent of and given

. We may thus write (14) as and (15) as . TheProof of Theorem 1 is provided in the Appendix.

2) Estimating : The calculation of in(12) requires the distribution . In the case of a binarysymmetric i.i.d. source, can be easily obtained fromthe distance polynomial coefficient in (8). Obtaining

for the BSMS is generally more difficult, sincemay not equal for the BSMS. In some cases,can be obtained easily. For example, if is

monomial for , then

for . In the following, we derive furtherconditions on such that .

Define the set . This set contains the twothree-bit error sequences which yield an ambiguity insquared distance between 8-PSK points which are paired over

( and arenot monomial). This results fromthe mapping and distance polynomials of Fig. 3. For example,given two information bits and at time , the squareddistance between 8-PSK points corresponding toand with depends on , and will beeither or . This implies that given a known informationsequence, the distance between and at a time

corresponding to the occurrence of , is actuallydependent on the value .

Assuming occurrences of over arbitrary timeintervals , the coefficients of the distance polynomial may bewritten as follows:

(16)

where is the deterministic component of squared distanceobtained from those .

The above implies that during the time intervals suchthat , all possible sequences of and are equallylikely. For example, if , the nondeterministic distancesare , and each of these fourelements is equiprobable.

Define the vector , which is a length- binary vectorspecifying the values for for the codeword at the times

such that , where and diverge from state. Hence knowing the vectors and , the squared

distance between codewords diverging from statecontainingthe information sequencesand is known exactly.

Observation 1: For a code with delays, an informationvector is capable of producing codewords . The code-word produced depends on the encoder state, when initially

KROLL AND PHAMDO: TRELLIS CODES OPTIMIZED FOR A BINARY SYMMETRIC MARKOV SOURCE 2983

enters the encoder. Given an error sequencewith ,these codewords will produce a set of possibly nonuniquesequences which we denote as . If each codewordproduced by is equiprobable, the sequences in areequiprobable as well. Consider an error sequencewithoccurrences of and an information vector. If all

possible binary -tuples are in and they occur withequal likelihood, then is equal to , regardlessof the source distribution (likelihood of).

Utilizing this observation, we now develop a theorem whichprovides conditions on which guarantees that and

are equal to .

Lemma 1: At any arbitrary point in time , a systematicencoder is equally likely to be in any of the states,regardless of the source distribution.

Because the identical ensemble of information sequencesconverges into each state, the probability flux into each stateis identical, and Lemma 1 must be true.

Lemma 2: Define the sequence

For any fixed information sequence, each of the vectorsis possible. The sequence produced depends only on the

encoder state when initially enters the encoder.

Lemma 2 is a consequence of the encoder memory and iseasily verified.

Theorem 2: If an error sequence contains occur-rences of over or less modulation intervals, then

, regardless of the source modeland relative likelihood of the information vectors.

Using the previous observation, it is clear that ifand the constraint of Theorem 2 is satisfied, any informationsequence will produce a set of equiprobable (Lemma1) sequences , such that each -tuple occurs in anequal number of times (Lemma 2). As a result, the set ofcodewords produced by yields a distribution on the setof squared Euclidean distances between codewords which arepaired over ; that is in agreement with . Thisimplies that .

Now consider averaging over all possible in-formation vectors

(17)

Since is a function of and . Ifthe constraint of Theorem 2 is satisfied,and (17) may be rewritten as

(18)

The summation in (18) is equal to one. Therefore,. An analogous set of equations and arguments is easily

used to show that . This proves Theorem 2.

In [11], we show that for a BSMSand may be code-dependent and arenot necessarilydescribed by . If and the condition of Theorem2 is not satisfied, a calculation such as

is required. Such a calculation would add an enormous com-plexity to the proposed search method of Section V and is notof practical interest.

In light of Theorem 2, we choose to approximateby using the coefficients of the distance polynomial .For practical purposes, this should yield an excellent ap-proximation. From Theorem 2, we know that even if ourapproximation for is incorrect, the error sequence

must be at least modulation intervals long (cannot occur at the beginning or end of an error pair). Suchan error sequence is very likely to cause the codewords pairedover to have accumulated a distance far exceeding .Furthermore, because , we would expect worst weightspectral lines associated with to have low multiplicity.Given these two observations it is clear that error sequences

for which Theorem 2 does not hold will not yield asignificant contribution to . To support this claim, we offerthe following two results. First, notice that if is notcorrectly described by when , then the distancespectrum of a code is dependent on . For this reason, aML decoder designed for the code would in theory havedifferent performance for different values of. In Fig. 2,however, we notice that simulation results for the Ungerboeck16-state ML decoder for two sources where areessentially identical. Second, notice that the bounding resultsfor the BSMS implemented with our approximations to(for ML decoding) and (for MAP decoding using aresult from Section IV-C3) are in excellent agreement with thesimulation results of Fig. 2 for . At very low SNR,both bounds are quite loose. This, of course, is characteristic ofthe union bound. For the remainder of this paper, all referencesto the distance spectrum of a code refer to the approximationof the true distance spectrum by setting forall .

3) A Computationally Efficient, Approximate Upper Boundon : The expression for in (12) and (13) may berewritten in a more manageable form, by realizing that theresulting sum is really a summation over the set of lines

in the applicable distance spectrum. There is one subtleproblem we must account for, however. In practice, the ratioof decoded sequence probabilitieswill depend on the value ofthe last bit along the survivor into the state from which thetwo sequences diverge. Assume the information sequencesand begin at time index and denote the last decodedbit along the survivor as . Clearly, the value of willalter the decoders estimates for and . Notice alsothat if , then the sequence probabilitiesare altered in the same way, and may be disregarded. Inthe discussion which follows, we simplify the notation in thefollowing manner: .

2984 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

A further complication results from the fact that mayhave been decoded incorrectly. Let the probability of the event

be denoted as . In [11], we show that the pairwiseerror event probability can be written as a function of threeparameters: , , and

(19)

To calculate the union bound, we use in place ofin (13). The bound can be efficiently computed using spectrallines. A spectral line is a triplet: . Note that

is a function of and . Define

(20)

as the pairwise error probability for a given spectral line, and

(21)

as the multiplicity of a given line. Then the bound in (12) and(13) reduces to

(22)

We will utilize this result in the code search described in thenext section.

V. CODE SEARCH—PROCEDURES ANDRESULTS

A. Code Analysis

Equation (22) provides the bound on for trellis codingon the BSMS/AWGN/MAP system. Now, a method of de-termining the spectral lines in must be chosen. In [3], acomputationally efficient bidirectional algorithm is describedwhich is capable of identifying error sequences, and termi-nates when the desired number of sequences is found. Here, werequire a larger set of error parameters when quantifying theerror vectors in each code. The modifications to the algorithmin [3] are quite straightforward. We develop these extensionsin [11].

B. Code Search Method

The code search procedure is simple. Choose a value for,the number of delays in the generalized encoder structure ofFig. 1. The choice of is associated with an ensemble of codeswhich are generated by considering all possible encoders of theform in Fig. 1, where the order of equals . For eachcode in this ensemble, determine the approximate distancespectrum and apply the bound indicated in (22). Because weare interested in a range of SNR, we choose to apply (22) atseveral values of SNR for each code. Utilizing this approach,

we may easily determine a set of codes which obtain the besterror performance over a range of SNR. For the code searchreported here, we consider spectral lines.

C. Estimating

The bound in (22) requires that we have an estimate of.There are several ways we may approach this problem. Afterconsidering several methods, we have decided on the followingscheme. First, simulate an Ungerboeck code withdelays,and determine over a range of SNR for that particularcode. We then assume the values for are the same forall trellis codes with delays. These values for willtend to be underestimated for codes which are less powerfulthan the Ungerboeck code for this system. The opposite istrue for codes which are superior to the Ungerboeck code.This assumption, however, will have little or no effect on theoptimal codes chosen when the code search is implemented.For instance, for the set of optimal codes found at a particularcode complexity, it can be shown by simulation that the valuesof for the set of optimal codes and the Ungerboeck codeare almost identical. Furthermore, the bound in (22) is robustto small changes in the value of over the range of SNRwe consider. Thus we believe our approach to estimatingto be quite practical.

D. Code Search—Results

Table III summarizes the source channel optimized (SCO)codes which result from the code search described in SectionV-B. The results reported here assume a BSMS with .Codes which minimize are found for , overa range of SNR. Specifically, the operating point at which aparticular code was found to be optimum is labeled SNR.The parity-check polynomials for each code are displayed inoctal notation. The reader unfamiliar with this convention isreferred to [1]. Included in this table are the values offor the SCO codes. It is important to note that for ,the SCO codes generally do not realize maximum . Thecodes reported in [1] realize of and forand , respectively. This confirms our claim that theoptimization of trellis codes for this system is not necessarilyaccomplished by maximizing .

It is also important to note that our code search wasimplemented for the mapping of Fig. 3(b), as well as thenatural binary mapping proposed in [1]. The codes optimizedfor the natural mapping realized only marginal coding gainsover standard Ungerboeck codes. The mapping utilized in Fig.3(b) better supports MAP detection. First, notice that the errorthree-tuple associated with distance has a bit change.Furthermore, note that the high probability 8-PSK points ofFig. 3(b) tend to have neighboring points of lower probability(true in six out of eight cases). As a result, codeword pairswith small squared Euclidean distance tend to have disparateprobabilities. Such a property allows for the design of trelliscodes where high probability sequence pairs have good Eu-clidean distance. Neither of the two aforementioned propertiesare true of the natural mapping.

KROLL AND PHAMDO: TRELLIS CODES OPTIMIZED FOR A BINARY SYMMETRIC MARKOV SOURCE 2985

TABLE IVCODING GAINS (IN DECIBELS) FOR MAP DETECTION OF THE OPTIMAL CODES OF TABLE III OVER MAP

DETECTION OF UNGERBOECK CODES OF EQUIVALENT COMPLEXITY FOR A RANGE OF ERROR EVENT RATES PeTHE SNR (IN dB) AT WHICH THE CODE ACHIEVES A PARTICULAR Pe IS GIVEN IN PARENTHESIS

TABLE VCODING GAINS (IN DECIBELS) OF THE CODE SCO7OVER THE 32-STATE UNGERBOECKCODE, FORVARIOUSDEGREES OFCODE OPTIMIZATION AND DECODERMISMATCH

MAP DECODING IS ASSUMED; p IS THE TRUE SOURCE TRANSITION PROBABILITY ; p̂ IS THE TRANSITION PROBABILITY ASSUMED BY THE MAP DECODER.

THE SNR (IN DECIBELS) AT WHICH THE CODE ACHIEVES A PARTICULAR Pb IS GIVEN IN PARENTHESIS

In Table IV, we present the coding gains achieved bythe trellis codes optimized for this system, over standardUngerboeck codes of equivalent complexity. MAP detectionis used in both cases. For each code, the coding gains arelisted over a range of . In general, the SCO codes achieveapproximately 1-dB gains at low SNR. As the SNR increases,the coding gains slowly diminish but remain significant. Forthe 64-state codes, the coding gain at is omitted,since over the range of SNR considered (SNR0 dB). All coding gains cited are determined by computersimulation. Also given in this table are the SNR’s at whicheach code achieves the given error event rates.

E. Source/Code and Source/Decoder Mismatch

The system we propose is susceptible to two main typesof communication system mismatch. Here, we consider eachtype of mismatch and report simulation results for the codeSCO7. In [11], we report our findings for a more exhaustiveset of simulations. However, the results for the SCO7 code arerepresentative of these findings. In all cases, coding gains aredetermined in terms of the bit-error rate.

The first phenomenon of concern is source/code mismatch.Here, the question is whether a code optimized formaintains good coding gains relative to an Ungerboeck codewhen the source is highly redundant ( or ),

but . Simulation results which are representative of ourfindings are presented in the top half of Table V. Observe thatthe coding gain increases as the source redundancy increases.Furthermore, even though SCO7 was optimized for ,it still performs well relative to the Ungerboeck codes for

or . This implies that SCO7 is robust tosource mismatch. Of course, asapproaches 0.5, SCO7 isslightly inferior to the Ungerboeck code due to its smallervalue of .

A second type of mismatch occurs when the true sourcedistribution is , the code is optimized for , butthe MAP decoder assumes a source distribution . Wecall this source/decoder mismatch. At the bottom of Table V,we present results which address this problem. Notice that thegains over the Ungerboeck code are maintained as long as

. Hence this type of mismatch is affecting both theUngerboeck code and the SCO code in a similar manner. Infact, the absolute performance of each code can be seen todecrease only marginally. As expected, asapproaches ,the Ungerboeck code is slightly more powerful.

F. Performance/Complexity Tradeoffs

To study the performance/complexity characteristics of theproposed BSMS/AWGN/MAP system, we considered an alter-native approach to coding the BSMS. The alternative system

2986 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 7, NOVEMBER 1998

Fig. 4. Bit-error rate performance of the 16-state Ungerboeck code for both abinary symmetric i.i.d. source and BSMS, assuming ML or MAP detection.For comparison,Pb is plotted for the code SCO4 (MAP), and the 16-state tandem system.

is a tandem coding scheme which consists of a fourth-orderHuffman encoder [12], followed by a rate- eight-statemaximum convolutional encoder [13], an interleaver oflength and an eight-state Ungerboeck trellis encoder [1].At the receiver, the reverse operations are performed. Theinterleaver serves to disperse error bursts which remain at theoutput of the TCM decoder. Furthermore, to avoid long errorbursts resulting from channel errors and Huffman decoding,the system is reset every 1000 bits. Such a system is consistentwith the more traditional approach of independent sourceand channel coding. Our main comparison is between thetandem scheme and 16-state trellis-coded systems with MAPdetection, as the overall complexity of these two approachesis approximately equal.

The BSMS has an entropy rate bits [12].Hence the transmission rate for the MAP system is

b/s/Hz. For the tandem system, the Huffman encoderproduces codewords with an average codeword length of2.4885 bits. After convolutional and trellis coding, the finaltransmission rate is b/s/Hz. Fig. 4 displaysthe bit-error rate (BER) performance of the tandem system,and the 16-state Ungerboeck and SCO4 codes with MAPdetection. Also plotted in this figure is the BER performanceof ML decoding assuming both binary symmetric i.i.d. andbinary symmetric Markov sources, with 16-state Ungerboecktrellis coding. In this figure, we plot as a function ofSNR SNR . SNR is used to comparetwo or more systems operating at different rates. Its rationalecan be found in [14]. Relative to the tandem system, the SCOMAP system realizes coding gains of 3.37 dB and 2.77 dBat and , respectively. In fact, theSCO MAP system yields the best possible performance forthe BSMS among the schemes considered, for all SNR10.8 dB (at which point the tandem scheme achieves reliablecommunication). Though it is difficult to see from this figure,at SNR 10.8 dB, the tandem system is actually the bestchoice. However, for low SNR , the proposed system isclearly superior. This demonstrates the benefits of joint sourcechannel coding of the BSMS at low SNR. The normalizedperformance of the ML decoder for the i.i.d. source is actually

superior to all systems operating on the BSMS. This implies itis more difficult to design systems which minimize whenthe source is a BSMS.

VI. CONCLUSION

We have considered the application of TCM as a codingscheme for the BSMS/AWGN/MAP system. MAP detectionwas shown to provide 0.80–2.13-dB coding gains to thatobtainable by the ML decoder for standard Ungerboeck codesassuming a BSMS with . We then developed a methodby which the performance of TCM could be predicted forthis system. An expression for the pairwise error probabilitywas first developed. This expression is not dependent on thesource model and could be utilized for an arbitrary sourceassuming the application of trellis coding on the AWGN chan-nel with MAP detection. An approximation for the Euclideandistance spectrum is obtained by considering the distancepolynomials for error sequences. The approximation isexpected to be excellent assuming a systematic trellis encoderas described in [1]. Finally, a bound was developed for theBSMS/AWGN/MAP system, where we utilize TCM as thecoding scheme. The bound is written in such a way that weonly need to quantify distance parameters for paths relativeto the all information sequence. This greatly simplifiesthe complexity associated with the union bound, and makessuch a bound practical for use in a code search. Such asearch is implemented, and the codes which are found realizegains of up to 1.09 dB over Ungerboeck codes of equivalentcomplexity.

Future work in this area could take several directions.Certainly, the performance and optimization of codes for othermodulations and data rates are of interest. An expressionfor the pairwise error event probability could be developedassuming channel models other than the AWGN channel, anda similar code search could be implemented. The optimizationof multidimensional trellis codes for this system would also beof interest. Such constructions are characterized by long errorpairs. This should allow for the design of trellis codes witherror pairs that realize large values of . This would facilitate

KROLL AND PHAMDO: TRELLIS CODES OPTIMIZED FOR A BINARY SYMMETRIC MARKOV SOURCE 2987

good error performance assuming a MAP decoder. Finally, itis hoped that this work provides the motivation to considerthe application of trellis coding with MAP detection to morepractical source models. In the past, such a problem seemedimpractical due to the overwhelming complexity associatedwith quantifying all codeword pairs in the trellis. The ideasdeveloped in this paper could provide the basic framework bywhich systematic trellis codes are optimized for other sourcemodels.

APPENDIX

We now provide a Proof of Theorem 1. Re-index the bitsin and , and remove the redundant bits

in to get

where . For , defineand

. It can easily be shown that

(23)

and . Furthermore, it can be shown thatis an i.i.d. binary sequence with .

Observe that takes values in the set . Letand consider the conditional probability

(24)

The first equality is the chain rule; the last two equalities aredue to (23) and the fact that is i.i.d. From (23),note that

ifififif

(25)

Thus if for each , then

(26)

where is the number of times that and isthe number of times that . Otherwise (iffor some ), . Since , must takevalues in the set , which proves(14). Finally, note that

(27)

If and , then there must be exactlyones in . Furthermore, there are ways to

distribute the ones in the nonzero locations in. Thus

(28)

where , which proves (15) (where is equalto ).

An alternative proof to this Theorem is provided in [11].

ACKNOWLEDGMENT

The authors thank the anonymous reviewers for their com-ments and suggestions which greatly enhanced the paper.Specifically, they wish to thank the reviewer who suggested asimplified approach to proving Theorem 1.

REFERENCES

[1] G. Ungerboeck, “Channel coding with multilevel/phase signals,”IEEETrans. Inform. Theory, vol. IT-28, pp. 55–67, Jan. 1982.

[2] G. D. Forney, “The Viterbi algorithm,”Proc. IEEE, vol. 61, pp.268–278, Mar. 1973.

[3] M. Rouanne and D. J. Costello Jr., “An algorithm for computing thedistance spectrum of trellis codes,”IEEE J. Select. Areas Commun., vol.7, pp. 929–940, Aug. 1989.

[4] National Communications System (NCS), “Details to assist in imple-mentation of federal standard 1016 CELP,” Tech. Inform. Bull. 92-1,Office of the Manager, NCS, Arlington, VA 22204-2198.

[5] K. Sayood and J. C. Borkenhagen, “Use of residual redundancy in thedesign of joint source-channel coders,”IEEE Trans. Commun., vol. 39,pp. 838–846, June 1991.

[6] W. Xu, J. Hagenauer, and J. Hollmann, “Joint source-channel decodingusing the residual redundancy in compressed images,” inProc. ICC,1996, pp. 142–148.

[7] R. Wang, E. A. Riskin, and R. Ladner, “Codebook organization toenhance maximum a posteriori detection of progressive transmissionof vector quantized images over noisy channels,”IEEE Trans. ImageProcessing, vol. 5, pp. 37–48, Jan. 1996.

[8] N. Phamdo and N. Farvardin, “Optimal detection of discrete Markovsources over discrete memoryless channels—Applications to combinedsource channel coding,”IEEE Trans. Inform. Theory, vol. 40, pp.186–193, Jan. 1994.

[9] F. Alajaji, N. Phamdo, and T. Fuja, “Channel codes that exploit theresidual redundancy in CELP-encoded speech,”IEEE Trans. SpeechAudio Processing, vol. 4, pp. 325–336, Sept. 1996.

[10] S. A. Al-Semari, F. Alajaji, and T. Fuja, “Sequence MAP decoding oftrellis codes for Gaussian and Rayleigh channels,”IEEE Trans. Veh.Technol., to be published.

[11] J. M. Kroll, “Source channel optimization of trellis codes for binaryMarkov sources with MAP detection,” Ph.D. dissertation, Dep. Elec.Eng., State Univ. New York, Stony Brook, Dec. 1997.

[12] T. M. Cover and J. A. Thomas,Elements of Information Theory. NewYork: Wiley, 1991.

[13] S. Lin and D. J. Costello Jr.,Error Control Coding. Englewood Cliffs,NJ: Prentice-Hall, 1983.

[14] M. V. Eyuboglu and G. D. Forney, “Trellis precoding: Combinedcoding, precoding and shaping for intersymbol interference channels,”IEEE Trans. Inform. Theory, vol. 38, pp. 301–314, Mar. 1992.


Recommended