+ All Categories
Home > Documents > 00032136

00032136

Date post: 03-Jun-2018
Category:
Upload: george-mathai
View: 216 times
Download: 0 times
Share this document with a friend

of 9

Transcript
  • 8/12/2019 00032136

    1/9

    IEEE TRANSACTIONS ON INFORhlATlON THEORY, VOL. 35, N O . 2 MARCH 1989 419

    Maximum Likelihood Sequence Estimators:A Geometric ViewLINEU C. BARBOSA

    Absrracr -Commu nication issues are described via macro operationsbetween the data and the observation spaces. The problem of recoveringthe data is related to the inversion of an operator the channel mapping);for this reason results available in linear algebra and functional analysisare applicable. Traditional concepts in communications are identified withthese operations. A new approach to the maximum likelihood sequenceestimator based on a sufficient statistic derived from these concepts isproposed. With this approach, intersymbol interference is removed bylinear equalization, and a Viterbi-like dynamic programming algorithmtakes into account the correlated noise in the metric evaluation. Theperformance of suboptimal receivers obtained by means of metric simplifi-cation is analyzed.

    I. INTRODUCTIONE PRESENT a geometric interpretation of the con-W epts related to maximum likelihood sequences esti-mators (MLSE). The treatment and notation used here arenot standard. In general, these problems are treated in theliterature with an emphasis on the computational algo-rithms; hence the mapping concepts are lost. Here, thefocus will be on the macro operations involved withoutregard to computational efficiency.The matched filter is introduced here in an unusual butnatural way. It is well-known that the sampled output ofsuch a filter forms a sufficient statistic for the inputsequence; this fact is reviewed here to familiarize thereader with the approach. It is shown that 1) the sampledoutput of the so-called zero forcing equalizer (ZFE) also

    forms a sufficient statistic for the data sequence, and 2) theZFE coincides with the unconstrained maximum likeli-hood sequence estimator of the data. These last two factsare not stressed in the literature. The sufficient statisticproperty of the ZFE samples suggests that the constrainedMLSE can be derived from them. In fact, the constrainedestimate is the point in the constraint set closest to theunconstrained estimate if the proper metric is introducedin the sequence space. The traditional approach is alsointerpreted geometrically. This shows the mappings in-volved in the whitened matched filter and how theyrelate to the presently proposed approach.The interpretation of the proper metric for the MLSE inthe sequence space S involves a combination of the chan-

    Manuscript received November 3, 1986; revised February 3, 1988.The author is with the IBM Almaden Research Center, 650 HarryRoad, San Jose, CA 95120-6099.IEEE Log Number 8926625.

    ne1 response and the noise covariance operator. This inter-pretation sheds light on attempts to simplify the MLSEmetric. In particular, it is easily shown that partial removalof intersymbol interference (ISI) by linear equalizationdoes not lead to metric simplification unless the effect ofthe equalization on the noise is completely ignored. In fact,the approach developed here uses linear equalization toremove all the intersymbol interference.Estimates of the performance of the MLSE are pre-sented. These estimates take into account the fact that theMLSE true metric is replaced by an approximate metric tosimplify the receiver complexity. It is shown that, underthe simplified metric, the concept of minimum distanceloses its meaning as a criterion for performance.For the case of stationary processes, the operators in-volved in the discussions below are of the Toeplitz type. Itis well-known that the Toeplitz operators and the z-expan-sion (or polynomials) of the sampled data systems are twoequivalent representations of the underlying algebra. Someof the statements made in th s paper readily can be veri-fied by substituting z-expansions corresponding to theoperators. The author finds that the operator representa-tion provides an easier, and potentially more powerful,tool for the derivations, besides providing more insightinto the geometric understanding of the matter. This ap-proach permits a unified treatment of several importantcases: it includes single user as well as cross-coupledmultiuser and crosstalk in communication and adjacenttrack interference in digital recording for both stationaryand nonstationary cases. The unified solution of equalizersfor decision feedback, partial response, and canceling is anexample of the flexibility of the approach and was demon-strated in [l].We assume here that the input sequences are finite. Toavoid some well-known theoretical technicalities, the ob-servation space is taken to be a finite dimensional space aswell. One may think, for example, that the observationsare a finite number of samples of the actual continuous-time observations, taken at a very fast rate so that anegligible amount of energy is left outside the correspond-ing Nyquist band. Alternatively, one may expand thecontinuous-t ime observation space in terms of some basisso that the neglected tails contain negligible energies. Theseare standard procedures and the reader is referred to, e.g.,[ 2 ] , for a proper treatment of the subject. The resultsobtained here, however, can be extended to more general

    0018-9448/89/0300-0419 01.00 01989 IEEE

  • 8/12/2019 00032136

    2/9

    420 IEEE TRANSACTIONSON INFORMATIONTHEORY, VOL. 35 , NO. 2 , MARCH 1989

    spaces by a limiting process. We give some of the examplesin terms of continuous-time observation to relate them towell-known results.Section I1 describes the basic spaces, norms, and mapsused in the paper. Section I11 deals with the maximumlikelihood sequence estimators, giving a geometric view ofboth the unconstrained and the constrained cases. SectionIV describes how to obtain the solution to the constrainedcase from the solution of the unconstrained one. Section Vshows a dynamic programming computation for the propermetric in the sequence space, very similar to the Viterbialgorithm. The traditional whitened matched filter ap-proach to the metric computation problem is also de-scribed in this section and related to the present approach.Section VI describes the estimation of performances. Inthis section the performance is estimated for a suboptimaldetector which uses an approximation to the real metric,and some examples are given. Section VI1 contains themain conclusions of t h s work.

    11. DEFINITIONF BASICMAPSA . The Channel

    a nonsingular noise process is also assumed, implying thatthe covariance matrix of the above samples is of full rankand hence invertible.At this point, we assume that the observations areactually a finite set of samples from the output of thechannel. Therefore, they are points of a Euclidean space Yof possibly much hg he r dimension than that of the inputsequences S . This is done to avoid well-known technicali-ties, as indicated in the Introduction, but the results ob-tained here can be extended by a limiting process to thecontinuous-time observation case.Let the covariance matrix of the observation noise be R.Then the joint probability density of the noise samples is

    p ( n ) = Kexp - ( n , R - ' n ) .i : iIn the above equation, as well as in the rest of thispaper, -,e) indicates the regular Euclidean inner productsfor the appropriate spaces. The quadratic form in theexponent of the above density has the properties of anorm. We introduce here this quadratic as the norm for Yaccording to the following definitions:For y and z in Y,

    (Y7 z) = ( Y , R-'z)llY112= (Y7 Y ) .

    The input sequences are points of a Euclidean space S .Furthermore, these sequences are constrained to belong toa constraint set C in S. The channel is modeled as a linearmap H from S to the space of the observations Y. Thefinal output y is corrupted with additive noise, accordingy = H b + n

    where b E C c S is the input sequence and y and n are

    The noise probability density function is then1to the equation P 4 = K e x p ( - l l 2 ) . (1)

    c. The Adjoint Operatorpoints in Y.and covers a variety of digital systems with constrainedThe above model, illustrated in Fig. 1, is quite general

    The adjoint Of a linear map from s to is a linearmap H* from to defined by theinputs. The input constraints can be the result, for exam- ( H b , z ) = ( b , H * z ) , V bb ES, Z E Y .plk, of restricting the components of the sequence to values Note that the first inner product ( ., ) is taken infrom a finite set of numbers, such as the set { O , l } , or whereas the second is in( 9be imposed in the sequence (such as that theOr 1. Further, Some Ode constraints can An important property of the adjoint operator is thefollowing. Let R, be the range of H, i.e., the image of Snumber of consecutive zeros be limited, or that two con-secutive nonzero components have opposite signs, etc.). under H:R,= { z ~ Y ; f o r s o m e a i n S , = H u } .

    S Y Thenz I R H - { ( z ,H u )= O V u in S }

    by definition. However, the expression in brackets is equiv-alent to( ( H * z , u ) = O V u in S }

    and therefore H*z = 0 , that is, z E N H * , the null spaceof H*.D n Ex a mp le

    Fig. 1 . Map involved in definition of basic system.

    Take a pulse-amplitude modulation PAM) channel as. The Noise and the Norm in the Observation SpaceWe assume the additive noise to be normally dis-tributed, i.e., any finite collection of arbitrary samples hasa joint normal probability density function. Furthermore,an example; let b = . . . b, . . .) be a sequence in S , andlet h ( t ) E Y be the channel's response to a sequence inwhich bo=1 and b, = 0 for k 0. For a continuous-time

  • 8/12/2019 00032136

    3/9

    BARBOSA: MAXIMUM LIKELIHOOD SEQUENCE ESTIMATORS 421

    function, this means that h ( t ) must belong to the repro-ducing kernel Hilbert space induced by the noise process(see [2]). For the whte noise case, h ( t ) has to be squareintegrable.Define( H b ) ( ) = b k h t - k A )k

    and let the noise be white, i.e., the inner product in Y isthe regular dot product, here written as the integral of theproduct of the vectors (for the sake of comparison, we aremomentarily representing the observation space as a con-tinuous-time function space). Then, for z E Y ,( H b , z ) A H b ) ( ) z t ) dt =1k ( h ( t - k A ) z ( ) d t

    J k J= x b k w k = ( b , w ) = ( b , H * z ) .

    kHence.

    wk = ( H * z ) = J h t - k A ) z ( d t ,which are the sampled values of the output of the filtermatched to h ( t ) , sampled at points k A . In a similar wayit can be shown that in the case of colored noise, theadjoint opera tor coincides with the sampled output of theclassical matched filter preceded by a whitening filter.Note, however, that h ( t ) is not necessarily the channelsimpulse response, but the channels response to the signalwhich is being modulated by the 6 , .

    111. MAXIMUMIKELIHOODEQUENCESTIMATORSThe maximum likelihood sequence estimator (MLSE) isa procedure of estimating an entire input sequence b E Sfrom the observation Y E Y. It maximizes over b E C ,where C is a constraint set in S , the probability density of

    y conditioned on b (i.e., the likelihood of y ) .After observing y , the MLSE of b is 6 satisfyingwhere 6 E C, a constraint set in S .A . Linear Channel with Additive Gaussian Noise

    As in Section 11-A, let us assume the linear channely = H b + n

    where b E C c S is the input random sequence, y E Y isthe observation, and n is the Gaussian noise, not necessar-ily white, having the probability density given in (1).Assuming that b and n are stochastically independent,the conditional density of y given b , can then be written as

    P ( Y / b ) = Kexp - 2 I l Y - Hb1I2i Hence the MLSE 6 of b satisfies

    minlly Hb1I2= Ily Hi l l 2 . 2)b = C

    Ca s e 1: C = S, i.e., the data sequence is unconstrainedin S .Equation (2) shows that the unconstrained MLSE(UMLSE) 6, is the point in S that is taken by the map Hto the point in Y closest to the observation y (see Fig. 2).Hence the vector y - H6, must be orthogonal to the rangeof H , i.e.,

    H * ( y - Hi,) = 0.

    Fig. 2. UMLSE mapping.

    Therefore,H*Y - H * H ~ , o - , = H * H ) - H * ~ .

    The invertibility of H * H is considered in Appendix 11.Fig. 2 shows the geometric relationships involved in thecase of UMLSE.The map that brings y into 6, is an instance of theMoore-Penrose pseudo inverse of H .Observations:

    estimate of b is1) In the absence of noise, that is, for y = H b , the6, = ( H * H ) - H * H b = b

    For the PAM case (see Section 11-D), if the input sequencee , consists of only one nonzero element at position k = 0,the output of the channel is the channel function h ( t )which may extend through several clock intervals, creatingintersymbol interference. The output of the above estima-tor being equal to e , indicates that the estimator is equiva-lent to a filter that eliminates such ISI, followed by asampler. In fact, it can be shown [l] that the elimination ofthe IS1 is done in an optimum way (minimum noiseenhancement). Such a filter is known in the communica-tion literature as the zero forcing equalizer. When the ZFEis used as a detection method, a thresholding operationfollows the linear equalization sampler. This forces a deci-sion satisfying the hypercube constraint C . Geometrically,the thresbolding operation is equivalent to choosing theelement bo ofAthehypercube C closest in the (Euclidean)norm of S to b , as indicated in Figs. 3and 4.2) As a consequence of the factorization theorem (seethe Corollary of Appendix I), the unconstrained estimateb , is a sufficient statistic for the input sequence.

    Ca s e 2: The input sequence is constrained to belong toa set C in S. This is the case, for example, when the data

  • 8/12/2019 00032136

    4/9

    422 IEEE TRANSACTI ONSON I N F O R M A T I O NTHEORY, V OL . 35, NO. 2, xmcn 1989and

    ( H a , H a ) = U , H * H a ) = l .Hence the CMLSE can be found as follows:

    a) get tht samples of the output of the FF E 6,;b) pick b, in the cube C closest to b, in the metricinduced by ( e , H * H . ) .

    Fig. 3. Mappings involved in UMLSE and CMLSE.

    Fig. 4. Balls for MLSE: ZFE: soccer ball: CMLSE and PR: footballs.

    elements are binary: in such case, the set C is a hypercubein S.The constrained MLSE (CMLSEZ h is the solutiongiven by, e.g;, the Viterbi algorithm. b, is the element in Csuch that Hb, is the point in the image of C ( H C )closestto the observation y or, equivalently, closest to the projec-tion of y into the range of H .Fig. 3shows the geometric interpretation of the above.Notice that the range of H (hence the whole Y ) can besubdivided into decision areas. A full description of theoperations involved in arriving at the CMLSE b, is givenin Sections IV and V.

    IV. GOINGROM UMLSE TO CMLSENow that we understand the geometry and mappingsinvolved in both the UMLSE nd the CMLSE, the relationbetween the two is simple: b, is mapped by H into thepoint in the range of H closest to the observation y , i.e.,into the projection of y on R,, the range of H . TheCMLSE subdivides R , into decision regions and picks thepoint in C which is mapped into the center of the regioncontaining the projection of y . The decision regions arebased on the metric of Y, i.e., its unit ball. It is a simplematter of redefining the metric of S to make he pointcorresponding to b, be the point of C closest to b,. This isdone by finding the shape of the new football in S thatis mapped into the unit ball in Y, as follows:

    y R , and ( y , y ) = l s a ~ S , H a = y

    An algorithm which may be used to compute b) recursivelyas more da ta are collected is presented in the next section.Observation: The proper norm in the sequence space Scould have been anticipated: since we want the metricin S to behave like the one in Y, the inner product in Scould have been defined as, for a , b in S, ( ( a ,b ) )=( H a , H b ) , where . , e ) ) is the new norm. If we select thisapproach, then the adjoint operator of H , call it H , is theZFE rather than the matched filter. Its statistical suffi-ciency would then follow directly from the factorizationtheorem. In fact, any map from Y to S preserving thestatistical sufficiency could be called a matched filter [3].At this point an interpretation of some partial realiza-tion of the CMLSE is useful. In general (whte noisecase), the complexity of the metric calculation grows expo-nentially with the amount of IS1 (see Section V). To limitthe span of interference and hence reduce the complexityof the algorithm that computes the CMLSE, sometimes anequalization of the channel is performed. Fig. 5shows thecase of preceding the CMLSE by one of the so-called

    partial response PR) equalizers. The usual rationale isthat part of the interference is removed by linear equaliza-tion and the remaining by the MLSE algorithm. However,it is not difficult to see that any information preservinglinear transformation cannot reduce the complexity of theMLSE algorithm. In the case of using a PR equalization asabove the simplification is accomplished by ignoring theeffect of the equalization in the noise. The solution to theMLSE proposed here reduces every channel to a noninter-fering channel and then uses a metric that reflects thenoise correlation (see also Section VI). Since any twosolutions of the MLSE problem must be the same, theselection of PR equalizers (or any other filter) to precedethe traditional MLSE algorithm would be equivalent to theuse of a dif erent (wrong) metric for picking the point of Cclosest to b, in the S space. Equalizers change the noisespectrum; if the change in the noise spectrum were takeninto account by modifying the metric of Y appropriately,the same complexity would result, and no simplificationswould be achieved. Simplifications in the metric then mustbe done taking into account its full effect. The amount ofinformation thrown away must be carefully estimated.Performance calculations taking these facts into accountare described in the Section VI.Fig. 4 shows examples of common simplifications. Thesimplest metric (but lowest performer) is, of course, theregular Euclidean metric in S which corresponds to theZFE followed by a slicer. This is equivalent to assuming

  • 8/12/2019 00032136

    5/9

    BARBOSA: MAXIMUM LIKELIHOOD SEQUENCE ESTIMATORS 423

    f i Ch ~ *

    Examples:

    P R V i te r b i p REqualizer

    s equivalent to :

    Which is equivalent to :New Algor i thm __cI1 . lPRh Z F E

    Constant,pre-process ingFig. 5 . Partial response equalization followed by its simplified CMLSE is equivalent to ZFE and use of PR metric in S

    that the noise is white after the ZFE equalization. Simi-larly, equalizing the channel for PR and using the corre-sponding reduced state space is equivalent to the assump-tion that the noise is white after the PR equalizer.V. COMPUTINGHE METRIC OR THE CMLSE

    The New Metric in SA recursive implementation of the metric induced by themap H in S can be found by application of the dynamicprogramming technique. For concreteness, let us assumethat H * H is a matrix whose elements satisfy (two symbolsinterference length in the white noise case)

    m i , =m , l - j ,= 0 for li - > 2 .Let x = . x , - ~ , n - l , x , ) be the n-truncation of a se-quence in S . For the pr%sent application, identify x withthe n-truncation of c - b, where c E C is a sequence of

    binary (for example) elements in S. DefineL ( x , , x n P l ) min x , H * H x )x1 x -z

    min { 1 , 2 m 0 + 2 c x , x l - l m l1 2 n 1 5 n1 x,-2

    + 2 c ~ , X 1 - z m 2 )n

    = m ox ,2 + 2 m l x , x n - l+ min { L ( x n P l , n P 2 ) 2 m 2 x , x n - z } .

    X P2

    The minimizations are over all indicated x , compatiblewith the constraint set C. The above minimization can beperformed by the rellis" of Fig. 6.In the above example,:he value of ALdepej_ndson x , = c, - b, and x , - = cnP1b n P lwhere b, and b n P lare the values of the output of theZFE, and c, and c n d 1 are the bits represented by thestates of the trellis.

    Fig. 6. Trellis for computation of metric ., H * H . )

  • 8/12/2019 00032136

    6/9

    424 IEEE TRANSACTIONS ON INFORM ATION THEORY, VOL. 35, NO. 2 , MARCH 1989

    This procedure is similar to the one used by the Viterbialgorithm. In the above example the two most recentcomponents of the output sequence have to be saved tocompute the current contribution to the distances, in con-trast with the Viterbi algorithm which requires the avail-ability of the noise-free observations as a function of thestate transitions. A bonus with the present approach is thefact that clocking and automatic gain control informationare readily available from the output of the intersymbolintereference free ZFE. For other types of code-con-strained sequences, the reader is referred to [4].B. The Whitened Matched Filter

    One can modify the computation of the above metric bynoting that H * H is a map from S to S and defining amap A from S to S satisfyingA*A =H * H .

    Then the problem is once more reduced to finding thep@nt in A C (the image of C under A ) which is closest toAb,, now in the regular soccer ball metric of S , since fora E S,

    ( A a , A a ) = a ,A * A a ) = a ,H * H a ) .One such map is the square root of H * H . However, amore common choice is to define A as a lower triangularoperator, i.e., the image of a sequence (under A ) at posi-tion k depends only on the elements of the sequence up toposition k . Another way to state this condition is to saythat the response of A is causal, independent of futureelements in the input sequence. In the stationarypime-invariant case, this triangular factorization corresponds tothe well-known Wiener factorization.Note that A = H is not a good choice, since H maps Sinto Y , which is a big space (possibly of continuous-timefunctions) and the metric in Y is not computationallyattractive. The use of a sufficient statistic for b permittedthe reduction of the observation space from Y back to S .The price to be paid by introducing the transformationA is that, as mentioned before, the noiseless informationcontained in A C , the image of>he cube under A , has to bestored for cornearison with Ab,. In this case, the sampledobservations Ab , have to be compared with the elementsof A C in the Euclidean metric. Observe that, although themetric is the simplest possible, no reduction of algorithmcomplexity is accomplished: the number of states requiredto carry the image of the cube A C is exactly the same asthe one du e to the span of IS1 (white noise case).The total mapping

    AW,= A ( H * H ) - H * = A ( A * A ) - H *- A A - ~ A * - ~ H * = A * - ~ H *

    from Y to S resulting from selecting the lower triangular Acan be identified as the so-called causal whitened matchedfilter [ 6 ] . It can be shown that, if ( H * H ) - exists, A - and A* - also exist [5] .

    VI. ESTIMATIONF THE PERFORMANCE OF THECMLSEA. General Case

    In standard MLSE situations with high signal-to-noi%eratios, the probability of transmitting bo and obtaining b,closer to b,, b, bo, s a good estimation of the probabilityof error in case bo and b, are sequences of minimumdistances, the number of which represents a nonnegligiblepercentage of the totality of possible sequences.In this section the probability of making a wrong deci-sion as described above is examined under the conditionthat the metric used in S for making the decision is not theproper metric induced by(-,H * H - ) as indicated in SectionIV. T h s is, for example, the case of suboptimal receiversusing a modified metric to simplify the detection algorithmas mentioned in Section IV. As a special case the errorprobability under the proper metric is derived.It will be seen that the concept of minimum distancedoes not apply in the case of modified metric and theselection of the sequences will pose a problem in theperformance estimation.Let M O= H * H and let M be the kernel of the quadraticform used in place of M O by a suboptimal receiver (seeexamples in Section VI-B). Assume that bo was transmit-ted, and let b, bo. Let us calculate the probability of theevent that u is closer to b , than to bo in the metric M . LetA = 6, - bo. Then the above event is equivalent to( 2 , M ( b l - b o ) ) A , M e )> 0 . 5 ( l b , - b o 1 1 ~ = 0 . 5 ( e , M e )

    3)where e = b, - bo. The scalar A , M e ) is a random vari-able, the variance of which is ( M e ,K , M e ) , where K , isthe covariance of A . However,

    6, = M ; ~ * y M,TH* ( T I )= bo M; * n .Hence A = M;H*n. K , can then be computed as fol-lows. For every a and b in S,

    ( a , K , b ) = E ( a , ) ;, b ) = E ( a , M ; H * n ) ( b , M ; H * n )= E ( H M { a , n ) ( n , H M ; b )= ( H M , - u , H M ; ~ ) = ( a ,AI,- ).

    Therefore,K , = M { .

    An interesting property of the inner product in Y whichwas used above is that the noise behaves as white:~ ( y ,) ( n , z) = E ( R - ~ , ) ( n ,R - z )

    = ( R - l y , R R P z ) = ( y , ) .It follows that the variance of A , M e ) is ( e ,M M i l M e ) .The probability of error is then the probability that thisrandom variable is greater than the value prescribed by 3),which is given by the error function Q :

    Pr[error] = Q [ O S ( e , M e ) / / ( e , ] . (4)

  • 8/12/2019 00032136

    7/9

    BARBOSA: MAXIMUM LIKELIHOOD SEQUENCE ESTIMATORS 425

    Note that the probability of error depends on twoquadrati c forms of the error sequence. This makes itdifficult to minimize the argument of the Q function toestimate the overall probability of error. The suboptimal

    a single error, i.e., witha 2 M ( z ) = ( l + z ) ( l + z - ' ) = z - ' + 2 + zdi, , /a2 = e , , M e , ) = 2 /a2 (the diagonal term).......receiver resulting from the use of the simplified metric asdescribed is different than the suboptimal receiver pro-posed and analyzed in [8], as shown in example 3 below.This calculation does not take into account the effect ofequalization on the noise. To take this into account, (4)must be used. For the same error sequence e,, the calcula-

    B. Examples1) As the first example, take thc correct metric. That is,M = M O ; then the probability of b, being closer to bo ethan to bo is simply Q[0.511ellM0].Further, for uncorrelatednoise, R = a21 yielding l l e l l L o = ( H e , He /a2 = d2/a2.The above is the traditional distance/sigma criterion.2) Let now M = I , that is, a simple slicer is used afterthe ZFE. Then,

    Pr [error] = Q [ 0 . 5 ~ ~ e ~ ~ 2 / J ~ ~ ] .Putting e = e , ( e , is a sequence of all zeros except for thevalue 1 at position 0) in the above expression, one gets theprobability of having the bit at position zero in error:

    Pr [bit error at zero] = Q [ 0 . 5 / J m ] .The above expression is equivalent to a well-known esti-mate for the probability of error for the ZFE detector [7].However, a more accurate estimate can be obtained byminimizing, over all possible error sequences e , the argu-ment of the Q function in the preceding expression, yield-ing a possibly higher probability of error.3) As a final example, let the noise be stationary andwhite, and let the channel be a minimum bandwidthchannel with response, sampled at clock times and ex-pressed as a polynomial in z ,

    h ( z ) =1 + 0 . 9 z + 0 . 2 z 2 .Then,

    aZM,(z) = h ( z ) h ( z - ' )= 0 . 2 ~ - ~ + 1 . 0 8 ~ - ~ + 1 . 8 5 + 1 . 0 8 ~0 . 2 ~ ~ .

    tion yieldsand hence ( e , , M M i ' M e , ) = 2.262/a2 5 )2

    The quadratic (5) was computed by multiplication anddivision of polynomials. It could also be computed byintegration in the frequency domain, as described in Ap-pendix 11.By focusing attention on the PR channel one fails torecognize that the minimum distance error sequence forthe actual channel is not e,, but e = ( - . . 1 -1 0 00 . . ), i.e., two consecutive errors of opposite sign. Forthis error sequence the performance of the optimal receiveris

    ~r [error] =Q ( m h o )= Q(J154 /20) = Q ( 1 . 2 4 1 / 2 ~ )

    To obtain the performance of the suboptimal receiverfor the error sequence e , compute

    to obtain( e , M M i ' M e ) = 2.75

    2It is possible that yet another error sequence could yield aworse performance for this suboptimal receiver.

    If the PR receiver were used without any attempt actu-ally to equalize the channel to the PR response, the perfor-mance would be measured by the method suggested in[8, chs. 3 and 81, yielding

    The implementat ion of the related metric involves a trellis Pr [error]=Q dAwithofoureducetates.he number of states of the trellis, the channel i:. i m . )is equalized to a partial response class 1 channel (denotedby PR), with response

    h , = l + z .

    whered; = ( e , M e )- ( H , - H ) ,, H e )

    = e , M e ) - ( H* H , - H )a , , ) .Here the performance is not only a function of the errorsequence e , but for each e one can look for an actualsequence a , yielding the smallest d i . The error sequence

    A Viterbi detector for the PR channel is then used, yield-ing a two-state trellis.With binary inputs from the set {1/2, -1/2) , the per-. ,formance of the PR receiver is calculated as e = ( - . . O O O 1 -1 O O O . . . )

    together withU , = ( ' x x 0.5 -0.5 -0.5 0.5 0 0 0 x x - . . )

    where the x's correspond to "don't care" entries, yields1 1 .4where u 2 is the in-band noise variance. In , the abovecalculation, the minimum distance for sequences in the PRchannel is achieved by the error sequences e , consisting of

  • 8/12/2019 00032136

    8/9

    426 IEEE TRANSACTIONS ON INFORMATIONTHEORY, VOL. 35, NO. 2, MARCH 1989

    VII. CONCLUSIONA geometric view of a digital communication channel isdeveloped by considering the channel as a map from theinput sequence space to the observation space. A newapproach to the computation of the MLSE under inputconstraints is developed. This approach is based on thezero forcing equalizers sampled output, which constitutes

    a sufficient s tatistic for the estimation of the input se-quence. This new approach has the side benefit, not ex-plored here, of providing a signal which is intereference-freefor clocking and automatic gain control. The new ap-proach not only gives insight into the macro operationsinvolved but also shows that information preserving trans-formations do not simplify the complexity of the MLSEalgorithm. I t also provides a way of evaluating the infor-mation losses from a simplified metric via the estimationof performance. It is shown that, in case of metric simplifi-cation, the concept of minimum distance must be replacedby a more complex functional. In these suboptimal re-ceivers, the functional 4) can be used to search for themetric simplification and proper input constraints codes.Such a task, although difficult, should replace the standardHamming and Euclidean distances in such suboptimalcases.

    For completeness, a short proof follows.1) Let y be an S.S. From

    It follows that, for y = r ( y ) ,P ( Y , Y , b )y. P ( Y / Y ) . P ( Y , b )

    P ( b )P ( Y / b ) = P ( Y , Y / b ) = p ( b )

    P ( Y / b , Y ) = P ( Y / b )

    and the right expression has the factorization property.2) Now, assume that the factorization holds. Then,P ( Y , Y / b ) P ( Y / b )

    P ( Y / b )

    P ( Y/b) dY

    =-

    - P ( Y / b )-J - vY)v ig ( b , y ) / f ( y , y ) dY

    - f Y Y ) g ( b , Y )-

    and the expression on the right is independent of b . The theoremfollows.Corollary: Let X = h ( y ) where A is invertible. Then,y S . S . S . S .

    This is an immediate consequence of the factorization theorem.ACKNOWLEDGMENTThe author wishes to thank his colleagues Richard G.Hirko, Thomas D. Howell, and Paul H. Siege1 for valuablecomments which proved instrumental in improving theclarity and understanding of the concepts presented in thispaper, Sylvia S. Fujii for the skillful formatting of themanuscript, John R. De Lany for the artwork, and thereviewers for many constructive suggestions that consider-ably improved the paper.

    APPENDIXREVIEWF SUFFICIENCYITH APPLICATIONS

    B. Application: Linear System with Additive Gaussian NoiseLet S be the (Euclidean) space of finite input sequences of realnumbers with inner product -;) and Y the space of observa-

    tions, with inner product .;). Let H be a linear mapping fromS to Y. The linear system under consideration is characterized bythe relationy = H b + n

    where ~ E C C Ss a random input sequence constrained tobelong to a set C in S , n is an additive Gaussian noise takingvalues in Y , and y is the observed output of the system, an RVtaking values also in Y.The probability density of the noise n is 1):

    1. Sufficient StatisticsLet and b be random variables (Rvs) (vectors, processes).A p ( n ) = Kexp( - l l 2 .statistic y = T ( y ) hence an RV) is said to be sufficient for b if The density of y , given b , becomes

    a) P ( b / y , Y ) = p ( b / y ) (independent of y ) 1or p ( y / b ) = p ( n = y - Hb ) = Kexp( - ylly - Hbl12)b)

    Here, p ( .) is the probability density function for the variablesinside the parentheses. Slashes are used, as usual, to denoteconditional densities. The following theorem is immediate.

    p y / b , ) =p y / y ) (independent of b )

    Theorem 1: a)- ).The proof follows from the application of Bayess relation inTheorem 2, the factorization theorem, suggests a way ofbringing the left side of a) to the left side of b) and vice versa.arriving at sufficient statistics.

    only ifTheorem 2: y = r ( y ) is a sufficient statistic ( s . s ) for b if andP ( Y / b ) = f Y ,Y . g ( b , Y ) .

    1= [ Kexp( - ~ l l Y l 1 2 ) ]

    Here, H* is the adjoint of H , hence mapping Y into S .Conclusion: From the factorization theorem, y = H * y is an S.S.for b . This S.S. is an element of the input sequence space S . Theabove is a restatement of the fact that the samples (at clock rate)of the output of matched filters (regardless of the color of thenoise) constitute an S.S. for the input sequence.

  • 8/12/2019 00032136

    9/9

    B A R B O S A : MAXIMUM LIKELIHOOD SEQUENCE ESTIMATORS 427APPENDIX1

    T H E INVERTIBILITY OF H*HLet H : S + Y It was already seen that

    Y =x @ NI,.(see Section 11-C). That is, the null space of H* is orthogonal tothe range of H . Hence the null space of

    H * H : S - + Scoincides with N f f .Hence a necessary and sufficient conditionfor the existence of ( H * H ) - is

    = { O } ,i.e., no two distinct sequences are mapped by H into the samepoint in Y Equivalently, this implies that a column of H is notthe linear combination of the remaining columns.For the time-invariant/stationary PAM case, the matrices Hand H * H = MO are Toeplitz and the existence of the inverse ofMO can be characterized in terms of H (a), he Fourier transformof h ( t ) . The characterization can be found in [7], and is brieflyreviewed here for completeness.Let a f O be an arbitrary nonnull sequence in S. MO isinvertible if and only if

    l l H ~ 1 1 ~( H a , H a ) = ( U , H * H a ) = ( a , M o a ) 0 ,but

    ( a , M O a ) = E a k a l m k - lk /where z = e - JwA , n is the dimension of S,

    1m, = MO(w e l u kA o2T -./A

    and R U is the average power spectrum of the noise.Then a sufficient condition for a ,M o a ) 0 for arbitrarya 0 is M o ( w ) z 0 a.e. in o E U , ), where U , ) is some inter-val of nonzero measure inside - T/A, ./A . This is becauseC:=,a,z is a polynomial in z , hence can have at most n rootsinside U , ), i.e., cannot vanish identically there.Although this condition is sufficient for the existence of theinverse of M,, this inverse may become ill-conditioned whenthe dimension n of S is large, i.e., n + 3 may cause ( H * H ) -

    to become unbounded. A necessary and sufficient conditionfor a bounded inverse is that l/M,(w) be integrable, w E- T/A, w / A ) [7]. To see the sufficiency,note that

    imply that M,(w) can be zero only in a set of measure zero;hence the previous sufficient condition for the existence of theinverse for finite sequences is satisfied. For any n, consider theproblemmin Hull2 subject to a , a , e , =o s s

    where again e , is a sequence for all zeros except for a 1 atposition 0. The uniform boundedness of M; s equivalent to thefact that the above minimum is uniformly bounded away fromzero as n + W . The minimum value of l l H ~ 1 1 ~s obviouslydecreasing with n.The above minimization yieldsHence eoM; eo ) increases with n . Its limit, as n -+ 00, is

    This is so because the Fourier transform of the rows of Mi, sn + 0, converges to l/M,(o) as can be easily verified).The integrability of l/M,(u) is also necessary, since theexistence of a bounded M i implies that

    REFERENCES[ l ][2][3][4]

    L. C. Barbosa, Equalizers for sampling detectors: From zero forc-ing to Viterbi, IBM Res. Rep. RJ 4363, July 18, 1984.E. Panen, Reproducing kernel Hilbert spaces, J . SIAM Contr . ,Ser. A, vol. 1, no. 1, 1962.C. Heega rd, Cornel1 Univ., Ithaca, NY, private conversation.H. Burkhardt and L. C. Barbosa, Contributions to the applicationof the Vite rbi algorithm, IEEE Trans. Inform. Theory, vol. IT-31,no. 5, pp. 626-634, Sept. 1985.L. C. Barbosa, Factorization of positive operators, with applicati onto prediction theory, IBM Res. Rep. RJ655, Dec. 15, 1969.G. D. Forney, Jr., Maximum likelihood sequence estimation ofdigital sequences in the presence of intersymbol interference, IEEETrans. Inform. Theory, pp. 363-378, May 1972.D. Messerschmitt, A geometric theory of intersymbol interference,Bell Sys t . Tech. J . , vol. 52, pp. 1483-1519, Nov. 1973.J. B. Anderson, T. Aulin, and C. E. Sundberg, Digital Phase Modula-tion. New York: Plenum, 1986.

    [5][6)

    [7][SI