+ All Categories
Home > Documents > Two-stage vector quantization-lattice vector quantization

Two-stage vector quantization-lattice vector quantization

Date post: 22-Sep-2016
Category:
Upload: tr
View: 222 times
Download: 0 times
Share this document with a friend
9
155 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 41, NO. I, JANUARY 1995 Two-Stage Vector Quantization-Lattice Vector Quantization Jianping Pan and Thomas R. Fischer, Senior Member, IEEE Abstract-A two-stage vector quantizer is introduced that uses an unstructured first-stage codebook and a second-stage lattice codebook. Joint optimum two-stage encoding is accomplished by exhaustive search of the parent codebook of the two-stage product code. Due to the relative ease of lattice vector quantization, optimum encoding is feasible for moderate-to-large encoding rates and vector dimensions, provided the first-stage cadebook size is kept reasonable. For memoryless Gaussian and Laplacian sources, encoding rates of 2 to 3 bkample, and vector dimensions of 8 to 32, the signal-to-noiseratio performance is comparable or superior to equivalent-delay encoding results previously reported. For Gaussian sources with memory, the effectiveness of the encoding method is dependent on the feasibility of using a large enough first-stage vector quantizer codebook to exploit most of the source memory. Index Terms-Vector quantization, lattice, source coding. I. INTRODUCTION HIS PAPER introduces a vector quantizer for appli- T cations requiring moderate-to-high encoding rate and vector dimension, but low encoding delay. A two-stage vec- tor quantizer-lattice vector quantizer (VQ-LVQ) structure is used, where the first-stage VQ generally has an unstructured codebook (such as results from using the Linde, Buzo, and Gray (LBG) design algorithm [l]) and the second-stage LVQ uses a lattice codebook with (for the mean-square error (mse) distortion measure) a spherical codebook boundary region. The novel feature of the VQ-LVQ formulation is the use of what amounts to “soft decision” first-stage encoding, allowing an optimum search over the joint two-stage codebook. More precisely, jointly optimum two-stage encoding is accomplished by exhaustive search of the parent codebook of the two- stage product code. The structured LVQ codebook is essential to making the optimum codebook search feasible for large encoding rate and vector dimension. Unstructured vector quantizers (such as designed using the LBG algorithm [ 11) can perform well for a variety of sources, but practical applications are limited by the complexity of codebook search and codebook storage. Tree-search VQ [2] reduces the codebook search complexity with a small in- crease in distortion, but fails to reduce the codebook storage requirement (and, typically doubles it). Multistage vector Manuscript received March 1 I, 1993; revised March 31, 1994. This work was supported, in part, by the National Science Foundation under Grants MIP- 91 16683 and NCR-9303868. Part of the material in this paper was presented at the International Symposium on Information Theory, June 1994. The authors are with the School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164 USA. IEEE Log Number 9406685. quantization (MSVQ) [2] divides the quantization task into several successive stages, resulting in a reduction of codebook search and storage complexity, but also a significant increase in encoding distortion. The performance can be improved using a multi-survivor-path search algorithm, but the per-stage complexity is increased proportional to the number of retained survivor paths. Vector quantization with direct sum codebooks [3] has a complexity equivalent to LBG VQ, but reduces the codebook storage complexity. Lattice vector quantization (LVQ) [4]-[6], with shaped boundary region [7]-[Ill, offers the granular gain [ I l l of the lattice codebook, and the boundary gain [ l l ] provided by appropriate shaping of the codebook region of support. Piecewise-uniform lattice vector quantization (PULVQ) [ 101 is proposed for improving (uniform) LVQ performance by using a lattice point density dependent on the source distribution. Entropy-coded scalar quantization [ 121 and entropy-coded trellis-coded quantization [ 131, [ 141 provide, for memoryless sources, respective performance within 1.53 and 0.5 dB of the rate-distortion function, but typically require a variable-length noiseless code to be used with the quantization. The scalar- vector quantizer (SVQ) [15] may be thought of as a truncated (to fixed-rate) version of an entropy-coded scalar quantizer, and captures much of the available boundary gain, but none of the possible granular gain. The trellis-based scalar-vector quantizer (TBSVQ) [ 161 combines trellis-coded quantization [ 171 (for granular gain) with the SVQ (for boundary gain), and provides excellent fixed-rate encoding performance. However, as with all trellis encoding methods, some encoding delay is required. The objective of this work is to introduce the VQ-LVQ for tasks of low-delay, fixed-rate encoding, requiring moderate-to- high encoding rates and vector dimensions. In Section 11, the motivation for VQ-LVQ is presented and the basic structure is described. In Section 111, two VQ-LVQ codebook design algorithms are presented. In Section IV, the VQ-LVQ encod- ing performance is presented for memoryless Gaussian and Laplacian sources and the first-order Gauss-Markov source. Comparisons with alternative fixed-length encoding methods demonstrate the effectiveness of the approach. 11. VQ-LVQ FORMULATION A. Motivation The motivation for the VQ-LVQ structure rests on two well-known results from rate-distortion theory. First, it is known that for a wide class of memoryless sources and the mse distortion measure, as the encoding rate becomes large 0018-9448/95$04.00 0 1995 IEEE
Transcript
Page 1: Two-stage vector quantization-lattice vector quantization

155 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 41, NO. I , JANUARY 1995

Two-Stage Vector Quantization-Lattice Vector Quantization

Jianping Pan and Thomas R. Fischer, Senior Member, IEEE

Abstract-A two-stage vector quantizer is introduced that uses an unstructured first-stage codebook and a second-stage lattice codebook. Joint optimum two-stage encoding is accomplished by exhaustive search of the parent codebook of the two-stage product code. Due to the relative ease of lattice vector quantization, optimum encoding is feasible for moderate-to-large encoding rates and vector dimensions, provided the first-stage cadebook size is kept reasonable. For memoryless Gaussian and Laplacian sources, encoding rates of 2 to 3 bkample, and vector dimensions of 8 to 32, the signal-to-noise ratio performance is comparable or superior to equivalent-delay encoding results previously reported. For Gaussian sources with memory, the effectiveness of the encoding method is dependent on the feasibility of using a large enough first-stage vector quantizer codebook to exploit most of the source memory.

Index Terms-Vector quantization, lattice, source coding.

I. INTRODUCTION HIS PAPER introduces a vector quantizer for appli- T cations requiring moderate-to-high encoding rate and

vector dimension, but low encoding delay. A two-stage vec- tor quantizer-lattice vector quantizer (VQ-LVQ) structure is used, where the first-stage VQ generally has an unstructured codebook (such as results from using the Linde, Buzo, and Gray (LBG) design algorithm [l]) and the second-stage LVQ uses a lattice codebook with (for the mean-square error (mse) distortion measure) a spherical codebook boundary region. The novel feature of the VQ-LVQ formulation is the use of what amounts to “soft decision” first-stage encoding, allowing an optimum search over the joint two-stage codebook. More precisely, jointly optimum two-stage encoding is accomplished by exhaustive search of the parent codebook of the two- stage product code. The structured LVQ codebook is essential to making the optimum codebook search feasible for large encoding rate and vector dimension.

Unstructured vector quantizers (such as designed using the LBG algorithm [ 11) can perform well for a variety of sources, but practical applications are limited by the complexity of codebook search and codebook storage. Tree-search VQ [2] reduces the codebook search complexity with a small in- crease in distortion, but fails to reduce the codebook storage requirement (and, typically doubles it). Multistage vector

Manuscript received March 1 I , 1993; revised March 31, 1994. This work was supported, in part, by the National Science Foundation under Grants MIP- 91 16683 and NCR-9303868. Part of the material in this paper was presented at the International Symposium on Information Theory, June 1994.

The authors are with the School of Electrical Engineering and Computer Science, Washington State University, Pullman, WA 99164 USA.

IEEE Log Number 9406685.

quantization (MSVQ) [2] divides the quantization task into several successive stages, resulting in a reduction of codebook search and storage complexity, but also a significant increase in encoding distortion. The performance can be improved using a multi-survivor-path search algorithm, but the per-stage complexity is increased proportional to the number of retained survivor paths. Vector quantization with direct sum codebooks [3] has a complexity equivalent to LBG VQ, but reduces the codebook storage complexity.

Lattice vector quantization (LVQ) [4]-[6], with shaped boundary region [7]-[Ill, offers the granular gain [ I l l of the lattice codebook, and the boundary gain [ l l ] provided by appropriate shaping of the codebook region of support. Piecewise-uniform lattice vector quantization (PULVQ) [ 101 is proposed for improving (uniform) LVQ performance by using a lattice point density dependent on the source distribution.

Entropy-coded scalar quantization [ 121 and entropy-coded trellis-coded quantization [ 131, [ 141 provide, for memoryless sources, respective performance within 1.53 and 0.5 dB of the rate-distortion function, but typically require a variable-length noiseless code to be used with the quantization. The scalar- vector quantizer (SVQ) [15] may be thought of as a truncated (to fixed-rate) version of an entropy-coded scalar quantizer, and captures much of the available boundary gain, but none of the possible granular gain. The trellis-based scalar-vector quantizer (TBSVQ) [ 161 combines trellis-coded quantization [ 171 (for granular gain) with the SVQ (for boundary gain), and provides excellent fixed-rate encoding performance. However, as with all trellis encoding methods, some encoding delay is required.

The objective of this work is to introduce the VQ-LVQ for tasks of low-delay, fixed-rate encoding, requiring moderate-to- high encoding rates and vector dimensions. In Section 11, the motivation for VQ-LVQ is presented and the basic structure is described. In Section 111, two VQ-LVQ codebook design algorithms are presented. In Section IV, the VQ-LVQ encod- ing performance is presented for memoryless Gaussian and Laplacian sources and the first-order Gauss-Markov source. Comparisons with alternative fixed-length encoding methods demonstrate the effectiveness of the approach.

11. VQ-LVQ FORMULATION

A. Motivation

The motivation for the VQ-LVQ structure rests on two well-known results from rate-distortion theory. First, it is known that for a wide class of memoryless sources and the mse distortion measure, as the encoding rate becomes large

0018-9448/95$04.00 0 1995 IEEE

Page 2: Two-stage vector quantization-lattice vector quantization

156 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 41, NO. 1, JANUARY 1995

Fig. 1. (Normalized) histogram of the VQ encoding error for M I = 16, 64, 256, and 1024, for memoryless Laplacian source and using 8-dimensional LBG VQs (solid/star, dotted, dash-dotted, and dashed lines are for the cases of M I = 16, 64, 256, and 1024, respectively, and the solid line is the density function for a Gaussian random variable with zero mean and variance of 0.16).

the Shannon lower bound [18] coalesces with the source rate-distortion function [19, ch. 41. As a consequence, as the rate becomes large the backward channel model holds and the (ideal) encoding noise is memoryless and Gaussian. Second, for many Gaussian sources with memory, there exists a critical encoding rate, say R,, such that for any encoding rate above R, the (ideal) encoding error is, again, white and Gaussian. What is unclear from this theory, however, is how quickly (with vector dimension or encoding rate) the vector quantization encoding error tends to become Gaussian- distributed. Naturally, this tendency is source-dependent, but some simple experiments are illustrative.

Let X be a memoryless source, and consider the L- dimensional vector quantization (VQ) of such a source, using a VQ codebook of size A41 = 2R1L, where RI is the encoding rate, in bits per dimension. Fig. 1 shows the histogram of the VQ encoding error for M I = 16, 64, 256, and 1024, for a memoryless Laplacian source and using 8-dimensional VQ designed using the LBG algorithm. As the encoding rate increases, the VQ encoding error distribution begins to look Gaussian. Fig. 2 shows the histogram of the VQ encoding error for A41 = 256, and 8-, 16-, and 32-dimensional LBG VQ. Even in the 32-dimensional case, where the effective bit rate is just 0.25 bldimension, modeling the vector quantization error distribution as Gaussian seems acceptable. Similar experiments with other sources support the contention that, in many cases, the VQ encoding error distribution can be reasonably modeled as Gaussian.

Lee and Neuhoff [20] and Lee, Neuhoff, and Paliwal [2 11 consider multistage VQ with very different modeling assumptions from those just presented. If the source density is smooth and the first-stage VQ encoding rate large, then it is apparent that the first-stage encoding error is (asymptotically) uniform over each quantization cell. Hence, as shown in [20], conditioned on each first-stage codeword, the best second- stage VQ is a uniform quantizer. Further, it is shown in [20] that, asymptotic in the size of the first-stage codebook, the overall encoding distortion is equivalent to that of a single- stage VQ.

o.2 0. I t Fig. 2. (Normalized) histogram of the VQ encoding error for Mi = 256, and 8-, 16-, and 32-dimensional LBG VQ's, for memoryless Laplacian source (dashed, dash-dotted, and dotted lines are for the cases of 8, 16, and 32 dimensions, respectively; the solid line is the density function for a Gaussian random variable with zero mean and variance of 0.25).

The apparent inconsistencies between our modeling assump- tions and those in [20], [21] can be resolved as follows. First, for large first-stage VQ encoding rate, although the quantization error is uniform, it is not, in general, uniform in a per-dimension sense. As the vector dimension gets large, the per-dimension quantization error distribution must approach Gaussian if the VQ performance is to approach the rate distortion function. Second, the approach in [20], [21] is to use a nearest codeword first-stage VQ encoding. In contrast, VQ- LVQ uses, effectively, a soft-decision first-stage quantization (more precisely, an exhaustive search of the parent codebook of the two-stage product codebook), so that for a given input vector, the closest first-stage codeword does not necessarily yield the overall optimum VQ-LVQ reproduction vector.

Based on the modeling assumption of a Gaussian distributed first-stage VQ encoding error, a second-stage VQ can be selected as a lattice vector quantizer with a spherical codebook. The VQ-LVQ (with spherical LVQ codebook) can achieve much boundary gain (in the second stage) [11], compared to the vector-scalar quantizer [22]. The highly structured LVQ codebook allows optimum two-stage encoding. Practical designs can use a relatively small number of first-stage VQ codewords.

B. Structure of Two-Stage VQ-LVQ A diagram of the VQ-LVQ encoder is shown in Fig. 3.

The first-stage VQ codebook, Y1, has M I codewords, and the second-stage lattice VQ codebook, Y,, has M , codewords (all of the lattice points within a sphere), with an overall encoding rate of R = [log, MlM,] /L bits per dimension, where [a1 denotes the smallest integer no smaller than a. Let S,, i = 1, 2, . . . . M I , be diagonal scaling matrices. VQ-LVQ encoding is accomplished as follows. Given an input vector z for each yl, E yl the difference z - yl, is computed. The scaled vector e, = S;' (z - yl, %) is then quantized as a lattice codeword y,, in Y2. The quantized vector is

Page 3: Two-stage vector quantization-lattice vector quantization

PAN AND FISCHER: TWO-STAGE VECTOR QUANTIZATION-LATTICE VECTOR QUANTIZATION

Li_l Scaling Matrix

Fig. 3. The structure of two-stage VQ-LVQ.

and the corresponding squared error is

Jointly optimum encoding means selection of the index pair ( z * , j ' ) that minimizes d i , j .

In the special case Si = ail , a = 1, 2, . . . , M I , jointly op- timum encoding can be accomplished by a two-step encoding algorithm. That is

( i * ] j*) = argmin{d;,j} 3

a ' = arg pip 2 I(e; - yz, 1 1 2

z > 3 L

The encoding algorithm is then the following. i) For each input vector z, form e; = ST'(.: - Y ~ , ~ ) ,

ii) For each e;, find the closest LVQ codeword, say y2, j * ( i ) .

iii) Determine to minimize

i = 1, 2; . . ,M1.

a' - Ilez - Y2, j- ( i ) ( I 2 L

C. Selection of Lattice VQ Codebook '

Let A be a lattice (see, e.g., [23], [24] for a general description of lattice VQ). Since the (scaled) error vector to be quantized is assumed to be (roughly) zero-mean, Gaussian- distributed, the LVQ codebook is selected as the M2 lattice points closest to the origin. Typically, Yz consists of all lattice points on or within an L-dimensional sphere, and the codebook size M2 is generally not an integral power of two. One strategy for handling the enumeration of lattice points is then to allocate R2 = (1/L) [log, M2] bits per dimension for encoding the LVQ codewords, with RI = (1/L)[log2 M11 bits per dimension assigned for representing the first-stage VQ codewords, and subject to an overall rate constraint of R 2 RI + R2, where R is the average encoding rate in bits per dimension for the VQ-LVQ. A slightly more effective coder results from selecting M I as the lar6est integer satisfying M1M2 5 2RL, and then enumerating the VQ- LVQ codeword pairs as integers in 0,...,2RL - 1. If we let ml E (0, 1, . . . , MI - l} index the first-stage VQ codewords and m2 E (0, 1, . . . , M2 - 1) index the LVQ codeword, then the index for the codeword pair (ml, m2) is m = mltM1m2. This enumeration strategy is used for all the simulations in this paper.

157

TABLE I GRANULAR GAIN OF THE D L LATTICE, OR COSET OF D L , OVER THE CUBIC LATTICE

DIMENSION GAIN (dB) -1 pi

0.16

Let Z L be the L-dimensional cubic (integer) lattice. The DL lattice is a sublattice of ZL, consisting of all y E ZL satisfying Cl"=, yl = even. The coset of DL, denoted as D;, consists of all y E Z L satisfying E,"=, yl = odd. For L 2 1, k 2 0, let B(L , I C ) be the set of all y E ZL of squared norm k , i.e.

and let N(L , k ) be the number of cubic lattice points in B(L, I C ) . The DL lattice then consists of the "shells" B ( L , I C ) , for k even, and D; consists of the shells with k odd.

There is a slight performance advantage to using the DL lattice (or D i ) for quantization, compared to the cubic lattice. Using the known value of mean-square error per Voronoi region [6, eq. (33)], the granular gain of the DL lattice is easily computed, and is listed in Table I for several dimensions. The granular gain for Di is identical. This gain is achievable for large encoding rate, but for moderate or small encoding rates, less gain is possible. Although the gains in Table I are small, they are achieved with little increase in complexity, compared to cubic lattice quantization. As the dimension gets large, the granular gain of the DL lattice vanishes.

The LVQ codebooks used in this paper consist of either the points in DL, or the points in DL, that lie inside or on an L- sphere. The number of cubic lattice points that lie on a sphere of squared radius k is easily computed [8]. Table I1 summarizes the values of squared radius k , rlog2M21 for possible LVQ codebooks, and size of corresponding VQ codebook M I for several dimensions and at three encoding rates.

D. Encoding Algorithm for LVQ

The LVQ codebooks consist of spherical bounded subsets of either the DL lattice or the coset of DL so that every codeword in J'2 satisfies Jig2, ( I 2 5 k . Assuming the codebook uses points in DL, a given vector, say e, is LVQ-encoded, as follows.

1) Use the Conway and Sloane algorithm [25] to find the closest point in DL, say e, to e.

2) If ( (2((2 5 k , then the LVQ codeword is y2 = e. Otherwise, go to 3).

3) Project e onto the sphere of squared radius k . Find the closest point in & (using a technique similar to the method in [26]).

E. Sequential and Jointly Optimum Two-Stage VQ Encoding

The VQ-LVQ structure is a special case of multistage VQ [2]. Traditional multistage VQ uses sequential or "hard

Page 4: Two-stage vector quantization-lattice vector quantization

158

‘ 2.5 1 12.47 1 13.69 3.0 1 15.23 1 16.65

2.0 1 9.89 I 10.93 2.5 I 12.73 I 13.86 3.0 1 15.55 I 16.80

32-DIMENSION

IEEE TRANSACTIONS ON INFORMATlON THEORY, VOL. 41, NO. I , JANUARY 1995

TABLE I1 SQUARED RADIUS OF THE LVQ CODEBOOK ( D L LATTICE FOR EVEN k , D i FOR ODD k ) AND CORRESPONDING

SUE OF THE FIRST-STAGE VO CODEBOOK. FOR SEVERAL VECTOR DIMENSIONS AND ENCODING RATES . (For every case M I M ~ 5 Z R L ) .

decision” encoding at each stage. In the VQ-LVQ case, this implies first computing the y, E Y , to minimize 112 - y11I2, say, y,,;, and then computing the y, E Y2 to minimize 112 - y,, - S;~,11~ for only this single y,, z . In general, there is a significant performance advantage to selecting the jointly optimum codeword pair (y,, i * , y2, J * ) . In the multistage VQ with unstructured codebooks, the jointly optimum encoding can be very complex, although “soft-decision’’ encoding meth- ods, such as the M-L algorithm [ 2 ] , can achieve much 0: the potential improvement. On the other hand, jointly optimum encoding is relatively simple in the VQ-LVQ, due to the structured codebook of the second-stage LVQ.

Table I11 compares optimum and hard-decision first-stage encoding performance of VQ-LVQ for a memoryless Gaussian source and several encoding rates and vector dimensions. Each case uses S; = ail , so that the jointly optimum VQ- LVQ encoding complexity is not significantly different from the sequential, hard decision two-stage VQ encoding. The performance advantage of the jointly optimum encoding is obvious.

F. Complexity

The structural constraints of the VQ-LVQ imply that the performance must generally be inferior to an optimum single-

TABLE 111 PERFORMANCE COMPARISON OF VQ-LVQ USING SEQUENTIAL (HARD

DECISION) AND JOINTLY OPTIMUM ENCODING SCHEMES, FOR THE MEMORYLESS GAUSS~AN SOURCE (70 000 TESTING VECTORS)

except [log,, Al l ] = 7 for a rate of 2.0 and dimension of 8.) (The codebook size satisfies Llog, = 8 for the first-stage VQ

SIGNAL-TO-NOISE RATIO (dB) RATE I Hard Decision I Jointly Optimum

8-DIMENSION 10.25 13.19

14.80 16.17

stage VQ. However, the VQ-LVQ formulation has a large complexity advantage over a single-stage unstructured code- book VQ.

An implementation of the VQ-LVQ must include the VQ- LVQ codebook search and the enumeration encoding (as a bit string) of the selected codeword. The VQ-LVQ codebook search complexity is proportional to M I , the number of first- stage VQ codewords. For each first-stage VQ error e;, the

Page 5: Two-stage vector quantization-lattice vector quantization

PAN AND FISCHER: TWO-STAGE VECTOR QUANTIZATION-LATTICE VECTOR QUANTIZATION 159

closest point in the DL lattice, say e,, either satisfies the spherical codebook boundary constraint (11k211 5 k ) or it does not (IlC,II > k ) . If the former condition holds, then ll(x - Y ~ , ~ ) - S,e,1I2 is computed and used to select the best first-stage codeword. This distortion computation and comparison is little different than that required in standard LBG VQ. For the second case, after projecting e, onto the sphere of squared radius I C , the closest point in the Z L lattice is first found. If this point is not in Y2, a sequential adjustment of the vector components is done, similar to the method in [26], and concluding with the closest point in &. Since the projected vector has squared norm k , at most L adjustments are necessary for convergence to a vector in 34.

The enumeration encoding and decoding algorithms have been implemented using the C programming language. The main idea is to enumerate the lattice points on each spherical LVQ (layer), with the union of all such shells making up the second-stage LVQ codebook Y2. The enumeration coding of the spherical LVQ is similar to the enumeration coding of the pyramid VQ in [26], [27]. The complexity of the enumeration encoding of the VQ-LVQ codewords is much lower than that of the VQ-LVQ codebook search.

111. DESIGN ALGORITHMS FOR VQ-LVQ CODEBOOK

It is complex to design jointly optimal codebooks for (unstructured) multistage VQ [28], [3], unless the overall encoding rate is small. In the design of VQ-LVQ, the second stage LVQ is defined by the lattice, the spherical boundary, and an allocated bit rate. Thus a joint codebook design for the VQ-LVQ requires design of the first stage VQ codebook and a set of scaling matrices, based on an LVQ.

A. Locally Optimal Codebook Design Algorithm

Suppose that a training set of source vectors X is used for the design of the first stage VQ codebook Yl = {yl,,. i = 1, . . . , M I } , and the set of diagonal scaling matrices, S = { S t . i = l , . . . .Ml}, where

0 1

In encoding, for each input x E X, the error vector x - yl, is formed, scaled by S;', and the best LVQ codeword y2 is selected. The reproduction vector is

x = y, + sy, (2)

The necessary conditions for minimizing J over Y1 and S are

and

c Y 2 , 1 ( 2 1 - Y l , z, 1 )

(6) X € C ,

c Y22J st1 =

S E C ,

where 51, yz, 1 , and y1, ,, 1 are the Ith components of x, y2, and yl, ,, respectively. If Si = a,I, i = 1, 2 , . . . , M I , then (6) is replaced by

c YT(X-Y1,,)

c YTY2 (7) X€C,

C Y , =

X€C, An iterative algorithm, denoted Algorithm 1, for designing

1) Let @, Sk denote the VQ codebook and diagonal scaling matrices at iteration index k . Initialize with X, Y,", So, set do = 03, set iteration index k = 1, and select a convergence threshold S > 0.

locally optimum & and S is summarized as follows.

2) Encode x E X using Y("l), S(k-l), and compute

If ( d k - 1 - d k ) / d k < 6, then Stop, with y("'), S('"-l) the final design. Otherwise, go to 3).

3) Update y$ using (5) as

1 (X-S3/,). i = 1.. . . , M I . k

YLz = /GI"') I x t C j k - l )

(or (7)) as

c Y2, 1(2l - Yl", 2 , 1 )

XEC!

c d,1

4) Encode x E X using YF, S("'). Update Sk using (6)

%, 1 = 1

SEC: 1 = l , . . . , L , i = l , . . . ,MI .

Replace k c k + 1; return to 2).

B. Separate Codebook Design Algorithm

An alternative first-stage codebook design algorithm is to simply use the LBG algorithm to design the codebook, and select the scaling matrices based on the conditional variance

Assign E to class c, if the best yl, ,, The average of each quantization region. The codebook design algorithm, denoted as Algorithm 2, is summarized as follows. squared error distortion is then

1) Given a set of training data, design the codebook of the first-stage VQ using the LBG design algorithm.

2) Classify the training set into subsets using the partition regions represented by the codewords. For each subset, say i , find the mean squared quantization error in each dimension, say cf 1 , 1 = 1.. . . , L.

where the pair y1 and yz (with S dependent on yl) minimizes

112 - (Y1 + sY2)ll;. (3)

1 J = - 112 - (Y1 + sY2)II; I X I XEX

AI1

= --E 112 - (Yl,, + S%Y2)1l2. (4) I X J 1'lXEC"

Page 6: Two-stage vector quantization-lattice vector quantization

160

DESIGN OF VQ CODEBOOK

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 41, NO. 1, JANUARY 1995

RATE lM1I 6 1 7 1 8 1 9

TABLE IV SIGNAL-TO-NOISE RATIO PERFORMANCE (IN dB) OF Two VQ

CODEBOOK DESIGN ALGORITHMS FOR THE I.I.D. GAUSSIAN SOURCE (70 000 EIGHT-DIMENSIONAL TESTING VECTORS)

OTHER APPROACHES RATE llog, MI1 6 I 7 I 8 I 9 I 10 11 SVQ I ULVQ I PULVQ

D(R)

Algorithm 1

Algorithm 2

2.5 13.25 13.25 13.24 13.21 3.0 16.17 16.18 16.18 16.17 2.0 10.24 10.24 2.5 13.17 13.15 13.19 13.14 3.0 16.10 16.14 16.16 16.18

TABLE VI SIGNAL-TO-NOISE RATIO PERFORMANCE (IN dB) OF Two VQ CODEBOOK

DESIGN ALGORITHMS FOR THE GAUSS-MARKOV SOURCE WITH CORRELATION COEFFICIENT OF 0.9 (70 000 EIGHT- DIMENSIONAL TESTING VECTORS)

VQ CODEBOOK 16.47 16.61

Algorithm 1 19.05 19.24 19.42 19.47 21.77 22.14 22.25 22.33

2.0 I [ 16.42 [ ] 16.57 Algorithm 2 2.5 I 19.01 I 19.21 [ 19.40 19.42

3.0 I 21.70 I 22.12 I 22.24 1 22.30

10.111 9.95* 2.0 10.24 10.24 2.5 13.17 13.15 13.19 13.14 12.64 3.0 16.10 16.14 16.16 16.18 15.53 15.29*

16-DIMENSION 2.0 10.76 10.77 10.78 10.78 10.33' 10.07' 2.5 13.69 13.72 13.73 13.75 12.99 3.0 16.61 16.65 16.68 16.70 15.83 15.52

3) The scaling matrix for the codeword yl,i is then Si = Pdiag [c;, 11, where /3 is a scale factor for the resolution of the spherical LVQ codebook. For Si = aJ, let ai, i = 1, 2, . . . , M I , be the average of ui, 2 over dimensions.

12.04 15.05 18.06

9.99 12.04 15.05

16.05 18.06

C. Comparison of Two Codebook Design Algorithms

Tables IV-VI compare the quantization performance of the two codebook design algorithms for memoryless Gaussian and Laplacian sources, and the first-order Gauss-Markov source with correlation coefficient of 0.9, using 8-dimensional vectors. The performance of the two design algorithms is similar, although the joint design algorithm (Algorithm 1) has slightly better performance in most cases. Since the two algorithms provide designs with similar performance, the remainder of the VQ-LVQ performance results will be based on the simpler design Algorithm 2.

2.0 10.92 10.94 10.95 2.5 13.77 13.85 13.88 3.0 16.70 16.77 16.82

Iv . COMPARISON OF QUANTIZATION PERFORMANCE

Although the VQ-LVQ formulation presented earlier allows for general diagonal scaling matrices, for the sources consid- ered here, there is only slight performance advantage (less than 0.15 dB) over the simpler case of Si = aiI. All of the simulation results to follow use Si = aiI, i = 1, 2 , . . . , M I .

32-DIMENSION 10.94 10.90 10.43 1 12.04 13.88 13.84 13.23 I 15.05 16.83 16.80 16.01 I 18.06

I I

Algorithm 2

TABLE VI11 SIGNAL-TO-NOISE RATIO PERFORMANCE (IN dB) FOR THE

MEMORYLESS LAPLACIAN SOURCE (70 000 TESTING VECTORS). (Entries marked * correspond to an encoding

rate slightly larger than the given rate.) 2.0 I 1 10.39 [ 1 10.37 2.5 I 13.04 I 13.17 [ 13.17 1 13.22 3.0 I 15.91 I 16.03 I 16.06 1 16.09 RATE [log, MI1 11 OTHER APPROACHES I D(R)

6 1 7 1 8 I 9 I 10 I] SVQ I ULVQ I PULVQ I

TABLE IX SIGNAL-TO-NOISE RATIO PERFORMANCE (IN dB) FOR THE FIRST-ORDER GAUSS -MARKOV SOURCE WITH

CORRELATION COEFFICIENT 0.9 (70 000 TESTING VECTORS)

. ~ ~~ ~~~ .~ ~ ~

10.03' 9.31 1 2.0 10.39 1 10.37 2.5 13.04 13.17 13.17 1 13.22 12.53 I 3.0 15.91 16.03 16.06 1 16.09 15.18; 14.53 I

2.0 10.79 10.87 10.89 1 10.88 10.54 10.38* 1 10.07 2.5 13.58 13.66 13.72 1 13.74 13.10 3.0 16.37 16.49 16.60 1 16.64 15.68 15.45 I 16.23

2.0 10.79 10.89 10.91 I 10.92 10.78 2.5 13.32 13.71 13.76 113.78 13.77 13.52 3.0 16.10 16.54 16.61 I 16.66 16.67 16.14

16-DIMENSION

I

I I

32-DIMENSION

12.66

18.68

12.66

18.68

12.66

18.68

RATE llog, MI1 / / OTHER APPROACHES 6 1 7 / 8 1 9 ] 1 0 1 1 ESVQ

A. Memolyless Sources Tables VI1 and VI11 present the SNR performance of VQ-

LVQ and other encoding methods for memoryless Gaussian and Laplacian sources. In these tables, SVQ denotes the best of design Algorithms 1 and 2, respectively, of the scalar- vector quantizer (SVQ) [15], and ULVQ and PULVQ denote, respectively, the uniform and piecewise-uniform lattice vector quantizer [lo]. In all cases, the second-stage LVQ uses a DL lattice (or D'j) codebook, and the first-stage VQ codebook is designed using Algorithm 2.

For the memoryless Gaussian and Laplacian sources, the VQ-LVQ performance is consistently better than the uniform and piecewise-uniform lattice VQ, and the scalar-vector quan- tizer, with the advantage more pronounced at the larger bit rates.

D(R)

2.0 I 15.76 2.5 I 18.53 3.0 I 21.10

2.0 1 14.59 2.5 I 17.03 3.0 I 19.76

~. ~~~~~~ ~~~~

16.17 16.50 16.74 17.23 19.25 18.95 19.31 19.45 19.58 1 21.59 22.02 22.28 22.48 I 22.61 25.27

14.97 15.28 15.55 15.75 I 17.50 19.25 17.67 18.02 18.32 18.56 I 20.31 20.73 21.05 21.39 I 23.11 25.27

I

32-DIMENSION

Page 7: Two-stage vector quantization-lattice vector quantization

PAN AND FISCHER: TWO-STAGE VECTOR QUANTIZATION-LATTICE VECTOR QUANTIZATION

i log2 MI

Fig. 4. Relationship between SNR and (log, M I ) / L . Encoding rate is 3.0. Source is memoryless Gaussian. Solid, dashed, and dash-dotted lines represent cases of 8-, 16-, and 32-dimensional source vectors, respectively. (The training set size is 50 000.)

I 0.2 0.4 0.6 0.8 1 1.2 13.5'

t log, Ml Fig. 5. ) / L . Encoding rate is 3.0. Source is memoryless Laplacian. Solid, dashed, and dash-dotted lines represent cases of 8-, 16-, and 32-dimensional source vectors, respectively. (The training set size is 50 000.)

Relationship between SNR and (log,

Figs. 4 and 5 show the overall VQ-LVQ encoding perfor- mance as a function of the encoding rate for the first-stage VQ, for the Gaussian and Laplacian sources, respectively. For each source, the VQ-LVQ SNR performance appears to saturate as the first-stage VQ codebook gets large.

Table IX shows the VQ-LVQ performance for the first-order Gauss-Markov source with correlation coefficient p = 0.9. ESVQ corresponds to the extended SVQ [29]. For small vector dimension the VQ-LVQ performance is comparable, or slightly better than, the ESVQ (which uses a Karhunen-LoCve transform to decorrelate the vector, prior to SVQ encoding). However, as the vector dimension increases, the relative VQ- LVQ performance decreases.

0.2 0.4 0.6 0.6 1

161

2

Fig. 6. Relationship between SNR and (logy ' ! f l ) / L . Encoding rate is 3.0. Source is first-order Gauss-Markov with correlation coefficient 0.9. Solid, dashed, and dash-dotted lines represent cases of 8-, 16-, and 32-dimensional source vectors, respectively. (The training set size is 50 000.)

Fig. 6 shows the overall VQ-LVQ encoding performance as a function of the size MI of the first-stage VQ codebook, for the first-order Gauss-Markov source with correlation coeffi- cient p = 0.9. For the 8-dimensional vector, the VQ-LVQ SNR saturates as the first-stage VQ encoding rate reaches about 1 b/dimension. For 16- and 32-dimensional vectors, however, Fig. 6 shows a steady increase in SNR, even for M I , as large as 1024.

The VQ-LVQ performance for the unit-variance first-order Gauss-Markov source can be understood by examining the source power spectral density

1 - p2

1 - 2pcosw + p2 q w ) =

and rate-distortion function [19, p. 1131

1 1 - p 2 1 - P D<-- 2 D ' l t p ' R ( D ) = - log, -

For this source, the critical encoding rate corresponding to D = ( 1 - p ) / ( l + p ) is R, = log, ( l+p) . From arate-distortion theory viewpoint, if the source encoding rate is R 2 R,, then the optimum quantization noise is white and Gaussian. Alternatively, if R < R,, then the optimum encoding noise is not white. The former situation matches the motivation for using a lattice VQ with spherical codebook, while the latter does not. Hence, we conclude that if the first-stage VQ encoding rate RI can be made at least about R, bits per dimension, then the VQ is capable (at least for large enough vector dimension) of exploiting most of the source memory, and producing a first-stage encoding error that is approximately white and Gaussian. In contrast, if RI < R, then the first-stage encoding noise is expected to necessarily be colored, and so not be a good match to the spherical VLQ codebook support region.

Table X presents average correlation values of the first-stage VQ encoding error for the first-order Gauss-Markov source

Page 8: Two-stage vector quantization-lattice vector quantization

I62 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 41, NO. 1, JANUARY 1995

TABLE X AVERAGE (OVER THE VECTOR DIMENSION) CORRELATION

VALUES OF THE VQ ENCODING ERROR FOR THE GAUSS-MARKOV SOURCE WITH CORRELATION COEFFICIENT 0.9

DIMENSION NUMBER OF CODEWORDS (MI) 64 1 128 I 256 I 512 1 1024

8 10.2260 10.1462 10.0865 I0.0790 10.0628 16 32

1 0.5067 1 0.4487 1 0.3818 10.3352 1 0.2898 10.6738 10.6405 I0.6002 10.5683 10.5399

and for several sizes of VQ codebook and vector dimension. From this table, one can see the tendency of removing the memory in the source. For the 8-dimensional vector, the first- stage VQ encoding error is reasonably modeled as white, provided the encoding rate is at least about 1 Wdimension. For larger vector dimensions, significant intravector correlation remains in the first-stage encoding error.

Fig. 7 shows the performance of the VQ-LVQ, as a function of first-stage VQ codebook size, for the first-order Gauss-Markov source, but now with correlation coefficient p = 0.5. In this case, R, = 0.585 b/sample. As expected, the 8-dimensional VQ-LVQ performance saturates at log, M I ’v 6 b and the 16-dimensional performance saturates at log, M I 21

9 b. For Gaussian sources with memory such that for R 2 R, the

optimum (in a rate-distortion sense) encoding noise is white and Gaussian, we reach the following conclusion. For rate- dimension products such that it is feasible for the first-stage VQ to have codebook size M I ‘v 2 R c L , then the VQ-LVQ coder will be very effective. If MI << a R c L , then the first-stage VQ must necessarily do an inadequate job of exploiting source memory, and the LVQ spherical codebook will generally not be the best choice for encoding the first-stage VQ encoding error.

V. CONCLUSIONS

A two-stage vector quantizer has been introduced that uses an unstructured codebook for the first stage, and a spherical lattice codebook for the second stage. The VQ-LVQ structure is based on the assumption that the distribution of the first- stage VQ encoding error is reasonably modeled as Gaussian. Jointly optimum VQ-LVQ encoding is feasible for moderate- to-high encoding rates and vector dimensions, due to the structured LVQ codebook in the second-stage. Locally opti- mum and separate (LBG-algorithm-based) codebook design algorithms are described. The much simpler separate codebook design algorithm provides quantization performance within a few tenths of a decibel of performance of the locally optimum codebook design algorithm. The VQ-LVQ encoding complexity is proportional to the number of the first-stage VQ codewords.

For memoryless Gaussian and Laplacian sources, the VQ- LVQ encoding performance is superior to that reported for the scalar-vector quantizer [ 151 and unifodpiecewise-uniform cubic lattice vector quantizer [lo]. For Gaussian sources with memory, the VQ-LVQ performance is dependent on the feasibility of using a large enough first-stage VQ codebook to effectively remove most of the source memory, leaving a near-white, (roughly) Gaussian residual to be encoded by the spherical LVQ.

I 0.2 0.4 0.6 0.8 1 1 2

14.5

Fig. 7. Relationship between SNR and (log, M i ) / L . Encoding rate is 3.0. Source is first-order Gauss-Markov with correlation coefficient 0.5. Solid, dashed, and dash-dotted lines represent cases of 8-, 16-, and 32-dimensional source vectors, respectively. (The training set size is 50 000.)

REFERENCES

[I] Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer design,” IEEE Trans. Commun., vol. COM-28, no. 1, pp. 84-95, Jan. 1980.

121 A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Norwell, MA: Kluwer, 1992.

[3) C. F. Barnes and R. L. Frost “Vector quantizers with direct sum ccdebooks,” IEEE Trans. Inform. Theory, vol. 39, no. 2, pp. 565-580, Mar. 1993.

[4] A. Gersho, “Asymptotically optimal block quantization,” IEEE Trans. on htform. Theory, vol. IT-25, no. 4, pp. 373-380, July 1979.

[5] __, “On the structure of vector quantizers,” IEEE Trans. Inform. Theory, vol. IT-28, no. 2, pp. 157-166, Mar. 1982.

[6] J. H. Conway and N. J. A. Sloane, “Voronoi regions of lattices, second moments of polytopes, and quantization,” IEEE Trans. Inform. Theory, vol. IT-28, no. 2, pp. 211-226, Mar. 1982.

[7] N. J. A. Sloane, “Tables of sphere packing and spherical codes,” IEEE Tmns. Inform. Theory, vol. IT-27, no. 3, pp. 327-338, May 1981.

181 T. R. Fischer, “Geometric source coding and vector quantization,” IEEE Trans. Inform. Theory, vol. 35, no. 1, pp. 137-145, Jan. 1989.

[9] J.-P. Adoul and M. Barth, “Nearest neighbor algorithm for spherical codes from the Leech lattice,” IEEE Trans. Inform. Theory, vol. 34, no. 5 , pp. 1188-1202, Sept. 1988.

1101 D. G. Jeong and J. D. Gibson, “Uniform and piecewise uniform lattice ~. vector quantization for memoryless Gaussian and Laplacian sources,” IEEE Trans. Inform. Theory, vol. 39, no. 3, pp. 786804, May 1993.

[ I I ] M. V. Eyuboglu and G. D. Fomey, Jr., “Lattice and trellis quantization with lattice- and trellis-bounded codebooks-High-rate theory for memo- ryless sources,” IEEE Trans. Inform. Theory, vol. 39, no. 1, pp. 46-59, Jan. 1993.

1121 N. Farvardin and J. W. Modestino, “Optimum quantizer performance for a class of nongaussian memoryless sources,” IEEE Trans. Inform. Theory, vol. IT-30, no. 3, pp. 485497, May 1984.

1131 T. R. Fischer and M. Wang, “Entropy-constrained trellis coded quanti- zation,” IEEE Trans. Inform. Theory, vol. 38, no. 2, pp. 415426, Mar. 1992.

[ 141 M. W. Marcellin, “On entropy-constrained trellis coded quantization,” IEEE Trans. Commun., vol. 42, pp. 14-16, Jan. 1994.

[15] R. Laroia and N. Farvardin, “A structured-fixed-rate vector quantizer derived from a variable-length scalar quantizer: Part I-Memoryless sources,” IEEE Trans. Inform. Theory, vol. 39, no. 3, pp. 851-867, May 1993.

I161 R. Laroia and N. Farvardin, “Trellis-based scalar-vector quantizer for memoryless sources,” IEEE Trans. Inform. Theory, vol. 40, pp. 860-870, May 1994..

[I71 M. W. Marcellin and T. R. Fischer, “Trellis coded quantization of memoryless and Gauss-Markov sources,’’ IEEE Trans. Commun., vol. 38, no. 1, pp. 82-93, Jan. 1990.

Page 9: Two-stage vector quantization-lattice vector quantization

PAN AND FISCHER: TWO-STAGE VECTOR QUANTIZATION-LAITICE VECTOR QUANTIZATION

~

163

C. E. Shannon and W. Weaver, The Mathematical Theory of Communi- cation. Urbana, IL: Univ. of Illinois Press, 1949. T. Berger, Rate-Distortion Theory. Englewood Cliffs, NJ: Prentice- Hall, 1971. D. H. Lee and D. L. Neuhoff, “Conditionally corrected two-stage vector quantization,” in Proc. Con$ on Information Sciences and Systems, 1990. D. H. Lee, D. L. Neuhoff, and K. K. Paliwal, “Cell-conditioned multistage vector quantization,” in Proc. Int. Conj: on Acoustics, Speech, and Signal Processing, May 1991, pp. 653-656. J. Grass and P. Kabal, “Methods of improving vector-scalar quantization of LPC coefficients,” in Proc. Int. Con& on Acoustics, Speech, and Signal Processing, May 1991, pp. 657-660. J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices, and Groups. New York: Springer-Verlag, 1988.

[24] J. D. Gibson and K. Sayood, “Lattice quantization,” in Advances in Electronics and Electron Physics, 1988, pp. 259-332.

[25] J. H. Conway and N. J. A. Sloane, “Fast quantization and decoding algorithms for lattice quantizers and codes,” IEEE Trans. Inform. Theory, vol. IT-28, no. 2, pp. 227-232, Mar. 1982.

[26] T. R. Fischer, “A pyramid vector quantizer,” IEEE Trans. Inform. Theory, vol. IT-32, no. 4, pp. 568-583, July 1986.

[27] T. R. Fischer and J. Pan, “Enumeration encoding and decoding algo- rithms for pyramid cubic lattice and trellis codes,” submitted to IEEE Trans. Inform. Theory.

[28] W.-Y. Chan, S. Gupta, and A. Gersho, “Enhanced multistage vector quantization by joint codebook design,” IEEE Trans. Commun., vol. 40, no. 11, pp. 1693-1697, Nov. 1992.

[29] R. Laroia and N. Farvardin, “A structured fixed-rate vector quantizer de- rived from a variable-length scalar quantizer: Part 11-Vector sources,” IEEE Trans. Inform. Theory, vol. 39, no. 3, pp. 851-867, May 1993.


Recommended