DCRVQ: a new strategy for efficient entropy coding of vector-quantized images

696 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 44, NO. 6, JUNE 1996

DCRVQ: A New Strategy for Efficient Entropy Coding of Vector-Quantized Images

Francesco G. B. De Natale, Stefan0 Fioravanti, and Daniele D. Giusto, Member, ZEEE

Abstract-This paper presents a novel predictive coding scheme for image-data compression by vector quantization (VQ). On the basis of a prediction, further compression is achieved by using a dynamic codehook-reordering strategy that allows a more efficient Huffman encoding of vector addresses. The proposed method is lossless, for it increases the compression performances of a baseline vector quantization scheme, without causing any further image degradation. Results are presented and a comparison with Cache-VQ is made.

Zndex Terms- Vector quantization, block prediction, lossless data compression.

Fig. 1. Spatial correlation between neighboring codevectors.

I. INTRODUCTION

ECTOR QUANTIZATION is a well-known coding strat- V egy [l], [2] widely used in many application fields, as well as for image-data compression [3]. The main reasons for its success are a simple implementation and good rate- distortion performances [4].

In short, it consists of an extension of scalar quantization to an ordered set of real numbers (a block in the original data set, that is, a vector) related to both one-dimensional and multidimensional signals. More precisely, it is defined as an operator S that maps a vector belonging to an n-dimensional Euclidean space !Rn (for example, an image block made up of n = n1 x 122 pixels) into a finite subset C c !Rn made up of only N vectors (that is, the codewords or codevectors, C being the codebook)

(1)

Vector quantization involves two crucial steps: the mapping operation, that aims at searching for the nearest codevector for a given vector, and the identification of the C subset, that is, the optimal codebook generation. Two common strategies are usually employed. For the former step, the mean square error (MSE) is a very popular strategy, whereas for the latter, the classical suboptimal strategy proposed in [5] is adopted in many cases.

3: Rn + C c R"; c = { c v } II = 1,2, ' ' ' , N

Paper approved by M. R. Civanlar, the Editor for Image Processing of the IEEE Communications Society. Manuscript received November 10, 1993; revised January 4, 1995. This paper was presented in part at the Intemational Symposium on Information Theory, Trondheim, Norway, June 27-July 1, 1994. This work was supported by CNR, the National Research Council of Italy, within the framework of the Progetto Finalizzato Telecomunicazioni.

F. G. De Natale is with the Department of Biophysical and Electronic Engineering, University of Genova 16145, Italy.

D. D. Giusto is with the Department of Electrical and Electronic Engineering, University of Cagliari, Cagliari 09 123, Italy (e-mail: giusto@elettrol .unica.it).

S. Fioravanti is with the Saclant Research Center, La Spezia, Italy. Publisher Item Identifier S 0090-6778(96)04348-6.

The simple vector quantizer described by (1) operates on vectors as single entities and generates only a series of codevectors, therefore, it is called memoryless VQ. From the point of view of data compression, the number of bits B needed to code a pixel by memoryless VQ equals

(2) log2 N B = -, n

However, in applications dealing with highly correlated sources (in particular, images), rate-distortion theory states that high performances are reachable only by employing prohibitively large codevectors and codebooks. However, one can achieve lower bitrates also by exploiting the notable amount of redundancy between neighboring codevectors (as in Fig. l), which are usually highly correlated. It is sufficient to calculate the transition matrices [ T d ] to realize that memoryless VQ is not capable of exploiting such an amount of redundancy. These matrices are defined as

(3)

where a generic entry ( p , q ) is proportional to the probability of finding the codevector cp given the codevector cq at the previous position in one of the main directions d shown in Fig. 1 (Le., 0, 7r/4, 7 r / 2 , and 37r/4). Inter-codevector redundancies result in very sparse matrices, as demonstrated by the example given in Fig. 2.

Therefore, in order to overcome this intrinsic limitation of memoryless VQ and to improve its performances, considerable effort has been devoted to exploiting interblock correlation. A widely known strategy is finite-state vector quantization (FSVQ), which was first proposed in [6] and [7], and subsequently used as a basis for various algorithms (for instance,

In [lo], an FSVQ-like strategy was implemented by using a cache-based lossless prediction scheme (Cache-VQ). In

[TdIP,* = P { C P I ( C * r 4 )

and [91).

0090-6778/96$05.00 0 1996 IEEE

DE NATALE et a1 DCRVQ A NhW STRATEGY FOR EFFICENT ENTROPY CODING OF VECTOR-QUANTIZED IMAGES

~

697

c l P

Fig. 2. A typical transition matrix

short, it consists of the generation and the exploitation of a reduced codebook, obtained by benefiting from the information provided by the neighboring blocks’ transition matrices. Subsequently, the same strategy was implemented using a two-layer neural network as a predictor [ 111.

A slightly different and more effective implementation of the Cache-VQ strategy is presented in this paper, followed by the proposal of a novel predicted VQ scheme. According to this scheme, the prediction step is aimed at reordering the codebook dynamically [dynamic codebook reordering vector quantization, (DCRVQ)], thus making possible a more efficient Huffman encoding of codevector addresses, as the entropy of address data is sharply reduced.

This paper is organized as follows: an introduction to the prediction problem is first provided. Then, the Cache-VQ strategy is outlined and the novel DCRVQ coding strategy is defined. Some different prediction schemes are analyzed and the choice of a neural approach is explained; the structure of the neural predictor is then outlined, and the training phase is described. The performances of the new approach and a comparison to other prediction schemes are dealt with in Section V. Finally, some conclusions are drawn.

11. VECTOR QUANTIZATION AND VECTOR PREDICTION The output of a memoryless vector quantizer is a stream of

data representing the codevectors’ addresses in the codebook. As these addresses are generated independently of one another, the resulting data stream is characterized by notable entropy, as can be deduced by analyzing the behavior of the address histogram shown in Fig. 3.

On the other hand, this implies that the memoryless-VQ output is not optimal, as the process does not exploit the interblock redundancy. Therefore, the VQ process is optimal when one looks for the codebook, but this advantage may be strongly reduced if one does not consider the interblock correlations; in other words, not the bitrate, but the distortion is optimized.

Pr

0 100 200 Address Fig. 3. An example of the address histogram for memoryless VQ

However, the data stream cannot be further compressed, but by using an entropy coder, whose performance cannot exceed the Shannon bound. The problem is then one of identifying a new process in order to vary the data stream, that is, some relations among the codebook addresses must be defined and exploited. Better yet, the natural correlations shown by transition matrices must be exploited.

The solution lies in the use of a predictor. As a first idea, one may try to organize the codebook by applying a strategy that allows one to optimize a DPCM coding of the codevector addresses. However, this approach involves many problems and is not optimal at all, as it is not based on the current prediction but just on starting statistics, consequently, after the initial ordering of the codebook, no drawbacks can be overcome. For instance, due to the starting with a certain codevector, the addresses may jump to and fro. One may try to reorder the codebook for each image, but this produces overhead information to be transmitted, and reduces only partially the drawbacks.

Transition matrices do not pose such a problem, even though they are large in size, however, they can be compressed efficiently. A residual redundancy after vector quantization means that not all the possible transitions between codevectors have occurred in certain images. This results in transition matrices characterized by many empty locations, that is, sparse matrices, that can be compressed with good results.

A first solution to obtain an efficient predictor for vector quantization might be to exploit the information carried by the transition matrices.

A. The Cache-VQ Strategy In [lo], the Cache-VQ strategy was proposed, which is

a predicted VQ scheme (conceptually based on FSVQ) that allows one to obtain higher compression factors without causing any further losses in the SNR. The basic idea stems from the observation of the arrays of conditional probabilities [Td] . In practice, on the basis of the four previously decoded neighboring codevectors, a prediction b ( z , 1) of the next block b(z,j) is computed (where z and j are the spatial coordinates of the block), and the codevectors { cu ; u = 1, . . . ,2‘ (where T < log, N , N being the size of the codebook) more similar to b ( z . j) are stored in a codebook of reduced dimensions, called cache codebook. If the correct codevector is contained in the cache, one can achieve a bitrate reduction by addressing such a smaller codebook.


CHANNEL

- vector

reduced

overall codebook

fault

~

I I

predictor previously

decoded vectors

Fig. 4. The architecture of the Cache-VQ encoderidecoder

In other words, if the current codevector c,(z.j) is one of the codevectors {cu} , only r bits are required to transmit its address within the cache codebook, otherwise, a fault configuration (i.e., the first one) is transmitted, followed by the log, N-bit address of the codevector c , (i, j ) in the overall codebook.

In the case of a correct prediction, the bitrate is multiplied by a factor Fcp equal to

F -: cp - log,N (4)

whereas, in the case of a wrong prediction, it is multiplied by a factor F5vp equal to

( 5 )

Denoting by PCp the percentage of correct predictions for an image, and by Pwp, the percentage of wrong ones (Le., P,, = 1 - Pwp), the resulting final bitrate Bf is equal to

Bf = B . Fcp . Pcp + B . Fwp . Pwp

Therefore, the threshold for Pcp, above which the Cache-VQ method is advantageous, can be computed by setting Bf = B in (6), that is

(7)

Equation (7) gives the condition on the percentage of correct predictions to achieve a bitrate reduction.

Concerning the architecture, the decoder must, of course, be synchronous with the encoder, in that it has to accomplish the same sequence of operations to find the current codevector. For each codevector, the prediction is computed by the decoder in the same way as by the encoder, and the same cache codebook is generated. If the r-bit configuration received is a valid address, the vector is selected from the reduced codebook, otherwise (fault condition), the next log, N-bits are used to address the overall codebook directly.

The global architecture of the Cache-VQ encodeddecoder is displayed in Fig. 4.

111. DYNAMIC CODEBOOK REORDERING VECTOR QUANTIZATION (DCRVQ)

Despite the good performance of Cache-VQ technique, it exhibits a major drawback, that is, prediction results are not

DE NATALE et al.: DCRVQ: A NEW STRATEGY FOR EFFICIENT ENTROPY CODING OF VECTOR-QUANTIZED IMAGES

0 0 0

0 0

0 0 0

0

0

~

699

Legend:

0 codevector

x predicted vector

0 0 ~ 0

Fig. 5. Graphical representation of a predicted vector in a 2-D space.

fully exploited. This is reflected in a more or less fixed bitrate: it may vary depending on the global correct-prediction probability Pep, but it cannot vary much, so it cannot be locally tailored to the predictor performance. In other words, it cannot be tailored to different situations, nor give the optimal performance. The problem lies in the size of the reduced codebook: the size is fixed when the vector quantizer starts coding an image. Of course, reduced codebook dimensions could vary in a more dynamic way.

In the following, we shall address the problem of making the codebook size vary dynamically for the purpose of defining a coding system capable of optimizing the use of prediction results.

In a two-dimensional case, that is, for vectors completely defined by two components only, b = ( b l . b 2 ) , we can dynamically adapt the reduced codebook size to the predictor output in the following way. Let b ( 2 . j ) = { & ~ ( i , j ) , & ~ ( i , j ) } be the result of the prediction process; we assume that it is a generic vector included in the whole admissible range, that is, b( i , j ) E P. Such a vector is a point in the multidimensional space prototyped by the codevectors {c , ; v = 1 , 2 , . . . , N } , as one can see in Fig. 5.

To have a codebook that can be tailored to the prediction output means to have a local codebook made up of all the codevectors that are contained in a circle centered in b( i , j ) and passing through the point c, ( i , j ) (i.e., the circle of radius l lc,(i , , j) - b(i,j)ll), cr( i , . j ) , being the codevector chosen by the vector quantizer to represent the actual block in the image. In other words, the adaptivity is achieved by means of a variable radius, as shown in Fig. 6.

More formally, we denote by E,, the Euclidean distance between the current codevector c,(i, j ) and the predicted vector b ( i , j )

which, in a generic, multidimensional case, becomes

(9)

0 Legend:

~~~ O ~ ~~ OoOO 0~ ~ 0 codevector

O O 0 1 0 ~ x predicted vector

0 0 0 + current codevector 0 ,

Fig. 6. way.

The concept of local codebook adaptivity expressed in a graphical

~~~ I 0 0 0 ”

0 1

+ current codevector 0 0

I 0 I I

I 6, Fig. 7. Generation of the predictor’s output by counting the number of steps.

codevectors { c t } that comply with the rule

In practice, if we consider each vector as a location where we can stop the best-search path, we can use the predictor’s output in an optimal way by transmitting only the number of steps that we have to perform in order to reach the goal [Le., the codevector c T ( i , j ) ] by starting from the vector b(i , j ) and moving to the nearest codevector, and so on, as shown in Fig. 7.

In other words, for each image block b( i , j ) , the coder has to compute the MSE E, between each codevector c, and the predicted vector b(i , j) , as well as the MSE E, between the predicted vector b( i , j ) and the current codevector c,( i , j )

As a result, the ordered address A is given by the number of vectors {ct} whose E, is smaller than E,

A = card { S’ } ; (13) S’={c t=c , :E , < E,;v E (1 ,2 , . . . ,N)} .

The set S’ c S of codevectors contained in the circle centered in b ( i , j ) and of radius l l cz ( i , j ) - b( i , j ) l l is made up of the -

If we analyze such a strategy, we can deduce that it can be easily implemented iust by reordering the codebook in a


T Pr

0.20 4

0.15 0.10 I ' I1

0.00 0.05 L a-- -I_t_

0 100 2oo Address

Fig. 8. An example of the address histogram for DCRVQ

dynamic way, that is, for each vector coded by the vector quantizer, and by transmitting the number of locations where to find the current codevector cz( i , j ) in the reordered codebook. We assume that the codevector nearest to the predicted vector corresponds to the first address, and that the second codevector in terms of distance corresponds to second address, and so on.

Under this assumption, if the predictor works well, the output is quite often an address close to the first, that is, the ideal case where the predictor is on target. As a result, a stream of addresses can be obtained, whose histogram is of the kind

onginal - Huffman cncder CHANNEL

in ordered codebook

coded vectors

I I Legend:

~ + codevector

I - YeclOT

~ ~ address

decoded vectors

Fig. 9. The global architecture of the DCRVQ codec.

Iv. THE PREDICTION PROBLEM AND THE NEURAL APPROACH shown in Fig. 8.

In other words, the entropy of the resulting stream is sharply reduced, as compared to the stream before prediction, whose histogram is shown in Fig. 3. This may result in a good performance, if an entropy-coding strategy is applied. The decoder has only to compute the values of {E,} and E,, reorder the codebook, and extract the appropriate codevector by using the received addresses.

Both Cache-VQ and DCRVQ require a module that performs the prediction of a block, given a set of previously coded vectors. Equation (6) shows how strongly the performance of Cache-VQ depends on the accuracy of the predictor estimate (a successful prediction yields a small number of faults, with a direct bitrate reduction), whereas, a good prediction produces shorter addresses in the DCRVQ scheme, thus, resulting in a more effective entropy reduction for the Huffman coder.

A. System Architecture

The global architecture of the encoding-decoding system is displayed in Fig. 9. It consists of a standard vector quantizer, a predictor, a codebook reordering module, and an entropy coder. The entropy coder performs the compression of the codevector addresses in the ordered codebook by the classical Huffman algorithm [ 121.

The predictor must be identical on the transmitter and on the receiver; it has to accomplish the task of reconstructing a codevector on the basis of the four neighboring codevectors previously decoded. In order to decrease the bitrate without any further losses, the encoder and decoder predictors work in a synchronous way, in that they use the same data as input (that is, the previous codevectors), thus, producing the same predicted vector. As the decoder is synchronous with the encoder, it has to compute the predicted vector and the MSE's ({E,} and E z ) and to reorder the codebook on the basis of the {E,}'s, i.e., it assigns the first address to the codevector corresponding to the smallest E,, and so on, until address N is assigned to codevector corresponding to the largest E,. Finally, it has to extract from the reordered codebook the codevector corresponding to the transmitted address A. The computational complexity of the coding phase is comparable to those of standard vector quantizers, whereas, the decoder has to execute a larger number of operations.

Among the various methods proposed in the literature to predict an image block, the Address-VQ scheme proposed in [13] and the Cache-VQ scheme proposed in [lo] are interesting. They are statistical approaches based on the evaluation of a score function for each block to be predicted. In the latter case, the score function f b consists of a summation of elements of the transition matrices

b(i,j)+cz ( i d f b { b ( i , j ) ) - f b { C z ( i , j ) }

= [Mole, (ZJ j r c d ( i , j - l ) + [M.ir/41c, ( i , j j,ca (i-1,j-I)

+ [M.rr/21c,(i,j),cb(i-l,j) + [M3~/41c,(i , j) ,e~(i--l , j+l) .

(14)

Besides the utilization of statistical score functions for Address-VQ and Cache-VQ, it should be remarked that both such prediction schemes exhibit some drawbacks that make them not so attractive for software and hardware implementations. First of all, they require the storage of the transition matrices, which are quite large. Moreover, the computation of the score functions for all the codevectors poses severe problems; this task requires that one access the transition matrices too many times, hence, it is very time-consuming.

The most powerful approach to the prediction problem is to find a law underlying the given dynamic process or phenomenon [14]. If this law can be discovered and analyti-

DE NATALE et al.: DCRVQ: A NEW STRATEGY FOR EFFICIENT ENTROPY CODING OF VECTOR-QUANTIZED IMAGES 701

cally described (for instance, by a set of ordinary differential equations), then solving such equations allows one to predict the future, once the initial conditions have been completely specified. Unfortunately, the information about a dynamic process is often partial and incomplete, so a prediction cannot be based on a known analytical model. A less powerful approach aims to discover some empirical regularities in the observation of a series. According to this approach, the unknown dynamic process is described by a generical (usually nonlinear) multivariate function

ZA = q Z A - 1 , zx-2,. ’ . , a - K ) (15)

where the { z x - i } ’ s are given samples of the series, and a(.) is an unknown function. In the simplest case, this function is linear, so a standard auto-regressive (AR) model can be used ~ 5 1

K

k = l

where the predicted value is given as a linear combination of a fixed number K of past values of the series. Of course, the AR model provides good results only if the dynamic process is linear or nearly linear. For highly nonlinear processes, an AR model-based prediction may be very poor or completely wrong. Therefore, a more flexible and universal approach is to employ a neural network, which can perform (16) by using nonlinear processing units, as it is capable of approximating any nonlinear continuous function on the basis of training examples [16]. The advantages of a neural-network model are generality and flexibility. The training process produces a multidimensional surface composed of a set of simple nonlinear functions that best fit the training set. The neural network is trained to find such a function by means of the examples available. The capbility of neural networks to generalize and predict the future can be proved by approximation theory, as shown in [17]. The strategy adopted in this paper is based on the use of a neural network. In addition to the capability to generalize, such an approach has many other advantages, including an easy hardware realization and a low storage requirement, as only the synaptic weights need to be stored in both the encoder and the decoder.

The network selected is a perceptron trained by a backpropagation algorithm (the mathematical and implementation details are provided in the Appendix). The values of the synaptic weights computed during the training phase are stored and used by the encoder and decoder to achieve a prediction in the same way. Of course, in this case, too, the decoder must be synchronous with the encoder.

As stated earlier, the encodingfdecoding process is inde- pendent of the prediction module. Every time a codevector has to be transmitted, the neural network receives the four neighboring codevectors (i.e., the previously transmitted ones) as inputs, and generates a predicted vector (not a codevector) on the output layer. The prediction result is then used by the coder in an appropriate manner, as described in the previous sections. For Cache-VQ, it is utilized to organize a cache codebook that contains the codevectors { cu} (where

u = 1, . . . 2‘ - 1; T < log, N ) at the minimum distance from the predicted vector b ( i , j ) ; for DCRVQ, it is used to organize dynamically the ordered codebook from which the address A to be transmitted has to be selected.

v . RESULTS AND COMPARISONS

In order to compare the performance of the DCRVQ approach to that of Cache-VQ, both theoretical and experimental analyses were carried out. In particular, we let p(x) be a probability density function (PDF) with non-null values in the range z E [0,1], and such that

P(2) = / ( z + 1 ) / 2 N p(x)dz (17) 2 / 2 N

where P( i ) is the probability of the ith symbol. We can, therefore, impose the following condition

P( i < 2 k ) < $) = PCP

and then study the entropy behavior versus the probability of a correct prediction.

It is necessary to define a probabilistic model for the distribution p(x). A first possibility is to have the PDF that maximizes the entropy of the source under the above conditions, namely

More realistic models are an exponential distribution or a hyperbolic one

p,(x) = mepaz

The related parameters can be computed by (1 8) and (19). In Fig. 10, the curves related to the proposed models are

plotted, with the superposition of some experimental values achieved by the Cache-VQ and DCRVQ coding schemes. One can notice that, as expected, DCRVQ outperforms Cache- VQ in the theoretical case, even in the worst case b C ( x ) distribution]; the advantage of DCRVQ is more evident for low values of the percentage of correct prediction. In the experimental case, it is interesting to observe how well DCRVQ approximates the ideal curves of more realistic models be(.) and p , (x) distributions].

The results reported in the following were obtained by using as a training set the images shown in Fig. 11, and as a test set the images shown in Fig. 12. No decoded images are shown, as the prediction step was performed in a lossless way, so the final quality is the same as for memoryless vector quantization. In particular, the used vector quantizer uses vectors of dimensions 3 x 3 pixels, 256 gray levels per pixel, and a codebook made up of 256 codevectors (same properties as those of the original vectors); the compression factor without prediction is then equal to nine.


DCRVQ real samples

CVQ real samples

CVQ behavior

DCR linear regression

exponential dist. DCRVQ

uniform dist. DCRVQ

exp(x,-a) dist. DCRVQ

~~.~~

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Probability of Correct Prediction

Fig. 10. Theoretical and experimental performances of Cache-VQ and DCRVQ ( k = 3..Y = 8 )

Fig. 11. The six images used as training set; from top left: airplane, boat, bridge, kid, mask, Carmen.

Several tests were carried out to assess the performances of DCRVQ, comparing this scheme to the Cache-VQ one, as well as to assess the performances of the neural prediction versus the one based on the transition matrices. In particular, results are presented for a cache codebook containing seven codevectors (T = 3 ) . Figs. 13 and 14 compare the performances (in terms of compression ratio) of DCRVQ and Cache-VQ when the neural predictor (the former) or the statistical one (the latter) is used.

Figure 15 reports on the performances of the neural approach versus those of the classical one based on transition matrices. Since we had verified that the DCRVQ scheme was superior to Cache-VQ, independently of the predictor efficiency, we present the results for only DCRVQ. Such results show that the classical approach seems to work generally better.

However, we have to point out that the neural approach does not require a change to the network's synaptic weights

DE NATALE et al.: DCRVQ: A NEW STRATEGY FOR EFFICIENT ENTROPY CODING OF VECTOR-QUANTIZED IMAGES 703

I . .

Fig. 12. The six images used as test set; from top left: agave, baboon, Lema, bike, peppers, Zelda.

Fig. 13. prediction = 9).

Comparison of the performances (in terms of compression ratio)

when another codebook is used; in the experiments we carried out the performances of the neural predictor were similar for images belonging or not to the training set. This results in no overhead information required when the codebook has to be changed and then in a high flexibility of use; instead, the classical approach requires such overhead information.

VI. CONCLUSION A novel predictive coding scheme for efficient entropy

coding of vector-quantized images has been presented. The proposed method, called dynamic codebook reordering vector

of DCRVQ and Cache-VQ when the neural predictor is used (CR without

quantization (DCRVQ), is a lossless prediction scheme aimed at exploiting the residual correlations among neighboring blocks by suitably reindexing the codewords to achieve a better performance of the Huffman entropy coder. Moreover, the choice of an artificial neural network to perform a prediction was experimentally justified, and the neural approach was compared to another statistical approach, that is, the use of address transition matrices.

Results show that the new scheme outperforms analogous methods proposed in the literature. In particular, a direct theoretical and experimental comparison of DCRVQ to Cache- VQ proved that the former exhibits higher efficiency.


4 . . . . . . . . I . " . . . . . . : _ . . . . . . . . . i i ' . . . . . . . i . . . . . . . . j . . . . . . . . ; I 1 1 : : : I!. . . . . . . . . . . . . . . .

. . . . i . . . . . . n t _ . . . . . . . . . . . . .

CR

agm e h;ihooll lcnna illhe ~~~~~~ Zelda

IMAGE

Fig 14 matrices is used (CR without predlctlon = 9)

Comparison of the performances (in terms of compresslon ratio) of DCRVQ and Cache-VQ when the classical predictor based on transition

p-h--*mm

I , " , ) . . " " I " " " . . . . . . . . . a . . . . . . . . . : $ 1 I . . " " " 2 " " . . . . . . , " , , " . . . . . . . _ _ . I . . . . . . . . . . . . . . . . .

. . . . . . . . . . . " . " " " j " x I " " " I , , ., I , , "

+ 25 I-"i

llgdve baboon lenna bike P ~ ~ ~ ~ f S Zelda

IZ4AGE

Fig 15 without prediction = 9)

Performance companson of the neural approach versus the classical one based on transition matrices (for the DCRVQ coding scheme) (CR

APPENDIX: STRUCTURE AND TRAINING OF THE NEURAL PREDICTOR

The proposed approach involves the implementation of a three-layer perceptron neural network [ 181, trained to re- construct a codevector c z ( i i j ) on the basis of the four neighboring codevectors previously decoded c, ( i - 1, j - l ) , c ~ ( z - 1 , j ) , c ~ ( i - 1 , ~ + 1 ) , c ~ ( i , ~ - 1 ) (as shownbythe scheme in Fig. 1), that is

c z ( i , j ) = Q [ C a ( i - l , j - 1) - 1 , C b ( i - 1,,j),

cC(i - 1, j + l ) , ~ ( i , j - l)]. (Al)

The functional relationship @'(.) of the neural network approximates the unknown function Q(.) on the basis of the examples given as a training set, therefore, the vector b(i , j) is obtained as prediction of the codevector c, (i, j ) as follows:

b(*i,j) = Q'[ca(i - l , j - 1 ) - l , C b ( i - l ,j),

cC( i - 1, j + l ) , ~ d ( i , j - I)]. (A21

In particular, the adopted training set is the same set of images as utilized to build up the VQ codebook. To obtain good generalization results, the dimensions of this set have been

DE NATALE et d: DCRVQ: A NEW STRATEGY FOR EFFICIENT ENTROPY CODJNG OF VECTOR-QUANTIZED IMAGES

~

705

taken such as to satisfy the constraints imposed in [19]; which relate the number of weights to the number of patterns to be learned.

The classical backpropagation training algorithm has been adopted [20]. This algorithm aims at minimizing the distance between the vector b(i,j) (generated by the network on the output layer) and the current codevector c, ( 2 , j ) , after the input layer has received a configuration made up of four previous neighboring codevectors c, (i - 1, j - 1) , ch ( i - 1, j ) , c, (i - 1 , j + l ) , c d ( i , j - 1). In other words, the dimensions of the input layer are equal to four times the dimensions of a vector, whereas, the dimensions of the output layer are equal to the dimensions of a vector. The dimensions of the hidden layer have been experimentally chosen equal to twice the dimensions of a vector.

Each neuron (except for the input ones) is characterized by a nonlinear activation function r(E) of the hyperbolic tangent type

To avoid a compression of the output dynamics (which could result from the saturation of r(E) for the lower and upper values), the output activation functions are driven to operate in the linear region by subtracting, from the desired values, the mean gray value over the four neighboring codevectors. The targets t, = t ( i , j ) of the network are

t, = c z ( i , j )

We denote by x the input vector, and by h and y the hidden and output vectors, respectively; the output depends on the inputs as follows

The purpose of the network is to produce outputs y that are as equal to the targets t, as possible; in other words, it aims to minimize the global error E on all the targets t, (with p = 1,. . . , L; L being the number of training-set elements), associated with the inputs

L

E = llYp - tp1I2 = Y{[W1l, [W211. (A61 p=l

If E is regarded as a surface that is a potential function of the weights, it is possible to apply the steepest-descent minimization algorithm.

Let us consider only one vector, for the sake of brevity; it follows that:

E p = l l ~ p - t p1I2 = x ( ~ p ( C - t p ( i ) ) 2 E 1 Ep (A71 2 P

and then

To obtain the vector on which the minimization step is to be performed, the gradient is computed as follows:

As the function y(() is of the sigmoid type, then we have

and then

The input weights vary according to

The resulting weight-updating rule is the following:

AIW1],,nl = 2~6, (1 - h i ) (A161 A[W2],,,k: = 2~(? / / , - th)hk(l - Y:). (A17)

This learning rule lends to converge to local minima of the energy function, if any. To bypass such minima, a moment of inertia is included in the descent phase

where t denotes the iteration index. Moreover, in order to speed up the convergence, some heuristic criteria are applied [21], concerning the variations in the parameters a and E. Such criteria are:

-The weights are updated according to (A18). -If a global error turns out to be smaller than the previous

one, then E + K . E , with K >> 1, and the weights are updated.

-Otherwise, E --f f l . E , with B << 1, Q = 0, and the current iteration is discarded.


TO sum up, at each iteration, when a set of codevectors (that [I81 M. Minsky and S. Papert, Perceptrons. Cambridge, MA: MIT Press, is, the neighboring codevectors that, in a raster scan, precede the one to be predicted) are presented to the network input, the algorithm applies the updating rule to the synaptic weights to minimize the MSE between the vector generated by the

1972. [I91 E. B. Baum and D. Haussler, “What size gives valid generalization,”

Neural Computing, vol. 1, pp. 151-160, 1989. [201 D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre-

sentations by back-propagating errors,” Nature, vol. 323, pp. 533-536, 19Rh

network and the target.

ACKNOWLEDGMENT The authors would like to thank the anonymous referees

[21] T. P. Vogl, K. J. Mangis, A. K. Rigler, W. T. Zink, and D. L. Alkon, “Accelerating the convergence of the back-propagation method,” Biological Cybernetics, vol. 59, pp. 257-263, 1988.

for to i

their comments and suggestions that notably contributed mproving the paper structure and content.

REFERENCES

R. M. Gray, “Vector quantization,” IEEE Acoust, Speech, Signal Pro- cessing Mag., pp. 4-29, Apr. 1984. A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Norwell, MA: Kluwer, 1992. N. M. Nasrabadi and R. A. King, “Image coding using vector quantization: A review,” IEEE Trans. Commun., vol. 36. no. 8, pp. 957-971.

Francesco G. B. De Natale received the Laurea degree in electronic engineering and the Ph.D. degree in communications, in 1990 and 1994, respectively both from the University of Genova, Italy.

He is currently a post-doctoral fellow at the Dept. of Biophysical and Electronic Engineering of the University of Genova; he is also an Adjunct Professor at the University of Trento, where he teaches Pattern Recognition Theory and Techniques. His research interests are in the area of video communications. image analvsis and multimedia.

I I L~

1988. T. D. Lookabaugh and R. M. Gray, “High-resolution quantization theory and the vector quantizer advantage,” IEEE Truns. Inform. Theoi?, vol. 35, no. 5, pp. 1020-1033, 1989. Y. Linde, A. Buzo, and R. M. Gray, “An algorithm for vector quantizer

Dr. De Natale received an IRI award for his thesis works in 1990. He is a member of AEI.

design,” IEEE Trans. Commun., vol. 28, no. 1, pp. 84-95, 1980. J . Foster, R. M. Gray, and M. 0. Dunham, “Finite-state vector quantization for waveform coding,” IEEE Trans. Inform. Theov, vol. 31. no. 3, pp. 348-359, 1985. M. 0. Dunham and R. M. Gray, “An algorithm for design of label- transition finite-state vector quantization,” IEEE Trans. Commun., vol. 33, no. 1, pp. 83-89, 1985. A. Aravind and A. Gersho, “Image compression based on vector quantization with finite memory,” Optical Eng., vol. 26, no. 7, pp. 570-580, 1987. T. Kim, “New finite state vector quantizers for images,” in Proc. ZEEE- ICASSP’88, Apr. 1988, pp. 1180-1183. F. G. B. De Natale, G. S. Desoli, D. D. Giusto, and G. Vemazza, ”A framework for high-compression coding of color images,” in Visual Commun. Image Process. ’89, and in Proc. SPIE, Nov. 1989, vol. 1199. pp. 1430-1439. S. Fioravanti and D. D. Giusto, “Inter-block redundancy reduction in vector-quantized images by a neural predictor,” Euro. Trans. Telecom- mum, vol. 3, no. 6, pp. 605-607, 1992. D. A. Huffman, “A method for the construction of minimum redundancy codes,” in Proc. IRE, 1952, vol. 40, pp. 1098-1101. Y. Feng and N. M. Nasrabadi, “Address-vector quantization: An adap- tive vector quantization scheme using interblock correlation,” in Visual Commun. Image Process. ‘88, Proc. SPIE, Nov. 1988, vol. 1001, pp. 2 14-222. A. S. Weigend, B. A. Huberman, and D. E. Rumelhart, “Predicting the future: A connectionist approach,” Int. J. Neural Syst., vol. 1, pp. 193-209, 1990. J. Makhoul, “Linear prediction: A tutorial review.” Proc. IEEE, 1975, vol. 63, pp. 561-580. A. Cichocki and R. Unbehauen, Neural Networks for Optimization and Signal Processing. J. J. Hopfield, “Neurons with graded response have collective computational properties like those of two-state neurons,” in Proc. Nar ’1. Academy Science, 1984, vol. 81, pp. 3088-3092.

New York: Wiley, 1993.

Stefan0 Fioravanti received the Laurea degree in electronic engineering and the Ph.D. degree in communications, in 1990 and 1994, respectively both from the University of Genova, Italy.

In 1991-1993 he was with DIBE, University of Genova, as a Research Fellow, working on fractal analysis of images and data compression. At present, he holds a research position at the Saclant Research Center in La Spezia, Italy. His research interests are in the area of nonlinear image analysis and processing.

Daniele D. Giusto (M’87) received the Laurea degree (M S ) in electronics engineering in 1986, and the Dottorato di Ricerca degree (Ph D ) in communications in 1990, both from the University of Genova, Italy

From 1991 to 1993, he was with DIBE, Uni- versity of Genova, as a Post-Doctoral Research Fellow, supervising the research activities within some international projects Since 1994, he has been an Assistant Professor of Communications in the Department of Electncal and Electronic Engineer-

ing, University of Caghan, Italy He is also Visiting Professor at the School of Advanced Optics, University of Nuoro, Italy. In the Summer 1995, he was with the Institut fur Nachrichtentechnik, Technische Universitat Braunschweig as a Visiting Fellow His research interests are in the area of signalhmage processsing and coding He has been involved as AuditorEvaluator in European Commission RACE and LTR research programmes

In 1993, he received the Ottavio Bonazzi best paper award from AEI He is member of AEI

Date post:	21-Sep-2016
Category:	Documents
Upload:	dd
View:	214 times
Download:	1 times

DCRVQ: a new strategy for efficient entropy coding of vector-quantized images

Documents