Novel embedded image coding algorithms based on wavelet difference reduction

Novel embedded image coding algorithms based onwavelet difference reduction

Y. Yuan and M.K. Mandal

Abstract: Wavelet difference reduction (WDR) has recently been proposed as a method forefficient embedded image coding. In this paper, the WDR algorithm is analysed and four newtechniques are proposed to either reduce its complexity or improve its rate distortion (RD)performance. The first technique, dubbed modified WDR-A (MWDR-A), focuses on improving theefficiency of the arithmetic coding (AC) stage of the WDR. Based on experiments with the statisticsof the output symbol sequence, it is shown that the symbols can either be arithmetic coded underdifferent contexts or output without AC. In the second technique, MWDR-B, the AC stage isdropped from the coder. By employing MWDR-B, up to 20% of coding time can be saved withoutsacrificing the RD performance, when compared to WDR. The third technique focuses on theimprovement of RD performance using context modelling. A low-complexity context model isproposed to exploit the statistical dependency among the wavelet coefficients. This technique istermed context-modelled WDR (CM-WDR), and acts without the AC stage to improve the RDperformance by up to 1.5 dB over WDR on a set of test images, at various bit rates. The fourthtechnique combines CM-WDR with AC and achieves a 0.2 dB improvement over CM-WDR interms of PSNR. The proposed techniques retain all the features of WDR, including low complexity,region-of-interest capability, and embeddedness.

1 Introduction

Good performance at low bit rates, and signal-to-noise(SNR) scalability are among the major features of the next-generation image compression algorithms. Many suchalgorithms are based on the two-dimensional discretewavelet transform (2D-DWT).

Generally, in a wavelet-based image coder, the DWTtransform stage is followed by an encoding algorithm thatexploits the inter-subband and=or intra-subband statisticaldependency among wavelet coefficients by employingefficient data structures. In most scalable wavelet imagecoders, the scalability is produced by the so-called‘embedded coding’ technique. The scalable output bit-stream is fine-granular, where additional bits contribute toan incremental improvement in the reconstruction quality.In other words, the rate distortion (RD) curve of the scalablecoding algorithm is continuous.

In his classic paper [1], Shapiro established the 2D-DWTas a competitive tool in progressive still-image coding.Here, the embedded coding is achieved by progressivelyencoding binary bit planes from the most significant onetowards the least significant one. The bit planes implicitlyspecify a sequence of progressively refined uniformscalar quantisers. In addition, an efficient datastructure named ‘zerotree’, which attributes its origin tothe quad-tree structure, was proposed [1] to exploit the

residual cross-scale dependency among subbands. Thezerotree-based wavelet coding performance was furtherimproved in the set partitioning in hierarchical trees(SPIHT) algorithm [2]. The most significant characteristicof the SPIHT is that the zerotree data structure is exploitedto encode an entire insignificant region of a given bit planewith only one symbol.

It is worth noting that although many embedded waveletimage coding schemes thereafter have adopted zerotree datastructure, some researchers have proposed methods notrelying on zerotrees to encode wavelet coefficients. Amongthem are the wavelet difference reduction (WDR) algorithm[3] and the low-complexity context modelling approach [4].In both techniques, the zerotree data structure is precluded,but the embedding principles of lossless bit plane coding andset partitioning are preserved. In the WDR algorithm, insteadof employing the zerotrees, each coefficient in a decomposedwavelet pyramid is assigned a linear position index. Themapping of the two-dimensional (2D) co-ordinate-pairs ofthe wavelet coefficients into a one-dimensional (1D) indexarray follows a fixed scan order, which is similar to a zig-zagscan order. The distinguishing feature of WDR is that the run-length between two neighbouring significant coefficients,i.e. the difference between their indices, is efficientlyencoded. With the remarkably simple encoding of the precisepositions of significant coefficients and embedded codingsuch as SPIHT, WDR enjoys simplicity as well as a codingperformance similar to that of SPIHT. To further improve therate distortion (RD) performance, Walker [5] incorporatedparent–children relationship into the WDR algorithm. Later,Walker and Nguyen proposed to incorporate neighbourdependency, i.e. ‘siblings’, into the encoding in addition tothe parent–children relationship [6].

In this paper, several novel techniques are proposed, withthe objective to reduce computational complexity andimprove compression performance of the WDR. Thispaper contains two parts discussing different ways of

q IEE, 2005

IEE Proceedings online no. 20051183

doi: 10.1049/ip-vis:20051183

The authors are with Multimedia Computing and CommunicationsLaboratory, Department of Electrical and Computer Engineering,University of Alberta, Edmonton, Alberta, Canada, T6G 2V4

E-mail: [email protected]

Paper first received 2nd July 2003 and in revised form 23rd September 2004

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 1, February 2005 9

improving the performance of WDR. In the first part, thestatistical properties of the output symbol sequence of WDRare analysed and it is shown that by separately encodingsymbols from the sorting pass and the refinement pass ofWDR, the output symbols can be arithmetic coded [7] underdifferent contexts for a slightly improved PSNR result, or beoutput without arithmetic coding, retaining the same, orbetter, PSNR result. In the second part, we propose to applya low-complexity context model to wavelet coefficients,based on the observation that statistical dependencies, e.g.parent–children relationships and neighbourhood relation-ships, were not exploited by WDR. The algorithm withcontext modelling is termed context-modelled WDR (CM-WDR). CM-WDR always performs better than SPIHTwithout arithmetic coding (AC) in terms of PSNR. CM-WDR with the AC algorithm, called CM-WDR=AC here,performs equally as well as SPIHT-AC in terms of PSNR.

2 The WDR algorithm

In this Section, we present a brief overview of the WDRalgorithm. It is shown that the WDR algorithm borrowsideas from the more popular EZW and SPIHT, whileadopting a totally different data structure to exploit thestatistical dependency among wavelet coefficients.

Both EZW and SPIHT utilise the zerotree data structureand progressive bit plane coding to encode waveletcoefficients efficiently. The zerotree data structure wasfirst proposed [8] to exploit cross-scale dependency byencoding the quantised-zero coefficients jointly. Especiallyat low bit rates, most wavelet coefficients will be quantisedto zero because of the coarse quantisation step. This makeszerotree a very efficient data structure. In EZW, Shapirocombines it with progressive bit plane coding to generate anembedded bitstream. SPIHT is very similar to EZW in termsof utilisation of zerotrees, but SPIHT uses a differentapproach to encode zerotree information and a differentorganisation of output bits. SPIHT can achieve goodcompression without an explicit entropy-coding stage.SPIHT conducts a sorting pass and a refinement passrecursively from the highest bit plane towards the lowest bitplane. In a bit plane p, wavelet coefficients are partitionedinto three lists, i.e. the lists of insignificant sets, insignificantpixels, and significant pixels, according a significance test.A wavelet coefficient c is labelled as ‘significant’ if jcj � 2p:

WDR employs similar encoding stages to SPIHT. It alsoconducts a sorting pass and a refinement pass for each bitplane. As the counterparts of the three lists in SPIHT, threesets are defined in WDR, i.e. the set of insignificantcoefficients (ICS), the set of significant coefficients (SCS),and the temporary set of significant coefficients (TPS).Since WDR does not utilise the zerotree data structure, itdoes not have a list of insignificant sets as in SPIHT. Insteadof directly concatenating the significant coefficients foundin a given bit plane to the SCS, like SPIHT adding suchcoefficients to the LSP, WDR adds newly identifiedsignificant coefficients to the TPS. The TPS is laterconcatenated to the end of the SCS after the refinement pass.

The only difference in bit plane encoding between WDRand SPIHT is in the sorting pass. Instead of using zerotreesto represent insignificant coefficients, WDR defines a scanorder of wavelet coefficients, which traverses all subbandsin a wavelet pyramid from coarse resolutions to fineresolutions. The scan order maps a 2D wavelet pyramidinto a 1D array of wavelet coefficients. A significance test isconducted on the 1D array to search for significantcoefficients. By encoding the run length between twoconsecutive significant coefficients, the output from the

sorting pass consists of the signs of significant coefficientsalong with the run lengths, which represent the positions ofsignificant coefficients.

For example, suppose we have a 4 � 4 subband; first wemap it into a 1D array of sixteen coefficients. If only thefirst, the eighth, and 14th coefficients are significant againsta given threshold T, we need to output the signs of thesesignificant coefficients and the run lengths between everypair of consecutive significant coefficients. In this example,the run lengths are 1, 7, 6. Since the most significant bits(MSB) of these numbers are always 1, we do not need tooutput the MSBs. Therefore, we will output ‘Null’ for 1,output ‘11’ for 7, and output ‘10’ for 6. In order to indicatethe end of a run length, the sign of the significant coefficientis used to mark the end of output numbers. Suppose thesigns of the three significant coefficients are ‘þ�þ’, theoutput symbol sequence is ‘þ1 1 � 1 0þ’. The symbolsequence will then be encoded into a bitstream by AC [7].

3 Modified WDR algorithms based on symbolseparation

In the previous Section, we briefly discussed WDR and itssimilarity and difference with zerotree-based codingschemes. In the WDR algorithm, there are four symbols,fþ;�; 0; 1g; in the output of a sorting pass. However, thereare only two symbols, {0, 1}, in the output of a refinementpass. While WDR encodes refinement pass symbols togetherwith sorting pass symbols under an identical context, wefound that the encoding of refinement pass symbols should beseparated from the encoding of sorting pass symbols to reacheither a slightly higher RD performance (with AC) within asimilar time, or a faster compression speed (without AC) withcomparable RD performance.

In the WDR algorithm, the symbol sequence produced ina refinement pass is encoded together with the symbolsequence of a sorting pass under the same context using AC.This means that we have an alphabet A of four symbols,namely, fþ;�; 0; 1g: The arithmetic coder adaptivelyupdates the occurrence frequencies of the symbols, anddetermines the corresponding code lengths for all symbols.

In the modified WDR (MWDR) algorithms, a refinementpass symbol sequence is separated from a sorting passsymbol sequence and is encoded under a different contextthat has an alphabet B of only two symbols, {0, 1}.

Suppose we have a symbol sequence of length N and nðn<NÞ symbols have already been encoded using WDR.With the adaptive model in [7], we can see that the codelength for the ðn þ 1Þth symbol is determined by theoccurrence frequencies of the four entries in the alphabet Ain the first n symbols, namely, fn;0; fn;1; fn;þ; and fn;�: Theestimated probability of symbol x up to the nth symbol will be

p̂px ¼fn;xPx fn;x

; x 2 A ¼ fþ;�; 0; 1g ð1Þ

In the MWDR schemes, there exist two contexts and we havenew equations for the estimated probabilities.

p̂psx ¼

f sn;xPxf sn;x

; x 2 A ¼ fþ;�; 0; 1g

p̂pry ¼

f rn;yPyf rn;y

; y 2 B ¼ f0; 1g

8>><>>: ð2Þ

where p̂psx denotes an estimated probability in a sorting pass

and p̂pry denotes an estimated probability in a refinement pass.

Similarly, f sn;x denotes the occurrence frequency of symbol x

in a sorting pass and f rn;y denotes the occurrence frequency of

IEE Proc.-Vis. Image Signal Process., Vol. 152, No. 1, February 200510

symbol y in a refinement pass, when the first n symbols havebeen encoded. The occurrence frequencies of the entriessatisfy the following conditions

fn;þ ¼ f sn;þ

fn;� ¼ f sn;�

fn;0 ¼ f sn;0 þ f r

n;0

fn;1 ¼ f sn;1 þ f r

n;1

8>><>>: ð3Þ

We have conducted experiments on two groups of standardtest images, which are downloadable from http://links.uwaterloo.ca/bragzone.base.html, to collect the occurrencefrequencies of the four output symbols. These two groups ofimages are all the grey-scale images we can download fromthe web site. The first group consists of twelve images of size256 � 256:Among them five are natural images and the otherseven are synthetic images, as shown in Fig. 1. The resultantestimated probabilities are plotted in Fig. 2. To distinguishresults for the two categories of images, results are illustratedin two subfigures. All images in the second group are naturalimages except two, as shown in Fig. 3. Most images in thesecond group have a size 512 � 512; although four imageshave different sizes. The corresponding results are plotted inFig. 4. All images were compressed at 2 bits per pixel (bpp).The occurrence frequencies of the symbols were updatedwhen a new symbol was output. At any point on the x axis, thesum of the estimated probabilities of the four symbols equals1. Based on the plotted results, we can generally write

fn;þ � fn;� ð4Þ

An interesting phenomenon is that the estimated probabilityof symbol 0 is always larger than that of symbol 1 andp̂p0 � p̂p1 � 0:1; especially in natural images. To simplify ouranalysis, we assume p̂p0 � p̂p1: This is practical because itseffect on the average code length is very small. For example,normally we have p̂p0 þ p̂p1 � 0:7: If we assume p̂p0 � p̂p1 �0:35; the average code length is �ll ¼ �p̂p0 log p̂p0 � p̂p1 log p̂p1 ¼1:06: If we take p̂p0 � 0:4 and p̂p0 � 0:3; the average codelength is 1.05. After the simplification assumption, we have

p̂p0 � p̂p1 � p

p̂pþ � p̂p� � q

p̂ps0 � p̂ps

1 � ps

p̂psþ � p̂ps

� � qs

p̂pr0 � p̂pr

1 � 0:5

8>>>><>>>>:

ð5Þ

where p þ q ¼ 0:5 and ps þ qs ¼ 0:5: In the MWDR, thecode length for the ðn þ 1Þth symbol is

lrnþ1;1 ¼ p̂pr0 log

1

p̂pr0

þ p̂pr1 log

1

p̂pr1

� 1 ð6Þ

if the symbol is produced in the refinement pass. If the symbolis encoded by WDR, the code length is

lrnþ1;2 ¼ p̂p0

p̂p0 þ p̂p1

log1

p̂p0

þ p̂p1

p̂p0 þ p̂p1

log1

p̂p1

ð7Þ

According to log-sum inequality [9],

p̂p0

p̂p0 þ p̂p1

log p̂p0 þp̂p1

p̂p0 þ p̂p1

log p̂p1 � logp̂p0 þ p̂p1

2ð8Þ

Fig. 1 First set of twelve standard test images from University of Waterloo still image database. Size of all images is 256� 256


therefore, (7) becomes

lrnþ1;2 � 1 � logðp̂p0 þ p̂p1Þ � log

1

pð9Þ

Consequently, the difference in code length between WDRand MWDR becomes

Dlrnþ1 ¼ lrnþ1;1 � lrnþ1;2 � 1 þ log p 0 ð10Þ

because p 0:5:

If the symbol is produced from the sorting pass, we havethe following code length if we use MWDR:

lsnþ1;1 ¼

Xx

p̂psx log

1

p̂psx

� 2ps log1

ps

þ 2qs log1

qs

; x 2 A

ð11ÞSimilarly, we have the following code length if we use theWDR:

Fig. 2 Estimated probabilities of the four symbols when encoding the first set of twelve test images

a Results from encoding five natural imagesb Results from encoding seven synthetic imagesNote that the probabilities of ‘0’ are slightly higher than the probabilities of ‘1’. The probabilities of ‘ þ ’ and ‘ 2 ’ are similar, and are smaller compared to theprobabilities at ‘0’ and ‘1’

Fig. 3 Second set of twelve standard test images from University of Waterloo still image database. Eight images have size 512� 512; fourimages have various sizes


lsnþ1;2 ¼

Xx

p̂px log1

p̂px

� 2p log1

pþ 2q log

1

q; x 2 A ð12Þ

Now, the difference in code length between WDR andMWDR becomes

Dlsnþ1 ¼ lsnþ1;1 � lsnþ1;2

� 2p log p þ 2q log q � 2ps log ps � 2qs log qs � 0

ð13Þ

because the probabilities of the digit symbols (0 and 1) ishigher than the probabilities of the sign symbols, (þ and �),which leads to jp � qj � jps � qsj: Now we see thatalthough we decrease lnþ1 in a refinement pass, we increaseit in a sorting pass. It is necessary to show that, on average,the amount of decrease in a refinement pass surpasses theamount of increase in a sorting pass.

Dlnþ1 ¼Dlrnþ1 þDlsnþ1

� 1þ logpþ2p logpþ2q logq�2ps logps �2qs logqs

¼ 1þ logpþ2p logpþ2ð0:5�pÞ logð0:5�pÞ�2ps logps �2ð0:5�psÞ logð0:5�psÞ

ð14Þ

where the relationship between p and ps satisfies followingconditions

0< p< 0:50< ps < 0:5ps p

8<: ð15Þ

because

ps � p �f sn;0

2 � ð f sn;0 þ f s

n;þÞ�

fn;0

2 � ð fn;0 þ fn;þÞ

¼ 1

2�

f sn;þ

2 � ð f sn;0 þ f s

n;þÞ

!� 1

2�

fn;þ2 � ð fn;0 þ fn;þÞ

¼fn;þ

2 � ð fn;0 þ fn;þÞ�

fn;þ2 � ð f s

n;0 þ fn;þÞ 0

ð16Þ

Figure 5 shows that when (15) holds, the decrease inexpected code length in a refinement pass always exceeds

the increase in expected code length in a sorting pass,i.e. Dlnþ1 < 0: Exactly how much bit budget can be saved inthe encoding of the ðn þ 1Þth symbol is discussed later inthe Appendix, Section 9.

With the analysis above, we propose two variants of theWDR algorithm. The difference among the three algorithmsis in the AC stage. If we need higher RD performance, wearithmetic encode sorting pass symbols with the four-symbolcontext, and refinement pass symbols with the two-symbolcontext. We call this algorithm MWDR-A. If we want toreduce computational complexity, we can simply use twobits to represent each of the four symbols in a sorting pass andoutput raw symbols without any further processing in arefinement pass. This algorithm we call MWDR-B. Theperformance of the two algorithms is presented in Section 5.

4 Context-modelled WDR algorithms

By separating the symbols produced in a sorting pass fromsymbols produced in a refinement pass, we can improve theefficiency of WDR. However, one major drawback of WDR,i.e. the lack of exploitation of statistical dependenciesamong wavelet coefficients, remains unaddressed by theMWDR schemes. In zerotree-based schemes, statisticaldependencies are efficiently exploited by the zerotree datastructure. Although the fixed scan order and the indexcoding in WDR make it computationally simple, a by-product of the techniques is that inter-subband and

Fig. 4 Estimated probabilities of the four symbols whenencoding second set of twelve test images. The probability trendis similar to the trend shown in Fig. 2

Fig. 5 (a) Plot of Dlnþ1 against p and ps; (b) enlarged portion of(a), where Dlnþ1>0

Plane p ¼ ps is also plotted. It is shown that Dlnþ1 < 0 when ps < p


intra-subband statistical dependencies, which are proven tobe worth exploiting in wavelet image coding, are precluded.To tackle this drawback, we employ a low-complexitycontext modelling approach to exploit the statisticaldependencies at low computational cost.

A detailed description of the proposed context-modelledWDR (CM-WDR) algorithm is presented below.

1. Transform: perform an L-level 2D-DWT on an imageand obtain a wavelet pyramid.2. Context setting: let i denote a coefficient in thedecomposed wavelet pyramid. Let P(i) denote the parentof i in terms of the usual parent–children relationship [1].Let N(i) denote a set of nearest neighbours of i within thesame subband as i. The number of neighbours for i isdetermined by its own position in the subband. If i is at oneof the four corners of a subband, it has three neighbours. If iis along a boundary line of a subband, it has five neighbours.Otherwise, it has eight neighbours.3. Scan order setting: each subband at level l is partitionedinto macroblocks of size 2Lþ1�l � 2Lþ1�l: Within eachblock, the index of the ith scanning coefficient is obtained byde-interleaving the odd and even bits of the binaryrepresentation of i. This scan order within each subband isdifferent from that of WDR [3], while the scanning priorityof orientations and levels remains the same.4. Initialisation: determine the initial threshold T using thesame procedure as in WDR. Initialise the indices of allcoefficients in the wavelet pyramid according to the scanorder.5. Sorting pass: record positions for the ‘new’ significantcoefficients, i.e. the indices for coefficients whose magni-tudes are greater than or equal to the current threshold.Encode these positions with the index coding method as inWDR.6. Refinement pass: record the refinement bits, i.e. thebinary values of the ‘old’ significant coefficients, which fallinto the bit plane defined by the current threshold. Theserefinement bits are inserted into the output bitstream withoutfurther encoding.7. Index update: On updated index array is produced basedon the remaining insignificant coefficients. The updatedindex array consists of three sub-arrays. The first sub-arrayis called the significant neighbour sub-array (SNS), andcontains all insignificant coefficients having at least onesignificant neighbour. The second sub-array is called thesignificant parent sub-array (SPS), and contains all insig-nificant coefficients whose parent is significant. The thirdsub-array is called the run sub-array (RS), and contains theremaining insignificant coefficients that do not satisfy theabove conditions. Note that each insignificant coefficientcan be enlisted in only one sub-array. The priority of theSNS is higher than that of the SPS, i.e. a coefficient is firstchecked to see whether it has significant neighbour(s). Thethree sub-arrays are then concatenated to generate theupdated index array. The concatenation order is: SNSþSPSþ RS:8. Threshold update: divide the current threshold by 2, andrepeat from the sorting pass until a given bit budget isreached.

In the decoder, the above steps are followed in the sameorder to produce a quantised output. The final reconstructionvalue is set to the midpoint of the quantisation bin.

The encoding of a demo 4 � 4 pyramid with adecomposition level of 2 by CM-WDR is illustrated inFig. 6. Note that the size of the 1D index array is 16 while anindex of 17 is shown. This index represents a ‘virtual’

positive significant coefficient labelling the end of a bitplane. Also, note that in HL subbands, coefficients arescanned column-by-column, which follows the lowpassfiltering along the column. The major difference betweenCM-WDR and WDR is shown in the coding of the secondmost significant bit plane. Thirteen coefficients remain asinsignificant after the encoding of the first bit plane. In thesorting pass, coefficients with the significant neighbor 49 areplaced in the front of the array. The three neighbours of 49in the same subband, namely, �25; 14, and 7, are moved tothe front. The coefficients with significant parent follownext. �31 and 23 are chosen because they have the sameparent 63. The children of �34 have already been moved to

Fig. 6 Encoding first two bit planes of 4 � 4 demo pyramid usingCM-WDR

a Encoding most significant bit planeb Encoding second most significant bit plane


the front because all of them neighbour a significantcoefficient. This shows the higher priority of coefficientswith significant neighbours.

The major difference between CM-WDR and WDR is theindex update step. This index update strategy enhances theperformance of WDR in two aspects: (1) the dependencybetween neighbouring coefficients and the dependencybetween a parent and its children coefficients are reflectedin the updated index array; and (2) the sub-array (fractionalbit plane) with the highest probability of generatingsignificant coefficients in the next loop of testing is placedin the front of the updated index array. This increases the‘skewness’ of the distribution of the significant coefficientswhereby the (lossless) compression ratio of the index codingis higher. The reason that we put the SNS in the first part ofthe updated index array is that with our experiments, it isexpected that new significant coefficients in SNS will beidentified with higher probability in the next loop of testingthan those of SPS. This arrangement is also supported by thefindings of other researchers [10].

Another important feature of CM-WDR is that AC isavoided, which simplifies the coding architecture. Each ofthe four symbols fþ;�; 0; 1g produced in a sorting pass isencoded using two bits.

A natural extension of the CM-WDR is to incorporate anAC stage into the encoding of symbol sequences fromsorting passes and refinement passes. Based on the analysisin Section 3 on the statistics of the symbols produced, twocontexts are employed in the AC as proposed in MWDR-A.We call this AC variant of the context modelling schemeCM-WDR=AC. The performance of the two schemes isdiscussed in the next Section.

5 Experimental results

The coding performance of the MWDR schemes ispresented first, followed by that of the CM-WDR schemes.A lifting implementation [11] of the bi-orthogonal waveletfilter bank proposed in [12] was used with a decompositionlevel of 6. The implementations ran on a Fedora Core 2GNU=Linux workstation with a Pentium-III 800 MHzprocessor and 256 MB memory. The Linux kernel versionis 2.6.7, and the GNU C Compiler (GCC) version is 3.3.3.

The reconstruction distortion was measured by the peaksignal-to-noise ratio (PSNR)

PSNR ¼ 10 log10

2552

MSE

dB ð17Þ

where MSE is the mean squared error between an originalimage and a reconstructed image. All bit rates given areexact output bit rates.

5.1 MWDR

In our experiments, the execution time as well as RDperformance was measured. The execution time of the threealgorithms (namely, WDR, MWDR-A, MWDR-B) wasmeasured by the getrusage( ) function provided in the GNUANSI C library. Since the getrusage( ) function returns onlyan approximate value of the processor time used by aprogram, we ran every program thirty times and calculatedthe average. Although we took this step to avoid largetiming errors, the bar graph plotted in Fig. 7 should beinterpreted as only a coarse-grained comparison among thealgorithms.

The execution time for the proposed algorithms isshown in Fig. 7. It can be observed that the MWDR-Balways consumed the least time on both encoding and

decoding. This results from the simplicity of the directmapping from a symbol sequence to an output bitstream.Another observation is that the timing gap increases withincrease of the coding bit-rate between MWDR-B andthe other two schemes; this is because the arithmeticcoding of the longer output symbol sequence in WDRneeds more time. When near-lossless bit-rate is reached,it can be seen that up to 20% of the decoding time canbe saved without sacrificing any compression perform-ance when compared to WDR .

The PSNR results for four 512 � 512; 8 bpp gray-scaleimages, ‘Lena’, ‘Goldhill’, ‘Boat’, and ‘Barbara’ areshown in Table 1. It can be seen that MWDR-A alwaysachieves a better RD performance than WDR byencoding output symbol sequences from sorting passesand refinement passes separately under different contexts.Between WDR and MWDR-B, it was found that atcertain breakpoints performance of MWDR-B exceedsthat of WDR. This is so far two reasons. First, MWDR-Bsaves bit budget by encoding a refinement pass symbolwith only one bit, while on average, WDR spends morethan one bit for each of the two symbols. Second, byexcluding 0 and 1 symbols of a refinement pass frombeing arithmetic coded together with those symbols of asorting pass, the skewness of the symbol probabilitydistribution in the sorting pass is decreased, which would

Fig. 7 Comparison of timing performance of MWDR schemeswhen coding ‘Lena’ image (512 � 512; 8 bpp)

a Encoding image at various bit ratesb Decoding image at various bit rates


have made an arithmetic coder less efficient if weemploy AC in MWDR-B. This justifies the exclusion ofAC from MWDR-B. Since the PSNR difference amongthe three algorithms is quite small and MWDR-B isfaster and less complex due to dropping of the AC stage,MWDR-B should be chosen over the MWDR-A inpractice.

5.2 CM-WDR

Table 1 also compares the RD performance of CM-WDRschemes against WDR and SPIHT. The seven columnsare for WDR, MWDR-A, MWDR-B, CM-WDR, CM-WDR=AC, SPIHT (without AC), and SPIHT-AC (with AC),respectively. The results for SPIHT and SPIHT-AC wereobtained using the implementation in QccPack [13].

As shown, CM-WDR always performs better than SPIHTand WDR at all bit-rates. The performance gap betweenCM-WDR and SPIHT-AC rarely exceeds 0.1 dB at low bit-rates while CM-WDR enjoys lower complexity. Theperformance gap between WDR and CM-WDR rangesfrom 0.1 dB to 1.5 dB at various bit-rates on the test images.It can also be observed that WDR performs nearly as well asSPIHT at low bit-rates. Figures 9 and 10 let us visuallycompare the reconstruction quality of the schemes dis-cussed. It is seen that CM-WDR schemes provide moretexture information, especially around the knee in the‘Barbara’ image.

Table 2 compares the performance of CM-WDR schemeswith WDR in terms of the total number of encodedsignificant coefficients for a specific bit-rate. Since theadaptive scan order in CM-WDR schemes increases theskewness of the distribution of the significant coefficient inthe to-be-scanned array, the efficiency of index coding ishigher. The bit budget saved by the adaptivity of the scanorder is used to encode more coefficients.

Figure 8 illustrates the timing performance ofour implementation of CM-WDR schemes against Fowler’simplementation of the SPIHT scheme. We compared ourimplementations with Fowler’s implementations becausewe also built our work upon Fowler’s QccPack library.In fact, our implementation of the WDR has been integrated

Table 1: Rate distortion performance of the proposed techniques (1: WDR, 2: MWDR-A, 3: MWDR-B, 4: CM-WDR,5: CM-WDR=AC, 6: SPIHT, 7: SPIHT-AC)

PSNR, dB

Image Bit rate 1 2 3 4 5 6 7

Lena 0.15 31.16 31.34 31.12 31.61 31.72 31.33 31.71

0.25 33.22 33.38 33.24 33.89 34.01 33.62 34.00

0.5 36.29 36.50 36.36 36.99 37.10 36.74 37.11

1.0 39.47 39.66 39.54 40.05 40.19 39.90 40.29

Goldhill 0.15 28.65 28.72 28.64 28.78 28.87 28.59 28.89

0.25 30.24 30.36 30.25 30.27 30.39 30.16 30.44

0.5 32.66 32.80 32.61 32.78 32.88 32.57 32.96

1.0 35.82 36.04 35.86 36.09 36.24 35.86 36.38

Boat 0.15 28.32 28.42 28.32 28.50 28.63 28.37 28.63

0.25 30.47 30.57 30.34 30.51 30.71 30.34 30.74

0.5 33.57 33.82 33.55 33.93 34.12 33.74 34.20

1.0 38.16 38.44 38.23 38.49 38.75 38.34 38.91

Barbara 0.15 25.79 25.85 25.67 26.07 26.15 25.50 26.08

0.25 27.25 27.32 27.23 28.20 28.25 27.60 28.20

0.5 31.12 31.35 31.20 32.15 32.28 31.60 32.20

1.0 35.86 36.25 36.05 37.31 37.42 36.79 37.42

Fig. 8 Comparison of timing performance of CM-WDR schemesand SPIHT schemes when coding ‘Lena’ image (512 � 512; 8 bpp)

a Encoding image at various bit ratesb Decoding image at various bit rates


into the QccPack library as a module and is available as freesoftware from http://qccpack.sourceforge.net. Generally, itis hard to compare the computational complexity of acoding scheme in this way because the timing is highly

dependent on specific implementations of an algorithm.Interestingly, the SPIHT program spends more time onencoding than the SPIHT-AC program, which is concep-tually more complex than the SPIHT. Although our

Fig. 9 Visual comparison of same enlarged area from ‘Barbara’ image

a Originalb WDR, PSNR 27.5 dBc MWDR-A, PSNR 27.32 dBd MWDR-B, PSNR 27.23 dBAll algorithms encode at 0.25 bpp

Fig. 10 Visual comparison of same enlarged area from ‘Barbara’ image

a CM-WDR, PSNR 28.20 dBb SPIHT, PSNR 27.60 dBc CM-WDR=AC, PSNR 28.25 dBd SPIHT-AC, PSNR 28.20 dBAll algorithms encode at 0.25 bpp


implementation of the MWDR schemes did perform wellagainst Fowler’s SPIHT implementation in terms ofexecution time, our implementation of CM-WDR schemesdid lag behind the timing performance of Fowler’simplementation. By employing a profiling tool on CM-WDR implementations, we found that our index updateimplementation is so inefficient that it consumes 70% of thetotal execution time. It is obvious that this index update stepcould use some optimisation. Our implementation used adoubly-linked list data structure for the index array, whichprovides manipulation simplicity, source code readability,and easy maintenance despite its inefficiency. Above all, westill claim the ‘conceptual’ simplicity of CM-WDR schemesand we believe they could be implemented more efficientlyin the future.

6 Conclusions

In this paper we propose four embedded wavelet imagecoding schemes based on WDR. In the discussion of the firsttwo algorithms, we have shown that the symbol sequencesfrom sorting passes and refinement passes should beprocessed separately to achieve improvements in the RDperformance and=or computational complexity. The symbolsequences can either be arithmetic coded under differentcontexts or simply be output without arithmetic coding.Since we can achieve similar or better compressionperformance with savings of up to 20% of the processortime, it is justified that the AC stage be dropped in WDR.

In the other two algorithm, we applied a low-complexitycontext-modelling approach to baseline WDR. The pro-posed algorithms preserve all features of WDR such asembeddedness, low complexity, region of interest (ROI)capability, and progressive SNR scalability. In terms ofPSNR, CM-WDR is superior to SPIHT and WDR, whileCM-WDR=AC ties with SPIHT-AC. Future work aims atoptimising the implementation of CM-WDR schemes andemploying a similar context modelling concept into ROIcoding.

7 Acknowledgments

The authors are grateful to the anonymous reviewers fortheir constructive comments and suggestions.

8 References

1 Shapiro, J.M.: ‘Embedded image coding using zerotrees of waveletcoefficients’, IEEE Trans. Acoust., Speech Signal Process., 1993, 41,(12), pp. 3445–3462

2 Said, A., and Pearlman, W.A.: ‘A new, fast, and efficient image codecbased on set partitioning in hierarchical trees’, IEEE Trans. CircuitsSyst. Video Technol., 1996, 6, (3), pp. 243–250

3 Tian, J., and Wells, R.O., Jr.: ‘Embedded image coding using waveletdifference reduction’, in Topiwala, P.N. (Ed.): ‘Wavelet image andvideo compression’ (Kluwer Academic, Boston, 1998), pp. 289–302

4 Ordentlich, E., Weinberger, M.J., and Seroussi, G.: ‘A low-complexitymodeling approach for embedded coding of wavelet coefficients’. Proc.of IEEE Data Compression Conf., 1998, pp. 408–417

5 Walker, J.S.: ‘Lossy image codec based on adaptively scanned waveletdifference reduction’, Opt. Eng., 2000, 39, (7), pp. 1891–1897

6 Walker, J.S., and Nguyen, T.Q.: ‘Adaptive scanning methods forwavelet difference reduction in lossy image compression’. Proc. ofIEEE Int. Conf. Image Processing, 1998, pp. 408–417

7 Witten, I.H., Neal, R.M., and Cleary, J.G.: ‘Arithmetic coding for datacompression’, Commun. ACM, 1987, 30, (6), pp. 520–540

8 Lewis, A.S., and Knowles, G.: ‘Image compression using the 2-Dwavelet transform’, IEEE Trans. Image Process., 1992, 1, pp. 244–250

9 Cover, T.M., and Tomas, J.A.: ‘Elements of information theory’ (JohnWiley & Sons, Inc., New York, 1991)

10 Liu, J., and Moulin, P.: ‘Information-theoretic analysis of interscale andintrascale dependencies between image wavelet coefficients’, IEEETrans. Image Process., 2001, 10, (11), pp. 1647–1658

11 Daubechies, I., and Sweldens, W.: ‘Factoring wavelet transforms intolifting steps’, J. Fourier Anal. Appl., 1998, 4, (3), pp. 245–267

12 Cohen, A., Daubechies, I., and Feauveau, J.C.: ‘Biorthogonal bases ofcompactly supported wavelets’, Commun. Pure Appl. Math., 1992, 45,(5), pp. 485–560

13 Fowler, J.E.: ‘QccPack: an open-source software library forquantization, compression, and coding’. Proc. of IEEE Data Com-pression Conf., 2000, p. 554

9 Appendix

Exactly how much bit budget can be saved in the encodingof the ðn þ 1Þth symbol depends on the difference between pand ps; and where it is produced in a sorting pass or arefinement pass. To facilitate the analysis, we further definetwo parameters r; which represents the ratio of theoccurrence frequency of the sign symbols versus that ofthe digit symbols in a sorting pass when the first n symbolshave been coded, and r, which represents the ratio of theoccurrence frequency of the digit symbols in a refinementpass versus that of the digit symbols in the sorting pass whenthe first n symbols have been coded.

r ¼D fn;þf sn;0

r ¼Df rn;0

f sn;0 ð18Þ

Fig. 11 Typical r and r values for natural images. Plot for‘Lena’ image given as an example. 256 � 256 image compressedat 2 bpp

Table 2: Number of encoded significant coefficients inWDR and CM-WDR

Output bit rate, bpp

Image 0.15 0.25 0.5 1.0

Lena CM-WDR 6928 12058 24292 48376

CM-WDR=AC 7091 12378 24934 50015

WDR 6351 10124 20285 42008

Goldhill CM-WDR 7144 10488 23636 50274

CM-WDR=AC 7351 11210 24150 51483

WDR 6872 10335 23291 48508

Boat CM-WDR 6801 11060 23381 46739

CM-WDR=AC 7109 11151 24021 46739

WDR 6427 10990 22372 46151

Barbara CM-WDR 7118 13981 27838 53990

CM-WDR=AC 7425 14147 28394 54743

WDR 6936 11603 23421 45589


From (16) we have

Dp¼ps�p �fn;þ

2 � ð fn;0þ fn;þÞ� fn;þ

2 � f sn;0þ fn;þ

� �¼

fn;þ

2 � f sn;0þ f r

n;0þ fn;þ

� �� fn;þ

2 � f sn;0þ fn;þ

� �¼ r

2ð1þrþrÞ�r

2ð1þrÞ¼� r2ð1þrÞ �

r

1þrþr

ð19ÞTaking the ‘Lena’ image as an example, typical values of rand r for natural images are plotted in Fig. 11. It is shownthat r is fairly constant with gradual increase following theincrease in the number of output symbols, while r increaseswith higher fluctuation. The sawtooth shape of r is causedby the alternating characteristic of sorting passes andrefinement passes. For example, there is no refinementpass symbol being produced during a sorting pass, whichcauses monotonic decrease of r.

For natural images, r gradually increases from 0.4 to 0.6,while r increases from 0 to more than 1 with significantfluctuation. As an example, if we take r ¼ 0:5 and r ¼ 1;

we have Dp ¼ 0:067: Since the typical value of p is around0.35, we have ps � 0:283: Because the ðn þ 1Þth symbol isnot evenly drawn from the two passes, we need to estimatethe probability that it comes from a refinement pass or asorting pass. It turns out that the probability of the ðn þ 1Þthsymbol coming from a refinement pass is r

1þrþr. After we

consider the probability of symbols being drawn from twodifferent passes, (14) becomes

Dlnþ1 ¼ r

1 þ r þ r� Dlrnþ1 þ

1 þ r1 þ r þ r

� Dlsnþ1 ð20Þ

and we have Dlnþ1 ¼ �0:14: However, the average codelength using WDR for this symbol will be

lnþ1 ¼ 2 � ðp log2 p þ ð0:5 � pÞ log2ð0:5 � pÞÞ

¼ 1:88 ð21Þ

This typical result shows that at higher bit rates, around 7%of the bit budget can be saved using the symbol separationmethod. Although this saving seems not significant, itcomes without cost because the same AC procedure isfollowed, except that two contexts are used.


Date post:	19-Sep-2016
Category:	Documents
Upload:	mk
View:	216 times
Download:	1 times

Novel embedded image coding algorithms based on wavelet difference reduction

Documents