MML Image Comp

7/28/2019 MML Image Comp

1/15

1

Experiment 7 IMAGE COMPRESSIONI Introduction

A digital image obtained by sampling and quantizing a continuous tone picture requires anenormous storage. For instance, a 24 bit color image with 512x512 pixels will occupy 768 Kbytestorage on a disk, and a picture twice of this size will not fit in a single floppy disk. To transmit such

an image over a 28.8 Kbps modem would take almost 4 minutes. The purpose for imagecompression is to reduce the amount of data required for representing sampled digital images and

therefore reduce the cost for storage and transmission. Image compression plays a key role in manyimportant applications, including image database, image communications, remote sensing (the useof satellite imagery for weather and other earth-resource applications), document and medical

imaging, facsimile transmission (FAX), and the control of remotely piloted vehicles in military,space, and hazardous waste control applications. In short, an ever-expanding number of applications

depend on the efficient manipulation, storage, and transmission of binary, gray-scale, or colorimages.

An important development in image compression is the establishment of the JPEG standard forcompression of color pictures. Using the JPEG method, a 24 bit/pixel color images can be reducedto between 1 to 2 bits/pixel, without obvious visual artifacts. Such reduction makes it possible to

store and transmit digital imagery with reasonable cost. It also makes it possible to download a colorphotograph almost in an instant, making electronic publishing/advertising on the Web a reality.

Prior to this event, G3 and G4 standards have been developed for compression of facsimiledocuments, reducing the time for transmitting one page of text from about 6 minutes to 1 minute.

In this experiment, we will introduce the basics of image compression, including both binary

images and continuous tone images (gray-scale and color). Video compression will be covered in

the next experiment.

II Theories and Techniques for Image CompressionIn general, coding method can be classified into Lossless and Lossy. With lossless coding, the

original sample values are retained exactly and compression is achieved by exploring the statisticalredundancies in the signal. With lossy coding, the original signal is altered to some extent to achieve

a higher compression radio.


2/15

2

II.1 Lossless CodingII.1.1 Variable Length Coding [1, Chapter6.4]

In variable length coding (VLC), the more probable symbol is represented with fewer bits

(using a shorter codeword). The Shannons first theorem [3] states that the average length persymbol, l, is bounded by the entropy of source,H, i.e.,

H p p l p l p p Hn n n n n n= = + = + log (log )2 2 1 1 (10.1)

where pn is the probability of the n-th symbol, H is the entropy of the source, which represents the

average information, ln is the length of the codeword for symbol n, and lis the average codeword length.

II.1.2 Huffman CodingThe Shannon theorem only gives the bound but not the actual way of constructing the code to

achieve the bound. On way to accomplish the later task is by a method known as Huffman Coding.

Example: Consider an image that is quantized to 4 levels: 0, 1, 2, and 3. Suppose the

probability of these levels are respectively 1/49, 4/49, 36/49 and 8/49. The design of a Huffmancode is illustrated in the Figure 1.

Symbol Prob Codeword Length

1

2 36/49 1 1

1 1

3 8/49 01 2

0

1

1 4/49 13/49 001 3

0

0 5/49

0 1/49 000 3

In this example, we have

Figure 1 An Example of Huffman Coding


3/15

3

Average length l = + + + = =36

491

8

492

4

49

1

493

67

4914( ) .

Entropy of the source H p pk k= = log .116.

< < +H l H 1

II.1.3 Other Variable Length Coding MethodsLZW Coding (Lempel, Ziv, And Welsh)[2] is the algorithm used in several public domain

software for lossless data compression, such as gzip (UNIX) and pkzip (DOS). One of the most

famous graphic file formats GIF also incorporates the LZW coding scheme.

Another method known as Arithmetic Coding [2] is more powerful than both Huffman coding

and LZW Coding. But it also requires more computation.

II.1.4 Runlength Coding (RLC) of Bilevel Images [1, Chapter 6.6]In one dimensional runlength coding of bilevel images, one scans the pixels from left to right

along each scan line. Assume that a line always starts and ends with white pixels, one counts the

number (referred to as runlength) of white pixels and that of the black pixels alternatively. The lastrun of white pixels are replaced with a special symbol EOL (end of line). The runlengths of whiteand black are coded using separate codebooks. The codebook, say, for the white runlengths isdesigned using Huffman Coding method by treating each possible runlength (including EOL) as asymbol. An example of runlengths Coding is illustrated in the Fig. 2.


4/15

4

EOL (End of line)

RUN-LENGTH CODING White Runlength

Black Runlength

-------------------------------------------------------------------

-------------------------------------------------------------------

- - x x x x - - x - - - - - x x x x - - x x x x - - x x x x - -- - x - - - - - x - - - - - x - - - - - x - - x - - - - - x - -

- - x - - - - - x - - - - - x - - - - - x - - x - - - - - x - -

- - x x x x - - x - - - - - x x x x - - x x x x - - x x x x - -- - x - - - - - x - - - - - - - - x - - - - - x - - - - - x - -

- - x - - - - - x - - - - - - - - x - - - - - x - - - - - x - -

- - x x x x - - x x x x - x x x x - - x x x x - - x x x x - -

-------------------------------------------------------------------

-------------------------------------------------------------------2 4 2 1 5 4 2 4 2 4

2 1 5 1 5 1 5 1 2 1 2 1

. . . . . . .

2 4 2 4 1 4 2 4 2 4

II.1.5 Two Dimensional Runlength Coding [1, Chapter 6.6]One dimensional runlength coding method only explores the correlation among pixels in the

same line. In two dimensional runlength coding orrelative address coding, the correlation among

pixels in the current line as well as the previous line is explored. With this method, when atransition in color occurs, the distance of this pixel to the most closest transition pixel (both beforeand after this pixel) in the previous line as well as to the last transition pixel in the same line arecalculated, and the one with the shortest distance is coded, along with an index indicating which

type of distance is coded. See Fig. 6.17 in [1].

Fig. 2 An example of runlength coding


5/15

5

II.1.6 CCITT Group 3 and Group 4 Facsimile Coding Standard - The READ Code[1,Chapter 6.6]

In the Group 3 method, the first line in every K lines is coded using 1-D runlength coding, andthe following (K-1) lines are coded using a 2-D runlength coding method known as RelativeElement Address Designate (READ). For details of this method and the actual code tables, see [1],Sec. 6.6.1. The reason that the 1-D RLC is used for every K line is to suppress propagation of

transmission errors. Otherwise, if the READ method is used continuously, when one bit error occurssomewhere during transmission, it will affect the entire page.

The Group 4 method is designed for more secure transmission media, such as leased data lines

where the bit error rate is very low. The algorithm is basically a streamline of the Group 3 method,with 1-D RLC eliminated.

II.1.7 Lossless Predictive CodingMotivation: The value of a current pixel usually does not change rapidly from those of

adjacent pixels. Thus it can be predicted quite accurately from the previous samples. The predictionerror will have a non-uniform distribution, centered mainly near zero, which has a lower entropy

than the original samples, which usually have a uniform distribution. For detail see [2] Sec. 9.4.With entropy coding (e.g. Huffman coding), the error values can be specified with fewer bits than

that required for specifying the original sample values.

II.2 Transform Coding (Lossy Coding) [1,Chapter 6.5]Lossless coding can achieve a compression ratio of 2 -- 3 for most images. To further reduce

the data amount, lossy coding methods apply quantization to the original samples or parameters of

some transformation of the original signal ( e.g. prediction or transformation). The transformation isto exploit the statistical correlation among original samples. Popular methods include linear

prediction and unitary transforms. We have discussed linear prediction coding and its application inspeech and audio coding in the previous experiment. You have learnt and experimented with

uniform and non-uniform quantization in the previous experiment as well. In this section, we focuson transform coding, which is more effective for images.

One of the most popular lossy coding schemes for images is transform coding. In block-based

transform coding, one divides an image into non-overlapping blocks. For each block, one firsttransforms the original pixel values into a set of transform coefficients using a unitary transform.

The transformed coefficients are then quantized and coded. In the decoder, one reconstructs theoriginal block from the quantized coefficients through an inverse transform. The transform is

designed to compact the energy of the original signal into only a few coefficients, and to reduce thecorrelation among the variables to be coded. Both will contribute to the reduction of the bit rate.


6/15

6

II.2.1 The Discrete Cosine Transformation (DCT)The DCT is popular with image signals because it matches well with the statistics of common

image signals. The basis vectors of the one dimensional N-point DCT are defined by:

h n kn k

Nwith k

Nk

Nk N

k( ) ( ) cos(( )

), ( )

, ,..., .

=+

=

=

=

2 1

2

10

21 2 1

The forward and inverse transforms are described by:

t k h n f n k f nn k

Nk

n

N

n

N

( ) ( ) ( ) ( ) ( ) cos(( )

)*= =+

=

=

0

1

0

1 2 1

2

f n h n t k k t kn k

Nkn

N

n

N

( ) ( ) ( ) ( ) ( ) cos(( )

)= =+

=

=

0

1

0

1 2 1

2

.

Note that the basis vectors vary in a sinusoidal pattern with increasing frequency. Note that theN-point DCT is related to the 2N-point DFT, but is not the real part of it. Each DCT coefficient

specifies the contribution of a sinusoidal pattern at a particular frequency to the actual signal. Thelowest coefficient, known as the DC coefficient, represents the average value of the signal. The

other coefficients, known as AC coefficients, are associated with increasingly higher frequencies.

To obtain 2-D DCT of an image block, one can first apply the above 1-D DCT to each row ofthe image block, and then apply the 1-D DCT to each column of the row transformed block. A 2D

transform is equivalent to represent an image block as a supposition of many basic block patterns.The basic block patterns corresponding to 8x8 DCT are shown in Fig. 3.

This figure is generated using Matlab script:

T=dctmtx2(8);

figure; colormap(gray(256));n=1;

for (i=1:8)

for (j=1:8)

subplot(8,8,n),

imagesc(T(i,:)'*T(j,:));

axis off; axis image;

n=n+1;

end;

end;


7/15

7

II.2.2 Representation of Image Blocks Using DCTThe reason that DCT is well suited for image compression is that an image block can often be

represented with a few low frequency DCT coefficients. This is because the intensity values in animage is usually varying smoothly and very high frequency components only exist near edges. Fig.

4 shows the distribution of the DCT coefficient variances (i.e the energy) as the frequency indexincreases. Fig. 5 shows the approximation of an image using different number of DCT coefficients.We can see that with only 16 out of 64 coefficients, we can already represent the original blockquite well. You can experiment with the approximation accuracy by different number of DCTcoefficients using the Matlab demo program dctdemo.

Low-Low

Low-High

High-Low

High-High

Figure 3 Two-dimensional DCT Basis Images


8/15

8

Original

With 8/64

Coefficients

With 16/64

Coefficients

With 4/64

Coefficients

Figure 5 Images reconstructed with different number of DCT coefficients

Coefficient Index (Zig-Zag Order)

Variance

Figure 4 DCT Coefficient Variance decreases as the frequency index increases


9/15

9

II.2.3 JPEG Standard for Still Image CompressionThe JPEG standard refers to the international standard for still image compression

recommended by the Joint Photographic Expert Group (JPEG) of the ISO (International Standards

Organization) [1, Chap. 6]. It consists of three parts: i) a baseline DCT based lossy compressionmethod for standard resolution and precision ( 8 bits/color /pixel) images; ii) an extended coding

method for higher resolution/precision images, which use the baseline algorithm in a progressivemanner; and iii) a lossless predictive coding method.

Given an image, it is first divided into 8x8 non-overlapping blocks. An 8x8 DCT is thenapplied to each block. For the DC coefficient of each block, a predictive coding method is used.That is, the current DC value is predicted by the DC value of the previous block, and the prediction

error is quantized using a uniform quantizer. The other AC coefficients are quantized directly usingdifferent quantizers (i.e. with different step-sizes). The step-sizes for the DC prediction error andother AC coefficients are specified in a normalization matrix. The particular matrix used can bespecified in the beginning of the compressed bit stream as side information. Alternatively, one can

use the matrix recommended by the JPEG as default, which is shown in Fig 5. Usually one cantrade-off the quality with the compression ratio by scaling the normalization matrix using a scalefactor. For example, a scale factor of two means the step-size for every coefficient is doubled. On

the other hand, a scale factor of 0.5 cuts all the step-sizes by half. A smaller scale factor will lead toa more accurate representation of the original image, but also a lower compression gain (i.e. higher

bit rate).

Fig. 5 A typical normalization matrix


10/15

10

Fig. 6 A typical zig-zag mask.

For the binary encoding of the quantized DC prediction error and DC coefficients, acombination of fixed and variable length coding is used. All possible DC prediction error values andall AC coefficient values (after quantization) are divided into separate categories according to theirmagnitude. See Table 6.14 in [1] for the categorization rule. The DC prediction error is encoded bya two-part code. The first part specifies which category it is in, and the second part specifies the

actual relative magnitude in that category. The first part is Huffman coded based on the frequenciesof different categories, while the second part is coded using a fixed length codeword. For the ACcoefficients, a runlength coding method is used. It arranges the DCT coefficients into a 1D arrayfollowing a zig-zag order, as illustrated in Fig. 5. The AC coefficients are converted into symbols,each symbol is a pair consisting of the runlength of zeros from the last non-zero value and the

following non-zero value. The non-zero value is further specified in two parts, similar to theapproach for the DC prediction error. The first part specifies which category it belongs to, and thesecond part specifies the relative magnitude. The symbol consisting of the zero runlength and thecategory of the non-zero value is Huffman coded, while the relative magnitude of the non-zerovalue is coded using a fixed length code. The standard recommends some default Huffman tables

for the DC prediction error and the AC symbols. But the user can also specify different tables that

are optimized for the statistics of the images in a particular application. For a color image, eachcolor component is compressed separately using the same method.

II.3 Vector QuantizationVector quantization (VQ) is another approach for image compression. The idea behind VQ is

to determine the best set of basic block patterns (each called a codeword) that can best represent allimages blocks present in an image. The set of all basic patterns is called the codebook. Usually, the

best codebook for a class of images is pre-designed. In the encoding process, for each given image

block, the best matching codeword in the codebook is found. Although this searching process ismore CPU-intensive than JPEG, it is much simpler and faster on the decoding (decompression side).

Decoding vector quantization coded information involves looking up the appropriate codeword inthe code book based on the index created during the encoding process. The complexity of VQgrowth exponentially with the size of the block (i.e. number of pixels in a block). In practice, block

size equal or less than 4x4 is often used. Figure 7 describes the vector quantization architecture.

The simplest approach to processing an image partitions the input image at the encoder into

small, contiguous, nonoverlapping rectangular blocks or vectors of pixels, each of which isindividually quantized. The vector dimension is given by the number of pixels in the block. The


11/15

11

vector of samples is a pattern that must be approximated by a finite set test patterns. The patterns arestored in a dictionary, the codebook. The patterns in the code book are called codewords.

During compression, the encoder assigns to each input vector an address or index identifyingthe codeword in the codebook that best approximates the input vector. In the decoder, each index is

mapped into an output vector taken from codebook. The decoder reconstructs the image as a

patchwork quilt by reconstructing the vectors in the same sequence as the input vectors. The IntelIndeo technology is based on a proprietary vector quantization methodology.

Input Image

(Input Vectors)

Encoder

nearest

neighbour

search

Decoder

table

lookup

Code Vectors Code Book

Output

Vector

Fig. 7. Vector Quantization


12/15

12

III Experiment1) For bi-level runlength coding, if you have an Image x=[1 1 1 1;1 2 2 1;1 1 1 1], the output should

yield code=[-1 1 2 -1 -1].

Use h:\el593\exp10\e.gif as your input image and perform bi-level-runlength coding manually and countthe probability of different runlengths. Then use Huffman Coding to decide the codeword for each symbol

and calculate the average code length. You can use dispgifto read the input file and display the data of theimage.

2) Demo10.m is a program that performs DCT over each 8x8 image block, then quantizes the DCTcoefficients. Finally it performs inverse DCT to obtain a reconstructed image. Play with the programto see the effect of quantization using different quantization scale factors.

3) Modify the demo10.m program so that instead of quantizing the DCT coefficients using a suppliednormalization matrix, you will retain only the first L coefficients in a zig-zag order. Try out different

values of L (e.g. 1 16 L ).

Hint you should modify the mask in the program based on which coefficients to be reserved and which tobe zeroed out.

4) Modify the demo10.m program so that instead of quantizing the DCT coefficients using a suppliednormalization matrix, you will retain only those coefficients that have a magnitude greater than a

threshold T. Try out different values of T (e.g. 1 256 T ).

NOTE: You must show your work to the instructor and get a signature.

IV Report1) Submit the programs you wrote. Include the source code and your output results.

2) For experiment #2,3, 4, comment on the effect of different quantization parameters, the number ofretained coefficients, and the threshold. What are the maximum scale factor, the minimum number ofcoefficients, and the maximum threshold, respectively, that can be while still maintaining anacceptable visual quality of the image?

3) For a video sequence, the image size for each frame is 360x240 and the frame rate is 30frames/second with full color (3 bytes/pixel). What would be the compression ratio for this sequenceto be 1.5 Mbits/second?


13/15

13

V References1) R. C. Gonzalez & R. E. Woods, Digital Image Processing, Addison Wesley 1992.

2) A. N. Netravali and B. G. Haskell, " Digital Pictures -- Representation, Compression, and Standards", 2nd ed., Plenum Press, 1995.

3) A. V. Oppenheim, R. W. Schafer, " Discrete-Time Signal Processing ", Prentice Hall, 1989.

4) Matlab User's Guide, The Math Works Inc., 1993.

5) Matlab Reference Guide, The Math Works Inc., 1992.


14/15

14

6) APPENDIX A.

******************************************************************* MATLAB Script file for demonstration of DCT *

******************************************************************

function demo10(FileName,dx,dy);

% Demo program for EL593 Exp.10

% usage : demo8('h:\el593\exp10\lena.img',256,256);

Img=fread(fopen(FileName),[dx,dy]);

colormap(gray(256));image(Img');

set(gca,'XTick',[],'YTick',[]);

title('Original Image');

truesize;drawnow

y=blkproc(Img,[8 8],'dct2');yy=blkproc(y,[8 8],'mask2');

yq=blkproc(yy,[8,8],'idct2');

figure;

colormap(gray(256));

image(yq');set(gca,'XTick',[],'YTick',[]);

title('Quantized Image');truesize;

drawnow


15/15

15

******************************************************************* MATLAB Script file for demonstration of DCT (subroutine 1) *

******************************************************************

function [y]=mask2(x);

mask=[16 11 10 16 24 40 51 61;

12 12 14 19 26 58 60 55;

14 13 16 24 40 57 69 56;14 17 22 29 51 87 80 62;

18 22 37 56 68 109 103 77;

24 35 55 64 81 104 113 92;49 64 78 87 103 121 120 101;

72 92 95 56 112 100 103 99];

x=x/8;

% Normally c=1c=16;

mask=c*mask;

z=round(x./mask);

y=mask.*z;y=8*y;

Date post:	03-Apr-2018
Category:	Documents
Upload:	ptkn
View:	225 times
Download:	0 times

MML Image Comp

Documents