Image Compression With Haar Discrete Wavelet Transform
Cory Cox
ME 535: Computational Techniques in Mech. Eng.
Figure 1 : An example of the 2D discrete wavelet transform that is used in JPEG2000.
Source: http://en.wikipedia.org/wiki/File:Jpeg2000_2-level_wavelet_transform-lichtenstein.png
Intro
Importance Of Image Compression
In 2010 Google started incorporating web page loading times into their search engine
optimization algorithms. Their reasoning behind the change was: “Faster sites create happy users
[...] Like us, our users place a lot of value in speed – that’s why we’ve decided to take site speed
into account in our search rankings.”As the internet becomes a more feature rich and graphic
intensive platform, image compression becomes more and more important, especially since
mobile phones and other devices are joining the fray along with more traditional and powerful
laptops and desktop computers. To provide a satisfactory internet experience it’s necessary to
compress images in order to optimize page loading times. Since search engine optimization is
one of the most widely studied marketing strategies it naturally follows that image compression
has become a vital component of web site design, especially since Google’s SEO incorporates
loading times.
To further impress the importance of image compression we can look at some data from a study
performed by the mobile web developer Trilibis.i By reviewing 155 major websites and
obtaining the image weight relative to the total weight of the site the following bar graph was
generated which shows just how much images contribute to the loading time for web pages.
Figure 2 : Plot of weight of web pages versus the loading time
Obviously image weight is a significant contributor to page weight, making up more than half of
the total weight in most cases. Now to illustrate how significant image compression is in
reducing the weight of a web page let’s look at Trilibis’ plot of weight savings after image
compression across different devices.
Figure 3: Page weight savings from image compression
Obviously image compression is a valuable tool for improving web page load times. It’s also
useful in many other applications such as storing image files on memory cards or hard drives.
Now let’s look at one method for image compression, the haar discrete wavelet transform
approach.
Haar Discrete Wavelet Transform Method
To begin, let’s assume that we’re working with a grayscale image. This means that each pixel is
represented with an integer value between zero (black) and 255 (white). Our goal is to save only
the most relevant pixel information with fewer values (smaller file size) while allowing the entire
image to be accurately reconstructed using only those relevant pixels. For a simple example let’s
first look at a row vector only then later we’ll move on to a matrix.
Say we have the following row of pixel information that we want to save.
We can represent these numbers in a variety of ways. As an example of what not to do, we could
save the vector as a series of paired averages, taking the first two pixel values and averaging
them, then the next two pixel values and so on. We’d get something like this:
That would give a relatively good representation of the original data, but it would be impossible
to accurately reconstruct the original data from the 4 so-called “approximation coefficients”
above. To improve this method we could save some more values that give us some idea of how
we could use the approximation coefficients to reconstruct the original data. Let’s save 4 more
values, bringing our total number of saved values to 8. This is the same number as we began
with! Why not just save those original 8 values then? We’ll get to that in a second. Let’s say that
we save the following 8 values:
The first 4 values are the values we originally picked to save; the average of neighboring pairs.
Various sources call those values the “sums” or the “approximation coefficients”. The last 4
values give us the distance from each average to the surrounding points. Some sources denote
that values as the “detail coefficients” or the “differences”. For example, the approximation
coefficient 5 is the average of the first two original points, and the distance to the surrounding
points is given by the detail coefficient 1, so we know the original values were 4 and 6.ii
The reason that we choose to save these 8 values is that they can accurately reconstruct the
original data plus they give us an idea of the rate of change of the data in an area. When the
distance to a surrounding point is small we know that the data at that location is all relatively
similar and not much changes as the location changes. This can be visualized as an area of an
image where the colors are relatively similar. When the distance to a surrounding point is large
we know that the data at that location is very different from the data surrounding it. This
corresponds to an area of an image where colors are changing at some kind of edge.
The fact remains that we started with 8 values and we are claiming that we can accomplish
image compression by saving 8 values, which is not intuitive. The reason that this method is
effective is that the differences, if small, can be approximated as zero and then discarded. By
iterating this process on matrices the Haar discrete wavelet transform focuses the energy of the
matrix in the upper left hand corner, leaving mostly zero values or near zero values elsewhere.
Let’s look at the procedure for Haar wavelet transforms (HWT) for matrices more in depth.
Say we have an image matrix, A, which stores grayscale pixel data for an image using integer
values between 0 and 255 :
If we look at the first row of matrix A and follow the procedure we outlined above then we’ll
start by splitting the row up into pairs:
Now we’ll find the approximate coefficients, or the sums, which are the averages of each of the
pairs.
We also need to find the detail coefficients, or the differences, which are the distances from each
average to the corresponding points on either side of it. We can calculate these more explicitly as
half of the difference of each pair.
If we combine the approximate coefficients and the detail coefficients into one row then we’ve
found the first iteration of the haar discrete wavelet transform of the first row of the matrix.
If we repeat that process on the approximate coefficients from the above row (just the first 4
values) then we’ll start with the following pairs:
We need to find the approximate and detail coefficients the way we did before:
And we’ll combine those into a row vector with the 1st iteration detail coefficients.
We repeat this once more for the first row and then perform the same operation for all of the rest
of the rows, as well as all of the columns. The resulting matrix is shown below:
All of the energy has been concentrated into the upper left hand entry and the rest of the entries
are either zero or relatively close to zero. This is a result of neighboring pixels in images
generally being relatively similar to each other in terms of color or grayscale intensity, with the
exception of pixels that define edges and outlines of shapes. If we count the number of entries
with a value of zero we’ll notice that there are 16 zeros in the above matrix.
Now let’s compress the image represented by the matrix above by picking a cutoff value such
that any pixel data with an absolute value less than that cutoff value is set to zero. Let’s pick a
cutoff value of 0.25 and set all entries in the above matrix that have an absolute value of less than
0.25 equal to zero. This results in the matrix pictured below:
If we count the number of entries with a value of zero we find that there are now 37 such entries
as opposed to 16 zero entries previously.
To calculate the compression ratio we take the number of non-zero entries in the original matrix
and divide by the number of non-zero entries in the compressed matrix.
Pixel Coding and Usage of Masks
To further optimize the compression of an image we can use different numbers of bits to store
the pixel information from each section of the matrix. As an example we’ll again reference the
compressed matrix, A, that we worked with in the above section. Most of the energy of the
matrix is contained in the upper left hand corner so we should use more bits to store that
information and we can use fewer bits for the sections of the matrix where the entries are mostly
zeros or close to zero.
Compression Analysis
We’ll be looking at a few different criteria for assessing the overall success image compression.
1. Compression Ratio
2. Mean Square Error
3. Peak Signal to Noise Ratio
The compression ratio is the most obvious quantitative measurement of the success of image
compression. As described above, it is a way to compare the amount of significant information
contained in the original image matrix to the amount of significant information contained in the
compressed image matrix. This can be found simply by comparing the file size of the original
image to the file size of the compressed image. An image that originally has a file size of 5 MB
that is compressed to have a file size of 1 MB would have a compression ratio of 5:1, for
example.
The mean square error is less of a compression evaluation than it is a quality evaluation. It is a
way to directly compare the accuracy of a compressed and reconstructed image to the original
image in terms of individual pixel values.
In the above formula the dimensions of the image are denoted by m and n and I is the intensity of
the individual grayscale pixel values. I(x, y) are the pixel values for the original image and
I’(x, y) are the pixel values for the compressed and reconstructed image.
Whereas the mean square error is indicative of a version of the cumulative error, the peak signal
to noise ratio describes a sort of maximum error. In terms of image compression the signal is the
original image and the noise is the error that occurs as a result of the compression and
reconstruction. The peak signal to noise ratio equation is given below, in terms of the mean
square error:
So generally a better image compression will result in lower mean square error and a higher peak
signal to noise ratio.iii
Procedure
The procedure that we’ll follow for compressing the image under study using a Haar discrete
wavelet transform is as follows:
Compressing the Image
1. Start with grayscale image of size 256 x 256
To begin with we’ll pick a picture in full color and crop it to the appropriate size. The starting
image is shown below in Figure 4.
Figure 4
We’ll convert it to a grayscale picture using Matlab. Unfortunately I don’t have access to the
Image Processing Toolbox so there will be a fair amount of coding in the name of finding a
workaround for some of the functions that come standard in the Image Processing Toolbox. The
reason we convert to a grayscale picture is to obtain a simple intensity map which is much easier
to work with. The grayscale image is seen below in Figure 5.
Figure 5 : The grayscale conversion
2. Scan a row of the image at a time, finding the sums/differences between neighboring
entries in the image matrix
This is easily accomplished using a “for” loop in Matlab.
3. Split the image matrix into a left side and a right side, storing the sums or approximate
coefficients in one half and the differences or detail coefficients in the other half.
Figure 6: After one iteration of row sums
In Figure 6 above we’ve split the original grayscale image seen in Figure 5 into a left half and
right half. For each entry in the original grayscale image matrix we calculated the sum and
difference between neighboring entries, which we use to generate Figure 6, which contains the
sums of the consecutive entries on one half and the differences between consecutive entries on
the other half. Now we’ll do the same for the columns of our original grayscale image.
4. Scan the image matrix by columns, finding the sums/differences between neighboring
entries
Once again, this is very easily done in Matlab with a simple “for” loop.
5. Split the matrix into a top half and bottom half, storing the sums in one half and the
differences in the other half.
Figure 7 : After one iteration of row sums and column sums
Here we’ve taken the sums and differences and stored the sums in the upper half of the new
image matrix and the differences in the lower half of the new image matrix.
6. Repeat steps 2-7 for the smaller matrix where the sums of the column scan and the row
scan overlap. In our case we’ll repeat 4 times to obtain an image matrix where all of the row
and column sums are concentrated in the upper left hand corner in a 16 x 16 sub-matrix.
In Figures 8 and 9 below we’ve shown the resulting sums after the second iteration and after the
fourth iteration, at which point the row and column sums are concentrated into a 16 x 16 area in
the upper left hand corner.
Figure 8 : Second Iteration Figure 9: After 4 iterations
Decompressing the Image
To decompress the image and see what kind of errors are present after the compression and
restoration we follow a similar process but in reverse. The steps are so similar to the compression
process that we won’t go over each step in detail.
1. Reverse the steps back to the original size
2. Reverse the sums/differences for each column of the matrix
3. Reverse the sums/differences for each row of the matrix
4. Repeat steps 2 and 3 for successively bigger matrices until we’re back at the original 256
x 256 image
The results of the decompression are shown below in Figure 10. Note that it is slightly more
pixelated than the original grayscale image in Figure 5.
Figure 10: Decompressed and reconstructed image showing signs of pixelation
Analysis
The point of image compression is obviously to reduce the file size of an image by eliminating
redundant pixels and areas. However it is also important that the image can be decompressed and
reconstructed successfully while minimizing the errors in the image. There are a few ways to
analyze the compressive capabilities and the quality of the compression, as previously discussed
in the method description. Examining the compression ratio is the most obvious way to assess
the compressive qualities and calculating the mean square error between the pixels of the
compressed image and the original image is a good way to analyze the quality of the
compression. In addition to those two analyses we will also look at the peak signal to noise ratio
which is a way to relate the power of the maximum signal in the image to the power of the noise
that corrupts the image’s fidelity. In the instance of image compression the signal is the original
image and the noise is the error that compression causes.
Let’s look first at the compression ratio, using a mask that allocated 8 bits to the highest energy
16 x 16 matrix in the upper left hand corner of the compressed image, 6 bits to the 32 x 32 matrix
that surrounded the upper left hand corner, 4 bits to the 64 x 64 matrix that makes up the next
level, 2 bits to the 128 x 128 matrix and 0 bits to the 256 x 256 matrix. This mask seems to give
a good mix of compression and quality. For this mask we see the following compression ratio:
Original Image Size Compressed File Size Compression Ratio
48,469 bytes 4,023 bytes 12:1
Now let’s look at the mean square error for a bunch of different types of masks:
And finally we’ll examine the corresponding peak signal to noise ratio (since it’s a function of
mean square error):
Obviously these plots show that as you use more bits to code each pixel you reduce the
cumulative error and decrease the amount of noise relative to the peak signal, which is indicative
of higher quality compression. We can get an idea of how our compression ratio of 12:1
compares to other compression ratios by doing a quick online search. It looks like a lot of online
image optimizers give the user the option to compress images at a ratio of anywhere from 1:1 to
99:1. So our compression trends towards the higher quality, less compression end of the
spectrum, as opposed to higher compression rates which sacrifice the quality of the reconstructed
image.
We can compare the mean square error results we got to some other studies that have been done
on image compression to see how our results look comparatively. Let’s look at a plot of mean
square error for several different video compression techniques.
The video compression software uses 5 different coding techniques on 59 different frames from a
video clip of football footage and finds the MSE for each frame. The MSE for the frames ranges
from around 150 to about 450. This makes sense compared to our results, especially if we were
using fewer bits per pixel (maybe 2 or 3) to compress each image. In general the MSE is on the
same order of magnitude.
Next we’ll look at a study where the researcher was varying the bits per pixel being used for
several different images.
Once again we see that the order of
magnitude of the mean square error
is on par with our image
compression results. They get
slightly better results using 1 bit per
pixel then we did, but other than
that the results seem quite similar.
References i Gesenhue, Amy. "Study: Load Times For 69% Of Responsive Design Mobile Sites Deemed "Unacceptable""
Marketing Land. MarketingLand, 22 Apr. 2014. Web. 19 May 2014.
ii Khoury, Joseph. "Application to Image Compression." Application to Image Compression. University of Ottawa,
n.d. Web. 3 June 2014.
iii Kumar, Satish. "An Introduction to Image Compression." An Introduction to Image Compression. DebugMode, 22
Oct. 2001. Web. 1 June 2014.
Bibliography
"Discrete Wavelet Transform." Wikipedia. Wikimedia Foundation, 06 June 2014. Web. 08 June 2014.
Emery, Ashley. "Wavelets." ME 535 Course Website. University of Washington, n.d. Web. 4 May 2014.
Gesenhue, Amy. "Study: Load Times For 69% Of Responsive Design Mobile Sites Deemed "Unacceptable""
Marketing Land. MarketingLand, 22 Apr. 2014. Web. 19 May 2014.
Husen. "Haar Wavelet Image Compression." Ohio State Mathematics. Ohio State University, Winter 2010. Web. 27
May 2014.
"Image Compression." Wikipedia. Wikimedia Foundation, 06 June 2014. Web. 3 June 2014.
Khoury, Joseph. "Application to Image Compression." Application to Image Compression. University of Ottawa,
n.d. Web. 3 June 2014.
Kumar, Satish. "An Introduction to Image Compression." An Introduction to Image Compression. DebugMode, 22
Oct. 2001. Web. 1 June 2014.