Fractal Techniques for FaceRecognition
by
Hossein Ebrahimpour-Komleh
M.Sc: Computer Engineering(With Honours)
B.Sc: Computer Engineering(First Class Honors)
PhD Thesis
Submitted in Fulfilment
of the Requirements
for the Degree of
Doctor of Philosophy
at the
Queensland University of Technology
Research Program in Speech, Audio, Image & Video
Technologies
August 2004
Keywords
fractals, subfractals, fractal image-set coding, image coding, face recognition,
image processing, computer vision
To my wife Soheila
and my little daughter Niloufar
Abstract
Fractals are popular because of their ability to create complex images using only
several simple codes. This is possible by capturing image redundancy and pre-
senting the image in compressed form using the self similarity feature. For many
years fractals were used for image compression. In the last few years they have
also been used for face recognition. In this research we present new fractal meth-
ods for recognition, especially human face recognition.
This research introduces 3 new methods for using fractals for face recognition, the
use of fractal codes directly as features, Fractal image-set coding and Subfractals.
In the first part, the mathematical principle behind the application of fractal
image codes for recognition is investigated. An image Xf can be represented as
Xf = A × Xf + B which A and B are fractal parameters of image Xf . Different
fractal codes can be presented for any arbitrary image. With the definition of a
fractal transformation, T (X) = A(X − Xf ) + Xf , we can define the relationship
between any image produced in the fractal decoding process starting with any
arbitrary image X0 as Xn = Tn(X) = An(X − Xf ) + Xf . We show that some
choices for A or B lead to faster convergence to the final image.
Fractal image-set coding is based on the fact that a fractal code of an arbitrary
gray-scale image can be divided in two parts – geometrical parameters and lumi-
nance parameters. Because the fractal codes for an image are not unique, we can
change the set of fractal parameters without significant change in the quality of
the reconstructed image. Fractal image-set coding keeps geometrical parameters
ii
the same for all images in the database. Differences between images are captured
in the non-geometrical or luminance parameters - which are faster to compute.
For recognition purposes, the fractal code of a query image is applied to all the
images in the training set for one iteration. The distance between an image and
the result after one iteration is used to define a similarity measure between this
image and the query image.
The fractal code of an image is a set of contractive mappings each of which
transfer a domain block to its corresponding range block. The distribution of
selected domain blocks for range blocks in an image depends on the content of
image and the fractal encoding algorithm used for coding. A small variation
in a part of the input image may change the contents of the range and domain
blocks in the fractal encoding process, resulting in a change in the transformation
parameters in the same part or even other parts of the image. A subfractal is a
set of fractal codes related to range blocks of a part of the image. These codes
are calculated to be independent of other codes of the other parts of the same
image. In this case the domain blocks nominated for each range block must be
located in the same part of the image which the range blocks come from.
The proposed fractal techniques were applied to face recognition using the MIT
and XM2VTS face databases. Accuracies of 95% were obtained with up to 156
images.
Contents
Abstract i
List of Figures viii
List of Tables xv
Acronyms & Units xvi
Certification of Thesis xvii
Acknowledgments xviii
Chapter 1 Introduction 1
1.1 Chaos and Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Face recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Publications Resulting from research . . . . . . . . . . . . . . . . 4
iv CONTENTS
Chapter 2 Fractal Encoding and Decoding 6
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Features of Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Mathematical Foundations . . . . . . . . . . . . . . . . . . . . . . 9
2.3.1 Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3.2 Contractive Transformations . . . . . . . . . . . . . . . . . 10
2.3.3 Fixed Point Theorem . . . . . . . . . . . . . . . . . . . . . 10
2.3.4 Affine Transformation . . . . . . . . . . . . . . . . . . . . 11
2.4 Iterated Function Systems(IFS) . . . . . . . . . . . . . . . . . . . 11
2.5 Principles of Fractal Coding . . . . . . . . . . . . . . . . . . . . . 12
2.5.1 Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 3 Face Recognition 19
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Facial Feature Detection . . . . . . . . . . . . . . . . . . . . . . . 19
3.3 Geometric Feature Based Methods . . . . . . . . . . . . . . . . . 20
3.3.1 Face Recognition Using Principal Component Analysis
(Eigenfaces) . . . . . . . . . . . . . . . . . . . . . . . . . . 21
CONTENTS v
3.3.2 Recognition Using Independent Component Analysis (ICA) 22
3.4 Linear Discriminant-Based Method . . . . . . . . . . . . . . . . . 23
3.4.1 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . 24
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Chapter 4 Fractal Codes Directly as Features 27
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.2 Previous Related Work . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2.1 Shape Recognition Using Fractal Geometry . . . . . . . . . 28
4.2.2 Face Recognition Using Fractal Dimensions . . . . . . . . . 28
4.2.3 Face Recognition Using Fractal Neighbor Distances . . . . 29
4.3 Fractal Codes as Features . . . . . . . . . . . . . . . . . . . . . . 29
4.3.1 Fractal Extraction . . . . . . . . . . . . . . . . . . . . . . 30
4.3.2 Normalizing Fractal Codes . . . . . . . . . . . . . . . . . . 36
4.3.3 Accuracy Tests . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Chapter 5 Fractal Image-set Coding 43
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.2 Mathematical Bases . . . . . . . . . . . . . . . . . . . . . . . . . 44
vi CONTENTS
5.3 Fractal Image-set Coding . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Similarity Measurements . . . . . . . . . . . . . . . . . . . . . . . 50
5.4.1 Minkowski-Form Distance . . . . . . . . . . . . . . . . . . 51
5.4.2 Cosine Distance . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4.3 Fractal Similarity Measures . . . . . . . . . . . . . . . . . 52
5.5 Using Fractal Image-set Coding for Face Recognition . . . . . . . 53
5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Chapter 6 Subfractals 65
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.3 Subfractal Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.4 Mathematical Basis . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.5 How to Use Subfractals for Face Recognition . . . . . . . . . . . . 77
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Chapter 7 Future Work and Conclusions 85
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.1.1 Improving the Robustness . . . . . . . . . . . . . . . . . . 87
CONTENTS vii
7.1.2 Face Location and Detection . . . . . . . . . . . . . . . . . 89
7.1.3 Face Recognition Using Subfractals of Eyes and Mouth Area 91
Appendix A Quick Glance Eye-Gaze Tracking System 93
Appendix B Experimental Details 95
B.1 Fractal Codes as Features . . . . . . . . . . . . . . . . . . . . . . 95
B.2 Fractal Image-Set Coding . . . . . . . . . . . . . . . . . . . . . . 96
Bibliography 101
List of Figures
2.1 One of the best examples for understanding the features of fractals
is the fern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Van Koch’s snowflake with fractal dimension of 1.26 . . . . . . . 13
2.3 Serpinski triangle the attractor of an IFS containing 3 contractive
transformations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.1 Examples of pose(XM2VTS face image database [65]), lighting
(AR face image database [61]) and facial experssion variations
(CMU-Pitt facial expression database [48]) in face images. . . . . 25
4.1 Domain (bottom) and range(top left) blocks for an image (top
right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2 The eight possible orientations of a block. The orientations consist
of four 90o rotations, a reflection and four more 90o rotations. . . 33
4.3 An illustration of domain and range blocks . . . . . . . . . . . . . 33
LIST OF FIGURES ix
4.4 Fractal features of an image (A=Domain index number,
B=Rotation (orientation) index, C=Brightness shift and
D=Contrast factor) displayed as gray values over the quad-
tree partition of the same image. . . . . . . . . . . . . . . . . . . 36
4.5 Typical images from the MIT face database
(ftp:\\whitechapel.media.mit.edu\pub\eigenfaces\pub\images).
Two different frontal views of each person are included. . . . . . 37
4.6 Recognition accuracy using Rotation, Domain index, brightness
and contrast features, independently, and total accuracy achieved
using all features, plotted against the number of the images in the
database as this is progressively increased. . . . . . . . . . . . . . 38
4.7 A query Image and the first 8 closest matches found by the method
(the first four images are the best hit for each feature and the last
four images are the second best hit of that feature). . . . . . . . 39
4.8 A rotated query image and the first 8 closest matches. Note that
the best match image retrieved using the orientation feature is the
correct person. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.9 Another rotated query image and the first 8 closest matches only
using the orientation feature. . . . . . . . . . . . . . . . . . . . . 41
4.10 A query image inverted in grayscale and the first 8 closest matches.
Note that the rotation feature gets both first and second matches
right. The brightness feature does not find the right match because
the change negatives the brightness feature. The other two features
find the right match. . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.1 An illustration of Get-Block and Put-Block operators . . . . . . . 44
x LIST OF FIGURES
5.2 Illustrations of function T (x) = A × x + B for a one-dimensional
space =. a)s > 1, b) s = 1, c,d,e) s < 1 . . . . . . . . . . . . . . . 47
5.3 An example of preprocessing with an image in the data-set.(A)
The original image, (B) grayscale image with orientation normal-
ized, (C) Nominated image with face area marked, (D) normalized
histogram equalized face image . . . . . . . . . . . . . . . . . . . 48
5.4 (A) Average image of the data-set, (B) An arbitrary image from
the data-set, (C) Range blocks for image A, (D) The same range
blocks applied to image B . . . . . . . . . . . . . . . . . . . . . . 49
5.5 The initial image and the first, third and fifth iterates of decoding
transformations corresponding to image 005 4 1. . . . . . . . . . . 50
5.6 The PSNR versus the number of decoding steps for 4 different
128× 128 gray-scale, normalized, encoded images of the XM2VTS
database. The dash-dot line, solid line, dashed line and dotted line
correspond to images 002 1 1, 000 1 1, 003 1 1and 005 4 1 images
of the XM2VTS database, respectively . . . . . . . . . . . . . . . 57
5.7 Euclidean distance takes both angle and vector lengths into ac-
count to calculate the distance, while cosine distance only takes
angle into account. . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.8 Convergence trajectories for three different initial images when the
same fractal code is applied iteratively. Note that the initial image
(x03) closest to the fixed point shows the least distance between
successive iterations (d3 < d2 < d1). The fractal parameters are
A = 0.9 × ρ45 and B = (I − 0.9 × ρ45) × xf . . . . . . . . . . . . . 58
LIST OF FIGURES xi
5.9 Convergence trajectories for the same three initial images when
the fractal code parameters are A = 0.9 × ρ15 and B = (I − 0.9 ×
ρ15) × xf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.10 Convergence trajectories for the same three initial images when the
fractal code parameters are A = 0.6×ρ45 and B = (I−0.6×ρ45)×xf . 59
5.11 Convergence trajectories for the same three initial images when
the fractal code parameters are A = 0.6 × ρ15 and B = (I − 0.6 ×
ρ15) × xf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.12 An example showing a query image on top followed by the six
closest images in the database. The best match is on the top-left,
followed by others left to right in row-first order. Note that the
first three matches are images of faces of the correct person and
some change in expression is tolerated by the method. . . . . . . 60
5.13 The error(top left) and the similarity (top right) between the query
image and the images in the training data-set. Errors are all very
small. Normalized error (bottom left) and normalized similarity
(bottom right) for the same images. Note that the normalized
similarity measure clearly shows the best matching face number as
9. Values of this measure for other faces are below 0.7 in this case. 60
5.14 Another example showing a correctly identified case. Note here
that there is a more marked change in facial expression and pose. 61
5.15 Yet another correctly identifed case. Note that the first three
matches are images of faces of the correct person. . . . . . . . . . 61
5.16 Yet another correctly identified test case. Note here that the query
image is of a light-skinned individual and so are all the 6 closest
matched images. . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
xii LIST OF FIGURES
5.17 A test case that failed. The second closest match is of the correct
individual but the facial hair change is too severe for the method
to cope. It seems as if the closest matched face exhibits a different
person but with very similar expression and features such as eyes
and mouth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.18 Query image (top) and training images (bottom) for the individual
number 019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.19 The only other test case that failed. The fourth and sixth closest
matches are of the correct individual. . . . . . . . . . . . . . . . . 63
5.20 Query image (top) and training images (bottom) for the individual
number 005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.21 A plot showing accuracy versus the number of persons in the data
base. Three images are used for each person in training set and
one image per person in test set. . . . . . . . . . . . . . . . . . . . 64
6.1 A distribution of the difference in the x position of the domains xd
and ranges xr for an encoding of 512× 512 Lena image, as well as
the theoretical distribution (dashed line) of the difference of two
randomly selected points. Adopted from [35]. . . . . . . . . . . . 66
6.2 A distribution of the difference in the y position of the domains
yd and ranges yd for an encoding of 512× 512 Lena image, as well
as the theoretical distribution (dashed line) of the difference of
two randomly selected points. Adopted from [35]. Note that the
distribution is skewed and also has significantly large values close
to 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
LIST OF FIGURES xiii
6.3 Range blocks (top left) in four major subfractal areas (eyes, nose
and lips) and corresponding domain blocks (bottom rows) for an
arbitrary face image. Top right, a plot of pixel values vs. pixel
numbers for last matched domain and range block is shown. . . . 71
6.4 A view of the eye-gaze tracking system . . . . . . . . . . . . . . . 78
6.5 A pair of face images shown to volunteers to verify the identity. . 80
6.6 An illustration showing the results of the eye-gaze tracking system
for 10 viewers. Circles (the centers) show the gaze points and the
radius of each circle shows the duration of gaze on that point. . . 80
6.7 Another pair of face images shown to volunteers to verify the iden-
tity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.8 The results of the eye-gaze tracking system show eyes, nose and lips
area are the most important area for viewers to verify the identity. 81
6.9 Another pair of face images. The face images are inverted in
grayscale (negative image). . . . . . . . . . . . . . . . . . . . . . 82
6.10 The results of the eye-gaze tracking system for negative images. . 82
6.11 Yet another pair of face images. Note that the left face image is
inverted in grayscale and the right face image is semi-drawing. . 83
6.12 The results of the eye-gaze tracking system for negative images. . 83
7.1 Block diagram of the fractal face recognition system with PCA
based feature reduction. . . . . . . . . . . . . . . . . . . . . . . . 90
xiv LIST OF FIGURES
7.2 Matrix showing differences between faces shown on the two axes.
Darker points indicate larger difference. Entries below the diagonal
are pixel-value differences. Entries above the diagonal are fractal-
feature differences. . . . . . . . . . . . . . . . . . . . . . . . . . . 90
B.1 The results of Fractal image-set coding for subset of MIT face
database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
B.2 The results of Fractal image-set coding for the evaluation subset of
XM2VTS database. Arrows showing the position of the threshold
for FRR=0, FRR=FAR and FAR=0 . . . . . . . . . . . . . . . . 98
B.3 The results of Fractal image-set coding for the test subset of
XM2VTS database. Arrows showing the position of the thresh-
old for FRR=0, FRR=FAR and FAR=0 in the evaluation data
set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
List of Tables
4.1 An example of fractal codes. . . . . . . . . . . . . . . . . . . . . . 35
B.1 Error rates obtained using Fractal image-set coding . . . . . . . . 99
B.2 Error rates Reported by T. Tan using fractal neighbor distances . 99
Acronyms & Units
bpp bits per pixel
dB decibels
FA False Acceptance
FR False Rejection
ICA Independent Component Analysis
IFS Iterated Function Systems
KLT Karhunen-Loeve Transform
LDA Linear Discriminant Analysis
LDT Linear Discriminant Transform
LED Light Emitting Diode
PCA Principal Component Analysis
PIFS Partitioned Iterated Function Systems
PSNR Peak Signal-to-Noise Ratio
XM2VTS Extended MultiModal Verification for Teleservices and Security
Certification of Thesis
The work contained in this thesis has not been previously submitted for a degree
or diploma at any other higher educational institution. To the best of my
knowledge and belief, the thesis contains no material previously published or
written by another person except where due reference is made.
Signed:
Date:
Acknowledgments
It is not possible to thank everybody who has had an involvement with me during
the course of Ph.D. However, there are some people who must be thanked. Firstly,
I would like to thank my family and parents whose encouragement, support and
prayers has helped me achieve beyond my greatest expectations. I thank them for
their understanding, love and patience, especially through the number of more
difficult and stressful moments. Without their help and support throughout the
years it was not possible for me to come this far.
I would like to thank my principal supervisor Dr. Vinod Chandran for his guid-
ance and encouragement throughout my course of study. In addition I must thank
Dr. Chandran for his conscientious reviewing of my conference and journal papers
as well as my thesis draft.
I would also like to thank my associate supervisor, Prof. Sridha Sridharan for the
research environment he has created, as well as the additional financial support he
has provided me through the scholarship top-ups and the financial travel support
for the many conference travels I have undertaken.
In addition, I am appreciative of the financial support of the Iranian ministry
of science, research and technology through the PhD scholarship, I had been
awarded. A special acknowledgement goes to Prof. Javad Farhoudi, former Ira-
nian scientific counsellor in Canberra for his role, support and help during my
study in Australia. I also thank Dr. Kohian for his consideration and help.
ACKNOWLEDGMENTS xix
Former and current staff and students of the Image and Video Research Labora-
tory must also be acknowledged and I was fortunate enough to interact and work
with them. Anthony Ngyuen and Jason Pelecanos have been of particular help
to me during my Ph.D. Simon Lucy, John Dines, Michael Mason, David Cole,
and Eddie Wong all deserve special mention for their help at various times.
Hossein Ebrahimpour-Komleh
Queensland University of Technology
August 2004
Chapter 1
Introduction
This thesis fundamentally addresses four related topics: (i) the study of possibility
of using fractal codes of grayscale images as features for face recognition, (ii)
the study of mathematical bases for using fractals for recognition, especially face
recognition, (iii) the possibility of designing a more suitable fractal coding system
for recognition, and (iv) theoretical investigations into the definition and use of
Sub-fractals which are defined to be independent fractal codes of different parts
of an image.
In this thesis, the emphasis is on the use of fractal codes for recognition. Face
recognition has been chosen as an application for testing this concept and the
intention is not to aim for superior face recognition performance by fractal tech-
niques alone. A short introduction of chaos and fractals as well as face recognition
are given below.
1.1 Chaos and Fractals
A fractal is by definition “a set for which the Hausdorff-Besicovitch dimension
strictly exceeds the topological dimension” [57]. Benoit Mandelbrot, who coined
2 1.2 Face recognition
fractal and its definition, in his classic book “The Fractal Geometry of Na-
ture” [58] developed a new geometry of nature that describes many of the ir-
regular and fragmented patterns around us using fractals. This ability is based
on special features of fractals and their differences with other known models like
geometrical models. For example fractals do not have a characteristic length. A
shape usually has a definite scale that characterizes itself. Geometric shapes have
their own characteristic length such as the radius or circumference of a circle and
the edge or diagonal of a square. On the other hand, the length, size or volume
of fractals cannot be measured with a single unit, as their surface are not smooth
and that more closer we look in, the more complicated the shape appears.
Mandelbrot used this ability of fractals to describe the geometry of natural shapes
such as clouds, mountains and coastlines which can not be modelled by simple
geometrical objects like spheres, cones or circles. After the developments in the
field of dynamical systems and chaos and chaotic dynamical systems and their
close relationship with fractals, there is no wonder why fractals can describe
shapes like the fern very well. It is now understood that chaotic dynamics is
inherent in nonlinear deterministic systems with seemingly random behavior. As
a result, chaos and fractals have fascinated scientists from all fields. It is not only
because of its importance in applications but also the beauty of the geometric
patterns produced. In chapter 2 we will explain fractals and fractal image coding
methods further.
1.2 Face recognition
Biometrics is an active area of research with a wide range of applications in surveil-
lance, security systems, human-computer interfaces, etc. This term has been used
to refer to the emerging field of technology devoted to automatic identification of
individuals using physiological or behavioral traits. Techniques such as retinal or
iris scanning, hand geometry, speech recognition, fingerprint scanning, signature
1.3 Thesis Outline 3
verification and face recognition are examples of biometric methods of identifica-
tion which work by measuring unique human characteristics as a way to confirm
identity. Face recognition has the advantage of requiring very little cooperation
or modification of normal behavior on the part of the subjects in order to collect
useful data. But unlike some other biometrics like fingerprints or irises, faces
do not stay the same over time. Facial recognition systems have to deal with
changes in hairstyle, facial hair, spectacles, make-up and aging. Face recognition
is different from other pattern recognition problems such as character recognition.
This difference arises from the fact that in classical pattern recognition, there are
relatively few classes, and many samples per class. With many samples per class,
algorithms can classify samples not previously seen by interpolating among the
training samples. On the other hand for a typical face recognition, not only is
there large intra-class variation, but also there are many classes and only few
samples per class for training.
A facial recognition system often relies, implicitly, on extrapolation from the
training samples. Some variations such as location in an image frame, size and
pose can be removed by preprocessing to align and normalize the face. Eyes are
detected and eye locations are used for this purpose in many face recognition
systems. Feature extraction is an important stage for a typical facial recognition
system. Many feature or template based methods have been proposed for this
task but still there are many new developments on the way. In chapter a review
of some classical face recognition methods is given.
1.3 Thesis Outline
The goal of this thesis is to advance novel fractal recognition systems and their
application in human face recognition. This idea is built based on the theory of
fractal image encoding and decoding which are discussed in Chap. 2. As face
recognition was used as the main application for testing the concepts presented
4 1.4 Publications Resulting from research
in this thesis, we review of some classical face recognition methods in Chap. 3.
Our fractal techniques for recognition, including fractal codes directly as features,
fractal image-set coding and subfractals are introduced and discussed in Chap. 4,
Chap. 5 and Chap. 6 respectively. The final chapter discusses the summary and
conclusion of this thesis and points out future and promising research directions.
1.4 Publications Resulting from research
The following fully-refereed publications have been produced as a result of work
in this thesis:
Book Chapter
1- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ An Application of
Fractal Image-set Coding in Facial Recognition,” vol. 3072 of Lecture Notes
in computer science, Biometric Authentication, pp. 178-186. Springer Ver-
lag, July 2004.
Conference Publications
2- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Fractal Image-set
encoding for Face Recognition,” Proceedings of International Conference
on Computational Intelligence for Modelling Control and Automation, pp.
664-672. July 2004.
3- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Facial Image Re-
trieval Using Fractal Image-Set Coding,” 2nd Workshop on Information
Technology and Its Disciplines, Feb. 2004.
4- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Mathematical ba-
sis for use of fractal codes as features,” Proceedings of Image and Vision
1.4 Publications Resulting from research 5
Computing, IVCNZ02, vol. 1, pp. 203-208. Nov. 2002.
5- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Robustness to ex-
pression variations in fractal-based face recognition,” Proceedings of Sixth
International, Symposium on Signal Processing and its Applications, vol.
1, pp. 359-362. 2001.
6- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Face recognition
using fractal codes,” Proceedings of IEEE International Conference on Im-
age Processing, vol. 3, pp. 58-61. 2001.
7- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Face recognition
using fractal codes,” Proceedings of Third Australasian Workshop on Signal
Processing Applications, (WoSPA) 2000, Brisbane, Australia, 2000.
Journal Publications
8- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Subfractals: A new
concept for fractal image coding and recognition,”Complexity International,
Monash university, ISSN 1320-0682 (Submitted).
Chapter 2
Fractal Encoding and Decoding
2.1 Introduction
Fractals, as some interesting mathematical sets, were known and studied by math-
ematicians like Cantor, Poincare and Hilbert [14] in the late 19th and early 20th
century. But it was Mandelbrot [56] who is widely recognized as the one who de-
fined the science of fractal mathematics. Iterated function theory defined by John
Hutchinson [43] was the second step in the development of fractal compression
systems. This theory later used by Michael Barnsley [3] for describing the collage
theorem that describes what a system of iterated functions must be like in order
to produce a fractal image. Arnaud Jacquin, one of Barnsley’s graduate students,
implemented the algorithm that can automatically convert an image into a Par-
titioned Iterated Function System [45]. This algorithm is the basis for most of
current fractal coding algorithms. The goal of these algorithms is to be able to
create a series of mathematical processes which would produce an accurate repro-
duction of an image. This reproduction using fractal codes is much more compact
than the picture. Many algorithms [35], [46], [70], [99] have been proposed to use
these codes for image compression. The remainder of this chapter is organized as
follows. Section 2.2 explains some common features of fractals. Mathematical
2.2 Features of Fractals 7
foundations, Iterated function systems(IFS) and principles of fractal coding are
presented in section 2.3, 2.4 and 2.5 respectively.
2.2 Features of Fractals
Fractal shapes are also characterized by their statistical self-similarity, regular
processes that appear over a range of scales, and non-integer fractional dimen-
sion. In spite of its intuitively comprehensive concept and potential for wide
application, complexity and difficulty concerning its visualization hindered the
study of fractals until recent advances in computer processing. Fractal dimen-
sion can be measured using various methods including the box-counting method,
i.e., estimating the complexity from the number of boxes used for approximating
the figure at different scales [90]. Fractal figures generally share the following
features in common:
No characteristic length: - A shape usually has a definite scale that charac-
terizes itself. Geometric shapes, for instance, have their own characteristic
length such as the radius or circumference of a circle and the edge or diag-
onal of a square. Fractal figures, on the other hand, have no such length.
Their length, size or volume cannot be measured with a single unit, as their
surface are not smooth and that more closer we look in, the more com-
plicated nest of surface shape appears. Consequently, we cannot draw a
tangent line of fractal figure; i.e. it is non-differentiable.
Self-similarity: - Fractal figures are unique in that they cannot be measured
with a single characteristic length, because of the repeated pattern we con-
tinuously discover at different scale levels. In other words, because fractal
figures hold self-similarity, their shape does not change even when observed
under different scale. One of the best examples for understanding this fea-
ture is fern. As it shown in figure 2.1, a small part of the figure when
8 2.2 Features of Fractals
enlarged reproduces the original figure.
Figure 2.1: One of the best examples for understanding the features of fractals isthe fern.
Non-integer dimension (fractal dimension): - We normally consider a
point to have a topological dimension of 0. In this sense, a boundary has a
topological dimension of 1, a surface has a dimension of 2 and a solid has
the dimension of 3. However, a complex curve may wander on a surface.
In case of van Koch’s snowflake that is shown in the figure 2.2, the curve
becomes 43
times longer than the original curve every time it grows. Thus,
a curve will have a fractal dimension of a real number between 1 and 2.
A complex curve that approaches surface filling will have a fractal dimen-
sion approaching 2. Therefore, the more complex the geographic boundary,
the higher the fractal dimension, (in the case of van Koch’s curve, we take
log 4log 3
or 1.26 as its fractal dimension). The actual values of these fractal di-
mensions differ slightly, depending on the method of defining it. Currently,
there are several methods that are physically feasible. We can measure frac-
tal dimension by: changing coarse-graining level (box-counting methods),
using the fractal measure relations, using the correlation function, using the
distribution function, or using the power spectrum.
2.3 Mathematical Foundations 9
2.3 Mathematical Foundations
This section provides basic notation and definitions related to fractal image cod-
ing.
2.3.1 Metric Space
A space M (can e.g. be the space of compact subsets of R3) is a metric space
if for any of its two elements x and y, there exists a real number d(x, y), called
distance, that satisfies the following properties:
(1) d(x, y) ≥ 0 (non-negativity)
(2) d(x, y) = 0 if and only if x = y (identity)
(3) d(x, y) = d(y, x) (symmetry)
(4) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality)
Cauchy sequence
A sequence {xn}∞n=0 = {xn ∈ M, n ∈ N} is said to be a Cauchy sequence if,
∀ε > 0,∃K ∈ N, such that d(xn, xm) ≤ ε, for all n,m > K
Complete metric space
A metric space (M, d) is complete if every Cauchy sequence of points {xn}∞n=0 in
M has a limit xn ∈ M.
10 2.3 Mathematical Foundations
2.3.2 Contractive Transformations
A Transformation w : M 7→ M is said to be contractive with contractility
s ∈ [0, 1) if for any two points x, y ∈ M, the distance
d(w(x), w(y)) < s.d(x, y)
Loosely speaking, This formula says the application of a contractive map always
brings points close together (by some factor less than 1).
Contractive transformations have the nice property that when they are repeatedly
applied, they converge to a point which remains fixed upon further iteration.
2.3.3 Fixed Point Theorem
If the space (M, d) is a complete metric space and w : M 7→ M is a contractive
transformation with contractivity factor s, then
1- There exists one unique fixed point xf ∈ M, which is invariant under w:
w(xf ) = xf
2- For any point x ∈ M, it holds that
limn→∞
wn(x) = limn→∞
w(w(w(. . . (x))))︸ ︷︷ ︸
n times
= xf
3-(Collage theorem) For any point x ∈ M, it holds that
d(x, xf ) ≤1
1 − s.d(x,w(x))
The fixed point theorem shows how fractal coding of images can be done. We
consider images a points in a metric space and we find a contractive transfor-
mation on that space which its fixed point is the image we wish to encode (in
2.4 Iterated Function Systems(IFS) 11
practice it may be an image very close to it). The fixed point theorem grantees
that the distance between transformed point(by the contractive transformation)
and the fixed point is less than the distance between the initial point and the
fixed point. If we apply the contractive transformation iteratively to an initial
point we can see the results come closer and closer to the fixed point.
2.3.4 Affine Transformation
For a gray scale image I, if z denotes the pixel intensity at the position (x, y),
then affine transformation W can be expressed in matrix form as follows:
W
x
y
z
=
a b 0
c d 0
0 0 s
x
y
z
+
e
f
o
Where a, b, c, d, e, f are geometrical parameters, s is the contrast and o is the
brightness offset (luminance parameters). This transformation also can be shown
in linear form W (X) = AX + B, where A is a n × n ( in our case n=3) matrix
and B is an offset vector of size 1 × n. Using an affine transformation, we can
scale, rotate an image, contrast scale or translate pixel intensities.
2.4 Iterated Function Systems(IFS)
An iterated function system {W : wi, i = 1, 2, . . . , N} consists of a collection
of contractive affine transformations wi : M 7→ M with respective contractiv-
ity factor si together with a complete metric space (M, d). This collection of
transformations defines a contractive transformation W with contractivity factor
s = max{si, i = 1, 2, . . . , N}. The contractive transformation W on the complete
metric space (M, d) will have a unique fixed point Xf which is also called the
12 2.5 Principles of Fractal Coding
attractor of this IFS.
W (X) =N⋃
i=1
wi(X)
W (Xf ) =N⋃
i=1
wi(Xf ) = Xf
Figure 2.3 shows an example of attractor of an IFS with 3 simple contractive
transformations w1, w2, w3 as :
wi a b c d e f
w1 0.5 0 0 0.5 0 0
w2 0.5 0 0 0.5 0.5 0
w3 0.5 0 0 0.5 0.25 0.5
which wi is in the following form:
wi
x
y
z
=
a b 0
c d 0
0 0 0
x
y
z
+
e
f
1
Jacquine’s method as well as many other fractal image coding methods are based
on partitioned iterated function systems(PIFS) which is a generalization of IFS.
In PIFS each transformation wi will apply only to a restricted set of domains.
This will help us to encode more general images which are not fully self-similar.
2.5 Principles of Fractal Coding
Various schemes of fractal image compression were proposed, which differ in the
partitioning method, class of transformation or type of search used in locating
suitable domain blocks. The first fully automated algorithm for fractal image
2.5 Principles of Fractal Coding 13
Figure 2.2: Van Koch’s snowflake with fractal dimension of 1.26
Figure 2.3: Serpinski triangle the attractor of an IFS containing 3 contractivetransformations.
14 2.5 Principles of Fractal Coding
compression, proposed by Jacquin [45] in 1989. Until Jacquin’s encoder be-
came available, attempts had been made to design fractal encoders which were
supposed to create transformation with the structure of iterated function sys-
tems. Jacquin’s method was based on partitioned iterated function systems
(PIFS), a more general type of transformation which exploits the fact that a
part of an image can be approximated by a transformed and down-sampled
version of another part of the same image, this property is called piecewise
self-similarity. A PIFS consists of a complete metric space X, a collection of
sub-domains Di ⊂ X, i = 1, . . . , n and a collection of contractive mappings
wi : Di → X, i = 1, . . . , n.
The encoder works, in principle, as follows:
Range Blocks: -An image to be encoded is partitioned into non-overlapping
range blocks Ri.
Domain Blocks: -An Image is also partitioned into larger blocks Dj called do-
main blocks which can be overlapped.
Transformation: - The task of a fractal encoder is to find a domain block DRi
of the same image for every range block Ri such that a transformed ver-
sion of this block w(DRi) is a good approximation of the range block. The
contractive transformation w is a combination of a geometrical transforma-
tion and luminance transformation. The transformed version of the domain
block can be rotated, mirrored, contrast scaled or translated, so the trans-
formation can be shown as an affine transformation.
Various schemes of fractal image coding are different in the partitioning method,
the class of transformation or the type of search used in locating suitable domain
blocks.
2.5 Principles of Fractal Coding 15
2.5.1 Partitioning
The first decision to be made when designing a fractal coding scheme is in the
choice of the type of image partition used for the domain and range blocks.
The simplest possible range partition consists of the fixed size square blocks.
Quadtree partitioning employs the well known image processing technique based
on recursive splitting of selected image quadrants, enabling the resulting partition
be represented by a tree structure in which each non-terminal node has four
descendants.
A horizontal-vertical (HV) partition like the quadtree, produces a tree-structured
partition of the image. Instead of recursively splitting quadrants, however, each
image block is split into two by a horizontal or vertical line and finally a number
of different constructions of triangular partitions have been investigated. In tri-
angular partitioning scheme, a rectangular image is divided diagonally into two
triangles. Each of these is recursively subdivided into four triangles by segment-
ing the triangle along lines that join three partitioning points along the tree sides
of the triangles.
2.5.2 Transformations
A critical element of a fractal coding scheme is the type of transform selected,
since it determines the convergence properties on decoding, and its quantized pa-
rameters comprise the majority of the information in compressed representation.
The fixed point theorem states that contractive transformations, through their
fixed points, can be used to represent points in the space. However, this theorem
does not show a method to find such transformations.
If we find a suitable contractive transformation W for image Xf , we know that
16 2.6 Summary
the fixed point of W is Xf , so
d(xf ,W (xf )) = d(xf , xf ) = 0
It may be very difficult to find an exact transformation W for any arbitrary image
x. Instead, many fractal image encoder only aim to find a transformation W ∗
with attractor x∗f which d(x, x∗
f ) is as little as possible. If the distance
d(x,W (x)) ≤ δ
then the distance from x to its approximation x∗f , which is the attractor of W ,
will be bounded by :
d(x, x∗f ) ≤
δ
1 − s
Hence, both δ and s (which is the contractivity factor of W ) should be as small
as possible. Affine transformations are good candidates for this case. Each trans-
formation can have 2 different parts: geometrical and luminance.
The geometrical part of the transformation, scales, rotates and translates a do-
main block to fit the range block. To keep the transformation contractive, the
size of a domain block is always bigger than range block so the scale factor is
always less than 1.
The luminance part consists a few simple functions, such as a luminance shift
and contrast scaling (again with contrast factor less than 1).
2.6 Summary
In this chapter the focus has been set on the brief introduction of fractals and their
features, such as self similarity, non-integer dimension as well as basic concepts
of fractal image coding. The mathematical basis, such as complete metric space,
contractive transformations and fixed point theorem have been introduced. Later
2.6 Summary 17
in this thesis, the utility of fractal codes for face recognition is proposed and
discussed.
Chapter 3
Face Recognition
3.1 Introduction
Face is a unique feature of human beings. However, all faces are similar in features
and structures. During the past several years, face recognition has developed into
a major research area in pattern recognition and computer vision. As one of the
most challenging applications in these fields, face recognition has received signif-
icant attention. Unlike other biometric systems, facial recognition can be used
for general surveillance, usually in combination with public video cameras. This
chapter overviews some of the classic 2D still image face recognition algorithms.
3.2 Facial Feature Detection
Most of the practical face recognition systems need a face detection stage to
detect the location of the face within a source image. Face recognition systems
also normalize the size and orientation of the face to achieve more robustness.
The normalization methods uses the location of the significant facial feature such
as eyes, nose or mouth. For example, once the eyes are detected, one is able to
20 3.3 Geometric Feature Based Methods
transfer the eyes into pre-determined locations in an image of pre-defined size
using an affine transformation. The importance of robust facial feature detection
for both detection and recognition has resulted in the development of a variety
of different facial feature detection algorithms [2], [20], [59], [66], [89], [106].
Brunelli and Piggio [15], [17] proposed a facial feature detection method which
uses a set of templates to detect the position of the eyes in an image, by looking
for the maximum absolute values of the normalized correlation coefficient of these
templates at each point in the test image. To cope with scale variations, a set of
templates at different scales was used.
The problems associated with scale variation can be solved by using a set of tem-
plates at different scales or using hierarchical correlation as proposed by Burt [18].
3.3 Geometric Feature Based Methods
The geometric feature based approaches [39], [42], [47], [49] are the earliest ap-
proaches to face recognition and detection. These approaches were focused on
detecting individual features such as eyes, ears, head outline and mouth, and
measuring different properties such as eyebrow thickness and their vertical posi-
tion or nose position and width, in a feature vector that is used to represent a
face. To recognize a face, first feature vectors of the test image and the images in
the database are obtained. Second, a similarity measure between these vectors,
most often a minimum distance criterion, is used to determine the identity of the
face.
Brunelli and Poggio [16] compute a set of geometrical features such as nose width
and length, mouth position, and chin shape. They report a 90% recognition rate
on a database of 47 people. However, they show that a simple template-matching
scheme provides 100% recognition for the same database.
3.3 Geometric Feature Based Methods 21
3.3.1 Face Recognition Using Principal Component Anal-
ysis (Eigenfaces)
Principal component analysis (PCA) [37], is a simple statistical dimensional-
ity reducing technique that has perhaps become the most popular and widely
used method for representation and recognition of human faces. PCA, via the
Kahunen-Loeve transform can extract most statistically significant information
for a set of images as a set of eigenvectors (usually called eigenfaces [96] when
applied to faces), which can be used both to recognize and reconstruct face im-
ages. This method proposed by Turk and Pentland [75], [96], [97] is motivated by
the earlier work of Sirovitch [88] and Kirby [50] for efficiently representing face
images. Once the face images are normalized for eye position, they can be treated
as a 1-D array of pixel values. The eigenvectors of the covariance matrix C of the
ensemble of training faces are called eigenfaces. The space spanned by the eigen-
vectors vk, k = 1, , K corresponding to the K largest eigenvalues of the covariance
matrix, is called the face space. Eigenvectors can be regarded as a set of gen-
eralized features, which characterize the image variations in the database. Each
image has an exact representation via a linear combination of these eigenvectors
and an arbitrarily close approximation using the K most significant eigenvectors.
The number of eigenvectors chosen determines the dimensionality of face space.
A new face image is transformed into its eigenface components by projection
onto the face space. The projections form the feature vector, which describes
the contribution of each eigenface in the representing the input image. A test
image is recognized by computing the Euclidean distance in the feature space
and selection the closest match. The effect of the lighting conditions over the
KLT based method has been detailed in [31]. The eigenface method has also
been used for face detection [67],[68] by measuring the distance from each lo-
cal pattern in a test image to the face space defined by the eigenfaces. In [1]
Akamatsu et. al., applied the eigenface method to the magnitude of the Fourier
spectrum of the images after normalization with respect to illumination and scale.
22 3.3 Geometric Feature Based Methods
Due to shift invariance property of the magnitude of the Fourier spectrum, and
to the illumination and scale normalization, the method, called the Karhunen-
Loeve Transform of Fourier Spectrum in the Affine Transformed Target Images
(KL-FSAT), performed better than classical eigenfaces method under variations
in head orientation and shifting.
In summary, PCA is a very efficient signal encoder, and designed specifically to
characterize and encode variation rather than ignore them. Thus it may find
the optimal low-dimensional representation, but this may be more useful for re-
construction rather than recognition. In addition, the eigenface method is not
invariant to image transformations such as scaling, shift or rotation in its original
form and requires complete relearning of the training data to add new individuals
to the database.
3.3.2 Recognition Using Independent Component Analy-
sis (ICA)
Independent Component Analysis (ICA) is a statistical method for transforming
an observed multidimensional random vector into components that are mutually
as independent as possible. This technique can be used for extracting statistically
independent variables from a mixture of them [22]. In a classical example, two
people in the same room speak simultaneously and two microphones are placed
at different locations recording the mixed conversations. ICA can be used to
estimate the contribution coefficients from the two signals, which allows us to
separate the two original signals from each other, assuming that the two speech
signals are statistically independent. The tutorial [44] written by Hyvarinen and
Oja contains more details about the algorithms involved.
Bartlett and Sejnowski have used Independent Component Analysis (ICA) for
face recognition [5], [6], [7]. Two approaches for recognizing faces across changes
3.4 Linear Discriminant-Based Method 23
in pose were explored using ICA. In the first architecture, a set of statistically
independent basis images for the faces, was provided. This set can be viewed as
a set of independent facial features. Unlike the PCA basis vectors, these ICA
basis images were spatially local. The representation consisted of the coefficients
for the linear combination of basis images that comprised each face image. The
second architecture produced independent coefficients. This provided a facto-
rial face code, in which the probability of any combination of features can be
obtained from the product of their individual probabilities. Classification was
performed using nearest neighbour, with similarity measured as the cosine of the
angle between representation vectors. Both ICA representations showed better
recognition scores than PCA when recognizing faces across sessions with changes
in expression and changes in pose.
3.4 Linear Discriminant-Based Method
In [8], [9], [33] the authors proposed a new method for face recognition using
Fisher’s Linear Discriminant Transform (LDT) [34], [37]. The Fisherface method
uses the class membership information and develops a set of feature vectors in
which variations between different faces are emphasized while different instances
of faces due to illumination condition, facial expressions, and orientations, are de-
emphasized. In other words, LDT finds the line that best separates the points.
Each test image is projected onto the optimal LDT space and the resulting set of
coefficients is used to compute the Euclidean distance from the images in training
set. The Fisherface method has also been applied to face detection from color
images [86]. In [1], Akamatsu, Sasaki and Suenuga applied LDA to the magnitude
of Fourier spectrum of the intensity image.The results reported by the authors
showed that LDA in the Fourier domain is significantly more robust to variations
in lighting than the LDA applied directly to the intensity images. In [54] authors
proposed another LDA based method
24 3.5 Summary
3.4.1 Other Methods
Other popular face recognition approaches that will only be mentioned in this re-
port include Dynamic Link Matching [100], [101] Matching Pursuit-Based Meth-
ods [55], [76], [77], [78] and Hidden Markov Model Based Methods [71], [85]
Face Recognition by Elastic Bunch Graph Matching [102], [103], [104].Fractal
based [24], [23], [25] , [26], [29], [27], [28], [30], [52], [93] approaches are a new
application of fractals which will be presented in next chapter. Some other pub-
lications that describe latest achievements as well as currently unsolved issues of
face recognition are as followed: [107], [87], [10], [40], [95], [19], [84].
3.5 Summary
Face recognition systems work very well under constrained conditions, such as
frontal mug shot images and consistent lighting. In actual world face recognition,
like other biometrics, suffers from several usability problems. Face is a changeable
social organ (see figure 3.1) displaying a variety of possible presentations. Human
facial expressions make changes in the shape of facial components such as eyes,
mouth and eyebrows. Artificial changes include cuts and bandages from injuries
or wearing glasses and fashion-related issues like makeup, jewelry also change the
face images. Some changes may occur with time such as growth and removal
of facial hair and wrinkles of the skin caused by aging. It has been shown that
using facial images taken at least one year apart, can cause error rates of 43%
[80]to 50% [74]. A facial image is a 2D view (projection) of a 3D surface. Viewing
angle, pose and illumination (change in sunlight intensities) can affect this projec-
tion. For example when the face tilt right-left or up-down the 2D view changes.
These changes in gray-level will cause features to change. Face recognition algo-
rithms appear to be sensitive to changes from ideal conditions. FRVT evaluation
report [12] show high error rates even in those ideal conditions. Different per-
3.5 Summary 25
Figure 3.1: Examples of pose(XM2VTS face image database [65]), lighting (ARface image database [61]) and facial experssion variations (CMU-Pitt facial ex-pression database [48]) in face images.
26 3.5 Summary
formance evaluation tests such as, FERET [82],[81], [83], FRVT [12], [79] and
XM2VTS[63], [65], show significant improvement in face recognition technology.
However, there are still areas which require further research and development.
Chapter 4
Fractal Codes Directly as
Features
4.1 Introduction
Using fractals for object or shape recognition is a relatively new application of
fractal image encoding. The goal of the fractal image encoding algorithms is to be
able to create a series of mathematical processes which would produce an accurate
reproduction of an image. For many years, fractal encoding was a technique
for image compression. Fractal codes are much more compact than the original
image and many algorithms [4], [13], [35], [36], [38], [45], [69], [70] [98], [99], [105]
have been proposed to use these codes for image compression. In this chapter
another application of fractal codes is proposed. Fractal codes have this ability to
reproduce an image (or at least a good approximation of it) by a set of contractive
transformations. These transformations can be shown in simple affine form and
can be recorded by several simple parameters. This compact presentation of
images shows its usefulness in image compression, but, is it possible to use this
code for recognition too? In this chapter, a brief explanation of some previous
related work is given in section 4.2. In section 4.3, use of fractal codes as feature
28 4.2 Previous Related Work
for recognition, especially face recognition, is described and different aspects of
this method such as fractal extraction, normalizing the Fractal codes, accuracy
test and improving the robustness is discussed. Other original fractal techniques
for face recognition are explained in the chapters 5 and 6.
4.2 Previous Related Work
4.2.1 Shape Recognition Using Fractal Geometry
Neil [72], [73], proposed one of the first methods for using fractal techniques in
shape recognition. His method is based on the comparison of a transformation
and an object. To compare two different objects (shapes), his method firstly finds
an associated transformation for the object being identified. Then a comparison
between the transformation and other object is achieved by applying the trans-
formation to the object. The object will remain unchanged if and only if the
transformation is an associated transformation for that object. This method is
based on the binary representation of shapes (black and white images) and has
reported some invariance to rotation by giving standard orientations to shapes.
4.2.2 Face Recognition Using Fractal Dimensions
Kouzani [51], proposed a face recognition method based on the fractal dimension.
In his method, each pixel of an image is replaced by the fractal dimension of region
around that pixel. To handel the shortcomings of fractal dimension calculation,
he used the average value of different fractal dimensions related to the region
with different region sizes around the pixel. To compare between two images, he
presented these fractal dimension maps to a normalized cross correlation stage in
which the best match is chosen.
4.3 Fractal Codes as Features 29
In another work [52], Kouzani used 2 feed forward neural networks. The first
one implements the search process for matching range and domain blocks in the
face image. The second one compares the fractal code of a query image and the
fractal code of the known face in the database. Kouzani claimed that the second
neural network calculates the degree of similarity between the two fractal face
models, but did not explain how.
4.2.3 Face Recognition Using Fractal Neighbor Distances
Yan and Tan [93], [92], [94] extended Neil’s method for gray scale images. In their
method, a database of fractal codes for the set of training face images is generated
first. Then for any unknown query image Iq and any known training image I
with fractal code WI , the fractal neighbor distance Υ(WI , Iq) = d(WI(Iq), Iq) is
calculated and compared with others (d is a Euclidian distance function). The
code Wmin with minimum Υ(Wmin, Iq) is taken as the best match.
This method was also used for face verification. The system comprised two com-
ponents: face detection and face verification subsystems. The location of the
head was detected based on the result of a search in the reduced region using the
fractal neighbor distance between a generic face template and a portion of the
image. The verification subsystem, also used fractal neighbor distance to com-
pute and find the minimal distance between localized head image and the images
stored in the XM2VTS database. The results were reported in [63].
4.3 Fractal Codes as Features
As the fractal encoding algorithms can apply to any (gray-scale) images, we can
say that, any (gray-scale) image can be approximated by the attractor of a fractal
code. Image xf is the attractor of fractal code W (x) if xf is the fixed point (see
30 4.3 Fractal Codes as Features
sec. 2.3.3 ) of the fractal code W (xf ) = xf . Since fractal representations are
transformations that apply between one part in an image and another, some part
of the code could be robust to many types of degradations that affect both the
parts (domain and range blocks) similarly.
This section describes the first system proposed in this thesis which is based on
the use of the fractal code of an Image as feature for recognition.
4.3.1 Fractal Extraction
In fractal image coding, the code for an image x is an efficient binary representa-
tion of a set of contractive affine transformations W whose unique fixed point xf
is a good approximation to x. The fractal coding algorithm used in this system
can be described as follows:
1- Partition the image to be encoded into non-overlapping range blocks Ri using
quad-tree partitioning.
2- Cover the image with a sequence of possibly overlapping domain blocks Dj.
3- For each range block, find the domain and corresponding transformation that
best match the range block.
4- Save the geometrical positions of range block and matched domain block as
well as the matching transformation parameters as fractal codes of image.
Quadtree partitioning
Quadtree partitioning method employs the well known image processing tech-
nique based on recursive splitting of selected image quadrants, enabling the re-
sulting partition be represented by a tree structure in which each non-terminal
node has four descendants. The usual top-down construction starts by selecting
4.3 Fractal Codes as Features 31
an initial level in the tree, corresponding to some maximum range block size. In
order to produce contractive transformations, range blocks not smaller than the
largest domain blocks are subdivided into smaller range blocks. Each range block
larger than a preset limit, is recursively partitioned if a match with one of the
domain blocks in the domain pool, better than some preselected threshold is not
found. In figure 4.1 (top left side) a sample of quadtree partitioning is shown.
Note that a region containing detail is split into smaller domains in the process
of finding a sufficiently good match.
Figure 4.1: Domain (bottom) and range(top left) blocks for an image (top right)
32 4.3 Fractal Codes as Features
Domain blocks
The task of a fractal encoder is to find a domain block D of the same image
for every range block such that a transformation of this block W (D) is a good
approximation of the range block. In order to have contractive transformations,
the domain block should be bigger than the range block. The number of different
sizes of domain blocks and how much overlap is allowed are two important pa-
rameters of the system. Figure 4.1 (bottom) shows domain blocks of two different
sizes 8×8 and 16×16. Note that in this example, the domain blocks of the same
size are not overlapped but the domain blocks with larger size are overlapped
with 4 domain blocks of smaller size.
Mapping domains to ranges
The main computational step in fractal image coding is the mapping of domains
to range blocks. For each range block, the algorithm compares transformed ver-
sions of the domain blocks to the range block. The transformations are typically
affine transformation. The transformations W is a combination of a geometri-
cal transformation and luminance transformation. For a gray scale image I, if
z denotes the pixel intensity at the position (x, y), then W can be expressed in
matrix form as follows:
W
x
y
z
=
a b 0
c d 0
0 0 s
x
y
z
+
e
f
o
(4.1)
Coefficients a, b, c, d, e and f control the geometrical aspects of the transformation
(skewing,stretching, rotation, scaling and translation), while the coefficients s and
o determine the contrast and brightness of the transformation and together make
the luminance parameters. The geometrical parameters of the transformation
limited to rigid translation, a contractive size-matching, and one of eight orien-
tations. The orientations consist of four 90o rotations, and a reflection followed
4.3 Fractal Codes as Features 33
by four 90o rotations as shown in figure 4.2.
Figure 4.2: The eight possible orientations of a block. The orientations consistof four 90o rotations, a reflection and four more 90o rotations.
Figure 4.3: An illustration of domain and range blocks
Domain-range comparison is a three-step process. One of the eight basic orien-
tations is applied to the selected domain block Dj. Next, the rotated domain
is shrunk to match the size of the range block Rk. The range must be smaller
than the domain in order for the overall mapping to be a contraction. Finally,
optimal contrast and brightness parameters are computed using least-squares fit-
ting. Representing the image as a set of transformed blocks does not form an
exact copy of the original image, but a close approximation of it. Minimizing the
error between W (Dj) and Rk will minimize the error between the original image
and the approximation. Let ri and di, i = 1, · · · , n denote the pixel values of two
equal size blocks Rk and shrink(Dj). The error Err is defined as:
Err =n∑
i=1
(s.di +o−ri)2 (4.2)
34 4.3 Fractal Codes as Features
The minimum of Err occurs when the partial derivatives with respect to s and
o are zero:
Err = n.o2+n∑
i=1
(s2.d2i +2.s.di.o−2.s.di.ri−2.o.ri+r2
i ) (4.3)
∂Err
∂s=
n∑
i=1
(2s.d2i +2di.o−2di.ri) = 0 (4.4)
∂Err
∂o= 2.n.o+
n∑
i=1
(2.s.di−2.ri) = 0 (4.5)
which occurs when:
s =[n
∑n
i=1 di.ri −∑n
i=1 di
∑n
i=1 ri][
n∑n
i=1 d2i − (
∑n
i=1 di)2] (4.6)
o =1
n
[n∑
i=1
ri − s
n∑
i=1
di
]
(4.7)
These two equations can be simplified as:
s =α
β(4.8)
o = r−
(α
β
)
d (4.9)
where:
d =1
n
n∑
i=1
di (4.10)
r =1
n
n∑
i=1
ri (4.11)
α =n∑
i=1
(di − d
). (ri − r) (4.12)
β =n∑
i=1
(di − d
)2(4.13)
Proof:
s =α
β=
n.α
n.β=
n.∑n
i=1
(di − d
). (ri − r)
n.∑n
i=1
(di − d
)2 =
4.3 Fractal Codes as Features 35
=n.
∑n
i=1 (di.ri) − n.d.∑n
i=1 ri − n.r.∑n
i=1 di + n2.d.r
n.∑n
i=1 d2i − 2.n.d.
∑n
i=1 di + n2.d2
=n.
∑n
i=1 di.ri −∑n
i=1 di.∑n
i=1 ri −∑n
i=1 ri.∑n
i=1 di +∑n
i=1 di.∑n
i=1 ri
n∑n
i=1 d2i − 2. (
∑n
i=1 di)2+ (
∑n
i=1 di)2
=[n
∑n
i=1 di.ri −∑n
i=1 di
∑n
i=1 ri][
n∑n
i=1 d2i − (
∑n
i=1 di)2]
and
o = r−
(α
β
)
d =1
n
n∑
i=1
ri−s.1
n
n∑
i=1
di =1
n
[n∑
i=1
ri − s
n∑
i=1
di
]
Sample fractal codes of an image are as shown here:
Table 4.1: An example of fractal codes.Quadtreeparameters
Domainindex
Orientation Brightness Contrast
1 1 1 1 0 0 1 6 111 0.1111 1 1 2 0 0 1 7 301 -0.1301 1 1 3 0 0 1 6 194 0.0031 1 1 4 0 0 1 5 67 0.1651 1 2 1 0 0 1 2 324 -0.1571 1 2 2 0 0 1 2 274 -0.0941 1 2 3 0 0 1 5 -522 0.9001 1 2 4 0 0 1 7 216 -0.0251 1 3 1 0 0 1 5 -47 0.3051 1 3 2 1 0 7 1 128 0.022...
......
......
The first columns contain the 6 quadtree parameters which showing the geo-
metrical positions of range blocks, the next column is the domain index number
which uniquely locate the position of domain block using some preset parameters
such as size of domain blocks, number of different domain sizes and overlapping
factor. The third column contains the orientation index which is a number be-
tween 0 and 7, the last two columns are brightness and contrast factor o and
s respectively. In this system, last 4 columns (domain index number, rotation
(orientation) index,brightness and contrast factor) are used as fractal features for
recognition.
36 4.3 Fractal Codes as Features
4.3.2 Normalizing Fractal Codes
Each fractal feature used in this system is a vector, such that each image has
4 feature vectors of same size. The size of each vector, however, varies from
one image to another and it depends on the number of range blocks which is
depends on the partitioning threshold, the size of image, image complexity and
the minimum size of range and domain blocks. In order to normalize the size of
each vector we use the quad-tree partitioning geometry and apply each feature
value at its geometrical position (as can be seen from figure 4.4). Because quad-
tree partitioning can be applied to an image of any arbitrary size, we can resize
all feature vectors to the size of the query image. This makes our method robust
to size and scale changes. For classification we used the Peak Signal-to-Noise
ration (PSNR) between feature vectors of the query image and feature vectors
of all image in the database as a measure of distance and a minimum distance
classifier.
A B
C D
Figure 4.4: Fractal features of an image (A=Domain index number, B=Rotation(orientation) index, C=Brightness shift and D=Contrast factor) displayed as grayvalues over the quad-tree partition of the same image.
4.3 Fractal Codes as Features 37
4.3.3 Accuracy Tests
To initially test our system, we have used a subset of the MIT face database.
This version of MIT face database consists of 2 face images from 90 subjects for
a total of 180 images, with some variation in the illumination, and the scale and
head orientation. In figure 4.5 some examples from this face database are shown.
Figure 4.5: Typical images from the MIT face database(ftp:\\whitechapel.media.mit.edu\pub\eigenfaces\pub\images). Two differ-ent frontal views of each person are included.
We used each feature separately for classification, first, to obtain some idea of
their information content or ability to discriminate between faces. Classification
accuracy was plotted as a function of the number of images as the size of the
number of images tested grew from 1 to 180, the size of the database. It was
found that the orientation parameter yielded an accuracy of about 72% and the
domain index yielded an accuracy of about 64% separately on this data. The
other two features yielded lower accuracy (as can be seen from figure 4.6 ). The
use of all four features resulted in a total accuracy of close to 88.5% accuracy for
a small database size. The accuracy tended to level off around 85%.
This suggests that in a fractal representation of the face, the information about
which parts are self similar to which other parts and the orientation differences
between these parts is more useful for recognition than the transformations be-
38 4.3 Fractal Codes as Features
Figure 4.6: Recognition accuracy using Rotation, Domain index, brightness andcontrast features, independently, and total accuracy achieved using all features,plotted against the number of the images in the database as this is progressivelyincreased.
tween ’averaged’ pixel gray level descriptions such as brightness and contrast from
domain to range. Lighting variations are also more likely to affect brightness and
contrast more significantly than the other two features.
Figure 4.7 shows the results of using this method to retrieve the closest 8 images to
a given image. The first four images are the best hit for each feature and the last
four images are the second best hit of that feature. In figures 4.8, 4.9 robustness of
this method to rotation is demonstrated. In the first test, a 180 rotated version
was used as the query image. The method, using only the orientation feature
vector, was able to pick the correct identity as the closest match in this test. The
method, using only the orientation feature vector, was able to pick the correct
identity as the closest match and the second view of the same person as the
next closet match in the second test. It is interesting to note that the third and
fourth best matches are similar to the query image from a human visual point of
4.3 Fractal Codes as Features 39
Figure 4.7: A query Image and the first 8 closest matches found by the method(the first four images are the best hit for each feature and the last four imagesare the second best hit of that feature).
view, subjectively. The other matches are of the wrong gender but share some
similarities in overall appearance and shape.
In figure 4.10, the query image is inverted in grayscale. All the fractal features
except the brightness feature find the right match. The rotation feature gets both
first and second matches right. It is happened only because this change similarly
effects range and domain blocks. Also the position of range and domain blocks
are not changed. Thus, if the domain block Dj was the best match for the range
block Rk in the original image, it is still is the best match even after inverting
the pixel values. This fact shows that the first two fractal features (domain index
number and orientation index) would not be effected by this change. The effect
of this change on the other two fractal features (brightness and contrast factor)
can be shown by an example.
40 4.3 Fractal Codes as Features
Figure 4.8: A rotated query image and the first 8 closest matches. Note that thebest match image retrieved using the orientation feature is the correct person.
Example
Suppose n = 3 and R = [1, 2, 3], D = [4, 5, 6] are the range and shrunk domain
block. Brightness and contrast factor o, s are calculated as explained in equations
(4.8) and (4.9):
s =α
β=
∑n
i=1
(di − d
). (ri − r)
∑n
i=1
(di − d
)2
s =(−1).(−1) + (0).(0) + (1).(1)
(−1)2 + (0)2 + (1)2= 1
o = r − s.d = 2 − (1).5 = −3
If the image inverted in grayscale, the range and domain block R,D will be
R = [255, 254, 253] and D = [252, 251, 250] and o, s will be:
s =(1).(1) + (0).(0) + (−1).(−1)
(1)2 + (0)2 + (−1)2= 1
o = r − s.d = 254 − (1).251 = 3
This example clearly shows that inverting the pixel values in a grayscale image
only changes the brightness feature and does not change the other features.
4.3 Fractal Codes as Features 41
Query image
Figure 4.9: Another rotated query image and the first 8 closest matches onlyusing the orientation feature.
It can be easily shown that any shift in brightness which affects all the pixel values
equally, only changes the brightness feature. Suppose the pixel value v(x, y) of
the pixel at the position (x, y) is changed to v(x, y) + δ then the new contrast
and brightness features are as follows:
s′ =α′
β′=
∑n
i=1
(d′
i − d′).(r′i − r′
)
∑n
i=1
(d′
i − d′)2 =
∑n
i=1
((di + δ) − d − δ
). ((ri + δ) − r − δ)
∑n
i=1
((di + δ) − d − δ
)2
=
∑n
i=1
(di − d
). (ri − r)
∑n
i=1
(di − d
)2 =α
β= s
o′ = r′ − s′.d′ = (r + δ) − s.(d + δ) = o + (1 − s).δ
Thus, all fractal features except the brightness feature are robust to a shift in
brightness.
42 4.4 Summary
Figure 4.10: A query image inverted in grayscale and the first 8 closest matches.Note that the rotation feature gets both first and second matches right. Thebrightness feature does not find the right match because the change negatives thebrightness feature. The other two features find the right match.
4.4 Summary
This chapter described a new method for face recognition by using fractal codes
directly as features. It is shown that the fractal parameters of an image have a
self similarity based representation of that image and can be used as features for
object recognition. The fractal code of an image contains several different parts,
some variations in images affect some of the parameters, while others remain un-
changed. This introduces some degree of robustness in the system. It is discussed
that the fractal codes of different images are different in the size of fractal features
and a method for normalization of features to generate same reduced size feature
vectors is presented. The details of the experiments are given in the Appendix
B.
Chapter 5
Fractal Image-set Coding
5.1 Introduction
In this chapter, another method of using fractal codes for facial recognition is
presented. It is shown that the fractal code of an image is not unique and that
certain parameters can be held constant to capture image information in the other
parameters. Fractal codes are calculated keeping geometrical fractal parameters
constant for all images. These parameters are calculated from a set of images.
The proposed method is faster than traditional fractal coding methods which
require time to search and find the best domain for any range block. It also
lends itself to preprocessing steps that provide robustness to changes in parts of
a face and produces codes that are more directly comparable. Results on the
XM2VTS database are used to demonstrate the performance and capabilities of
the method.
44 5.2 Mathematical Bases
5.2 Mathematical Bases
A compact representation of fractal encoding and decoding process can be pro-
vided by using these operators [21] : Let =m denote the space of m × m digital
grayscale images, that is, each element of =m is a m × m matrix of grayscale
values. The get-block operator Γkn,m : =N → =k, where k ≤ N , is the operator
that extract the k × k block with lower corner at n,m from the original N × N
image,as shown in Figure 5.1.
Figure 5.1: An illustration of Get-Block and Put-Block operators
The put-block operator (Γkn,m)∗ : =k → =N inserts a k × k image block into a
N ×N zero image, at the location with lower left corner at n,m. A N ×N image
xf ∈ =N can shown as
xf =M∑
i=1
(xf )i =M∑
i=1
(Γrini,mi
)∗(Ri) (5.1)
that {R1, . . . , RM} are a collection of range cell images that partition xf . Each
5.3 Fractal Image-set Coding 45
Ri has dimension ri × ri with lower corner located at ni,mi in xf . If the range
cells Ri are the result of fractal image encoding of the image xf , then for each
range cell Ri there is a domain cell Di and an affine transformation Wi such that
Ri = Wi(Di) = Gi(Di)+Hi (5.2)
Denote the dimension of Di by di, and denote the lower left coordinates of Di by
ki, li. Gi = =di → =r
i is the operator that shrinks (assuming di > ri), translates
(ki, li) → (ni,mi) and applies a contrast factor si, while Hi is a constant ri × ri
matrix that represents the brightness offset. We can write Di = Γdi
ki,li(xf ) Thus,
equation 5.1 can be rewritten as the following approximation:
xf =M∑
i=1
(Γrini,mi
)∗{Gi(Γdi
ki,li(xf ))+Hi}
xf =M∑
i=1
(Γrini,mi
)∗{Gi(Γdi
ki,li(xf ))}
︸ ︷︷ ︸
A(xf )
+M∑
i=1
(Γrini,mi
)∗(Hi)
︸ ︷︷ ︸
B
(5.3)
Then if we write the Get-Block operator (Γkn,m)∗ and Put-Block operator Γdi
ki,liand
transformations Gi in their matrix form we can simplify equation 5.3 as follow:
xf = A×xf +B (5.4)
In these equation A,B are fractal parameters of image xf .
5.3 Fractal Image-set Coding
In this section we will use the compact representation (5.4) to show some interest-
ing properties of fractal image encoding and introducing a method for extracting
fractal codes for a set of face images with the same geometrical parameters which
we will call Fractal Image-set Coding. The fundamental principle of fractal im-
age encoding is to represent an image by a set of affine transformations. Images
are represented in this framework by viewing them as vectors. This encoding is
46 5.3 Fractal Image-set Coding
not simple because there is no known algorithm for constructing the transforms
with the smallest possible distance between the image to be encoded and the
corresponding fixed point of the transformations. Banach’s fixed point theorem
guarantees that, within a complete metric space, the unique fixed point of a con-
tractive transformation may be recovered by iterated application thereof to an
arbitrary initial element of that space. The Banach’s fixed point theorem gives
us an idea of how the decoding process works:
Let T : =n → =n be a contractive transformation and (=n, d) a metric space with
metric d then the sequence of {Xk} constructed by Xk+1 = T (Xk) converge for
any arbitrary initial image X0 ∈ =n to the unique fixed point Xf ∈ =n of the
transformation T .
The contraction condition in this theorem is defined by this definition: Transfor-
mation T : =n → =n is called contractive if there exists a constant 0 < s < 1,
such that
∀x, y ∈ =n, d(T (x), T (y)) ≤ s.d(x, y) (5.5)
This condition is a sufficient condition for existence of a unique fixed point for
fractal transformations. Because if there exist two fix points xf and x′f for a
contractive transformation T , we will have:
T (xf ) = xf
T (x′f ) = x′
f
and
d(T (xf ), T (x′f )) = d(xf , x
′f ) ⇒ s = 1
So the transformation T can not be a contractive because s = 1:
Let us show the fractal transformation in compact form (5.4) as this:
T (x) = A × x + B
5.3 Fractal Image-set Coding 47
So the fractal image coding of an image xf can be defined by finding A,B to
satisfy this condition, while A,B define a contractive transformation:
xf = A × xf + B
This condition shows that the fractal code for an image xf is not unique
because we can have infinite pairs of A,B to satisfy that condition, and
have the same fixed point xf . And in many of them, A and B define
a contractive transformation T (x) with |s| < 1. transformation. Figure 5.2
shows an illustrations of function T (x) = A×x+B for a one-dimensional space =.
6
-��
��
��
��
��
��
�
��������
��������
��
��
��
��
��
��
��
��
!!!!!!!!
!!!!!!!!
xf
xf
T (x) = A × x + B
x
de
cb
a
x
Figure 5.2: Illustrations of function T (x) = A×x+B for a one-dimensional space=. a)s > 1, b) s = 1, c,d,e) s < 1
Different fractal coding algorithms, use different A and B for an image that makes
the fractal face recognition process more complex. The aim of Fractal Image-set
coding is to find fractal parameters for several images with the same geometrical
part for all of them. In this case, the information in the luminance part of fractal
codes of these images is more comparable. This method is also more efficient
and faster than existing methods because there is no need to search for the best
matching domain block for any range block which is the most computationally
expensive part of the traditional fractal coding process.
In this system, a sample image is nominated for using to find the geometrical
parameters. This image can be an arbitrary image of database, an image out of
the database or the average image of all or part of the database.
48 5.3 Fractal Image-set Coding
The Fractal image-set coding algorithm can be described as follow:
Step 0 (preprocessing) - For any face image data-set F use eye locations and
histogram equalization to form a normalized face image data-set Fnormal.
Any face image in this data-set is a 128 × 128, histogram equalized 256-
grayscale image, with the position of left and right eye at (32,32) and (96,32)
respectively as shown in Figure 5.3.
A
200 400 600
100
200
300
400
500
B
200 400 600
100
200
300
400
500
C
100 200 300 400
50
100
150
200
250
300
350
D
20 40 60 80 100 120
20
40
60
80
100
120
Figure 5.3: An example of preprocessing with an image in the data-set.(A) Theoriginal image, (B) grayscale image with orientation normalized, (C) Nominatedimage with face area marked, (D) normalized histogram equalized face image
Step 1 - Calculate the fractal codes for the sample image x (that can be the
average image of data-set) using traditional fractal image coding algo-
rithms [35]. These fractal codes contain the luminance information and
geometrical position information for range blocks {R1, R2, . . . , Rn}, the do-
main block {D1, D2, . . . , Dm} corresponded to each range blocks and the
geometrical transformation like rotation and resizing to match the domain
block with the range block.
Step 2 - For any image xi of the data-set, use the same geometrical parameters
(range and domain blocks positions and geometrical transformations) that
used for coding the sample image x as shown in Figure 5.4. Let (xRi, yRi
),
lRibe the geometrical position and the block size of Range block Ri and
5.3 Fractal Image-set Coding 49
(xDj, yDj
), lDjthe geometrical position and the size of domain block Dj
which is the best matched domain block for Ri.
A B
C D
Figure 5.4: (A) Average image of the data-set, (B) An arbitrary image from thedata-set, (C) Range blocks for image A, (D) The same range blocks applied toimage B
Step 3 - For any image range block Ri in image x use the domain block at the
same position (xDj, yDj
) and the same size lDjand calculate the luminance
parameters as follows to minimize the error e:
e =n∑
i=1
(s.di + o − ri)2
Let di and ri denote the pixel value of domain block D and range block R.
The minimum of e occurs when:
s =α
β
o = r −
(α
β
)
d
where
α =n∑
i=1
(di − d
). (ri − r)
β =n∑
i=1
(di − d
)2
50 5.4 Similarity Measurements
d =1
n
n∑
i=1
di
r =1
n
n∑
i=1
ri
as proven in section 4.3.1.
Step 4 - Save the geometrical parameters as well as luminance parameters as
fractal codes of image x
Initial image First itaration
third itaration fifth itaration
Figure 5.5: The initial image and the first, third and fifth iterates of decodingtransformations corresponding to image 005 4 1.
In figure 5.5 an example of decoding result for one of encoded images of XM2VTS
face database is shown. The PSNR versus the iteration is drawn in figure 5.6 for
this image and three other images of the same database. It is clearly shown
that the fixed point of each fractal image codes can be reached after only 5 or 6
iterations.
5.4 Similarity Measurements
A similarity measurement τ(x, y) is a method to calculate the similarity between
two images. It is normally defined based on a metric distance d(x, y) (higher
5.4 Similarity Measurements 51
distance between two patterns showing lower similarity between them). Similarity
measurement generally is a number between 0 and 1, which 0 shows the lowest
similarity and 1 shows the highest similarity between two patterns. In this section
different similarity measurements are described.
5.4.1 Minkowski-Form Distance
The Minkowski-form distance is defined based on the Lp norm as:
dp(x, y) = p
√√√√
N−1∑
i=0
(xi − yi)p
Where x = x0, x1, . . . , xN−1 and x = x0, x1, . . . , xN−1 and y = y0, y1, . . . , yN−1 are
the query and target feature vectors respectively.
When p = 1, d1(x, y) is the city block distance or Manhattan distance (L1):
d1(x, y) =N−1∑
i=0
|xi − yi|
When p = 2, d2(x, y) is the Euclidean distance (L2):
d2(x, y) =
√√√√
N−1∑
i=0
(xi − yi)2
When p → ∞, we get L∞
d(x, y) = max0≤i<N
{|xi − yi|}
5.4.2 Cosine Distance
The cosine distance computes the difference in direction, irrespective of vector
lengths. The distance is given by the angle between the two vectors. By the rule
of dot product
−→x .−→y = |x|.|y| cos(θ)
52 5.4 Similarity Measurements
dcos(x, y) = 1 − cos(θ) = 1 −−→x .−→y
|x|.|y|
Similarity measurement τcos(x, y) = 1−dcos(x, y) =−→x .−→y|x|.|y|
only takes angle between
two vectors into account. Let τcos(x1, y), τcos(x2, y) denote the similarity between
two query vectors x1 and x2 and a target vector y respectively. When x1 and x2
only differ in the length, we will have θx1,y = θx2,y ⇒ τcos(x1, y) = τcos(x2, y), while
Euclidean distance uses both angle and vector lengths to calculate the distance
d2(x, y), as it illustrated in figure 5.7. In some cases, such as matching a domain
block with a range block, cosine distance can be more useful than Euclidean
distance because if pixel values of a block are multiplied by a contrast factor, the
matching results will not change and only the contrast parameter in fractal codes
may change.
5.4.3 Fractal Similarity Measures
Each image can be represented as a point in image space, RN , where N is the
number of pixels. For the purpose of illustrating convergence and distances in
feature space, we will use a two-dimensional image space, X = [x, y]. Fractal
code parameters can then be represented using matrix A and vector B in the
transformation F (X) = A × X + B. We choose different initial images and
different fractal parameters to show how distances in image space can be used
for classification. When the fractal code image xf is applied iteratively to an
initial image, say x01 the image converges to xf after several iterations. We
want to find that image in the database which is closest to xf . If an Euclidean
distance based on grayscale differences between corresponding pixels is used, the
distance between a database image and the query image is not very reliable
because it can change considerably with noise and with small changes to the
query image. For example, a small misalignment between the images can cause
large differences between grayscale values of corresponding pixels. Therefore, a
more robust distance measure is required. One such distance is that between
5.5 Using Fractal Image-set Coding for Face Recognition 53
two successive iterations of an image when the fractal code for xf is applied to
it. Figure 5.8 shows the trajectories in image space as images x01, x02 and x03
converge towards xf for the simplified two-dimensional case. It can be observed
that image x03 is closest to xf and d3 is also the shortest of d1, d2 and d3. This
relationship holds regardless of fractal parameters A and B.Fractal parameter
A can be decomposed into a rotation matrix ρθ and a scale factor s Figure 5.9
shows convergence trajectories for the same images when A = 0.9 × ρ15 and
B = (I −0.9×ρ15)×xf . Figures 5.8 and 5.9 correspond to a scale factor s = 0.9
but different rotation matrices. Figures 5.10 and 5.11 are the same plots with
the same rotation matrices as 5.8 and 5.9, respectively, but with the scale factor
changed to s = 0.6. Note the faster convergence for the lower value of s. We can
use distance d between an image and its first iterate when the fractal code of xf
is applied to it as a measure of the distance of this image to xf . All images in
the database are subjected to this transformation and distances are compared.
The image with the least distance is used to identify the person (there may be
more than one image of the same person in the database used for training). A
similarity score between a database image, xi, and a query image, xf , can also be
defined. One such score could be e−d, which is guaranteed to be between 0 and
1, corresponding to least similar and identical cases, respectively.
5.5 Using Fractal Image-set Coding for Face
Recognition
For applications such as criminal identification, it would be useful to have a
computer system that understood human conceptions of facial similarity; for
instance, one can imagine wanting a database program that could retrieve similar
faces from a comprehensive mug shot database after a witness selected one face
from a small initial grouping. The fractal codes extracted by Fractal image-set
coding method for a face data-set F have the advantage that all the codes have the
54 5.6 Experimental Results
same geometrical parameters and therefore the luminance parameters are more
comparable than the traditional fractal codes. For face recognition applications,
we can divide the image database to two image-sets, a training set and a test set.
The sample image x can be the average image of the test set or the training set
or the entire database. These cases may be suitable to face recognition from a
closed set or open set. The results have been found not to change much with the
choice of the image set from which the geometrical parameters are extracted.
As some popular face recognition systems such as Eigenface based systems are rely
heavily on predetermined eye locations, the effect of eye finding accuracy in these
systems are significantly high [60]. Fractal images-set coding uses eye location to
normalize the face images thus the accuracy of eye localization process may affect
the recognition accuracy. However using fixed fractal geometrical parameters as
well as applying the block-wised operation on the blocks of size 16x16 or 8x8
decreases the effect of one pixel or two pixels shift in the images.
5.6 Experimental Results
We selected the first 39 individuals from the XM2VTS database and 4 face images
per individual. The first image was used as a test image while the other images
were added to the training set. Eye location information is used to normalize and
align the images to a 128×128 pixel grid. The eye coordinates are now fixed and
64 pixels apart. The average image x over the entire data-set is calculated and
used to extract the shared geometrical fractal parameters.
The fractal code of a query or test image is applied to all the images in the training
set for one iteration. The distance, d between each of these transformed images
and the correspondent initial (target) image is used as a measure of distance
between the test image and the target image. This value is divided by the number
of pixels and the maximum pixel value (256). A similarity score, e−d, which is
5.6 Experimental Results 55
bounded between 0 and 1, indicates the closeness of the match between the target
image and the test image. The target that has the highest similarity score is
the recognized identity. The next 5 best matches reveal the effectiveness of the
method. Several such test cases are shown in figures 5.12 to 5.16.
The distance value d can be normalized using this function:
dnorm =d − min(d)
max(d) − min(d)
Then the similarity score Snorm = e−dnorm is better illustrates the similarity be-
tween the images as shown in figure 5.13.
The recognition accuracy in this system is 95%. Only for 2 cases among 39 cases
the first best matches are not face images of the correct individual, as shown in
figures 5.17 and 5.19. However, it must be noted that in figure 5.17 the second
closest matches is of the correct individual but the facial hair change is too severe
for the method to cope. The query image and 3 training images of this individual
are shown in figure 5.18. Figure 5.19 shows the second and the last failed test.
Wearing spectacles and change in expression is the main difference between the
test and training images for this individual as shown in figure 5.20.
The biggest advantage that this method offers is that effects of changes in parts
of a face are confined by the geometrical parameters used and by the number
of iterations. Since the parameters are common to all codes, we can choose to
emphasize or deemphasize certain regions, thereby achieving robustness to the
presence of spectacles or expression changes. This may not be possible easily in
other non-fractal methods where a change in one part of a face affects all features
(such as in the Eigenface approach without part segmentation) or fractal methods
where geometrical fractal code parameters vary from image to image.
56 5.7 Summary
5.7 Summary
In this chapter, fractal image-set encoding and its application in face recognition
and facial image retrieval are explained. A fractal code of an arbitrary gray-
scale image can be divided in two parts – geometrical parameters and luminance
parameters. Because the fractal codes for an image are not unique, we can change
the set of fractal parameters without significant change in the quality of the
reconstructed image. Fractal image-set coding keeps geometrical parameters the
same for all images in the database. Differences between images are captured
in the non-geometrical or luminance parameters - which are faster to compute.
For recognition purposes, the fractal code of a query image is applied to all the
images in the training set for one iteration. The distance between an image and
the result after one iteration is used to define a similarity measure between this
image and the query image. Results on a subset of the XM2VTS database are
presented.
5.7 Summary 57
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 610
12
14
16
18
20
22
24
26
Iteration
PS
NR
(dB
)
Figure 5.6: The PSNR versus the number of decoding steps for 4 different128 × 128 gray-scale, normalized, encoded images of the XM2VTS database.The dash-dot line, solid line, dashed line and dotted line correspond to images002 1 1, 000 1 1, 003 1 1and 005 4 1 images of the XM2VTS database, respec-tively
6
-��
��
��
��
��
��
����
��
��
��
����������������
θ
y
x1
����������
x2
d2(x1, y)
d2(x2, y)
Figure 5.7: Euclidean distance takes both angle and vector lengths into accountto calculate the distance, while cosine distance only takes angle into account.
58 5.7 Summary
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
x01
x02
x03
d
1
d2
d3
Figure 5.8: Convergence trajectories for three different initial images when thesame fractal code is applied iteratively. Note that the initial image (x03) closestto the fixed point shows the least distance between successive iterations (d3 <
d2 < d1). The fractal parameters are A = 0.9× ρ45 and B = (I − 0.9× ρ45)× xf .
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Figure 5.9: Convergence trajectories for the same three initial images when thefractal code parameters are A = 0.9 × ρ15 and B = (I − 0.9 × ρ15) × xf .
5.7 Summary 59
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Figure 5.10: Convergence trajectories for the same three initial images when thefractal code parameters are A = 0.6 × ρ45 and B = (I − 0.6 × ρ45) × xf .
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
Figure 5.11: Convergence trajectories for the same three initial images when thefractal code parameters are A = 0.6 × ρ15 and B = (I − 0.6 × ρ15) × xf .
60 5.7 Summary
00911
.png
N00921
.bmp N00941
.bmp N00931
.bmp
N03231
.bmp N00021
.bmp N03241
.bmp
Figure 5.12: An example showing a query image on top followed by the six closestimages in the database. The best match is on the top-left, followed by others leftto right in row-first order. Note that the first three matches are images of facesof the correct person and some change in expression is tolerated by the method.
1 1.5 2 2.5 3 3.5
x 10−3
0
10
20
30
40
Error
Fac
e#
N009
0 10 20 30 400
0.2
0.4
0.6
0.8
1
Sim
ilarit
y
Face#
N009
0 0.2 0.4 0.6 0.8 10
10
20
30
40
Error
Fac
e#
Normalized Error
0 10 20 30 400
0.2
0.4
0.6
0.8
1
Sim
ilarit
y
Face#
Normalized Similarity
Figure 5.13: The error(top left) and the similarity (top right) between the queryimage and the images in the training data-set. Errors are all very small. Normal-ized error (bottom left) and normalized similarity (bottom right) for the same im-ages. Note that the normalized similarity measure clearly shows the best match-ing face number as 9. Values of this measure for other faces are below 0.7 in thiscase.
5.7 Summary 61
00611
.png
N00631
.bmp N00621
.bmp N02931
.bmp
N03421
.bmp N00641
.bmp N00021
.bmp
Figure 5.14: Another example showing a correctly identified case. Note here thatthere is a more marked change in facial expression and pose.
00211
.png
N00231
.bmp N00221
.bmp N00241
.bmp
N01821
.bmp N01831
.bmp N04041
.bmp
Figure 5.15: Yet another correctly identifed case. Note that the first threematches are images of faces of the correct person.
62 5.7 Summary
02111
.png
N02141
.bmp N02131
.bmp N02121
.bmp
N00421
.bmp N00031
.bmp N00041
.bmp
Figure 5.16: Yet another correctly identified test case. Note here that the queryimage is of a light-skinned individual and so are all the 6 closest matched images.
01911
.png
N00041
.bmp N01921
.bmp N00031
.bmp
N00421
.bmp N00441
.bmp N00431
.bmp
Figure 5.17: A test case that failed. The second closest match is of the correctindividual but the facial hair change is too severe for the method to cope. Itseems as if the closest matched face exhibits a different person but with verysimilar expression and features such as eyes and mouth.
5.7 Summary 63
Test image
01921
.png 01931
.png 01941
.png
Figure 5.18: Query image (top) and training images (bottom) for the individualnumber 019
00511
.png
N00021
.bmp N00031
.bmp N01331
.bmp
N00521
.bmp N00041
.bmp N00541
.bmp
Figure 5.19: The only other test case that failed. The fourth and sixth closestmatches are of the correct individual.
Test image
00521
.png 00531
.png 00541
.png
Figure 5.20: Query image (top) and training images (bottom) for the individualnumber 005
64 5.7 Summary
0 5 10 15 20 25 30 35 400
10
20
30
40
50
60
70
80
90
100
Number of person in the database
Acc
urac
y
Figure 5.21: A plot showing accuracy versus the number of persons in the database. Three images are used for each person in training set and one image perperson in test set.
Chapter 6
Subfractals
6.1 Introduction
As it was shown in previous chapters, the fractal code of an image is a set of
contractive mapping transformations which each of them transfer a domain block
to its corresponding range block. The distribution of selected domain blocks
for range blocks in an image depends on the content of image and the fractal
encoding algorithm. Some methods use best matching search for finding a domain
while some others use first match. The shapes of domain blocks can be square,
rectangle, triangle and so on also the size of domain blocks in the domain pool
can be fixed or variable. All of these parameters can co-operate to make the
fractal codes sensitive to small changes in image. A small variation in a part
of input image may change the contents of the range and domain blocks in the
fractal encoding process, resulting in a change in transformation parameters in
the same part or even other parts of the image. In this chapter, we introduced
a new method of fractal image coding to make the fractal code of each part
independent of variations of other parts.
66 6.2 Basic Concepts
6.2 Basic Concepts
Is there any local relationship between range and domain blocks of an image? It
is one of the first questions that any researcher in this field may ask. Fisher [35]
in his book (chapter 3, page 69-72) tried to show that the corresponding domain
block for each range block is random in position relative to it. Fisher plotted the
distributions of the difference in the x and y positions of the domains and ranges
for an encoding of 512 × 512 Lena image as well as the theoretical distribution
of the difference of two randomly selected points as shown in Figures 6.1 and
6.2. In these Figures, (xr, yr) and (xd, yd) are the range and domain positions.
Fisher calculated the probability distribution of dx and dy, where dx and dy are
the differences in x and y coordinations of two points randomly chosen in the
unit square with uniform probability, as ρ(dx) = 1 − |dx| and ρ(dy) = 1 − |dy|.
−600 −400 −200 0 200 400 6000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5x 10
−3
xd − x
r
(1−|x|/512)/512
Figure 6.1: A distribution of the difference in the x position of the domains xd
and ranges xr for an encoding of 512 × 512 Lena image, as well as the theoret-ical distribution (dashed line) of the difference of two randomly selected points.Adopted from [35].
6.2 Basic Concepts 67
−600 −400 −200 0 200 400 6000
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5x 10
−3
(1−|y|/512)/512
yd − y
r
Figure 6.2: A distribution of the difference in the y position of the domains yd
and ranges yd for an encoding of 512 × 512 Lena image, as well as the theoret-ical distribution (dashed line) of the difference of two randomly selected points.Adopted from [35]. Note that the distribution is skewed and also has significantlylarge values close to 0.
68 6.2 Basic Concepts
In the book, Fisher mentioned “so even when the points are chosen randomly, it
appears that there is a preference for local domains. However, this is an artifact
. . . there is a slight preference for local domains, but the effect is small”. It may
be a small effect for fractal compression but it plays a big role in the fractal
recognition. If the relation between range and domain blocks is random, a small
variation in a part of the image will change the range and domain blocks in a
random area. Also this change may cause a change in the fractal codes of all the
range blocks which are corresponding to those domain blocks. It clearly shows
that if the domain blocks’ distribution is random, a small change in some part
of an image will affect the fractal codes of other parts, and it means that this
change will be propagated randomly. On the other hand, as Fisher explained,
traditional fractal image coding methods prefer to choose local domain blocks for
each range block but it will not always happen. Our experiments have shown that
non-constant range blocks from a given segment tend to use domain blocks from
the same segment. As can be inferred from Fig.1 and Fig.2, for a sample image
like Lena (512×512 ) the number of range blocks which match with domain blocks
in their neighborhood with a radius of 60 is significantly higher than a random
matching between two blocks. This is owing to similar properties such as the same
texture. This fact makes some usage of fractal codes for recognition,(for example,
[25]) robust to some variations like expression variations on a face because these
kinds of variations cause only small local changes around lips or eyes that do not
affect the entire fractal codes. While the fractal codes of two different faces (a big
change) will affect the block partitioning, range blocks and domain blocks and
the entire code is changed.
To generalize this beneficial property, we propose a new fractal coding method
which chooses a domain block for each range block from the same area as range
block. It guarantees that any changes in a area or segment will only effect the
fractal codes related to that area and will not propagate anywhere else. It means
that the fractal codes of different areas of the image will be independent.
6.3 Subfractal Coding 69
A subfractal is defined to be a set of fractal codes that map a subset of domain
blocks in an image to domain blocks that cover the several part of the image.
These codes will be calculated to be independent of other codes of the other
parts of the same image.
6.3 Subfractal Coding
To calculate subfractals for an image we propose this algorithm. We assume here
that images are face images from a standard face database like the Banca face
database:
Step 0 (preprocessing) - For all face images use eye locations and histogram
equalization to form a geometrically and photometrically normalized face
image data-set.
Step 1 - Nominate the subfractal area for each part such as left and right eyes,
nose, lips and the rest of the image manually only for one arbitrary nor-
malized image of the database. This information will be used for all other
normalized images of the database as well.
Step 2 - For each subfractal, partition the area with non-overlapping r×r range
blocks.
Step 3 - Cover the subfractal area with a sequence of overlapping domain blocks
in k different sizes 2r×2r, 22r×22r, . . . , 2kr×2kr to form a domain pool for
that area. Also, add the 90o, 180o, 270o rotated version of each block to the
domain pool. Add the mirrored imaged version of each member of domain
pool to the pool, as well.
Step 4 - For each range block, find a domain block from domain pool of the
same subfractal area that best cover the range block. It can be done by
minimizing the distance function E(R,D) :
70 6.3 Subfractal Coding
E(R,D) =
√√√√
r∑
i=1
r∑
j=1
(R(i, j) − T (D)(i, j))2
between range block R and domain block D. The transformation
T (D) = Flip(F,Rotate(θ, Resize(1
L,D)))
resizes (L ∈ {2, 4, . . . , 2k}), rotates (θ ∈ {0, π4, π
2, 3π
4}) and flips (F ∈ {0 =
No flip, 1 = Horizontal flip}) domain block to match the corresponding
range block.
Step 5 - Record geometrical positions of the range block and the domain block
as well as parameters L, θ, F as the geometrical part of the fractal code for
the range block.
Step 6 - Calculate luminance parameters o and s and record them as other part
of the code :
s =α
β
o = R −
(α
β
)
D
where
α =r∑
i=1
r∑
j=1
(T (D)(i, j) − D
).(R(i, j) − R
)
β =r∑
i=1
r∑
j=1
(T (D)(i, j) − D
)2
D =1
r2
r∑
i=1
r∑
j=1
T (D)(i, j)
R =1
r2
r∑
i=1
r∑
j=1
R(i, j)
Step 7 - Repeat steps 4-6 for all range blocks in the subfractal area.
Step 8 - Repeat steps 2-7 for all subfractals in the image.
6.3 Subfractal Coding 71
In figure 6.3 range blocks in four major subfractals (eyes, nose and lips) and cor-
responding domain blocks for an arbitrary face image are shown. A plot of pixel
values for last matched domain and range block is also shown. Examination of
this plot for all the range blocks shows that even with the restriction of choosing
domain blocks from a subfractal area which is smaller than the image there is
enough freedom of choice to find a good match for most of the range blocks. This
arises from the overlapping of domain blocks which increases the number of do-
main blocks in the domain pool rapidly and the existence of different transformed
versions of a block in the domain pool. To speed up the coding process, we can
encode constant range blocks with only their geometrical parameters and their
average pixel values.
Range blocks
10 20 30 40 50
10
20
30
40
50
Domain blocks rotation=90o
rotation= 180orotation= 270o
Domain blocks flipped rotation= 90o ,flipped
rotation= 180o ,flippedrotation= 270o ,flipped
0 5 10 150
50
100
150
200
250
Pixel No.
Pix
el V
alue
Figure 6.3: Range blocks (top left) in four major subfractal areas (eyes, nose andlips) and corresponding domain blocks (bottom rows) for an arbitrary face image.Top right, a plot of pixel values vs. pixel numbers for last matched domain andrange block is shown.
72 6.4 Mathematical Basis
6.4 Mathematical Basis
As it shown in previous chapters, a N × N image xf can be shown as a unique
attractor of iterated contractive transformations:
xf = A × xf + B
In this equation A,B are fractal parameters of the image xf and are defined as:
A × xf =M∑
i=1
(Γrini,mi
)∗{Gi(Γdi
ki,li(xf ))}
B =M∑
i=1
(Γrini,mi
)∗(Hi)
Γkn,m : =N → =k, where k ≤ N , is a get-block operator which extract the k × k
block with lower corner at n,m from the original N × N image, and (Γkn,m)∗ :
=k → =N is a put-block operator which inserts a k×k image block into a N ×N
zero image, at the location with lower left corner at n,m. Gi = =di → =r
i is
the operator that shrinks (assuming di > ri), translates (ki, li) → (ni,mi) and
applies a contrast factor si, while Hi is a constant ri × ri matrix that represents
the brightness offset.
Because Gi is a combination of some geometrical transformation and a brightness
scaling, we can show that matrix A is a product of a contrast matrix Ψ and
another matrix Λ, that we call the distribution matrix :
A = Ψ × Λ
The values on the contrast matrix Ψ are the contrast factors si, (0 ≤ si < 1).
The distribution matrix Λ shows the relationship between each pixel of a range
and corresponding pixels of the domain. So in each column of the matrix, we
have non-zero values only in the rows corresponding to the domain pixels which
effect that range pixel. As the fractal code of an image is not unique, there are
many different possible values for Ψ and Λ. We can study these general cases:
6.4 Mathematical Basis 73
Case 1 - Each range pixel is in relation to only one domain pixel, each column
of Λ has only one non-zero value λi:
A =
0 s1 . . . 0
. . . 0 . . . s2
s3 0 . . . 0...
.... . .
0 0 sn 0
×
0 . . . λ3 . . . 0
λ1 0 . . . 0...
. . .... λn
0 λ2 0 . . . 0
This case can only happen when the size of range blocks is equal to the size
of domain blocks and will not be true for most of fractal image encoding
methods.
Case 2 - Each range pixel is in relation to all the pixels of the image:
A =
s1 s1 . . .
s2 s2 . . ....
.... . .
sn . . . sn
×
λ11 λ12 . . . λ1n
λ21 λ22 . . . λ2n
......
. . ....
λn1 λn2 . . . λnn
This case can only happen when the range blocks are derived from the entire
image and not only from a portion of the image.
Case 3 - Each range pixel is related to some of the domain pixels of the image.
In this case, each column of distribution matrix has some zero and some
non-zero values. The subfractal concept is one special subclass of this case.
For subfractals, we choose domain and range blocks from the same portion
of image so the matrixes A and Λ are sparse but we can re-arrange them
in the form of diagonal matrixes of subfractals.
We will illustrate this idea with an example: Suppose image X is a 3 × 3
grayscale image below, with 3 different subfractal areas a, b, and c :
74 6.4 Mathematical Basis
X =
a1 b1 b2
a2 a3 a4
c1 c2 a5
So xf can be :
xf = A × xf + B
xf =
a1
b1
b2
a2
a3
a4
c1
c2
a5
A = Ψ×Λ
Ψ =
sa11 0 0 sa12 sa13 sa14 0 0 sa15
0 sb11 sb12 0 0 0 0 0 0
0 sb21 sb22 0 0 0 0 0 0
sa21 0 0 sa22 sa23 sa24 0 0 sa25
sa31 0 0 sa32 sa33 sa34 0 0 sa35
sa41 0 0 sa42 sa43 sa44 0 0 sa45
0 0 0 0 0 0 sc11 sc12 0
0 0 0 0 0 0 sc21 sc22 0
sa51 0 0 sa52 sa53 sa54 0 0 sa55
6.4 Mathematical Basis 75
Λ =
λa11 0 0 λa12 λa13 λa14 0 0 λa15
0 λb11 λb12 0 0 0 0 0 0
0 λb21 λb22 0 0 0 0 0 0
λa21 0 0 λa22 λa23 λa24 0 0 λa25
λa31 0 0 λa32 λa33 λa34 0 0 λa35
λa41 0 0 λa42 λa43 λa44 0 0 λa45
0 0 0 0 0 0 λc11 λc12 0
0 0 0 0 0 0 λc21 λc22 0
λa51 0 0 λa52 λa53 λa54 0 0 λa55
Now, we define a swapping transformations Υi,jrow(X) as a transformation
which swap the row(i) and row(j) of matrix or vector X with each other.
In the same way, we define Υi,jcol(X) for swapping col(i) and col(j). Using
linear algebra, it can be easily shown that :
Υi,jrow(xf ) = Υi,j
row(A × xf + B) = Υi,jrow(Υi,j
col(A)) × Υi,jrow(xf ) + Υi,j
row(B)
and
Υi,jrow(Υi,j
col(A)) = Υi,jrow(Υi,j
col(Ψ)) × Υi,jcol(Υ
i,jrow(Λ))
So the form of Ψ and Λ after this series of transformation will be
xf = Υ3,2row(Υ1,2
row(Υ7,8row(Υ9,8
row(xf )))) :
76 6.4 Mathematical Basis
Ψ =
sb11 sb12 0 0 0 0 0 0 0
sb21 sb22 0 0 0 0 0 0 0
0 0 sa11 sa12 sa13 sa14 sa15 0 0
0 0 sa21 sa22 sa23 sa24 sa25 0 0
0 0 sa31 sa32 sa33 sa34 sa35 0 0
0 0 sa41 sa42 sa43 sa44 sa45 0 0
0 0 sa51 s52 sa53 sa54 sa55 0 0
0 0 0 0 0 0 0 sc11 sc12
0 0 0 0 0 0 0 sc21 sc22
Λ =
λb11 λb12 0 0 0 0 0 0 0
λb21 λb22 0 0 0 0 0 0 0
0 0 λa11 λa12 λa13 λa14 λa15 0 0
0 0 λa21 λa22 λa23 λa24 λa25 0 0
0 0 λa31 λa32 λa33 λa34 λa35 0 0
0 0 λa41 λa42 λa43 λa44 λa45 0 0
0 0 λa51 λ52 λa53 λa54 λa55 0 0
0 0 0 0 0 0 0 λc11 λc12
0 0 0 0 0 0 0 λc21 λc22
Matrixes Ψ and Λ can be divided to independent matrixes Ψa, Ψb, Ψc and
Λa, Λb, Λc. It is because we used subfractals and in each subfractal, pixels
are only related to other pixels of its own area. Thus
xf =
Xa
Xb
Xc
=
Ψa 0 0
0 Ψb 0
0 0 Ψc
×
Λa 0 0
0 Λb 0
0 0 Λc
×
Xa
Xb
Xc
+
Ba
Bb
Bc
and finally
Xa = Ψa×Λa×Xa + Ba
6.5 How to Use Subfractals for Face Recognition 77
Xb = Ψb×Λb×Xb + Bb
Xc = Ψc×Λc×Xc + Bc
These formulas clearly show that the fractal code of an image can be divided
to several independent subfractal codes. Each pixel in a subfractal area is
only related to other pixels of the same area.
6.5 How to Use Subfractals for Face Recogni-
tion
Hancock’s [41] psychophysical observations show that the human face recognition
most likely is based on the low-level image properties, rather than on an abstract
representation of the face. Certain image transformations, such as intensity nega-
tion, strange viewpoint changes, and changes in lighting direction can severely
disrupt human face recognition. The fractal codes show some degree of robust-
ness to some of these changes such as intensity negation. However in traditional
fractal image coding systems, the fractal code of a part of an image is not inde-
pendent from the changes in other parts of the same image. Subfractals, unlike
traditional fractal codes, do not have this problem, because the subfractal codes
of an image are defined to be independent. This fact can make subfractals more
suitable for applications such as image and face recognition.
To determine which part of the face should be a subfractal we devised a test.
In this test all 10 pairs of face images shown to 10 volunteers (5 males and 5
females) and asked them to verify if the two images in each pairs are belong to
one person or not. At the same time the gaze data of these volunteers collected
using Eye-Gaze Tracking System (Figure 6.4, See Appendix A for more details).
By using this system, we can indicate where on, and for how long, the computer
monitor the user is looking. These information are used to show which parts of
the face were compared to verify the person.
78 6.5 How to Use Subfractals for Face Recognition
Figure 6.4: A view of the eye-gaze tracking system
6.6 Summary 79
Figures 6.5 to 6.12 show 4 pairs of face images and the results of the eye-gaze
tracking system for 10 viewers. The results are shown as circles on the face
images. The center of each circle, shows the gaze point and the radius of each
circle shows the duration of gaze on that point.
The results in figures 6.6 and 6.8 show that eyes, nose and lips area are the
most important area for viewers to verify the identity. In figure 6.9 the pair of
face images are inverted in grayscale (negative image). About half of the viewers
could not verify these images correctly as shown in figure 6.10. However, the most
important areas for viewers were noise, eyes and lips again. Figure 6.12 shows
how viewers compare a face image and a semi-drawing image. These results and
the other results from other 6 pair images show that the most important areas
for face verification task for humans are eyes, nose and lips. Negative images are
more difficult for humans to verify than the normal images, while as shown in the
example in section 4.3.3 (page 40), fractal recognition system can dealing with
this difficulty very well.
Based on these results, the suitable subfractal areas for a face must contain left
and right eyes, nose and lips. To generate a complete fractal code of an image
the other parts of the face are also coded.
6.6 Summary
This chapter is an introduction to the concept and underlying maths of a new
fractal code for an image called subfractal coding. Based on this method the
fractal code of an image can be divided to several subfractals. Each subfractal is
defined to be independent from others, thus the changes in one part of the image
do not have an effect on the subfractal codes of other parts of the same image.
80 6.6 Summary
Figure 6.5: A pair of face images shown to volunteers to verify the identity.
Figure 6.6: An illustration showing the results of the eye-gaze tracking systemfor 10 viewers. Circles (the centers) show the gaze points and the radius of eachcircle shows the duration of gaze on that point.
6.6 Summary 81
Figure 6.7: Another pair of face images shown to volunteers to verify the identity.
Figure 6.8: The results of the eye-gaze tracking system show eyes, nose and lipsarea are the most important area for viewers to verify the identity.
82 6.6 Summary
Figure 6.9: Another pair of face images. The face images are inverted in grayscale(negative image).
Figure 6.10: The results of the eye-gaze tracking system for negative images.
6.6 Summary 83
Figure 6.11: Yet another pair of face images. Note that the left face image isinverted in grayscale and the right face image is semi-drawing.
Figure 6.12: The results of the eye-gaze tracking system for negative images.
Chapter 7
Future Work and Conclusions
This thesis started to address four research questions:
1- Is it possible to use fractal codes of grayscale images as features
for recognition?
It has been shown throughout this thesis that fractal codes have a great
capability to be used for recognition such as face recognition. As it was
described in Chapter 4, the fractal parameters of an image have a self
similarity based representation of that image and can be used as features
for face recognition. The fractal codes of different images are different in
the size of fractal features. The system presented for use of fractal codes as
features contains a method for normalization of features to generate same
reduced size feature vectors is presented. As the fractal code of an image
contains several different parts, some variations in images such as shift in
brightness effect some of the parameters, while others remain unchanged.
This results in some degree of robustness in the system.
2- What is the mathematical basis for using fractals for recognition?
The extraction of fractal code from an image involves the partitioning of
the image into a set of range blocks. There is also a corresponding set of
86
domain blocks to choose from. For each range block, a suitable domain
block is found using some prescribed criterion. The mapping between the
domain and range blocks, which is a contractive, similarity transformation,
forms the fractal code for this range block. The fractal code for the image
is a collection of fractal codes for all range blocks. Fractal code of an image
is not unique. An image xf can be shown as attractor of a contractive
transformation T which is in the form of T (xf ) = A × xf + B = xf .
3- Is it possible to design a more suitable fractal coding system for recognition?
Fractal code of an image is a set of transformations. Each transformation
has two parts. geometrical part and luminance part. Fractal image-set cod-
ing keeps geometrical parameters the same for all images in the database.
Differences between images are captured in the non-geometrical or lumi-
nance parameters - which are faster to compute. For recognition purposes,
the fractal code of a query image is applied to all the images in the training
set for one iteration. The distance between an image and the result after
one iteration is used to define a similarity measure between this image and
the query image. Experiments show that this system can achieve 95% ac-
curacy rate on a subset of the XM2VTS database and only 2 cases of 39
cases failed.
4- Are the different parts of a fractal code independent? And if not, how can we
define and extract independent fractal codes of different parts of an image?
Experience with face images shows that any changes in some part of the
image may affect the fractal codes of that part and also other parts of the
image. Chapter 6 a defines subfractal which is a new type of fractal code
for an image. Each subfractal is defined to be independent from others. An
algorithm is presented for extraction of subfractal codes.
7.1 Future Work 87
7.1 Future Work
7.1.1 Improving the Robustness
Faces can vary in terms of size, location in an image, orientation about the z-axis.
Such variation can be removed by normalising the face. Descriptions for recog-
nition can then be obtained. The eyes are commonly detected for normalisation
and some effective eye detectors have been produced. But eye detection cannot
always be successfully applied to faces; glasses or other obstacles can hide the
eyes. Some methods use a whole face approach to face normalisation, which is
more robust.
Facial expression is another kind of variation that can not be removed by nor-
malisation. The expression is almost composed with six main emotions, such as
happiness, sadness, surprise, disgust, anger and fear. Several algorithms have
been proposed for facial expression detection [11],[32],[53], [62],[64]. Some of
these techniques are related to extraction of motion of nose, mouth, eyebrows
and eyes with tracking algorithm, optical flow, motion energy, network criteria,
3d geometric modelling with a range finder and color image analysis technique.
Most of these techniques are used to recognize the facial expressions but only
little effort has gone into the recognition of faces with varying facial expressions.
A combination of fractal face recognition system and a PCA based feature re-
duction system (as shown in Figure 7.1) can be used to show how this method is
robust to the some facial variations like human facial expression.
In this application, domain index numbers for each range block in a feature vector
is used. To normalise the size of each vector, the quadtree partitioning geometrical
parameters which is part of fractal codes for each image is used. Because quad-
tree partitioning can be applied to an image of arbitrary size, the feature vector
can be resize to the size of the query image. In a typical image of size 128x128 the
88 7.1 Future Work
quadtree decomposition produces about 400 or more range blocks. The feature
vectors are of this length and will after normalisation be uniformly of size 64x64
because the smallest range size used is 4x4. This is a large vector and must be
reduced to suit classifiers.
The optimal linear method (in the least mean squared error sense) for reducing
redundancy in a data set is the Karhunen-Loeve (KL) transform or eigenvector
expansion via Principal Components Analysis (PCA). The basic idea behind the
KL transform is to transform possibly correlated variables in a data set into
uncorrelated variables. The transformed variables will be ordered so that the
first one describes most of the variation of the original data set. The second
will try to describe the remaining part of variation under the constraint that
it should be uncorrelated with the first variable. This continues until all the
variation is described by the new transformed variables, which are called Principal
Components. Mathematically, the PCA can be described as follows. Suppose X
is a vector, let P be the transformation matrix required such that Y has a diagonal
covariance matrix.
Y = P × X
It has been shown that the rows of P are eigen-vectors of the covariance matrix
E[(X − X)(X − X)T ]. The eigenvectors are arranged in descending order of the
corresponding eigenvalues. Elements of Y are called the principal components of
X. The expectation operation is preformed as an average over all feature vectors
from the training set. In this approach, PCA is preformed on fractal feature
vectors, not pixel values directly.
The aim of PCA is to reduce the dimension of the working space. The maximum
number of principal components is the number of variables in the original space.
However, in order to reduce the dimension, some principal components should be
omitted. In order to minimize the error, the eigenvalues are classified in decreasing
order and the last eigenvalues (and their eigenvectors) may be dropped. We use
this method to reduce dimension of fractal features to the number of individuals
7.1 Future Work 89
in the training database, for example from 16384 to 100. Independent component
analysis (ICA) could also be used for feature reduction but in this thesis we have
restricted our attention to PCA.
The eigenface approach uses normalised face images as vectors of pixel values,
which are transformed using PCA into feature vectors. The difference with our
approach is the use of fractal code vectors instead of pixel values as input to
the PCA. Results in figure 7.2 seem to indicate that our method should provide
better robustness to expression variations. We also do not need to normalise the
face images for small changes in size, position and rotation.
We used reduced fractal features as a vector. The size of each vector is equal.
For classification we used the mean squared error between feature vectors of the
query image and feature vectors of all images in the database as a measure of
distance and a minimum distance classifier.
7.1.2 Face Location and Detection
When a face image is captured using a video camera, the face may be located
anywhere on the video frame (on still image). Because most face recognition
methods rely on some normalisation of size and position, it is important to locate
the face and find either the contour or the location of some reference points such
as the eyes or the mouth. A segmentation technique, which will distinguish face
pixels or blocks from background ones, can be used for this task. However, if is
a difficult task and no good segmentation algorithms are known, especially when
the background is not uniform in grayscale or texture. The possibility of using
the subfractal idea to segment an image and locate the face can be studied.
90 7.1 Future Work
Figure 7.1: Block diagram of the fractal face recognition system with PCA basedfeature reduction.
Figure 7.2: Matrix showing differences between faces shown on the two axes.Darker points indicate larger difference. Entries below the diagonal are pixel-value differences. Entries above the diagonal are fractal-feature differences.
7.1 Future Work 91
7.1.3 Face Recognition Using Subfractals of Eyes and
Mouth Area
Face recognition accuracy can be improved if global features are augmented by
features depending only on specific parts such as eyes or mouth. This can only
be done if these parts can be segmented out from the rest of the face. This
requires properties of these parts, which are distinct from those of the rest of
the face. It is our contention that there is self-similarity within these parts and
range blocks from eyes will be transformed versions of domain blocks from within
the eye provided the search for the best suited domain is constrained to weight
domains inversely as the distance from the range. Under some such constraint
the eye region might turn out to be a sub-fractal within the face. We intend to
test and further develop these ideas.
Other future directions include using subfractals for video coding and neural
network based subfractals.
Appendix A
Quick Glance Eye-Gaze Tracking
System
An eye-tracker system is designed to determine the gaze point and the duration of
gaze of the user on the computer monitor. This appendix introduces the EyeTech
Digital Systems’ product Quick Glance, an eye-tracking system that was used for
the tests described in section 6.5.
The Quick Glance system consists of two infrared LED light sources, a camera,
a power supply and cabling, a PCI bus board and software. The camera and
light sources are mounted on the computer’s monitor. The video capture card
(PCI bus board) is installed into an available computer slot and connected to the
camera with a cable. The software is designed to help users to setup the system,
calibrate it and use it for their purpose.
This system examines the pupil center and corneal reflections from the user’s
eye which is illuminated by two low power infrared LEDs which are mounted on
the computer’s monitor to measure the user’s gaze point. The reflected light is
focused onto a camera, also mounted on the computer’s monitor. The image of
the eye upon which the camera is focused is captured at a fast and user determined
94
rate by image capturing hardware provided with the system. By analyzing the
position of the light reflections and the center of the pupil contained in the image,
a software determines the gaze point. Gaze point duration is also derived. With
that information, a gaze tracking program can illustrate the user’s gaze path by
moving the location of the cursor according to the gaze point and its duration.
Appendix B
Experimental Details
This dissertation contains several experiments. The details of experiments as well
as the results and comparison between the results are described in this Appendix:
B.1 Fractal Codes as Features
Method: direct use of fractal codes as features (Chapter 4).
Coding method: conventional fractal coding.
Domain block: overlapping square blocks of two different size (8X8 and 16X16).
Range blocks: non-overlapping, square blocks, generated by quad-tree partition-
ing.(Figure 4.1)
Geometrical aspects of transformation: contractive size matching and one of eight
orientations.(Figure 4.2)
Number of features: 4 vectors( Domain index number, Orientation, Brightness
shift and Contrast factor)
96 B.2 Fractal Image-Set Coding
Normalization: each of the fractal features will be normalized to a specific
size(64X64) using quad-tree partitioning geometry.(Figure 4.4)
Database: a subset of MIT face database contains 2 face images of 90 subjects,
with some variation in the illumination, and the scale and head orientation. (Fig-
ure 4.5)
classification: The Peak Signal-to-Noise ratio (PSNR) between feature vectors of
the query image and feature vectors of all images in the database are used as
measure of distance. A minimum distance classifier then employed to determine
the recognition accuracy.
Results: classification accuracy for each features is calculated separately. The
orientation parameter with 72% and the domain index with 64% showing the
higher accuracy than 2 other features. The accuracy can be increased to 88.5%
by using best of four features.(Figure 4.6)
B.2 Fractal Image-Set Coding
Method: Fractal image-set coding (Chapter 5).
Coding method: calculating the geometrical fractal features only once from a
mean image or even a single chosen image.
Domain block: overlapping square blocks of two different size (8X8 and 16X16).
Range blocks: non-overlapping, square blocks, generated by quad-tree partition-
ing.
Geometrical aspects of transformation: contractive size matching and one of eight
orientations.
B.2 Fractal Image-Set Coding 97
Number of features: 1 vector (luminance parameters).
Normalization: any image in the data-set will be normalized using histogram
equalization and eye locations to produce 128x128 face images with left and right
eyes at (32,32) and (96,32) respectively.(Figure 5.3)
Databases and results : this method has been tested on two databases including
a subset of MIT face database and a subset of XM2VTS face database.
The subset of MIT face database contains 90 person and 2 shots per person. one
of the shots used as test data while the other shot used as training data. ROC
plot in Figure B.1 shows the results of this experiment .
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1ROC plot
Threshold
Err
or R
ate
FAFR
Figure B.1: The results of Fractal image-set coding for subset of MIT facedatabase.
The recognition accuracy rate of this system is 83.33% which is higher than the
results of any of 4 fractal features tested in the first experiment.
The subset of XM2VTS database contains 39 people and 4 images per person(first
98 B.2 Fractal Image-Set Coding
shot of 4 sessions). Image data-set is divided to 3 sets: training set, evaluation
set and test set. Three subjects ( subject number 000, 002 and 007) are used as
imposters in evaluation set, 8 subjects ( subject number = 001, 008, 010, 011,
023, 028, 031, 039) are used as imposters in test set and the other subjects are
used as clients. The first images of each client subject is used as test image
while the other 3 images are used for training. Figure B.2 shows the results of
this experiment for evaluation data in ROC plot format. Based on this plot the
threshold will be set to obtain certain false acceptance (FAR) and false rejection
(FRR) values. The same threshold will then be used on the test set.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1ROC plot
Threshold
Err
or R
ate
Evaluation FAREvaluation FRR
FRR=0
FAR=0
FRR=FAR
Figure B.2: The results of Fractal image-set coding for the evaluation subset ofXM2VTS database. Arrows showing the position of the threshold for FRR=0,FRR=FAR and FAR=0
To compare this results with the results of other researchers that also used
XM2VTS database, test set will be evaluated to three different thresholds T:
TFAR=0 = argminT (FRR|FAR = 0)
TFAR=FAR = (T |FRR = FAR)
B.2 Fractal Image-Set Coding 99
TFRR=0 = argminT (FAR|FRR = 0)
Table B.1: Error rates obtained using Fractal image-set codingError FRR=0 FAR=FRR FAR=0FAR 53.33% 16.4% 0.3%FRR 0.0% 9.3% 51.85%
Table B.2: Error rates Reported by T. Tan using fractal neighbor distancesError FRR=0 FAR=FRR FAR=0FAR 94.0% 13.6% 0.0%FRR 0.0% 12.3% 81.3%
Figure B.3 shows these results in the form of ROC plot. The error rates are
summarized in the Table B.1. Using this information we can compare our results
with the other results in the literature. For examples the results of face recognition
using fractal neighbor distances [91] is shown in the table B.2 which indicates that
our results have less errors in most of the cases and have slightly higher error in
other cases.
100 B.2 Fractal Image-Set Coding
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1ROC plot
Threshold
Test FARTest FRR
FAR=FAR
FRR=0 FAR=0
Figure B.3: The results of Fractal image-set coding for the test subset of XM2VTSdatabase. Arrows showing the position of the threshold for FRR=0, FRR=FARand FAR=0 in the evaluation data set
Bibliography
[1] S. Akamatsu, T. Sasaki, H. Fukumachi, and Y. Suenaga, “A robust face iden-
tification scheme -KL expansion of an invariant feature space,” Proceedings
of SPIE, vol. 1607: Intelligent Robots and Computer Vision X: Algorithms
and Techniques, pp. 71–84, 1991.
[2] A. Alattar and S. Rajala, “Facial features localization in frontal view head
and shoulders images,” IEEE international Conference on Acoustics, Speech
and signal Processing, vol. 6, pp. 3557–3560, 1999.
[3] M. Barnsley, “Fractals everywhere,” Academic Press, San Diego, 1988.
[4] M. Barnsley and L. Hurd, Fractal Image Compression. AK Peters, Wellesley,
1993.
[5] M. S. Bartlett and T. J., “Viewpoint invariant face recognition using inde-
pendent component analysis and attractor networks,” in Advances in Neural
Information Processing Systems (T. P. M. Mozer, M. Jordan, ed.), pp. 817–
823, Cambridge, MA: MIT Press, 1997.
[6] M. S. Bartlett and T. J. Sejnowski, “Independent components of face images:
A representation for face recognition,” in Proceedings of the 4th Annual Jount
Symposium on Neural Computation,, (Pasadena, CA,), May 1997.
[7] L. H. M. Bartlett, M. Stewart and T. Sejnowski, “Independent component
representations for face recognition,” in Proceedings of the SPIE, Conference
on Human Vision and Electronic Imaging III,, vol. 3299, pp. 528–539, 1998.
102 BIBLIOGRAPHY
[8] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fish-
erfaces: Recognition using class specific linear projection,” Proceedings of
European Conference on Computer Vision, ECCV’96, pp. 45–58, 1996.
[9] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisher-
faces: Recognition using class specific linear projection,” IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720,
1997.
[10] T. D. Bie, N. Cristianini, and R. Rosipal, “Eigenproblems in pattern recogni-
tion,” Handbook of Computational Geometry for Pattern Recognition, Com-
puter Vision, Neurocomputing and Robotics, E. Bayro-Corrochano (editor),
Springer-Verlag, April 2004.
[11] M. J. Black and Y. Yacoob, “Tracking and recognition rigid and non-rigid
facial motion using local parametric models of image motion,” Proceedings
of IEEE International Conference on Computer Vision, ICCV95, Boston,
pp. 374–381, 1995.
[12] D. Blackburn, M. Bone, and P. J. Philips, “Facial recognition vendor test
2000,” Evaluation report. National Institute of Standards and Technology,
2000.
[13] R. D. Boss and E. W. Jacobs, “Archetype classification in an iterated trans-
formation image compression algorithm.,” in Fractal Image Compression -
Theory and Application, (Y. Fisher, ed.), pp. 79–90, Springer-Verlag, New
York, 1994.
[14] Boyer and Merzbach, A History of Mathematics. New York: John Wiley,
2nd ed., 1989.
[15] R. Brunelli and D. Falavigna, “Person identification using multiple cues,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17,
pp. 955–966, 1995.
BIBLIOGRAPHY 103
[16] R. Brunelli and T. Poggio, “Face recognition through geometrical features,”
Proceedings of European Conference on Computer Vision, ECCV92, Santa
Margherita Ligure, pp. 792–800, 1992.
[17] R. Brunelli and T. Poggio, “Face recognition: features versus templates,”
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15,
1993.
[18] P. Burt, “Smart sensing within a pyramid vision machine,” Proceedings of
the IEEE, vol. 76, pp. 1006–1015, 1988.
[19] L. Chen, H. Liao, J. Lin, and C. Han, “Why recognition in a statistics-
based face recognition system should be based on the pure face portion:
a probabilistic decision-based proof,” Pattern Recognition, vol. 34, no. 5,
pp. 1393–1403, 2001.
[20] G. Chow and X. Li, “Towards a system for automatic facial feature detec-
tion,” Pattern Recognition, vol. 26, no. 12, pp. 1739–1755, 1993.
[21] G. M. Davis, “A wavelet-based analysis of fractal image compression,” IEEE
Transactions on Image Processing, pp. 100–112, 1997.
[22] O. Deniz, M. Castrillon, and M. Hernandez, “Face recognition using inde-
pendent component analysis and support vector machines,” 3rd Interna-
tional Conference on Audio- and Video-based Biometric Person Authentica-
tion 2001, Halmstad, Sweden, June 6-8,, vol. 2091, pp. 59–64, 2001.
[23] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Face recognition
using fractal codes,” Proceedings, WoSPA 2000, , Brisbane, Australia, 2000.
[24] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Face recogni-
tion using fractal codes,” Proceedings of International Conference on Image
Processing, vol. 3, pp. 58–61, 2001.
[25] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Robustness to
expression variations in fractal-based face recognition,” Sixth International,
104 BIBLIOGRAPHY
Symposium on Signal Processing and its Applications, vol. 1, pp. 359–362,
2001.
[26] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Mathematical
basis for use of fractal codes as features,” Image and Vision Computing ’02
New Zealand, 2002.
[27] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, An Application of
Fractal Image-set Coding in Facial Recognition, vol. 3072 of Lecture Notes in
computer science, Biometric Authentication, pp. 178–186. Springer Verlag,
July 2004.
[28] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Facial image re-
trieval using fractal image-set coding,,” Feb. 2004.
[29] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Fractal image-
set encoding for face recognition,” in Proceedings of International Conference
on Computational Intelligence for Modelling Control and Automation, (Gold
Coast, Australia), pp. 664–672, July 2004.
[30] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Subfractals: A
new concept for fractal image coding and recognition,” Submitted to the
Journal of Complexity International, 2004.
[31] R. Epstein, P. Hallina, and A. Yuille, “5+/- eigenimages suffices: An empir-
ical investigation of low-dimensional lighting models,” Proceedings of the
Workshop on Physics-based Modeling in Computer Vision, pp. 108–116,
1995.
[32] I. A. Essa and A. P.Pentland, “Facial expression recognition using a dynamic
model and motion energy,” Proceedings of IEEE International Conference
on Computer Vision, ICCV95, Boston, pp. 360–367, 1995.
[33] K. Etemad and R. Chellappa, “Face recognition using discriminant eigen-
vectors,” Proceedings of International Conference on Acoustics, Speech and
Signal Processing, pp. 2148–2151, 1996.
BIBLIOGRAPHY 105
[34] R. A. Fisher, “The use of multiple measurements in taxonomic problems,”
Annals of Eugenics, vol. 7, pp. 179–188, 1936.
[35] Y. Fisher, ed., Fractal Image Compression: Theory and Application.
Springer-Verlag , New York, NY, USA, 1995.
[36] Y. Fisher, ed., Fractal Image Encoding and Analysis. NATO ASI Series,
Springer-Verlag, Berlin Heidelberg, 1998.
[37] K. Fukunaga, “Introduction to statistical pattern recognition,” Academic
Press, 2nd ed, 1990.
[38] M. Gharavi-Alkhansari and T. S. Huang, “A generalized method for im-
age coding using fractal-based techniques.,” Journal Visual Communication
Image Representation, vol. 8, no. 2, pp. 208–225, 1997.
[39] A. J. Goldstein, L. Harmon, and A. Lesk, “Identification of human faces,”
Proceedings of the IEEE, pp. 748–760, 1971.
[40] R. Gross, J. Shi, and J. Cohn, “The current state of the art in face recog-
nition,” Technical Report, Robotics Institute, Carnegie Mellon University,
Pittsburgh,USA, 2004.
[41] P. Hancock, V. Bruce, and M. Burton, “A comparison of two computer-
based face identication systems with human perceptions of faces,” Vision
Research, vol. 38, 1998.
[42] L. D. Harmon, M. K. Khan, R. Lasch, and P. F. Ramig, “Machine identifi-
cation of human faces,” Pattern Recognition, pp. 97–110, 1981.
[43] J. Hutchinson, “Fractals and self similarity.,” Indiana University Mathemat-
ics Journal, vol. 30, no. 5, pp. 713–747, 1981.
[44] A. Hyvarinen and E. Oja, “Independent component analysis: Algorithms
and applications,” Neural Networks, vol. 13, no. 4-5, pp. 411–430, 2000.
106 BIBLIOGRAPHY
[45] A. E. Jacquin, A Fractal Theory of Iterated Markov Operators with Applica-
tions to Digital Image Coding. PhD thesis, Georgia Institute of Technology,
1989.
[46] A. E. Jacquin, “Fractal image coding: A review,,” Proceedings of the IEEE,
vol. 81, no. 10, pp. 1451–1465, 1993.
[47] T. Kanade, Picture Processing by Computer Complex and Recognition of
Human Faces. PhD thesis, Kyoto University, 1973.
[48] T. Kanade, J. Cohn, and Y. Tian, “Comprehensive database for facial ex-
pression analysis,” Proceedings of the 4th IEEE International Conference on
Automatic Face and Gesture Recognition (FG’00), pp. 46 – 53, March 2000.
[49] M. D. KELLY, “Visual identification of people by computer.,” Technical
report AI-130, Stanford AI Project, Stanford, CA., 1970.
[50] M. Kirby and L. Sirovitch, “Application of the karhunen-loeve procedure
for the characterization of human faces,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 12, pp. 103–108, 1990.
[51] A. Kouzani, F.He, and K. Sammut, “Face image matching using fractal
dimension,” IEEE International Conference on Image Processing, pp. 642–
646, 1999.
[52] A. Z. Kouzani, F. He, and K. Sammut, “Fractal face representation and
recognition,” IEEE International Conference on Systems, Man and Cyber-
netics, vol. 2, pp. 1609–1613, 1997.
[53] A. Lanitis, C.J.Taylor, and T. Cootes, “A unified approach to coding and
interpreting face images,” Proceedings of IEEE International Conference on
Computer Vision, ICCV95, Boston, pp. 368–373, 1995.
[54] J. Lu, K. Plataniotis, and A. Venetsanopoulos, “Face recognition using lda-
based algorithms,” IEEE Trans. on Neural Networks, vol. 14, no. 1, pp. 195–
200, 2003.
BIBLIOGRAPHY 107
[55] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequncy dictio-
naires,” IEEE Transactions on Signal Processing, vol. 41, pp. 3397–3415,
1993.
[56] B. Mandelbrot, Les Objets Fractals: Forme, Hasard et Dimension. Paris:
Flammarion, 1975.
[57] B. Mandelbrot, Fractals: Form,Chance and Dimention. Freeman, W. H. and
Company, 1977.
[58] B. Mandelbrot, The Fractal Geometry of Nature. Freeman, W. H. and Com-
pany, 1982.
[59] B. Manjunath, R. Chellappa, and C. D. Malsburg, “A feature based ap-
proach to face recognition,” Proceedings of IEEE Computer Society. Confer-
ence on Computer Vision and Pattern Recognition, pp. 373–378, 1992.
[60] J. Manjunath, N. Orlans, and A. Piszcz, “Effects of eye position on eigenface-
based face recognition scoring,” Technical Report, The MITRE corporation,
7515 colshire drive, mclean, VA 22102,USA, 2003.
[61] A. M. Martinez and R. Benavente, “The ar face database,” CVC Tech.
Report 24, 1998.
[62] K. Mase, “Recognition of facial expression from optical flow.,” IEICE Trans-
actions, vol. E74, no. 10, pp. 3474–3483, 1991.
[63] J. Matas, M. Hamouz, K. Jonsson, J. Kittler, Y. Li, C. Kotroupolous,
A. Tefas, I. Pitas, T. Tan, H. Yan, F. Smeraldi, J. Bigun, N. Capdevielle,
W. Gerstner, S. Ben-Yacoub, and Y. Abduljaoued, “Comparison of face ver-
ification results on the xm2vts database,” in Proceedings of the 15th ICPR
(A. Sanfeliu, J. J. Villanueva, M. Vanrell, R. Alqueraz, J. Crowley, and
Y. Shirai, eds.), vol. 4, (Los Alamitos, USA), pp. 858–863, IEEE Computer
Soc Press, 2000.
108 BIBLIOGRAPHY
[64] K. Matsuno, C. Lee, S. Kimura, and S. Tsuji, “Automatic recognition of
human facial expressions,” Proceedings of IEEE International Conference
on Computer Vision, ICCV95, Boston, pp. 352–359, 1995.
[65] K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre., “Xm2vtsdb: The
extended m2vts database.,” March 1999.
[66] M. Michaelis, R. Herpers, L. Witta, and G. Sommer, “Hierarchical filtering
scheme for the detection of facial keypoints,” International Conference on
Acostics, Speech, and Signal Processing, vol. 4, pp. 2541–2544, 1997.
[67] B. Moghaddam and A. Pentland, “Probabilistic visual learning for ob-
ject representation,” The 5th International conference on Computer Vision,
Cambridge MA, pp. 786–793, 1995.
[68] B. Moghaddam and A. Pentland, “Probabilistic visual learning for object
representation,” IEEE Transactions on Pattern Analysis and Machine In-
telligence, vol. 19, pp. 676–710, 1997.
[69] D. M. Monro and F. Dudbridge, “Fractal block coding of images.,” Electron-
ics Letters, vol. 28, no. 11, pp. 1053–1055, 1992.
[70] D. M. Monro and F. Dudbridge, “Rendering algorithms for deterministic
fractals,” IEEE Computer Graphics and Applications, vol. 15, no. 1, pp. 32–
41, 1995.
[71] A. Nefian, A hidden Markov model-based approach for face detection and
recognition. Phd thesis, Georgia Institute of Technology,, Atlanta, GA, 1999.
[72] G. Neil and K. M. Curtis, “Scale and rotationally invariant object recog-
nition using fractal transformations,” Proceedings of IEEE International
Conference on Acoustics, Speech and Signal Processing, ICASSP96, vol. 6,
pp. 3458–3461, 1996.
[73] G. Neil and K. M. Curtis, “Shape recognition using fractal geometry,” Pat-
tern recognition, vol. 30, no. 12, pp. 1957–1969, 1997.
BIBLIOGRAPHY 109
[74] A. Pentland and T. Choudhury, “Face recognition for smart environments.,”
IEEE Computer, vol. 33, no. 2, pp. 50–55, 2000.
[75] A. Pentland, R. Picard, and S. Scarloff, “Photobook: content-based ma-
nipulation of image databases,” International Journal of Computer Vision,
vol. 18, pp. 233–254, 1996.
[76] P. Phillips, “Matching pursuit filters design,” 12th International Conference
on pattern recognition, pp. 57–61, 1994.
[77] P. Phillips, “Matching pursuit filters design for face identification,” in SPIE,
vol. 2277, pp. 2–9, 1994.
[78] P. Phillips, “Matching pursuit filters applied to face identification,” IEEE
Transactions on Image Processing, vol. 7, no. 8, pp. 1150–1164, 1998.
[79] P. J. Phillips, P. Grother, R. Micheals, D. M. Blackburn, E. Tabassi, and
J. M. Bone, “Face recognition vendor test 2002: Overview and summary,”
National Institute of Standards and Technology, 2003.
[80] P. J. Phillips, A. Martin, C. L. Wilson, and M. Przybocki, “An introduction
to evaluating biometric systems,” IEEE Computer, vol. 33, no. 2, pp. 56–63,
2000.
[81] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The feret evalua-
tion methodology for face-recognition algorithms,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104,
2000.
[82] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss., “The feret database
and evaluation procedure for face-recognition algorithm.,” Image and Vision
Computing,, vol. 16, pp. 295–306, 1998.
[83] S. Rizvi, P. Phillips, and H. Moon, “The feret verification testing protocol
for face recognition algorithms,” Thechnical report, NISTIR 6281, National
Institute of Standards and Technology, 1998.
110 BIBLIOGRAPHY
[84] J. RuizdelSolar and P. Navarrete, “Eigenspace-based face recognition: a
comparative study of different approaches,” IEEE Transactions on Systems,
Man and Cybernetics, Part C, vol. 35, pp. 315–325, 2005.
[85] F. Samaria and S. Young, “Hmm based architecture for face identification.,”
Image and Vision Computing, vol. 12, no. 8, pp. 537–583., 1994.
[86] A. W. Senior, “Face and feature finding for a face recognition system,” Sec-
ond International Conference on Audio- and Video-based Biometric Person
Authentication, pp. 154–159, 1999.
[87] G. Shakhnarovich and B. Moghaddam, “Face recognition in subspaces,”
Handbook of Face Recognition, Eds. Stan Z. Li and Anil K. Jain, Springer-
Verlag, pp. 154–159, 2004.
[88] L. Sirovitch and M. Kirby, “Low-dimensional procedure for the character-
ization of human faces,” Journal of the Optical Society of America, vol. 4,
pp. 519–524, 1987.
[89] L. Stringa, “Eyes detection for face recognition,” Applied Artificial Intelli-
gence, vol. 7, pp. 365–382, 1993.
[90] H. Takayasu, Fractals in the Physical Sciences. Manchester University Press,
1990.
[91] T. Tan, “Human face recognition based on fractal image coding,” PH.D.
Thesis, The University of Sydney, 2003.
[92] T. Tan and H. Yan, “Analysis of the contractivity factor in fractal based face
recognition,” IEEE International Conference on Image Processing, vol. 3,
pp. 637–641, 1999.
[93] T. Tan and H. Yan, “Face recognition by fractal transformations,” Pro-
ceedings of IEEE International Conference on Acoustics, Speech and Signal
Processing, ICASSP99, pp. 3537–3540, 1999.
BIBLIOGRAPHY 111
[94] T. Tan and H. Yan, “Object recognition using fractal neighbor distance:
Eventual convergence and recognition rates,” Proceedings of 15th Interna-
tional Conference Pattern Recognition, pp. 781–784, 2000.
[95] L. Torres, “Is there any hope for face recognition?,” Proc. of the 5th Inter-
national Workshop on Image Analysis for Multimedia Interactive Services,
WIAMIS 2004, pp. 21–23, 2004.
[96] M. Turk and A. Pentland, “Eigenfaces for recognition,” Jornal of Cognitive
Neuroscience, vol. 3, pp. 71–86, 1991.
[97] M. Turk and A. Pentland, “Face recognition using eigenfaces,” Proceedings
of IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–
591, 1991.
[98] L. Vences and I. Rudomin, “Genetic algorithms for fractal image and im-
age sequence compression,” in Proceedings Computacion Visual, pp. 35–44,
Universidad Nacional Autonoma de Mexico, 1997.
[99] S. Welstead, “Fractal and wavelet image compression techniques,” SPIE
Press, 1999.
[100] L. Wiskott, J. Fellous, N. Krger, and C. von der Malsbnurg, “Face recog-
nition by elastic bunch graph matching,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, vol. 19, no. 7, 1997.
[101] L. Wiskott and C. v. d. Malsburg, “Recognizing faces by dynamic link
matching,” Proceedings of International Conference on Artificial Neural Net-
works, ICANN’95, pp. pp. 347–352, 1995.
[102] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image
analysis,” Apr. 2001. United States Patent 6,222,939.
[103] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image
analysis,” Mar. 2002. United States Patent 6,356,659.
112 BIBLIOGRAPHY
[104] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image
analysis,” May 2003. United States Patent 6,563,950.
[105] B. Wohlberg and G. de Jager, “A review of the fractal image coding litera-
ture,” IEEE Transactions on Image Processing, vol. 8, no. 12, pp. 1716–1729,
1999.
[106] L. Yuille, P. Hallinan, and D. Cohen, “Feature extraction from faces using
deformable templates,” International Journal of Computer Vision, vol. 8,
no. 2, pp. 99–111, 1992.
[107] W. Zhao and R. Chellappa, “Face recognition: A literature survey,” ACM
Journal of Computing Surveys, pp. 399–458, 2003.