Fractal Techniques for Face Recognition · Fractal Techniques for Face Recognition by Hossein...

Fractal Techniques for FaceRecognition

by

Hossein Ebrahimpour-Komleh

M.Sc: Computer Engineering(With Honours)

B.Sc: Computer Engineering(First Class Honors)

PhD Thesis

Submitted in Fulfilment

of the Requirements

for the Degree of

Doctor of Philosophy

at the

Queensland University of Technology

Research Program in Speech, Audio, Image & Video

Technologies

August 2004

Keywords

fractals, subfractals, fractal image-set coding, image coding, face recognition,

image processing, computer vision

To my wife Soheila

and my little daughter Niloufar

Abstract

Fractals are popular because of their ability to create complex images using only

several simple codes. This is possible by capturing image redundancy and pre-

senting the image in compressed form using the self similarity feature. For many

years fractals were used for image compression. In the last few years they have

also been used for face recognition. In this research we present new fractal meth-

ods for recognition, especially human face recognition.

This research introduces 3 new methods for using fractals for face recognition, the

use of fractal codes directly as features, Fractal image-set coding and Subfractals.

In the first part, the mathematical principle behind the application of fractal

image codes for recognition is investigated. An image Xf can be represented as

Xf = A × Xf + B which A and B are fractal parameters of image Xf . Different

fractal codes can be presented for any arbitrary image. With the definition of a

fractal transformation, T (X) = A(X − Xf ) + Xf , we can define the relationship

between any image produced in the fractal decoding process starting with any

arbitrary image X0 as Xn = Tn(X) = An(X − Xf ) + Xf . We show that some

choices for A or B lead to faster convergence to the final image.

Fractal image-set coding is based on the fact that a fractal code of an arbitrary

gray-scale image can be divided in two parts – geometrical parameters and lumi-

nance parameters. Because the fractal codes for an image are not unique, we can

change the set of fractal parameters without significant change in the quality of

the reconstructed image. Fractal image-set coding keeps geometrical parameters

ii

the same for all images in the database. Differences between images are captured

in the non-geometrical or luminance parameters - which are faster to compute.

For recognition purposes, the fractal code of a query image is applied to all the

images in the training set for one iteration. The distance between an image and

the result after one iteration is used to define a similarity measure between this

image and the query image.

The fractal code of an image is a set of contractive mappings each of which

transfer a domain block to its corresponding range block. The distribution of

selected domain blocks for range blocks in an image depends on the content of

image and the fractal encoding algorithm used for coding. A small variation

in a part of the input image may change the contents of the range and domain

blocks in the fractal encoding process, resulting in a change in the transformation

parameters in the same part or even other parts of the image. A subfractal is a

set of fractal codes related to range blocks of a part of the image. These codes

are calculated to be independent of other codes of the other parts of the same

image. In this case the domain blocks nominated for each range block must be

located in the same part of the image which the range blocks come from.

The proposed fractal techniques were applied to face recognition using the MIT

and XM2VTS face databases. Accuracies of 95% were obtained with up to 156

images.

Contents

Abstract i

List of Figures viii

List of Tables xv

Acronyms & Units xvi

Certification of Thesis xvii

Acknowledgments xviii

Chapter 1 Introduction 1

1.1 Chaos and Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Face recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Publications Resulting from research . . . . . . . . . . . . . . . . 4

iv CONTENTS

Chapter 2 Fractal Encoding and Decoding 6

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Features of Fractals . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Mathematical Foundations . . . . . . . . . . . . . . . . . . . . . . 9

2.3.1 Metric Space . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.3.2 Contractive Transformations . . . . . . . . . . . . . . . . . 10

2.3.3 Fixed Point Theorem . . . . . . . . . . . . . . . . . . . . . 10

2.3.4 Affine Transformation . . . . . . . . . . . . . . . . . . . . 11

2.4 Iterated Function Systems(IFS) . . . . . . . . . . . . . . . . . . . 11

2.5 Principles of Fractal Coding . . . . . . . . . . . . . . . . . . . . . 12

2.5.1 Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.5.2 Transformations . . . . . . . . . . . . . . . . . . . . . . . . 15

2.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Chapter 3 Face Recognition 19

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2 Facial Feature Detection . . . . . . . . . . . . . . . . . . . . . . . 19

3.3 Geometric Feature Based Methods . . . . . . . . . . . . . . . . . 20

3.3.1 Face Recognition Using Principal Component Analysis

(Eigenfaces) . . . . . . . . . . . . . . . . . . . . . . . . . . 21

CONTENTS v

3.3.2 Recognition Using Independent Component Analysis (ICA) 22

3.4 Linear Discriminant-Based Method . . . . . . . . . . . . . . . . . 23

3.4.1 Other Methods . . . . . . . . . . . . . . . . . . . . . . . . 24

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Chapter 4 Fractal Codes Directly as Features 27

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.2 Previous Related Work . . . . . . . . . . . . . . . . . . . . . . . . 28

4.2.1 Shape Recognition Using Fractal Geometry . . . . . . . . . 28

4.2.2 Face Recognition Using Fractal Dimensions . . . . . . . . . 28

4.2.3 Face Recognition Using Fractal Neighbor Distances . . . . 29

4.3 Fractal Codes as Features . . . . . . . . . . . . . . . . . . . . . . 29

4.3.1 Fractal Extraction . . . . . . . . . . . . . . . . . . . . . . 30

4.3.2 Normalizing Fractal Codes . . . . . . . . . . . . . . . . . . 36

4.3.3 Accuracy Tests . . . . . . . . . . . . . . . . . . . . . . . . 37

4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Chapter 5 Fractal Image-set Coding 43

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.2 Mathematical Bases . . . . . . . . . . . . . . . . . . . . . . . . . 44

vi CONTENTS

5.3 Fractal Image-set Coding . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Similarity Measurements . . . . . . . . . . . . . . . . . . . . . . . 50

5.4.1 Minkowski-Form Distance . . . . . . . . . . . . . . . . . . 51

5.4.2 Cosine Distance . . . . . . . . . . . . . . . . . . . . . . . . 51

5.4.3 Fractal Similarity Measures . . . . . . . . . . . . . . . . . 52

5.5 Using Fractal Image-set Coding for Face Recognition . . . . . . . 53

5.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Chapter 6 Subfractals 65

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.2 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.3 Subfractal Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.4 Mathematical Basis . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.5 How to Use Subfractals for Face Recognition . . . . . . . . . . . . 77

6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Chapter 7 Future Work and Conclusions 85

7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

7.1.1 Improving the Robustness . . . . . . . . . . . . . . . . . . 87

CONTENTS vii

7.1.2 Face Location and Detection . . . . . . . . . . . . . . . . . 89

7.1.3 Face Recognition Using Subfractals of Eyes and Mouth Area 91

Appendix A Quick Glance Eye-Gaze Tracking System 93

Appendix B Experimental Details 95

B.1 Fractal Codes as Features . . . . . . . . . . . . . . . . . . . . . . 95

B.2 Fractal Image-Set Coding . . . . . . . . . . . . . . . . . . . . . . 96

Bibliography 101

List of Figures

2.1 One of the best examples for understanding the features of fractals

is the fern. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Van Koch’s snowflake with fractal dimension of 1.26 . . . . . . . 13

2.3 Serpinski triangle the attractor of an IFS containing 3 contractive

transformations. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 Examples of pose(XM2VTS face image database [65]), lighting

(AR face image database [61]) and facial experssion variations

(CMU-Pitt facial expression database [48]) in face images. . . . . 25

4.1 Domain (bottom) and range(top left) blocks for an image (top

right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2 The eight possible orientations of a block. The orientations consist

of four 90o rotations, a reflection and four more 90o rotations. . . 33

4.3 An illustration of domain and range blocks . . . . . . . . . . . . . 33

LIST OF FIGURES ix

4.4 Fractal features of an image (A=Domain index number,

B=Rotation (orientation) index, C=Brightness shift and

D=Contrast factor) displayed as gray values over the quad-

tree partition of the same image. . . . . . . . . . . . . . . . . . . 36

4.5 Typical images from the MIT face database

(ftp:\\whitechapel.media.mit.edu\pub\eigenfaces\pub\images).

Two different frontal views of each person are included. . . . . . 37

4.6 Recognition accuracy using Rotation, Domain index, brightness

and contrast features, independently, and total accuracy achieved

using all features, plotted against the number of the images in the

database as this is progressively increased. . . . . . . . . . . . . . 38

4.7 A query Image and the first 8 closest matches found by the method

(the first four images are the best hit for each feature and the last

four images are the second best hit of that feature). . . . . . . . 39

4.8 A rotated query image and the first 8 closest matches. Note that

the best match image retrieved using the orientation feature is the

correct person. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4.9 Another rotated query image and the first 8 closest matches only

using the orientation feature. . . . . . . . . . . . . . . . . . . . . 41

4.10 A query image inverted in grayscale and the first 8 closest matches.

Note that the rotation feature gets both first and second matches

right. The brightness feature does not find the right match because

the change negatives the brightness feature. The other two features

find the right match. . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.1 An illustration of Get-Block and Put-Block operators . . . . . . . 44

x LIST OF FIGURES

5.2 Illustrations of function T (x) = A × x + B for a one-dimensional

space =. a)s > 1, b) s = 1, c,d,e) s < 1 . . . . . . . . . . . . . . . 47

5.3 An example of preprocessing with an image in the data-set.(A)

The original image, (B) grayscale image with orientation normal-

ized, (C) Nominated image with face area marked, (D) normalized

histogram equalized face image . . . . . . . . . . . . . . . . . . . 48

5.4 (A) Average image of the data-set, (B) An arbitrary image from

the data-set, (C) Range blocks for image A, (D) The same range

blocks applied to image B . . . . . . . . . . . . . . . . . . . . . . 49

5.5 The initial image and the first, third and fifth iterates of decoding

transformations corresponding to image 005 4 1. . . . . . . . . . . 50

5.6 The PSNR versus the number of decoding steps for 4 different

128× 128 gray-scale, normalized, encoded images of the XM2VTS

database. The dash-dot line, solid line, dashed line and dotted line

correspond to images 002 1 1, 000 1 1, 003 1 1and 005 4 1 images

of the XM2VTS database, respectively . . . . . . . . . . . . . . . 57

5.7 Euclidean distance takes both angle and vector lengths into ac-

count to calculate the distance, while cosine distance only takes

angle into account. . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.8 Convergence trajectories for three different initial images when the

same fractal code is applied iteratively. Note that the initial image

(x03) closest to the fixed point shows the least distance between

successive iterations (d3 < d2 < d1). The fractal parameters are

A = 0.9 × ρ45 and B = (I − 0.9 × ρ45) × xf . . . . . . . . . . . . . 58

LIST OF FIGURES xi

5.9 Convergence trajectories for the same three initial images when

the fractal code parameters are A = 0.9 × ρ15 and B = (I − 0.9 ×

ρ15) × xf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.10 Convergence trajectories for the same three initial images when the

fractal code parameters are A = 0.6×ρ45 and B = (I−0.6×ρ45)×xf . 59

5.11 Convergence trajectories for the same three initial images when

the fractal code parameters are A = 0.6 × ρ15 and B = (I − 0.6 ×

ρ15) × xf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

5.12 An example showing a query image on top followed by the six

closest images in the database. The best match is on the top-left,

followed by others left to right in row-first order. Note that the

first three matches are images of faces of the correct person and

some change in expression is tolerated by the method. . . . . . . 60

5.13 The error(top left) and the similarity (top right) between the query

image and the images in the training data-set. Errors are all very

small. Normalized error (bottom left) and normalized similarity

(bottom right) for the same images. Note that the normalized

similarity measure clearly shows the best matching face number as

9. Values of this measure for other faces are below 0.7 in this case. 60

5.14 Another example showing a correctly identified case. Note here

that there is a more marked change in facial expression and pose. 61

5.15 Yet another correctly identifed case. Note that the first three

matches are images of faces of the correct person. . . . . . . . . . 61

5.16 Yet another correctly identified test case. Note here that the query

image is of a light-skinned individual and so are all the 6 closest

matched images. . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

xii LIST OF FIGURES

5.17 A test case that failed. The second closest match is of the correct

individual but the facial hair change is too severe for the method

to cope. It seems as if the closest matched face exhibits a different

person but with very similar expression and features such as eyes

and mouth. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.18 Query image (top) and training images (bottom) for the individual

number 019 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.19 The only other test case that failed. The fourth and sixth closest

matches are of the correct individual. . . . . . . . . . . . . . . . . 63

5.20 Query image (top) and training images (bottom) for the individual

number 005 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.21 A plot showing accuracy versus the number of persons in the data

base. Three images are used for each person in training set and

one image per person in test set. . . . . . . . . . . . . . . . . . . . 64

6.1 A distribution of the difference in the x position of the domains xd

and ranges xr for an encoding of 512× 512 Lena image, as well as

the theoretical distribution (dashed line) of the difference of two

randomly selected points. Adopted from [35]. . . . . . . . . . . . 66

6.2 A distribution of the difference in the y position of the domains

yd and ranges yd for an encoding of 512× 512 Lena image, as well

as the theoretical distribution (dashed line) of the difference of

two randomly selected points. Adopted from [35]. Note that the

distribution is skewed and also has significantly large values close

to 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

LIST OF FIGURES xiii

6.3 Range blocks (top left) in four major subfractal areas (eyes, nose

and lips) and corresponding domain blocks (bottom rows) for an

arbitrary face image. Top right, a plot of pixel values vs. pixel

numbers for last matched domain and range block is shown. . . . 71

6.4 A view of the eye-gaze tracking system . . . . . . . . . . . . . . . 78

6.5 A pair of face images shown to volunteers to verify the identity. . 80

6.6 An illustration showing the results of the eye-gaze tracking system

for 10 viewers. Circles (the centers) show the gaze points and the

radius of each circle shows the duration of gaze on that point. . . 80

6.7 Another pair of face images shown to volunteers to verify the iden-

tity. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.8 The results of the eye-gaze tracking system show eyes, nose and lips

area are the most important area for viewers to verify the identity. 81

6.9 Another pair of face images. The face images are inverted in

grayscale (negative image). . . . . . . . . . . . . . . . . . . . . . 82

6.10 The results of the eye-gaze tracking system for negative images. . 82

6.11 Yet another pair of face images. Note that the left face image is

inverted in grayscale and the right face image is semi-drawing. . 83

6.12 The results of the eye-gaze tracking system for negative images. . 83

7.1 Block diagram of the fractal face recognition system with PCA

based feature reduction. . . . . . . . . . . . . . . . . . . . . . . . 90

xiv LIST OF FIGURES

7.2 Matrix showing differences between faces shown on the two axes.

Darker points indicate larger difference. Entries below the diagonal

are pixel-value differences. Entries above the diagonal are fractal-

feature differences. . . . . . . . . . . . . . . . . . . . . . . . . . . 90

B.1 The results of Fractal image-set coding for subset of MIT face

database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

B.2 The results of Fractal image-set coding for the evaluation subset of

XM2VTS database. Arrows showing the position of the threshold

for FRR=0, FRR=FAR and FAR=0 . . . . . . . . . . . . . . . . 98

B.3 The results of Fractal image-set coding for the test subset of

XM2VTS database. Arrows showing the position of the thresh-

old for FRR=0, FRR=FAR and FAR=0 in the evaluation data

set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

List of Tables

4.1 An example of fractal codes. . . . . . . . . . . . . . . . . . . . . . 35

B.1 Error rates obtained using Fractal image-set coding . . . . . . . . 99

B.2 Error rates Reported by T. Tan using fractal neighbor distances . 99

Acronyms & Units

bpp bits per pixel

dB decibels

FA False Acceptance

FR False Rejection

ICA Independent Component Analysis

IFS Iterated Function Systems

KLT Karhunen-Loeve Transform

LDA Linear Discriminant Analysis

LDT Linear Discriminant Transform

LED Light Emitting Diode

PCA Principal Component Analysis

PIFS Partitioned Iterated Function Systems

PSNR Peak Signal-to-Noise Ratio

XM2VTS Extended MultiModal Verification for Teleservices and Security

Certification of Thesis

The work contained in this thesis has not been previously submitted for a degree

or diploma at any other higher educational institution. To the best of my

knowledge and belief, the thesis contains no material previously published or

written by another person except where due reference is made.

Signed:

Date:

Acknowledgments

It is not possible to thank everybody who has had an involvement with me during

the course of Ph.D. However, there are some people who must be thanked. Firstly,

I would like to thank my family and parents whose encouragement, support and

prayers has helped me achieve beyond my greatest expectations. I thank them for

their understanding, love and patience, especially through the number of more

difficult and stressful moments. Without their help and support throughout the

years it was not possible for me to come this far.

I would like to thank my principal supervisor Dr. Vinod Chandran for his guid-

ance and encouragement throughout my course of study. In addition I must thank

Dr. Chandran for his conscientious reviewing of my conference and journal papers

as well as my thesis draft.

I would also like to thank my associate supervisor, Prof. Sridha Sridharan for the

research environment he has created, as well as the additional financial support he

has provided me through the scholarship top-ups and the financial travel support

for the many conference travels I have undertaken.

In addition, I am appreciative of the financial support of the Iranian ministry

of science, research and technology through the PhD scholarship, I had been

awarded. A special acknowledgement goes to Prof. Javad Farhoudi, former Ira-

nian scientific counsellor in Canberra for his role, support and help during my

study in Australia. I also thank Dr. Kohian for his consideration and help.

ACKNOWLEDGMENTS xix

Former and current staff and students of the Image and Video Research Labora-

tory must also be acknowledged and I was fortunate enough to interact and work

with them. Anthony Ngyuen and Jason Pelecanos have been of particular help

to me during my Ph.D. Simon Lucy, John Dines, Michael Mason, David Cole,

and Eddie Wong all deserve special mention for their help at various times.

Hossein Ebrahimpour-Komleh

Queensland University of Technology

August 2004

Chapter 1

Introduction

This thesis fundamentally addresses four related topics: (i) the study of possibility

of using fractal codes of grayscale images as features for face recognition, (ii)

the study of mathematical bases for using fractals for recognition, especially face

recognition, (iii) the possibility of designing a more suitable fractal coding system

for recognition, and (iv) theoretical investigations into the definition and use of

Sub-fractals which are defined to be independent fractal codes of different parts

of an image.

In this thesis, the emphasis is on the use of fractal codes for recognition. Face

recognition has been chosen as an application for testing this concept and the

intention is not to aim for superior face recognition performance by fractal tech-

niques alone. A short introduction of chaos and fractals as well as face recognition

are given below.

1.1 Chaos and Fractals

A fractal is by definition “a set for which the Hausdorff-Besicovitch dimension

strictly exceeds the topological dimension” [57]. Benoit Mandelbrot, who coined

2 1.2 Face recognition

fractal and its definition, in his classic book “The Fractal Geometry of Na-

ture” [58] developed a new geometry of nature that describes many of the ir-

regular and fragmented patterns around us using fractals. This ability is based

on special features of fractals and their differences with other known models like

geometrical models. For example fractals do not have a characteristic length. A

shape usually has a definite scale that characterizes itself. Geometric shapes have

their own characteristic length such as the radius or circumference of a circle and

the edge or diagonal of a square. On the other hand, the length, size or volume

of fractals cannot be measured with a single unit, as their surface are not smooth

and that more closer we look in, the more complicated the shape appears.

Mandelbrot used this ability of fractals to describe the geometry of natural shapes

such as clouds, mountains and coastlines which can not be modelled by simple

geometrical objects like spheres, cones or circles. After the developments in the

field of dynamical systems and chaos and chaotic dynamical systems and their

close relationship with fractals, there is no wonder why fractals can describe

shapes like the fern very well. It is now understood that chaotic dynamics is

inherent in nonlinear deterministic systems with seemingly random behavior. As

a result, chaos and fractals have fascinated scientists from all fields. It is not only

because of its importance in applications but also the beauty of the geometric

patterns produced. In chapter 2 we will explain fractals and fractal image coding

methods further.

1.2 Face recognition

Biometrics is an active area of research with a wide range of applications in surveil-

lance, security systems, human-computer interfaces, etc. This term has been used

to refer to the emerging field of technology devoted to automatic identification of

individuals using physiological or behavioral traits. Techniques such as retinal or

iris scanning, hand geometry, speech recognition, fingerprint scanning, signature

1.3 Thesis Outline 3

verification and face recognition are examples of biometric methods of identifica-

tion which work by measuring unique human characteristics as a way to confirm

identity. Face recognition has the advantage of requiring very little cooperation

or modification of normal behavior on the part of the subjects in order to collect

useful data. But unlike some other biometrics like fingerprints or irises, faces

do not stay the same over time. Facial recognition systems have to deal with

changes in hairstyle, facial hair, spectacles, make-up and aging. Face recognition

is different from other pattern recognition problems such as character recognition.

This difference arises from the fact that in classical pattern recognition, there are

relatively few classes, and many samples per class. With many samples per class,

algorithms can classify samples not previously seen by interpolating among the

training samples. On the other hand for a typical face recognition, not only is

there large intra-class variation, but also there are many classes and only few

samples per class for training.

A facial recognition system often relies, implicitly, on extrapolation from the

training samples. Some variations such as location in an image frame, size and

pose can be removed by preprocessing to align and normalize the face. Eyes are

detected and eye locations are used for this purpose in many face recognition

systems. Feature extraction is an important stage for a typical facial recognition

system. Many feature or template based methods have been proposed for this

task but still there are many new developments on the way. In chapter a review

of some classical face recognition methods is given.

1.3 Thesis Outline

The goal of this thesis is to advance novel fractal recognition systems and their

application in human face recognition. This idea is built based on the theory of

fractal image encoding and decoding which are discussed in Chap. 2. As face

recognition was used as the main application for testing the concepts presented

4 1.4 Publications Resulting from research

in this thesis, we review of some classical face recognition methods in Chap. 3.

Our fractal techniques for recognition, including fractal codes directly as features,

fractal image-set coding and subfractals are introduced and discussed in Chap. 4,

Chap. 5 and Chap. 6 respectively. The final chapter discusses the summary and

conclusion of this thesis and points out future and promising research directions.

1.4 Publications Resulting from research

The following fully-refereed publications have been produced as a result of work

in this thesis:

Book Chapter

1- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ An Application of

Fractal Image-set Coding in Facial Recognition,” vol. 3072 of Lecture Notes

in computer science, Biometric Authentication, pp. 178-186. Springer Ver-

lag, July 2004.

Conference Publications

2- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Fractal Image-set

encoding for Face Recognition,” Proceedings of International Conference

on Computational Intelligence for Modelling Control and Automation, pp.

664-672. July 2004.

3- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Facial Image Re-

trieval Using Fractal Image-Set Coding,” 2nd Workshop on Information

Technology and Its Disciplines, Feb. 2004.

4- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Mathematical ba-

sis for use of fractal codes as features,” Proceedings of Image and Vision

1.4 Publications Resulting from research 5

Computing, IVCNZ02, vol. 1, pp. 203-208. Nov. 2002.

5- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Robustness to ex-

pression variations in fractal-based face recognition,” Proceedings of Sixth

International, Symposium on Signal Processing and its Applications, vol.

1, pp. 359-362. 2001.

6- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Face recognition

using fractal codes,” Proceedings of IEEE International Conference on Im-

age Processing, vol. 3, pp. 58-61. 2001.

7- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Face recognition

using fractal codes,” Proceedings of Third Australasian Workshop on Signal

Processing Applications, (WoSPA) 2000, Brisbane, Australia, 2000.

Journal Publications

8- H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan,“ Subfractals: A new

concept for fractal image coding and recognition,”Complexity International,

Monash university, ISSN 1320-0682 (Submitted).

Chapter 2

Fractal Encoding and Decoding

2.1 Introduction

Fractals, as some interesting mathematical sets, were known and studied by math-

ematicians like Cantor, Poincare and Hilbert [14] in the late 19th and early 20th

century. But it was Mandelbrot [56] who is widely recognized as the one who de-

fined the science of fractal mathematics. Iterated function theory defined by John

Hutchinson [43] was the second step in the development of fractal compression

systems. This theory later used by Michael Barnsley [3] for describing the collage

theorem that describes what a system of iterated functions must be like in order

to produce a fractal image. Arnaud Jacquin, one of Barnsley’s graduate students,

implemented the algorithm that can automatically convert an image into a Par-

titioned Iterated Function System [45]. This algorithm is the basis for most of

current fractal coding algorithms. The goal of these algorithms is to be able to

create a series of mathematical processes which would produce an accurate repro-

duction of an image. This reproduction using fractal codes is much more compact

than the picture. Many algorithms [35], [46], [70], [99] have been proposed to use

these codes for image compression. The remainder of this chapter is organized as

follows. Section 2.2 explains some common features of fractals. Mathematical

2.2 Features of Fractals 7

foundations, Iterated function systems(IFS) and principles of fractal coding are

presented in section 2.3, 2.4 and 2.5 respectively.

2.2 Features of Fractals

Fractal shapes are also characterized by their statistical self-similarity, regular

processes that appear over a range of scales, and non-integer fractional dimen-

sion. In spite of its intuitively comprehensive concept and potential for wide

application, complexity and difficulty concerning its visualization hindered the

study of fractals until recent advances in computer processing. Fractal dimen-

sion can be measured using various methods including the box-counting method,

i.e., estimating the complexity from the number of boxes used for approximating

the figure at different scales [90]. Fractal figures generally share the following

features in common:

No characteristic length: - A shape usually has a definite scale that charac-

terizes itself. Geometric shapes, for instance, have their own characteristic

length such as the radius or circumference of a circle and the edge or diag-

onal of a square. Fractal figures, on the other hand, have no such length.

Their length, size or volume cannot be measured with a single unit, as their

surface are not smooth and that more closer we look in, the more com-

plicated nest of surface shape appears. Consequently, we cannot draw a

tangent line of fractal figure; i.e. it is non-differentiable.

Self-similarity: - Fractal figures are unique in that they cannot be measured

with a single characteristic length, because of the repeated pattern we con-

tinuously discover at different scale levels. In other words, because fractal

figures hold self-similarity, their shape does not change even when observed

under different scale. One of the best examples for understanding this fea-

ture is fern. As it shown in figure 2.1, a small part of the figure when

8 2.2 Features of Fractals

enlarged reproduces the original figure.

Figure 2.1: One of the best examples for understanding the features of fractals isthe fern.

Non-integer dimension (fractal dimension): - We normally consider a

point to have a topological dimension of 0. In this sense, a boundary has a

topological dimension of 1, a surface has a dimension of 2 and a solid has

the dimension of 3. However, a complex curve may wander on a surface.

In case of van Koch’s snowflake that is shown in the figure 2.2, the curve

becomes 43

times longer than the original curve every time it grows. Thus,

a curve will have a fractal dimension of a real number between 1 and 2.

A complex curve that approaches surface filling will have a fractal dimen-

sion approaching 2. Therefore, the more complex the geographic boundary,

the higher the fractal dimension, (in the case of van Koch’s curve, we take

log 4log 3

or 1.26 as its fractal dimension). The actual values of these fractal di-

mensions differ slightly, depending on the method of defining it. Currently,

there are several methods that are physically feasible. We can measure frac-

tal dimension by: changing coarse-graining level (box-counting methods),

using the fractal measure relations, using the correlation function, using the

distribution function, or using the power spectrum.

2.3 Mathematical Foundations 9

2.3 Mathematical Foundations

This section provides basic notation and definitions related to fractal image cod-

ing.

2.3.1 Metric Space

A space M (can e.g. be the space of compact subsets of R3) is a metric space

if for any of its two elements x and y, there exists a real number d(x, y), called

distance, that satisfies the following properties:

(1) d(x, y) ≥ 0 (non-negativity)

(2) d(x, y) = 0 if and only if x = y (identity)

(3) d(x, y) = d(y, x) (symmetry)

(4) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality)

Cauchy sequence

A sequence {xn}∞n=0 = {xn ∈ M, n ∈ N} is said to be a Cauchy sequence if,

∀ε > 0,∃K ∈ N, such that d(xn, xm) ≤ ε, for all n,m > K

Complete metric space

A metric space (M, d) is complete if every Cauchy sequence of points {xn}∞n=0 in

M has a limit xn ∈ M.

10 2.3 Mathematical Foundations

2.3.2 Contractive Transformations

A Transformation w : M 7→ M is said to be contractive with contractility

s ∈ [0, 1) if for any two points x, y ∈ M, the distance

d(w(x), w(y)) < s.d(x, y)

Loosely speaking, This formula says the application of a contractive map always

brings points close together (by some factor less than 1).

Contractive transformations have the nice property that when they are repeatedly

applied, they converge to a point which remains fixed upon further iteration.

2.3.3 Fixed Point Theorem

If the space (M, d) is a complete metric space and w : M 7→ M is a contractive

transformation with contractivity factor s, then

1- There exists one unique fixed point xf ∈ M, which is invariant under w:

w(xf ) = xf

2- For any point x ∈ M, it holds that

limn→∞

wn(x) = limn→∞

w(w(w(. . . (x))))︸︷︷︸

n times

= xf

3-(Collage theorem) For any point x ∈ M, it holds that

d(x, xf ) ≤1

1 − s.d(x,w(x))

The fixed point theorem shows how fractal coding of images can be done. We

consider images a points in a metric space and we find a contractive transfor-

mation on that space which its fixed point is the image we wish to encode (in

2.4 Iterated Function Systems(IFS) 11

practice it may be an image very close to it). The fixed point theorem grantees

that the distance between transformed point(by the contractive transformation)

and the fixed point is less than the distance between the initial point and the

fixed point. If we apply the contractive transformation iteratively to an initial

point we can see the results come closer and closer to the fixed point.

2.3.4 Affine Transformation

For a gray scale image I, if z denotes the pixel intensity at the position (x, y),

then affine transformation W can be expressed in matrix form as follows:

W

x

y

z

=

a b 0

c d 0

0 0 s

x

y

z

+

e

f

o

Where a, b, c, d, e, f are geometrical parameters, s is the contrast and o is the

brightness offset (luminance parameters). This transformation also can be shown

in linear form W (X) = AX + B, where A is a n × n ( in our case n=3) matrix

and B is an offset vector of size 1 × n. Using an affine transformation, we can

scale, rotate an image, contrast scale or translate pixel intensities.

2.4 Iterated Function Systems(IFS)

An iterated function system {W : wi, i = 1, 2, . . . , N} consists of a collection

of contractive affine transformations wi : M 7→ M with respective contractiv-

ity factor si together with a complete metric space (M, d). This collection of

transformations defines a contractive transformation W with contractivity factor

s = max{si, i = 1, 2, . . . , N}. The contractive transformation W on the complete

metric space (M, d) will have a unique fixed point Xf which is also called the

12 2.5 Principles of Fractal Coding

attractor of this IFS.

W (X) =N⋃

i=1

wi(X)

W (Xf ) =N⋃

i=1

wi(Xf ) = Xf

Figure 2.3 shows an example of attractor of an IFS with 3 simple contractive

transformations w1, w2, w3 as :

wi a b c d e f

w1 0.5 0 0 0.5 0 0

w2 0.5 0 0 0.5 0.5 0

w3 0.5 0 0 0.5 0.25 0.5

which wi is in the following form:

wi

x

y

z

=

a b 0

c d 0

0 0 0

x

y

z

+

e

f

1

Jacquine’s method as well as many other fractal image coding methods are based

on partitioned iterated function systems(PIFS) which is a generalization of IFS.

In PIFS each transformation wi will apply only to a restricted set of domains.

This will help us to encode more general images which are not fully self-similar.

2.5 Principles of Fractal Coding

Various schemes of fractal image compression were proposed, which differ in the

partitioning method, class of transformation or type of search used in locating

suitable domain blocks. The first fully automated algorithm for fractal image

2.5 Principles of Fractal Coding 13

Figure 2.2: Van Koch’s snowflake with fractal dimension of 1.26

Figure 2.3: Serpinski triangle the attractor of an IFS containing 3 contractivetransformations.

14 2.5 Principles of Fractal Coding

compression, proposed by Jacquin [45] in 1989. Until Jacquin’s encoder be-

came available, attempts had been made to design fractal encoders which were

supposed to create transformation with the structure of iterated function sys-

tems. Jacquin’s method was based on partitioned iterated function systems

(PIFS), a more general type of transformation which exploits the fact that a

part of an image can be approximated by a transformed and down-sampled

version of another part of the same image, this property is called piecewise

self-similarity. A PIFS consists of a complete metric space X, a collection of

sub-domains Di ⊂ X, i = 1, . . . , n and a collection of contractive mappings

wi : Di → X, i = 1, . . . , n.

The encoder works, in principle, as follows:

Range Blocks: -An image to be encoded is partitioned into non-overlapping

range blocks Ri.

Domain Blocks: -An Image is also partitioned into larger blocks Dj called do-

main blocks which can be overlapped.

Transformation: - The task of a fractal encoder is to find a domain block DRi

of the same image for every range block Ri such that a transformed ver-

sion of this block w(DRi) is a good approximation of the range block. The

contractive transformation w is a combination of a geometrical transforma-

tion and luminance transformation. The transformed version of the domain

block can be rotated, mirrored, contrast scaled or translated, so the trans-

formation can be shown as an affine transformation.

Various schemes of fractal image coding are different in the partitioning method,

the class of transformation or the type of search used in locating suitable domain

blocks.

2.5 Principles of Fractal Coding 15

2.5.1 Partitioning

The first decision to be made when designing a fractal coding scheme is in the

choice of the type of image partition used for the domain and range blocks.

The simplest possible range partition consists of the fixed size square blocks.

Quadtree partitioning employs the well known image processing technique based

on recursive splitting of selected image quadrants, enabling the resulting partition

be represented by a tree structure in which each non-terminal node has four

descendants.

A horizontal-vertical (HV) partition like the quadtree, produces a tree-structured

partition of the image. Instead of recursively splitting quadrants, however, each

image block is split into two by a horizontal or vertical line and finally a number

of different constructions of triangular partitions have been investigated. In tri-

angular partitioning scheme, a rectangular image is divided diagonally into two

triangles. Each of these is recursively subdivided into four triangles by segment-

ing the triangle along lines that join three partitioning points along the tree sides

of the triangles.

2.5.2 Transformations

A critical element of a fractal coding scheme is the type of transform selected,

since it determines the convergence properties on decoding, and its quantized pa-

rameters comprise the majority of the information in compressed representation.

The fixed point theorem states that contractive transformations, through their

fixed points, can be used to represent points in the space. However, this theorem

does not show a method to find such transformations.

If we find a suitable contractive transformation W for image Xf , we know that

16 2.6 Summary

the fixed point of W is Xf , so

d(xf ,W (xf )) = d(xf , xf ) = 0

It may be very difficult to find an exact transformation W for any arbitrary image

x. Instead, many fractal image encoder only aim to find a transformation W ∗

with attractor x∗f which d(x, x∗

f ) is as little as possible. If the distance

d(x,W (x)) ≤ δ

then the distance from x to its approximation x∗f , which is the attractor of W ,

will be bounded by :

d(x, x∗f ) ≤

δ

1 − s

Hence, both δ and s (which is the contractivity factor of W ) should be as small

as possible. Affine transformations are good candidates for this case. Each trans-

formation can have 2 different parts: geometrical and luminance.

The geometrical part of the transformation, scales, rotates and translates a do-

main block to fit the range block. To keep the transformation contractive, the

size of a domain block is always bigger than range block so the scale factor is

always less than 1.

The luminance part consists a few simple functions, such as a luminance shift

and contrast scaling (again with contrast factor less than 1).

2.6 Summary

In this chapter the focus has been set on the brief introduction of fractals and their

features, such as self similarity, non-integer dimension as well as basic concepts

of fractal image coding. The mathematical basis, such as complete metric space,

contractive transformations and fixed point theorem have been introduced. Later

2.6 Summary 17

in this thesis, the utility of fractal codes for face recognition is proposed and

discussed.

Chapter 3

Face Recognition

3.1 Introduction

Face is a unique feature of human beings. However, all faces are similar in features

and structures. During the past several years, face recognition has developed into

a major research area in pattern recognition and computer vision. As one of the

most challenging applications in these fields, face recognition has received signif-

icant attention. Unlike other biometric systems, facial recognition can be used

for general surveillance, usually in combination with public video cameras. This

chapter overviews some of the classic 2D still image face recognition algorithms.

3.2 Facial Feature Detection

Most of the practical face recognition systems need a face detection stage to

detect the location of the face within a source image. Face recognition systems

also normalize the size and orientation of the face to achieve more robustness.

The normalization methods uses the location of the significant facial feature such

as eyes, nose or mouth. For example, once the eyes are detected, one is able to

20 3.3 Geometric Feature Based Methods

transfer the eyes into pre-determined locations in an image of pre-defined size

using an affine transformation. The importance of robust facial feature detection

for both detection and recognition has resulted in the development of a variety

of different facial feature detection algorithms [2], [20], [59], [66], [89], [106].

Brunelli and Piggio [15], [17] proposed a facial feature detection method which

uses a set of templates to detect the position of the eyes in an image, by looking

for the maximum absolute values of the normalized correlation coefficient of these

templates at each point in the test image. To cope with scale variations, a set of

templates at different scales was used.

The problems associated with scale variation can be solved by using a set of tem-

plates at different scales or using hierarchical correlation as proposed by Burt [18].

3.3 Geometric Feature Based Methods

The geometric feature based approaches [39], [42], [47], [49] are the earliest ap-

proaches to face recognition and detection. These approaches were focused on

detecting individual features such as eyes, ears, head outline and mouth, and

measuring different properties such as eyebrow thickness and their vertical posi-

tion or nose position and width, in a feature vector that is used to represent a

face. To recognize a face, first feature vectors of the test image and the images in

the database are obtained. Second, a similarity measure between these vectors,

most often a minimum distance criterion, is used to determine the identity of the

face.

Brunelli and Poggio [16] compute a set of geometrical features such as nose width

and length, mouth position, and chin shape. They report a 90% recognition rate

on a database of 47 people. However, they show that a simple template-matching

scheme provides 100% recognition for the same database.

3.3 Geometric Feature Based Methods 21

3.3.1 Face Recognition Using Principal Component Anal-

ysis (Eigenfaces)

Principal component analysis (PCA) [37], is a simple statistical dimensional-

ity reducing technique that has perhaps become the most popular and widely

used method for representation and recognition of human faces. PCA, via the

Kahunen-Loeve transform can extract most statistically significant information

for a set of images as a set of eigenvectors (usually called eigenfaces [96] when

applied to faces), which can be used both to recognize and reconstruct face im-

ages. This method proposed by Turk and Pentland [75], [96], [97] is motivated by

the earlier work of Sirovitch [88] and Kirby [50] for efficiently representing face

images. Once the face images are normalized for eye position, they can be treated

as a 1-D array of pixel values. The eigenvectors of the covariance matrix C of the

ensemble of training faces are called eigenfaces. The space spanned by the eigen-

vectors vk, k = 1, , K corresponding to the K largest eigenvalues of the covariance

matrix, is called the face space. Eigenvectors can be regarded as a set of gen-

eralized features, which characterize the image variations in the database. Each

image has an exact representation via a linear combination of these eigenvectors

and an arbitrarily close approximation using the K most significant eigenvectors.

The number of eigenvectors chosen determines the dimensionality of face space.

A new face image is transformed into its eigenface components by projection

onto the face space. The projections form the feature vector, which describes

the contribution of each eigenface in the representing the input image. A test

image is recognized by computing the Euclidean distance in the feature space

and selection the closest match. The effect of the lighting conditions over the

KLT based method has been detailed in [31]. The eigenface method has also

been used for face detection [67],[68] by measuring the distance from each lo-

cal pattern in a test image to the face space defined by the eigenfaces. In [1]

Akamatsu et. al., applied the eigenface method to the magnitude of the Fourier

spectrum of the images after normalization with respect to illumination and scale.

22 3.3 Geometric Feature Based Methods

Due to shift invariance property of the magnitude of the Fourier spectrum, and

to the illumination and scale normalization, the method, called the Karhunen-

Loeve Transform of Fourier Spectrum in the Affine Transformed Target Images

(KL-FSAT), performed better than classical eigenfaces method under variations

in head orientation and shifting.

In summary, PCA is a very efficient signal encoder, and designed specifically to

characterize and encode variation rather than ignore them. Thus it may find

the optimal low-dimensional representation, but this may be more useful for re-

construction rather than recognition. In addition, the eigenface method is not

invariant to image transformations such as scaling, shift or rotation in its original

form and requires complete relearning of the training data to add new individuals

to the database.

3.3.2 Recognition Using Independent Component Analy-

sis (ICA)

Independent Component Analysis (ICA) is a statistical method for transforming

an observed multidimensional random vector into components that are mutually

as independent as possible. This technique can be used for extracting statistically

independent variables from a mixture of them [22]. In a classical example, two

people in the same room speak simultaneously and two microphones are placed

at different locations recording the mixed conversations. ICA can be used to

estimate the contribution coefficients from the two signals, which allows us to

separate the two original signals from each other, assuming that the two speech

signals are statistically independent. The tutorial [44] written by Hyvarinen and

Oja contains more details about the algorithms involved.

Bartlett and Sejnowski have used Independent Component Analysis (ICA) for

face recognition [5], [6], [7]. Two approaches for recognizing faces across changes

3.4 Linear Discriminant-Based Method 23

in pose were explored using ICA. In the first architecture, a set of statistically

independent basis images for the faces, was provided. This set can be viewed as

a set of independent facial features. Unlike the PCA basis vectors, these ICA

basis images were spatially local. The representation consisted of the coefficients

for the linear combination of basis images that comprised each face image. The

second architecture produced independent coefficients. This provided a facto-

rial face code, in which the probability of any combination of features can be

obtained from the product of their individual probabilities. Classification was

performed using nearest neighbour, with similarity measured as the cosine of the

angle between representation vectors. Both ICA representations showed better

recognition scores than PCA when recognizing faces across sessions with changes

in expression and changes in pose.

3.4 Linear Discriminant-Based Method

In [8], [9], [33] the authors proposed a new method for face recognition using

Fisher’s Linear Discriminant Transform (LDT) [34], [37]. The Fisherface method

uses the class membership information and develops a set of feature vectors in

which variations between different faces are emphasized while different instances

of faces due to illumination condition, facial expressions, and orientations, are de-

emphasized. In other words, LDT finds the line that best separates the points.

Each test image is projected onto the optimal LDT space and the resulting set of

coefficients is used to compute the Euclidean distance from the images in training

set. The Fisherface method has also been applied to face detection from color

images [86]. In [1], Akamatsu, Sasaki and Suenuga applied LDA to the magnitude

of Fourier spectrum of the intensity image.The results reported by the authors

showed that LDA in the Fourier domain is significantly more robust to variations

in lighting than the LDA applied directly to the intensity images. In [54] authors

proposed another LDA based method

24 3.5 Summary

3.4.1 Other Methods

Other popular face recognition approaches that will only be mentioned in this re-

port include Dynamic Link Matching [100], [101] Matching Pursuit-Based Meth-

ods [55], [76], [77], [78] and Hidden Markov Model Based Methods [71], [85]

Face Recognition by Elastic Bunch Graph Matching [102], [103], [104].Fractal

based [24], [23], [25] , [26], [29], [27], [28], [30], [52], [93] approaches are a new

application of fractals which will be presented in next chapter. Some other pub-

lications that describe latest achievements as well as currently unsolved issues of

face recognition are as followed: [107], [87], [10], [40], [95], [19], [84].

3.5 Summary

Face recognition systems work very well under constrained conditions, such as

frontal mug shot images and consistent lighting. In actual world face recognition,

like other biometrics, suffers from several usability problems. Face is a changeable

social organ (see figure 3.1) displaying a variety of possible presentations. Human

facial expressions make changes in the shape of facial components such as eyes,

mouth and eyebrows. Artificial changes include cuts and bandages from injuries

or wearing glasses and fashion-related issues like makeup, jewelry also change the

face images. Some changes may occur with time such as growth and removal

of facial hair and wrinkles of the skin caused by aging. It has been shown that

using facial images taken at least one year apart, can cause error rates of 43%

[80]to 50% [74]. A facial image is a 2D view (projection) of a 3D surface. Viewing

angle, pose and illumination (change in sunlight intensities) can affect this projec-

tion. For example when the face tilt right-left or up-down the 2D view changes.

These changes in gray-level will cause features to change. Face recognition algo-

rithms appear to be sensitive to changes from ideal conditions. FRVT evaluation

report [12] show high error rates even in those ideal conditions. Different per-

3.5 Summary 25

Figure 3.1: Examples of pose(XM2VTS face image database [65]), lighting (ARface image database [61]) and facial experssion variations (CMU-Pitt facial ex-pression database [48]) in face images.

26 3.5 Summary

formance evaluation tests such as, FERET [82],[81], [83], FRVT [12], [79] and

XM2VTS[63], [65], show significant improvement in face recognition technology.

However, there are still areas which require further research and development.

Chapter 4

Fractal Codes Directly as

Features

4.1 Introduction

Using fractals for object or shape recognition is a relatively new application of

fractal image encoding. The goal of the fractal image encoding algorithms is to be

able to create a series of mathematical processes which would produce an accurate

reproduction of an image. For many years, fractal encoding was a technique

for image compression. Fractal codes are much more compact than the original

image and many algorithms [4], [13], [35], [36], [38], [45], [69], [70] [98], [99], [105]

have been proposed to use these codes for image compression. In this chapter

another application of fractal codes is proposed. Fractal codes have this ability to

reproduce an image (or at least a good approximation of it) by a set of contractive

transformations. These transformations can be shown in simple affine form and

can be recorded by several simple parameters. This compact presentation of

images shows its usefulness in image compression, but, is it possible to use this

code for recognition too? In this chapter, a brief explanation of some previous

related work is given in section 4.2. In section 4.3, use of fractal codes as feature

28 4.2 Previous Related Work

for recognition, especially face recognition, is described and different aspects of

this method such as fractal extraction, normalizing the Fractal codes, accuracy

test and improving the robustness is discussed. Other original fractal techniques

for face recognition are explained in the chapters 5 and 6.

4.2 Previous Related Work

4.2.1 Shape Recognition Using Fractal Geometry

Neil [72], [73], proposed one of the first methods for using fractal techniques in

shape recognition. His method is based on the comparison of a transformation

and an object. To compare two different objects (shapes), his method firstly finds

an associated transformation for the object being identified. Then a comparison

between the transformation and other object is achieved by applying the trans-

formation to the object. The object will remain unchanged if and only if the

transformation is an associated transformation for that object. This method is

based on the binary representation of shapes (black and white images) and has

reported some invariance to rotation by giving standard orientations to shapes.

4.2.2 Face Recognition Using Fractal Dimensions

Kouzani [51], proposed a face recognition method based on the fractal dimension.

In his method, each pixel of an image is replaced by the fractal dimension of region

around that pixel. To handel the shortcomings of fractal dimension calculation,

he used the average value of different fractal dimensions related to the region

with different region sizes around the pixel. To compare between two images, he

presented these fractal dimension maps to a normalized cross correlation stage in

which the best match is chosen.

4.3 Fractal Codes as Features 29

In another work [52], Kouzani used 2 feed forward neural networks. The first

one implements the search process for matching range and domain blocks in the

face image. The second one compares the fractal code of a query image and the

fractal code of the known face in the database. Kouzani claimed that the second

neural network calculates the degree of similarity between the two fractal face

models, but did not explain how.

4.2.3 Face Recognition Using Fractal Neighbor Distances

Yan and Tan [93], [92], [94] extended Neil’s method for gray scale images. In their

method, a database of fractal codes for the set of training face images is generated

first. Then for any unknown query image Iq and any known training image I

with fractal code WI , the fractal neighbor distance Υ(WI , Iq) = d(WI(Iq), Iq) is

calculated and compared with others (d is a Euclidian distance function). The

code Wmin with minimum Υ(Wmin, Iq) is taken as the best match.

This method was also used for face verification. The system comprised two com-

ponents: face detection and face verification subsystems. The location of the

head was detected based on the result of a search in the reduced region using the

fractal neighbor distance between a generic face template and a portion of the

image. The verification subsystem, also used fractal neighbor distance to com-

pute and find the minimal distance between localized head image and the images

stored in the XM2VTS database. The results were reported in [63].

4.3 Fractal Codes as Features

As the fractal encoding algorithms can apply to any (gray-scale) images, we can

say that, any (gray-scale) image can be approximated by the attractor of a fractal

code. Image xf is the attractor of fractal code W (x) if xf is the fixed point (see

30 4.3 Fractal Codes as Features

sec. 2.3.3 ) of the fractal code W (xf ) = xf . Since fractal representations are

transformations that apply between one part in an image and another, some part

of the code could be robust to many types of degradations that affect both the

parts (domain and range blocks) similarly.

This section describes the first system proposed in this thesis which is based on

the use of the fractal code of an Image as feature for recognition.

4.3.1 Fractal Extraction

In fractal image coding, the code for an image x is an efficient binary representa-

tion of a set of contractive affine transformations W whose unique fixed point xf

is a good approximation to x. The fractal coding algorithm used in this system

can be described as follows:

1- Partition the image to be encoded into non-overlapping range blocks Ri using

quad-tree partitioning.

2- Cover the image with a sequence of possibly overlapping domain blocks Dj.

3- For each range block, find the domain and corresponding transformation that

best match the range block.

4- Save the geometrical positions of range block and matched domain block as

well as the matching transformation parameters as fractal codes of image.

Quadtree partitioning

Quadtree partitioning method employs the well known image processing tech-

nique based on recursive splitting of selected image quadrants, enabling the re-

sulting partition be represented by a tree structure in which each non-terminal

node has four descendants. The usual top-down construction starts by selecting


an initial level in the tree, corresponding to some maximum range block size. In

order to produce contractive transformations, range blocks not smaller than the

largest domain blocks are subdivided into smaller range blocks. Each range block

larger than a preset limit, is recursively partitioned if a match with one of the

domain blocks in the domain pool, better than some preselected threshold is not

found. In figure 4.1 (top left side) a sample of quadtree partitioning is shown.

Note that a region containing detail is split into smaller domains in the process

of finding a sufficiently good match.

Figure 4.1: Domain (bottom) and range(top left) blocks for an image (top right)


Domain blocks

The task of a fractal encoder is to find a domain block D of the same image

for every range block such that a transformation of this block W (D) is a good

approximation of the range block. In order to have contractive transformations,

the domain block should be bigger than the range block. The number of different

sizes of domain blocks and how much overlap is allowed are two important pa-

rameters of the system. Figure 4.1 (bottom) shows domain blocks of two different

sizes 8×8 and 16×16. Note that in this example, the domain blocks of the same

size are not overlapped but the domain blocks with larger size are overlapped

with 4 domain blocks of smaller size.

Mapping domains to ranges

The main computational step in fractal image coding is the mapping of domains

to range blocks. For each range block, the algorithm compares transformed ver-

sions of the domain blocks to the range block. The transformations are typically

affine transformation. The transformations W is a combination of a geometri-

cal transformation and luminance transformation. For a gray scale image I, if

z denotes the pixel intensity at the position (x, y), then W can be expressed in

matrix form as follows:

W

x

y

z

=

a b 0

c d 0

0 0 s

x

y

z

+

e

f

o

(4.1)

Coefficients a, b, c, d, e and f control the geometrical aspects of the transformation

(skewing,stretching, rotation, scaling and translation), while the coefficients s and

o determine the contrast and brightness of the transformation and together make

the luminance parameters. The geometrical parameters of the transformation

limited to rigid translation, a contractive size-matching, and one of eight orien-

tations. The orientations consist of four 90o rotations, and a reflection followed


by four 90o rotations as shown in figure 4.2.

Figure 4.2: The eight possible orientations of a block. The orientations consistof four 90o rotations, a reflection and four more 90o rotations.

Figure 4.3: An illustration of domain and range blocks

Domain-range comparison is a three-step process. One of the eight basic orien-

tations is applied to the selected domain block Dj. Next, the rotated domain

is shrunk to match the size of the range block Rk. The range must be smaller

than the domain in order for the overall mapping to be a contraction. Finally,

optimal contrast and brightness parameters are computed using least-squares fit-

ting. Representing the image as a set of transformed blocks does not form an

exact copy of the original image, but a close approximation of it. Minimizing the

error between W (Dj) and Rk will minimize the error between the original image

and the approximation. Let ri and di, i = 1, · · · , n denote the pixel values of two

equal size blocks Rk and shrink(Dj). The error Err is defined as:

Err =n∑

i=1

(s.di +o−ri)2 (4.2)


The minimum of Err occurs when the partial derivatives with respect to s and

o are zero:

Err = n.o2+n∑

i=1

(s2.d2i +2.s.di.o−2.s.di.ri−2.o.ri+r2

i ) (4.3)

∂Err

∂s=

n∑

i=1

(2s.d2i +2di.o−2di.ri) = 0 (4.4)

∂Err

∂o= 2.n.o+

n∑

i=1

(2.s.di−2.ri) = 0 (4.5)

which occurs when:

s =[n

∑n

i=1 di.ri −∑n

i=1 di

∑n

i=1 ri][

n∑n

i=1 d2i − (

∑n

i=1 di)2] (4.6)

o =1

n

[n∑

i=1

ri − s

n∑

i=1

di

]

(4.7)

These two equations can be simplified as:

s =α

β(4.8)

o = r−

(α

β

)

d (4.9)

where:

d =1

n

n∑

i=1

di (4.10)

r =1

n

n∑

i=1

ri (4.11)

α =n∑

i=1

(di − d

). (ri − r) (4.12)

β =n∑

i=1

(di − d

)2(4.13)

Proof:

s =α

β=

n.α

n.β=

n.∑n

i=1

(di − d

). (ri − r)

n.∑n

i=1

(di − d

)2 =


=n.

∑n

i=1 (di.ri) − n.d.∑n

i=1 ri − n.r.∑n

i=1 di + n2.d.r

n.∑n

i=1 d2i − 2.n.d.

∑n

i=1 di + n2.d2

=n.

∑n

i=1 di.ri −∑n

i=1 di.∑n

i=1 ri −∑n

i=1 ri.∑n

i=1 di +∑n

i=1 di.∑n

i=1 ri

n∑n

i=1 d2i − 2. (

∑n

i=1 di)2+ (

∑n

i=1 di)2

=[n

∑n

i=1 di.ri −∑n

i=1 di

∑n

i=1 ri][

n∑n

i=1 d2i − (

∑n

i=1 di)2]

and

o = r−

(α

β

)

d =1

n

n∑

i=1

ri−s.1

n

n∑

i=1

di =1

n

[n∑

i=1

ri − s

n∑

i=1

di

]

Sample fractal codes of an image are as shown here:

Table 4.1: An example of fractal codes.Quadtreeparameters

Domainindex

Orientation Brightness Contrast

1 1 1 1 0 0 1 6 111 0.1111 1 1 2 0 0 1 7 301 -0.1301 1 1 3 0 0 1 6 194 0.0031 1 1 4 0 0 1 5 67 0.1651 1 2 1 0 0 1 2 324 -0.1571 1 2 2 0 0 1 2 274 -0.0941 1 2 3 0 0 1 5 -522 0.9001 1 2 4 0 0 1 7 216 -0.0251 1 3 1 0 0 1 5 -47 0.3051 1 3 2 1 0 7 1 128 0.022...

......

......

The first columns contain the 6 quadtree parameters which showing the geo-

metrical positions of range blocks, the next column is the domain index number

which uniquely locate the position of domain block using some preset parameters

such as size of domain blocks, number of different domain sizes and overlapping

factor. The third column contains the orientation index which is a number be-

tween 0 and 7, the last two columns are brightness and contrast factor o and

s respectively. In this system, last 4 columns (domain index number, rotation

(orientation) index,brightness and contrast factor) are used as fractal features for

recognition.


4.3.2 Normalizing Fractal Codes

Each fractal feature used in this system is a vector, such that each image has

4 feature vectors of same size. The size of each vector, however, varies from

one image to another and it depends on the number of range blocks which is

depends on the partitioning threshold, the size of image, image complexity and

the minimum size of range and domain blocks. In order to normalize the size of

each vector we use the quad-tree partitioning geometry and apply each feature

value at its geometrical position (as can be seen from figure 4.4). Because quad-

tree partitioning can be applied to an image of any arbitrary size, we can resize

all feature vectors to the size of the query image. This makes our method robust

to size and scale changes. For classification we used the Peak Signal-to-Noise

ration (PSNR) between feature vectors of the query image and feature vectors

of all image in the database as a measure of distance and a minimum distance

classifier.

A B

C D

Figure 4.4: Fractal features of an image (A=Domain index number, B=Rotation(orientation) index, C=Brightness shift and D=Contrast factor) displayed as grayvalues over the quad-tree partition of the same image.


4.3.3 Accuracy Tests

To initially test our system, we have used a subset of the MIT face database.

This version of MIT face database consists of 2 face images from 90 subjects for

a total of 180 images, with some variation in the illumination, and the scale and

head orientation. In figure 4.5 some examples from this face database are shown.

Figure 4.5: Typical images from the MIT face database(ftp:\\whitechapel.media.mit.edu\pub\eigenfaces\pub\images). Two differ-ent frontal views of each person are included.

We used each feature separately for classification, first, to obtain some idea of

their information content or ability to discriminate between faces. Classification

accuracy was plotted as a function of the number of images as the size of the

number of images tested grew from 1 to 180, the size of the database. It was

found that the orientation parameter yielded an accuracy of about 72% and the

domain index yielded an accuracy of about 64% separately on this data. The

other two features yielded lower accuracy (as can be seen from figure 4.6 ). The

use of all four features resulted in a total accuracy of close to 88.5% accuracy for

a small database size. The accuracy tended to level off around 85%.

This suggests that in a fractal representation of the face, the information about

which parts are self similar to which other parts and the orientation differences

between these parts is more useful for recognition than the transformations be-


Figure 4.6: Recognition accuracy using Rotation, Domain index, brightness andcontrast features, independently, and total accuracy achieved using all features,plotted against the number of the images in the database as this is progressivelyincreased.

tween ’averaged’ pixel gray level descriptions such as brightness and contrast from

domain to range. Lighting variations are also more likely to affect brightness and

contrast more significantly than the other two features.

Figure 4.7 shows the results of using this method to retrieve the closest 8 images to

a given image. The first four images are the best hit for each feature and the last

four images are the second best hit of that feature. In figures 4.8, 4.9 robustness of

this method to rotation is demonstrated. In the first test, a 180 rotated version

was used as the query image. The method, using only the orientation feature

vector, was able to pick the correct identity as the closest match in this test. The

method, using only the orientation feature vector, was able to pick the correct

identity as the closest match and the second view of the same person as the

next closet match in the second test. It is interesting to note that the third and

fourth best matches are similar to the query image from a human visual point of


Figure 4.7: A query Image and the first 8 closest matches found by the method(the first four images are the best hit for each feature and the last four imagesare the second best hit of that feature).

view, subjectively. The other matches are of the wrong gender but share some

similarities in overall appearance and shape.

In figure 4.10, the query image is inverted in grayscale. All the fractal features

except the brightness feature find the right match. The rotation feature gets both

first and second matches right. It is happened only because this change similarly

effects range and domain blocks. Also the position of range and domain blocks

are not changed. Thus, if the domain block Dj was the best match for the range

block Rk in the original image, it is still is the best match even after inverting

the pixel values. This fact shows that the first two fractal features (domain index

number and orientation index) would not be effected by this change. The effect

of this change on the other two fractal features (brightness and contrast factor)

can be shown by an example.


Figure 4.8: A rotated query image and the first 8 closest matches. Note that thebest match image retrieved using the orientation feature is the correct person.

Example

Suppose n = 3 and R = [1, 2, 3], D = [4, 5, 6] are the range and shrunk domain

block. Brightness and contrast factor o, s are calculated as explained in equations

(4.8) and (4.9):

s =α

β=

∑n

i=1

(di − d

). (ri − r)

∑n

i=1

(di − d

)2

s =(−1).(−1) + (0).(0) + (1).(1)

(−1)2 + (0)2 + (1)2= 1

o = r − s.d = 2 − (1).5 = −3

If the image inverted in grayscale, the range and domain block R,D will be

R = [255, 254, 253] and D = [252, 251, 250] and o, s will be:

s =(1).(1) + (0).(0) + (−1).(−1)

(1)2 + (0)2 + (−1)2= 1

o = r − s.d = 254 − (1).251 = 3

This example clearly shows that inverting the pixel values in a grayscale image

only changes the brightness feature and does not change the other features.


Query image

Figure 4.9: Another rotated query image and the first 8 closest matches onlyusing the orientation feature.

It can be easily shown that any shift in brightness which affects all the pixel values

equally, only changes the brightness feature. Suppose the pixel value v(x, y) of

the pixel at the position (x, y) is changed to v(x, y) + δ then the new contrast

and brightness features are as follows:

s′ =α′

β′=

∑n

i=1

(d′

i − d′).(r′i − r′

)

∑n

i=1

(d′

i − d′)2 =

∑n

i=1

((di + δ) − d − δ

). ((ri + δ) − r − δ)

∑n

i=1

((di + δ) − d − δ

)2

=

∑n

i=1

(di − d

). (ri − r)

∑n

i=1

(di − d

)2 =α

β= s

o′ = r′ − s′.d′ = (r + δ) − s.(d + δ) = o + (1 − s).δ

Thus, all fractal features except the brightness feature are robust to a shift in

brightness.

42 4.4 Summary

Figure 4.10: A query image inverted in grayscale and the first 8 closest matches.Note that the rotation feature gets both first and second matches right. Thebrightness feature does not find the right match because the change negatives thebrightness feature. The other two features find the right match.

4.4 Summary

This chapter described a new method for face recognition by using fractal codes

directly as features. It is shown that the fractal parameters of an image have a

self similarity based representation of that image and can be used as features for

object recognition. The fractal code of an image contains several different parts,

some variations in images affect some of the parameters, while others remain un-

changed. This introduces some degree of robustness in the system. It is discussed

that the fractal codes of different images are different in the size of fractal features

and a method for normalization of features to generate same reduced size feature

vectors is presented. The details of the experiments are given in the Appendix

B.

Chapter 5

Fractal Image-set Coding

5.1 Introduction

In this chapter, another method of using fractal codes for facial recognition is

presented. It is shown that the fractal code of an image is not unique and that

certain parameters can be held constant to capture image information in the other

parameters. Fractal codes are calculated keeping geometrical fractal parameters

constant for all images. These parameters are calculated from a set of images.

The proposed method is faster than traditional fractal coding methods which

require time to search and find the best domain for any range block. It also

lends itself to preprocessing steps that provide robustness to changes in parts of

a face and produces codes that are more directly comparable. Results on the

XM2VTS database are used to demonstrate the performance and capabilities of

the method.

44 5.2 Mathematical Bases

5.2 Mathematical Bases

A compact representation of fractal encoding and decoding process can be pro-

vided by using these operators [21] : Let =m denote the space of m × m digital

grayscale images, that is, each element of =m is a m × m matrix of grayscale

values. The get-block operator Γkn,m : =N → =k, where k ≤ N , is the operator

that extract the k × k block with lower corner at n,m from the original N × N

image,as shown in Figure 5.1.

Figure 5.1: An illustration of Get-Block and Put-Block operators

The put-block operator (Γkn,m)∗ : =k → =N inserts a k × k image block into a

N ×N zero image, at the location with lower left corner at n,m. A N ×N image

xf ∈ =N can shown as

xf =M∑

i=1

(xf )i =M∑

i=1

(Γrini,mi

)∗(Ri) (5.1)

that {R1, . . . , RM} are a collection of range cell images that partition xf . Each

5.3 Fractal Image-set Coding 45

Ri has dimension ri × ri with lower corner located at ni,mi in xf . If the range

cells Ri are the result of fractal image encoding of the image xf , then for each

range cell Ri there is a domain cell Di and an affine transformation Wi such that

Ri = Wi(Di) = Gi(Di)+Hi (5.2)

Denote the dimension of Di by di, and denote the lower left coordinates of Di by

ki, li. Gi = =di → =r

i is the operator that shrinks (assuming di > ri), translates

(ki, li) → (ni,mi) and applies a contrast factor si, while Hi is a constant ri × ri

matrix that represents the brightness offset. We can write Di = Γdi

ki,li(xf ) Thus,

equation 5.1 can be rewritten as the following approximation:

xf =M∑

i=1

(Γrini,mi

)∗{Gi(Γdi

ki,li(xf ))+Hi}

xf =M∑

i=1

(Γrini,mi

)∗{Gi(Γdi

ki,li(xf ))}

︸︷︷︸

A(xf )

+M∑

i=1

(Γrini,mi

)∗(Hi)

︸︷︷︸

B

(5.3)

Then if we write the Get-Block operator (Γkn,m)∗ and Put-Block operator Γdi

ki,liand

transformations Gi in their matrix form we can simplify equation 5.3 as follow:

xf = A×xf +B (5.4)

In these equation A,B are fractal parameters of image xf .

5.3 Fractal Image-set Coding

In this section we will use the compact representation (5.4) to show some interest-

ing properties of fractal image encoding and introducing a method for extracting

fractal codes for a set of face images with the same geometrical parameters which

we will call Fractal Image-set Coding. The fundamental principle of fractal im-

age encoding is to represent an image by a set of affine transformations. Images

are represented in this framework by viewing them as vectors. This encoding is

46 5.3 Fractal Image-set Coding

not simple because there is no known algorithm for constructing the transforms

with the smallest possible distance between the image to be encoded and the

corresponding fixed point of the transformations. Banach’s fixed point theorem

guarantees that, within a complete metric space, the unique fixed point of a con-

tractive transformation may be recovered by iterated application thereof to an

arbitrary initial element of that space. The Banach’s fixed point theorem gives

us an idea of how the decoding process works:

Let T : =n → =n be a contractive transformation and (=n, d) a metric space with

metric d then the sequence of {Xk} constructed by Xk+1 = T (Xk) converge for

any arbitrary initial image X0 ∈ =n to the unique fixed point Xf ∈ =n of the

transformation T .

The contraction condition in this theorem is defined by this definition: Transfor-

mation T : =n → =n is called contractive if there exists a constant 0 < s < 1,

such that

∀x, y ∈ =n, d(T (x), T (y)) ≤ s.d(x, y) (5.5)

This condition is a sufficient condition for existence of a unique fixed point for

fractal transformations. Because if there exist two fix points xf and x′f for a

contractive transformation T , we will have:

T (xf ) = xf

T (x′f ) = x′

f

and

d(T (xf ), T (x′f )) = d(xf , x

′f ) ⇒ s = 1

So the transformation T can not be a contractive because s = 1:

Let us show the fractal transformation in compact form (5.4) as this:

T (x) = A × x + B


So the fractal image coding of an image xf can be defined by finding A,B to

satisfy this condition, while A,B define a contractive transformation:

xf = A × xf + B

This condition shows that the fractal code for an image xf is not unique

because we can have infinite pairs of A,B to satisfy that condition, and

have the same fixed point xf . And in many of them, A and B define

a contractive transformation T (x) with |s| < 1. transformation. Figure 5.2

shows an illustrations of function T (x) = A×x+B for a one-dimensional space =.

6

-��

��

��

��

��

��

�

��

��

��

��

��

��

��

��

��

��

!!!!!!!!

!!!!!!!!

xf

xf

T (x) = A × x + B

x

de

cb

a

x

Figure 5.2: Illustrations of function T (x) = A×x+B for a one-dimensional space=. a)s > 1, b) s = 1, c,d,e) s < 1

Different fractal coding algorithms, use different A and B for an image that makes

the fractal face recognition process more complex. The aim of Fractal Image-set

coding is to find fractal parameters for several images with the same geometrical

part for all of them. In this case, the information in the luminance part of fractal

codes of these images is more comparable. This method is also more efficient

and faster than existing methods because there is no need to search for the best

matching domain block for any range block which is the most computationally

expensive part of the traditional fractal coding process.

In this system, a sample image is nominated for using to find the geometrical

parameters. This image can be an arbitrary image of database, an image out of

the database or the average image of all or part of the database.

48 5.3 Fractal Image-set Coding

The Fractal image-set coding algorithm can be described as follow:

Step 0 (preprocessing) - For any face image data-set F use eye locations and

histogram equalization to form a normalized face image data-set Fnormal.

Any face image in this data-set is a 128 × 128, histogram equalized 256-

grayscale image, with the position of left and right eye at (32,32) and (96,32)

respectively as shown in Figure 5.3.

A

200 400 600

100

200

300

400

500

B

200 400 600

100

200

300

400

500

C

100 200 300 400

50

100

150

200

250

300

350

D

20 40 60 80 100 120

20

40

60

80

100

120

Figure 5.3: An example of preprocessing with an image in the data-set.(A) Theoriginal image, (B) grayscale image with orientation normalized, (C) Nominatedimage with face area marked, (D) normalized histogram equalized face image

Step 1 - Calculate the fractal codes for the sample image x (that can be the

average image of data-set) using traditional fractal image coding algo-

rithms [35]. These fractal codes contain the luminance information and

geometrical position information for range blocks {R1, R2, . . . , Rn}, the do-

main block {D1, D2, . . . , Dm} corresponded to each range blocks and the

geometrical transformation like rotation and resizing to match the domain

block with the range block.

Step 2 - For any image xi of the data-set, use the same geometrical parameters

(range and domain blocks positions and geometrical transformations) that

used for coding the sample image x as shown in Figure 5.4. Let (xRi, yRi

),

lRibe the geometrical position and the block size of Range block Ri and


(xDj, yDj

), lDjthe geometrical position and the size of domain block Dj

which is the best matched domain block for Ri.

A B

C D

Figure 5.4: (A) Average image of the data-set, (B) An arbitrary image from thedata-set, (C) Range blocks for image A, (D) The same range blocks applied toimage B

Step 3 - For any image range block Ri in image x use the domain block at the

same position (xDj, yDj

) and the same size lDjand calculate the luminance

parameters as follows to minimize the error e:

e =n∑

i=1

(s.di + o − ri)2

Let di and ri denote the pixel value of domain block D and range block R.

The minimum of e occurs when:

s =α

β

o = r −

(α

β

)

d

where

α =n∑

i=1

(di − d

). (ri − r)

β =n∑

i=1

(di − d

)2

50 5.4 Similarity Measurements

d =1

n

n∑

i=1

di

r =1

n

n∑

i=1

ri

as proven in section 4.3.1.

Step 4 - Save the geometrical parameters as well as luminance parameters as

fractal codes of image x

Initial image First itaration

third itaration fifth itaration

Figure 5.5: The initial image and the first, third and fifth iterates of decodingtransformations corresponding to image 005 4 1.

In figure 5.5 an example of decoding result for one of encoded images of XM2VTS

face database is shown. The PSNR versus the iteration is drawn in figure 5.6 for

this image and three other images of the same database. It is clearly shown

that the fixed point of each fractal image codes can be reached after only 5 or 6

iterations.

5.4 Similarity Measurements

A similarity measurement τ(x, y) is a method to calculate the similarity between

two images. It is normally defined based on a metric distance d(x, y) (higher

5.4 Similarity Measurements 51

distance between two patterns showing lower similarity between them). Similarity

measurement generally is a number between 0 and 1, which 0 shows the lowest

similarity and 1 shows the highest similarity between two patterns. In this section

different similarity measurements are described.

5.4.1 Minkowski-Form Distance

The Minkowski-form distance is defined based on the Lp norm as:

dp(x, y) = p

√√√√

N−1∑

i=0

(xi − yi)p

Where x = x0, x1, . . . , xN−1 and x = x0, x1, . . . , xN−1 and y = y0, y1, . . . , yN−1 are

the query and target feature vectors respectively.

When p = 1, d1(x, y) is the city block distance or Manhattan distance (L1):

d1(x, y) =N−1∑

i=0

|xi − yi|

When p = 2, d2(x, y) is the Euclidean distance (L2):

d2(x, y) =

√√√√

N−1∑

i=0

(xi − yi)2

When p → ∞, we get L∞

d(x, y) = max0≤i<N

{|xi − yi|}

5.4.2 Cosine Distance

The cosine distance computes the difference in direction, irrespective of vector

lengths. The distance is given by the angle between the two vectors. By the rule

of dot product

−→x .−→y = |x|.|y| cos(θ)

52 5.4 Similarity Measurements

dcos(x, y) = 1 − cos(θ) = 1 −−→x .−→y

|x|.|y|

Similarity measurement τcos(x, y) = 1−dcos(x, y) =−→x .−→y|x|.|y|

only takes angle between

two vectors into account. Let τcos(x1, y), τcos(x2, y) denote the similarity between

two query vectors x1 and x2 and a target vector y respectively. When x1 and x2

only differ in the length, we will have θx1,y = θx2,y ⇒ τcos(x1, y) = τcos(x2, y), while

Euclidean distance uses both angle and vector lengths to calculate the distance

d2(x, y), as it illustrated in figure 5.7. In some cases, such as matching a domain

block with a range block, cosine distance can be more useful than Euclidean

distance because if pixel values of a block are multiplied by a contrast factor, the

matching results will not change and only the contrast parameter in fractal codes

may change.

5.4.3 Fractal Similarity Measures

Each image can be represented as a point in image space, RN , where N is the

number of pixels. For the purpose of illustrating convergence and distances in

feature space, we will use a two-dimensional image space, X = [x, y]. Fractal

code parameters can then be represented using matrix A and vector B in the

transformation F (X) = A × X + B. We choose different initial images and

different fractal parameters to show how distances in image space can be used

for classification. When the fractal code image xf is applied iteratively to an

initial image, say x01 the image converges to xf after several iterations. We

want to find that image in the database which is closest to xf . If an Euclidean

distance based on grayscale differences between corresponding pixels is used, the

distance between a database image and the query image is not very reliable

because it can change considerably with noise and with small changes to the

query image. For example, a small misalignment between the images can cause

large differences between grayscale values of corresponding pixels. Therefore, a

more robust distance measure is required. One such distance is that between

5.5 Using Fractal Image-set Coding for Face Recognition 53

two successive iterations of an image when the fractal code for xf is applied to

it. Figure 5.8 shows the trajectories in image space as images x01, x02 and x03

converge towards xf for the simplified two-dimensional case. It can be observed

that image x03 is closest to xf and d3 is also the shortest of d1, d2 and d3. This

relationship holds regardless of fractal parameters A and B.Fractal parameter

A can be decomposed into a rotation matrix ρθ and a scale factor s Figure 5.9

shows convergence trajectories for the same images when A = 0.9 × ρ15 and

B = (I −0.9×ρ15)×xf . Figures 5.8 and 5.9 correspond to a scale factor s = 0.9

but different rotation matrices. Figures 5.10 and 5.11 are the same plots with

the same rotation matrices as 5.8 and 5.9, respectively, but with the scale factor

changed to s = 0.6. Note the faster convergence for the lower value of s. We can

use distance d between an image and its first iterate when the fractal code of xf

is applied to it as a measure of the distance of this image to xf . All images in

the database are subjected to this transformation and distances are compared.

The image with the least distance is used to identify the person (there may be

more than one image of the same person in the database used for training). A

similarity score between a database image, xi, and a query image, xf , can also be

defined. One such score could be e−d, which is guaranteed to be between 0 and

1, corresponding to least similar and identical cases, respectively.

5.5 Using Fractal Image-set Coding for Face

Recognition

For applications such as criminal identification, it would be useful to have a

computer system that understood human conceptions of facial similarity; for

instance, one can imagine wanting a database program that could retrieve similar

faces from a comprehensive mug shot database after a witness selected one face

from a small initial grouping. The fractal codes extracted by Fractal image-set

coding method for a face data-set F have the advantage that all the codes have the

54 5.6 Experimental Results

same geometrical parameters and therefore the luminance parameters are more

comparable than the traditional fractal codes. For face recognition applications,

we can divide the image database to two image-sets, a training set and a test set.

The sample image x can be the average image of the test set or the training set

or the entire database. These cases may be suitable to face recognition from a

closed set or open set. The results have been found not to change much with the

choice of the image set from which the geometrical parameters are extracted.

As some popular face recognition systems such as Eigenface based systems are rely

heavily on predetermined eye locations, the effect of eye finding accuracy in these

systems are significantly high [60]. Fractal images-set coding uses eye location to

normalize the face images thus the accuracy of eye localization process may affect

the recognition accuracy. However using fixed fractal geometrical parameters as

well as applying the block-wised operation on the blocks of size 16x16 or 8x8

decreases the effect of one pixel or two pixels shift in the images.

5.6 Experimental Results

We selected the first 39 individuals from the XM2VTS database and 4 face images

per individual. The first image was used as a test image while the other images

were added to the training set. Eye location information is used to normalize and

align the images to a 128×128 pixel grid. The eye coordinates are now fixed and

64 pixels apart. The average image x over the entire data-set is calculated and

used to extract the shared geometrical fractal parameters.

The fractal code of a query or test image is applied to all the images in the training

set for one iteration. The distance, d between each of these transformed images

and the correspondent initial (target) image is used as a measure of distance

between the test image and the target image. This value is divided by the number

of pixels and the maximum pixel value (256). A similarity score, e−d, which is

5.6 Experimental Results 55

bounded between 0 and 1, indicates the closeness of the match between the target

image and the test image. The target that has the highest similarity score is

the recognized identity. The next 5 best matches reveal the effectiveness of the

method. Several such test cases are shown in figures 5.12 to 5.16.

The distance value d can be normalized using this function:

dnorm =d − min(d)

max(d) − min(d)

Then the similarity score Snorm = e−dnorm is better illustrates the similarity be-

tween the images as shown in figure 5.13.

The recognition accuracy in this system is 95%. Only for 2 cases among 39 cases

the first best matches are not face images of the correct individual, as shown in

figures 5.17 and 5.19. However, it must be noted that in figure 5.17 the second

closest matches is of the correct individual but the facial hair change is too severe

for the method to cope. The query image and 3 training images of this individual

are shown in figure 5.18. Figure 5.19 shows the second and the last failed test.

Wearing spectacles and change in expression is the main difference between the

test and training images for this individual as shown in figure 5.20.

The biggest advantage that this method offers is that effects of changes in parts

of a face are confined by the geometrical parameters used and by the number

of iterations. Since the parameters are common to all codes, we can choose to

emphasize or deemphasize certain regions, thereby achieving robustness to the

presence of spectacles or expression changes. This may not be possible easily in

other non-fractal methods where a change in one part of a face affects all features

(such as in the Eigenface approach without part segmentation) or fractal methods

where geometrical fractal code parameters vary from image to image.

56 5.7 Summary

5.7 Summary

In this chapter, fractal image-set encoding and its application in face recognition

and facial image retrieval are explained. A fractal code of an arbitrary gray-

scale image can be divided in two parts – geometrical parameters and luminance

parameters. Because the fractal codes for an image are not unique, we can change

the set of fractal parameters without significant change in the quality of the

reconstructed image. Fractal image-set coding keeps geometrical parameters the

same for all images in the database. Differences between images are captured

in the non-geometrical or luminance parameters - which are faster to compute.

For recognition purposes, the fractal code of a query image is applied to all the

images in the training set for one iteration. The distance between an image and

the result after one iteration is used to define a similarity measure between this

image and the query image. Results on a subset of the XM2VTS database are

presented.

5.7 Summary 57

1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 610

12

14

16

18

20

22

24

26

Iteration

PS

NR

(dB

)

Figure 5.6: The PSNR versus the number of decoding steps for 4 different128 × 128 gray-scale, normalized, encoded images of the XM2VTS database.The dash-dot line, solid line, dashed line and dotted line correspond to images002 1 1, 000 1 1, 003 1 1and 005 4 1 images of the XM2VTS database, respec-tively

6

-��

��

��

��

��

��

��

��

��

��

��

θ

y

x1

��

x2

d2(x1, y)

d2(x2, y)

Figure 5.7: Euclidean distance takes both angle and vector lengths into accountto calculate the distance, while cosine distance only takes angle into account.

58 5.7 Summary

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

x01

x02

x03

d

1

d2

d3

Figure 5.8: Convergence trajectories for three different initial images when thesame fractal code is applied iteratively. Note that the initial image (x03) closestto the fixed point shows the least distance between successive iterations (d3 <

d2 < d1). The fractal parameters are A = 0.9× ρ45 and B = (I − 0.9× ρ45)× xf .

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Figure 5.9: Convergence trajectories for the same three initial images when thefractal code parameters are A = 0.9 × ρ15 and B = (I − 0.9 × ρ15) × xf .

5.7 Summary 59

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2


0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2


60 5.7 Summary

00911

.png

N00921

.bmp N00941

.bmp N00931

.bmp

N03231

.bmp N00021

.bmp N03241

.bmp

Figure 5.12: An example showing a query image on top followed by the six closestimages in the database. The best match is on the top-left, followed by others leftto right in row-first order. Note that the first three matches are images of facesof the correct person and some change in expression is tolerated by the method.

1 1.5 2 2.5 3 3.5

x 10−3

0

10

20

30

40

Error

Fac

e#

N009

0 10 20 30 400

0.2

0.4

0.6

0.8

1

Sim

ilarit

y

Face#

N009

0 0.2 0.4 0.6 0.8 10

10

20

30

40

Error

Fac

e#

Normalized Error

0 10 20 30 400

0.2

0.4

0.6

0.8

1

Sim

ilarit

y

Face#

Normalized Similarity

Figure 5.13: The error(top left) and the similarity (top right) between the queryimage and the images in the training data-set. Errors are all very small. Normal-ized error (bottom left) and normalized similarity (bottom right) for the same im-ages. Note that the normalized similarity measure clearly shows the best match-ing face number as 9. Values of this measure for other faces are below 0.7 in thiscase.

5.7 Summary 61

00611

.png

N00631

.bmp N00621

.bmp N02931

.bmp

N03421

.bmp N00641

.bmp N00021

.bmp

Figure 5.14: Another example showing a correctly identified case. Note here thatthere is a more marked change in facial expression and pose.

00211

.png

N00231

.bmp N00221

.bmp N00241

.bmp

N01821

.bmp N01831

.bmp N04041

.bmp

Figure 5.15: Yet another correctly identifed case. Note that the first threematches are images of faces of the correct person.

62 5.7 Summary

02111

.png

N02141

.bmp N02131

.bmp N02121

.bmp

N00421

.bmp N00031

.bmp N00041

.bmp

Figure 5.16: Yet another correctly identified test case. Note here that the queryimage is of a light-skinned individual and so are all the 6 closest matched images.

01911

.png

N00041

.bmp N01921

.bmp N00031

.bmp

N00421

.bmp N00441

.bmp N00431

.bmp

Figure 5.17: A test case that failed. The second closest match is of the correctindividual but the facial hair change is too severe for the method to cope. Itseems as if the closest matched face exhibits a different person but with verysimilar expression and features such as eyes and mouth.

5.7 Summary 63

Test image

01921

.png 01931

.png 01941

.png

Figure 5.18: Query image (top) and training images (bottom) for the individualnumber 019

00511

.png

N00021

.bmp N00031

.bmp N01331

.bmp

N00521

.bmp N00041

.bmp N00541

.bmp

Figure 5.19: The only other test case that failed. The fourth and sixth closestmatches are of the correct individual.

Test image

00521

.png 00531

.png 00541

.png

Figure 5.20: Query image (top) and training images (bottom) for the individualnumber 005

64 5.7 Summary

0 5 10 15 20 25 30 35 400

10

20

30

40

50

60

70

80

90

100

Number of person in the database

Acc

urac

y

Figure 5.21: A plot showing accuracy versus the number of persons in the database. Three images are used for each person in training set and one image perperson in test set.

Chapter 6

Subfractals

6.1 Introduction

As it was shown in previous chapters, the fractal code of an image is a set of

contractive mapping transformations which each of them transfer a domain block

to its corresponding range block. The distribution of selected domain blocks

for range blocks in an image depends on the content of image and the fractal

encoding algorithm. Some methods use best matching search for finding a domain

while some others use first match. The shapes of domain blocks can be square,

rectangle, triangle and so on also the size of domain blocks in the domain pool

can be fixed or variable. All of these parameters can co-operate to make the

fractal codes sensitive to small changes in image. A small variation in a part

of input image may change the contents of the range and domain blocks in the

fractal encoding process, resulting in a change in transformation parameters in

the same part or even other parts of the image. In this chapter, we introduced

a new method of fractal image coding to make the fractal code of each part

independent of variations of other parts.

66 6.2 Basic Concepts

6.2 Basic Concepts

Is there any local relationship between range and domain blocks of an image? It

is one of the first questions that any researcher in this field may ask. Fisher [35]

in his book (chapter 3, page 69-72) tried to show that the corresponding domain

block for each range block is random in position relative to it. Fisher plotted the

distributions of the difference in the x and y positions of the domains and ranges

for an encoding of 512 × 512 Lena image as well as the theoretical distribution

of the difference of two randomly selected points as shown in Figures 6.1 and

6.2. In these Figures, (xr, yr) and (xd, yd) are the range and domain positions.

Fisher calculated the probability distribution of dx and dy, where dx and dy are

the differences in x and y coordinations of two points randomly chosen in the

unit square with uniform probability, as ρ(dx) = 1 − |dx| and ρ(dy) = 1 − |dy|.

−600 −400 −200 0 200 400 6000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10

−3

xd − x

r

(1−|x|/512)/512

Figure 6.1: A distribution of the difference in the x position of the domains xd

and ranges xr for an encoding of 512 × 512 Lena image, as well as the theoret-ical distribution (dashed line) of the difference of two randomly selected points.Adopted from [35].

6.2 Basic Concepts 67

−600 −400 −200 0 200 400 6000

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5x 10

−3

(1−|y|/512)/512

yd − y

r

Figure 6.2: A distribution of the difference in the y position of the domains yd

and ranges yd for an encoding of 512 × 512 Lena image, as well as the theoret-ical distribution (dashed line) of the difference of two randomly selected points.Adopted from [35]. Note that the distribution is skewed and also has significantlylarge values close to 0.

68 6.2 Basic Concepts

In the book, Fisher mentioned “so even when the points are chosen randomly, it

appears that there is a preference for local domains. However, this is an artifact

. . . there is a slight preference for local domains, but the effect is small”. It may

be a small effect for fractal compression but it plays a big role in the fractal

recognition. If the relation between range and domain blocks is random, a small

variation in a part of the image will change the range and domain blocks in a

random area. Also this change may cause a change in the fractal codes of all the

range blocks which are corresponding to those domain blocks. It clearly shows

that if the domain blocks’ distribution is random, a small change in some part

of an image will affect the fractal codes of other parts, and it means that this

change will be propagated randomly. On the other hand, as Fisher explained,

traditional fractal image coding methods prefer to choose local domain blocks for

each range block but it will not always happen. Our experiments have shown that

non-constant range blocks from a given segment tend to use domain blocks from

the same segment. As can be inferred from Fig.1 and Fig.2, for a sample image

like Lena (512×512 ) the number of range blocks which match with domain blocks

in their neighborhood with a radius of 60 is significantly higher than a random

matching between two blocks. This is owing to similar properties such as the same

texture. This fact makes some usage of fractal codes for recognition,(for example,

[25]) robust to some variations like expression variations on a face because these

kinds of variations cause only small local changes around lips or eyes that do not

affect the entire fractal codes. While the fractal codes of two different faces (a big

change) will affect the block partitioning, range blocks and domain blocks and

the entire code is changed.

To generalize this beneficial property, we propose a new fractal coding method

which chooses a domain block for each range block from the same area as range

block. It guarantees that any changes in a area or segment will only effect the

fractal codes related to that area and will not propagate anywhere else. It means

that the fractal codes of different areas of the image will be independent.

6.3 Subfractal Coding 69

A subfractal is defined to be a set of fractal codes that map a subset of domain

blocks in an image to domain blocks that cover the several part of the image.

These codes will be calculated to be independent of other codes of the other

parts of the same image.

6.3 Subfractal Coding

To calculate subfractals for an image we propose this algorithm. We assume here

that images are face images from a standard face database like the Banca face

database:

Step 0 (preprocessing) - For all face images use eye locations and histogram

equalization to form a geometrically and photometrically normalized face

image data-set.

Step 1 - Nominate the subfractal area for each part such as left and right eyes,

nose, lips and the rest of the image manually only for one arbitrary nor-

malized image of the database. This information will be used for all other

normalized images of the database as well.

Step 2 - For each subfractal, partition the area with non-overlapping r×r range

blocks.

Step 3 - Cover the subfractal area with a sequence of overlapping domain blocks

in k different sizes 2r×2r, 22r×22r, . . . , 2kr×2kr to form a domain pool for

that area. Also, add the 90o, 180o, 270o rotated version of each block to the

domain pool. Add the mirrored imaged version of each member of domain

pool to the pool, as well.

Step 4 - For each range block, find a domain block from domain pool of the

same subfractal area that best cover the range block. It can be done by

minimizing the distance function E(R,D) :

70 6.3 Subfractal Coding

E(R,D) =

√√√√

r∑

i=1

r∑

j=1

(R(i, j) − T (D)(i, j))2

between range block R and domain block D. The transformation

T (D) = Flip(F,Rotate(θ, Resize(1

L,D)))

resizes (L ∈ {2, 4, . . . , 2k}), rotates (θ ∈ {0, π4, π

2, 3π

4}) and flips (F ∈ {0 =

No flip, 1 = Horizontal flip}) domain block to match the corresponding

range block.

Step 5 - Record geometrical positions of the range block and the domain block

as well as parameters L, θ, F as the geometrical part of the fractal code for

the range block.

Step 6 - Calculate luminance parameters o and s and record them as other part

of the code :

s =α

β

o = R −

(α

β

)

D

where

α =r∑

i=1

r∑

j=1

(T (D)(i, j) − D

).(R(i, j) − R

)

β =r∑

i=1

r∑

j=1

(T (D)(i, j) − D

)2

D =1

r2

r∑

i=1

r∑

j=1

T (D)(i, j)

R =1

r2

r∑

i=1

r∑

j=1

R(i, j)

Step 7 - Repeat steps 4-6 for all range blocks in the subfractal area.

Step 8 - Repeat steps 2-7 for all subfractals in the image.

6.3 Subfractal Coding 71

In figure 6.3 range blocks in four major subfractals (eyes, nose and lips) and cor-

responding domain blocks for an arbitrary face image are shown. A plot of pixel

values for last matched domain and range block is also shown. Examination of

this plot for all the range blocks shows that even with the restriction of choosing

domain blocks from a subfractal area which is smaller than the image there is

enough freedom of choice to find a good match for most of the range blocks. This

arises from the overlapping of domain blocks which increases the number of do-

main blocks in the domain pool rapidly and the existence of different transformed

versions of a block in the domain pool. To speed up the coding process, we can

encode constant range blocks with only their geometrical parameters and their

average pixel values.

Range blocks

10 20 30 40 50

10

20

30

40

50

Domain blocks rotation=90o

rotation= 180orotation= 270o

Domain blocks flipped rotation= 90o ,flipped

rotation= 180o ,flippedrotation= 270o ,flipped

0 5 10 150

50

100

150

200

250

Pixel No.

Pix

el V

alue

Figure 6.3: Range blocks (top left) in four major subfractal areas (eyes, nose andlips) and corresponding domain blocks (bottom rows) for an arbitrary face image.Top right, a plot of pixel values vs. pixel numbers for last matched domain andrange block is shown.

72 6.4 Mathematical Basis

6.4 Mathematical Basis

As it shown in previous chapters, a N × N image xf can be shown as a unique

attractor of iterated contractive transformations:

xf = A × xf + B

In this equation A,B are fractal parameters of the image xf and are defined as:

A × xf =M∑

i=1

(Γrini,mi

)∗{Gi(Γdi

ki,li(xf ))}

B =M∑

i=1

(Γrini,mi

)∗(Hi)

Γkn,m : =N → =k, where k ≤ N , is a get-block operator which extract the k × k

block with lower corner at n,m from the original N × N image, and (Γkn,m)∗ :

=k → =N is a put-block operator which inserts a k×k image block into a N ×N

zero image, at the location with lower left corner at n,m. Gi = =di → =r

i is

the operator that shrinks (assuming di > ri), translates (ki, li) → (ni,mi) and

applies a contrast factor si, while Hi is a constant ri × ri matrix that represents

the brightness offset.

Because Gi is a combination of some geometrical transformation and a brightness

scaling, we can show that matrix A is a product of a contrast matrix Ψ and

another matrix Λ, that we call the distribution matrix :

A = Ψ × Λ

The values on the contrast matrix Ψ are the contrast factors si, (0 ≤ si < 1).

The distribution matrix Λ shows the relationship between each pixel of a range

and corresponding pixels of the domain. So in each column of the matrix, we

have non-zero values only in the rows corresponding to the domain pixels which

effect that range pixel. As the fractal code of an image is not unique, there are

many different possible values for Ψ and Λ. We can study these general cases:

6.4 Mathematical Basis 73

Case 1 - Each range pixel is in relation to only one domain pixel, each column

of Λ has only one non-zero value λi:

A =

0 s1 . . . 0

. . . 0 . . . s2

s3 0 . . . 0...

.... . .

0 0 sn 0

×

0 . . . λ3 . . . 0

λ1 0 . . . 0...

. . .... λn

0 λ2 0 . . . 0

This case can only happen when the size of range blocks is equal to the size

of domain blocks and will not be true for most of fractal image encoding

methods.

Case 2 - Each range pixel is in relation to all the pixels of the image:

A =

s1 s1 . . .

s2 s2 . . ....

.... . .

sn . . . sn

×

λ11 λ12 . . . λ1n

λ21 λ22 . . . λ2n

......

. . ....

λn1 λn2 . . . λnn

This case can only happen when the range blocks are derived from the entire

image and not only from a portion of the image.

Case 3 - Each range pixel is related to some of the domain pixels of the image.

In this case, each column of distribution matrix has some zero and some

non-zero values. The subfractal concept is one special subclass of this case.

For subfractals, we choose domain and range blocks from the same portion

of image so the matrixes A and Λ are sparse but we can re-arrange them

in the form of diagonal matrixes of subfractals.

We will illustrate this idea with an example: Suppose image X is a 3 × 3

grayscale image below, with 3 different subfractal areas a, b, and c :


X =

a1 b1 b2

a2 a3 a4

c1 c2 a5

So xf can be :

xf = A × xf + B

xf =

a1

b1

b2

a2

a3

a4

c1

c2

a5

A = Ψ×Λ

Ψ =

sa11 0 0 sa12 sa13 sa14 0 0 sa15

0 sb11 sb12 0 0 0 0 0 0

0 sb21 sb22 0 0 0 0 0 0

sa21 0 0 sa22 sa23 sa24 0 0 sa25

sa31 0 0 sa32 sa33 sa34 0 0 sa35

sa41 0 0 sa42 sa43 sa44 0 0 sa45

0 0 0 0 0 0 sc11 sc12 0

0 0 0 0 0 0 sc21 sc22 0

sa51 0 0 sa52 sa53 sa54 0 0 sa55

6.4 Mathematical Basis 75

Λ =

λa11 0 0 λa12 λa13 λa14 0 0 λa15

0 λb11 λb12 0 0 0 0 0 0

0 λb21 λb22 0 0 0 0 0 0

λa21 0 0 λa22 λa23 λa24 0 0 λa25

λa31 0 0 λa32 λa33 λa34 0 0 λa35

λa41 0 0 λa42 λa43 λa44 0 0 λa45

0 0 0 0 0 0 λc11 λc12 0

0 0 0 0 0 0 λc21 λc22 0

λa51 0 0 λa52 λa53 λa54 0 0 λa55

Now, we define a swapping transformations Υi,jrow(X) as a transformation

which swap the row(i) and row(j) of matrix or vector X with each other.

In the same way, we define Υi,jcol(X) for swapping col(i) and col(j). Using

linear algebra, it can be easily shown that :

Υi,jrow(xf ) = Υi,j

row(A × xf + B) = Υi,jrow(Υi,j

col(A)) × Υi,jrow(xf ) + Υi,j

row(B)

and

Υi,jrow(Υi,j

col(A)) = Υi,jrow(Υi,j

col(Ψ)) × Υi,jcol(Υ

i,jrow(Λ))

So the form of Ψ and Λ after this series of transformation will be

xf = Υ3,2row(Υ1,2

row(Υ7,8row(Υ9,8

row(xf )))) :


Ψ =

sb11 sb12 0 0 0 0 0 0 0

sb21 sb22 0 0 0 0 0 0 0

0 0 sa11 sa12 sa13 sa14 sa15 0 0

0 0 sa21 sa22 sa23 sa24 sa25 0 0

0 0 sa31 sa32 sa33 sa34 sa35 0 0

0 0 sa41 sa42 sa43 sa44 sa45 0 0

0 0 sa51 s52 sa53 sa54 sa55 0 0

0 0 0 0 0 0 0 sc11 sc12

0 0 0 0 0 0 0 sc21 sc22

Λ =

λb11 λb12 0 0 0 0 0 0 0

λb21 λb22 0 0 0 0 0 0 0

0 0 λa11 λa12 λa13 λa14 λa15 0 0

0 0 λa21 λa22 λa23 λa24 λa25 0 0

0 0 λa31 λa32 λa33 λa34 λa35 0 0

0 0 λa41 λa42 λa43 λa44 λa45 0 0

0 0 λa51 λ52 λa53 λa54 λa55 0 0

0 0 0 0 0 0 0 λc11 λc12

0 0 0 0 0 0 0 λc21 λc22

Matrixes Ψ and Λ can be divided to independent matrixes Ψa, Ψb, Ψc and

Λa, Λb, Λc. It is because we used subfractals and in each subfractal, pixels

are only related to other pixels of its own area. Thus

xf =

Xa

Xb

Xc

=

Ψa 0 0

0 Ψb 0

0 0 Ψc

×

Λa 0 0

0 Λb 0

0 0 Λc

×

Xa

Xb

Xc

+

Ba

Bb

Bc

and finally

Xa = Ψa×Λa×Xa + Ba

6.5 How to Use Subfractals for Face Recognition 77

Xb = Ψb×Λb×Xb + Bb

Xc = Ψc×Λc×Xc + Bc

These formulas clearly show that the fractal code of an image can be divided

to several independent subfractal codes. Each pixel in a subfractal area is

only related to other pixels of the same area.

6.5 How to Use Subfractals for Face Recogni-

tion

Hancock’s [41] psychophysical observations show that the human face recognition

most likely is based on the low-level image properties, rather than on an abstract

representation of the face. Certain image transformations, such as intensity nega-

tion, strange viewpoint changes, and changes in lighting direction can severely

disrupt human face recognition. The fractal codes show some degree of robust-

ness to some of these changes such as intensity negation. However in traditional

fractal image coding systems, the fractal code of a part of an image is not inde-

pendent from the changes in other parts of the same image. Subfractals, unlike

traditional fractal codes, do not have this problem, because the subfractal codes

of an image are defined to be independent. This fact can make subfractals more

suitable for applications such as image and face recognition.

To determine which part of the face should be a subfractal we devised a test.

In this test all 10 pairs of face images shown to 10 volunteers (5 males and 5

females) and asked them to verify if the two images in each pairs are belong to

one person or not. At the same time the gaze data of these volunteers collected

using Eye-Gaze Tracking System (Figure 6.4, See Appendix A for more details).

By using this system, we can indicate where on, and for how long, the computer

monitor the user is looking. These information are used to show which parts of

the face were compared to verify the person.

78 6.5 How to Use Subfractals for Face Recognition

Figure 6.4: A view of the eye-gaze tracking system

6.6 Summary 79

Figures 6.5 to 6.12 show 4 pairs of face images and the results of the eye-gaze

tracking system for 10 viewers. The results are shown as circles on the face

images. The center of each circle, shows the gaze point and the radius of each

circle shows the duration of gaze on that point.

The results in figures 6.6 and 6.8 show that eyes, nose and lips area are the

most important area for viewers to verify the identity. In figure 6.9 the pair of

face images are inverted in grayscale (negative image). About half of the viewers

could not verify these images correctly as shown in figure 6.10. However, the most

important areas for viewers were noise, eyes and lips again. Figure 6.12 shows

how viewers compare a face image and a semi-drawing image. These results and

the other results from other 6 pair images show that the most important areas

for face verification task for humans are eyes, nose and lips. Negative images are

more difficult for humans to verify than the normal images, while as shown in the

example in section 4.3.3 (page 40), fractal recognition system can dealing with

this difficulty very well.

Based on these results, the suitable subfractal areas for a face must contain left

and right eyes, nose and lips. To generate a complete fractal code of an image

the other parts of the face are also coded.

6.6 Summary

This chapter is an introduction to the concept and underlying maths of a new

fractal code for an image called subfractal coding. Based on this method the

fractal code of an image can be divided to several subfractals. Each subfractal is

defined to be independent from others, thus the changes in one part of the image

do not have an effect on the subfractal codes of other parts of the same image.

80 6.6 Summary

Figure 6.5: A pair of face images shown to volunteers to verify the identity.

Figure 6.6: An illustration showing the results of the eye-gaze tracking systemfor 10 viewers. Circles (the centers) show the gaze points and the radius of eachcircle shows the duration of gaze on that point.

6.6 Summary 81

Figure 6.7: Another pair of face images shown to volunteers to verify the identity.

Figure 6.8: The results of the eye-gaze tracking system show eyes, nose and lipsarea are the most important area for viewers to verify the identity.

82 6.6 Summary

Figure 6.9: Another pair of face images. The face images are inverted in grayscale(negative image).

Figure 6.10: The results of the eye-gaze tracking system for negative images.

6.6 Summary 83

Figure 6.11: Yet another pair of face images. Note that the left face image isinverted in grayscale and the right face image is semi-drawing.

Figure 6.12: The results of the eye-gaze tracking system for negative images.

Chapter 7

Future Work and Conclusions

This thesis started to address four research questions:

1- Is it possible to use fractal codes of grayscale images as features

for recognition?

It has been shown throughout this thesis that fractal codes have a great

capability to be used for recognition such as face recognition. As it was

described in Chapter 4, the fractal parameters of an image have a self

similarity based representation of that image and can be used as features

for face recognition. The fractal codes of different images are different in

the size of fractal features. The system presented for use of fractal codes as

features contains a method for normalization of features to generate same

reduced size feature vectors is presented. As the fractal code of an image

contains several different parts, some variations in images such as shift in

brightness effect some of the parameters, while others remain unchanged.

This results in some degree of robustness in the system.

2- What is the mathematical basis for using fractals for recognition?

The extraction of fractal code from an image involves the partitioning of

the image into a set of range blocks. There is also a corresponding set of

86

domain blocks to choose from. For each range block, a suitable domain

block is found using some prescribed criterion. The mapping between the

domain and range blocks, which is a contractive, similarity transformation,

forms the fractal code for this range block. The fractal code for the image

is a collection of fractal codes for all range blocks. Fractal code of an image

is not unique. An image xf can be shown as attractor of a contractive

transformation T which is in the form of T (xf ) = A × xf + B = xf .

3- Is it possible to design a more suitable fractal coding system for recognition?

Fractal code of an image is a set of transformations. Each transformation

has two parts. geometrical part and luminance part. Fractal image-set cod-

ing keeps geometrical parameters the same for all images in the database.

Differences between images are captured in the non-geometrical or lumi-

nance parameters - which are faster to compute. For recognition purposes,

the fractal code of a query image is applied to all the images in the training

set for one iteration. The distance between an image and the result after

one iteration is used to define a similarity measure between this image and

the query image. Experiments show that this system can achieve 95% ac-

curacy rate on a subset of the XM2VTS database and only 2 cases of 39

cases failed.

4- Are the different parts of a fractal code independent? And if not, how can we

define and extract independent fractal codes of different parts of an image?

Experience with face images shows that any changes in some part of the

image may affect the fractal codes of that part and also other parts of the

image. Chapter 6 a defines subfractal which is a new type of fractal code

for an image. Each subfractal is defined to be independent from others. An

algorithm is presented for extraction of subfractal codes.

7.1 Future Work 87

7.1 Future Work

7.1.1 Improving the Robustness

Faces can vary in terms of size, location in an image, orientation about the z-axis.

Such variation can be removed by normalising the face. Descriptions for recog-

nition can then be obtained. The eyes are commonly detected for normalisation

and some effective eye detectors have been produced. But eye detection cannot

always be successfully applied to faces; glasses or other obstacles can hide the

eyes. Some methods use a whole face approach to face normalisation, which is

more robust.

Facial expression is another kind of variation that can not be removed by nor-

malisation. The expression is almost composed with six main emotions, such as

happiness, sadness, surprise, disgust, anger and fear. Several algorithms have

been proposed for facial expression detection [11],[32],[53], [62],[64]. Some of

these techniques are related to extraction of motion of nose, mouth, eyebrows

and eyes with tracking algorithm, optical flow, motion energy, network criteria,

3d geometric modelling with a range finder and color image analysis technique.

Most of these techniques are used to recognize the facial expressions but only

little effort has gone into the recognition of faces with varying facial expressions.

A combination of fractal face recognition system and a PCA based feature re-

duction system (as shown in Figure 7.1) can be used to show how this method is

robust to the some facial variations like human facial expression.

In this application, domain index numbers for each range block in a feature vector

is used. To normalise the size of each vector, the quadtree partitioning geometrical

parameters which is part of fractal codes for each image is used. Because quad-

tree partitioning can be applied to an image of arbitrary size, the feature vector

can be resize to the size of the query image. In a typical image of size 128x128 the

88 7.1 Future Work

quadtree decomposition produces about 400 or more range blocks. The feature

vectors are of this length and will after normalisation be uniformly of size 64x64

because the smallest range size used is 4x4. This is a large vector and must be

reduced to suit classifiers.

The optimal linear method (in the least mean squared error sense) for reducing

redundancy in a data set is the Karhunen-Loeve (KL) transform or eigenvector

expansion via Principal Components Analysis (PCA). The basic idea behind the

KL transform is to transform possibly correlated variables in a data set into

uncorrelated variables. The transformed variables will be ordered so that the

first one describes most of the variation of the original data set. The second

will try to describe the remaining part of variation under the constraint that

it should be uncorrelated with the first variable. This continues until all the

variation is described by the new transformed variables, which are called Principal

Components. Mathematically, the PCA can be described as follows. Suppose X

is a vector, let P be the transformation matrix required such that Y has a diagonal

covariance matrix.

Y = P × X

It has been shown that the rows of P are eigen-vectors of the covariance matrix

E[(X − X)(X − X)T ]. The eigenvectors are arranged in descending order of the

corresponding eigenvalues. Elements of Y are called the principal components of

X. The expectation operation is preformed as an average over all feature vectors

from the training set. In this approach, PCA is preformed on fractal feature

vectors, not pixel values directly.

The aim of PCA is to reduce the dimension of the working space. The maximum

number of principal components is the number of variables in the original space.

However, in order to reduce the dimension, some principal components should be

omitted. In order to minimize the error, the eigenvalues are classified in decreasing

order and the last eigenvalues (and their eigenvectors) may be dropped. We use

this method to reduce dimension of fractal features to the number of individuals

7.1 Future Work 89

in the training database, for example from 16384 to 100. Independent component

analysis (ICA) could also be used for feature reduction but in this thesis we have

restricted our attention to PCA.

The eigenface approach uses normalised face images as vectors of pixel values,

which are transformed using PCA into feature vectors. The difference with our

approach is the use of fractal code vectors instead of pixel values as input to

the PCA. Results in figure 7.2 seem to indicate that our method should provide

better robustness to expression variations. We also do not need to normalise the

face images for small changes in size, position and rotation.

We used reduced fractal features as a vector. The size of each vector is equal.

For classification we used the mean squared error between feature vectors of the

query image and feature vectors of all images in the database as a measure of

distance and a minimum distance classifier.

7.1.2 Face Location and Detection

When a face image is captured using a video camera, the face may be located

anywhere on the video frame (on still image). Because most face recognition

methods rely on some normalisation of size and position, it is important to locate

the face and find either the contour or the location of some reference points such

as the eyes or the mouth. A segmentation technique, which will distinguish face

pixels or blocks from background ones, can be used for this task. However, if is

a difficult task and no good segmentation algorithms are known, especially when

the background is not uniform in grayscale or texture. The possibility of using

the subfractal idea to segment an image and locate the face can be studied.

90 7.1 Future Work

Figure 7.1: Block diagram of the fractal face recognition system with PCA basedfeature reduction.

Figure 7.2: Matrix showing differences between faces shown on the two axes.Darker points indicate larger difference. Entries below the diagonal are pixel-value differences. Entries above the diagonal are fractal-feature differences.

7.1 Future Work 91

7.1.3 Face Recognition Using Subfractals of Eyes and

Mouth Area

Face recognition accuracy can be improved if global features are augmented by

features depending only on specific parts such as eyes or mouth. This can only

be done if these parts can be segmented out from the rest of the face. This

requires properties of these parts, which are distinct from those of the rest of

the face. It is our contention that there is self-similarity within these parts and

range blocks from eyes will be transformed versions of domain blocks from within

the eye provided the search for the best suited domain is constrained to weight

domains inversely as the distance from the range. Under some such constraint

the eye region might turn out to be a sub-fractal within the face. We intend to

test and further develop these ideas.

Other future directions include using subfractals for video coding and neural

network based subfractals.

Appendix A

Quick Glance Eye-Gaze Tracking

System

An eye-tracker system is designed to determine the gaze point and the duration of

gaze of the user on the computer monitor. This appendix introduces the EyeTech

Digital Systems’ product Quick Glance, an eye-tracking system that was used for

the tests described in section 6.5.

The Quick Glance system consists of two infrared LED light sources, a camera,

a power supply and cabling, a PCI bus board and software. The camera and

light sources are mounted on the computer’s monitor. The video capture card

(PCI bus board) is installed into an available computer slot and connected to the

camera with a cable. The software is designed to help users to setup the system,

calibrate it and use it for their purpose.

This system examines the pupil center and corneal reflections from the user’s

eye which is illuminated by two low power infrared LEDs which are mounted on

the computer’s monitor to measure the user’s gaze point. The reflected light is

focused onto a camera, also mounted on the computer’s monitor. The image of

the eye upon which the camera is focused is captured at a fast and user determined

94

rate by image capturing hardware provided with the system. By analyzing the

position of the light reflections and the center of the pupil contained in the image,

a software determines the gaze point. Gaze point duration is also derived. With

that information, a gaze tracking program can illustrate the user’s gaze path by

moving the location of the cursor according to the gaze point and its duration.

Appendix B

Experimental Details

This dissertation contains several experiments. The details of experiments as well

as the results and comparison between the results are described in this Appendix:

B.1 Fractal Codes as Features

Method: direct use of fractal codes as features (Chapter 4).

Coding method: conventional fractal coding.

Domain block: overlapping square blocks of two different size (8X8 and 16X16).

Range blocks: non-overlapping, square blocks, generated by quad-tree partition-

ing.(Figure 4.1)

Geometrical aspects of transformation: contractive size matching and one of eight

orientations.(Figure 4.2)

Number of features: 4 vectors( Domain index number, Orientation, Brightness

shift and Contrast factor)

96 B.2 Fractal Image-Set Coding

Normalization: each of the fractal features will be normalized to a specific

size(64X64) using quad-tree partitioning geometry.(Figure 4.4)

Database: a subset of MIT face database contains 2 face images of 90 subjects,

with some variation in the illumination, and the scale and head orientation. (Fig-

ure 4.5)

classification: The Peak Signal-to-Noise ratio (PSNR) between feature vectors of

the query image and feature vectors of all images in the database are used as

measure of distance. A minimum distance classifier then employed to determine

the recognition accuracy.

Results: classification accuracy for each features is calculated separately. The

orientation parameter with 72% and the domain index with 64% showing the

higher accuracy than 2 other features. The accuracy can be increased to 88.5%

by using best of four features.(Figure 4.6)

B.2 Fractal Image-Set Coding

Method: Fractal image-set coding (Chapter 5).

Coding method: calculating the geometrical fractal features only once from a

mean image or even a single chosen image.

Domain block: overlapping square blocks of two different size (8X8 and 16X16).

Range blocks: non-overlapping, square blocks, generated by quad-tree partition-

ing.

Geometrical aspects of transformation: contractive size matching and one of eight

orientations.

B.2 Fractal Image-Set Coding 97

Number of features: 1 vector (luminance parameters).

Normalization: any image in the data-set will be normalized using histogram

equalization and eye locations to produce 128x128 face images with left and right

eyes at (32,32) and (96,32) respectively.(Figure 5.3)

Databases and results : this method has been tested on two databases including

a subset of MIT face database and a subset of XM2VTS face database.

The subset of MIT face database contains 90 person and 2 shots per person. one

of the shots used as test data while the other shot used as training data. ROC

plot in Figure B.1 shows the results of this experiment .

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC plot

Threshold

Err

or R

ate

FAFR

Figure B.1: The results of Fractal image-set coding for subset of MIT facedatabase.

The recognition accuracy rate of this system is 83.33% which is higher than the

results of any of 4 fractal features tested in the first experiment.

The subset of XM2VTS database contains 39 people and 4 images per person(first


shot of 4 sessions). Image data-set is divided to 3 sets: training set, evaluation

set and test set. Three subjects ( subject number 000, 002 and 007) are used as

imposters in evaluation set, 8 subjects ( subject number = 001, 008, 010, 011,

023, 028, 031, 039) are used as imposters in test set and the other subjects are

used as clients. The first images of each client subject is used as test image

while the other 3 images are used for training. Figure B.2 shows the results of

this experiment for evaluation data in ROC plot format. Based on this plot the

threshold will be set to obtain certain false acceptance (FAR) and false rejection

(FRR) values. The same threshold will then be used on the test set.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC plot

Threshold

Err

or R

ate

Evaluation FAREvaluation FRR

FRR=0

FAR=0

FRR=FAR

Figure B.2: The results of Fractal image-set coding for the evaluation subset ofXM2VTS database. Arrows showing the position of the threshold for FRR=0,FRR=FAR and FAR=0

To compare this results with the results of other researchers that also used

XM2VTS database, test set will be evaluated to three different thresholds T:

TFAR=0 = argminT (FRR|FAR = 0)

TFAR=FAR = (T |FRR = FAR)

B.2 Fractal Image-Set Coding 99

TFRR=0 = argminT (FAR|FRR = 0)

Table B.1: Error rates obtained using Fractal image-set codingError FRR=0 FAR=FRR FAR=0FAR 53.33% 16.4% 0.3%FRR 0.0% 9.3% 51.85%

Table B.2: Error rates Reported by T. Tan using fractal neighbor distancesError FRR=0 FAR=FRR FAR=0FAR 94.0% 13.6% 0.0%FRR 0.0% 12.3% 81.3%

Figure B.3 shows these results in the form of ROC plot. The error rates are

summarized in the Table B.1. Using this information we can compare our results

with the other results in the literature. For examples the results of face recognition

using fractal neighbor distances [91] is shown in the table B.2 which indicates that

our results have less errors in most of the cases and have slightly higher error in

other cases.


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ROC plot

Threshold

Test FARTest FRR

FAR=FAR

FRR=0 FAR=0

Figure B.3: The results of Fractal image-set coding for the test subset of XM2VTSdatabase. Arrows showing the position of the threshold for FRR=0, FRR=FARand FAR=0 in the evaluation data set

Bibliography

[1] S. Akamatsu, T. Sasaki, H. Fukumachi, and Y. Suenaga, “A robust face iden-

tification scheme -KL expansion of an invariant feature space,” Proceedings

of SPIE, vol. 1607: Intelligent Robots and Computer Vision X: Algorithms

and Techniques, pp. 71–84, 1991.

[2] A. Alattar and S. Rajala, “Facial features localization in frontal view head

and shoulders images,” IEEE international Conference on Acoustics, Speech

and signal Processing, vol. 6, pp. 3557–3560, 1999.

[3] M. Barnsley, “Fractals everywhere,” Academic Press, San Diego, 1988.

[4] M. Barnsley and L. Hurd, Fractal Image Compression. AK Peters, Wellesley,

1993.

[5] M. S. Bartlett and T. J., “Viewpoint invariant face recognition using inde-

pendent component analysis and attractor networks,” in Advances in Neural

Information Processing Systems (T. P. M. Mozer, M. Jordan, ed.), pp. 817–

823, Cambridge, MA: MIT Press, 1997.

[6] M. S. Bartlett and T. J. Sejnowski, “Independent components of face images:

A representation for face recognition,” in Proceedings of the 4th Annual Jount

Symposium on Neural Computation,, (Pasadena, CA,), May 1997.

[7] L. H. M. Bartlett, M. Stewart and T. Sejnowski, “Independent component

representations for face recognition,” in Proceedings of the SPIE, Conference

on Human Vision and Electronic Imaging III,, vol. 3299, pp. 528–539, 1998.

102 BIBLIOGRAPHY

[8] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fish-

erfaces: Recognition using class specific linear projection,” Proceedings of

European Conference on Computer Vision, ECCV’96, pp. 45–58, 1996.

[9] P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman, “Eigenfaces vs. fisher-

faces: Recognition using class specific linear projection,” IEEE Transactions

on Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711–720,

1997.

[10] T. D. Bie, N. Cristianini, and R. Rosipal, “Eigenproblems in pattern recogni-

tion,” Handbook of Computational Geometry for Pattern Recognition, Com-

puter Vision, Neurocomputing and Robotics, E. Bayro-Corrochano (editor),

Springer-Verlag, April 2004.

[11] M. J. Black and Y. Yacoob, “Tracking and recognition rigid and non-rigid

facial motion using local parametric models of image motion,” Proceedings

of IEEE International Conference on Computer Vision, ICCV95, Boston,

pp. 374–381, 1995.

[12] D. Blackburn, M. Bone, and P. J. Philips, “Facial recognition vendor test

2000,” Evaluation report. National Institute of Standards and Technology,

2000.

[13] R. D. Boss and E. W. Jacobs, “Archetype classification in an iterated trans-

formation image compression algorithm.,” in Fractal Image Compression -

Theory and Application, (Y. Fisher, ed.), pp. 79–90, Springer-Verlag, New

York, 1994.

[14] Boyer and Merzbach, A History of Mathematics. New York: John Wiley,

2nd ed., 1989.

[15] R. Brunelli and D. Falavigna, “Person identification using multiple cues,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17,

pp. 955–966, 1995.

BIBLIOGRAPHY 103

[16] R. Brunelli and T. Poggio, “Face recognition through geometrical features,”

Proceedings of European Conference on Computer Vision, ECCV92, Santa

Margherita Ligure, pp. 792–800, 1992.

[17] R. Brunelli and T. Poggio, “Face recognition: features versus templates,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15,

1993.

[18] P. Burt, “Smart sensing within a pyramid vision machine,” Proceedings of

the IEEE, vol. 76, pp. 1006–1015, 1988.

[19] L. Chen, H. Liao, J. Lin, and C. Han, “Why recognition in a statistics-

based face recognition system should be based on the pure face portion:

a probabilistic decision-based proof,” Pattern Recognition, vol. 34, no. 5,

pp. 1393–1403, 2001.

[20] G. Chow and X. Li, “Towards a system for automatic facial feature detec-

tion,” Pattern Recognition, vol. 26, no. 12, pp. 1739–1755, 1993.

[21] G. M. Davis, “A wavelet-based analysis of fractal image compression,” IEEE

Transactions on Image Processing, pp. 100–112, 1997.

[22] O. Deniz, M. Castrillon, and M. Hernandez, “Face recognition using inde-

pendent component analysis and support vector machines,” 3rd Interna-

tional Conference on Audio- and Video-based Biometric Person Authentica-

tion 2001, Halmstad, Sweden, June 6-8,, vol. 2091, pp. 59–64, 2001.

[23] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Face recognition

using fractal codes,” Proceedings, WoSPA 2000, , Brisbane, Australia, 2000.

[24] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Face recogni-

tion using fractal codes,” Proceedings of International Conference on Image

Processing, vol. 3, pp. 58–61, 2001.

[25] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Robustness to

expression variations in fractal-based face recognition,” Sixth International,

104 BIBLIOGRAPHY

Symposium on Signal Processing and its Applications, vol. 1, pp. 359–362,

2001.

[26] H. Ebrahimpour-komleh, V. Chandran., and S. Sridharan, “Mathematical

basis for use of fractal codes as features,” Image and Vision Computing ’02

New Zealand, 2002.

[27] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, An Application of

Fractal Image-set Coding in Facial Recognition, vol. 3072 of Lecture Notes in

computer science, Biometric Authentication, pp. 178–186. Springer Verlag,

July 2004.

[28] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Facial image re-

trieval using fractal image-set coding,,” Feb. 2004.

[29] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Fractal image-

set encoding for face recognition,” in Proceedings of International Conference

on Computational Intelligence for Modelling Control and Automation, (Gold

Coast, Australia), pp. 664–672, July 2004.

[30] H. Ebrahimpour-komleh, V. Chandran, and S. Sridharan, “Subfractals: A

new concept for fractal image coding and recognition,” Submitted to the

Journal of Complexity International, 2004.

[31] R. Epstein, P. Hallina, and A. Yuille, “5+/- eigenimages suffices: An empir-

ical investigation of low-dimensional lighting models,” Proceedings of the

Workshop on Physics-based Modeling in Computer Vision, pp. 108–116,

1995.

[32] I. A. Essa and A. P.Pentland, “Facial expression recognition using a dynamic

model and motion energy,” Proceedings of IEEE International Conference

on Computer Vision, ICCV95, Boston, pp. 360–367, 1995.

[33] K. Etemad and R. Chellappa, “Face recognition using discriminant eigen-

vectors,” Proceedings of International Conference on Acoustics, Speech and

Signal Processing, pp. 2148–2151, 1996.

BIBLIOGRAPHY 105

[34] R. A. Fisher, “The use of multiple measurements in taxonomic problems,”

Annals of Eugenics, vol. 7, pp. 179–188, 1936.

[35] Y. Fisher, ed., Fractal Image Compression: Theory and Application.

Springer-Verlag , New York, NY, USA, 1995.

[36] Y. Fisher, ed., Fractal Image Encoding and Analysis. NATO ASI Series,

Springer-Verlag, Berlin Heidelberg, 1998.

[37] K. Fukunaga, “Introduction to statistical pattern recognition,” Academic

Press, 2nd ed, 1990.

[38] M. Gharavi-Alkhansari and T. S. Huang, “A generalized method for im-

age coding using fractal-based techniques.,” Journal Visual Communication

Image Representation, vol. 8, no. 2, pp. 208–225, 1997.

[39] A. J. Goldstein, L. Harmon, and A. Lesk, “Identification of human faces,”

Proceedings of the IEEE, pp. 748–760, 1971.

[40] R. Gross, J. Shi, and J. Cohn, “The current state of the art in face recog-

nition,” Technical Report, Robotics Institute, Carnegie Mellon University,

Pittsburgh,USA, 2004.

[41] P. Hancock, V. Bruce, and M. Burton, “A comparison of two computer-

based face identication systems with human perceptions of faces,” Vision

Research, vol. 38, 1998.

[42] L. D. Harmon, M. K. Khan, R. Lasch, and P. F. Ramig, “Machine identifi-

cation of human faces,” Pattern Recognition, pp. 97–110, 1981.

[43] J. Hutchinson, “Fractals and self similarity.,” Indiana University Mathemat-

ics Journal, vol. 30, no. 5, pp. 713–747, 1981.

[44] A. Hyvarinen and E. Oja, “Independent component analysis: Algorithms

and applications,” Neural Networks, vol. 13, no. 4-5, pp. 411–430, 2000.

106 BIBLIOGRAPHY

[45] A. E. Jacquin, A Fractal Theory of Iterated Markov Operators with Applica-

tions to Digital Image Coding. PhD thesis, Georgia Institute of Technology,

1989.

[46] A. E. Jacquin, “Fractal image coding: A review,,” Proceedings of the IEEE,

vol. 81, no. 10, pp. 1451–1465, 1993.

[47] T. Kanade, Picture Processing by Computer Complex and Recognition of

Human Faces. PhD thesis, Kyoto University, 1973.

[48] T. Kanade, J. Cohn, and Y. Tian, “Comprehensive database for facial ex-

pression analysis,” Proceedings of the 4th IEEE International Conference on

Automatic Face and Gesture Recognition (FG’00), pp. 46 – 53, March 2000.

[49] M. D. KELLY, “Visual identification of people by computer.,” Technical

report AI-130, Stanford AI Project, Stanford, CA., 1970.

[50] M. Kirby and L. Sirovitch, “Application of the karhunen-loeve procedure

for the characterization of human faces,” IEEE Transactions on Pattern

Analysis and Machine Intelligence, vol. 12, pp. 103–108, 1990.

[51] A. Kouzani, F.He, and K. Sammut, “Face image matching using fractal

dimension,” IEEE International Conference on Image Processing, pp. 642–

646, 1999.

[52] A. Z. Kouzani, F. He, and K. Sammut, “Fractal face representation and

recognition,” IEEE International Conference on Systems, Man and Cyber-

netics, vol. 2, pp. 1609–1613, 1997.

[53] A. Lanitis, C.J.Taylor, and T. Cootes, “A unified approach to coding and

interpreting face images,” Proceedings of IEEE International Conference on

Computer Vision, ICCV95, Boston, pp. 368–373, 1995.

[54] J. Lu, K. Plataniotis, and A. Venetsanopoulos, “Face recognition using lda-

based algorithms,” IEEE Trans. on Neural Networks, vol. 14, no. 1, pp. 195–

200, 2003.

BIBLIOGRAPHY 107

[55] S. G. Mallat and Z. Zhang, “Matching pursuits with time-frequncy dictio-

naires,” IEEE Transactions on Signal Processing, vol. 41, pp. 3397–3415,

1993.

[56] B. Mandelbrot, Les Objets Fractals: Forme, Hasard et Dimension. Paris:

Flammarion, 1975.

[57] B. Mandelbrot, Fractals: Form,Chance and Dimention. Freeman, W. H. and

Company, 1977.

[58] B. Mandelbrot, The Fractal Geometry of Nature. Freeman, W. H. and Com-

pany, 1982.

[59] B. Manjunath, R. Chellappa, and C. D. Malsburg, “A feature based ap-

proach to face recognition,” Proceedings of IEEE Computer Society. Confer-

ence on Computer Vision and Pattern Recognition, pp. 373–378, 1992.

[60] J. Manjunath, N. Orlans, and A. Piszcz, “Effects of eye position on eigenface-

based face recognition scoring,” Technical Report, The MITRE corporation,

7515 colshire drive, mclean, VA 22102,USA, 2003.

[61] A. M. Martinez and R. Benavente, “The ar face database,” CVC Tech.

Report 24, 1998.

[62] K. Mase, “Recognition of facial expression from optical flow.,” IEICE Trans-

actions, vol. E74, no. 10, pp. 3474–3483, 1991.

[63] J. Matas, M. Hamouz, K. Jonsson, J. Kittler, Y. Li, C. Kotroupolous,

A. Tefas, I. Pitas, T. Tan, H. Yan, F. Smeraldi, J. Bigun, N. Capdevielle,

W. Gerstner, S. Ben-Yacoub, and Y. Abduljaoued, “Comparison of face ver-

ification results on the xm2vts database,” in Proceedings of the 15th ICPR

(A. Sanfeliu, J. J. Villanueva, M. Vanrell, R. Alqueraz, J. Crowley, and

Y. Shirai, eds.), vol. 4, (Los Alamitos, USA), pp. 858–863, IEEE Computer

Soc Press, 2000.

108 BIBLIOGRAPHY

[64] K. Matsuno, C. Lee, S. Kimura, and S. Tsuji, “Automatic recognition of

human facial expressions,” Proceedings of IEEE International Conference

on Computer Vision, ICCV95, Boston, pp. 352–359, 1995.

[65] K. Messer, J. Matas, J. Kittler, J. Luettin, and G. Maitre., “Xm2vtsdb: The

extended m2vts database.,” March 1999.

[66] M. Michaelis, R. Herpers, L. Witta, and G. Sommer, “Hierarchical filtering

scheme for the detection of facial keypoints,” International Conference on

Acostics, Speech, and Signal Processing, vol. 4, pp. 2541–2544, 1997.

[67] B. Moghaddam and A. Pentland, “Probabilistic visual learning for ob-

ject representation,” The 5th International conference on Computer Vision,

Cambridge MA, pp. 786–793, 1995.

[68] B. Moghaddam and A. Pentland, “Probabilistic visual learning for object

representation,” IEEE Transactions on Pattern Analysis and Machine In-

telligence, vol. 19, pp. 676–710, 1997.

[69] D. M. Monro and F. Dudbridge, “Fractal block coding of images.,” Electron-

ics Letters, vol. 28, no. 11, pp. 1053–1055, 1992.

[70] D. M. Monro and F. Dudbridge, “Rendering algorithms for deterministic

fractals,” IEEE Computer Graphics and Applications, vol. 15, no. 1, pp. 32–

41, 1995.

[71] A. Nefian, A hidden Markov model-based approach for face detection and

recognition. Phd thesis, Georgia Institute of Technology,, Atlanta, GA, 1999.

[72] G. Neil and K. M. Curtis, “Scale and rotationally invariant object recog-

nition using fractal transformations,” Proceedings of IEEE International

Conference on Acoustics, Speech and Signal Processing, ICASSP96, vol. 6,

pp. 3458–3461, 1996.

[73] G. Neil and K. M. Curtis, “Shape recognition using fractal geometry,” Pat-

tern recognition, vol. 30, no. 12, pp. 1957–1969, 1997.

BIBLIOGRAPHY 109

[74] A. Pentland and T. Choudhury, “Face recognition for smart environments.,”

IEEE Computer, vol. 33, no. 2, pp. 50–55, 2000.

[75] A. Pentland, R. Picard, and S. Scarloff, “Photobook: content-based ma-

nipulation of image databases,” International Journal of Computer Vision,

vol. 18, pp. 233–254, 1996.

[76] P. Phillips, “Matching pursuit filters design,” 12th International Conference

on pattern recognition, pp. 57–61, 1994.

[77] P. Phillips, “Matching pursuit filters design for face identification,” in SPIE,

vol. 2277, pp. 2–9, 1994.

[78] P. Phillips, “Matching pursuit filters applied to face identification,” IEEE

Transactions on Image Processing, vol. 7, no. 8, pp. 1150–1164, 1998.

[79] P. J. Phillips, P. Grother, R. Micheals, D. M. Blackburn, E. Tabassi, and

J. M. Bone, “Face recognition vendor test 2002: Overview and summary,”

National Institute of Standards and Technology, 2003.

[80] P. J. Phillips, A. Martin, C. L. Wilson, and M. Przybocki, “An introduction

to evaluating biometric systems,” IEEE Computer, vol. 33, no. 2, pp. 56–63,

2000.

[81] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The feret evalua-

tion methodology for face-recognition algorithms,” IEEE Transactions on

Pattern Analysis and Machine Intelligence, vol. 22, no. 10, pp. 1090–1104,

2000.

[82] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss., “The feret database

and evaluation procedure for face-recognition algorithm.,” Image and Vision

Computing,, vol. 16, pp. 295–306, 1998.

[83] S. Rizvi, P. Phillips, and H. Moon, “The feret verification testing protocol

for face recognition algorithms,” Thechnical report, NISTIR 6281, National

Institute of Standards and Technology, 1998.

110 BIBLIOGRAPHY

[84] J. RuizdelSolar and P. Navarrete, “Eigenspace-based face recognition: a

comparative study of different approaches,” IEEE Transactions on Systems,

Man and Cybernetics, Part C, vol. 35, pp. 315–325, 2005.

[85] F. Samaria and S. Young, “Hmm based architecture for face identification.,”

Image and Vision Computing, vol. 12, no. 8, pp. 537–583., 1994.

[86] A. W. Senior, “Face and feature finding for a face recognition system,” Sec-

ond International Conference on Audio- and Video-based Biometric Person

Authentication, pp. 154–159, 1999.

[87] G. Shakhnarovich and B. Moghaddam, “Face recognition in subspaces,”

Handbook of Face Recognition, Eds. Stan Z. Li and Anil K. Jain, Springer-

Verlag, pp. 154–159, 2004.

[88] L. Sirovitch and M. Kirby, “Low-dimensional procedure for the character-

ization of human faces,” Journal of the Optical Society of America, vol. 4,

pp. 519–524, 1987.

[89] L. Stringa, “Eyes detection for face recognition,” Applied Artificial Intelli-

gence, vol. 7, pp. 365–382, 1993.

[90] H. Takayasu, Fractals in the Physical Sciences. Manchester University Press,

1990.

[91] T. Tan, “Human face recognition based on fractal image coding,” PH.D.

Thesis, The University of Sydney, 2003.

[92] T. Tan and H. Yan, “Analysis of the contractivity factor in fractal based face

recognition,” IEEE International Conference on Image Processing, vol. 3,

pp. 637–641, 1999.

[93] T. Tan and H. Yan, “Face recognition by fractal transformations,” Pro-

ceedings of IEEE International Conference on Acoustics, Speech and Signal

Processing, ICASSP99, pp. 3537–3540, 1999.

BIBLIOGRAPHY 111

[94] T. Tan and H. Yan, “Object recognition using fractal neighbor distance:

Eventual convergence and recognition rates,” Proceedings of 15th Interna-

tional Conference Pattern Recognition, pp. 781–784, 2000.

[95] L. Torres, “Is there any hope for face recognition?,” Proc. of the 5th Inter-

national Workshop on Image Analysis for Multimedia Interactive Services,

WIAMIS 2004, pp. 21–23, 2004.

[96] M. Turk and A. Pentland, “Eigenfaces for recognition,” Jornal of Cognitive

Neuroscience, vol. 3, pp. 71–86, 1991.

[97] M. Turk and A. Pentland, “Face recognition using eigenfaces,” Proceedings

of IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–

591, 1991.

[98] L. Vences and I. Rudomin, “Genetic algorithms for fractal image and im-

age sequence compression,” in Proceedings Computacion Visual, pp. 35–44,

Universidad Nacional Autonoma de Mexico, 1997.

[99] S. Welstead, “Fractal and wavelet image compression techniques,” SPIE

Press, 1999.

[100] L. Wiskott, J. Fellous, N. Krger, and C. von der Malsbnurg, “Face recog-

nition by elastic bunch graph matching,” IEEE Transactions on Pattern

Analysis and Machine Intelligence, vol. 19, no. 7, 1997.

[101] L. Wiskott and C. v. d. Malsburg, “Recognizing faces by dynamic link

matching,” Proceedings of International Conference on Artificial Neural Net-

works, ICANN’95, pp. pp. 347–352, 1995.

[102] L. Wiskott and C. von der Malsburg, “Labeled bunch graphs for image

analysis,” Apr. 2001. United States Patent 6,222,939.


analysis,” Mar. 2002. United States Patent 6,356,659.

112 BIBLIOGRAPHY


analysis,” May 2003. United States Patent 6,563,950.

[105] B. Wohlberg and G. de Jager, “A review of the fractal image coding litera-

ture,” IEEE Transactions on Image Processing, vol. 8, no. 12, pp. 1716–1729,

1999.

[106] L. Yuille, P. Hallinan, and D. Cohen, “Feature extraction from faces using

deformable templates,” International Journal of Computer Vision, vol. 8,

no. 2, pp. 99–111, 1992.

[107] W. Zhao and R. Chellappa, “Face recognition: A literature survey,” ACM

Journal of Computing Surveys, pp. 399–458, 2003.

Date post:	29-Jun-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Fractal Techniques for Face Recognition · Fractal Techniques for Face Recognition by Hossein...

Documents