+ All Categories
Home > Documents > Automatic Imaging for Face Biometrics and Eye Localization239849/FULLTEXT01.pdf · Preface This...

Automatic Imaging for Face Biometrics and Eye Localization239849/FULLTEXT01.pdf · Preface This...

Date post: 03-May-2018
Category:
Upload: vodan
View: 231 times
Download: 1 times
Share this document with a friend
71
Technical report, IDE0942, June 2009 Automatic Imaging for Face Biometrics and Eye Localization Master’s Thesis in Computer Science and Engineering Tao Wang, Weifeng Lin School of Information Science, Computer and Electrical Engineering Halmstad University
Transcript

Technical report, IDE0942, June 2009

Automatic Imaging for Face Biometrics and Eye Localization

Master’s Thesis in Computer Science and Engineering

Tao Wang, Weifeng Lin

School of Information Science, Computer and Electrical Engineering

Halmstad University

Automatic Imaging for Face Biometrics and Eye Localization

Master’s Thesis in Computer Science and Engineering

Tao Wang, Weifeng Lin

School of Information Science, Computer and Electrical Engineering

Halmstad University Box 823, S-301 18 Halmstad, Sweden

June 2009

Preface This master’s thesis is part of the research project called “Automatic

Imaging for Face Biometrics and Eye Localization” which is defined by the

Bigsafe technology AB, and carried out at the school of Information Science,

Computer and Electric Engineering at the Halmstad University. As the

authors, we would like to thank our supervisor Professor Josef Bigun for

always being ready to offer suggestions and ideas on every step and detailed

answers to every question. His valuable support has given us great insights,

and the flexibility to work in the best possible way to achieve our goals in

this project.

Halmstad, Sweden, June 2009

Abstract A proposal for a person authentication system, which localizes facial

landmarks and extracts biometrical features for face authentication, is

presented in this thesis. An efficient algorithm for eye localization and

biometrical feature extraction and person identification is developed by

using Gabor filters. In the eye localization part, we build artificial average

eye models for eye location. In the person identification part, we construct

databases of biometrical features around the eye area of clients and, for

authentication, Schwartz inequality and the sum square error (SSE) are used.

This project is implemented in the ‘Matlab’ programming language, on a

personal computer system, and experimental results on the proposed system

are presented.

Contents

1 Introduction............................................................................................. 1

1.1 Project background ....................................................................... 1

1.2 Aim of the study ........................................................................... 2

1.2.1 Eye localization........................................................................ 2

1.2.2 Client identification.................................................................. 2

1.3 Environment.................................................................................. 3

1.4 Outline of thesis ............................................................................ 3

1.5 The retinotopic sampling grid....................................................... 4

1.6 Gabor decomposition.................................................................... 6

2 Eye localization ..................................................................................... 13

2.1 System introduction .................................................................... 13

2.2 Preprocessing of face images...................................................... 16

2.2.1 Retina radius ......................................................................... 16

2.2.2 Start and end frequency of Gabor filter ................................ 17

2.2.3 Picture size............................................................................ 19

2.3 Training sample set..................................................................... 20

2.3.1 Gabor filters model ............................................................... 20

2.3.1.1Single Gabor filter per retina grid point........................ 21

2.3.1.2Specific filter for each retinal point .............................. 22

2.3.2 Computing the average eye .................................................. 25

2.3.3 Weights ................................................................................. 26

2.4 Locate eye center ........................................................................ 27

2.4.1 The strategy of locating eye center....................................... 27

2.4.2 Compare testing feature with training feature ...................... 28

2.4.2.1 The sum of square error (SSE)..................................... 29

2.4.2.2Schwartz inequality....................................................... 29

2.4.3 Detection performance.......................................................... 29

3 Client identification .............................................................................. 33

3.1 Identification model.................................................................... 33

3.2 The matching concept................................................................. 37

3.2.1 Schwartz inequality .............................................................. 37

3.2.2 SSE........................................................................................ 38

3.3 Training by weights .................................................................... 39

4 Experiments and results....................................................................... 43

4.1 Landmark localization tests ............................................................. 43

4.1.1 Parameters instruction .......................................................... 43

4.1.2 Experimental results ............................................................. 44

4.2 Client identification tests ................................................................. 49

4.2.1 Determining model parameters.............................................. 49

4.2.2 Identification tests................................................................. 52

5 Discussion and conclusion.................................................................... 57

Automatic Imaging for Face Biometrics and Eye Localization

1

Chapter 1

Introduction

1.1 Project background

Each person has a variety of unique physiological and behavioral

characteristics. Uniqueness is how well those characteristics separate

individuals from each other. In today’s networked society, in order to prove

one’s identity, instead of using old fashioned methods like ID-cards,

passwords and PINs, the method of biometrics is developed and which

allows those unique characteristics of individuals to best represent

themselves. The following definition of biometrics can be found in [1].

Biometrics refers to methods for uniquely recognizing humans based

upon one or more intrinsic physical or behavioral traits.

In information technology, in particular, biometrics is used as a form

of identity access management and access control. It is also used to

identify individuals in groups that are under surveillance.

One example scenario for this project application would be a door

entrance system. When an unidentified person approaches, the camera of the

entrance system could track the facial characteristics of the person, and our

2

system would open the door or access would be denied, based on the facial

information extracted from this person and data in the system database.

1.2 Aim of the study

Based on a reasonably good face tracking algorithm, this thesis focuses on

two relative aspects for implementing this person authentication system.

1.2.1 Eye localization

The most interesting facial landmarks are eyes, nose and mouth. Eye

localization is a well-researched topic in biometrics. The aim of this part is

to locate eye centers in face images which are generated by a given face

tracking mechanism. In this report, we construct two artificial average eye

models with the aid of Gabor filters, and use these two models for eye

detection and locate the centers of both left and right human eyes.

1.2.2 Client identification

The aim of this part is to identify a client. After locating the eye centers of

face image of a client, we then extract biometrical features around the eye

areas and store the feature information, which best represents this specific

client, into our database along with feature information from other clients.

When information from an unidentified person comes into the system, we

compare that information with the data stored in our client database and give

identity of this person or, deny his or her access.

Automatic Imaging for Face Biometrics and Eye Localization

3

1.3 Environment

The hardware and software environments used for this research are listed

below.

Standard desktop systems based on Intel Pentium Dual-Core.

HP Pavilion dv2000 build-in Web Camera.

The XM2VTS, which is a large, multi-modal database captured onto

high quality digital video, is used in this project. It contains 4 recordings of

295 individuals and, in this project, we choose several groups of subjects as

our data sets.

The algorithms are programmed in the Matlab R14.

The operating system is Windows XP 2002 SP3 and Windows Vista

home basic.

1.4 Outline of Thesis

This thesis is organized as follows: Chapter 2 describes the theoretical

background and the algorithms used for eye location. Chapter 3 explains

algorithms and ideas of client identification. Chapter 4 presents experiments

for testing the performance of proposed methods. Then the results are

discussed, and we conclude in Chapter 5. In the following next two sections

of this chapter, we introduce the basic theoretical knowledge of the

4

retinotopic sampling grid and Gabor decomposition, which are commonly

used in this project.

1.5 The retinotopic sampling grid

Figure 1-1: An example of retinotopic sampling grid

When it comes to extracting the features of images, it is not necessary to

take every pixel into consideration. A simple mathematical abstraction

algorithm, based on a sparse retinotopic sampling grid by log-polar mapping

is introduced by [2]. The term ‘retinotopic’ is used because this method is a

mimic of the human visual system that implements a “focus of attention”

formation. Figure1-1 shows a grid consisting of 50 points, arranged in 5

concentric circles, and the radius of the innermost circle is 3 pixels, and that

Automatic Imaging for Face Biometrics and Eye Localization

5

of the outermost circle is 30 pixels. With rising radius of each concentric

circle, the density of the sampling points decreases exponentially. This

means we automatically concentrate the computational effort on the central

area of the sampling grid. In our project we focus on analyzing those

biometric features of the eye area, other biometric features around a

subject’s eye area such as ears, hair and moles on the forehead could affect

our result of eye detection. This strategy of retinotopic sampling grid

reduces the algorithmic processing demand of the computer on unnecessary

parts to achieve real-time performance. Further discussion and introduction

of this technique was presented in [2, 5].

We construct a retinotopic sampling grid placed on a subject’s eye. The

sampling grid consists of 69 points, with 4 concentric circles and the radius

range from 4 pixels at the innermost circle to 32 at the outermost circle.

The initial inner and outermost radius are empirically determined mainly

by two factors, the proportion of the eye area in an image and its size, and

the biometric features we want to cover. On the retinotopic sampling grid,

we have 1 point at the eye center and 4 points on the first ring, 8 points on

the second ring and then 24 points on the third ring and 32 on the fourth ring

as displayed in figure 1-2. In the figure, there is a properly centered face and

a retinotopic sampling grid is placed on this person’s right eye. We

proceeded as follows. In the training session we placed the grid on the right

eye for every person. The positions of those points are stored into a 1-D data

6

structure. Then, the biometric features around those points are extracted. The

same strategy is used also on the left eye.

Figure 1-2: A retinotopic sampling grid placed on an eye

1.6 Gabor decomposition

In terms of representation, an image can be expressed as a matrix of

brightness values in a Cartesian coordinate system, and also can be

represented as a superposition of sinusoids with different frequencies, phases

and amplitudes, determined by the Fourier transform of the image [4], as it is

shown in Figure 1-3.

Gabor filters can serve as excellent band-pass filters. Such a filter is defined

as the product of a Gaussian kernel times a complex sinusoid, i.e.

( ) ( ) ( )jg t ke w a t s tθ= (1)

Automatic Imaging for Face Biometrics and Eye Localization

7

where 2

( ) tw t e π−= (2)

(3)

(4)

Here

k ,θ , 0f are filter parameters. A Gabor filter can be thought of as two,

out-of-phase filters continently allocated in the real and complex part of a

complex function, with the real part,

0( ) ( ) s in ( 2 )rg t w t f tπ θ= + (5)

and the imaginary part (see figure1-5),

0( ) ( ) c o s ( 2 )ig t w t f tπ θ= + (6)

Figure 1-3: An example image (left) and its logarithmically scaled absolute amplitudes of

the spectral decomposition (right)

0( 2 )0 0( ) (sin (2 ), cos(2 ))j j f te s t e f t j f tθ π θ π θ π θ+ = + +

0( 2 )( ) j f ts t e π=

8

Gabor filters are very powerful tools for processing images. Different Gabor

filters respond to different local orientation and wave number around a

certain point, which is a very unique attribute and could be seen as an

analogy with the human visual system, a further discussion of which can be

found in [2].

In our case of feature extraction, we use log-polar separable Gabor

decomposition to extract the local features around a certain point in an

image, [4]. Since the orientation and wave numbers vary in an image,

several Gabor filters are needed. This set is also called Gabor filter bank.

Our Gabor filters in the filter bank are designed in the log-polar domain,

which is a logarithmically scaled polar space. 2 2

0 0

2 2

( ) ( )( , ) exp( ) exp( )2 2

f Aξ η

ξ ξ η ηξ ηδ δ− −

= − − (7)

The variables of the filter f ( , )ξ η are defined in the log-polar frequency

domain [2], shown in equation (7), where A is the normalization constant.

Then the filter f ( , )ξ η is tuned to orientation 0η and an absolute spatial

frequency 0ξ , which represents the absolute angular frequency 0 0exp( )w ξ= .

The log-polar frequency coordinates are defined in equation 8

1( , ) (log(| |), tan ( , ))x yw w wξ η −= (8)

Visually, the Gabor filters are two-dimensional, Gaussian bell-shaped curves.

While in the log-polar domain, the Gabor filters are symmetric 2D Gaussian

bells, but in the Cartesian frequency coordinates, the Gabor filters are

egg-shaped bells (see figure 1-4).

Automatic Imaging for Face Biometrics and Eye Localization

9

The “daisy” structure of figure1-4 appears in many published studies.

The figure shows a top sectional drawing of Gabor filters in the frequency

domain, with orientation from 0 rad to π rad. 5 frequency channels and 6

orientation channels, a total of 30 filters, are displayed. Each egg-shaped

contour represents one filter, the response of which on the input image is

called a channel. A cross marks the apex of one Gaussian filter. Figure 1-6,

which is based on the cutting plane from figure 1-4, shows a front sectional

drawing of all frequencies.

Figure 1-4: Top sectional drawing of Gabor filters in the frequency domain

A 3D-View of a Gabor filter is displayed in figure 1-5, with the highest

frequency and lowest orientation channel. In the first row, the magnitude of

the frequency spectrum of a Gabor filter (upper left) is displayed and then

we transform the filter back to the image domain, where the modulus of the

filter is shown (upper right); The real part of this filter is a cosine function,

10

whose amplitude are modulated by a Gaussian bell-shaped curve. The

imaginary part of the filter is similarly a Gaussian modulated sine function.

As frequency increases, the modulus of the filter becomes smaller in the

spatial domain.

Figure1-5: 3D-View on a Gabor filter. This shows the magnitude of the frequency spectrum of a Gabor filter (upper left), the modulus of the filter in the spatial domain (upper right), the real part of the Gabor filter (left) and the imaginary part of the Gabor filter (right)

Automatic Imaging for Face Biometrics and Eye Localization

11

Figure 1-6: Front sectional drawing of Gabor filters in the frequency domain

After implementing the above filter bank, we then could calculate the Gabor

filter response on any of the grid points. The Gabor feature vector is

arranged according to wave number and orientation. An element of the

feature vector (Gabor filter response magnitude) is calculated by the

following equation,

1 1

0 0 0 00 0

( , ) | ( , ) ( , , , ) |M N

m nk IM m n f m nξ η ξ η

− −

= =

= ∑∑ (9)

For a local image IM, around a certain point p, the magnitude k is computed

for all responses of all Gabor filters f. The local image IM is cut from the

original image such that the indices m,n, visit the image points inside a

rectangle with size MxN centered at p. A single Gabor filter

0 0( , , , )f m n ξ η is a 2-D complex valued filter corresponding to a certain

frequency 0ξ and orientation η0. The element of the feature vector is

formed by the absolute value of the scalar product of the local image

(cut-out of the input image) and the complex Gabor filter f. The index 0ξ

12

in the equation determines the absolute frequency of each filter f to which it

is tuned. The higher the frequency the smaller the filter size is. Likewise, η0

determines the tune-in orientation of the filter. As for the dimensionality of

the feature vector around grid point p, it is the product of the number of

frequencies and the number of orientations. Note that, in equation (9), the

scalar product between IM and f is calculated in the spatial domain, and 0ξ

and η0 do not denote an actual frequency or orientation value, but an index

number of the applied channel (response of a particular filter).

Automatic Imaging for Face Biometrics and Eye Localization

13

Chapter 2

Eye Localization

The eyes and eye regions are the most important facial landmarks on the

human face in many respects, including for recognition of human identities.

Eye localization, therefore, is an important step in human face recognition.

In this chapter, a novel approach for determining the location of human eye

center using Gabor filters is devised.

2.1 System introduction

The flowchart in figure 2-1 presents the approaches and algorithms of eye

centre localization we proposed.

The accuracy of face normalization is critical to the performance of the

following face analysis steps, thus we first preprocess human face images

and, here, we determine three parameters: retina radius, starting and ending

frequency and picture size

After face normalization, the proposed system begins to train these

images using the training set. We studied two models: one is a model based

on a specific frequency and orientation filter response for each point of the

artificial retina grid, and the other model is an averaged (over 50 people)

feature vector where each vector consists of all Gabor filter responses at a

14

single eye centre of a single individual, also called ‘average eye’. In both

cases, the resulting model can be represented by a vector.

For testing the system, or when the system is operational, first we

extract the feature vector for any image point, which is a candidate for being

an eye location. The elements of this feature vector are obtained by taking

the scalar product between the region determined by the candidate point at

hand and the specific Gabor filter model. The region and the specific Gabor

filter, are determined by the model studied. We then compare this feature

vector to the feature vector of the eye model obtained from the training set.

We determine the location of the eye centre by either the Sum of Square

Error (SSE) method, or the Schwartz inequality method.

Automatic Imaging for Face Biometrics and Eye Localization

15

Frequency1 & Orientation1

Figure 2-1: Flow chart of eye localization system

Training Set Testing Set

Preprocessing of face images

Starting and Ending frequency

Picture Size

Scalar product by a set of Gabor filters

Set retina grid with the centre of fovea

Max of first

element

Max of second element

Max of Nth

element

Frequency2 & Orientation2

Features1 Features2

FrequencyN & OrientationN

FeaturesN

Frequency and Orientation Model

Average Eye Model

Scalar product by corresponding

Gabor filter one pixel by one pixel

Retina Radius

Testing features

Determining Eye Centre Schwartz inequality

Sum of Square Error

16

2.2 Preprocessing of face images

The same parameters are used both in the training part and the testing part.

2.2.1 Retina radius

Retina sampling grids contain important information around the pixels they

are placed on. However, the radius of the grid needs to be determined. Our

retinotopic grid consists of 68+1 points distributed onto 4 circles [5], which

are displayed in figure 2-2.

Figure 2-2: The retinotopic grid

From the figure above, we can see that the artificial retina which is

denser at the centre (fovea) than at the periphery. That means that the grid

size is empirically determined by letting it cover the pupil and the eyebrow

area [2].

Automatic Imaging for Face Biometrics and Eye Localization

17

On the other hand, the smaller the radius is, the higher the computational

speedup one can perform an identification. Specifically, we chose the area of

the pupil as a circle with a radius of 2 pixels, whereby the average distance

from eye center to the eyebrow of about 15 pixels was also fixed empirically.

As a consequence of this consideration, the radius between the peripheral

and the foveal vision in our topology was allowed to vary between 2 pixels

and 20 pixels.

2.2.2 Start and end frequency of Gabor filter

Gabor features are widely used for feature extraction to recognize visual

information. The transform coefficients have good discrimination

characteristics, and it is easy to adjust the direction, baseband bandwidth and

center frequency of Gabor filters [23]. Thus, Gabor filters have been widely

utilized to extract components that normally include relatively high energy

in high frequency components, e.g. shapes defined by lines and edges.

However, they are also used to represent and analyze textures. The

fundamental frequencies are used for representing the silhouettes of an

object and can be used to classify objects.

In a face image, eyes have special properties – two gray valleys and rich

edge segments [12]. A Gabor filter in which the center frequency lies in the

high frequency band has a smaller window size, and describes abruptly

18

changing local characteristics of the local image. By contrast, low frequency

Gabor filters are more favorable for slowly varying intensity changes. Hence,

the high frequency of the Gabor filter must be present for locating facial

features which are rich in details, such as eye area. Low frequency Gabor

filters are more important at the periphery of the eye, where the image

intensity changes relatively slowly.

Besides, dynamically choosing among filters having different sizes, we

must remember that we need to keep the total number of candidate points for

being eye centers small. The fewer this number, i.e. the picture size, the

fewer tests will be performed, reducing the searching time. Through

empirical experiments on eye center localization, we found that the

frequency from 0.4π to 0.9π yields better results, when the size of filter

is from 25×25 to 11×11. Table 2-1 shows the different sizes of Gabor

filters.

Table 2-1: Start and end frequency of Gabor filter

Start--end frequency Window 1 Window 2 Window 3 Window 4 Window 5

0.1—0.5 77×77 51×51 35×35 23×23 15×15

0.4—0.9 25×25 21×21 17×17 13×13 11×11

0.1—0.9 75×75 43×43 25×25 15×15 9×9

Automatic Imaging for Face Biometrics and Eye Localization

19

2.2.3 Picture size

The size of each original picture is 205×256 pixels, as well as the original

picture are assumed to be handed over by a face tracking system. However,

the useful part could get smaller due to the known retina radius, and size of

Gabor filter. For a test picture, the scanning direction of the testing image

points for being an eye center is left-to-right and top-to-bottom, whereby all

pixels of the handed-over picture are tested for being eye locations. Thus,

the smaller the handed-over picture is, the faster the testing speed will be.

What is more, geometric constraints are applied to localize eye center,

which means we localize the left eye and the right eye separately due to the

similarity around the two eyes. In our model, we select a square with the

center at the point of visual pupil center, and the length of the side is

determined by the retina radius adjusted upwards with the size of Gabor

filter. These parameters were determined empirically as mentioned before.

We set the radius of the innermost circle to 2 pixels, and that of the

outermost circle to 20 pixels, and the filter wavelengths span the range from

0.4π to 0.9π. In order to get whole feature information around eye center,

the radius of a useful picture is 20+25/2=33, which means we at least select

a rectangle by 66×66 pixels, with center at eye center while, for an

acceptable error, and to avoid interference from the other side eye, we

selected 60×60 pixels centered on the visual pupil for image registration.

Figure 2-3 shows the original face image and the target eye-and-brow region.

20

2.3 Training sample set

2.3.1 Gabor filters model

In the face image, eye-and-brow region, as 2D signal, has specific frequency

and orientation, so this region is different from those of other face ones [12].

Hence, in order to segment the eye-and-brow region, a proper bank of

band-pass filters could enhance the signal of this region, while suppressing

that of other ones [13].

In this paper, we select 5 frequencies and 6 orientations. For frequency,

the start frequency is 0.4π, and the end frequency is 0.9π, which we have

discussed in section 2.2.2. For orientation, the start orientation is 0, and the

end orientation is5π/6. That means the sensitive direction is increasing by

30°; for example, the first filter is specialized to locate the 0° orientation,

which is sensitive to vertical, the fourth one locates 90°, which is sensitive to

horizontal, and so on. Figure 2-4 illustrates this visually.

Figure 2-3: original face image and target eye-and-brow region

Automatic Imaging for Face Biometrics and Eye Localization

21

2.3.1.1 Single Gabor Filter per retina grid point

In order to enhance computational efficiency and robustness, we represent

features by using Gabor filters which are also non-orthogonal. The Gabor

filters transformation corresponds to multi-scale oriented feature

representation, and at multiple scales, Gabor filters can be used for detecting

oriented features.

In the eye-and-brow region, orientation is the salient characteristic,

which means the signal contains more possibilities in the horizontal

orientation than the vertical orientation. Gabor filters can capture salient

visual properties, such as spatial localization, orientation selectivity.

From the figure 2-4, we can see the first orientation is sensitive to

vertical and the fourth orientation is sensitive to horizontal directions. It

Figure 2-4: Gabor filters on the frequency plane with 5 frequencies and 6 orientations

22

means that the fourth orientation is a good one to identify eye-and-brow

region if we had to choose only one filter. The experimental results are

shown in figure 2-5.

Frequency=4, orientation=4 || Frequency=4, orientation=6 || Frequency=2, orientation=4

Figure 2-5: Scalar product results by sample Gabor filters

2.3.1.2 Specific filter for each retinal point

For a face, though eye and eyebrow look like they are horizontal, the

features of other regions are obviously non-horizontal. Accordingly

appropriate filters need to be compiled with the corresponding frequency

and orientation in mind. That means that if we only used a single filter and

having the same frequency and orientation to obtain the scalar products with

local images around every grid point, our features will not be as descriptive

as these would be, compared to choosing the single filter in accordance with

the dominant orientation and frequency occurring around each grid point.

On the other hand, using all filters (5 frequencies and 6 orientations

based on a certain start and end frequency), with 68+1 retina sampling grid

Automatic Imaging for Face Biometrics and Eye Localization

23

points for each artificial retina, we get a 2070 dimensional feature vector

(30X69).

We picked up the filter producing the highest response among the 30

dimensions at every grid point. Accordingly, we then get a 1*69 vector, for

the artificial retina. The filter chosen for every grid point corresponds to a

frequency and orientation that yields the highest share of the representation

compared to using all filters. Table 2-2 shows the automatically chosen

filters and the directions they represent. Choosing different filters for

different grid points is made for the average of the artificial eye feature

vector having 2070 dimensions.

Figure 2-6 shows the specific frequency and orientation filter chosen for

each different retina point. We circle out the regions with the same

orientation. In order to verify the capability of the Gabor filters to select the

eye-and-brow regions, experiments are conducted. The used filter bank is

illustrated by figure 2-7.

Table 2-2: Relationship between the order of the combination and the corresponding frequency

and orientation

Orientation

○1 (1,1) ○2 (1,2) ○3 (1,3) ○4 (1,4) ○5 (1,5) ○6 (1,6)

○7 (2,1) ○8 (2,2) ○9 (2,3) ○10 (2,4) ○11 (2,5) ○12 (2,6)

○13 (3,1) ○14 (3,2) ○15 (3,3) ○16 (3,4) ○17 (3,5) ○18 (3,6)

○19 (4,1) ○20 (4,2) ○21 (4,3) ○22 (4,4) ○23 (4,5) ○24 (4,6)

Frequency

○25 (5,1) ○26 (5,2) ○27 (5,3) ○28 (5,4) ○29 (5,5) ○30 (5,6)

24

Figure 2-6: Gabor filters model with the frequency span from 0.4π to 0.9π

Figure 2-7: possibility of each Gabor filters for retina grid of an average eye

Automatic Imaging for Face Biometrics and Eye Localization

25

Figure 2-8 shows, the distribution of Gabor filters located on a retina

grid under frequency from 0.1π to 0.5π using the training set discussed in

section 2.2.2. We can see that some details are ignored by the low frequency

filter.

Figure 2-8: Gabor filters model with the frequency span from 0.1π to 0.5π

2.3.2 Computing the average eye

Using the parameters we have selected, we perform scalar products between

each retina point of one eye and the corresponding frequency and orientation,

and then we obtain a matrix of 69 feature points for one eye. Looping this

procedure for every person in the training set, we obtain a feature matrix,

and then we calculate the mean value of this matrix such that we obtain a 69

26

dimensional vector. The average vector contains 69 average features of one

eye, and we call this vector the ‘average eye’. The training procedure is

shown as follow:

For i = 1 : (num of different eye pictures in the training set)

1). Select visual eye centre O manually.

2). Place 69 retina points (P1…P69) around this eye centre O.

3). Compute scalar product between each retina point and

corresponding Gabor filter Gi, to obtain the feature vector FV.

4). Normalize feature vector FV.

End

5). Compute the average features for all training people.

At step (1), for a new image, we first select the eye centre manually, then

perform step (2), we set a retina model with the centre of this eye centre,

every retina grid point Pi is retained for the feature extraction. At step (3),

the Gabor feature vector is computed using the scalar product. This vector

describes the neighborhood of each retina point Pi. A single Gabor filter Gi

is a 2-D complex valued filter [5], which has a certain frequency and

orientation as we have selected in section 2.2.1.2.

2.3.3 Weights

The variance of a probability distribution is a measure of statistical

Automatic Imaging for Face Biometrics and Eye Localization

27

dispersion, which is used to capture its scale, or degree of being spread out

[1]. Thus, we compute variance to the normalized feature vector FV, and

then the weights are set as the inverse of the corresponding variance, finally

normalize the weights such that they sum up to 1.

.

2.4 Locate eye center

We first illustrate the whole localization procedure in the testing part, then

import two methods to determine eye center, finally we set threshold for

detection performance.

2.4.1 The strategy of locating eye center

The parameters of the testing part are the same as in the training data. The

strategy of determining eye center consists of comparing the feature vector

of 69 retina points of one pixel in the test picture to the reference feature

vector of the eye center. The comparison is done by obtaining a quadratic

sum of vector element differences. This is done for every pixel of the test

image whereby the said pixel is assumed to be an eye center. The pixel with

Sum of Square Error (SSE) holds the most similarity (best matches) to the

eye center. That point is then marked as an eye (not specific to a person).

The testing procedure is shown as follow:

28

For m = 1: (number of y-coordinate)

For n = 1: (number of x-coordinate)

1). Select the current pixel at n,m as the candidate point for an eye.

Call it O.

2). Place 69 retina points (P1…P69) around this test centre O.

3). Compute the scalar product between each retina point Pi and its

corresponding Gabor filter Gi, for getting feature vector FV.

4). Normalize feature vector FV.

5). Compute the difference of feature vector FV and the average eye

vector.

End

End

6). Pick up the location yielding SSE or Schwartz inequality among

differences computed at step 5 as the eye-location.

2.4.2 Compare testing feature with training feature

Using a similar strategy, we can get the feature vector of the test image at a

candidate position and the average feature vector of the training set at the

artificial retina centre. After this step we can compare this feature vector to

an average vector obtained on the training set using different techniques as

discussed next.

Automatic Imaging for Face Biometrics and Eye Localization

29

2.4.2.1 The Sum of Square Error (SSE)

The quadratic sum of the difference of features between the model and the

test image gives a global matching score, and the pixel with the least

difference holds the most similarity to a typical eye, which means a facial

landmark has been found.

Note that, Sum of Square Error (SSE) is actually a value between 0 and 1.

“0” means the two vectors have the highest similarity, and “1” means the

lowest similarity. This is because the difference is normalized by the sum of

the norms of the two vectors.

2.4.2.2 Schwartz inequality

Schwartz inequality states that, given two vectors f and g, in a vector space

V and a scalar product <, > over V, |<f, g>|<=||f|| · ||g||.

Note that, Schwartz inequality is actually a value between 0 and 1. “1”

means the two vectors have the highest similarity, and “0” means the lowest

similarity.

2.4.3 Detection performance

Detection performance was evaluated by Euclidean distance and visual

inspection. We compared the detected eye positions with the foveal positions

30

which we manually selected before; the performance is described by the

success rate of eye localization [15].

We set three thresholds on distance to the ideal eye centers (marked

manually), which are less than, or equal to, 5 pixels, 3 pixels and 1 pixel. We

circle out the threshold region in figure 2-9, figure 2-9(a) is in the

handed-over image and figure 2-9(b) is in the original picture.

Automatic Imaging for Face Biometrics and Eye Localization

31

Figure 2-9(a) the threshold region in the handed-over image

Figure 2-9(b) the threshold regions in the original image

32

Automatic Imaging for Face Biometrics and Eye Localization

33

Chapter 3

Client identification

3.1 Identification model

Client authentication is a research field related to face recognition. Since the

human face can always undergo variations in appearance and changes in

facial expressions, poses, scales, shifts, and lighting conditions constantly

occur, face recognition has long been facing many challenges. Our proposed

system verifies the claimed identity of the subject, with tolerance to facial

expression variations. The flowchart of our identification approach is

presented in figure3-1.

The first goal in our proposed system is to construct a client database

that includes those clients we intend for future recognition. In our system,

we selected 5 groups of subjects from the XM2VTS database as inputs. In

the previous chapter, we explain that facial landmark location in our case is

an eye location. We then give the system a prepared frame where a face is

already delineated by a face detection technique. With the help of eye

localization techniques discussed previously, the system jumps to the

assumed position, either to the center of the right or the left eye. Then the

retinotopic sampling grid is placed on the eye area, with its first grid point

34

Image database/client

group

Image standardization

Data preparing

Facial Landmarks

location

Sampling grid placing

Feature extraction

Client database

Image database/camera

frame

Image standardization

Data preparing

Facial Landmarks

location

Sampling grid placing

Feature extraction

Matching

Client identified /Access denied

right on the eye center. Preferably, the sampling grid will cover the fovea of

the eye area. As indicated, we would mainly extract the biometric feature of

the eye fovea. This means that we assume different people have different

biometric features around their eyes.

Figure3-1: Flow chart of the identification system

Automatic Imaging for Face Biometrics and Eye Localization

35

This is the basic concept of our biometric identification. One could

easily figure out the advantage of using a retinotopic sampling grid

remembering that feature extraction is usually very time-consuming. Closer

to the eye center, more feature are retained; this means that, by

implementing retinotopic sampling grid, we maximize the discriminative

information we want to keep and reduce the computational costs. After

positioning the retinotopic sampling grid, we associate with each grid point

orientation and frequency sensitive cells, each having a receptive field of its

own, (see figure3-2) represented by the spatial extensions of the Gabor

filters. Here we constructed 6-absolute frequency channels, and 6-orientation

channels using a Gabor filter bank of 36 filters. A receptive field can be

viewed as a simple model of theV1 cell in the primary visual cortex.

Accordingly, in our experiments, we calculated 36 different Gabor filter

responses at each point of the artificial retina, with the help of equation 9. It

is worth noting that the Gabor filter magnitude response is invariant to

sinusoids, reaching a maximal response when 0ξ is the actual frequency

(and 0η the orientation) of the filter. When analysis of local image structures

with small details are needed, higher frequency filters will show a higher

filter response k. In contrast, applying lower rather than higher frequencies

shall give greater magnitudes k, when a coarser and larger structure of a face

is focused on by the feature extraction. Because of the complexity of the

area eye we cover, both finer small structure and coarser larger structure can

36

be analyzed simultaneously, and we also hope that more discriminative

information for face recognition will be available.

Figure3-2: Receptive fields of Gabor filters

For each grid point of one eye, we calculate the scalar product between

one out of 36 filters, with its corresponding local neighborhood. This results

in a feature vector of length 2484, for representing the identity of each

subject. At this step, we store every feature representing each client identity

on the hard disk for future verification of that client, when she/he wants to

be authenticated.

When a new camera frame is given, we follow the same routine as the

previous step that we followed to obtain the training reference model of

clients. Then we need to have a similarity measurement technique for

matching the current image features with the reference features already in

the client feature database we constructed. This technique is explained in the

next section. The details of experiments are given in Chapter 4.

Automatic Imaging for Face Biometrics and Eye Localization

37

3.2 The matching concept

In our proposed system for matching two Gabor feature vectors representing

identities, we studied two similarity measurement methods, Schwartz

inequality and the method of sum of squared error.

3.2.1 Schwartz inequality

Schwartz inequality, already discussed previously in the context of eye

localization, states that given two vectors, f and g, in a vector space V, and a

scalar product <, > over V, then we have the inequality |<f, g>|<=||f|| · ||g||. It

can also be used in the context of identity verification. Also, Schwartz

inequality can be interpreted as measuring the angle between two feature

vectors each representing an identity:

,cos|| |||| ||

xyx y

x yθ < >

= (9),

Here, X and Y are two vectors in a vector space V, and the equation

results a in a value between [-1, 1].

After extracting the biometric feature of the eye area of a subject, we

calculate both the norm of the feature vector and the reference vector

previously stored, and applying equation (9) to receive a similarity score

between the two vectors. Note that we use the absolute value of the scalar

38

product between the two feature vectors; our similarity measure is actually a

value between 0 and 1 because all of our vector elements are magnitudes,

meaning that they are never negative. The final score “1” means the two

vectors have the highest similarity, and “0” means the lowest similarity.

3.2.2 SSE

The method of sum of squared error is often applied in statistical contexts,

particularly regression analysis. It can be interpreted as a method of fitting

data. The sum of squared residuals has its least value when the two vectors

are very similar, a residual being the difference between an observed value

and the value predicted by the model (equation 10). This residual sum

described by Carl Friedrich Gauss around 1794 who attempted to minimize

the SSE by changing the model parameters.

2

1( )

n

i ii

X Y=

−∑ (10)

Here, X and Y are two vectors in a vector space V. In our system, at

matching phase, we derived two Gabor feature vectors, each representing

client or potential client. Applying (10) on two normalized biometric

features, we then derive our similarity measure between the model (of the

client stored previously) and the measurements made on the current picture

of the client. Detailed information on the experiment result is discussed in

chapter 4.

Automatic Imaging for Face Biometrics and Eye Localization

39

3.3 Training by weights

In chapter 3.1, we discussed placing a retinotopic sampling grid on the fovea

of an eye in order to collect feature information of that area. In all of 69

positions on an eye area, which position is more important than the others

and which one is the weakest position, is the subject of the discussion in this

paragraph. After establishing the important position, one could give a higher

weight value for that position to declare its importance for identifying the

clients. The other points are, accordingly, treated as weaker and will

contribute less to identification. Such points would be given a small weight

value. The key factor of determining which grid points are strong verifiers in

the process of identification is the stability of the performance of those very

points when they are extracted for a number of clients. A point that always

performs consistently is considered an important grid position; in this case

we introduce variance as an indication for measuring the importance of those

grid positions. In probability theory, the variance of a random variable,

probability distribution, or sample is a measure of statistical dispersion,

averaging the squared distance of its possible values from the expected value

(mean). The mean is a way to describe the location of a distribution, and the

variance is a way to capture its scale or degree of consistency or being

spread out. In general, the population variance of a finite population of size

N is given by

40

2 2

1

1 ( )N

ii

X XN

σ=

= −∑ (11)

where X is the population mean.

In our system, we construct two groups of images for the training weight

for each position. In one training group, we have 40 different images of

different people, while in the other training group we also have 40 images of

the same group of subjects, but with different facial expression or hair

changes. For one same person, we extract the Gabor responses of the same

point out of 69 positions on a sampling grid in two training groups, calculate

the similarity of the two response vectors and give it a score. Then we do the

same for the artificial retina point but for a different person. After this we get

a group of 40 scores for one particular retina point. The variance of this

position 21σ , which represents the stability of this position, is derived.

Variances for other positions are also obtained in this way. Higher variance

means lower stability. In this case, our system uses the reciprocal of variance

2

for calculating the weight of a particular grid point. Figure 3-3 shows

training one weight for one particular point of 69 retinotopic sampling grid

points.

Automatic Imaging for Face Biometrics and Eye Localization

41

Figure 3-3: An example of the training process for variance of one grid point

+ + + …………

Score1 Score2 ……. Score40 21σ

42

Automatic Imaging for Face Biometrics and Eye Localization

43

Chapter 4 Experiments and results

In this chapter, the experimental results of all proposed methods and

algorithms are presented, and a detailed analysis of those results is also

included. Detailed information on the laboratory environment is described in

section 1.3.

4.1 Landmark localization tests

4.1.1 Parameters instruction

Facial landmark detection tests were run on a total of 110 images from the

XM2VTS. For each handed-over image, there is only one eye included, due

to locating two eyes’ centers separately; the following experiments use right

eyes as samples.

The training set is consisted of 50 persons, they are all non-glasses,

frontal face, no expression, and size of the handed-over images (by the face

localization system) are 60×60 pixels.

The testing sets are separated into three groups. The first one is the same

as the training set, which has the same 50 people, and the second one and

third one both consisted of 30 different people; the total number of people in

the three groups is 110.

44

Succeed Rate

Time calculation is under the configuration of the PC, which is Intel

Core 2, 2.1 GHz, 800MHz FSB, 3MB L2 cache, 2GB DDR2.

4.1.2 Experimental results

We mainly adopt four combinatorial methods for testing. The first two

methods are the Sum of Square Error (SSE), with and without weights, and

the other two methods are Schwartz inequality, with and without weights.

Visually, eyeball region, including the white of the eye, is less than or

equal to 5 pixels with the centre of fovea: this is what we called an

acceptable result. Pupil region is less than or equal to 3 pixels: this is what

we called the perfect result. The region which is less than or equal to 1 pixel

is only for reference, in order to evaluate the accuracy.

Experimental results are shown from table 4-1 to table 4-4. Each table

describes the success rate achieved in the thresholds of less than or equal to

5 pixels, 3 pixels and 1 pixel, by the groups of 50 people, which is the same

as the training set, another two groups, each of 30 people.

Table 4-1: Sum of Square Error (SSE) without weights

Threshold

Group ≤5 pixels ≤3 pixels ≤1 pixels

Seconds required for the operation

50p training set 100% 94% 50% 13.845

30p1 93% 90% 73% 13.545

30p2 100% 100% 70% 13.790

Average Success Rate 98% 95% 64% 13.727

Automatic Imaging for Face Biometrics and Eye Localization

45

Succeed Rate

In this table, success rate of the acceptable region is 98%, which means 2

images out of 110 images are not detected. Success rate of the perfect region

is 95%, which means 6 images out of 110 images are not detected, whereas

success rate of the reference region is still 64.43%, which means the

detected eye center of 97 images are exactly same as the visual fovea.

The average time for detecting one eye is 13.727 seconds, using

MATLAB 7.0, running in our PC.

Table 4-2: Sum of Square Error (SSE) with weights

Threshold

Group ≤5 pixels ≤3 pixels ≤1 pixels

Seconds required for the operation

50p training set 100% 94% 36% 13.265

30p1 100% 87% 53% 13.759

30p2 100% 97% 43% 13.601

Average Success Rate 100% 93% 44% 13.542

In this table, success rate of the acceptable region is 100%, which means

all 110 images are detected. Success rate of the perfect region is 93%, which

means 8 images out of 110 images are not detected, whereas success rate of

the reference region is only 44%, which means the detected eye center of 48

images are exactly same as the visual fovea.

The average time for detecting one eye is 13.542 seconds, using

MATLAB 7.0, running in our PC.

46

Succeed Rate

Succeed Rate

Table 4-3: Schwartz inequality without weights

Threshold

Group ≤5 pixels ≤3 pixels ≤1 pixels

Seconds required for the operation

50p training set 100% 94% 50% 13.441

30p1 93% 90% 73% 14.112

30p2 100% 100% 70% 13.941

Average Success Rate 98% 95% 64% 13.831

In this table, success rate of the acceptable region is 98%, which means 2

images out of 110 images are not detected. Success rate of the perfect region

is 95%, which means 6 images out of 110 images are not detected, whereas

success rate of the reference region is still 64%, which means the detected

eye center of 97 images are exactly same as the visual fovea.

The average time for detecting one eye is 13.831 seconds, using

MATLAB 7.0, running in our PC.

Table 4-4: Schwartz inequality with weights

Threshold

Group ≤5 pixels ≤3 pixels ≤1 pixels

Seconds required for the operation

50p training set 100% 90% 42% 13.729

30p1 93% 77% 50% 13.828

30p2 93% 67% 30% 13.545

Average Success Rate 95% 78% 41% 13.701

In this table, success rate of the acceptable region is 95%, which means 4

images out of 110 images are not detected. Success rate of the perfect region

is 78%, which means 24 images out of 110 images are not detected, whereas

Automatic Imaging for Face Biometrics and Eye Localization

47

success rate of the reference region is only 41%, which means the detected

eye center of 45 images are exactly same as the visual fovea.

The average time for detecting one eye is 13.701 seconds, using

MATLAB 7.0, running in our PC.

From the tables above, we can find some regular patterns as follow:

1. In the Sum of Square Error (SSE) method, using weights could get the

best results, 100% success rate, in less than or equal to 5 pixels, but

could get worse results in less than or equal to 3 pixels.

2. In the Schwartz inequality method, not using weights could get better

results than using weights.

3. For non-weights, the results are exactly same under both Sum of Square

Error (SSE) method and the Schwartz inequality method.

4. On the whole, without using weights, the result in the perfect area could

be 95% in both methods; while, with using weights, the result in the

perfect area is only 86% in both methods.

5. The time required for detecting one eye center is about 13.5 seconds,

which is also depending on the PC configuration.

Experimental results all show reliable eye detection performance. Figure

4-1 shows some identified results in the handed-over image and the original

image.

48

Figure 4-1(a): Successful identified results in the perfect region

Figure 4-1(b): Successful identified results in the acceptable region

Figure 4-1(c): Failing identified results

Automatic Imaging for Face Biometrics and Eye Localization

49

4.2 Client identification tests

In this section, in accordance with the theories and methods proposed in

chapter 3, the implementation of our system for identification and its test

results are presented.

4.2.1 Determining model parameters

A training-set of 10 images is formed, of which 5 pictures are captured by

web-camera and another 5 pictures are from the XM2VTS database. In this

training set, there are 3 women and none of the subjects wear glasses. In the

testing set, we have 13 client images, of which 3 are impostors without

glasses. In the identification process, all 10 clients are correctly identified

and 3 impostors are rejected. We use the above small training set and testing

set to determine the radius of our retina, number of frequency channels,

number of orientation channels and suitable frequency range based on the

theories and algorithm described in Chapter3.

Firstly, identification based on alternative retina radius range is tested, as

mentioned in chapter 3. We empirically choose the radius range of retina

from 10.7 to 50 or 60 pixels, according to the size of eye area in a camera

frame. Basically, this range covers the eye area and excludes information of

other biometric features such as ears, forehead or hair. A 5-frequency

channels 6-orientation channels Gabor filter bank is constructed first, with a

50

Similarity

frequency range from 0.1π to 0.5π , and similarity measure is calculated

using Schwartz inequality. Table 4-5 shows the comparison between the two

ranges. We could see that, with radius from 10.7 to 50 pixels, the mean

similarity value of clients has risen compared to the retina radius, from 10.7

to 60, and the mean similarity value of impostors has decreased, which

means the retina radius range from 10.7 to 50 has a better performance in

enlarging the gap between similarity measure of clients and impostors.

Table 4-5: Similarity measure based on alternative retina radius range Group

Radius Range Clients Impostors

10.7-60 0.93 0.90

10.7-50 0.94(↑ ) 0.89 (↓ )

Secondly, we test the effect of the frequency channels employed in our

system on the similarity measure. We compare only using higher frequency

channels with using all channels (the result is shown in table 4-6.) We notice

that a rise in the similarity measure of clients is a good thing, but too much

rise in similarity measure of impostors also ruins the whole performance.

Table 4-6: Similarity measure based on alternative frequency Group

Frequency Clients Impostors

[1 2 3 4 5] 0.93 0.90

[3 4 5] 0.94 (↑ ) 0.94 (↑ )

Similarity

Automatic Imaging for Face Biometrics and Eye Localization

51

Similarity

Next, we compare the Gabor filter bank of 5 frequency channels and 6

frequency channels with the same 6 orientation channels. Table 4-7 shows

there is no change in the mean similarity of clients, and a well achieved

decrease in the mean similarity of impostors, which means a 6 by 6 Gabor

filter bank has a better performance in distinguishing impostors from clients.

Table 4-7: Similarity measure based on alternative Gabor filter bank Group

Filter Bank Clients Impostors

5 by 6 Gabor filter bank 0.93 0.90

6 by 6 Gabor filter bank 0.93 0.87(↓ )

The frequency range of Gabor filter bank is proved to be an important factor

in deciding similarity value. The original start frequency of our filter bank is

0.1π and the end frequency is 0.5π . After we enlarge the range from 0.05π

to 0.7π , the system gains much better performance. Table 4-8 shows the

results.

Table 4-8: Similarity measure based on alternative filter bank frequency range Group

filter bank frequency range

Clients Impostors

0.1π -0.5π 0.93 0.90

0.05π -0.7π 0.96(↑ ) 0.87 (↓ )

Similarity

52

Finally, we take another biometric feature into consideration for our system,

which is the nose feature. The configuration is that the radius range of the

retina is from 10.7 to 50 pixels. A 5-frequency channel 6-orientation channel

Gabor filter bank is constructed with a frequency range from 0.1π to 0.5π ,

and the similarity measure is calculated using Schwartz inequality. Since we

have two different biometric features in our system, we decide to assign

weight for different features. Each eye is given a weight of 40%, and the

nose 20% weight. With the nose feature introduced into our system, we

would improve the performance but in a very limited way. The result is

shown in table 4-9.

Table 4-9: Similarity measure based on different feature information Group

Feature Information Clients Impostors

Without nose 0.93 0.90

With nose 0.94(↑ ) 0.89 (↓ )

4.2.2 Identification tests

In the first identification test, a group of 50 people is chosen from the

XM2VTS database as the training set for reference, and in which 10 people

have glasses. We have another image recording of the same group of 50

people as our testing set, and we add 20 more impostors, of which 4 among

the 20 wear glasses.

Similarity

Automatic Imaging for Face Biometrics and Eye Localization

53

From the experiment results we have gained in last section, we decided

to use a 6 by 6 Gabor filter bank, with frequency range from 0.05π to 0.7π ,

and employ a retina, with radius range from 10.7 to 50 pixels, in our

identification test. The identification rate is the indication of how

successfully we can recognize a person with our system, and this ratio can

be obtained as follows:

Identification Rate = __

client identifiedclient number

We use Schwartz inequality as our similarity measure in the first test,

and the result is 33/40=82.50% clients are correctly identified among those

40 clients without glasses, and only 10% clients with glasses are correctly

identified. Figure 4-2 illustrates a histogram of similarity measurement of

test 1, and we use a threshold of value 0.91. In all, 20 impostors are rejected

by our system.

In the second test, we only have the same 40 people without glasses as

our training set, and choose the corresponding 40 images from the testing set

of test 1 as the testing set with an additional 20 impostors. In this test we

employ SSE as our similarity measure, a histogram of similarity

measurement is shown in figure 4-3.With a false rejection rate of 12.5%, and

false acceptance rate of 10%, we use 0.20 as our threshold, and this results in

an identification rate of 87.5%.

54

In our third test, we introduce weight into our system. We already

discussed employing weight in chapter 3. We use the previous 40 people.

Two groups of recordings are used as training sets for weight, and another

for the testing set, with an additional 20 impostors. Schwartz inequality is

implemented for calculating similarity. A histogram of similarity

measurement is shown in figure 4-4. With a false rejection rate of 7.5%, and

false acceptance rate of 5%, we use 0.91 as our threshold, and this results in

an identification rate of 92.5%.

Figure 4-2: Higtogram of similarity measurement of test 1.

Automatic Imaging for Face Biometrics and Eye Localization

55

Figure 4-3: Higtogram of similarity measurement of test 2.

Figure 4-4: Histogram of similarity measurement of test 3

56

Figure 4-5: Histogram of similarity measurement of test 4

In our last test, we implement weight and use SSE as the similarity

measure in our system. Again, we use the same 40 people as before. Two

groups of recordings are used as training sets for weight and another for the

testing set, with an additional 20 impostors. A histogram of similarity

measurement is shown in figure 4-5. With a false rejection rate of 7.5% and

false acceptance rate of 10%, we use 0.0027 as our threshold, and this results

in an identification rate of 92.5%.

Automatic Imaging for Face Biometrics and Eye Localization

57

Chapter 5 Discussion and conclusion

In our proposed system, the test results of the eye localization are

encouraging with the help of Gabor filter banks, which is a pre-requirement

of the person authentication system. Due to the harsh thresholds we set, we

only use the acceptable area to compare with several similar systems, which

are presented in table 5-1. From the table, the system we proposed is much

better than others. For Ref [12], it uses Gabor-eye model and radial

symmetry operator to locate pupil area; for Ref [17], Gabor filter is under

probabilistic framework to locate eye center; for Ref [19], navigational

routines is proposed. Table 5-1: Comparison of the results with other methods

Groups Yang & Du

[12]

Ma & Ding

[17]

Huang &

Wechsler [19] Ours

Success Rate 95% 94.44% 98.7% 98%

Regarding the differences between the human eye fovea, the average eye

model works at a satisfying level, but an eye location technique for both the

human eyes needs to be developed comparing with the current locating

one-by-one mechanism. The eye localization performance with respect to

time demand can be improved by using dedicated hardware.

58

On client identification, good results have been achieved with Gabor

filter banks and the usage of carefully chosen sampling points and frequency

channels employed in the identification part, which enable its real-time

performance, and need to be improved with a bigger training-set. It is not

realistic to compare the performance of different systems in a quantitative

way, because of the different environments that are implemented. In general,

the algorithm of [2] took about 8 minutes to perform facial landmark

detection and face verification. In our proposed system, it takes no more

than 4 seconds for client identification, and no more than 40 seconds to

perform facial landmark detection and face identification. An EER of 6.0%

has been achieved in the identification process and can be further improved

by training SVM experts [2]. In order to gain a better identification

performance, we could also continue research on introducing other facial

biometrics, such as nose or mouth, into our system. Further research on the

sensitivity with respect to clients with glasses has to be carried out. For

real-time purposes, the possible future work involves implementation of the

whole system, with a faster language such as ’C#’.

Automatic Imaging for Face Biometrics and Eye Localization

59

References

[1] Wikipedia.org

Biometrics

The Free Encyclopedia URL:http://en.wikipedia.org/wiki/Biometric

Viewed: May 15, 2009

[2] F. Smeraldi, J. Bigun

Retinal vision applied to facial features detection and face authentication

Pattern Recognition Letters 23, pp. 463–475, 2002

[3] F. Smeraldi, O. Camona, J. Bigun

Real–Time Head Tracking by Saccadic Exploration

Proceedings of the 5th International Workshop on IEEE Cat. Num. 98TH8354, pp.

684–687, 1998

[4] Josef Bigun Vision with direction: A systematic introduction to image processing and computer vision. Springer, 2006.

[5] J. Bigun, H. Fronthaler, and K. Kollreider

Assuring liveness in biometric identity authentication by real-time face tracking

IEEE International Conference on Computational Intelligence for Homeland Security

and Personal Safety Venice, Italy, 21-22 July 2004

[6] B. Duc, S. Fischer, and J. Bigun.

Face authentication with Gabor information on deformable graphs

IEEE Trans. on Image Processing, 8(4):504–516, 1999.

[7] Ian R Fasel, M. S Bartlett, and J. R. Movellan.

A comparison of Gabor methods for automatic detection of facial landmarks

International conference on Automatic Face and Gesture Recognition, pages

242–248, May 2002.

[8] Al-Amin Bhuiyan, and Chang Hong Liu

On Face Recognition using Gabor Filters

Proceedings of World Academy of Science, Engineering and Technology volume 22

July 2007.

60

[9] David A. Clausi, M. Ed Jernigan

Designing Gabor filters for optimal texture separability

Pattern Recognition 33 (2000) 1835-1849

[10] Josef Bigun

Circular Symmetry Models in Image Processing

Linkoping studies in Science and Technology, Thesis No. 85,

LIU-TEK-LIC-1986:25, Linkoping University, Sweden, September 1986

[11] J. Bigun

Pattern recognition in images by symmetries and coordinate transformation

Computer Vision and Image Understanding, vol. 68, nr. 3, pp. 290–307, 1997

[12] Peng Yang, Bo Du, Shiguang Shan. Wen Gao

A novel pupil localization method based on gaboreye model and radial symmetry

operator

2004 International Conference on Image Processing (IClP), 0-7803-8554-3/04

[13] Hyunwoo Kim, Jong Ha Lee, Seok Cheol Kee

A Fast Eye Localization Method for Face Recognition

Proceedings of the 2004 IEEE International Workshop on Robot and Human

Interactive Communication Kurashiki, Okayama Japan September 20-22, 2004

[14] Geng Du, Fei Su, Anni Cai

Eye Location under Various Illumination Conditions

Proceedings of the International Multi-Conference on Computing in the Global

Information Technology (ICCGI'06), 0-7695-2629-2/06

[15] Peng Wang, Matthew B. Green, Qiang Ji

Automatic Eye Detection and Its Validation

Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision

and Pattern Recognition (CVPR’05), 1063-6919/05

[16] Guo-Sheng Yang, Ting Wang, Huan-Long Zhang

Eye location method based on Gabor wavelet and topographic feature extraction.

Proceedings of the Seventh International Conference on Machine Learning and

Cybernetics, Kunming, 12-15 July 2008. 978-1-4244-2096-4/08

Automatic Imaging for Face Biometrics and Eye Localization

61

[17] Yong Ma, Xiaoqing Ding, Zhenger Wang, Ning Wang

Robust precise eye location under probabilistic framework

Proceedings of the Sixth IEEE International Conference on Automatic Face and

Gesture Recognition (FGR’04), 0-7695-2122-3/04

[18] Shang-Hung Lin, Sun-Yuan Kung, Long-Ji Lin

Face Recognition/Detection by Probabilistic Decision-Based Neural Network.

IEEE Transactions on Neural Networks, vol. 8, No.1, January 1997.

[19] Jeffrey Huang and Harry Wechsler

Visual Routines for Eye Location Using Learning and Evolution

IEEE Transactions on Evolutionary Computation, vol. 4, No.1, April 2000.

[20] Geng Du

Eye location method based on symmetry analysis and high-order fractal feature

IEEE Proceeding Vision Image Signal Process, Vol. 153, No. 1, February 2006.

[21] Yanfang Zhang, Nongliang Sun, Yang Gao, Maoyong Cao

A new eye location method based on Ring Gabor Filter

Proceedings of the IEEE International Conference on Automation and Logistics

Qingdao, China September 2008, 978-1-4244-2503-7/08

[22] Song Li, Danghui Liu, Lansun Shen

Eye Location Using Gabor Transform

Measurement and Control Techniques, 2006, 25(5): 27-29

[23] Richard Buse and Zhi-Qiang Liu

Feature measurement and analysis using Gabor filters

International Conference on Volume 4, Issue , 9-12 May 1995 Page(s):2447 - 2450

vol.4


Recommended