+ All Categories
Home > Documents > [IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China...

[IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China...

Date post: 05-Jan-2017
Category:
Upload: hoangmien
View: 214 times
Download: 2 times
Share this document with a friend
5
Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008 EYE BLINK DETECTION BASED ON MULTIPLE GABOR RESPONSE WAVES JIANG-WEI LI Intelligence Recognition and Image Processing Laboratory, Beihang University, Beijing, P.R.China E-MAIL: [email protected] Abstract: Eye blink detection is useful in many applications, such as human-computer interface, driver awareness detection, and so on. Here we consider eye blink as an evidence of liveness to exclude using some fakes, mainly 2D photographs, to spoof face recognition systems. The detection of eye blink is done by signal analysis of multi-scale and multi-orientation Gabor responses of eye images. First, based on Gabor decomposition, we derive multiple Gabor response waves (GRW) and show some of them are homo-responsive to the behavior of eye blink. Second, probable blinking signals contained in GRWs are segmented out and reshaped into pulses. Third, the ratio of the union to the intersection (RUI) of the reshaped pulses is proposed as the indicator of the extent of eye blink. Our method has successfully detected eye blink under various conditions and can be applied to liveness detection in video-based face recognition system. Keywords: Eye Blink Detection; Gabor Decomposition; Live Face Detection 1. Introduction Eye blink may be one of the most important actions among the eye related behaviors. This is due to its potential applications in many fields, including human-computer interface [1,7], indicator of driver awareness[2], psychology analysis [4], and so on. A few methods have been proposed for blink detection. In Fasel et al. [9], many open eyes and closed eyes were used to train the blink detector with a boosted cascade classifier structure. Some [8,10] tried to make use of optical flow to determine whether a blink has occurred. As for some other work [1,7], the detection of blinking was based on observations of the correlation scores using the online or offline templates. Note among these methods, some of them [1,7,9] are appearance-based, so they are highly dependent on precise location of eye position and difficult to tackle the situation of weak blinking. Recently, personal identification based on biometrics has made great progress with the advances of information technology. To protect the authentication process, biometric systems should be able to tackle the likely case that the attacker may use a copy of a biometric to fool the systems. This functionality is termed “liveness detection”. Face recognition from still image or video is one of most widely used biometric authentication technologies. In the same way, face recognition systems should have the capability of live face detection [6,11,13,14]. In some methods [11,13], people were requested to speak some words or act as the prompt of the system. And other work [6,14] thought human faces are 3D structures while most fakes are 2D structures and used this cue for liveness detection. Most of these methods need either complex computation or much cooperation from users. In this paper, we propose a novel approach for eye blink detection. It is based on the observation that for the liveness, the edges along some scales and orientations vary homo-responsively to the behavior of eye blink. By analyzing the varying tendency of multi-scale and multi-orientation Gabor response waves (GRW), five key Gabor response waves (KGRWs) are selected from the wave set since they are most homo-responsive to the behavior of eye blink. Then we intercept the probable blinking signals from KGRWs and reshape them into pulses to improve the ratio of signal to noise (SNR). At last, the parameter of the ratio of the union to the intersection of the reshaped pulses (RUI) is proposed. Its value determines the extent of eye blink. Differing from previous methods [1,2,7-10], our purpose is to apply eye blink for liveness detection, so it only focuses on how to determine the occurrence of eye blink as accurately as possible. Compared with them, our method is more reliable because of fusing multiple KGRWs for blink detection. Every KGRW can be regarded as a weak detector and fusing them can improve final detection rate. Even if eye blinks weakly, it can also detect the blinking signal successfully. Furthermore, our method is robust to eye location error because each index in GRW is 978-1-4244-2096-4/08/$25.00 ©2008 IEEE 2852
Transcript
Page 1: [IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China (2008.07.12-2008.07.15)] 2008 International Conference on Machine Learning and Cybernetics

Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008

EYE BLINK DETECTION BASED ON MULTIPLE GABOR RESPONSE WAVES

JIANG-WEI LI

Intelligence Recognition and Image Processing Laboratory, Beihang University, Beijing, P.R.China E-MAIL: [email protected]

Abstract: Eye blink detection is useful in many applications, such as

human-computer interface, driver awareness detection, and so on. Here we consider eye blink as an evidence of liveness to exclude using some fakes, mainly 2D photographs, to spoof face recognition systems. The detection of eye blink is done by signal analysis of multi-scale and multi-orientation Gabor responses of eye images. First, based on Gabor decomposition, we derive multiple Gabor response waves (GRW) and show some of them are homo-responsive to the behavior of eye blink. Second, probable blinking signals contained in GRWs are segmented out and reshaped into pulses. Third, the ratio of the union to the intersection (RUI) of the reshaped pulses is proposed as the indicator of the extent of eye blink. Our method has successfully detected eye blink under various conditions and can be applied to liveness detection in video-based face recognition system.

Keywords: Eye Blink Detection; Gabor Decomposition; Live Face

Detection

1. Introduction

Eye blink may be one of the most important actions among the eye related behaviors. This is due to its potential applications in many fields, including human-computer interface [1,7], indicator of driver awareness[2], psychology analysis [4], and so on.

A few methods have been proposed for blink detection. In Fasel et al. [9], many open eyes and closed eyes were used to train the blink detector with a boosted cascade classifier structure. Some [8,10] tried to make use of optical flow to determine whether a blink has occurred. As for some other work [1,7], the detection of blinking was based on observations of the correlation scores using the online or offline templates. Note among these methods, some of them [1,7,9] are appearance-based, so they are highly dependent on precise location of eye position and difficult to tackle the situation of weak blinking.

Recently, personal identification based on biometrics

has made great progress with the advances of information technology. To protect the authentication process, biometric systems should be able to tackle the likely case that the attacker may use a copy of a biometric to fool the systems. This functionality is termed “liveness detection”.

Face recognition from still image or video is one of most widely used biometric authentication technologies. In the same way, face recognition systems should have the capability of live face detection [6,11,13,14]. In some methods [11,13], people were requested to speak some words or act as the prompt of the system. And other work [6,14] thought human faces are 3D structures while most fakes are 2D structures and used this cue for liveness detection. Most of these methods need either complex computation or much cooperation from users.

In this paper, we propose a novel approach for eye blink detection. It is based on the observation that for the liveness, the edges along some scales and orientations vary homo-responsively to the behavior of eye blink. By analyzing the varying tendency of multi-scale and multi-orientation Gabor response waves (GRW), five key Gabor response waves (KGRWs) are selected from the wave set since they are most homo-responsive to the behavior of eye blink. Then we intercept the probable blinking signals from KGRWs and reshape them into pulses to improve the ratio of signal to noise (SNR). At last, the parameter of the ratio of the union to the intersection of the reshaped pulses (RUI) is proposed. Its value determines the extent of eye blink.

Differing from previous methods [1,2,7-10], our purpose is to apply eye blink for liveness detection, so it only focuses on how to determine the occurrence of eye blink as accurately as possible. Compared with them, our method is more reliable because of fusing multiple KGRWs for blink detection. Every KGRW can be regarded as a weak detector and fusing them can improve final detection rate. Even if eye blinks weakly, it can also detect the blinking signal successfully. Furthermore, our method is robust to eye location error because each index in GRW is

978-1-4244-2096-4/08/$25.00 ©2008 IEEE 2852

Page 2: [IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China (2008.07.12-2008.07.15)] 2008 International Conference on Machine Learning and Cybernetics

Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008

the sum of edge energies and it is insensitive to eye location. We apply the method into liveness detection in video. Experiments results have shown that the proposed method is very promising to distinguish the fake liveness without eye blink.

2. Motivation

Some professors have explained the human’s physiological characteristic of blinking eye from the views of both physiology and psychology [12]. As for physiology, human have to blink eye to cleanse and moisten the eye. Normally we blink every four to six seconds. But there are more psychologically influenced reasons for blinking. If we pay attention to something like reading interesting materials, we would blink an average of three to eight times per minute as opposed to 15 times per minute when we are not engaged in an attention-demanding activity. Based on the above analysis, no matter what the user psychological state is, at least once of eye blink will happen at every interval of 20 seconds. So detecting eye blink is an effective way to verify the liveness, and the time demand of at least 20 seconds for detecting eye blink can be easily satisfied in video-based face recognition system.

3. Our approach

Our approach for eye blink detection is divided into 4 steps: (1) Gabor Decomposition; (2) KGRWs’ Selection; (3) Wave Reshaping; (4) Detection Parameter Proposition.

3.1. Gabor decomposition

Gabor wavelets [5] are a favorable toolkit for image decomposition and representation, since it has excellent properties to capture local features with location selectivity, orientation selectivity and special frequency selectivity. The typical Gabor kernels can be defined as:

][),|,( 2

2)(,22

)22(2

,

2

2,

σσ

σψ

−+⋅+⋅

−−×= eee

kvuyx yixvuik

yxvukvu (1)

where x and y are the coordinates of the pixel in image, and the scale and orientation of Gabor filter,u v σ the

standard deviation of the Gaussian function, and defined as:

vuk ,

8max,

πvi

uvu ef

kk = (2)

where is the maximum frequency, and is a scaling factor controlling the space between two adjacent

Gabor kernels. Generally, we use 40 Gabor kernels at 5 scales, , and 8 orientations, .

maxk f

}4,3,2,1,0{∈u }7,6,5,4,3,2,1,0{∈v

Figure 1. Gabor decomposition on eye image

Gabor decomposition on an image is to seek a series of

Gabor responses with Gabor kernels. Given an eye image , the representation of Gabor decomposition on is defined as:

),( yxI),( yxI

),|,(),(),|,( vuyxyxIvuyxG ψ⊗= (3) where denotes the convolution operation. Figure.1 is an example of response maps, where each pixel in the maps delegates the norm of . To learn how much edge energies are contained in an eye with given u and v , here defines as the sum of within the image:

),( yxG

),( vuO ),( yxG

∑=yx

vuyxGvuO,

),|,(),( (4)

Gabor wavelets represent favorable properties of frequency (scale) selectivity and orientation selectivity. Gabor wavelets will response strongly to the edge if the location, scale, and orientation of the edge meet to those of Gabor kernels. Eye contains abundant edges with various orientations, and Gabor wavelets are a delicate tool because it can analyze the edge on certain location, frequency, and orientation. The total response is actually the total edge energy along certain and , so it is insensitive to eye location error.

),( vuOu v

3.2. KGRWs’ selection

In Gabor decomposition, eye region is cut from face image and then the energy term is estimated on each scale and orientation frame by frame so as to form Gabor response wave (GRW) , which is a collection of defined in a video:

),( vuO

),( vuS),( vuO

2853

Page 3: [IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China (2008.07.12-2008.07.15)] 2008 International Conference on Machine Learning and Cybernetics

Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008

}),(,),(,,),(,),({),( 21 Nn vuOvuOvuOvuOvuS = (5) where N is the length of the video. Figure.2 is an example of 40 GRWs on 5 scales and 8 orientations.

Figure 2. An example of GRWs

In this example, eye blinks only once in the video and

the length of the video N is 100. Figure.2 shows some typical eye images selected from the video, and draws the curves of 40 GRWs reflecting the variations of with the time elapse. Vertical axis is the magnitude of

, while horizontal coordinate is the time index of the video.

),( vuO

),( vuO

Note that in Figure.2, some GRWs, e.g., ~ , exhibit the same varying tendency when

eye is blinking. For these GRWs, there is an obvious wave sinking when eye closes. The behavior of eye blink consists of three sub-behaviors: eye opening, eye closing, and then eye opening. When eye closes, along some certain

and v will decrease to the low level, since there are less edges contained in the close eye compared to the open eye. When eye opens again, will recover to the previous level. In our method, five key Gabor response waves (KGRWs), i.e., , , , and , are selected from the wave set since they are most homo-responsive to the behavior of eye blink. The underlying reasons why only these KGRWs are chosen are: (1) The scales of them are closer to the radius of the normalized eye, so they can reflect the overall edge variations when eye blinks; (2) The edges along their orientations are more sensitive to the behavior of blink.

)0,3(S )2,3(S

),( vuSu

),( vuS

)0,3(S )1,3(S )2,3(S )6,3(S)7,3(S

Figure.3 shows other four examples to further display

this homo-responsive principle. In order to observe KGRWs arisen from other motions and apply our method for liveness detection purpose, two videos in top examples are sampled by moving two photos with different resolutions before the camera, and the remaining two are sampled from liveness with eye blink. For the bottom two cases, there are obvious flips occurring at the same place of all KGRWs, while it does not come into existence for the top two cases. Even if eye blinks very weakly as in last example, all KGRWs are also homo-responsive to this behavior, while this example is certain to be an impossible task for those previous approaches for eye blink detection. Because eye blink is one of human’s physiological characteristics, and from Figure.3, the difference between the KGRWs from photos and those from liveness is obvious, we think if our approach can be used for liveness detection, mainly for resisting the attack with photo to spoof face recognition systems.

Figure 3. Some videos and their KGRWs

Figure 4. Some results of reshaped waves

3.3. Wave reshaping

Observing KGRWs, we can intuitively infer whether there contains blink signal or not in video. However, since these KGRWs contain not only useful signal of eye blink, but also useless signals of noise, eyeball moving, image deformation, and so on. To distinguish eye blink from the combined signals, it is necessary to intercept the probable blinking signals out. Furthermore, as shown by the bottom two cases in Figure.3, the wave sinking corresponding to eye blink is not very sharp due to noises. In our method, we

2854

Page 4: [IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China (2008.07.12-2008.07.15)] 2008 International Conference on Machine Learning and Cybernetics

Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008

reshape the segmented signal of eye blink into rectangle waves to improve the ratio of signal to noise (SNR). Figure.4 shows the reshaping results of the KGRWs in Figure.3. The following are the steps: (1) Apply median filter to KGRWs to remove high frequency noise; (2) Sum up all KGRWs, and find the time index p corresponding to the minimum sum. For each KFW, get a rectangle window centered at p with fixed radius, and this rectangle may contain the interesting signal of eye blink; (3) Segment the interesting signal out from each KGRW; (4) For each segmented signal, estimate the minimum and that of the corresponding KGRW. If the difference between two minima is larger than a certain threshold, reset the segmented signal to zero. Otherwise, calculate the mean of the interesting signal, and if , set it to one, else zero.

meanvuO n >),(

3.4. Detection parameter proposition

To enhance the reliability of blink detection, the detection parameter should fuse all these five reshaped waves. The detection parameter is defined as:

⎪⎪

⎪⎪

=

=

===

=

LRWlRWlRWl

LRWl

RUI

ii

ii

ii

ii

)(,)()(

)(,0

5

1

5

1

5

1

5

1

∩∩∪

∩ (6)

where is the ith reshaped wave, , is the operator to get the width of zero signal, and L is fixed as the width of reshaped waves. So RUI is the ratio of the union to the intersection of reshaped waves, varying from 0 to 1. It presents the overlapped degree of five reshaped waves and quantitatively reflects the extent of eye blink. Bigger value RUI takes, more obviously eye blinks. For above four examples, the corresponding RUI values are 0, 0, 91.67%, and 84.62%, respectively.

iRW 5,4,3,2,1=i )(•l

4. Experimental results

To verify the performance of our algorithm, we test it on the collected database composed of 20 videos, 10 from photos and 10 from the liveness with eye blink. Before the camera, the photos are moved or rotated randomly and the liveness moves their head freely. The length of each video is 20s.

Both faces and eyes are detected automatically. Face detector is developed from Adaboost algorithm as proposed in [3]. The detector can only tackle nearly frontal view. It may not be a drawback because most of existing face recognition systems work best on nearly frontal faces. Note

that in practice, we must also perform face recognition simultaneously to ensure the detected face is generated from the claimed identity. Eyes in the face are located using gray information, tracking technique and correlation module. Human may blink not only once in video. In our method, we only pick out one most likely case for analysis.

Figure.5a shows the experimental results. The RUI values of the liveness are labeled with blue circle while those of the photos are presented with purple diamond. Horizontal axis denotes the number of the tested video. The result looks very encouraging because there is an obvious separate strip between the liveness and the photos. The minimum RUI of the liveness is 84.62% while all those of the fake are 0%. The result demonstrates the effectiveness of the proposed method. Moreover, by locating eye with errors of 1, 2 and 3 pixels, respectively, the wave shapes of one KGRW are drawn in Figure.5b. It reveals the location error has a trivial effect on the shape of KGRWs. So our method is robust to eye location error and this property makes the method applicable in practice.

2 4 6 8 10-0.2

0

0.2

0.4

0.6

0.8

1

(a)

20 40 60 80 1000.5

1

1.5

2

2.5StandardShift1Shift2Shift3

S(3,0)

(b)

Figure 5. Experimental results: (a) The comparison of RUI

2855

Page 5: [IEEE 2008 International Conference on Machine Learning and Cybernetics (ICMLC) - Kunming, China (2008.07.12-2008.07.15)] 2008 International Conference on Machine Learning and Cybernetics

Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, 12-15 July 2008

values; (b) The wave shapes with location errors

5. Conclusions

This paper has proposed an effective algorithm for eye blink detection. By considering eye blink is one of human’s physiological characteristics, this algorithm is applied for liveness detection. The idea is based on the observation that for the liveness, the edges along some scales and orientations vary homo-responsively to the behavior of eye blink. Compared to previous approaches, our algorithm is thought to be more accurate for fusing multiple weak detectors. Moreover, it can detect weak signal of eye blink and quantitatively estimate the extent of eye blink.

In the future, we will put emphasis on combining the method of eye blink detection with other human’s intrinsic properties, etc., 3D structure, for more reliable liveness detection.

Acknowledgements

This work was supported by Chinese Postdoctor Fund (Grant No. 230-21-90).

References

[1] K. Grauman, M. Betke, J. Gips and G. Bradski, “Communication via Eye Blinks - Detection and Duration Analysis in Real Time”, Proc. Conf. CVPR, 2001.

[2] T. Nakano, K Sugiyama, M. Mizuno and S. Yamamoto, “Blink Measurement by Image Processing and Application to Warning of Driver’s Drowsiness in Automobiles”, IEEE Intelligent Vehicles, 1998.

[3] P.Viola and M.J.Jones, “Robust Real-time Object Detection”, Proc. Conf. ICCV, 2000.

[4] M.K. Holland and G. Tarlow, “Blinking and Mental Load”, Psychological Reports, 1972(31): 119–127.

[5] C.Liu and Wechsler, “A Gabor Feature Classifier for Face Recognition”, Proc. Conf. ICCV, 2001.

[6] T.Choudhury, B.Clarkson, T.Jebara and A.Penland, “Multimodal Person Recognition using Unconstrained Audio and Video”, Proc. Conf. AVBPA, 1999.

[7] M. Chau and M.Betke, “Blink Detection and Eye Tracking for Eye Localization”, Boston University Computer Science Technical Report No. 2005-12, 2005.

[8] T.N. Bhaskar, F.T. Keat, S. Ranganath and Y.V.Venkatesh, “Blink Detection and Eye Tracking for Eye Localization”, Proc. Conf. TENCON, 2003.

[9] I.Fasel, B.Fortenberry and J.Movellan, “A Generative Framework for Real Time Object Detection and Classification”, CVIU, 2005(98): 182–210.

[10] M.J. Black, D.J. Fleet and Y. Yacoob, “A Framework for Modeling Appearance Change in Image Sequences”, Proc.Conf. ICCV, 1998.

[11] R.W.Frischholz and A.Werner, “Avoiding Replay-Attacks in a Face Recognition System using Head-Pose Estimation”, Proc. Conf. AMFG, 2003.

[12] J.Stern, “Why Do We Blink?”, < http://www.msnbc.msn.com/id/3076704/>.

[13] G.Chetty and M.Wagner, “Liveness Verification in Audio-Video Authentication”, Proc. Conf. Speech Science & Technology, 2004.

[14] J.Li, Y.Wang, T.Tan and A.K.Jain, “Live Face Detection Based on the Analysis of Fourier Spectrums”, Proc. Conf. SPIE, Defense and Security Symposium, 2004.

2856


Recommended