Fast and Robust Algorithm

J. S. Kim et al.: Fast and Robust Algorithm of Tracking Multiple Moving Objects for Intelligent Video Surveillance Systems 1165

Contributed Paper Manuscript received 07/15/11 Current version published 09/19/11 Electronic version published 09/19/11. 0098 3063/11/$20.00 © 2011 IEEE

Fast and Robust Algorithm of Tracking Multiple Moving Objects for Intelligent Video Surveillance Systems

Jong Sun Kim, Dong Hae Yeom, and Young Hoon Joo

Abstract — This paper deals with an intelligent image

processing method for the video surveillance systems. We propose a technology detecting and tracking multiple moving objects, which can be applied to consumer electronics such as home and business surveillance systems consisting of an internet protocol (IP) camera and a network video recorder (NVR). A real-time surveillance system needs to detect moving objects robustly against noises and environment. So the proposed method uses the red-green-blue (RGB) color background modeling with a sensitivity parameter to extract moving regions, the morphology to eliminate noises, and the blob-labeling to group moving objects. To track moving objects fast, the proposed method predicts the velocity and the direction of the groups formed by moving objects. Finally, the experiments show that the proposed method has the robustness against the environmental influences and the speed, which are suitable for the real- time surveillance system.1

Index Terms — Multiple moving object tracking, IP camera, NVR, background modeling, morphology, blob-labeling, group tracking.

I. INTRODUCTION

The traditional video surveillance systems have disadvantages in that a person should monitor the closed-circuit televisions (CCTV) or search the digital video recorders (DVR) when necessary. So the needs for the intelligent video surveillance systems which can monitor and respond to situations in real time have increased due to the high-cost and low-efficiency of the existing ones. In addition, the video surveillance systems using IP cameras have been widespread, and NVR enables a person to keep watching anywhere. This paper deals with an intelligent image processing technology for home or business video surveillance systems.

The intelligent video surveillance system is a convergence technology including detecting and tracking objects, analyzing their movements, and responding to them [3,13]. We propose a method detecting and tracking multiple moving objects,

1 This work was partially supported by the National Research Foundation

of Korea Grant funded by the Korean Government (MEST) (KRF-2009-220-D00034).

Jong Sun Kim is with the School of Electronic and Information Engineering, Kunsan University, Kunsan, Chonbuk, 573-701, Korea ( e-mail: [email protected]).

Dong Hae Yeom is with the PostBK21 Team, Kunsan University, Kunsan, Chonbuk, 573-701, Korea (e-mail: [email protected]).

Young Hoon Joo is with the Department of Control and Robotics Engineering, Kunsan University, Kunsan, Chonbuk, 573-701, Korea (e-mail: [email protected]).

which includes the basic technologies of the intelligent video surveillance systems. To detect and track the specific moving objects only, it is important to eliminate the environmental disturbances such as light scattering, leaves, birds and so on from input images. To do this, two methods are mainly used. One is to use the Bayesian method such as the particle filter (PF) or the extended Kalman filter (EKF), the other is to use the difference image methods such as the background modeling (BM) or the Gaussian mixture model (GMM). Applying PF to dynamic mixture models obtained from the parallel EKF [14] and using the kernel PF [1] are some examples of Bayesian methods. However, they are not suitable for the real-time surveillance system because of the high computational cost. Also, there are several examples of difference image methods such as gray-scale BM [5,6,8,9,11], RGB color BM to prevent gray-scale BM from missing image information [15], and GMM and expectation maximization (EM) to prevent RGB color BM from distorting the color tone [7,10,12]. However, they can not track the moving objects smoothly when the objects are hidden by obstacles or the background has the colors similar to theirs.

In this paper, we propose the fast and robust algorithm detecting and tracking the multiple moving objects for intelligent video surveillance systems, which is suitable for the real-time surveillance system because the proposed method has the fast computation and is robust against the environmental influences. To detect the moving objects, we employ RGB BM with a new sensitivity parameter to extract moving regions, morphology schemes to eliminate noises, and blob-labeling to group the moving objects. To track the groups of the moving objects, we propose a tracking algorithm consisting of the prediction of the position of each group, the recognition of the same group, and the identification of newly appearing and disappearing groups. Finally, we show the efficiency and the applicability of the proposed method through some experiments.

II. DETECTING MOVING OBJECTS

This section deals with the procedure of detecting moving objects from the input image. The procedure consists of the extraction stage based on RGB BM and morphology, and the grouping stage based on blob-labeling.

A. Extraction of Moving Objects

In general, the extraction of moving regions from sequential images is carried out by using BM represented by (1). This kind of BM involves the loss of image information compared

1166 IEEE Transactions on Consumer Electronics, Vol. 57, No. 3, August 2011

with the color BM using RGB and hue-saturation-intensity (HSI) color space models. Fig. 1 depicts the extracted result of moving regions by gray-scale BM, which shows the image information is excessively attenuated.

1

min ,,

, max ,

, max , ,

, , 2 ,

z

z

z z

z

V x yM x y

N x y V x y

D x y V x y V x y

V x y x y x y

, (1.)

where ,zV x y is the gray-level of the zth input image at the pixel position ,x y , ,x y and ,x y are the median and the standard deviation of all input images at the corresponding pixel, respectively [5].

(a) Input image (b) Result image

Fig. 1. Extraction of moving regions by gray-scale BM

In this paper, RGB color model is employed to prevent this

excessive attenuation. Also, RGB color model has the shorter execution time because any additional image transformation is not required. But, it is a crucial disadvantage to be very sensitive to even small changes caused by light scattering or reflection. The parameter is proposed to overcome the sensitivity problem as shown in (2).

min ,,

, max ,

, , , 0 255

zii

zi i

V x yM x y

N x y V x y

i r g b

, (2.)

where ziV s are the corresponding RGB-values of the zth input

image. From the result of RGB color model with , the change of color tone in each pixel z

iB of the sequential images ,z

iV x y is obtained by (3).

, ,, ,

, , ,

0, else

zi iz

iz zi i i

V x y M x yV x y

B x y V x y N x y

(3.)

The moving regions extracted by (3) are affected by the sensitivity parameter as shown in Fig. 2. To obtain the best image, this parameter can be adjusted according to the circumstances where the camera is installed. In our case, the best value is 18 2 .

The noise caused by light scattering or reflection can be eliminated by the proposed sensitivity parameter . However, the parameter should become larger to eliminate the noise caused by natural objects such as leaves and birds, which

leads to extra attenuation on the moving regions as shown in Fig. 2-(d). So the morphology, one of the geometric image processing schemes, is used to deal with this kind of noise appearing in the form of the crowd of pixels that the arrows indicate in Fig. 3-(a). The erosion operation of morphology removes the noises spread irregularly, and the dilation operation of morphology recovers the loss of moving regions made in the procedure of the erosion [4].

(a) Input image (b) RGB BM with 0

(c) RGB BM with 10 (d) RGB BM with 30

Fig. 2. The results of RGB BM according to the sensitivity parameter

Fig. 3 depicts the procedure of morphology. The image extracted by adjusting is binarized by (4), and the erosion and the dilation operation are applied to the each previous result in order.

255, 0 ,,

0, else

zz ii

B x yBIN x y

(4.)

(a) Extracted image by RGB BM (b) Binary result

(c) Erosion result (d) Dilation result

Fig. 3. Elimination of natural objects by using morphology

Compared Fig. 1-(d) with Fig. 3-(d), the proposed method by

RGB BM with and morphology is superior to gray-scale BM in the extraction performance under the environmental influences.


B. Grouping Moving Objects

The tracking performance deteriorates when each moving object extracted by RGB BM and morphology is tracked individually because the extracted moving regions may be hidden by obstacles as shown in the dotted box of Fig. 4 and be confused with something in similar colors as shown in the solid box of Fig. 4. In addition, the individual tracking of neighboring or overlapping objects requires a lot of computational capacity and may cause misidentification.

In this paper, the group tracking is used to prevent the aforementioned problems of the individual tracking. Before tracking the groups, a grouping scheme is required to classify moving objects into several groups. The 4-directional blob-labeling is employed to group moving objects as shown in Fig. 5, which is suitable for the real-time system because it is implemented easily and needs low computational cost [2]. The yellow boxes in Fig. 6 are the resulting groups of the 4-directional blob-labeling.


Fig. 4. The information loss of the extracted image

Fig. 5. 4-directional blob-labeling


Fig. 6. Grouping moving objects by blob-labeling

III. TRACKING MOVING OBJECTS

This section deals with the procedure of tracking the groups detected in the previous section. The procedure consists of predicting the position of each group, recognizing the homogeneity of each group in the sequential frames, and identifying the newly appearing and disappearing groups.

First, some terminologies are defined as follow: lCG is the position of the l th group among n unidentified groups in the

current frame, and kjIG is the position of the group k among

h identified groups in the j th frame of the previous m frames, where the subscripts are 1 l n , 1 j m , and 1 k h . From this information, the variation k

jIGD and the predicted position k

jPG of the group k in the j th frame are obtained by using (5).

1

11

1

11

1

cos

sin

1 Frame Rate

cos , 0

cos , else

k k k kj j j j

k k k kj j j j

k kj jk

j

k kj j k k

j jk kj jk

j k kj j

k kj j

PG x IG x IGD IGD v

PG y IG y IGD IGD v

IG IGIGD v

IG x IG xIG y PG y

IG y IG yIGD

IG x IG x

IG y IG y

, (5.)

where kjPG x , k

jPG y , kjIG x , and k

jIG y denote the x-axis and the y-axis values of the corresponding position vectors, respectively, and k

jIGD v and kjIGD denote the

velocity and the direction of the variation vector kjIGD . The

geometric generation method of the variation vector and the predicted position vector is shown in Fig. 7.

Fig. 7. The variation vector and the predicted position vector

Based on the predicted information, the homogeneity

between the identified groups in the previous frame and the unidentified groups in the current frame is determined. That is, the current position of the group k is lCG which is the closest group from the predicted position k

jPG in Euclidean distance as shown in Fig. 8.

Fig. 8. The recognition of the homogeneity by Euclidean distance

Meanwhile, the identification of newly appearing and

disappearing groups is carried out by comparing the number h of the identified groups in the previous frame with the number n of the unidentified groups in the current frame.


When a new group appears or the existing group is divided, new n h identifications (ID) are given to the groups whose homogeneity is not verified as shown in Fig. 9-(a). When the existing group disappears or is combined with other group, the current groups inherit ID from the previous groups with the homogeneity, and other h n ID are discarded as shown in Fig. 9-(b).

(a) Group creation

(b) Group extinction

Fig. 9. The identification of newly appearing and disappearing groups

IV. IMPLEMENTATION AND EXPERIMENT

The proposed detecting-tracking method is implemented as shown in Fig. 10. The 33Mbit IP camera provides the input image with 704 480 pixels. The surveillance image is transmitted through Internet, and the consumer PC with 2.66GHz CPU and 4GB RAM is used for the image signal processing and the proposed algorithm.

Fig. 10. Experimental setup

A. System Implementation

The proposed algorithm consists of two parts of detecting the moving objects and tracking them. The detecting stage is performed through the extraction of moving objects by RGB BM, the elimination of noises by morphology, and grouping the objects by blob-labeling as shown in the left box of Fig. 11.

And, the tracking stage is activated when a moving object is detected. As shown in the right box of Fig. 11, the tracking stage uses the geometric information of groups such as the previous position IG , the variation IGD , the predicted position PG , and the current position CG . In sequential frames, the groups at the shortest Euclidean distance are recognized as the same ones. Finally, newly appearing and disappearing groups are identified by comparing the number of groups in each frame.

Fig. 11. The proposed detecting-tracking algorithm

B. Experiment Results

The results of the proposed method are shown in Fig. 12, where the red boxes denote the groups under tracking and the yellow circles denote the traces of the groups.

In Fig. 12-(a), new groups ① and ② enter the surveillance area. In Fig. 12-(b), a new group ③ enters the area, the group ① of Fig. 11-(a) is divided into ① and ④ , and the group ② of Fig. 11-(a) is divided into ② and ⑤ . In Fig. 12-(c), the groups ① , ② , ④ , and ⑤ of Fig. 11-(b) are combined to ② , and new groups ⑥ enters. In Fig. 12-(d), the group ③ gets out of the area. These results show the robust tracking performance of the proposed method in spite of the noises by light scattering and reflection, the disturbances of natural objects such as leaves and birds, and the information losses by obstacles and background colors.


(a) The 68th frame

(b) The 91th frame

(c) The 152th frame

(d) The 169th frame

Fig. 12. The resulting images of the proposed method

Fig. 13 shows the error between the actual position and the

predicted position of groups formed by the moving objects. The maximum error is restricted in 3 pixels, which shows that the proposed algorithm provides a trustworthy prediction performance. Fig. 14 shows the cost time of the proposed algorithm. The average cost time per frame is 0.0842 second, that is, the proposed algorithm can handle more than 11 frames per second. It is the enough speed to apply to the real-time surveillance system.

Fig. 13. The error of the predicted position of each group

Fig. 14. The processing time of the proposed method

V. CONCLUSIONS

In this paper, we proposed a technology detecting and tracking multiple moving objects, which can be applied to consumer electronics such as home and business surveillance systems consisting of IP camera and NVR. The robustness and the speed of the proposed method were verified through the experiments. Because of the robustness against the environmental influences, the proposed method can be used regardless of the place where a camera is installed. And, because of the high-speed of the image processing, the proposed method can be applicable to the real-time surveillance system. At this time the method is intended for a fixed camera. But further research for a pan-tilt-zoom (PTZ) camera is under consideration, which makes it possible to monitor a wide area with the minimal number of cameras and to track a particular moving object among many ones.

REFERENCES [1] C. Chang, R. Ansari, and A. Khokhar, “Multiple Object Tracking with

Kernel Particle Filter,” Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol.1, pp.566-573, May 2005.

[2] F. Chang, C. J. Chen, and C. J. Lu. “A Linear-time Component Labeling Algorithm Using Contour Tracing Technique,” Computer Vision and Image Understanding, Vol. 93, No. 2, pp. 206-220, 2004.

[3] A. Hampapur, L. Brown, J. Connell, A. Ekin, N. Haas, M. Lu, H. Merkl, S. Pankanti, A. Senior, C. Shu, and Y. L. Tian, “Smart Video Surveillance,” IEEE Signal Processing Magazine, Vol. 22, No.2, pp. 38-51, Mar. 2005.

[4] R. M. Haralick, S. R. Stemberg, and X. Zhuang, “Image Analysis Using Mathematical Morphology,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-9, No. 4, pp. 532-550. 1987.

[5] I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: Real-time Surveillance of People and Their Activities,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22, No.8, pp. 809-830, Aug. 2000.


[6] M. Haseyama and Y. Kaga “Two-phased Region Integration Approach for Effective Pedestrian Detection in Low Contrast Images” IEEE International Conference on Consumer Electronics, pp. 1-2, Jan. 2008.

[7] O. Javed and M. Shah, “Tracking and Object Classification for Automated Surveillance,” 7th European Conference on Computer Vision, Lecture Notes in Computer Science 2353, pp. 343–357, 2002.

[8] S. Kang, J. Paik, A. Koschan, B. Abidi, and A. Abidi, “Real-time Video Tracking Using PTZ Cameras,” Proceedings of SPIE 6th International Conference on Quality Control by Artificial Vision, Vol. 5132, pp. 103-111, 2003.

[9] W. Lao, J. Han, and H. N. Peter, “Automatic Video-based Human Motion Analyzer for Consumer Surveillance System” IEEE Transactions on Consumer Electronics, Vol. 55, No. 2, pp. 591-598, May 2009.

[10] D. Makris and T. Ellis, “Automatic Learning of an Activity-based Semantic Scene Model,” Proceedings of IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 183-188, Jul. 2003.

[11] M. H. Sedky, M. Moniri, and C. C. Chibelushi, “Classification of Smart Video Surveillance Systems for Commercial Applications,” IEEE Conference on Advanced Video and Signal Based Surveillance, pp. 638-643, Sep. 2005.

[12] C. Stauffer and W. Grimson, “Learning Patterns of Activity Using Real Time Tracking,” IEEE Transactions on Pattern Analysis and machine Intelligence, Vol. 22, No.8, pp. 747-767, Aug. 2000.

[13] M. Valera and S. A. Velastine, “A Review of the State-of-art in Distributed Surveillance Systems,” IEE Intelligent Distributed Video Surveillance Systems, pp.1-30, 2006.

[14] Y. Zhai, M. B. Yeary, S. Cheng, and N. Keharnavaz, “An Object-Tracking Algorithm Based on Multiple-model Particle Filtering with State Partitioning,” IEEE Transactions on instrumentation and measurement, Vol.58, No.5, pp. 1797-1809, May 2009.

[15] R. Zhang, S. Zhang, and S. Yu, “Moving Objects Detection Method Based on Brightness Distortion and Chromaticity Distortion,” IEEE Transactions on Consumer Electronics, Vol. 53, No. 3, pp. 1177-1185, Aug. 2007.

BIOGRAPHIES

Jong Sun Kim received the B.S. and M.S. degree in the School of Electronics and Information Engineering from Kunsan National University, Kunsan, Korea, in 2007 and 2009, respectively. He is currently doctoral student. His research interests include intelligent robot, human-robot interaction, intelligent surveillance system.

Dong Hae Yeom received the B.S. degree in Electronic Engineering from Dong-a University in 1998, the M.S. degree in Electronics and Computer Engineering from Hanyang University in 2001, and the Ph. D. Degree in Electrical and Computer Science from Seoul National University in 2006. He worked for Samsung Electronics as a senior engineer. Since 2009, he has been with the PostBK21 Team, Kunsan

National University, Kunsan, Korea, where he is currently a research professor. His research interests include nonlinear systems, and switching and hybrid control.

Young Hoon Joo received the B.S., M.S., and Ph.D. degrees in Electrical Engineering from Yonsei University, Seoul, Korea, in 1982, 1984, and 1995, respectively. He worked with Samsung Electronics Company, Seoul, Korea, from 1986 to 1995, as a project manager. He was with the University of Houston, Houston, TX, from 1998 to 1999, as a visiting professor in the Department of Electrical and Computer

Engineering. He is currently a professor in the School of Electronic and Information Engineering, Kunsan National University, Korea. His major interest is mainly in the field of intelligent robot, intelligent control, and human-robot interaction. He served as President for Korea Institute of Intelligent Systems (KIIS) (2008-2009) and is serving as Editor for the International Journal of Control, Automation, and Systems (IJCAS) (2008-present).

Date post:	21-Apr-2015
Category:	Documents
Upload:	krishna-chaitanya
View:	41 times
Download:	7 times

Fast and Robust Algorithm

Documents