Anomaly Detection in Data Docent Xiao-Zhi Gao xiao-zhi.gao@aalto.fi.

Post on 20-Jan-2016

222 views 0 download

transcript

Anomaly Detection in Data

Docent Xiao-Zhi Gaoxiao-zhi.gao@aalto.fi

Outline

• Introduction

• Anomaly detection in data

• Negative Selection Algorithm (NSA)

• Application examples

• Conclusions

What are Anomalies?

• Anomaly is a pattern in the data that does not conform to the expected behavior

• Anomaly is also referred to as outliers, exceptions, peculiarities, surprise, etc.

• Anomalies translate to significant (often critical) real life entities– Credit card fraud– Cyber intrusions

Simple Example

• N1 and N2 are regions of normal behavior

• Points o1 and o2 are anomalies

• Points in region O3 are anomalies

X

Y

N1

N2

o1

o2

O3

Simple Example

Simple Example

Real World Anomalies

• Credit Card Fraud– An abnormally high purchase made on a

credit card

• Cyber Intrusions– A web server involved in ftp traffic

Applications of Anomaly Detection

• Network intrusion detection• Insurance / Credit card fraud detection• Healthcare informatics / Medical diagnostics• Industrial damage detection• Image processing / Video surveillance • Novel topic detection in text mining• …

Key Challenges

• Defining a representative normal region is challenging

• The boundary between normal and outlying behavior is often not precise

• The exact notion of an outlier is different for different application domains

• Data might contain noise• Normal behavior keeps evolving

Artificial Immune Systems (AIS)

• Artificial Immune Systems (AIS) are an emerging kind of soft computing methods– Inspired by natural immune systems– Features of pattern recognition, anomaly detection, data

analysis, machine learning, etc• Negative Selection Algorithm (NSA) is an important

partner of AIS– Maturation of T cells and self/nonself discrimination– Developed by Forrest in 1994

• NSA is applied to deal with anomaly detection in data

Biological Information Processing Systems

SystemGenetic

SystemEndocrine

SystemBrain

SystemImmune

Natural Immune System

Natural Immune System

X X

Pathogens

Biochemical barriers

Skin

Innate immune response

Adaptive immune response

Lymphocytes

How Does Natural Immune System Work?

Negative Selection Algorithm (NSA)

• Immune system (B and T cells) is capable of distinguishing self from nonself– Negative censoring of T cells in thymus

• Negative Selection Algorithm (NSA) mimics mechanism of immune system– 1. Define self samples (representative samples)– 2. Generation of detectors (binary and real-valued)– 3. Negative selection of detectors– 4. Employment of detectors in anomaly detection

Negative Selection Algorithm (NSA)

Generation of NSA Detectors

Negative Selection Algorithm (NSA)

Anomaly Detection Using NSA

Self and Nonself Samples

Self Samples

Nonself Samples

Self and Nonself Coverage in NSA

Self and Nonself Samples

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Two Examples of Random Detector Groups

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Two Examples of Optimized Detector Groups (Gao, 2004)

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Distribution of Fisher’s iris Data in Sepal Length-Sepal Width Dimensions

4 4.5 5 5.5 6 6.5 7 7.5 82

2.5

3

3.5

4

4.5

Sepal Length

Sep

al W

idth

setosa

versicolorvirginica

Distribution of Fisher’s iris Data in Petal Length-Petal Width Dimensions

0 1 2 3 4 5 6 70

0.5

1

1.5

2

2.5

Petal Length

Pet

al W

idth

setosa

versicolorvirginica

Detector Generation in NSA (Gao, 2006)

Anomaly Detection in Fisher’s iris Data with NSA Detectors

Anomaly Detection in Chaotic Time Series

• Mackey-Glass chaotic time series

controls the behaviors of Mackey-Glass time series– and

• Anomaly detection rate can be improved by neural networks-based NSA

)()(1

)()( tbx

tx

taxtx

c

1730

Mackey-Glass Time Series

30

17

Fresh Data

Anomaly Detection in Mackey-Glass Time Series Using Adaptive NSA

Before Training

After Training

57M 9L%86

31M 1L%97

NSA-based Motor Fault Detection

• Monitoring of the working conditions of the running motors is necessary in maintaining their normal status

• Anomaly in the feature signals acquired from the faulty motors is caused by faults

• Motor fault detection is converted to a typical problem of anomaly detection

Normal, Abnormal, and Faulty Feature Signals of Motors

Normal Feature Signal

Abnormal Feature Signal

Faulty Feature Signal

NSA in Motor Fault Detection

Feature Signals from Healthy Motors

Feature Signals from Operating Motors

Signal Preprocessing

Signal Preprocessing

Detector Generation

Detectors Anomaly Detection

Fault Detection

Fault Detection PhaseDetector Generation Phase

NSA-based Motor Fault Detection

•Anomaly in feature signals is caused by faults–Healthy feature signals: self–Faulty feature signals: nonself

•Neural networks are combined with NSA• NSA detectors are built up on the structure of BP

neural networks• Neural networks training algorithm is applied

Neural Networks-based NSA (Gao, 2010)

1x 2x Nx

Nw

1v 2v Nv

y

f f f

2w1w

E

3Layer

2Layer

1Layer

2 iii wxd

N

iiii

N

iii wxvdvy

1

2

1

)(f)f(

yE

Training of Neural Networks-based NSA

),( Weights vw

)E

(E

rror

Mat

chin

gv

w,

)E( vw,

),( ** vw

Margin Training Strategy of NSA Detectors

• Case1: (for faulty plant feature signals only): if , detectors are trained using normal BP learning algorithm to decrease E; otherwise, no training is employed

• Case 2 (for healthy plant feature signals only): if , detectors are trained using ‘positive’ learning algorithm to increase E; otherwise, no training is employed

E0

0 E

Margin Training of NSA Detectors in Fault Detection

y

Detectors

Out

puts

Det

ecto

r IIRegion Training

IRegion Training

Inner Raceway Fault Detection of Bearings

• Bearings are important components in rotating machinery

• Defect on the inner raceway is a common but typical fault of bearings

• Fault detection is based on vibration signals of bearings– A sensor mounted on eight-ball bearings with a motor rotation

speed at 1,782 rpm

Inner Raceway Fault of Bearings

Features Signals of Healthy and Faulty Bearings

Healthy Bearings

Faulty Bearings

Fault Detection Rate of Neural Networks-based NSA

27M

Before Training

7L

%79

15M

After Training

0L

%100

Motor Fault Detection using NSA (Gao, 2012)• Two kinds of motor faults are considered here

– Rotor fault– Stator fault

• Stator current signals are used as feature signals

• Both healthy and faulty motors are running with/without varying loads

• Fault detection rate is %100

BA

B

Feature Signals for Motor Fault Detection

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1-150

-100

-50

0

50

100

150

Time in Seconds

Sta

tor

Cur

rent

Healthy Motor

Feature Signals for Motor Fault Detection

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1-200

-150

-100

-50

0

50

100

150

Time in Seconds

Sta

tor

Cur

rent

Broken Rotor

Feature Signals for Motor Fault Detection

0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1-150

-100

-50

0

50

100

150

Time in Seconds

Sta

tor

Cur

rent

Broken Stator

Motor Fault Detection Results

0 50 100 150 200 250 300 350 400 450 5000

1

2

3

4

5

6

7

8

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs

Healthy Motor

Motor Fault Detection Results

0 50 100 150 200 250 300 350 400 450 5000

1

2

3

4

5

6

7

8

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs

Broken Rotor

Motor Fault Detection Result

0 50 100 150 200 250 300 350 400 450 5000

1

2

3

4

5

6

7

8

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs

Broken Stator

Detection Results of Rotor and Stator Faults

Current Signals of Healthy Motor withFour Different Loads

0 5000 10000

-100

0

100

Time in Seconds

Rot

or C

urre

nt

(a)

0 5000 10000-400

-200

0

200

400

Time in Seconds

Rot

or C

urre

nt

(b)

0 5000 10000-100

-50

0

50

100

Time in Seconds

Rot

or C

urre

nt

(c)

0 5000 10000-100

-50

0

50

100

Time in Seconds

Rot

or C

urre

nt

(d)

Current Signals of Faulty Motor withFour Different Loads

0 5000 10000-200

-100

0

100

200

Time in Seconds

Rot

or C

urre

nt

(a)

0 5000 10000

-500

0

500

Time in Seconds

Rot

or C

urre

nt

(b)

0 5000 10000

-100

0

100

Time in Seconds

Rot

or C

urre

nt

(c)

0 5000 10000

-200

0

200

Time in Seconds

Rot

or C

urre

nt

(d)

Numbers of Activated NSA Detectorsfor Healthy Motor

0 100 200 300 400 5000

5

10

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (a)

0 100 200 300 400 5000

5

10

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (b)

0 100 200 300 400 5000

5

10

15

20

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (c)

0 100 200 300 400 5000

10

20

30

40

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (c)

Healthy Motor

Numbers of Activated NSA Detectorsfor Faulty Motor

0 100 200 300 400 5000

5

10

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (a)

0 100 200 300 400 5000

5

10

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (b)

0 100 200 300 400 5000

5

10

15

20

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (c)

0 100 200 300 400 5000

10

20

30

40

Number of Signal Windows

Num

ber

of A

ctiv

ated

Det

ecto

rs (c)

Faulty Motor

Fault Detection Rates of Faulty Motorswith Different Loads (Gao, 2013)

Conclusions

• Anomaly detection in data is an important topic• NSA can be used for anomaly detection based

on only the normal data• A few application examples have demonstrated

the effectiveness of the NSA in anomaly detection

• Performance comparisons need to be made between the NSA and other anomaly detection methods, e.g., Support Vector Machine (SVM)